91% found this document useful (11 votes)
15K views560 pages

Jacaranda 2 HSC 3rd Edition

Uploaded by

Khanh Le
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
91% found this document useful (11 votes)
15K views560 pages

Jacaranda 2 HSC 3rd Edition

Uploaded by

Khanh Le
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 560

MICHAEL ANDRIESSEN • PETER PENTLAND

RICHARD GAUT • BRUCE McKAY • JILL TACON

JACARANDA HSC SCIENCE


Third edition published 2008 by
John Wiley & Sons Australia, Ltd
42 McDougall Street, Milton, Qld 4064
Typeset in 10.5/12pt New Baskerville
© Michael Andriessen, Peter Pentland, Richard Gaut, Bruce McKay,
Jillian Tacon and Upgrade Business Systems (Ric Morante) 2008
First edition published 2001
© Michael Andriessen, Peter Pentland, Richard Gaut and
Bruce McKay 2001
Second edition published 2003
© Michael Andriessen, Peter Pentland, Richard Gaut, Bruce McKay
and Jillian Tacon 2003
The moral rights of the authors have been asserted.
National Library of Australia
Cataloguing-in-Publication data

Title: Physics 2 HSC course/Michael


Andriessen ... [et al.].
Edition: 3rd ed.
ISBN: 978 0 7314 0823 8 (pbk.)
Notes: Includes index.
Target audience: For secondary school age.
Subjects: Physics — Textbooks.
Other authors/contributors: Andriessen, Michael.
Dewey number: 530

Reproduction and communication for educational purposes


The Australian Copyright Act 1968 allows a maximum of one chapter
or 10% of the pages of this work, whichever is the greater, to be
reproduced and/or communicated by any educational institution
for its educational purposes provided that the educational
institution (or the body that administers it) has given a
remuneration notice to Copyright Agency Limited (CAL).
Reproduction and communication for other purposes
Except as permitted under the Act (for example, a fair dealing
for the purposes of study, research, criticism or review), no part
of this book may be reproduced, stored in a retrieval system,
communicated or transmitted in any form or by any means without
prior written permission. All inquiries should be made to the
publisher.

All activities have been written with the safety of both teacher and
student in mind. Some, however, involve physical activity or the use
of equipment or tools. All due care should be taken when
performing such activities. Neither the publisher nor the authors
can accept responsibility for any injury that may be sustained when
completing activities described in this textbook.

Front and back cover images: © Photodisc, Inc.


Illustrated by the Wiley Art Studio
Printed in Singapore by
Craft Print
10 9 8 7 6 5 4 3 2 1
CONTENTS

Preface viii
About eBookPLUS ix
Syllabus grid x
Acknowledgements xvi

Chapter 1: Earth‘s gravitational field 2


HSC CORE MODULE 1.1 The Earth’s gravity 3
1.2 Weight 6
1.3 Gravitational potential energy 7
Space Summary 10
Questions 10
Practical activities 11

Chapter 2: Launching into space 13


2.1 Projectile motion 14
2.2 Escape velocity 23
2.3 Lift-off 24
Summary 33
Questions 33
Practical activities 35

Chapter 3: Orbiting and re-entry 38


3.1 In orbit 39
3.2 Re-entry 50
Summary 56
Questions 56
Practical activities 58

Chapter 4: Gravity in the solar system 60


4.1 The Law of Universal Gravitation 61
4.2 Gravitational fields 65
4.3 The slingshot effect 66
Summary 70
Questions 70

Chapter 5: Space and time 71


5.1 The aether model 72
5.2 Special relativity 74
5.3 Consequences of special relativity 77
Summary 93
Questions 93
Practical activities 96

Chapter 6: The motor effect and DC electric motors 100


HSC CORE MODULE 6.1 The motor effect 103
6.2 Forces between two parallel conductors 105
Motors and 6.3 Torque 107
generators 6.4 DC electric motors 109
Summary 116
Questions 116
Practical activities 120
Chapter 7: Generating electricity 122
7.1 The discoveries of Michael Faraday 123
7.2 Electromagnetic induction 126
7.3 Generating a potential difference 127
7.4 Lenz’s law 128
7.5 Eddy currents 131
Summary 134
Questions 134
Practical activities 137

Chapter 8: Generators and power distribution 139


8.1 Generators 140
8.2 Electric power generating stations 146
8.3 Transformers 148
8.4 Power distribution 151
8.5 Electricity and society 156
Summary 157
Questions 157
Practical activities 160

Chapter 9: AC electric motors 163


9.1 Main features of an AC motor 164
9.2 Energy transformations and transfers 169
Summary 171
Questions 171
Practical activities 172

Chapter 10: Cathode rays and


HSC CORE MODULE the development of television 174
10.1 The discovery of cathode rays 175
From ideas to 10.2 Effect of electric fields on cathode rays 177
implementation 10.3 Effect of magnetic fields on cathode rays 182
10.4 Determining the charge-to-mass ratio of cathode rays 183
10.5 Cathode rays — waves or particles? 184
10.6 Applications of cathode rays 186
Summary 189
Questions 189
Practical activities 191

Chapter 11: The photoelectric


effect and black body radiation 193
11.1 Maxwell’s theory of electromagnetic waves 194
11.2 Heinrich Hertz and experiments with radio waves 196
11.3 The black body problem and the ultraviolet catastrophe 199
11.4 What do we mean by ‘classical physics’ and ‘quantum theory’? 202
11.5 The photoelectric effect 202
Summary 209
Questions 209
Practical activities 211

iv
Chapter 12: The development
and application of transistors 212
12.1 Conductors, insulators and semiconductors 213
12.2 Band structures in semiconductors 216
12.3 Doping and band structure 219
12.4 Thermionic devices 220
12.5 Solid state devices 222
12.6 Thermionic versus solid state devices 224
12.7 Invention of the transistor 225
12.8 Integrated circuits 227
Summary 230
Questions 230
Practical activities 231

Chapter 13: Superconductivity 232


13.1 Interference 233
13.2 Diffraction 235
13.3 X-ray diffraction 235
13.4 Bragg’s experiment 238
13.5 The crystal lattice structure of metals 239
13.6 Superconductivity 240
13.7 How is superconductivity explained? 243
Summary 251
Questions 251
Practical activities 253

Chapter 14: Looking and seeing 256


HSC OPTION MODULE 14.1 Galileo’s telescopes 257
14.2 Atmospheric absorption of the electromagnetic spectrum 258
14.3 Telescopes 261
Astrophysics 14.4 Seeing 265
14.5 Modern methods to improve telescope performance 265
Summary 270
Questions 270
Practical activities 272

Chapter 15: Astronomical measurement 274


15.1 Astrometry 275
15.2 Spectroscopy 279
15.3 Photometry 289
Summary 299
Questions 299
Practical activities 302

Chapter 16: Binaries and variables 305


16.1 Binaries 306
16.2 Variables 312
Summary 316
Questions 316
Practical activities 318

v
Chapter 17: Star lives 320
17.1 Star birth 321
17.2 Main sequence star life 324
17.3 Star life after the main sequence 327
17.4 Star death 332
Summary 336
Questions 336
Practical activities 338

Chapter 18: The use of ultrasound in medicine 340


HSC OPTION MODULE 18.1 What type of sound is ultrasound? 341
18.2 Using ultrasound to detect structure inside the body 343
18.3 Producing and detecting ultrasound: the piezoelectric effect 347
Medical physics 18.4 Gathering and using information in an ultrasound scan 348
18.5 Using ultrasound to examine blood flow 352
Summary 358
Questions 358

Chapter 19: Electromagnetic


radiation as a diagnostic tool 361
19.1 X-rays in medical diagnosis 362
19.2 CT scans in medical diagnosis 368
19.3 Endoscopes in medical diagnosis 373
Summary 378
Questions 378
Practical activities 380

Chapter 20: Radioactivity as a diagnostic tool 381


20.1 Radioactivity and the use of radioisotopes 382
20.2 Positron emission tomography (PET) 392
20.3 Imaging methods working together 394
Summary 396
Questions 396

Chapter 21: Magnetic resonance


imaging as a diagnostic tool 398
21.1 The patient and the image using MRI 399
21.2 The MRI machine: effect on atoms in the patient 402
21.3 Medical uses of MRI 410
21.4 Comparison of the main imaging techniques 412
Summary 415
Questions 415

Chapter 22: The atomic models of Rutherford and Bohr 418


HSC OPTION MODULE 22.1 The Rutherford model of the atom 419
22.2 Bohr’s model of the atom 423
From quanta 22.3 Bohr’s postulates 427
to quarks 22.4 Mathematics of the Rutherford and Bohr models 429
22.5 Limitations of the Bohr model of the atom 434
Summary 435
Questions 435
Practical activities 437

vi
Chaper 23: Development of quantum mechanics 440
23.1 Diffraction 441
23.2 Steps towards a complete quantum theory model of the atom 444
Summary 452
Questions 452

Chapter 24: Probing the nucleus 453


24.1 Discoveries pre-dating the nucleus 454
24.2 Discovery of the neutron 458
24.3 Discovery of the neutrino 461
24.4 The strong nuclear force 466
24.5 Mass defect and binding energy of the nucleus 468
Summary 472
Questions 472

Chapter 25: Nuclear fission


and other uses of nuclear physics 474
25.1 Energy from the nucleus 475
25.2 The discovery of nuclear fission 476
25.3 The development of the atom bomb 480
25.4 Nuclear fission reactors 484
25.5 Medical and industrial applications of radioisotopes 489
25.6 Neutron scattering 492
Summary 493
Questions 493

Chapter 26: Quarks and the


Standard Model of particle physics 495
26.1 Instruments used by particle physicists 496
26.2 The Standard Model of particle physics 503
Summary 516
Questions 516
Practical activities 518

Glossary 521
Appendix 1: Formulae and data sheet 526
Appendix 2: Periodic table 528
Appendix 3: Key words for examination questions 529
Answers to numerical questions 531
Index 536

vii
PREFACE
This third edition of Physics 2: HSC Course is revised and updated to meet
all the requirements of the amended Stage 6 Physics Syllabus for Year 12
students in New South Wales. Written by a team of experienced Physics
teachers, Physics 2 offers a complete resource with coverage of the three
core modules as well as three option modules: Quanta to Quarks, Astro-
physics and Medical Physics. An additional option topic, The Age of Sil-
icon, is available online.

Physics 2 features:
• full-colour, high-quality, detailed illustrations to enhance students’
understanding of Physics concepts
• clearly written explanations and sample problems
• interest boxes focusing on up-to-date information, current research
and new discoveries
• practical activities at the end of each chapter to support the syllabus
investigations
• key terms highlighted and defined in the context of the chapters and
in a complete glossary
• chapter reviews that provide a summary and a range of problem-
solving and descriptive questions.

eBook plus Next generation teaching and learning

This title features eBookPLUS: an electronic version of the textbook and


a complementary set of targeted digital resources. These flexible and
engaging ICT activities are available online at the JacarandaPLUS website
(www.jacplus.com.au).
eBookPLUS icons within the text direct students to the online
resources, which include:
• eModelling: Excel spreadsheets that provide examples of numerical and
algebraic modelling
• eLessons: Video and animations that reinforce study by bringing key
concepts to life
• Interactivities: Interactive study activities that enhance student under-
standing of key concepts through hands-on experience
• Weblinks: HTML links to other useful support material on the internet.

viii
Next generation teaching and learning

About eBookPLUS
Physics 2: HSC Course, 3rd edition features eBookPLUS: an electronic version of the entire textbook and supporting
multimedia resources. It is available for you online at the JacarandaPLUS website (www.jacplus.com.au).

Using the JacarandaPLUS website Step 1. Create a user account


To access your eBookPLUS resources, simply log on The first time you use the JacarandaPLUS system,
to www.jacplus.com.au. There are three easy steps for you will need to create a user account. Go to the
using the JacarandaPLUS system. JacarandaPLUS home page (www.jacplus.com.au)
and follow the instructions on screen.

Step 2. Enter your registration code


Once you have created a new
account and logged in, you will
be prompted to enter your unique
registration code for this book,
which is printed on the inside
front cover of your textbook.

LOGIN
Once you have created your account,
you can use the same email address and
password in the future to register any
JacarandaPLUS books.

Step 3. View or download eBookPLUS resources


Your eBook and supporting resources are provided
in a chapter-by-chapter format. Simply select the
desired chapter from the drop-down list and navigate
through the tabs to locate the appropriate resource.

Minimum requirements Troubleshooting


• Internet Explorer 7, Mozilla Firefox 1.5 • Go to the JacarandaPLUS help page at
or Safari 1.3 www.jacplus.com.au
• Adobe Flash Player 9 • Contact John Wiley & Sons Australia, Ltd.
• Javascript must be enabled (most browsers Email: [email protected]
are enabled by default). Phone: 1800 JAC PLUS (1800 522 7587)

ix
SYLLABUS GRID
Core module: SPACE (chapters 1–5, pages 1–98)
1. The Earth has a gravitational field that exerts a force on objects both on it and around it
Students learn to: page Students: page
• define weight as the force on an object due to a gravitational field 8 • perform an investigation and gather information to determine a value for 11
• explain that a change in gravitational potential energy is related to work 7 acceleration due to gravity using pendulum motion or computer- assisted
done technology and identify reasons for possible variations from the value
−2
• define gravitational potential energy as the work done to move an object 7–9 9.8 m s
from a very large distance away to a point in a gravitational field • gather secondary information to predict the value of acceleration due to 5, 10, 12
m1 m2 gravity on other planets
E p = – G -------------
- • analyse information using the expression F = mg to determine the weight 6, 10, 12
r force for a body on Earth and for the same body on other planets

2. Many factors have to be taken into account to achieve a successful rocket launch, maintain a stable orbit and return to Earth
Students learn to: page Students: page
• describe the trajectory of an object undergoing projectile motion within 14–23 • solve problems and analyse information to calculate the actual velocity of 19–22, 33–4
the Earth’s gravitational field in terms of horizontal and vertical a projectile from its horizontal and vertical components using:
components 2
vx = ux
2
• describe Galileo’s analysis of projectile motion 14 v 2= u +2at
• explain the concept of escape velocity in terms of the: 23–4 vy = uy + 2ay ∆y
– gravitational constant ∆x = uxt
– mass and radius of the planet ∆y = uyt + 1--- ayt
2
• outline Newton’s concept of escape velocity 23 2
• identify why the term ‘g forces’ is used to explain the forces acting on an 26–31 • perform a first-hand investigation, gather information and analyse data to 35
astronaut during launch calculate initial and final velocity, maximum height reached, range and
• discuss the effect of the Earth’s orbital motion and its rotational motion 31–2 time of flight of a projectile for a range of situations by using simulations,
on the launch of a rocket data loggers and computer analysis
• analyse the changing acceleration of a rocket during launch in terms of 25, 26–7 (see • identify data sources, gather, analyse and present information on the 32
the: also 36–7) contribution of one of the following to the development of space
– Law of Conservation of Momentum exploration: Tsiolkovsky, Oberth, Goddard, Esnault-Pelterie, O’Neill or
– forces experienced by astronauts von Braun
• analyse the forces involved in uniform circular motion for a range of 39–41 • solve problems and analyse information to calculate the centripetal force 40–1, 56,
objects, including satellites orbiting the Earth acting on a satellite undergoing uniform circular motion about the Earth 58–9
• compare qualitatively low Earth and geo-stationary orbits 47–8 using
• define the term orbital velocity and the quantitative and qualitative 41–4 mv
2
relationship between orbital velocity, the gravitational constant, mass of F = ----------
the central body, mass of the satellite and the radius of the orbit using r
Kepler’s Law of Periods • solve problems and analyse information using: 42–4, 56–7
• account for the orbital decay of satellites in low Earth orbit 49–50 3
r GM
• discuss issues associated with safe re-entry into the Earth’s atmosphere 50–5 ------ = ----------
2 2
and landing on the Earth’s surface T 4π
• identify that there is an optimum angle for safe re-entry for a manned 51
spacecraft into the Earth’s atmosphere and the consequences of failing
to achieve this angle

3. The Solar System is held together by gravity


Students learn to: page Students: page
• describe a gravitational field in the region surrounding a massive object 65–6 • present information and use available evidence to discuss the factors 61–4, 70
in terms of its effects on other masses in it affecting the strength of the gravitational force
• define Newton’s Law of Universal Gravitation 61–2 • solve problems and analyse information using 61–2, 70
m1 m2 m1 m2
F = G -------------
- F = G -------------
-
2
d d
• discuss the importance of Newton’s Law of Universal Gravitation in
understanding and calculating the motion of satellites 62–5
• identify that a slingshot effect can be provided by planets for space
probes 66–9

4. Current and emerging understanding about time and space has been dependent upon earlier models of the transmission of light
Students learn to: page Students: page
• outline the features of the aether model for the transmission of light 72 • gather and process information to interpret the results of the Michelson- 96–7
• describe and evaluate the Michelson-Morley attempt to measure the 72–4 Morley experiment
relative velocity of the Earth through the aether • perform an investigation to help distinguish between non-inertial and 97–8
• discuss the role of the Michelson-Morley experiments in making 74 inertial frames of reference
determinations about competing theories • analyse and interpret some of Einstein’s thought experiments involving 75–6, 77–8
• outline the nature of inertial frames of reference 74–5 mirrors and trains and discuss the relationship between thought and
• discuss the principle of relativity 74–5 reality
• describe the significance of Einstein’s assumption of the constancy of 75–6 • analyse information to discuss the relationship between theory and the 80–1
the speed of light evidence supporting it, using Einstein’s predictions based on relativity that
• identify that if c is constant then space and time become relative 76 were made many years before evidence was available to support it
• discuss the concept that length standards are defined in terms of time in 77
contrast to the original metre standard

(continued)

x
• explain qualitatively and quantitatively the consequence of special • solve problems and analyse information using:
relativity in relation to: 2
E = mc 89, 95
– the relativity of simultaneity 77–8
– the equivalence between mass and energy 88–9 2
v
– length contraction 81–4 l v = l 0 1 – ----- 84, 94–5
2
– time dilation 78–9 c
– mass dilation 85–8 t0
• discuss the implications of mass increase, time dilation and length 89–92 t v = ------------------
- 81, 94–5
contraction for space travel 2
v
1 – -----
2
c
m0
m v = ------------------- 88, 95
2
v
1 – -----
2
c
Core module: MOTORS AND GENERATORS (chapters 6–9, pages 101–72)
1. Motors use the effect of forces on current-carrying conductors in magnetic fields
Students learn to: page Students: page
• discuss the effect on the magnitude of the force on a current-carrying 104–5 • solve problems using: 107, 117–19
conductor of variations in: I1 I2
F
– the strength of the magnetic field in which it is located -- = k --------
-
– the magnitude of the current in the conductor l d
– the length of the conductor in the external magnetic field • perform a first-hand investigation to demonstrate the motor effect 120
– the angle between the direction of the external magnetic field and • solve problems and analyse information about the force on current- 105, 117–19
the direction of the length of the conductor carrying conductors in magnetic fields using
• describe qualitatively and quantitatively the force between long parallel 105–7 F = BIl sin θ
F I1 I2 • solve problems and analyse information about simple motors using:
current-carrying conductors: -- = k --------
- 114, 118–19,
l d τ = nBIA cos θ 121
• define torque as the turning moment of a force using: τ = Fd 107–8 • identify data sources, gather and process information to qualitatively
• identify that the motor effect is due to the force acting on a current- 104–5 114–15
describe the application of the motor effect in:
carrying conductor in a magnetic field – the galvanometer
• describe the forces experienced by a current-carrying loop in a 102–3 – the loudspeaker
magnetic field and describe the net result of the forces
• describe the main features of a DC electric motor and the role of each 109–11
feature
• identify that the required magnetic fields in DC motors can be produced 109–11, 112
either by current-carrying coils or permanent magnets

2. The relative motion between a conductor and magnetic field is used to generate an electrical voltage
Students learn to: page Students: page
• outline Michael Faraday’s discovery of the generation of an electric 123–6 • perform an investigation to model the generation of an electric current 137–8
current by a moving magnet by moving a magnet in a coil or a coil near a magnet
• define magnetic field strength B as magnetic flux density 126 • plan, choose equipment or resources for, and perform a first-hand 137–8
• describe the concept of magnetic flux in terms of magnetic flux density 126–7 investigation to predict and verify the effect on a generated electric
and surface area current when:
• describe generated potential difference as the rate of change of 127–8 – the distance between the coil and the magnet is varied
magnetic flux through a circuit – the strength of the magnet is varied
• account for Lenz’s Law in terms of conservation of energy and relate it 128–30 – the relative motion between the coil and the magnet is varied
to the production of back emf in motors • gather, analyse and present information to explain how induction is used 133
• explain that, in electric motors, back emf opposes the supply emf 129–30 in cooktops in electric ranges
• explain the production of eddy currents in terms of Lenz’s Law 131–2 • gather secondary information to identify how eddy currents have been 132
utilised in electromagnetic braking

3. Generators are used to provide large-scale power production


Students learn to: page Students: page
• describe the main components of a generator 140–1 • plan, choose equipment or resources for, and perform a first-hand 160–1
• compare the structure and function of a generator to an 141, 109, investigation to demonstrate the production of an alternating current
electric motor 164–5 • gather secondary information to discuss advantages and disadvantages of 160
• describe the differences between AC and DC generators 142–5 AC and DC generators and relate these to their use
• discuss the energy losses that occur as energy is fed through 151–4 • analyse secondary information on the competition between Westinghouse 147–8
transmission lines from the generator to the consumer and Edison to supply electricity to cities
• assess the effects of the development of AC generators on society 147–8, 156 • gather and analyse information to identify how transmission lines are: 154–5
and the environment – insulated from supporting structures
– protected from lightning strikes

4. Transformers allow generated power to be either increased or decreased before it is used


Students learn to: page Students: page
• describe the purpose of transformers in electrical circuits 148–9 • perform an investigation to model the structure of a transformer to 160–1
• compare step-up and step-down transformers 149 demonstrate how secondary voltage is produced
• identify the relationship between the ratio of the number of turns in the 149–50 • solve problems and analyse information about transformers using: 150–1
primary and secondary coils and the ratio of primary to secondary Vp n
voltage ----- = ----p-
• explain why voltage transformations are related to conservation of 149–50 Vs ns
energy • gather, analyse and use available evidence to discuss how difficulties of 151
• explain the role of transformers in electricity sub-stations 154 heating caused by eddy currents in transformers may be overcome
• discuss why some electrical appliances in the home that are connected 148–9, 153 • gather and analyse secondary information to discuss the need for 152–4, 162
to the mains domestic power supply use a transformer transformers in the transfer of electrical energy from a power station to its
• discuss the impact of the development of transformers on society 152–3, 156 point of use

xi
5. Motors are used in industries and the home usually to convert electrical energy into more useful forms of energy
Students learn to: page Students: page
• describe the main features of an AC electric motor 164–9 • perform an investigation to demonstrate the principle of an AC induction 172
motor
• gather, process and analyse information to identify some of the energy 169–70
transfers and transformations involving the conversion of electrical energy
into more useful forms in the home and industry

Core module: FROM IDEAS TO IMPLEMENTATION (chapters 10–13, pages 173–254)


1. Increased understanding of cathode rays led to the development of television
Students learn to: page Students: page
• explain why the apparent inconsistent behaviour of cathode rays caused 184–5 • perform an investigation and gather first-hand information to observe the 191
debate as to whether they were charged particles or electromagnetic occurrence of different striation patterns for different pressures in
waves discharge tubes
• explain that cathode ray tubes allowed the manipulation of a stream of 175–6 • perform an investigation to demonstrate and identify properties of 192
charged particles cathode rays using discharge tubes:
• identify that moving charged particles in a magnetic field experience a 182 – containing a Maltese cross
force – containing electric plates
• identify that charged plates produce an electric field 177–9 – with a fluorescent display screen
• describe quantitatively the force acting on a charge moving through a 182 – containing a glass wheel
magnetic field F = qvB sin θ – analyse the information gathered to determine the sign of the charge
• discuss qualitatively the electric field strength due to a point charge, 177–9 on cathode rays
positive and negative charges and oppositely charged parallel plates • solve problems and analyse information using: 178, 181,
• describe quantitatively the electric field due to oppositely charged 177–9 f = qvB sin θ 182, 189–90
parallel plates
• outline Thomson’s experiment to measure the charge : mass ratio of an 183 F = qE
electron and
• outline the role of: 186–8 V
– electrodes in the electron gun E = ---
– the deflection plates or coils d
– the fluorescent screen in the cathode ray tube of conventional TV
displays and oscilloscopes

2. The reconceptualisation of the model of light led to an understanding of the photoelectric effect and black body radiation
Students learn to: page Students: page
• describe Hertz’s observation of the effect of a radio wave on a receiver 196–8, 202–3 • perform an investigation to demonstrate the production and reception of 211
and the photoelectric effect he produced but failed to investigate radio waves
• outline qualitatively Hertz’s experiments in measuring the speed of 196–8 • identify data sources, gather, process and analyse information and use 205–6
radio waves and how they relate to light waves available evidence to assess Einstein’s contribution to quantum theory and
• identify Planck’s hypothesis that radiation emitted and absorbed by the 199–201 its relation to black body radiation
walls of a black body cavity is quantised • identify data sources and gather, process and present information to 207–8
• identify Einstein’s contribution to quantum theory and its relation to 205 summarise the use of the photoelectric effect in:
black body radiation – solar cells
• explain the particle model of light in terms of photons with particular 204 – photocells
energy and frequency • solve problems and analyse information using: 201, 209–10
• identify the relationships between photon energy, frequency, speed of 201– 5 E = hf and c = f λ
light and wavelength: E = hf and c = f λ • process information to discuss Einstein and Planck’s differing views about 206
whether science research is removed from social and political forces

3. Limitations of past technologies and increased research into the structure of the atom resulted in the invention of transistors
Students learn to: page Students: page
• identify that some electrons in solids are shared between atoms and 213–14 • perform an investigation to model the behaviour of semiconductors, 230, 231
move freely including the creation of a hole or positive charge on the atom that has
• describe the difference between conductors, insulators and semi- 213–20 lost the electron and the movement of electrons and holes in opposite
conductors in terms of band structures and relative electrical resistance directions when an electric field is applied across the semiconductor
• identify absences of electrons in a nearly full band as holes, and 216–17, • gather, process and present secondary information to discuss how 225–6, 231
recognise that both electrons and holes help to carry current 219–20 shortcomings in available communication technology led to an increased
• compare qualitatively the relative number of free electrons that can drift 213–15 knowledge of the properties of materials with particular reference to the
from atom to atom in conductors, semiconductors and insulators invention of the transistor
• identify that the use of germanium in early transistors is related to lack 218–19 • identify data sources, gather, process, analyse information and use 225–8
of ability to produce other materials of suitable purity available evidence to assess the impact of the invention of transistors on
• describe how ‘doping’ a semiconductor can change its electrical 217, 219–20 society with particular reference to their use in microchips and
properties microprocessors
• identify differences in p- and n-type semiconductors in terms of the 219–20
relative number of negative charge carriers and positive holes
• describe differences between solid state and thermionic devices and 220–5
discuss why solid state devices replaced thermionic devices

4. Investigations into the electrical properties of particular metals at different temperatures led to the identification of superconductivity and the
exploration of possible applications
Students learn to: page Students: page
• outline the methods used by the Braggs to determine crystal structure 236–9 • process information to identify some of the metals, metal alloys and 241–2
• identify that metals possess a crystal lattice structure 239 compounds that have been identified as exhibiting the property of
• describe conduction in metals as a free movement of electrons 240 superconductivity and their critical temperatures
unimpeded by the lattice • perform an investigation to demonstrate magnetic levitation 253–4
• identify that resistance in metals is increased by the presence of 240 • analyse information to explain why a magnet is able to hover above a 240–1, 245
impurities and scattering of electrons by lattice vibrations superconducting material that has reached the temperature at which it is
• describe the occurrence in superconductors below their critical 243–6 superconducting
temperature of a population of electron pairs unaffected by electrical • gather and process information to describe how superconductors and the 248–9
resistance effects of magnetic fields have been applied to develop a maglev train
• discuss the BCS theory 243–4 • process information to discuss possible applications of superconductivity
• discuss the advantages of using superconductors and identify limitations 240–2, and the effects of those applications on computers, generators and motors 246–8
to their use 246–50 and transmission of electricity through power grids

xii
Option module: ASTROPHYSICS (chapters 14–17, pages 255–338)
1. Our understanding of celestial objects depends upon observations made from Earth or space near the Earth
Students learn to: page Students: page
• discuss Galileo’s use of the telescope to identify features of the Moon 257–8 • identify data sources, plan, choose equipment or resources for, and 272
• discuss why some wavebands can be more easily detected from space 258–60 perform an investigation to demonstrate why it is desirable for telescopes
• define the terms ‘resolution’ and ‘sensitivity’ of telescopes 262–4 to have a large diameter objective lens or mirror in terms of both
• discuss the problems associated with ground-based astronomy in terms 265 sensitivity and resolution
of resolution and absorption of radiation and atmospheric distortion
• outline methods by which the resolution and/or sensitivity of ground- 265–8
based systems can be improved, including:
– adaptive optics
– interferometry
– active optics

2. Careful measurement of a celestial object’s position, in the sky, (astrometry) may be used to determine its distance
Students learn to: page Students: page
• define the terms parallax, parsec, light-year 275–6 • solve problems and analyse information to calculate the distance to a star 277, 299–300
• explain how trigonometric parallax can be used to determine the 275–7 given its trigonometric parallax using:
distance to stars 1
• discuss the limitations of trigonometric parallax measurements 277–8 d = ---
p
• gather and process information to determine the relative limits to 302–3
trigonometric parallax distance determinations using recent ground-
based and space-based telescopes

3. Spectroscopy is a vital tool for astronomers and provides a wealth of information


Students learn to: page Students: page
• account for the production of emission and absorption spectra and 280–2 • perform a first-hand investigation to examine a variety of spectra 303
compare these with a continuous black body spectrum produced by discharge tubes, reflected sunlight, or incandescent
• describe the technology needed to measure astronomical spectra 279–80 filaments
• identify the general types of spectra produced by stars, emission 285 • analyse information to predict the surface temperature of a star from its 300, refer
nebulae, galaxies and quasars intensity/wavelength graph 281–2
• describe the key features of stellar spectra and describe how these are 286–7
used to classify stars
• describe how spectra can provide information on surface temperature, 288–9
rotational and translational velocity, density and chemical composition
of stars

4. Photometric measurements can be used for determining distance and comparing objects
Students learn to: page Students: page
• define absolute and apparent magnitude 291 • solve problems and analyse information using: 291, 292,
• explain how the concept of magnitude can be used to determine the 291–2 d 293, 294–5,
distance to a celestial object M = m − 5 log  ------ 300–1
• outline spectroscopic parallax 293–5  10
• explain how two-colour values (i.e. colour index, B-V) are obtained and 293–7 and
why they are useful IA mB – mA
• describe the advantages of photoelectric technologies over 298 ----- = 100 -------------------
-
photographic methods for photometry IB 5
to calculate the absolute or apparent magnitude of stars using data and a
reference star
• perform an investigation to demonstrate the use of filters for photometric 303–4
measurements
• identify data sources, gather, process and present information to assess the 289, 298
impact of improvements in measurement technologies on our
understanding of celestial objects

5. The study of binary and variable stars reveals vital information about stars
Students learn to: page Students: page
• describe binary stars in terms of the means of their detection: visual, 306–10 • perform an investigation to model the light curves of eclipsing binaries 318–19
eclipsing, spectroscopic and astrometric using computer simulation
• explain the importance of binary stars in determining stellar masses 306–8 • solve problems and analyse information by applying: 308, 318
• classify variable stars as either intrinsic or extrinsic and periodic or non- 312–314 2 3
periodic 4π r
m1 + m2 = --------------
• explain the importance of the period-luminosity relationship for 315 2

determining the distance of cepheids

6. Stars evolve and eventually ‘die’


Students learn to: page Students: page
• describe the processes involved in stellar formation 321–4 • present information by plotting Hertzsprung–Russell diagrams for: 330
• outline the key stages in a star’s life in terms of the physical processes 321–34, 335 nearby or brightest stars; stars in a young open cluster; stars in a globular
involved cluster
• describe the types of nuclear reactions involved in Main Sequence and 324–8, 331 • analyse information from an H–R diagram and use available evidence to 335, 338
post-Main Sequence stars determine the characteristics of a star and its evolutionary stage
• discuss the synthesis of elements in stars by fusion 332 • present information by plotting on an H–R diagram the pathways of stars 335
• explain how the age of a globular cluster can be determined from its 329–30 of 1, 5 and 10 solar masses during their life cycle
zero-age main sequence plot for an H–R diagram
• explain the concept of star death in relation to: 332–3
– planetary nebula
– supernovae
– white dwarfs
– neutron stars/pulsars
– black holes

xiii
Option module: MEDICAL PHYSICS (chapters 18–21, pages 339–416)
1. The properties of ultrasound waves can be used as diagnostic tools
Students learn to: page Students: page
• identify the differences between ultrasound and sound in normal 341–2 • solve problems and analyse information to calculate the acoustic 345, 358
hearing range impedance of a range of materials, including bone, muscle, soft tissue, fat,
• describe the piezoelectric effect and the effect of using an alternating 347 blood and air and explain the types of tissues that ultrasound can be used
potential difference with a piezoelectric crystal to examine
• define acoustic impedance: 344–5 • gather secondary information to observe at least two ultrasound images of 344, 350,
Z = ρv body organs 355–6
and identify that different materials have different acoustic impedances • identify data sources and gather information to observe the flow of blood 355, 360
• describe how the principles of acoustic impedance and reflection and 344–6 through the heart from a Doppler ultrasound video image
refraction are applied to ultrasound • identify data sources, gather, process and analyse information to describe 351–2
• define the ratio of reflected to initial intensity as: 345–6 how ultrasound is used to measure bone density
2 • solve problems and analyse information using: 344–6, 358–9
I [ Z2 – Z1 ]
----r = ------------------------
- Z = ρv
I0 2
[ Z2 + Z1 ] and
2
• identify that the greater the difference in acoustic impedance between 345–6 I [ Z2 – Z1 ]
two materials, the greater is the reflected proportion of the incident ----r = ------------------------
-
I0 2
pulse [ Z2 + Z1 ]
• describe the situations in which A scans, B scans, and phase and sector 348–51
scans would be used and the reasons for the use of each
• describe the Doppler effect in sound waves and how it is used in 352–5
ultrasonics to obtain flow characteristics of blood moving through the
heart
• outline some cardiac problems that can be detected through the use of 354–5
the Doppler effect

2. The physical properties of electromagnetic radiation can be used as diagnostic tools


Students learn to: page Students: page
• describe how X-rays are currently produced 362–3 • gather information to observe at least one image of a fracture on an X-ray 361, 367, 368
• compare the differences between ‘soft’ and ‘hard’ X-rays 366 film and X-ray images of other body parts
• explain how a computed axial tomography (CT) scan is produced 368–70 • gather secondary information to observe a CT scan image and compare 371–2, 377
• describe circumstances where a CT scan would be a superior diagnostic 371–7 the information provided by CT scans to that provided by an X-ray image
tool compared to either X-rays or ultrasound for the same body part
• explain how an endoscope works in relation to total internal reflection 373–7 • perform a first-hand investigation to demonstrate the transfer of light by 380
• discuss differences between the role of coherent and incoherent 375 optical fibres
bundles of fibres in an endoscope • gather secondary information to observe internal organs from images 377, 379
• explain how an endoscope is used in: 376–7 produced by an endoscope
– observing internal organs
– obtaining tissue samples of internal organs for further testing

3. Radioactivity can be used as a diagnostic tool


Students learn to: page Students: page
• outline properties of radioactive isotopes and their half lives that are 382–4 • perform an investigation to compare an image of bone scan with an X-ray 390, 397
used to obtain scans of organs image
• describe how radioactive isotopes may be metabolised by the body to 384, 385 • gather and process secondary information to compare a scanned image of 388, 390, 391
bind or accumulate in the target organ at least one healthy body part or organ with a scanned image of its
• identify that, during decay of specific radioactive nuclei, positrons are 392–3 diseased counterpart
given off
• discuss the interaction of electrons and positrons resulting in the 392
production of gamma rays
• describe how the positron emission tomography (PET) technique is 393–5
used for diagnosis

4. The magnetic field produced by nuclear particles can be used as a diagnostic tool
Students learn to: page Students: page
• identify that the nuclei of certain atoms and molecules behave as small 399–400 • perform an investigation to observe images from magnetic resonance 408, 410,
magnets image (MRI) scans, including a comparison of healthy and damaged 411, 414
• identify that protons and neutrons in the nucleus have properties of 400–1 tissue
spin and describe how net spin is obtained • identify data sources, gather, process and present information using 410–13
• explain that the behaviour of nuclei with a net spin, particularly 400–1 available evidence to explain why MRI scans can be used to:
hydrogen, is related to the magnetic field they produce – detect cancerous tissues
• describe the changes that occur in the orientation of the magnetic axis 403–4 – identify areas of high blood flow
of nuclei before and after the application of a strong magnetic field – distinguish between grey and white matter in the brain
• define precessing and relate the frequency of the precessing to the 404–5 • gather and process secondary information to identify the function of the 402
composition of the nuclei and the strength of the applied external electromagnet, radio frequency oscillator, radio receiver and computer in
magnetic field the MRI equipment
• discuss the effect of subjecting precessing nuclei to pulses of radio waves 405–8 • identify data sources, gather and process information to compare the 412–13, 415
• explain that the amplitude of the signal given out when precessing 408–9 advantages and disadvantages of X-rays, CT scans, PET scans and MRI
nuclei relax is related to the number of nuclei present scans
• explain that large differences would occur in the relaxation time 409–10 • gather, analyse information and use available evidence to assess the 415
between tissue containing hydrogen-bound water molecules and tissues impact of medical applications of physics on society
containing other molecules

xiv
Option module: From QUANTA TO QUARKS (chapters 22–26, pages 417–520)
1. Problems with the Rutherford model of the atom led to the search for a model that would better explain the observed phenomena
Students learn to: page Students: page
• discuss the structure of the Rutherford model of the atom, the 421–3 • perform a first-hand investigation to observe the visible components of 437–9
existence of the nucleus and electron orbits the hydrogen spectrum
• analyse the significance of the hydrogen spectrum in the development 424 • process and present diagrammatic information to illustrate Bohr’s 433–4
of Bohr’s model of the atom explanation of the Balmer series
• define Bohr’s postulates 427–8 • solve problems and analyse information using: 425, 435
• discuss Planck’s contribution to the concept of quantised energy 423 1
• describe how Bohr’s postulates led to the development of a 429–33 1 1
--- = R  ----- – -----
mathematical model to account for the existence of the hydrogen λ  n f n 2i 
2

spectrum:
• analyse secondary information to identify the difficulties with the 434
1 1 1 Rutherford–Bohr model, including its inability to completely explain:
--- = R  ----- – -----
λ  n 2f n 2i  – the spectra of larger atoms
– the relative intensity of spectral lines
• discuss the limitations of the Bohr model of the hydrogen atom 434 – the existence of hyperfine spectral lines
– the Zeeman effect

2. The limitations of classical physics gave birth to quantum physics


Students learn to: page Students: page
• describe the impact of de Broglie’s proposal that any kind of particle has 444–6 • solve problems and analyse information using: 445, 452
both wave and particle properties h
• define diffraction and identify that interference occurs between waves 441–3 λ = ------
-
mv
that have been diffracted
• gather, process, analyse and present information and use available 448–51
• describe the confirmation of de Broglie’s proposal by Davisson and 446
evidence to assess the contributions made by Heisenberg and Pauli to the
Germer
development of atomic theory
• explain the stability of the electron orbits in the Bohr atom using 446–7
de Broglie’s hypothesis

3. The work of Chadwick and Fermi in producing artificial transmutations led to practical applications of nuclear physics
Students learn to: page Students: page
• define the components of the nucleus (protons and neutrons) as 457 • perform a first-hand investigation or gather secondary information to 518–20
nucleons and contrast their properties observe radiation emitted from a nucleus using a Wilson cloud chamber
• discuss the importance of conservation laws to Chadwick’s discovery of 460 or similar detection device
the neutron • solve problems and analyse information to calculate the mass defect and 470–1, 473,
• define the term ‘transmutation’ 456 energy released in natural transmutation and fission reactions 494
• describe nuclear transmutations due to natural radioactivity 456
• describe Fermi’s initial experimental observation of nuclear fission 476–8
• discuss Pauli’s suggestion of the existence of the neutrino and relate it to 461–5
the need to account for the energy distribution of electrons emitted in
ß-decay
• evaluate the relative contributions of electrostatic and gravitational 467
forces between nucleons
• account for the need for the strong nuclear force and describe its 467–8
properties
• explain the concept of a mass defect using Einstein’s equivalence 468–70
between mass and energy
• describe Fermi’s demonstration of a controlled nuclear chain reaction 481–2
in 1942
• compare requirements for controlled and uncontrolled nuclear chain 482–7
reactions

4. An understanding of the nucleus has led to large science projects and many applications
Students learn to: page Students: page
• explain the basic principles of a fission reactor 484–8 • gather, process and analyse information to assess the significance of the 480–4
• describe some medical and industrial applications of radioisotopes 489–91 Manhattan Project to society
• describe how neutron scattering is used as a probe by referring to the 492 • identify data sources, and gather, process and analyse information to 489–91
properties of neutrons describe the use of:
• identify ways by which physicists continue to develop their 496–502 – a named isotope in medicine
understanding of matter, using accelerators as a probe to investigate the – a named isotope in agriculture
structure of matter – a named isotope in engineering
• discuss the key features and components of the standard model of 503–13
matter, including quarks and leptons

xv
ACKNOWLEDGEMENTS
The authors would like to thank the following people for Modern Physics 2nd edition, 1996, p. 24. Used by permission of John
their support during the writing of this book: Michael Wiley & Sons, Inc.: 5.4 • Resnick, R, Introduction to Special
Andriessen gives special thanks to Christine, Sam and Luke Relativity, figures 1.4 and 1.6, John Wiley & Sons Inc., © 1968:
for their understanding and patience; Peter Pentland wishes 5.5 • Halliday et al., Fundamentals of Physics Extended, 5th edition,
especially to thank his wife Helen Kennedy for her contin- John Wiley & Sons, Inc. 1997, figures 37.27 (b, c and d), 43.3,
uing support; Bruce McKay is indebted to close friend and 37.22, 43.6. Used by permission of John Wiley & Sons, Inc.: 13.9,
former colleague Barry Mott for his valuable advice; Richard 13.12, 22.5, 22.12, 24.15 • Webster, J G (ed), Medical
Instrumentation, 3rd edition, 1998, p. 559, adapted and used by
Gaut thanks Stephen and Greta for their helpful suggestions permission of John Wiley & Sons, Inc.: 20.7(a) • Adapted from
and feedback; Jill Tacon wishes to thank Dr Manjula Sharma ‘The Particle Adventure’, produced by the Particle Data Group,
and Dr Joe Khachan from the University of Sydney for their Lawrence Berkeley National Laborator y: 26.3, 26.4 • Bruce
valuable advice and Lee Collins from Westmead Hospital for McKay: 13.4, 22.1, 23.2(b, c), 26.11 • David Malin Images:
his expert assistance. Yoka McCallum’s contribution and 15.1 • Mary Evans Picture Library: 14.3 • Cambridge Encyclopedia
encouragement is also appreciated, and thanks must go to of Astronomy, ed. by Dr Simon Mitton, Jonathon Cape, 1977,
Dean Bunn for his permission to adapt some practical activi- © Cambridge University. Reproduced with pemission of Simon
ties and other material from Physics for a Modern World. Mitton: 15.8(b), 17.13 • Barbara Mochejska (Copernicus
Astronomical Center), Andrew Szentgyorgyi (Harvard-
The authors and publisher are grateful to the following indi- Smithsonian Center for Astrophysics), F. L. Whipple Observatory:
viduals and organisations for their permission to reproduce 17.11(b) • NASA: 2.28, 3.1/MSFC, 3.6, 3.8, 4.8/NSSDC, 14.1/
photographs and other copyright material. STSCI, 14.7, 17.3/NOAO, 17.16/The Hubble Heritage Team,
17.18(a)/ESA/ASU/J Hester and A Loll, 17.18(b)/CXC/ASU/
Images J Hester et al. • © Newspix: 20.13/Jody D’Arcy, 25.2 • © Moriel
• © Alan Bean/Novaspace Galleries: 2.2 • AIP Emilio Segrè NessAiver: 21.2 • © Department of Nuclear Medicine, The
Visual Achives, W.F. Meggers Collection: 22.8 • © ANSTO: 20.3, Queen Elizabeth Hospital, South Australia: 20.7(d) • De Jong et
20.4/Gentech® generator courtesy of ANSTO • Otto Rogge/ al., Heinmann Physics Two, p. 237. Reproduced with permission
ANTPhoto.com.au: 11.21 • Bhathal, R., Astronomy for the HSC, from Pearson Education Australia: 13.2 • Adapted from David
Kangaroo Press 1993, p. 47. Reproduced by permission of Heffernan, Physics Contexts 2, Pearson Education Australia 2002,
Ragbir Bhathal: 17.12(a–c) • © Black & Decker: 9.1 p. 315. Reproduced with permission of Pearson Education
• © Boeing: 3.9 • The Royal Institution, London, UK/ Australia: 18.8 • Peter Pentland: 9.9 • © Philips Medical
Bridgeman Art Library: 7.1 • Courtesy of Brookhaven National Systems: 18.3(b), 19.11 • © Photodisc, Inc.: pages 1, 173, 255,
Laboratory: 26.8(a, b) • Cavendish Laborator y, Cambridge 339, 417; 1.1, 4.1, 8.1, 8.24(a, b), 10.22, 11.19, 17.1, 18.1, 18.3(c),
University: 10.1 • © University of Queensland, Centre for 19.9(a), 20.15(a), 21.17, 25.1 • photolibrary.com: 9.4/Tom
Magnetic Resonance: 21.15(a–c) • Chicago Historical Society, Marexchal, 13.26(b)/Photo Researchers Inc., 17.11(a)/John
Birth of the Atomic Age, Painting by Gary Sheahan, 1957: 25.5 Chumack; 18.15(a, b), 19.14, 20.9, 21.1, 21.3, 21.14(a, b), 23.1/
• © Corbis Australia/Australian Picture Library: 13.1; 18.16/ Phototake; 19.8/Phototake Science; 19.9(b), 20.14/Phototake
Belt/Corbis/Annie Griffiths; 8.17/Corbis; 8.15, 8.16, 11.2, Science/Science Ltd; 20.1/Phototake Science/ISM
11.5, 11.12, 23.11, 23.12, 23.13, 23.14, 23.15/Corbis/Bettman; • photolibrary.com/Science Photo Librar y: 4.2, 5.1, 7.2, 14.5,
21.14(d)/Corbis/Howard Sochurek; 7.4/Corbis/Hulton-Deutsch 15.22, 19.1, 22.7; 3.11/NASA; 10.26(b)/Peter Apraham; 11.1,
Collection • © CERN: 26.2 • © CFHT, 1996 Used with 14.6/David Nunuk; 12.1/John Walsh; 12.23/Martin Dohrn;
permission: 14.17 • Adapted from Physics in Medical Diagnosis, 12.25(b)/Andrew Syred; 18.7/P Saada/Eurelios; 20.5/Philippe
Chapman & Hall 1997, p. 213 fig. 5.23. Reproduced with Plailly; 20.6(a)/Dr P Marazzi; 20.6(b, c)/CNRI; 20.11/Zephyr;
permission of the author Dr Trevor A Delchar: 18.14(a) 20.15(b)/Hank Morgan; 21.16/Dept of Cognitive Neurology;
• © Digital Stock/Corbis Corporation: page 99; 2.4, 13.18 • © The Picture Source/Terry Oakley: 6.24, 12.12(b), 12.22,
• © Digital Vision: 2.1 • © Dorling Kindersley: 6.1 • Image 12.25(a) • © Greg Pitt: 20.7(c) • Science Museum/Science &
courtesy of Elaine Collin: 19.19 • © Corbis/Greg Smith: Society Picture Library: 24.8 • Reprinted from Medical Physics,
11.22 • Image courtesy of the European Space Agency © ESA: 1978, John Wiley & Sons Inc. with permission from the authors,
14.13, 15.6 • Courtesy of EMI Archives: 19.10 • Fermilab John R Cameron and James G Skofronick © 1992: 19.5(a, b)
National Accelerator Laborator y: 26.1, 26.7, 26.10 • Dave Finley, • Peter Storer: 8.22 • Sudbury Neutrino Observatory/R Chambers:
National Radio Astronomy Observatory, courtesy Associated 24.13 • © SP-AusNet: 8.21 • Courtesy of Transrapid
Universities, Inc., and the National Science Foundation: 14.14 International-USA, Inc.: 13.26(a) • Reproduced with kind
• © Fundamental Photographs, New York: 15.8(a); 15.14, 22.9, permission of TransGrid: 8.20 • Photo © UC Regents/Lick
23.6/Wabash Instrument Corp.; 23.2(a)/Ken Kay • © Getty Observatory: 16.1 • © Richard Wainscoat: 14.19 • Jack
Images/ Hulton Archive: 14.2, 22.2, 23.8 • David Grabham: 19.7, Washburn: 13.11(b) • Westmead Hospital Medical Physics
21.14(c), 21.18(a, b) • Medical Physics by J Pope, p. 81, fig 4.13 Department: 18.14(b), 20.10(a, b), 20.12(a, b), 20.16 • Justine
Reprinted by permission of Harcourt Education Ltd: 21.13 Wong: 19.13(a-d) • Courtesy of Xerox Corporation: 10.9
• Reprinted by permission of HarperCollins Publishers Ltd.
© Illingworth 1994: 15.21, 16.10 • Terry Herter: 16.15, 16.16 Text
• Adapted and reproduced from Martin Hollins, Medical Physics, • Extracts from Physics Stage 6 Syllabus © Board of Studies
2nd edition, Nelson Thornes 2001, pp. 114, 122, 127, 100, 101, NSW, 2002 (pages x–xv) • Data and formulae sheets from
186, 197: 18.6, 18.17, 19.6, 19.16, 19.17, 20.7(b), 21.6 • ‘The Physics Higher School Certificate Examination © Board of
Fermion Particles’ and ‘The Boson Force Carriers’, From Quarks to Studies NSW (pages 526–7) • Higher School Certificate
the Cosmos: Tools of Discovery by Leon M Lederman and David N Assessment Support Document © Board of Studies NSW, 1999
Schramm. © 1989, 1995 Scientific American Library. Reprinted by (key words, pages 529–30)
permission of Henry Holt and Company, LLC: table 26.4
• Courtesy of Hologic: 18.11 • © www.imageaddict.com.au: 14.8, Every effort has been made to trace the ownership of
14.10 • ImageState: 10.5 • Kamioka Observatory, ICRR (Institute copyright material. Information that will enable the
for Cosmic Ray Research), The University of Tokyo: 24.1 • IEE publisher to rectify any error or omission in subsequent
Review March 2001, Vol 47, 102, p 42. Reproduced with permission editions will be welcome. In such cases, please contact the
of IET. www.theiet.org: 13.25 • Jared Schneidman Design: Permissions Section of John Wiley & Sons Australia, Ltd, who
12.24 • JAS Photography/John Sowden: 2.3 • Kenneth Krane, will arrange for the payment of the usual fee.

xvi
HSC CORE MODULE
Chapter 1
Earth’s gravitational field

Chapter 2
Launching into space

Chapter 3
Orbiting and re-entry

Chapter 4
Gravity in the solar system

Chapter 5
Space and time

SPACE
CHAPTER
1 EARTH’S
GRAVITATIONAL
FIELD

Remember
Before beginning this chapter, you should be able to:
• recall and apply Newton’s Second Law of Motion:
F = ma.

Key content
At the end of this chapter you should be able to:
• make a comparison between the acceleration due to gravity at
various places over the Earth’s surface as well as at other
locations throughout the solar system
• define weight as the force on an object due to a gravitational
field
• explain that work done to raise or lower a mass in a
gravitational field is directly related to a change in the
gravitational potential energy of the mass
• calculate the weight of a body on Earth, above the Earth or on
other planets
• define gravitational potential energy as the work done in
moving an object from a very large distance away to a point in
a gravitational field.

Figure 1.1 The Earth rising as seen from the Moon


Gravity is a force of attraction that exists between any two masses. Usually
this is a very small, if not negligible, force. However, when one or both of
the masses is as large as a planet, then the force becomes very significant
indeed. The force of attraction between the Earth and our own bodies is
the force we call our weight. This force exists wherever we are on or near the
Earth’s surface (although, as we shall see, with some variation). We can say
that a gravitational field exists around the Earth and we live within that field.

1.1 THE EARTH’S GRAVITY


The Earth is surrounded by a gravitational field. This type of field,
discussed in more general terms in chapter 4, is a vector field within
which a mass will experience a force.
(Other vector fields include electric
and magnetic fields.) The gravitational
field around the Earth can be drawn as
shown in figure 1.2. Note that the
direction of a field line at any point is
the direction of the force experienced
by a mass placed at that point.

Figure 1.2 The gravitational field around


the Earth

The field vector: g


A field vector is a single vector that
A field vector is a single vector that describes the strength and direction
describes the strength and of a uniform vector field. For a gravitational field the field vector is g,
direction of a uniform vector field. which is defined in this way:
For a gravitational field, the field F
vector is g.
g = ----
m
where
F = force exerted (N) on mass m
m = mass (kg) in the field
−1
g = the field vector (N kg ).
Vector symbols are indicated here in bold italics. The direction of the
vector g is the same as the direction of the associated force.
Note that a net force applied to a mass will cause it to accelerate.
Newton’s Second Law describes this relationship:

1.1 F
a = ----
Using a pendulum to m
determine g where
−2
a = acceleration (m s ).
Hence, we can say that the field vector g also represents the accel-
eration due to gravity and we can calculate its value at the Earth’s surface
as described below.
The Law of Universal Gravitation (discussed in more detail in
chapter 4) says that the magnitude of the force of attraction between the
Earth and an object on the Earth’s surface is given by:
mE mO
F = G --------------
rE 2

CHAPTER 1 EARTH’S GRAVITATIONAL FIELD 3


where
mE = the mass of the Earth
24
= 5.97 × 10 kg
mO = mass of the object (kg)
rE = radius of the Earth
6
= 6.38 × 10 m.
From the defining equation for g, on the previous page, we can see
that the force experienced by the mass can also be described by:
F = mOg.
mE mO
Equating these two we get: mOg = G -------------- .
rE 2
m
This simplifies to give: g = G ------E2
rE
– 11 24
( 6.672 × 10 ) ( 5.974 × 10 )
= ------------------------------------------------------------------------- .
6 2
( 6.378 × 10 )
−2
Hence, g ≈ 9.80 m s .
The value of the Earth’s radius used here, 6378 km (at the Equator), is
−2
an average value so the value of g calculated, 9.80 m s , also represents an
average value.

Variations in the value of g


Variation with geographical location
The actual value of the acceleration due to gravity, g, that will apply in a
given situation will depend upon geographical location. Minor variations
in the value of g around the Earth’s surface occur because:
• the Earth’s crust or lithosphere shows variations in thickness and
structure due to factors such as tectonic plate boundaries and dense
mineral deposits. These variations can alter local values of g.
• the Earth is not a perfect sphere, but is flattened at the poles. This
means that the value of g will be greater at the poles, since they are
closer to the centre of the Earth.
• the spin of the Earth creates a centrifuge effect that reduces the effec-
tive value of g. The effect is greatest at the Equator and there is no
effect at the poles.
As a result of these factors, the rate of acceleration due to gravity at the
surface of the Earth varies from a minimum value at the Equator of
−2 −2
9.782 m s to a maximum value of 9.832 m s at the poles. The usual
−2
value used in equations requiring g is 9.8 m s .

Variation with altitude


The formula for g shows that the value of g will also vary with altitude
above the Earth’s surface. By using a value of r equal to the radius of the
Earth plus altitude, the following values can easily be calculated. It is
clear from table 1.1 that the effect of the Earth’s gravitational field is felt
quite some distance out into space.
mE
The formula used is: g = G -------------------------------------
2
.
( r E + altitude )
Note that as altitude increases the value of g decreases, dropping to zero
only when the altitude has an infinite value.

4 SPACE
Table 1.1 The variation of g with altitude above Earth’s surface
−2
ALTITUDE (km) g (m s ) COMMENT

0 9.80 Earth’s surface

8.8 9.77 Mt Everest Summit

80 9.54 Arbitrary beginning of space

200 9.21 Mercury capsule orbit altitude

250 9.07 Space shuttle minimum orbit altitude

400 8.68 Space shuttle maximum orbit altitude

1 000 7.32 Upper limit for low Earth orbit

40 000 0.19 Communications satellite orbit altitude

Variation with planetary body


The formula for g also shows that the value of g depends upon the mass
and radius of the central body which, in examples so far, has been the
Earth. Other planets and natural satellites (moons) have a variety of
masses and radii, so that the value of g elsewhere in our solar system can
be quite different from that on Earth. The following table presents a few
examples.
m planet
The formula used here is: g = G -------------
-.
1.2 r planet 2
Weight values in the solar system
and g Table 1.2 A comparison of g on the surface of other planetary bodies
−2
BODY MASS (kg) RADIUS (km) g (m s )
22
Moon 7.35 × 10 1 738 1.6
23
Mars 6.42 × 10 3 397 3.7
27
Jupiter 1.90 × 10 71 492 24.8
22
Pluto 1.31 × 10 1 151 0.66

Determining acceleration due to gravity above the Moon


SAMPLE PROBLEM 1.1 For each of the Apollo lunar landings, the command module continued
orbiting the Moon at an altitude of about 110 km, awaiting the return of
the Moon walkers. Determine the value of acceleration due to gravity at
that altitude above the surface of the Moon (the radius of the Moon is
1738 km).
mM
SOLUTION g = G --------------------------------------
-2
( r M + altitude )
– 11 22
( 6.67 × 10 ) ( 7.35 × 10 )
= -------------------------------------------------------------------
6 5 2
( 1.738 × 10 + 1.1 × 10 )
−2
= 1.4 m s
That is, the acceleration due to gravity operating on the orbiting command
−2
module was approximately 1.4 m s .

CHAPTER 1 EARTH’S GRAVITATIONAL FIELD 5


1.2 WEIGHT
Weight is defined as the force on a mass due to the gravitational field of a
Weight is defined as the force on a
mass due to the gravitational field large celestial body, such as the Earth. Since it is a force, it is measured in
of a large celestial body, such as newtons. We can use Newton’s Second Law to define a simple formula
the Earth. for weight:
Newton’s Second Law states:
F = ma
and hence:
W = mg
where
W = weight (N)
m = mass (kg)
−2
g = acceleration due to gravity at that place (m s ).

F W
F F

W
F

Figure 1.3 There is always a gravitational force between any two masses. When one of the
masses is as large as a planet, the force on a small mass is called weight.

Determining the weight of an astronaut


SAMPLE PROBLEM 1.2 What would be the weight of a 100 kg astronaut (a) on the Earth, (b) on
the Moon and (c) in an orbiting space shuttle?
SOLUTION Use the values of g shown in tables 1.1 and 1.2. Note that the astronaut’s
mass does not change with position but weight does.
Many numerical HSC questions (a) On the Earth WE = mgE
allocate marks for working. These = 100 × 9.8
marks are earned at the substitution
line; that is, the line following the = 980 N
formula into which data values are (b) On the Moon WM = mgM
correctly substituted. Usually no = 100 × 1.6
marks are awarded for a formula on
its own. You should develop the habit = 160 N
of showing all your workings, (c) In orbit WO = mgO
especially the substitution line. = 100 × 8.68 (at maximum altitude)
= 868 N
There is an apparent contradiction in this last answer. The astronaut in
orbit still has a considerable weight, rather than being weightless. How-
ever, the answer is correct because, as we have already seen, the Earth’s
gravitational field extends quite some distance out into space. Why then
do space shuttle astronauts experience weightlessness? As we shall see
later, the weightlessness they feel is not real but only apparent, and is a
consequence of their orbital motion around the Earth.

6 SPACE
1.3 GRAVITATIONAL POTENTIAL
ENERGY
Gravitational potential energy, Ep, is the energy of a mass due to its position
Gravitational potential energy, Ep,
is the energy of a mass due to its within a gravitational field. Here on Earth, the Ep of an object at some point,
position within a gravitational x, above the ground is easily found as it is equal to the work done to move
field. On a large scale, the object from the ground up to the point, x, as shown in figure 1.4.
gravitational potential energy is
defined as the work done to move
Gravitational potential energy Ep = work done to move to the point
an object from infinity (or some = force required × distance moved
point very far away) to a point (since work W = Fr)
within a gravitational field. = (mg) × h = mgh
Hence, in this case Ep = mgh. We chose the ground as our starting point
because this is our defined zero level; that is, the place where Ep = 0. Note
Point x that since work must be done on the object to lift it, it acquires energy.
Hence, at point x, Ep is greater than zero.
On a larger, planetary scale we need to rethink our approach. Due to the
inverse square relationship in the Law of Universal Gravitation, the force of
Work done = Fr = mgh attraction between a planet and an object will drop to zero only at an infinite
distance from the planet. For this reason we will now choose infinity (or
some point a very large distance away) as our level of zero potential energy.
Ground There is a strange side effect of our choice of zero level. Because gravi-
tation is a force of attraction, work must be done on the object to move it
Figure 1.4 Gravitational potential from a point, x, to infinity; that is, against the field so that it gains energy, Ep.
energy, Ep = work done to move up
Therefore, Ep at infinity > Ep at point x
to the point from the ground (the
but Ep at infinity = 0
zero level)
so that Ep at point x < 0
that is, Ep at point x has a negative value! (see figure 1.5)

Work must be done to move a mass against a gravitational field.


Planet
Point x

Ep at surface < Ep at x < Ep at ∞


(= 0 by our definition)

Figure 1.5 Different levels of Ep.


If we choose a planet’s surface as the
zero level, Ep at x has a positive value.
If infinity is chosen as the zero level,
Ep has a negative value.

CHAPTER 1 EARTH’S GRAVITATIONAL FIELD 7


Using the same approach as earlier, the gravitational potential
energy, Ep, of an object at a point, x, in a gravitational field is equal to
the work done to move the object from the zero energy level at
infinity (or some point very far away) to point x. It can be shown
mathematically that:
m1 m2
Ep = −G ------------
-
r
where
m1 = mass of planet (kg)
m2 = mass of object (kg)
r = distance separating masses (m).
Figure 1.6 graphs this relationship to show how Ep varies in value in the
space around a planet.

Ep

r
O rp

G mmp

rp

Figure 1.6 A graph showing how the negative value for gravitational potential energy, Ep,
increases with distance up to a maximum value of zero

Gravitational potential energy in the Sun–Earth–Moon system


SAMPLE PROBLEM 1.3 Given the following data, determine the gravitational potential energy of:
(a) the Moon within the Earth’s gravitational field
(b) the Earth within the Sun’s gravitational field.
24
Mass of the Earth = 5.97 × 10 kg
22
Mass of the Moon = 7.35 × 10 kg
30
Mass of the Sun = 1.99 × 10 kg
8
Earth–Moon distance = 3.84 × 10 m on average
11
Earth–Sun distance = 1.50 × 10 m on average (one astronomical
unit, AU)
mE mM
SOLUTION (a) Ep = −G --------------
-
r
– 11 24 22
( 6.67 × 10 ) ( 5.97 × 10 ) ( 7.35 × 10 )
= − ---------------------------------------------------------------------------------------------------
8
( 3.84 × 10 )
28
= −7.62 × 10 J
That is, the gravitational potential energy of the Moon is approxi-
28
mately −7.62 × 10 J. Put another way, the work that would be done in

8 SPACE
moving the Moon from a very large distance away from Earth to its
28
current distance would be −7.62 × 10 J.

mE mS
(b) Ep = −G ------------
-
r
– 11 24 30
( 6.67 × 10 ) ( 5.97 × 10 ) ( 1.99 × 10 )
= − ---------------------------------------------------------------------------------------------------
11
( 1.50 × 10 )
33
= −5.28 × 10 J
That is, the gravitational potential energy of the Earth is approxi-
33
mately −5.28 × 10 J. The negative sign indicates that this would be
work done by the system (not on the system) in moving the Earth
from a very large distance away from the Sun to its present orbital dis-
tance. This negative work represents potential energy lost by the
system as the Earth and the Sun are brought together (converted into
other forms of energy, most probably kinetic). Since the Ep is reduced
below the zero level (see figure 1.6), it is quite appropriate that it
should appear as a negative value.

This missing energy actually lends stability to a system, since the Earth
would need to get this amount of energy back from somewhere if ever it
were to separate from the Sun. It can be thought of as binding energy,
since the lack of this energy binds a system together.

CHAPTER 1 EARTH’S GRAVITATIONAL FIELD 9


5. The moon of Pluto, Charon (pronounced
SUMMARY Kair-on), discovered in 1978, is one of the
largest moons, in proportion to its planet or
• The field vector g describes the strength and dwarf planet (as Pluto is), in the solar system.
direction of the gravitational field at any point. 21
(a) The mass of Charon is 1.62 × 10 kg while
At the surface of the Earth it has an average 22
−1 −2
the mass of Pluto is 1.31 × 10 kg. Calcu-
value of 9.8 N kg or m s . late the ratio of the mass of Charon to the
• The value of g at any specific point on the Earth’s mass of Pluto.
surface can vary from the average figure due to (b) The radius of Charon is 593 km and the
a number of factors. It will also vary with altitude. radius of Pluto is 1151 km. Calculate the
ratio of the radius of Charon to the radius
• Weight is the force on an object due to a signifi- of Pluto.
cant gravitational field (W = mg). (c) Calculate the ratio of the density of
• The gravitational potential energy of an object Charon to the density of Pluto.
at some point within a gravitational field is (d) Calculate the ratio of g on Charon to g on
equivalent to the work done in moving the Pluto.
object from an infinite distance to that point. 6. Identify four different factors that cause the
value of g to vary around the Earth.
m1 m2
Ep = −G ------------
- 7. Construct a graph that shows the value of g
r each 5000 km above the surface of the Earth
up to an altitude of 40 000 km (which corre-
sponds to the altitude of communications
satellites). The mass of the Earth is
QUESTIONS 24
5.97 × 10 kg and the radius of the Earth is
6378 km at the Equator.
1. Define weight.
8. Define gravitational potential energy Ep.
2. In general terms only, describe the variation in
g that would be experienced in a spacecraft 9. Explain the reason for the selection of infinity
travelling directly from the planet Mars to its as the place of zero gravitational potential
moon, Phobos, 9380 km away. energy.
3. The gravitational field vector g has an average 10. Explain how this selection of zero level results
−1
value, on the surface of the Earth, of 9.8 N kg in any point within a gravitational field having
−2
or m s . Show that the two alternative units a negative gravitational potential energy.
quoted are equivalent.
11. Calculate the gravitational potential energy of
4. Complete the following table to calculate the
a 1000 kg communications satellite orbiting
acceleration due to gravity and weight force
the Earth at an altitude of 40 000 km. Use the
experienced by an 80 kg person standing on
data provided in question 7.
the surface of each of the planets or moons
indicated. 12. Use the following data to calculate the gravi-
tational potential energy of (a) Callisto as it
WEIGHT OF orbits within Jupiter’s gravitational field, and
g ON 80 kg (b) Jupiter as it orbits within the Sun’s gravi-
MASS RADIUS SURFACE PERSON tational field.
−2 27
BODY (kg) (km) (m s ) THERE (N) Mass of Jupiter = 1.90 × 10 kg
23
CHAPTER REVIEW

Mercury 3.30 × 10
23
2440 Mass of Callisto = 1.08 × 10 kg
30
24
Mass of the Sun = 1.99 × 10 kg
Venus 4.87 × 10 6052 9
Jupiter–Callisto distance = 1.88 × 10 m
22
Io 8.94 × 10 1821 on average
11
Callisto 1.08 × 10
23
2410 Jupiter–Sun distance = 7.78 × 10 m
on average

10 SPACE
PRACTICAL ACTIVITIES
the stopwatch to time 10 complete back-and-
1.1 USING A forth swings. Be sure to start and stop the stop-
watch at an extreme of the motion rather than
PENDULUM TO somewhere in the middle. Enter your time for
10 swings in the results table.
DETERMINE g 4. Repeat steps 2 and 3 at least five times, after
shortening the string by 5 cm each time.

Aim Results
To determine the rate of acceleration due to Copy the table below into your practical book to
gravity using the motion of a pendulum. record your results, and then complete the other
columns of information.
Apparatus
retort stand TIME FOR 10 PERIOD LENGTH OF
bosshead and clamp OSCILLATIONS PERIOD SQUARED PENDULUM
approximately 1 metre of string 2 2
TRIAL (s) T (s) T (s ) (m)
50 g mass carrier or pendulum bob
stopwatch 1
metre rule
2
Theory 3
When a simple pendulum swings with a small
4
angle, the mass on the end performs a good
approximation of the back-and-forth motion called 5
simple harmonic motion. The period of the pen-
dulum, that is, the time taken to complete a single
full back-and-forth swing, depends upon just two Draw a graph of period squared versus length of
2
variables: the length of the string and the rate of the pendulum. Plot T on the vertical axis and
acceleration due to gravity. The formula for the length on the horizontal axis.
period is as shown below:
l Analysis
T = 2π --
g 1. Your graph should display a straight-line
where relationship. Draw a line of best fit and evaluate
T = period of the pendulum (s) the gradient.
l = length of the pendulum (m) 2. Rearrange the pendulum equation given earlier
−2
g = rate of acceleration due to gravity (m s ). to the form, T 2 = kl, where k is a combination of
constants.
Method 3. Compare this formula with the general
1. Set up the retort stand and clamp on the edge equation for a straight line: y = kx. This com-
2
of a desk as shown in figure 1.7. Tie on the string parison shows that if T forms the y-axis and
and adjust its length to about 90 cm before length, l, forms the x-axis, the expression you
Bosshead attaching the 50 g mass derived for k in step 2 should correspond to the
and clamp
carrier or pendulum gradient of the graph you have drawn. Write
bob to its end. down your expression:
Pendulum
gradient = (complete).
2. Using the metre rule,
carefully measure the 4. Use your expression to calculate a value for g,
length of the pendulum the acceleration due to gravity.
Retort
String
stand from the knot at its top
to the base of the mass
Questions
carrier. Enter this length 1. This method usually produces very accurate
in your results table. results. Can you suggest a reason why it should
Mass 3. Set the pendulum be so reliable?
carrier
swinging gently (30° 2. What are the sources of error in this experiment?
Figure 1.7 Apparatus for maximum deviation 3. What could you do to improve the method of
practical activity 1.1 from vertical) and use this experiment to make it even more accurate?

CHAPTER 1 EARTH’S GRAVITATIONAL FIELD 11


order of mass. However, the mass values and radii
1.2 WEIGHT have not been provided. Conduct research to deter-
mine these figures and then perform the calcu-
VALUES IN THE lations to complete the table as shown.

SOLAR SYSTEM Analysis


Draw a bar graph of your results, with the bodies in
AND g their mass order along the horizontal axis, and
acceleration due to gravity on the vertical axis. You
may be surprised at some of the results.
Aim
To research g and weight values throughout the Questions
solar system. 1. How does g on Jupiter compare with the rest of
the plotted results?
Theory 2. How does g on Saturn, Neptune, Uranus and
The value of g on the surface of a planet depends Venus compare with g on Earth?
upon the mass of the planet and its radius. The 3. How does g on Uranus compare with g on
equation relating these variables is: Venus?
m planet 4. How does g on Mars compare with g on
g = G -------------
2
-. Mercury?
r planet
5. How does g on all of the natural satellites
(moons) listed compare with g on Pluto?
Method 6. There is some debate over whether Pluto
The table below lists the 16 most massive objects in should be downgraded in official status from a
our solar system, excluding the Sun, in descending ‘planet’. Can you provide one good argument

Results
A comparison of gravity throughout the solar system
WEIGHT OF
g ON SURFACE 100 kg PERSON ON
−2
BODY CENTRE OF ORBIT MASS (kg) RADIUS (km) (m s ) SURFACE (N)
Jupiter Sun
Saturn Sun
Neptune Sun
Uranus Sun
Earth Sun
Venus Sun
PRACTICAL ACTIVITIES

Mars Sun
Mercury Sun
Ganymede Jupiter
Titan Saturn
Callisto Jupiter
Io Jupiter
Moon Earth
Europa Jupiter
Triton Neptune
Pluto Sun

12 SPACE
CHAPTER
2 LAUNCHING
INTO SPACE
Remember
Before beginning this chapter, you should be able to:
• describe the nature of velocity and calculate values
r
using: v = ∆-
t
• describe the nature of acceleration and calculate
values using: a = ∆---------v- = v-------------------
–u
∆t t
• describe the nature of kinetic energy and calculate
2
values using: Ek = 1--- mv
2
• describe the nature of gravitational potential energy
m1 m2
using: E p = – G ----------------
-
r
• describe the nature of momentum and calculate
values using: p = mv
• apply Newton’s Second Law of Motion: Σ F = ma.

Key content
At the end of this chapter you should be able to:
• describe the trajectory of a projectile in terms of
horizontal and vertical components
• solve projectile motion problems that require you to
analyse the motion and determine the velocity of the
projectile at any time, the maximum height reached,
the time of flight or the range of the motion
• describe Galileo’s contribution to our understanding
of projectile motion
• explain the concept of escape velocity
• outline Isaac Newton’s concept of escape velocity
• identify why the term ‘g forces’ is used to describe
forces acting upon an astronaut during a typical
launch or re-entry
• discuss the effect of the Earth’s orbital and rotational
motion on the launch of a rocket
• analyse a rocket’s launch acceleration in terms of
Figure 2.1 The space shuttle launching from
forces and conservation of momentum
Cape Canaveral. It uses two solid-fuel
• present information on one notable figure in the
rocket engines to supplement the thrust of its
history of rocket development and space exploration.
own liquid-fuel rocket engines.
In chapter 1 we looked at the nature of gravitational fields. In this
chapter we discuss the issue of escaping the Earth’s gravitational field, at
least as far as reaching an orbit around the Earth. This is still well within
the reach of the Earth’s gravitational field even though orbiting
astronauts do not apparently feel its effects. This field is all that holds
them in an orbit around the Earth and stops them from heading off
further into space.
We will begin by considering simple projectiles, then move from them
to projectiles launched into space, and then on to rockets launched into
space.

2.1 PROJECTILE MOTION


A projectile is any object that is thrown, dropped or otherwise launched
A projectile is any object launched
into the air. into the air. This includes such things as a box dropped from a plane, a
thrown ball, a struck golf ball, a kicked football, or even a fired bullet or
cannonball. For our purposes, however, it does not include a rocket,
because the thrust of a rocket continues well into its flight. Projectiles are
projected into the air and then left to complete their unpowered flight.
Throughout the flight the projectile is subject to just one force — the
force of gravity, and just one acceleration — acceleration due to gravity,
−2
g, or 9.8 m s downwards near the surface of the Earth. This rate of
Figure 2.2 An artist’s impression
acceleration applies to all objects, large or small. It is natural to think of
of astronaut David Scott dropping
heavier objects falling faster than lighter ones, but this is not what
a hammer and a feather on the
Moon to demonstrate that all
happens. All objects are accelerated towards the Earth at the same rate.
objects fall at the same rate This was first realised by Galileo Galilei.
Galileo postulated that all masses, whether
large or small, fall at the same rate, and he
conducted experiments to try to prove just
that. However, air resistance gets in the way
of such experiments and made the job quite
difficult for him. He eventually overcame
this difficulty by rolling balls down highly-
polished inclines instead of simply dropping
them, thereby reducing the effective accel-
eration. This lower rate of acceleration was
less affected by air resistance and was easier
to measure.
When astronauts went to the Moon, one
of the experiments they performed was to
drop a hammer and a feather together. On
the Moon there is no air to get in the way,
and the rate of acceleration due to gravity is
much less than here on Earth, so that things
fall more slowly. As figure 2.2 shows, these
two factors made the experiment much
easier to perform and the result was clear —
the feather and the hammer hit the Moon’s
surface at the same time.
In order to simplify our analysis of a
projectile’s motion we will ignore the effect
of air resistance.

14 SPACE
The trajectory
The trajectory of a projectile is the path that it follows during its flight. In
The trajectory of a projectile is the
path that it follows during its flight.
the absence of air resistance, the path of the flight of a projectile will
trace out the shape of a parabola as shown by the photograph in figure
2.3, taken with the aid of a stroboscope. A stroboscope is a light that
produces quick flashes at regular (usually small) time periods. If used
A stroboscope is a light that with a camera, instead of a regular flashgun, a stroboscopic photograph
produces quick flashes at regular
(usually small) time periods.
is produced which shows multiple images of a moving object.

Figure 2.3 A stroboscopic photograph of a ball undergoing projectile motion

To understand and analyse this motion we must


note an observation first made by Galileo: the
motion of a projectile can be regarded as two
Horizontal separate and independent motions superimposed
motion
upon each other. The first is a vertical motion, which
Vertical is subject to acceleration due to gravity, and the
y-axis

motion second is a horizontal motion, which experiences no


acceleration. Figure 2.4 places these two motions
within a frame of reference, using the y-axis for the
vertical motion and the x-axis for the horizontal
motion.
Because the two motions are perpendicular, and
x-axis
therefore independent, we can treat them separately
Figure 2.4 A frame of reference for and analyse them separately.
the vertical and horizontal component
motions of a projectile Acceleration equations
You will recall from the Preliminary course topic ‘Moving about’ that
whenever a moving object changes its velocity, such as a thrown ball, then
it has accelerated. This acceleration is defined by the following equation:
∆v v–u
a = ------ = -----------
∆t t
where
−2
a = acceleration (m s )
−1
∆v = change in velocity (m s )
∆t = change in time (s)
−1
v = final velocity (m s )
−1
u = initial velocity (m s )
t = time taken (s).

CHAPTER 2 LAUNCHING INTO SPACE 15


Using this equation, a further set of equations describing accelerated
motion can be derived. These equations are shown together here as a set:
v = u + at
2 2
v = u + 2ar
2
r = ut + 1--- at
2

(a) Going up
(b) Going down where
r = displacement (m).
We will use this set to derive equations specific to the vertical and
horizontal motions.

The vertical motion


When a ball is thrown directly up, it is accelerated due to gravity directly
down. As a result it will rise up, slow to a halt in the air and then fall back
to Earth. As it falls it will speed up until, when back at its starting point, it
is going as fast as it was when thrown. Furthermore, the time taken to fall
v
from its peak height to the ground exactly equals the time taken to rise
v to the peak height. Figure 2.5 shows this motion broken up into equal
time segments. Notice that we have taken up to be the positive direction,
so that acceleration is always in the negative direction.
In adapting the acceleration equations for the vertical motion we need
to note the following variables:
−2
a = ay = 9.8 m s down (as shown in figure 2.5(b))
v = vy
u = uy
r = ∆y (since displacement = change of position on the y-axis).
Hence, our three equations become:
Figure 2.5 (a and b) The motion of vy = uy + ayt
a ball thrown vertically upwards 2 2
vy = uy + 2ay ∆y
2
∆y = uyt + 1--- ayt .
2

Calculating the height, time and velocity of a ball thrown


SAMPLE PROBLEM 2.1 vertically
–1
A ball is thrown directly upwards with a velocity of 45 km h . Ignoring air
resistance, determine:
(a) its peak height
(b) its time of flight
(c) its velocity after 0.5 s
(d) its velocity after 1.5 s.
SOLUTION We shall take up to be the positive direction. Note that the velocity has not
–1
been given in a standard SI unit and must first be converted into m s .
45
45 km h =  ------- m s = 12.5 m s
−1 −1 −1
 3.6
(a) Let us consider just the first half of the motion; that is, the rise up to
the peak. We can say that for this segment:
−1 −1 −2
uy = 12.5 m s , vy = 0 m s , ay = −9.8 m s , ∆y = ?

16 SPACE
The equation to use has these four variables:
2 2
∴ vy = uy + 2ay ∆y
2 2
0 = 12.5 + 2 × (−9.8)∆y
∴ ∆y = 7.97 m ≈ 8.0 m.
That is, the maximum height reached by the ball is 8.0 m.
(b) We must still focus on the ball’s rise up to its peak height. We can
now say that:
−1 −1 −2
uy = 12.5 m s , vy = 0 m s , ay = −9.8 m s , ∆y = 7.97 m, t = ?
The equation to use is:
vy = uy + ayt
0 = 12.5 + (−9.8)t
∴ t = 1.28 s.
This is the time to rise to the peak height. By symmetry, it will take
the ball just as long to fall, so that the total trip time is:
2 × 1.28 = 2.56 s ≈ 2.6 s
(c) We can now consider the entire up and down motion as a whole, and
we can list the following data:
−1 −2
uy = 12.5 m s , ay = −9.8 m s , t = 0.5 s, vy = ?
The right equation to use has these four variables:
vy = uy + ayt
= 12.5 + (−9.8 × 0.5)
−1
= 7.6 m s .
−1
That is, the velocity of the ball after 0.5 s is 7.6 m s upward.
(d) We continue to consider the entire motion as a whole, and can list
the following data:
−1 −2
uy = 12.5 m s , ay = −9.8 m s , t = 1.5 s, vy = ?
The right equation to use has these four variables:
vy = uy + ayt
= 12.5 + (−9.8 × 1.5)
−1
= −2.2 m s .
−1
That is, after 1.5 s the ball is falling and its velocity is 2.2 m s
downwards.

The horizontal motion


If a ball is pushed horizontally, ideally, once it is under way, it
experiences no acceleration at all in its direction of motion. This is hard
to visualise only because we are used to the force of friction, which
resists any motion. The effect of this friction can be minimised by con-
sidering a flat disc sliding across a horizontal air table, such as an air
hockey table.
If no acceleration is experienced, the disc will travel with a uniform,
unchanging velocity. If we were to mark the position of the disc at regular
time intervals, it would look like figure 2.6.

CHAPTER 2 LAUNCHING INTO SPACE 17


This is also the nature of the horizontal portion of a
projectile’s motion. Once free of the ground, there is no
friction for a thrown ball, other than air resistance which
we are currently ignoring, and so the ball will travel side-
ways above the ground in the same manner as the motion
shown in figure 2.6.
In adapting the acceleration equations for the horizontal
motion we need to note the following variables:
−2
a=0ms
v = vx
u = ux
r = ∆x (since displacement = change of position on the
Figure 2.6 The motion of a disc x-axis).
sliding over an air table with
uniform velocity Hence, our three equations become:
vx = ux (that is, horizontal velocity is uniform)
2 2
vx = ux
∆x = uxt.
Any object travelling with a velocity will eventually bump into some-
thing and bring the motion to an end. The disc on the air table will,
sooner or later, collide with the wall of the table, and a thrown ball will
eventually collide with the ground. The point at which the ball hits the
ground and stops moving defines the maximum horizontal displacement,
or range. We can modify the third equation above to show this:
Range = maximum ∆x = ux × trip time
where
−1
ux = horizontal velocity of a projectile (m s ).

Calculating the velocity of a flat disc on an air table


SAMPLE PROBLEM 2.2 A flat disc slides across a 1.5 m wide air table in 0.5 s. What was its velocity?
SOLUTION ∆x = 1.5 m, t = 0.5 s, ux = ?
∆x = uxt
1.5 = ux × 0.5
−1
∴ ux = 3.0 m s
−1
That is, the disc slid across the air table with a velocity of 3.0 m s .

Calculating the velocity of a bullet fired at a target


SAMPLE PROBLEM 2.3 An air gun is fired horizontally at a target 81 m away and the bullet takes
just 0.35 s to strike it. What was the velocity of the bullet?
SOLUTION ∆x = 81 m, t = 0.35 s, ux = ?
∆x = uxt
81 = ux × 0.35
−1 −1
∴ ux = 231 m s ≈ 230 m s
That is, the bullet travelled to the target at a velocity of approximately
−1
230 m s .

18 SPACE
Calculating the range of the bullet
SAMPLE PROBLEM 2.4 The gun is now fired into the distance, in a direction that ensures that it
won’t hit anything during its flight. If it takes 0.5 s to fall to the ground
and stop, what was its range?
−1
SOLUTION ux = 230 m s , trip time t = 0.5 s, range = ?
∆x = uxt
= 230 × 0.5
= 115 m ≈ 120 m
That is, the bullet managed to travel a total of approximately 120 m
before it stopped because it hit the ground.

Putting the two parts together


We now have a set of equations that describe each of the vertical and
horizontal components of the projectile motion. They are summarised in
table 2.1.

Table 2.1 The equation set for projectile motion

x-DIRECTION y-DIRECTION
2.1 GENERAL FORM
ACCELERATION EQUATION
(HORIZONTAL)
Note: a = 0
(VERTICAL)
−2
Note: a = 9.8 m s down
Modelling projectile motion
v = u + at vx = ux vy = uy + ayt
2 2 2 2 2 2
v = u + 2ar vx = ux vy = uy + 2ay ∆y
eBook plus r = ut + 1--- at
2
∆x = uxt ∆y = uyt + 1--- ayt
2
2 2

Weblink:
Projectile motion Let us now look at how the accelerated vertical motion and the non-
eModelling: accelerated horizontal motion superimpose to give the parabolic trajectory
Freethrow shooter
of a projectile. Figure 2.7 shows the vertical motion on the left and the
Use a spreadsheet to
predict the conditions horizontal motion along the base. Each successive image in both motions
necessary to shoot a occurs after the same periods of time. We now regard the images as time-
basketball into a hoop. matched pairs and use them as coordinates to plot the combined motion
doc-0006
of the projectile. Figure 2.7 shows how this is done.
(uniform downward acceleration)
Vertical motion

Figure 2.7 Combining the


independent vertical and horizontal
motions to produce the more complex Horizontal motion
parabolic trajectory of a projectile (uniform velocity)

CHAPTER 2 LAUNCHING INTO SPACE 19


The velocity of the
projectile
Projectiles are most commonly sent out u
uy = u sin θ
at some angle to the horizontal. This
initial velocity can quite easily be θ
ux = u cos θ
resolved into vertical and horizontal
components using trigonometry, as Figure 2.8 The initial velocity at
shown in figure 2.8. Performing this cal- some angle to the horizontal can be
culation determines the initial velocity resolved into vertical and horizontal
in the vertical direction and the initial components, uy and ux .
velocity in the horizontal direction.

Calculating the vertical and horizontal components of a


SAMPLE PROBLEM 2.5a projectile
−1
A cannon is fired at a velocity of 400.0 m s 30.0° above horizontal.
Determine the vertical and horizontal components of this initial velocity.
SOLUTION Using the method shown in figure 2.8, the following expressions can be
deduced:
−1
The vertical component, uy = 400.0 sin 30.0° = 200.0 m s
−1
The horizontal component, ux = 400.0 cos 30.0° = 346.4 m s .

The velocity of the projectile at other v


times during the motion can be found vy
by combining the vertical and hori- θ
zontal velocities together in a vector vx
addition. Figure 2.9 shows how this |v | = vy2 + vx2
vector addition is performed at a pos- v
θ = tan–1 ( vy (
ition where the projectile has almost x
risen to a peak. Notice that the real Figure 2.9 Determining the
velocity of the projectile is directed at a velocity of a projectile at any
tangent to the trajectory. point in the motion

Calculating the velocity of the cannonball


SAMPLE PROBLEM 2.5b Determine the velocity of the cannonball from sample problem 2.5a,
30.0 s after firing.
SOLUTION We must first consider the vertical motion to deduce the vertical velocity
after 30.0 s. We shall take upwards to be the positive direction.
−1 −2
uy = 200.0 m s , ay = −9.8 m s , t = 30.0 s, vy = ?
vy = uy + ayt
= 200.0 + (−9.8 × 30.0)
−1
= −94.0 m s
−1
That is, the vertical velocity is 94.0 m s downwards.
vx = 346 m s–1
We already know that the horizontal velocity is constant, so that after 30.0 s
vy = –94.0 m s–1 −1
vx = ux = 346 m s .
Figure 2.10 The vector addition Finally, we need to add vy and vx together in a vector triangle as shown in
of vx and vy figure 2.10.

20 SPACE
v = vy + vx
2 2 −1
= ( – 94.0 ) + 346 = 359 m s
−1 94.0 ˙
θ = tan  ---------- = 15.2°
346
−1
That is, the velocity of the cannonball 30.0 s after firing is 359 m s at 15°
eBook plus below horizontal.

eModelling: In figure 2.11 this calculation has been performed for several points
Modelling along the trajectory of a fired bullet to show how the velocity varies
a stunt driver throughout the motion. You can see how the velocity reduces to a
A spreadsheet for a minimum at the peak, because at this point the vertical velocity is zero
powerful general model
of projectile motion although the horizontal velocity remains. As the projectile falls from its
doc-0007 peak, its velocity increases again until, at the end of the trajectory, it has
the same value as the initial velocity and even the same angle to the
horizontal, although now it is directed below the horizontal.

vx = 30.3 m s–1
v = 32.7 m s–1
vy = 12.3 m s–1 vx = 30.3 m s–1
22°
vx = 30.3 m s–1 22° vy = 12.3 m s–1
v = 35 m s–1 v = 32.7 m s–1
uy = 35 sin 30°
30° = 17.5 m s–1 vx = 30.3 m s–1
30°
ux = 35 cos 30° vy = 17.5 m s–1
= 30.3 m s–1 v = 35 m s–1

Figure 2.11 Velocity determined at


many points along the trajectory of a Determining other quantities
bullet fired from an air gun We are now in a position to outline a strategy for calculating other
dimensions of a projectile’s path, such as the height, range and trip time.
To determine maximum height, follow these steps:
1. Resolve initial velocity, u, into component uy .
2. Consider the vertical motion up to the peak.
3. Note that vy = 0 in this case.
4. Select an acceleration equation to suit the available data.
5. Calculate ∆y, which will be maximum height.

Calculating the maximum height of a tennis ball


SAMPLE PROBLEM 2.6a −1
A tennis ball is struck at a velocity of 25 m s 15° above horizontal.
Calculate the maximum height reached by this ball.
SOLUTION First, determine uy and ux:
−1
uy = 25 sin 15° = 6.47 m s
−1
ux = 25 cos 15° = 24.2 m s .
Next, consider the vertical motion up to the peak:
−1 −1 −2
uy = 6.47 m s , vy = 0 m s , ay = −9.8 m s , ∆y = ?
2 2
vy = uy + 2ay ∆y
2 2
0 = 6.47 + 2(−9.8)∆y
∆y = 2.1 m.
That is, the maximum height reached by the tennis ball is 2.1 m above
the ground.

CHAPTER 2 LAUNCHING INTO SPACE 21


To determine trip time, follow these steps:
1. Resolve initial velocity, u, into component uy .
2. Consider the vertical motion up to the peak.
3. Note that vy = 0 in this case.
4. Select an acceleration equation to suit the available data.
5. Calculate t, time to rise to the peak.
6. Double this time to find the trip time, since it takes just as long to fall
as to rise.

Calculating the time for the struck tennis ball to return to the
SAMPLE PROBLEM 2.6b ground
Referring back to the tennis ball in sample problem 2.6a, determine the
time it takes to return to the ground.
SOLUTION Once again, consider the vertical motion up to the peak:
−1 −1 −2
uy = 6.47 m s , vy = 0 m s , ay = −9.8 m s , t = ?
vy = uy + ayt
0 = 6.47 + (−9.8)t
∴ t = 0.66 s
and hence,
trip time = 2t = 2 × 0.66 = 1.32 s.
That is, the time taken for the tennis ball to complete its flight and strike
the ground is 1.32 s.

To determine the range, follow these steps:


1. Resolve initial velocity, u, into components uy and ux.
2. Analyse the vertical motion to find the trip time as shown above.
3. Now consider the horizontal motion and calculate the range using
∆x = uxt.

Calculating the range of the tennis ball’s trajectory


SAMPLE PROBLEM 2.6c Referring once again back to the tennis ball in sample problem 2.6a,
calculate the range of its trajectory.
SOLUTION This time, consider just the horizontal portion of its motion:
−1
ux = 24.2 m s , trip time t = 1.3 s, range = ?
∆x = uxt
= 24.2 × 1.3
= 31.5 m ≈ 32 m.
That is, the tennis ball covered 32 m before striking the ground.
Note : Tennis players are able to strike a ball at this sort of velocity and
higher, and still keep it within the court, because they impart spin to the
ball, which can dramatically alter the trajectory. This course does not go
into the effects of spin.

Air resistance
In all of our work on projectile motion we have ignored the effect of air
resistance on the motion of the projectile. The reason for this is that it is
simply too difficult for us to account for, since it depends on many factors
such as the shape, surface area and texture of the projectile, as well as its

22 SPACE
velocity through the air. In the real world, air resistance acts as a
retarding force in both the vertical and horizontal directions. As a result,
the path of the projectile is distorted away from a perfect parabola to the
shape shown in figure 2.12.

Path of a projectile
with air resistance
Fa.r.

Path of projectile
without air
resistance

Fa.r.

Fa.r.

Figure 2.12 Air resistance opposes the velocity of a projectile at any given moment and
distorts the trajectory away from a parabolic shape.

2.2 ESCAPE VELOCITY


E
Isaac Newton wrote that it should be possible to launch a projectile fast
Tower
enough so that it achieves an orbit around the Earth. As shown in figure
A 2.13, his reason was that a stone thrown from a tall tower will cover a con-
B
siderable range before striking the ground. If it is thrown faster, it will
travel further before stopping. If thrown faster still, it will have an even
Earth
greater range. If thrown fast enough then, as the stone falls, the Earth’s
surface curves away, so that the falling stone never actually lands on the
C
ground and orbits the Earth. It was only a thought experiment, of course.
D He had no way of testing this idea but it does hit upon one important
fact — that for any given altitude, there is a specific velocity required for
any object to achieve a stable circular orbit.
Figure 2.13 Newton’s suggestion for
If this specific velocity is exceeded slightly, then the object will follow
achieving an orbit
an elliptical orbit around the Earth. If the specific velocity is exceeded
further still, then the object will follow a parabolic or hyperbolic path
away from the Earth. This is the manner in which space probes depart
the Earth and head off into space.
We will now consider a situation similar to Newton’s. Imagine throwing
a stone directly up. When thrown, the stone will rise to a certain height
before falling back to Earth. If thrown faster, it will rise higher. If thrown
Escape velocity is the initial fast enough, it should rise up and continue to rise, slowing down but
velocity required by a projectile to
rise vertically and just escape the
never falling back to Earth, and finally coming to rest only when it has
gravitational field of a planet. completely escaped the Earth’s gravitational field. The initial velocity
required to achieve this is known as escape velocity.

CHAPTER 2 LAUNCHING INTO SPACE 23


By considering the kinetic and gravitational potential energy of a
projectile, it can be shown mathematically that the escape velocity of a
planet depends only upon the universal gravitation constant, the mass and
the radius of the planet. Their relationship is shown in the equation that
follows.
2Gm planet
Escape velocity = -----------------------
r planet
where
G = universal gravitational constant
−11 2 −2
= 6.67 × 10 N m kg
mplanet = mass of the planet (kg)
rplanet = radius of the planet (m).
We are now in a position to calculate the escape velocity for Earth:
2Gm Earth
Escape velocity = ---------------------
r Earth
– 11 24
2 ( 6.67 × 10 ) ( 5.97 × 10 )
= ---------------------------------------------------------------------
6
-
6.38 × 10
−1 −1
= 11 200 m s ≈ 40 000 km h .
−1
That is, the escape velocity on Earth is about 40 000 km h . This is a con-
siderable velocity, but remember that this is the velocity with which a pro-
jectile must be launched directly up in order to completely escape the
Earth’s gravitational field. It does not apply to a rocket, which continues
its thrust well after launch.

Calculating escape velocity


SAMPLE PROBLEM 2.7 Determine the escape velocity of the planet Venus, given that its mass is
24
4.87 × 10 kg and its radius is 6052 km.
2Gm Venus
SOLUTION Escape velocity = ----------------------
r Venus

– 11 24
2 ( 6.67 × 10 ) ( 4.87 × 10 )
= ---------------------------------------------------------------------
6
-
6.052 × 10
−1 −1
= 10 360 m s ≈ 37 300 km h
−1
That is, escape velocity on the planet Venus is approximately 37 300 km h .

2.3 LIFT-OFF
Let us now turn our attention to powered projectiles, that is, rockets.
Whereas projectiles receive an initial velocity and are then left to fall
Thrust is the force delivered to a through a trajectory, rockets receive a force called thrust from their
rocket by its engines. engine(s) for a significant portion of their upwards flight, and become
more conventional projectiles only after their engines are exhausted.

Rockets
A rocket engine is different from most other engines in that it carries
with it both its fuel and oxygen supply. Any fuel needs oxygen to burn
and most engines, such as jet engines or internal combustion engines,

24 SPACE
Insulated
casing obtain the necessary oxygen from the air around them. However, in
space there is no air or other atmosphere, which makes a rocket engine
Solid mixture
the natural choice.
of fuel and Modern rockets can use either solid or liquid propellants. Solid rocket
oxidiser propellant is a manufactured mixture of a fuel, such as a mixture of
hydrogen compounds and carbon, with an oxidiser, or oxygen supply,
being a mixture of oxygen compounds. The dry, solid propellant is
packed into an insulated cylindrical vessel, usually with a hollow core
Hollow core
through its middle. The hollow core is not necessary, but it increases the
surface area available for burning, and therefore the thrust. The end of
the cylinder is fitted with a nozzle. Finally, an igniter built into the
cylinder sparks off the rapid burning of the propellant. Hot gases are
produced at an extreme rate and are forced out through the nozzle.
Liquid-propellant rockets keep both the liquid fuel, such as kerosene
or liquid hydrogen, and the oxidiser, usually liquid oxygen, in separate
Nozzle storage tanks. Pumps force each liquid from their tanks and spray them
into a combustion chamber where they mix as they burn, producing the
Figure 2.14 hot gases that are expelled out through a nozzle (see figure 2.15).
The structure of a The forward motion of the rocket can be understood by recalling the
solid-fuel rocket Law of Conservation of Momentum. This law states that during any inter-
action in a closed system the total momentum of the system remains
unchanged. Stated another way, this means that during a launch, the
momentum of the gases shooting out of the rear of the rocket must be
equal to the forward momentum of the rocket itself, as shown in figure
2.16. This means that during any one-second time interval:
Total change in momentum = 0
∴ −∆pgases = ∆procket
Liquid fuel
−∆(mv)gases) = ∆(mv)rocket
where
−1
∆p = change in momentum (kg m s )
m = mass (kg)
Liquid −1
oxidiser
v = velocity (m s ).
This means that the backward momentum of the gases (−∆p) is exactly
Pumps equal in magnitude to the forward momentum of the rocket (+∆p),
endowing the rocket with forward velocity. It is important to note that,
Combustion while the mass of the gases during any given second is less than the mass
chamber
of the rocket, their velocity is much greater, so that their momenta are
Nozzle equal but opposite. You should also recall that:
∆p = impulse = Ft
where
Figure 2.15
The structure of a F = force (N)
liquid-fuel rocket t = time (s)
so that
−(Ft)gases = (Ft)rocket
or, for any one second interval,
–pgases +procket
−Fgases = Frocket.
This is Newton’s Third Law of Motion. This law says that for every
force there is an equal but opposite force, and this is also the case here.
The rocket is forcing a large volume of gases backward behind it, and the
–Fgases +Frocket
gases, in turn, force the rocket forward as shown in figure 2.16. Although
Figure 2.16 Momentum and force the two forces are equal and opposite, the rocket experiences just one of
acting on a rocket them — the forward push that we call thrust.

CHAPTER 2 LAUNCHING INTO SPACE 25


PHYSICS IN FOCUS
The engines of the space shuttle
T he thrust of a solid-fuel rocket engine cannot
be varied once started, since the fuel is
ignited and burns at the maximum possible rate
then separate from the shuttle — they fall into the
ocean and are recovered for reuse. The large
brown tank in the middle contains the external
until it is exhausted. The thrust of a liquid-fuel liquid-fuel tanks for the shuttle’s three engines on
engine can be throttled to some extent, by vary- the upward journey. Once in orbit, this tank is
ing the amount of fuel and oxidiser that enter released — it eventually falls into the ocean but is
the combustion chamber, allowing some control not recovered. The liquid-fuel engines can be
over the thrust delivered by the engine and the throttled and this gives the shuttle the ability to
resultant rate of acceleration. vary the launch thrust between 50% and 100% —
In figure 2.1 you can clearly see the solid rocket an important feature that minimises the forces
boosters either side of the space shuttle. After experienced by the crew.
about two minutes firing, they are exhausted and

Thrust and acceleration


A rocket at various points in its lift-off and flight is shown in figure 2.17.
As it is a mass subject to several forces, it will accelerate according to
Newton’s Second Law:
ΣF = ma
ΣF
∴ a = ------- .
m

Far space T
Figure 2.17 The forces and acceleration a
rocket is subjected to during a launch. Also shown
is the g force experienced by the astronauts within. Orbiting

W
T

W = Weight
R = Reaction
T = Thrust T
W

T
R T
R
W

W W W
W
R= W 2 0 0 0 0 0
W
T= 0 2 W 1.5W 3W 0 >0

a= ΣF 0 0 0 0.5g 2g –g
T
m m
g+a T
g force = 1 1 1 1.5 3 0
9.8 9.8m

26 SPACE
As shown in figure 2.17, the rocket is subject to the following forces:
• its weight force directed downward
• its thrust (the force delivered by the engines) directed upward
• the reaction force of the ground on the rocket (equal to the difference
between the weight and the thrust while the rocket is on the ground)
directed upward
• air resistance directed downward against the motion of the rocket once
it has left the ground. This air resistance force can become significant
as the speed of the rocket builds, but at the relatively low speeds of early
lift-off its effect can be ignored.

Calculating a model rocket’s lift-off acceleration


SAMPLE PROBLEM 2.8 A model rocket has a mass of 100.0 g and is able to produce a thrust of
4.50 N. Determine its initial rate of acceleration upon lift-off.

SOLUTION
Σ F ( T – mg )
a = ------- = ----------------------
m m
( 4.50 – 0.100 × 9.8 )
= -------------------------------------------------
0.100
−2
= 35 m s
−2
That is, the rocket’s initial rate of acceleration will be 35 m s .

A rocket’s acceleration will not be constant, however, because fuel con-


stitutes up to 90% of the mass of a typical rocket. As the fuel is burnt, the
mass of the rocket decreases although the thrust remains essentially
constant. Additionally, the gravitational field vector, g, reduces slightly
with increasing altitude, as seen in chapter 1. The result is that a rocket’s
rate of acceleration will increase as its flight progresses, and its velocity
will increase logarithmically.
Consequently, the acceleration equation above can apply only at an
instant in time, provided the mass and thrust are known at that instant.
More detailed rocket equations exist for the enthusiast which allow the
calculation of a rocket’s velocity, maximum height and required fuel
load; however, this theory is beyond the scope of the HSC physics course.

g forces
Your body is a mass lying somewhere within a gravitational field, and
therefore experiences a true weight, W = mg. The sensation of weight that
Reaction = mg
you feel, however, derives from your apparent weight, which is equal to
ma the sum of the contact forces resisting your true weight. This includes the
normal reaction force of the floor on your body, or the thrust of a rocket
engine.
The term ‘g force’ is used to express a person’s apparent weight as a
multiple of his/her normal true weight (that is, weight when standing on
W = mg the surface of the Earth).
apparent weight
Figure 2.18 The forces acting on an Hence, g force = -------------------------------------------------- .
normal true weight
astronaut during a launch
Figure 2.18 shows the forces acting upon an astronaut during a launch.
The astronaut’s body is exerting a downward weight force on the floor,
and the floor meets this with an upward reaction force equal to m × g. In
addition, the floor is exerting an upward accelerating force equal to m × a.
The astronaut feels an apparent weight = mg + ma.

CHAPTER 2 LAUNCHING INTO SPACE 27


apparent weight
Therefore g force = --------------------------------------------------
normal true weight
mg + ma
= ---------------------
9.8m
g+a
and hence, g force = -----------
9.8
where
−2
g = acceleration due to gravity at altitude (m s )
m = mass of astronaut (kg).
Note that g force is closely related to acceleration.
It is common to experience variations in g force when riding up or
down in an elevator, so let us compare this situation with that for an
astronaut during launch. This is shown in figure 2.19. When the rocket
Before Stationary
lift-off
F

F F=W
... g force = 1

W
W
Accelerating up

Lift-off F
F

F>W
... g force > 1

Accelerating down
W

F
F<W
... g force < 1

W
Between
rocket
stages
Free fall

F=0
F=0
F=0
Figure 2.19 An occupant of an
... g force = 0
elevator experiences types of forces that
are similar to those experienced by an
astronaut during a launch. W
W

28 SPACE
and elevator are both stationary, the only forces acting are the weight and
reaction force, which are equal in magnitude but opposite in direction.
In this case, the apparent weight equals the true weight and the occupant
experiences a g force of one (that is, a one g load).
When the elevator begins to accelerate upwards, it is analogous to the
rocket lifting off. The floor will exert an upwards force on the occupant
(g + a)
of (mg + ma) so that the occupant experiences a g force of ---------------- , which
9.8
is a value greater than one.
When the elevator accelerates downwards, the floor exerts an upward
force less than the occupant’s weight, so that the g force experienced is
(g – a)
---------------- , which is less than one. If the elevator were in free fall, the down-
9.8
ward rate of acceleration would be g, so that the g force would have a
value of zero. In other words, the floor would exert no force on the occu-
pant, and the occupant would experience a zero apparent weight, that is,
weightlessness within the accelerating frame of reference of the elevator.
This situation is analogous to a multi-stage rocket after it has jettisoned
a spent stage but before it has ignited the next. During those few seconds
there is only the downward acceleration due to gravity, so that the
astronauts experience a zero g load (weightlessness).

Calculating the g force on a model rocket


SAMPLE PROBLEM 2.9 The model rocket has a pre-launch mass of 94.2 g, of which 6.24 g is solid
propellant. It is able to deliver a thrust of 4.15 N for a period of 1.2 s.
Assuming that the rocket is fired directly up, determine:
(a) the initial rate of acceleration and g force
(b) the final rate of acceleration and g force just prior to exhaustion of
the fuel.
−2
SOLUTION We shall assume up to be the positive direction, and that g = 9.8 m s at
the relatively low height achieved by this rocket.
(a) Determine initial acceleration as follows:
Σ F ( T – mg )
a = ------- = ----------------------
m m
4.15 – ( 0.0942 × 9.8 )
= -----------------------------------------------------
0.0942
−2
= 34 m s .
Also, g force can be determined as follows:
g+a
g force = ------------
9.8
9.8 + 34.3
= -------------------------
9.8
= 4.5.
(b) Determine final acceleration as follows:
Final mass = 94.2 − 6.24 = 88 g
Σ F ( T – mg )
Hence, a = ------- = ----------------------
m m
4.15 – ( 0.0880 × 9.8 )
= -----------------------------------------------------
0.0880
−2
= 37 m s .

CHAPTER 2 LAUNCHING INTO SPACE 29


Also, final g force can be determined as follows:
g+a
g force = ------------
9.8
9.8 + 37.4
= -------------------------
9.8
= 4.8.
As shown in figure 2.20, riders on a roller-coaster also experience vari-
ations in g force. When a roller-coaster zooms down through a dip in its
g force < 1 track and turns upward, the riders experience an upward acceleration
and, hence, a g force greater than one. However, when rolling over the
top of a crest in the track and accelerating downhill, the riders will
experience a downward acceleration and a g force less than 1.
This idea has been used to provide training for astronauts with a
simulated weightless environment. The subjects sit within an aircraft,
g force > 1
which flies a trajectory very similar to that of a roller-coaster ride, as
shown in figure 2.21. At first the plane flies down in a shallow dive
Figure 2.20 Riders on a roller- before turning hard up into a 50° climb. This turn will create a g load
coaster experience changing g forces. of approximately 2.25, but soon after the pilot pushes the nose over
and throttles the engines down, turning the aircraft into a free-falling
projectile following a parabolic path. During this phase the subjects
within experience about half a minute of near-weightlessness before the
Typical altitude (km)

r zero
pilot needs to throttle up the engines again to recover the dive and
9 Neao repeat the process.
g f rce
8
7
6
2.25 g 2.25 g Variations in acceleration and g forces during
a typical launch
Figure 2.21 Aircraft trajectory to As shown in figure 2.17, prior to lift-off a rocket has zero acceleration
simulate near-weightlessness because of the balance that exists between the weight force and the
reaction force plus thrust. The astronaut within is experiencing a one g
load. This initial condition will not change until the building thrust
exceeds the weight of the rocket, at which point the rocket will lift off.
Since the thrust now exceeds the weight, there is a net force upwards
on the rocket, which begins to accelerate upwards. The g force experi-
enced by the rocket will have a value slightly greater than one. From this
point onwards, the mass of the rocket begins to decrease as fuel is con-
sumed and, hence, the rate of acceleration and subsequent g force steadily
climbs, reaching maximum values just before the rocket has exhausted its
fuel.
2.2 At this point a single-stage rocket becomes a projectile, eventually
Acceleration and load during falling to Earth. A multi-stage rocket, however, drops the spent stage
the Apollo 10 launch away, momentarily experiencing zero g conditions as it coasts. The
second-stage rocket fires and quickly develops the necessary thrust to
exceed the effective weight at its altitude, and then starts to accelerate
again. The g force experienced by the rocket and astronaut begins again
at a value marginally greater than one and gradually builds to its
maximum value just as the second-stage fuel supply is exhausted. If there
is a third stage, the process is repeated.
The variation in g forces varied during the launch of Saturn V, a large
three-stage rocket used to launch the Apollo spacecraft, is shown in figure
2.22. Note that the jagged peaks on the graph are due to the sequential
shutdown of the multiple rocket engines of each stage — a technique
designed specifically to avoid extreme g forces.

30 SPACE
4 This figure shows that Apollo astronauts experienced a
peak load of four g during lift-off. This is a significant
force — at four g a person begins to lose their colour
Sequential shutdown of multiple vision and peripheral vision. Soon after this the person
3 engines avoids excessive peaks.
will black out; although each individual has their own
threshold level. In the first manned US space flight, astro-
naut Alan Shepard had to tolerate a peak g force of 6.3 g
g force

2
during launch. Rocket design has improved since then —
space shuttle astronauts never experience loads greater
than three g due to the shuttle’s ability to throttle back its
1 liquid fuel engines. This is discussed in more detail in
1st
stage
chapter 3.
2nd stage 3rd
stage
0
0 100 200 300 400 500 600 700 800 The effect of the Earth’s motion
Seconds since lift-off on a launch
Figure 2.22 Variations in g forces
Why do cricket fast bowlers run up to the wicket before
during an Apollo–Saturn V launch
bowling the ball? The answer is that the velocity at which
the ball is bowled is greater than it would have been if
the bowler had not run up. This is because the velocity
of the ball relative to the ground is equal to the velocity
of the ball relative to the bowler, plus the velocity of the
bowler relative to the ground. Algebraically, this is
expressed as:
Ro
tati v = ballvbowler + bowler vground.
o n of ball ground
Ea r t
h
In other words, a moving platform (the bowler) offers a boost to the
velocity of a projectile (the ball) launched from it, if launched in the
direction of motion of the platform.
The same principle applies to a rocket launched from the Earth. Con-
sider that the Earth is revolving around the Sun at approximately
−1
Figure 2.23 A rocket heading into
107 000 km h relative to the Sun. In addition, the Earth rotates once on
orbit is launched to the east to receive a its axis per day so that a point on the Equator has a rotational velocity of
−1
velocity boost from the Earth’s approximately 1700 km h relative to the Sun. Hence, the Earth is itself
rotational motion. a moving platform with two different motions which can be exploited in
a rocket launch to gain a boost in velocity.
Engineers planning to launch a rocket into orbit can exploit the
Earth’s rotation in order to achieve the velocity needed for a stable orbit.
This is done by launching in the direction of the Earth’s rotation; that is,
by launching toward the east, as shown in figure 2.23. In this way, the
rotational velocity of the launch site relative to the Sun will add to the
orbital velocity of the rocket relative to the Earth, to produce a higher
orbital velocity achieved by the rocket relative to the Sun.
Sun
In a similar way, engineers planning a rocket mission heading further
into space can exploit the Earth’s revolution around the Sun by plan-
ning the launch for a time of year when the direction of the Earth’s
Earth's motion
orbital velocity corresponds to the desired heading. Only then is the
rocket launched up into orbit. The rocket is allowed to proceed around
its orbit until the direction of its orbital velocity corresponds with the
Figure 2.24 The flight of a rocket Earth’s, and then its engines are fired to push it out of orbit and further
heading into space is timed so that it into space, as shown in figure 2.24. In this way the Earth’s orbital
can head out in the direction of the velocity relative to the Sun adds to the rocket’s orbital velocity relative to
Earth’s motion and thereby receive an the Earth, to produce a higher velocity achieved by the rocket relative to
extra boost. the Sun.

CHAPTER 2 LAUNCHING INTO SPACE 31


Planning of this sort clearly favours certain times of the year over
others, or even certain times of day depending upon the flight mission.
These favourable periods are referred to as launch windows.

PHYSICS IN FOCUS
Space exploration and rocket science pioneers
T he Chinese discovered gunpowder and used
it to create fireworks. By the eleventh century,
the Chinese were using simple rockets called fire
gyroscopes and vanes for guidance, and to
separate the payload from the rocket in flight and
return it to Earth.
arrows as weapons. In the late 1700s a British artil- Herman Oberth (1894–1992) was born in
lery officer, William Congreve, developed simple Romania but lived in Germany. Purely a theorist,
rockets for use by the British army. The Hale he was yet another inspired by Jules Verne. He
rockets followed 50 years later. Despite all of this, wrote a doctoral thesis titled By Rocketry to Space.
modern rocket science didn’t begin in earnest Although the University of Heidelberg rejected
until the late 1800s and early 1900s. Listed here the thesis, he had the work published privately as
are some of the most notable pioneers. a book. It promptly sold out. The subject of
Konstantin Tsiolkovsky (1857–1935) was a
rocketry captured the public’s imagination and
Russian mathematics teacher who took an interest
Oberth was himself inspiring a new generation of
in rocketry, being inspired by Jules Verne’s book
From the Earth to the Moon. Working entirely on his rocket scientists. He was an early member of the
own, he developed precise calculations for space VfR, or Society for Space Travel, and published
flight and the details of many aspects of rocket another book titled The Road to Space Travel. This
design and space exploration. His work was purely work won an award and Oberth used the prize
theoretical as he performed no experiments, but money to purchase rocket motors for the VfR,
his published work influenced rocket develop- assisting its development efforts. One of those
ment around the world, especially in Russia. His inspired by Oberth was Wernher von Braun
ideas were wide ranging — from the very (1912–1977) who became the rocket engineer
pragmatic, such as the design of a liquid-fuel responsible for the development of the V2 rocket,
rocket engine featuring throttling capability and which was used to bomb London during World
multi-staging, to the (then) fanciful, such as space War II, and later the Mercury-Redstone rocket
stations and artificial gravity, terraforming of which put the first Americans into space.
other planets and extraterrestrial life. He was the Roberts Esnault-Pelterie (1881–1957) was a
inspiration for men such as Sergei Korolev (1906– French rocket pioneer. He published two impor-
1966) who was the Russian Chief Constructor tant books — Astronautics in 1930 and Astronautics
responsible for Sputnik I and the Vostok rocket. Complement in 1934. He suggested the idea that
Sputnik I was the world’s first artificial satellite, rockets be used as long-range ballistic missiles, and
while the Vostok rocket was used to send Yuri the French Army employed him to develop these
Gagarin into a single orbit of the Earth on rockets. He experimented with various liquid fuels
12 April 1961. This was the first time that a person in rocket motors of his design, starting with liquid
had entered space.
oxygen and gasoline, then nitrogen peroxide and
Robert H. Goddard (1882–1945) was an
benzene, before attempting liquid oxygen and
American college professor of physics with a
passion for rocketry. Also inspired by Jules Verne tetranitromethane. This last combination caused
as a boy, Goddard decided early to dedicate his him a major hand injury.
life to rocketry. Unlike Tsiolkovsky, Goddard was Theodore von Karman (1881–1963) was born
an engineer and an experimentalist. He con- in Hungary but later settled in America. In the
ceived ideas then tested them, patenting those 1930s he became a professor of aeronautics at
that were successful. He built and tested the Caltech. There he established the ‘Jet Propulsion
world’s first liquid-fuel rocket, which solved many Laboratory’ dedicated to rocket work. The JPL
technical problems such as fuel valving for still exists today, working closely with NASA and
throttle, start and stop, fuel injection, engine specialising in exploration of the solar system by
cooling and ignition. He was the first to use space probe.

32 SPACE
CHAPTER REVIEW
SUMMARY QUESTIONS
• A projectile is any object that is launched into 1. Explain why it is that the vertical and hori-
the air. zontal components of a projectile’s motion are
independent of each other. Identify any
• The path of a projectile, called its trajectory,
common variables.
has a parabolic shape if air resistance is
ignored. The trajectory can be analysed mathe- 2. Describe the trajectory of a projectile.
matically by regarding the vertical and hori- 3. List any assumptions we are making in our
zontal components of the motion separately. treatment of projectile motion.
• The vertical motion of a projectile is uniformly 4. Describe Galileo’s contribution to our knowl-
accelerated motion and can be analysed using edge of projectile motion.
these equations: 5. What is the mathematical significance of
vy = uy + ayt vertical and horizontal motions being perpen-
2
∆y = uyt + 1--- ayt dicular?
2
2 2
vy = uy + 2ay ∆y. 6. Describe the strategy you can employ to deter-
mine a projectile’s:
• The horizontal motion of a projectile is con-
(a) velocity (c) trip time
stant velocity and can be analysed using these
(b) maximum height (d) range.
equations:
vx = ux 7. Describe the effect of air resistance on the
2 2 trajectory of a projectile.
vx = ux
∆x = uxt. 8. A volleyball player sets the ball for a team
mate. In doing so she taps the ball up at
• Escape velocity is the vertical velocity that a pro- −1
5.0 m s at an angle of 80.0° above the hori-
jectile would need to just escape the gravitational zontal. If her fingers tapped the ball at a
field of a planet. It is given by the equation: height of 1.9 m above the floor, calculate the
2Gm planet maximum height to which the ball rises?
Escape velocity = ----------------------- .
r planet 9. An ‘extreme’ cyclist wants to perform a stunt
in which he rides up a ramp, launching
• A rocket is different from a projectile because it
himself into the air, then flies through a hoop
continues to be propelled after it is launched,
and lands on another ramp. The angle of each
accelerating throughout most of its upward
ramp is 30.0° and the cyclist is able to reach
journey. Rockets differ from other engines such
the launch height of 1.50 m with a launching
as jets because they carry with them the oxygen −1
speed of 30.0 km h . Calculate:
required to burn their fuel.
(a) the maximum height above the ground that
• The forward progress of a rocket can be explained the lower edge of the hoop could be placed
using Newton’s Third Law (equal and opposite (b) how far away the landing ramp should be
forces) as well as by the conservation of momentum placed.
(total change in momentum equals zero). 10. A football is kicked with a velocity of 35.0 m s
−1

• The acceleration of a rocket can be determined at an angle of 60.0°. Calculate:


using Newton’s Second Law: ΣF = ma. (a) the ‘hang time’ of the ball (time in the air)
(b) the length of the kick.
• The term ‘g force’ is used to describe apparent
weight as a multiple of normal weight, and is 11. A basketball player stands 2.50 m from the
closely related to acceleration. ring. He faces the backboard, jumps up so that
his hands are level with the ring and launches
• During a rocket launch the g force experienced −1
the ball at 5.00 m s at an angle of 50.0° above
by an astronaut increases because the mass of the horizontal. Calculate whether he will
fuel is reducing, even though the thrust score.
remains essentially the same.
12. A cannon’s maximum range is achieved with a
• The revolution and rotation of the Earth can be firing angle of 45.0° above the horizontal. If its
−1
used to provide a launched rocket with an muzzle velocity is 750.0 m s , calculate the
additional boost, allowing it to save fuel in range achieved with a firing angle of:
achieving its target velocity. (a) 40.0° (b) 45.0° (c) 50.0°.

CHAPTER 2 LAUNCHING INTO SPACE 33


13. A coastal defence cannon fires a shell hori- ESCAPE
zontally from the top of a 50.0 m high cliff, VELOCITY
directed out to sea as shown in figure 2.25, BODY MASS (kg) RADIUS (km)
−1
(m s )
−1
with a velocity of 1060.0 m s . Calculate the
23
range of the shell’s trajectory. Mercury 3.3 × 10 2410
24
Venus 4.9 × 10 6052
1060.0 m s–1 22
Io 8.9 × 10 1821
23
Callisto 1.1 × 10 2400

50.0 m
18. A certain model rocket has a pre-launch mass
of 87.3 g, of which 10.5 g is propellant. It is
able to deliver a thrust of 6.10 N. Assuming
that the rocket is fired directly up, calculate:
Figure 2.25 (a) the initial rate of acceleration and g force
(b) the final rate of acceleration and g force
14. To increase the range of the shell in question
just prior to exhaustion of the fuel.
13, the cannon is lifted, so that it now points
up at an angle of 45.0° as shown in figure 2.26. 19. If a rocket had a mass of 32 000 kg, of which
Calculate the new range. 85% was fuel, and a thrust of 400 000 N, cal-
culate:
(a) the rate of acceleration and g force at
1060.0 m s–1
lift-off
(b) the rate of acceleration and g force just
45°
prior to exhaustion of the fuel. Assume it
is travelling horizontally and accelerating
up to orbital velocity.
50.0 m
20. Identify the stage of a space mission during
which an astronaut experiences the greatest g
forces. Describe strategies that spacecraft
designers can employ to ensure the survival of
Figure 2.26 living occupants as well as delicate payloads.

15. Identify the variables upon which the escape 21. Discuss the manner in which the rotation of
velocity of the Earth depends. If the mass of the the Earth and the revolution of the Earth
Earth were somehow changed to four times its around the Sun can be utilised by rocket
real value, state how the value of the escape designers.
velocity would change.
22. (a) Explain rocket propulsion in terms of the
16. Outline Newton’s concept of escape velocity. Law of Conservation of Momentum.
17. Calculate the escape velocity of the following (b) Construct a diagram of a rocket to show
planets, using the data shown in the following the force pair that must exist due to
table. Newton’s Third Law.
CHAPTER REVIEW

34 SPACE
PRACTICAL ACTIVITIES
Method
2.1 MODELLING 1. Set up the apparatus as shown in figure 2.27.
PROJECTILE 2. Set up the inclined plane at an angle of approxi-
mately 20° and place the graph paper on it so
that the ball will enter onto the inclined plane
MOTION at a major division on the paper.
3. Clamp the ruler so that the ball bearing rolling
Aim from it onto the inclined plane will be pro-
jected horizontally. Adjust the angle of the ruler
To model projectile motion by studying the
so the path of the ball bearing will fit on the
motion of a ball bearing projected onto an
graph paper.
inclined plane.
4. Having adjusted the apparatus, place a piece of
carbon paper on the graph paper and record
Apparatus the motion of the ball bearing projected onto
30 cm × 30 cm board ball bearing the inclined plane.
retort stand and clamp graph paper 5. Remove the carbon paper and highlight the
carbon paper 30 cm ruler (the ramp) path for easier analysis.
6. We will assume that the horizontal velocity of
Theory the ball bearing’s motion remained constant.
Therefore, the ball bearing took equal times to
Galileo found that he could slow down the action of
travel horizontally between the major divisions
acceleration due to gravity by rolling a ball down a
on the graph paper. Thus we can arbitrarily call
slope. In this way things happened slow enough for
one of these major divisions a unit of time.
him to observe them. We are going to use that same
Beginning at the point where the ball entered
strategy to slow down a projectile motion by pro-
the graph paper, label these major divisions 0,
jecting a ball bearing across an inclined plane. Recall
1, 2, 3 . . . time intervals.
that projectile motion can be considered as the
addition of two linear motions at right angles to each Analysis
other — the horizontal, constant velocity motion and
1. Record and tabulate the distance down the
the vertical, constant acceleration motion.
slope that the ball bearing travelled during each
In the horizontal motion: ∆x = uxt time interval.
2
In the vertical motion: ∆y = uyt + 1--- ayt , 2. Determine the average speed of the ball bearing
2
but we will let uy = 0 down the slope during each time interval. Your
2
answers should be in cm per time unit.
Thus ∆y = 1--- ayt 3. Plot a graph of average speed down the slope
2
We will assume that frictional forces can be versus time and determine a value for the accel-
neglected. eration of the ball down the slope. Your answer
2
will be in cm per (time unit) .
Clamp
Questions
1. What do these graphs indicate about the
Ball bearing motion of the ball down the plane?
Retort
stand
Ruler 2. What assumptions have been made in order to
Carbon paper obtain these results?
3. How would the path of the ball bearing differ if:
(a) the inclined plane was raised to a steeper
angle while keeping the ramp as it was?
(b) the angle of the ramp was raised and the
inclined plane was kept as it was?
4. The ball moves faster across the bottom of the
Books for support paper than across the top, which represents an
30 cm × 30 cm board increase in kinetic energy. What is the source of
Graph paper
this extra energy? Try to find out why the
Figure 2.27 The path of the projectile (ball bearing) is marked rolling mass of the ball introduces a problem
as it rolls down the ramp on the carbon paper. into this energy conversion.

CHAPTER 2 LAUNCHING INTO SPACE 35


Note : The diagram distinguishes between the
2.2 ACCELERATION launch vehicle and the spacecraft. The spacecraft
differed between missions, but on Apollo 10 it
AND LOAD consisted of the Lunar Module (LM) with a mass
of 13 941 kg, and the Service Module (SM) and
DURING THE Command Module (CM) with a combined mass of
28 834 kg. The CM was the only part of the entire
APOLLO 10 rocket to return to Earth.
The launch vehicle, the Saturn V rocket itself,
LAUNCH was made up of three stages. The first stage con-
sisted of five Rocketdyne F-1 engines, which
Aim burned liquid oxygen and kerosene. Upon launch
To determine acceleration and load conditions it would burn for approximately 150 s before it was
applicable during the launch of Apollo 10. exhausted at an altitude of about 70 km. It then
separated from the rocket and tumbled back down
to the ocean.
Theory The second stage used five Rocketdyne J-2
Apollo 10 was launched on 18 May 1969, carrying a engines, which burned liquid oxygen and liquid
crew of three — Stafford, Young and Cernan. The hydrogen. After separation of the first stage, this
rocket used for the Apollo missions was the second stage would ignite and burn for 365 s
Saturn V, the largest rocket ever built. Figure 2.28 before it, too, separated from what remained of
is a press release diagram from 1969 and we will the rocket. Separation of stages was always
use it to extract some performance figures. accomplished by several small ‘interstage’ retro-
rockets, which literally pulled the
stage off. By now the rocket was at an
altitude of 185 km with a speed of
−1
over 25 000 km h .
The third stage, consisting of just
one Rocketdyne J-2 engine, was then
fired for 142 s before being shut
down. The purpose of this burn was
to insert the rocket into a 190 km
high orbit with an orbital velocity of
−1
28 100 km h . The third stage rocket
was not yet exhausted — it would be
needed later to propel the craft out
of orbit and toward the Moon — so
let’s assume that half its fuel load was
consumed in this burn.

Method
PRACTICAL ACTIVITIES

1. Note that the specifications in


figure 2.28 are not SI units. The
first task is to extract the infor-
mation in part 1 of Results, and
convert it to SI units. You will need
the following conversion factors:
Height: 1 ft = 0.3048 m
1 mile = 1.609 km
Mass: 1 lb = 0.4536 kg
Thrust: 1 lb = 4.448 N

Figure 2.28 Specifications of an Apollo rocket

36 SPACE
PRACTICAL ACTIVITIES
2. Now use the information just extracted to fill in Third stage
and complete the table at the bottom of this Height = ft = m
page. You will need the following formulas: Mass fuelled = lb = kg
Mass dry = lb = kg
mE Thrust = lb = N
Acceleration due to gravity g = G -------------------------------------
2
( r E + altitude ) Second stage
Height = ft = m
( T – mg cos θ )
Acceleration of a rocket a = ------------------------------------- Mass fuelled = lb = kg
m Mass dry = lb = kg
(Allows for angle other than vertical.) Thrust = lb = N
T First stage
Acceleration load or g force = ------------
9.8m Height = ft = m
(To be consistent with above equation.) Mass fuelled = lb = kg
Mass dry = lb = kg
where Thrust = lb = N
Entire Apollo 10 rocket
G = universal gravitation constant Launch height = m
−11 2 −2
= 6.67 × 10 N m kg Launch mass = N
mE = mass of Earth 2. Complete the results table below.
24
= 5.97 × 10 kg
rE = radius of Earth
Questions
6 1. According to your table, what was the minimum
= 6.38 × 10 m
and maximum g load experienced?
T = thrust (N) The actual maximum g loads experienced by
θ = angle of thrust from vertical (°). Apollo astronauts at each stage were never quite
as high as this, because they would turn the
rocket engines off sequentially which would
Results remove approximately 0.5 g from the peak. In
1. Extract the following information from the addition, air resistance would reduce the accel-
theory above and from figure 2.28: eration and resulting g force. Also, the minimum
Spacecraft g loads were lower than that calculated, because
Height = ft = m between stages the rocket would coast for a few
seconds, essentially in free fall, which placed the
Mass = lb = kg
astronauts temporarily under a zero g load.
Instrument unit 2. When do the greatest g loads occur during such
Height = ft = m a mission? See if you can find out the maximum
Mass = lb = kg loads experienced by an Apollo crew.

ASSUMED
TOTAL MASS AVAILABLE ANGLE OF
OF ROCKET THRUST ALTITUDE g THRUST θ a g FORCE
−2 −2
STAGE (kg) (N) (km) (m s ) (°) (m s ) LOAD

Launch
Start 1st stage 0

End 1st stage 45

Start 2nd stage 45

End 2nd stage 87

Start 3rd stage 87

End 3rd stage 90

CHAPTER 2 LAUNCHING INTO SPACE 37


CHAPTER
3 ORBITING AND
RE-ENTRY
Remember
Before beginning this chapter, you should be able to:
• describe and apply Newton’s Second Law of Motion:
ΣF = ma
• state the definition of momentum: p = mv
• state the definition of kinetic energy:
2
Ek = 1--- mv .
2

Key content
At the end of this chapter you should be able to:
• analyse the forces involved in a range of uniform
circular motions, including the motion of a
satellite orbiting the Earth
• be able to calculate the centripetal force acting on a
satellite
• compare low Earth orbits to geostationary orbits
• define, describe and apply Kepler’s Law of Periods
• define orbital velocity
• solve problems using Kepler’s Law of Periods
• account for the orbital decay of satellites in low Earth
orbit
• discuss the problems associated with a safe re-entry
into the Earth’s atmosphere and return to the Earth’s
surface
Figure 3.1 An artist’s impression of the space
• identify the need for an optimum angle of re-entry
shuttle in orbit. It occupies a low Earth orbit
and the consequences of failing to achieve it.
with an altitude between 250 km and 400 km.
Upon re-entry the space shuttle uses a unique
flight pattern to minimise the load on its occupants.
After a successful launch the next challenge is to sufficiently accelerate a
spacecraft in order to place it into an orbit around the Earth. There are
several different types and shapes of orbit, although our focus is upon
low and geostationary circular orbits. In order to discover the orbital
velocities required by these different types of orbit we will first look at
circular motion and then apply this theory to orbits.
Most orbiting spacecraft do not need to be returned to Earth,
although they will eventually fall back of their own accord. However,
those with passengers do need to be returned, and in such a way as to
keep the occupants alive. This means dealing with the potentially fatal
problems of re-entering the Earth’s atmosphere — the re-entry angle,
the heat of re-entry and high g forces.

3.1 IN ORBIT
Once a launched rocket has achieved a sufficient altitude above the sur-
face of the Earth, it can be accelerated into the desired orbit. It must
attain a specific speed that is dependent upon the mass of the Earth and
the geometry of the orbit. If that speed is not reached, the spacecraft will
follow a shortened elliptical orbit that dips back down toward the atmos-
phere, possibly causing immediate re-entry; if the speed is exceeded, the
spacecraft will follow an elongated elliptical orbit that takes it away from
3.1 the Earth. To see why this speed is so crucial we first need to study the
simplest orbital motion — a uniform speed along a circular path around
Investigating circular motion
the Earth.

Uniform circular motion


Uniform circular motion is circular
Uniform circular motion is circular motion with a uniform orbital speed.
motion with a uniform orbital As an example of circular motion, imagine you have a rock tied to a
speed. string and are whirling it around your head in a horizontal plane.
Because the path of the rock is in a horizontal plane, as shown in figure
3.2, gravity plays no part in its motion. The Greek, Aristotle, considered
Orbital velocity, v that circular motion was a perfect and natural motion, but it is far from
Path Rock it. If you were to let go of the string, the rock would fly off at a tangent to
Centripetal the circle — a demonstration of Newton’s First Law of Motion. He said
acceleration, that an object would continue in uniform motion in a straight line unless
ac
Tension = acted upon by a force. In the case of our rock, the force keeping it within
centripetal a circular path is the tension in the string, and it is always directed back
force
towards the hand at the centre of the circle. Without that force the rock
will travel in a straight line.
The same is true of a spacecraft in orbit around the Earth, or any
object in circular motion — some force is needed to keep it there, and
that force is directed back towards the centre of the circle. In the case of
the spacecraft, it is the gravitational attraction between the Earth and the
Figure 3.2 The string keeps the rock spacecraft that acts to maintain the circular motion that is the orbit.
travelling in a circle, and gravity The force required to maintain circular motion, known as centripetal
keeps a satellite in a circular orbit. force, can be determined using the following equation.
2
mv
Centripetal force, F C = ---------
Centripetal force is the force that r
acts to maintain circular motion where
and is directed towards the centre FC = centripetal force (N)
of the circle.
m = mass of object in motion (kg)
−1
v = instantaneous or orbital velocity of the mass (m s )
r = radius of circular motion (m).

CHAPTER 3 ORBITING AND RE-ENTRY 39


Newton’s Second Law states that wherever there is a net force acting on
an object there is an associated acceleration. Since this centripetal force
is the only force acting on the motion, we can say that:
Centripetal force, FC = mass × centripetal acceleration, aC
and therefore,
2
v
centripetal acceleration, aC = ---- .
r
It may not be immediately apparent to you that the whirling rock in
circular motion is accelerating. Consider that at any instant the velocity
of the rock is at a tangent to the circle. As it progresses around the circle,
the direction of its velocity is constantly changing, even though its
magnitude (its speed) remains unchanged. Now recall that velocity is a
vector quantity — to change it you need only change its direction.
Hence, the velocity vector of the rock is constantly changing with time.
This is the centripetal acceleration and it is also directed towards the
Centripetal acceleration is always
present in uniform circular motion. centre of the circle, as shown in figure 3.2.
It is associated with centripetal Several common circular motions and their centripetal forces are
force and is also directed towards shown in table 3.1. In each case the centripetal force is directed back
the centre of the circle. towards the centre of the circle.

Table 3.1 A comparison of common circular motions

MOTION FC PROVIDED BY . . .

Whirling rock on a string The string

Electron orbiting atomic nucleus Electron–nucleus electrical attraction

Car cornering Friction between tyres and road

Moon revolving around Earth Moon–Earth gravitational attraction

Satellite revolving around Earth Satellite–Earth gravitational attraction

Centripetal force and acceleration on a whirled rock


SAMPLE PROBLEM 3.1 A rock of mass 250 g is attached to the end of a 1.5 m long string and
−1
whirled in a horizontal circle at 15 m s . Calculate the centripetal force
and acceleration of the rock.
2
mv
SOLUTION Centripetal force, FC = ---------
r
2
0.250 × 15
= ----------------------------
1.5
= 37.5 N
Centripetal acceleration can be found using its formula or, more simply,
using Newton’s Second Law.
F
aC = -----C
m
37.5
= ----------
0.25
−2
= 150 m s

40 SPACE
Calculating frictional force on a turning car
SAMPLE PROBLEM 3.2 A car of mass 1450 kg is driven around a bend of radius 70.0 m. Deter-
mine the frictional force required between the tyres and the road in
−1
order to allow the car to travel at 70.0 km h .
SOLUTION The frictional force between the tyres and the road must provide sufficient
centripetal force for the circular motion involved.
−1 70.0 −1 −1
Firstly, note that 70.0 km h = ---------- m s = 19.4 m s .
3.6
2
mv
Centripetal force, FC = ---------
r
2
1450 × 19.4
= -------------------------------
70
= 7800 N.
That is, the total frictional force provided by the tyres must be at least
7800 N, or an average force of 1950 N per tyre.
From the previous chapters you will recall that an astronaut in orbit
around the Earth still experiences an acceleration due to gravity of about
−2
8.8 m s . This, in turn, means that the astronaut still has significant true
weight. It should now be clear that this acceleration due to gravity acts as
the centripetal acceleration of the orbital motion and the astronaut’s
weight forms the centripetal force. Why, then, does the astronaut feel
weightless? It is for the same reason that a person in a falling elevator also
experiences weightlessness during the fall.
You should also recall that apparent weight is the sensation of weight
created by those forces resisting a body’s true weight. In the case of an
astronaut in an orbiting spacecraft, and of a person in a falling elevator,
there are no resisting forces acting on the person so that there is no
apparent weight. Referring back to figures 2.17 and 2.19, we can see
that the acceleration in both cases is −g (taking ‘up’ to be the positive
direction). Therefore, g force experienced is:
g+a
g force = --------------
9.8 m
g + ( –g )
= -------------------
9.8 m
eBook plus = 0, that is, zero apparent weight.

Weblink:
Kepler’s third law
Kepler’s third law — the Law of Periods
Johannes Kepler (1571–1642) discovered his third law, the Law of
Periods, through trial and error in the very early 1600s. He had access to
extraordinarily detailed observations of the motions of the planets made
by Tycho Brahe, and in attempting to analyse them he came upon this
Period, T, is the time taken to relationship, among others. He expressed this law in this form:
complete one orbit.
r 3
 ------ r 3
 ------
for planet 1 = for planet 2
 T 2  T 2

This relation can be used to compare any two bodies orbiting the same
object, for example, any two moons orbiting Jupiter or any two planets
orbiting the Sun. An alternative expression of this law is:
r3
------2 = k for any satellites orbiting a common central mass,
T

CHAPTER 3 ORBITING AND RE-ENTRY 41


where
r = the radius of the orbit of any given satellite
T = the period of that satellite’s orbit
k = a constant.

Students of astrophysics will learn


In chapter 4 we will see how the Law of Universal Gravitation can be used
that the variable M in Kepler’s Law
to derive an expression for the constant k, so that Kepler’s Law of Periods
of Periods actually refers to the
becomes:
combined mass of the system. The r3 GM
------2 = ---------2-
derivation can be found in chapter 16 T 4π
on page 307. In the case of a satellite, where
however, M is approximately equal to G = the universal gravitation constant
−11 2 −2
the mass of the planet being orbited. = 6.676 × 10 N m kg
M = the mass of the central body.

Calculating the periods of satellites


SAMPLE PROBLEM 3.3 Calculate the periods of three different satellites orbiting the Earth at
altitudes of (a) 250 km, (b) 400 km and (c) 40 000 km. The radius of the
6 24
Earth is 6.38 × 10 m and the Earth’s mass is 5.97 × 10 kg.

SOLUTION (a) At an altitude of 250 km:


r3 GM E
------2 = -----------
-
T 4π 2
( 6.38 × 10 6 + 250 × 10 3 ) 3 ( 6.67 × 10 –11 ) ( 5.97 × 10 24 )
--------------------------------------------------------------- = ------------------------------------------------------------------
-
T2 4π 2
∴ T = 5375 seconds = 89.6 minutes
Note that the radius of the orbit equals the radius of the Earth plus
the altitude.
(b) At an altitude of 400 km:
r3 GM E
------2 = -----------
-
T 4π 2
( 6.38 × 10 6 + 400 × 10 3 ) 3 ( 6.67 × 10 –11 ) ( 5.97 × 10 24 )
--------------------------------------------------------------- = ------------------------------------------------------------------
-
T2 4π 2
∴ T = 5560 seconds = 92.7 minutes
(c) At an altitude of 40 000 km:
r3 GM E
------2 = -----------
-
T 4π 2
( 6.38 × 10 6 + 40 000 × 10 3 ) 3 ( 6.67 × 10 –11 ) ( 5.97 × 10 24 )
---------------------------------------------------------------------- = ------------------------------------------------------------------
-
T2 4π 2
∴ T = 99 450 seconds = 27.6 hours
Orbital velocity
Orbital velocity is the instantaneous direction and speed of an object in
circular motion along its path. For uniform circular motion, its magni-
tude is constant and inversely proportional to the period of the orbit, as
follows:
circumference of the circle
orbital velocity v = ----------------------------------------------------------------------
period T
2πr
v = ---------
T
This general formula can be applied to any object in circular motion.

42 SPACE
Calculating the orbital velocity of a turning car
SAMPLE PROBLEM 3.4 Calculate the orbital velocity of a car that travels completely around a
20.0 m radius roundabout in 8.00 s.
2πr
SOLUTION v = ---------
T
2π × 20.0
= ------------------------
8.00
−1
= 15.7 m s

Calculating the orbital velocity of satellites


SAMPLE PROBLEM 3.5 Determine the orbital velocities of the three satellites in sample problem
3.3, that is, with orbits of altitude (a) 250 km, (b) 400 km, and (c) 40 000 km.
SOLUTION (a) At an altitude of 250 km:
2πr
v = ---------
T
2π ( 6.38 × 10 6 + 250 × 10 3 )
= --------------------------------------------------------------------
5375
−1 −1
= 7750 ms ≈ 27 900 km h
(b) At an altitude of 400 km:
2πr
v = ---------
T
2π ( 6.38 × 10 6 + 400 × 10 3 )
= --------------------------------------------------------------------
5560
−1 −1
= 7660 m s ≈ 27 600 km h
(c) At an altitude of 40 000 km:
2πr
v = ---------
T
2π ( 6.38 × 10 6 + 40 000 × 10 3 )
= ---------------------------------------------------------------------------
99 450
−1 −1
= 2930 m s ≈ 10 550 km h
Note the decrease in orbital velocity as the radius of the orbit increases.
If this expression for orbital velocity is substituted into Kepler’s Law of
Periods, then a formula for orbital velocity emerges that is specific to a
satellite.
2πr 2πr
v = --------- so T = ---------
T v
r3 GM
∴ -----------------2 = ---------2-
 2πr 4π
---------
 v 
Rearranging gives
GM
v= ----------
r
where
−1
v = orbital velocity (m s )
G = universal gravitation constant
−11 −2
= 6.67 × 10 N m2 kg
M = central mass (kg)
r = radius of the orbit (m).

CHAPTER 3 ORBITING AND RE-ENTRY 43


Note that the value of a satellite’s orbit depends on:
• the mass of the planet being orbited
• the radius of the orbit. For a satellite orbiting a planet, this is equal to
the radius of the planet plus the altitude of the orbit.
Hence, for the case of a satellite orbiting the Earth, the formula becomes:
Gm E
v= --------------------------------
-
r E + altitude
where
−1
v = orbital velocity (m s )
mE = mass of the Earth
24
= 5.97 × 10 kg
rE = radius of the Earth
6
= 6.38 × 10 m
altitude = height of orbit above the ground (m).
It is clear from this formula that altitude is the only variable that deter-
mines the orbital velocity required for a specific orbit. Further, the
greater the radius of the orbit, the lower that velocity is.

Calculating orbital velocity from altitude


SAMPLE PROBLEM 3.6 Verify the results of sample problem 3.5 by calculating the orbital velo-
cities directly from the altitudes.

SOLUTION (a) At an altitude of 250 km:


Gm E
v= --------------------------------
-
r E + altitude
( 6.67 × 10 –11 ) ( 5.97 × 10 24 )
= ------------------------------------------------------------------
-
( 6.38 × 10 6 + 250 × 10 3 )
−1 −1
= 7750 ms ≈ 27 900 km h
(b) At an altitude of 400 km:
Gm E
v= --------------------------------
-
r E + altitude
( 6.67 × 10 –11 ) ( 5.97 × 10 24 )
= ------------------------------------------------------------------
-
( 6.38 × 10 6 + 400 × 10 3 )
−1 −1
= 7660 m s ≈ 27 600 km h
(c) At an altitude of 40 000 km:
Gm E
v= --------------------------------
-
r E + altitude
( 6.67 × 10 –11 ) ( 5.97 × 10 24 )
= -------------------------------------------------------------------
-
( 6.38 × 10 6 + 40 000 × 10 3 )
−1 −1
= 2930 m s ≈ 10 550 km h

Orbital energy
Any satellite travelling in a stable circular orbit at a given orbital radius
has a characteristic total mechanical energy E. This is the sum of its kin-
etic energy Ek (due to its orbital velocity) and its gravitational potential
energy Ep (due to its height). The kinetic energy equation is:

44 SPACE
GM
Ek = 1--- mv , and we have seen that v =
2
---------- .
2 r
Combining these equations gives a new equation for the kinetic energy
of an orbiting satellite:
Ek = GMm
---------------
2r
where
M = mass of the central body being orbited (kg)
m = mass of the satellite (kg)
r = radius of the orbit (m).
We know from chapter one that Ep = GMm --------------- . Note that for a stable cir-
r
cular orbit the value of Ek is always half that of the Ep but positive in value.
An expression for total mechanical energy can now be determined.
Mechanical energy E = Ek + Ep
GMm GMm
= 1---  -------------- –  --------------
2 r   r 

GMm
∴  = – 1---  --------------
2 r 

This equation looks very similar to the equation for Ep and also rep-
resents a negative energy well. The value of the mechanical energy of a
satellite orbiting a planet depends only on the masses involved and the
radius of the orbit. A lower orbit produces a more negative value of E and,
therefore, less energy, while a higher orbit corresponds to more energy.
A useful concept for comparing orbits is the specific orbital energy of a
satellite, which is the mechanical energy per kilogram.
E- = – GM
Specific orbital energy ε = --- ---------- for circular orbits.
m 2r
Refer to table 3.2, which lists orbital data for several different types of
satellites. The first four rows list satellites with near-circular orbits but
with increasing radii. Note that the specific orbital energy also increases
as radius increases.

Elliptical orbits
The preceding theory assumes that we are dealing with circular orbits,
b
Focus Focus
but that is usually not the case. The most common orbital shape is an
c ellipse, or oval shape. Kepler also realised this and stated it as his first law.
Earth
a Ellipses can be round or elongated — the degree of stretch is known as
eccentricity. Referring to figure 3.3, we can see that eccentricity is
Apogee Perigee c where c is the distance between the two focuses
defined as the ratio ------
— furthest — closest 2a
point point
of the ellipse and a is the semi-major axis. In fact, a circle is an ellipse
a = semi-major axis with an eccentricity of zero. Most satellites are placed into near-circular
b = semi-minor axis orbits, as shown in table 3.2, but there are a few notable exceptions. Each
c = distance between foci of these types of orbits is discussed over the next few pages.
Eccentricity = —c
2a Much of the information shown in table 3.2 can be calculated if the
Figure 3.3 Various dimensions semi-major axis a is known, and this can be determined from the apogee
of an ellipse and perigee distances.

CHAPTER 3 ORBITING AND RE-ENTRY 45


Table 3.2 A sample of various Earth satellites as at February 2008, sorted according to the specific orbital
energy of their orbits.

Specific orbital energy (MJ kg )


–1

Velocity at perigee (km h )

Velocity at apogee (km h )


–1

–1
Apogee altitude (km)
Perigee altitude (km)
Inclination (degrees)
Orbit description
Satellite name

Period (min)

Eccentricity
Purpose

GENESAT Biological Low Earth orbit –29.4 93 40 397 27 590 401 27 580 0.0050
research and
amateur radio
beacon

USA 197 Military eye-in- Polar low Earth –28.4 97 97.8 627 27 140 630 27 120 0.0024
the-sky orbit

IRIDIUM 95 Satellite phone Low Earth orbit –28.2 98 86.6 670 27 050 674 27 040 0.0030
communication

NOAA 18 Weather Polar low Earth –27.5 102 98.8 845 26 740 866 26 660 0.0123
orbit

RASCOM 1 African Transfer orbit to –8.1 638 5.4 587 35 650 35 745 5 900 0.9677
communications geostationary
position

MOLNIYA 3–53 TV and military Elliptical Molniya –7.5 718 64.9 1 047 34 570 39 308 5 620 0.9481
communications orbit

NAVSTAR 59 Global High altitude GPS –7.5 718 55.2 20 092 13 980 20 273 13 890 0.0045
Positioning orbit
System

OPTUS D2 Australian and NZ Geostationary –4.7 1436 0 35 776 11 060 35 798 11 060 0.0003
television orbit
communications

SKYNET 5B Military Geostationary –4.7 1436 0.1 35 773 11 060 35 810 11 060 0.0005
communications orbit

Perigee or periapsis?
In orbital mechanics, the general term
Some relevant equations are:
r +r apogee altitude + perigee altitude
for the point of closest approach to the Semi-major axis a = A P = + rE
central body is periapsis, and the 2 2
furthest point is apoapsis. However, c r −r
Eccentricity e = = A P
these terms adapt to the body being 2a rA + rP
orbited. When considering satellites GM
orbiting the Earth, the terms become Specific orbital energy ε = –
2a
perigee and apogee. If orbiting the
Moon, the terms become perilune and The specific orbital energy equation is a more general form than that
apolune; and if the Sun is orbited, given earlier, and has been used to order the satellites in the table. You
then they are perihelion and should note that as the size of the near-circular orbits increases so too
aphelion. There are a range of other does the energy. Note also that the specific orbital energy of an elliptical
similar terms for other celestial objects orbit lies between that of circular orbits that correspond to the ellipse’s
that can be orbited. perigee and apogee altitudes.

46 SPACE
The velocity of a satellite at any point along an elliptical path can be
calculated using the following general equation:
2 1
30000
v= GM  --------------
– -
 r 1 a
where
r = the orbital radius at the point being considered.
The satellite velocities at apogee and perigee in table 3.2 were
calculated using this formula. (You should confirm that for a circular
orbit a = r and that this formula simplifies to the orbital velocity equation
given on page 44.) Notice how the velocities of each satellite are least at
the apogee and greatest at the perigee, that is, the satellites move quickly
when closest to the Earth and slow down as they move further away. This,
of course, is just what is described in Kepler’s second law.
The Molniya orbit was developed specifically for this reason. Devised
for Russian communications (as most of Russia lies too far north to be
satisfactorily covered by a geostationary satellite), this very eccentric orbit
places a high apogee over the desired location. A Molniya satellite will
cruise slowly through this apogee before zipping around through the low
perigee and returning quickly to the coverage area.

Types of orbit
Spacecraft or satellites placed into orbit will generally be placed into one
of two altitudes — either a low Earth orbit or a geostationary orbit.
A low Earth orbit is generally an orbit higher than approximately
A low Earth orbit is an orbit higher
than 250 km and lower than
250 km, in order to avoid atmospheric drag, and lower than approxi-
1000 km. mately 1000 km, which is the altitude at which the Van Allen radiation
belts start to appear. These belts are regions of high radiation trapped by
the Earth’s magnetic field and pose significant risk to live space travellers
as well as to electronic equipment. The space shuttle utilises a low Earth
orbit somewhere between 250 km and 400 km depending upon the
−1
mission. At 250 km, an orbiting spacecraft has a velocity of 27 900 km h
and takes just 90 minutes to complete an orbit of the Earth.
A geostationary orbit is at an altitude at which the period of the orbit
A geostationary orbit is at an
precisely matches that of the Earth. If over the Equator, such an orbit
altitude at which the period of the
orbit precisely matches that of the would allow a satellite to remain ‘parked’ over a fixed point on the sur-
Earth. This corresponds to an face of the Earth throughout the day and night. From the Earth such a
altitude of approximately satellite appears to be stationary in the sky, always located in the same
35 800 km. direction regardless of the time of day. This is particularly useful for
communications satellites because a receiving dish need only point to a
fixed spot in the sky in order to remain in contact with the satellite.
The altitude of such an orbit can be calculated from Kepler’s Law of
Periods. Firstly, the period of the orbit must equal the length of one sidereal
day; that is, the time it takes the Earth to rotate once on its orbit, relative
to the stars. This is 3 minutes and 56 seconds less than a 24-hour solar day,
so that T is set to be 86 164 s. The radius of the orbit then works out to be
42 168 km, or 6.61 Earth radii. Subtracting the radius of the Earth gives the
altitude as approximately 35 800 km. This places the satellite at the upper
limits of the Van Allen radiation belts and near the edge of the magneto-
sphere, making them useful for scientific purposes as well. Australia has the
AUSSAT and OPTUS satellites in geostationary orbits.
If a satellite at this height is not positioned over the Equator but at
some other latitude, it will not remain fixed at one point in the sky.
Instead, from the Earth the satellite will appear to trace out a ‘figure of
eight’ path each 24 hours. It still has a period equal to Earth’s, however,
so this orbit is referred to as geosynchronous.

CHAPTER 3 ORBITING AND RE-ENTRY 47


A transfer orbit is a path used to manoeuvre a satellite from one orbit
A transfer orbit is an orbit used to to another. Satellites headed for a geostationary orbit are first placed into
manoeuvre a satellite from one
orbit to another. a low Earth orbit and then boosted up from there using a transfer orbit,
which has a specific orbital energy that lies between that of the lower and
higher circular orbits. Orbital manoeuvres utilise Keplerian motion,
which is not always intuitive. In order to move a satellite into a different
orbit, the satellite’s energy must be changed; this is achieved by rapidly
altering the kinetic energy. Rockets are fired to change the satellite’s
velocity by a certain amount, referred to as ‘delta-v’ (∆v), which will
increase or decrease the kinetic energy (and therefore the total energy)
to alter the orbit as desired. However, as soon as the satellite begins to
change altitude, transformations between the Ep and the Ek occur, so its
speed is continually changing.
The simplest and most fuel-efficient path is a
Hohmann transfer orbit, as shown in figure 3.4.
Apogee
v = 5820 km h–1 This is a transfer ellipse that touches the
lower orbit at its perigee and touches
the higher orbit at its apogee. The
Hohmann transfer involves two
relatively quick (called
‘impulsive’) rocket boosts.
(In orbital mechanics,
the word ‘impulsive’
describes a quick
Δv = 5280 km h–1
Transfer ellipse change in velocity
The spacecraft and energy.) In
slows when moving order to move to
out to apogee, a higher orbit,
though total E does the first boost
Low Earth orbit not change.
increases the
v = 27 600 km h–1
satellite’s velocity,
stretching the cir-
Earth
cular low Earth
orbit out into a
Perigee transfer ellipse.
v = 36 200 km h –1 The perigee is the
Δv = 8600 km h–1 fastest point on this
transfer orbit, and
Geostationary orbit as the satellite moves
v = 11 100 km h–1
along the ellipse it
slows again. When
it finally reaches the
apogee, it will be at the
correct altitude for its new
orbit, but it will be moving too
slowly, the apogee being the
Figure 3.4 slowest point on the ellipse. At this
A Hohmann transfer point the rockets are fired again, to
orbit used to raise a satellite increase the velocity to that required for the
from a low Earth orbit of altitude new higher, stable and circular orbit.
400 km up to a geostationary orbit In order to move down from a higher to a lower orbit, the process is
of altitude 35 800 km reversed, requiring two negative delta-v rocket boosts, that is, retro-firing
of the rockets. These two boosts will slow the satellite, hence changing its
orbit, first from the higher circular orbit into a transfer ellipse that
reaches down to lower altitudes, then from the perigee of the transfer
ellipse into a circular low Earth orbit.

48 SPACE
PHYSICS IN FOCUS
Other types of orbit
T here are several, more unusual types of orbit.
One is a low altitude polar orbit. A satellite
flying 1000 km over the North Pole and then the
force. This means that it will begin to speed ahead
of the Earth and eventually lose contact. However,
the Earth’s gravity is pulling in the opposite
South Pole, orbiting the Earth once every 100 direction and, at a particular distance, it will
minutes, will, over the course of 24 hours, be able reduce the pull of the Sun just enough to allow the
to survey the entire globe as it spins beneath it. spacecraft to slow to a speed that matches the
If the plane of the orbit is about 8 degrees off Earth’s progress. This point is known as the L1
the north–south plane, the mass of the Earth’s point and is approximately four times the distance
equatorial bulge causes the plane of the orbit to to the Moon, or one-hundredth of the distance to
slowly rotate in time with the Earth’s rotation so the Sun.
that it maintains its attitude to the Sun. Such a A spacecraft placed at the L1 point will orbit
sun-synchronous orbit allows a satellite to always the Sun along with the Earth, maintaining its
orbit along a path over the twilight between day relative position between the Sun and the Earth
and night, known as the terminator. throughout the year. This strategy is ideal for
An even stranger orbit is an orbit at a Lan- studying the solar wind, and has been used for
grangian point. A Langrangian point is a position the Advanced Composition Explorer (ACE) and
in space at which a satellite can maintain a stable the Solar and Heliospheric Observatory
orbit despite the gravitational influence of two sig- (SOHO). Rather than residing at the point, these
nificant masses, in this case the Sun and the Earth. spacecraft move in small orbits around it.
Any spacecraft launched toward the Sun will, There are several other Langrangian points,
according to Kepler’s Law of Periods, begin to such as the L2 directly behind the Earth, at which
speed up as its orbital radius (its distance from the point the Earth’s gravity adds to the Sun’s to
Sun) decreases and the gravitational pull of the speed up a satellite so that it can accompany the
Sun increases, thereby increasing the centripetal Earth, as well as more points around the Moon.

Orbital decay
All satellites in low Earth orbit are sub-
ject to some degree of atmospheric drag
that will eventually decay their orbit and
Atmospheric drag Fd limit their lifetimes. Although the atmos-
phere is very thin 1000 km above the
surface of the Earth, it is still sufficient
Lift Fd sin θ to cause some friction with a satellite.
This friction causes a gradual (‘non-
impulsive’) loss of energy to heat. Refer-
ring back to the equation for orbital
energy, you can see that a loss of orbital
θ energy necessarily means a loss of alti-
Drag Fd cos θ tude, so that this gradual loss of energy
causes a low-orbiting satellite to slowly
spiral back towards the Earth.
A number of factors combine to make
Weight W this an accelerating process. Referring
to figure 3.5, we see that the two forces
Velocity v acting on a low-orbiting satellite are its
weight and atmospheric drag. The
equation for the atmospheric drag is as
follows:
Fd = – 1--- ρv CdA
2
Figure 3.5 Forces acting on a satellite 2

CHAPTER 3 ORBITING AND RE-ENTRY 49


where
–3
ρ = air density (kg m )
–1
v = velocity (m s )
Cd = coefficient of drag (a ratio expressing how streamlined the shape is)
2
A = cross-sectional area (m ).
As the satellite descends this force increases for two reasons:
9
• The density of the atmosphere increases by a factor of almost 10 from
an altitude of 150 km down to ground level.
• As the satellite loses altitude some of its Ep is transformed into Ek and it
speeds up — it is literally starting to fall back to Earth. The drag force
is proportional to the square of the velocity, so as the speed increases
the drag increases much more sharply.
At an altitude of about 80 km the atmospheric drag increases suf-
ficiently to start slowing the descending satellite; however, by now the
increasing air density means that the braking effect is building quite
rapidly. At some point, usually around an altitude of 60 km, the atmos-
pheric drag increases sharply, leading to a catastrophe of heat and g
force. Braking occurs so suddenly that the heat generated usually
becomes sufficient to vaporise all but the largest of satellites, in addition
to the generation of extreme g forces.
Designers plan for an expected satellite lifetime by building in small
rocket boosters so that the satellite can be lifted periodically back up to
its intended orbital altitude. Figure 3.6 shows the changes in the altitude
of the International Space Station over several years. Being a very large,
unstreamlined shape, it experiences more drag than most other satellites;
it loses about 90 metres of altitude per day due to atmospheric drag. It is
lifted regularly — the larger lifts were done by visiting space shuttles
(which ceased for some time after the Columbia disaster in 2003) and the
smaller ones by Russian rockets.
Average altitude (km)

400

Figure 3.6 Altitude changes of the 380

International Space Station. The 360


drops in altitude represent orbital 340
decay due to atmospheric drag. The
320
increases in altitude are lifts 1999 2000 2001 2002 2003 2004 2005 2006
performed by visiting spacecraft. Year

However, the actual service life of a satellite can be unpredictable as


the atmosphere itself can change. For example, an increase in solar radi-
ation can cause the atmosphere to expand and rise up to meet a satellite,
increasing the atmospheric density at that altitude. Many satellites have
been prematurely lost this way during a solar cycle maximum, including
the first US space station, Skylab, in 1979.

3.2 RE-ENTRY
The process of deliberately leaving a stable Earth orbit and re-entering
the atmosphere in order to return to the surface of the Earth is known as
‘de-orbiting’. De-orbiting has some significant differences from orbital
decay. Orbital decay is an unintended, gradual (‘non-impulsive’) process
resulting in a spiral path downward, whereas de-orbiting is a deliberate,
impulsive orbital manoeuvre resulting in an elliptical path down to the
atmosphere.

50 SPACE
The de-orbit manoeuvre
The first phase of the de-orbit manoeuvre is to alter the spacecraft’s orbit
into a transfer ellipse that intersects the atmosphere at the desired angle.
A shallow angle is selected to minimise the extreme heating and g forces
that can destroy a spacecraft on an uncontrolled re-entry. However, if the
angle is too shallow, the spacecraft may skip off the atmosphere instead
of penetrating it. For example, as shown in figure 3.7, the optimum angle
for the Apollo missions was between 5.2° and 7.2°.

Lift

Drag

5.6 x 10
3 km
120 3.7 x 3
10
1.9
x 10 3
90 2.0°
Alt (km)
0
60
5.2°
30

Landing velocities:
3 chutes – 9.5 m s–1
2 chutes – 11 m s–1

Figure 3.7 Re-entry of


an Apollo capsule

Just as for other transfer ellipses, in order to collapse the shape of the
stable circular orbit into a smaller transfer ellipse, some energy must be lost.
This is achieved by a retroburn of the spacecraft’s rockets — that is, pointing
them ahead of the spacecraft and executing a short burn to quickly reduce
the velocity and thus the kinetic energy. The required transfer ellipse must
be calculated in advance, as this will determine when and for how long the
retroburn must occur in order to achieve the required delta-v.
The next phase is the glide phase, in which the spacecraft’s orbit
changes and carries it down to meet the atmosphere at the ‘re-entry
interface’, around 120 km altitude. While the spacecraft is gliding down,
no further manoeuvring is required. However, the spacecraft is speeding
up again because gravitational potential energy is transforming into kin-
etic energy as it falls.
The third phase of the process is the actual re-entry into and through
the atmosphere, during which the issues of heat, g forces, ionisation
blackout and reaching the surface must be dealt with.

An extreme heat
Why is it that re-entry produces so much heat? Consider that the
spacecraft has a velocity, even after retrofiring, of tens of thousands of

CHAPTER 3 ORBITING AND RE-ENTRY 51


kilometres per hour. This velocity means that the spacecraft has signifi-
cant kinetic energy. Additionally, the altitude of the spacecraft’s orbit
means that it also has considerable gravitational potential energy, which
is also lost as the spacecraft’s altitude decreases during re-entry.
As the spacecraft re-enters, it experiences friction with the mol-
ecules of the atmosphere. This friction is a force directed against the
motion of the spacecraft and causes it to decelerate; that is, to slow
down. The enormous kinetic energy the spacecraft possesses is con-
verted into heat, and that heat can cause the spacecraft to reach
extreme temperatures.

Figure 3.8 The Apollo 8 capsule re-entering the Earth’s atmosphere

Research into the heat of re-entry was first conducted in the early
1950s in the USA, not for space flight but to build a durable warhead for
intercontinental ballistic missiles (ICBMs). A typical ICBM reaches an
altitude of 1400 km and has a range of 10 000 km. Computer design had
produced streamlined missiles with long needle-shaped nose cones, but
these designs reached temperatures of 7500° during re-entry — high
enough to vaporise the nose cone.
In 1952, aeronautical engineer Harry Julian Allen, working with only
pen and paper, calculated that the best shape for re-entry was a blunt
one. When a blunt shape collides with the upper atmosphere at re-entry
speeds, it produces a shockwave of compressed air in front of itself, much
like the bow wave of a boat. Most of the heat is then generated in the
compressing air, and significantly less heat is caused by friction of the air
against the object itself.
Allen’s discovery led to a new design of warhead — one that would
detach from the rocket at altitude and re-enter the atmosphere back-
wards, presenting its blunt rear end as it fell. This was also the same basic
design used for early space capsules such as the Mercury, Gemini and
Apollo missions (see figure 3.7), as well as the planned Orion spacecraft.
The first Mercury rocket, in particular, was simply a slightly modified
ICBM, with the detachable nose cone adapted for occupation by a single
short person. The space shuttle still uses this idea — by keeping its nose
well up during re-entry, it presents a flat underbelly to the atmosphere to
create the shockwave.

52 SPACE
However, the blunt shape would still experience high temperatures,
and a protective ‘heat shield’ was needed. After considerable research a
technique called ablation was settled on. In this technique the nose cone
is covered with a ceramic material, such as fibreglass, which is vaporised
or ‘ablated’ during re-entry heating. The vaporising of the surface dis-
sipates the heat and carries it away. This technique was used with success
on all of the Mercury, Gemini and Apollo capsules during the 1950s,
1960s and 1970s; and will be used again on the Orion capsules. The
Orion spacecraft is planned to take astronauts back to the Moon by 2020.
As it returns it will re-enter Earth’s atmosphere with greater velocity than
–1
the space shuttle — about 40 000 km h . The ablative heat shield devel-
oped for this purpose is shown in figure 3.9. Called the TPS (Thermal
Protection System), it is constructed of a modern material called
Phenolic Impregnated Carbon Ablator (PICA).
The space shuttle’s slower orbital speed allows it to use a different
approach. Each shuttle has a covering of insulating tiles. These tiles are
made of glass fibre but are approximately 90% air. This gives them excel-
lent thermal insulation properties and also conserves mass. Unfortu-
nately they are also porous, so they absorb water, meaning that the
surface must be waterproofed between each flight. Damage to the space
shuttle Columbia’s heat shield is thought to have caused its disintegration
and the loss of seven astronauts on 1 February 2003. Investigators believe
the scorching air of re-entry penetrated a cracked panel on the left wing
and melted the metal support structures inside. Events such as these have
prompted the planned retirement of the space shuttle fleet in 2010.

Figure 3.9 Boeing have developed


this prototype PICA heat shield for the
new Orion spacecraft, which will
return astronauts to Earth in a
capsule much like the Apollo capsules,
although on a larger scale.
Copyright © Boeing

Decelerating g forces
Even with a spacecraft designed to cope with re-entry heating, there is
still another major consideration — the survival of any living occupants.
Greater angles of re-entry mean greater rates of deceleration. This means
a faster rate of heat build-up as kinetic energy is converted, but it also
means greater g forces experienced by the occupants of the spacecraft. In
the 1950s this was a very real concern, because designers were aware that
the re-entry angles required by their spacecraft would generate loads up
to 20 g. Unfortunately, early studies using large centrifuges suggested that
g forces on astronauts should be restricted to 3 g if possible, and that 8 g
represented a maximum safe load although symptoms such as chest pain
and loss of consciousness could be experienced at this level.
Research was conducted into ways to increase a human’s tolerance of
g forces. The findings included those below:

CHAPTER 3 ORBITING AND RE-ENTRY 53


v • A transverse application of g load is easiest to cope with because blood
is not forced away from the brain. This meant that the astronaut
should be lying down at take-off, not standing or sitting vertically.
• An ‘eyeballs-in’ application of g loads is easier to tolerate than
Lift-off
‘eyeballs-out’. This meant that the astronaut should lift-off forwards
(facing up) but re-enter backward (facing up) since the g forces are
always directed upwards. This was consistent with the idea of the
a F
nose-cone presenting its rear face as it re-entered, as in figure 3.6.
• Supporting the body in as many places as possible increases tolerance.
Supporting suits, webbing, netting and plaster casts were all tried, with
designers eventually settling on a contoured couch, built of fibreglass
and moulded to suit the body of a specific astronaut. Using this couch
volunteers were successfully subjected to loads of up to 20 g, though
this represented the limit of their tolerance.
In the Mercury rocket program that followed soon after this work, all
of the above strategies were employed to the benefit of the astronauts.
During his flight, Alan Shepard, the first American in space, was
Re-entry subjected to a maximum lift-off g force of 6.3 and a maximum re-entry
g force of 11.6.

a F
PHYSICS FACT

I n 1959, a remarkable experiment was conducted by R. Flanagan


Gray, a physician working at the US Navy’s centrifuge laboratory in
Pennsylvania. The Navy was interested in water immersion as a means
of body support to increase tolerance of high g forces. Gray designed
v
a large aluminium capsule that could be fitted to the centrifuge and
Figure 3.10 The application of g filled with water. It was nicknamed the ‘Iron Maiden’, which sounds
force upon an astronaut during lift-off ominous, but it was fitted with an emergency flushing mechanism.
and re-entry Gray tested it himself. Completely submerged inside the capsule, he
held his breath as the centrifuge wound up, subjected him to a load
of 31 g for 5 seconds, then wound down again. This established a new
record for tolerance of g forces.

Ionisation blackout
An additional problem was discovered early in the development of space
flight. As heat builds up around a spacecraft during re-entry, atoms in the
air around it become ionised, forming a layer around the spacecraft. Radio
signals cannot penetrate this layer of ionised particles, preventing com-
munication between the ground and the spacecraft. All telemetry and
verbal communication by radio is cut off for the duration of this ionisation
Ionisation blackout is a period of
no communication with a blackout, the length of which depends upon the re-entry profile.
spacecraft due to a surrounding Apollo capsules would experience an ionisation blackout of three to
layer of ionised atoms forming in four minutes, whereas the space shuttle suffers a somewhat longer period
the heat of re-entry. of 16 minutes.

Reaching the surface


Figure 3.7 (page 51) shows how a typical Apollo capsule, which would
contain three astronauts, would reach an altitude of 400 000 ft or 120 km,
considered the ‘entry interface’, at a re-entry angle between 5.2° and
7.2°. It would then descend from this altitude over a range of 2800 to
4600 km, continually slowing down and, at some point, suffering ionis-
ation blackout. In the last portion of its descent, parachutes would be

54 SPACE
–1
released to slow it to about 33 km h . Finally, it would splash down into
the ocean to await recovery by a naval vessel. This was essentially the
same strategy as that used by the earlier Mercury and Gemini spacecraft.
Soviet missions often ended a little differently, as they descended
over land. The early Vostok capsules included an ejection seat, and the
cosmonaut would eject at a suitable altitude, descending to the ground
by parachute. Later Soyuz spacecraft, and the Chinese Shenzhou space-
craft that were based upon them, improved on this somewhat. With the
occupants remaining inside, the capsules descended by a series of para-
chutes. In the last few moments the heat shield would be jettisoned,
revealing a set of retrorockets. These were fired at a height of just 2
metres to provide a soft landing onto the ground.
As figure 3.11 shows, the space shuttle displays a unique solution to the
problem of reaching the surface of the Earth without subjecting its occu-
pants to loads greater than 3 g. As it has wings, the pilot is able to control
the attitude of the space shuttle and direct its descent. During the period
of maximum deceleration and heat, its nose is held up at an angle of 40°,
which slows its progress and presents the underbelly as a protected blunt
surface. Past this stage, it is flown in a series of sharp-banking S-turns in
order to control its descent, much like a snow skier descending a steep
mountain. When it is just 1.5 km from the runway, it is gliding down an
18° gradient, much steeper than the 3° approach of a large airliner.
When 500 m above the ground, speed brakes are applied (special flaps
that increase drag) so that it settles in to a 1.5° final approach. The crew
deploys the landing gear and within seconds the space shuttle touches
down on its runway.

Figure 3.11 The space shuttle has a unique


approach to the problems of re-entry. This is an
artist’s impression of the shuttle descending at
an angle through the upper atmosphere.

CHAPTER 3 ORBITING AND RE-ENTRY 55


SUMMARY QUESTIONS
• Orbital motion is an example of uniform cir- 1. Construct a diagram of a satellite orbiting
cular motion. Centripetal force is the force the Earth, showing all forces acting on the
required to maintain circular motion and is satellite.
given by the equation: 2. A 400 g rock is tied to the end of a 2 m long
2
mv string and whirled until it has a speed of
Centripetal force, F C = --------- . −1
r 12.5 m s . Calculate the centripetal force and
acceleration experienced by the rock.
• The period of a satellite’s orbit is related to its −1
3. A 900 kg motorcycle, travelling at 70 km h ,
radius by this expression of Kepler’s Law of
rounds a bend in the road with a radius of
Periods:
3
17.5 m. Calculate the centripetal force
r GM required from the friction between the tyres
-------2 = ---------2- .
T 4π and the road.
4. (a) Define the orbital velocity of a satellite.
• The period of a satellite’s orbit is related to its
velocity by this expression: (b) Describe its relationship to:
(i) the gravitational constant G
2πr
T = --------- . (ii) the mass of the planet it is orbiting
v
(iii) the mass of the satellite
• Combining the above two equations leads to an (iv) the radius of the orbit
expression for the orbital velocity of a satellite (v) the altitude of the satellite.
around the Earth: (c) State this relationship in algebraic form.
Gm E 5. If a person were to hurl a rock of mass 0.25 kg
v = -----------
r horizontally from the top of Mt Everest, at an
altitude of 8.8 km above sea level, calculate:
• Low Earth orbit refers to any orbits below an (a) the velocity required by the rock so that it
altitude of approximately 1000 km and will would orbit the Earth and return to the
typically involve an orbital period of approxi- thrower, still waiting on Mt Everest. Note :
mately one-and-a-half hours. A geostationary 24
The mass of the Earth is 5.97 × 10 kg, and
orbit is defined to be an orbit with an orbital its mean radius is 6380 km.
period equal to that of the Earth, which places
(b) how long the thrower would have to wait
it permanently over a fixed point on the Earth’s
for the rock’s return.
surface. This requires an altitude of approxi-
(c) the magnitude and direction of the cen-
mately 35 800 km.
tripetal force acting on the rock.
• Satellites in low Earth orbit are continually sub- 6. Apollo rockets to the Moon would always
jected to some small degree of atmospheric begin their mission by launching into a low
friction, which is influenced by many factors. orbit with an approximate altitude of just
This friction will eventually slow the satellite 180 km. An orbit of this height can decay quite
and decay its orbit. rapidly. However, they were never there for
• In order to re-enter the Earth’s atmosphere, a very long before boosting out of orbit towards
spacecraft must deliberately lose velocity in the Moon. For their orbit calculate:
−1
such a way that it strikes the atmosphere at an (a) the required velocity in km h
CHAPTER REVIEW

optimum angle. If the angle is too shallow, the (b) the period of the orbit
spacecraft may skip off the atmosphere and fail (c) the acceleration of the spacecraft
to re-enter. If the angle is too steep, the heat of (d) the centripetal force acting on the spacecraft.
re-entry and the g forces produced would be Use 110 000 kg as the mass of the spacecraft in
too great to ensure the survival of the space- orbit. Additional data can be found in the
craft or its occupants. previous question.
• The heat of re-entry at its maximum produces a 7. Use Kepler’s Law of Periods to calculate the
layer of ionised particles around a spacecraft time required for the following planets to com-
that prevent radio communication with the plete an orbit of the Sun. Note that the radius
6
spacecraft. of the Earth’s orbit is 150 × 10 km.

56 SPACE
CHAPTER REVIEW
RADIUS OF ORBIT TIME TO ORBIT SUN
6
PLANET (× 10 km) (IN EARTH YEARS)
Mercury 58.5

Venus 109

Mars 229

Jupiter 780

Saturn 1430

8. Compare, in words only, low Earth orbits and


geostationary orbits.
9. Distinguish between a geostationary orbit and
a geosynchronous orbit.
10. Calculate the altitude, period and velocity data
to complete the following table.

ALTITUDE PERIOD VELOCITY


−1
TYPE OF ORBIT (km) (h) (m s )

Low Earth 360

Geostationary 24

11. Explain the limited lifetimes of low Earth


orbits.
12. Describe the magnitude and direction of the g
forces of re-entry into the Earth’s atmosphere.
13. Construct a list of some of the dangers of
atmospheric re-entry with a short discussion of
each. Your list should include at least four
items.

CHAPTER 3 ORBITING AND RE-ENTRY 57


Method
3.1 1. Record the mass of the rubber stopper being
INVESTIGATING used as a bob.
2. Attach the rubber stopper to a length of string
CIRCULAR approximately 1.5 m long, then thread the loose
end of the string through the glass tube.
MOTION 3. Attach the mass carrier to the loose end of the
string as shown in figure 3.12.
Aim r
To examine some of the factors affecting the
motion of an object undergoing uniform circular String
motion, and then to determine the quantitative
relationship between the variables of force, velocity
and radius.
Glass tube
Apparatus
rubber stopper
glass tube
50 g mass carrier
stopwatch
string
sticky tape
Radius is marked
metre rule by adhesive tape.
50 g slot masses

Theory
Recall that the expression for the centripetal force Mass carrier
that causes circular motion is as follows:
2
mv Masses
F C = ---------
r
where Figure 3.12
FC = centripetal force (N)
m = mass of object in motion (kg) 4. Place a piece of sticky tape on the string at the
−1 point shown in figure 3.12 so that the distance,
v = instantaneous velocity of mass (m s )
r = radius of circular motion (m). r, is 40 cm.
5. Hold the glass tube and move it in a small circle
In this experiment, the centripetal force is pro- so as to get the rubber bob moving in a circular
vided by the weight of some masses hanging on a path. The mass carrier will provide the centri-
mass carrier, so that here: petal force to keep the bob moving in its circular
PRACTICAL ACTIVITIES

FC = mC g path. Adjust your frequency of rotation so that


where the sticky tape just touches the bottom of the
mC = mass of mass carrier + masses (kg) glass tube. This will keep the radius of the bob’s
−2
g = acceleration due to gravity = 9.8 m s . orbit steady.
6. Record the time for the bob to complete 10 rev-
You should also recall the relationship between olutions at a constant speed then calculate and
the period of an object in circular motion and its record the period. Do this three times and then
orbital velocity: use the average of these as the correct period.
2πr Use the radius and period to calculate the
T = ---------
v orbital velocity of the bob.
where 7. Repeat steps 4 to 6 for radii of 0.60 m, 0.80 m,
T = period (s) and 1.0 m.
r = radius of motion (m) 8. Repeat steps 3 to 7 using masses of 100 g, then
−1
v = orbital velocity (m s ). 150 g, and finally 200 g.

58 SPACE
PRACTICAL ACTIVITIES
Results Analysis
Tabulate your results as shown in the table below. 1. From the results above, calculate the orbital
velocity of the bob and complete the table.
PERIOD ORBITAL 2. For each of the radii used with 50 g, construct a
2
(10 MEAN VELOCITY graph of v versus r.
FORCE RADIUS REVOLUTIONS) PERIOD v 3. Repeat this for the 100 g, 150 g and 200 g
results.
1.0 m

50 g × 0.8 m Questions
gravity 0.6 m 1. What is the relationship that these graphs indi-
cate?
2
0.4 m 2. What does the slope of your v versus r graph
represent?
1.0 m
3. What role does gravity play in the results in this
100 g × 0.8 m experiment?
gravity 0.6 m

0.4 m

CHAPTER 3 ORBITING AND RE-ENTRY 59


CHAPTER
4 GRAVITY IN THE
SOLAR SYSTEM
Remember
Before beginning this chapter, you should be able to:
• recall Kepler’s Law of Periods and be able to use it to
solve problems and analyse information:
r3 GM
--------- = -------------2
T2 4π
• calculate the centripetal force acting on a satellite
orbiting the Earth in uniform circular motion:
mv 2
F = --------------
r
• state the definition of momentum: p = mv
1
• state the definition of kinetic energy: Ek = ---- ( mv ) 2
2
• use the conservation of momentum and kinetic
energy to analyse one-dimensional elastic collisions.

Key content
At the end of this chapter you should be able to:
• define Newton’s Law of Universal Gravitation
• describe the gravitational field in the region
surrounding a massive object such as a planet
• discuss the importance of Newton’s Law of Universal
Gravitation to an understanding of the orbital
motion of satellites
• identify the slingshot effect as used by space probes.

Figure 4.1 The orbit of the Moon around the


Earth is determined by the gravitational
forces that the two bodies exert on each other.
Our solar system consists of eight planets and currently three known (but
potentially many) dwarf planets performing orbital motion around the
Sun. Most of those nine planets have smaller satellites performing orbital
motion around them. In chapter 3, we looked at circular motion as the
simplest model of orbital motion. We learned that this type of motion
requires a centripetal force, that is, a force that acts continually on the
orbiting body and is always directed towards the central body. We also
noted that for planets and satellites, the force of gravity provides the cen-
tripetal force.
In this chapter we further examine the force of gravity, and also the
nature of gravitational fields, within the solar system.

4.1 THE LAW OF UNIVERSAL


GRAVITATION
You will recall from your work in the Preliminary Course that it was Isaac
Newton who was first able to provide a formula to describe the manner in
which gravity acts. This formula, now known as the Law of Universal
Gravitation, is as follows:
m1 m2
F = G ------------
2
-
where r
F = force of gravity between two masses (N)
G = universal gravitation constant
−11 2 −2
= 6.67 × 10 N m kg
m1, m 2 = the two masses involved (kg)
r = the distance between their centres of mass (m).
Note that this is always an attractive force and is exerted equally on
both masses. It depends only upon the value of the two masses and
their separation distance. Note also that the force is inversely pro-
portional to the square of the distance. Hence, in any given situation, if
Figure 4.2 Isaac Newton
the distance were to double, the value of the force would drop to one-
quarter of its previous value.

Determining gravitational forces in the Sun–Earth–Moon system


SAMPLE PROBLEM 4.1 Given the following data, determine the magnitude of the gravitational
attraction between:
(a) the Earth and the Moon
(b) the Earth and the Sun.
24
Mass of the Earth = 5.97 × 10 kg
22
Mass of the Moon = 7.35 × 10 kg
30
Mass of the Sun = 1.99 × 10 kg
8
Earth–Moon distance = 3.84 × 10 m on average
11
Earth–Sun distance = 1.50 × 10 m on average (one astronomical
unit, AU)
mE mM
SOLUTION (a) F = G --------------
2
-
r
– 11 24 22
( 6.67 × 10 ) ( 5.97 × 10 ) ( 7.35 × 10 )
= ---------------------------------------------------------------------------------------------------
8 2
( 3.84 × 10 )
20
= 1.98 × 10 N
That is, the magnitude of the gravitational force of attraction between
20
the Earth and the Moon is approximately 1.98 × 10 N.

CHAPTER 4 GRAVITY IN THE SOLAR SYSTEM 61


mE mS
(b) F = G ------------
2
-
r
– 11 24 30
( 6.67 × 10 ) ( 5.97 × 10 ) ( 1.99 × 10 )
= ---------------------------------------------------------------------------------------------------
11 2
( 1.50 × 10 )
22
= 3.52 × 10 N
That is, the magnitude of the gravitational force of attraction between
22
the Earth and the Sun is approximately 3.52 × 10 N, or about 180
times greater than the Earth–Moon attraction.

Determining smaller gravitational forces


SAMPLE PROBLEM 4.2 Determine the gravitational force of attraction between two 250 g apples
that are lying on a desk 1.0 m apart.
m apple1 m apple 2
SOLUTION F = G -------------------------------
2
-
r
– 11 –1 –1
( 6.67 × 10 ) ( 2.5 × 10 ) ( 2.5 × 10 )
= ---------------------------------------------------------------------------------------------
2
( 1.0 )
−12
= 4.2 × 10 N
Comparing this tiny force with those forces calculated in sample problem
4.1, we can see that gravitation requires huge masses to produce signifi-
cant forces.

Universal gravitation and the motion of


satellites
In chapter 3, we learned that the force of gravity serves as the centripetal
force that maintains the orbital motion of a satellite or planet, and we
assumed this motion to be uniform circular motion. (Most orbits are
ellipses with slight eccentricities that make them almost circular, so this
assumption is a fair one that simplifies the situation.) This force holds
the solar system together, tying the planets to the Sun and the satellites to
their planets, keeping them on an orbital path and determining the
speed of their rotations.
In chapter 3, we also derived an equation for the orbital velocity of a satel-
lite from Kepler’s Law of Periods. In order to emphasise the importance
of gravity to the motion of satellites and planets, that same equation can
be derived from Newton’s expression for universal gravitation by equating
it to the expression for centripetal force. The following focuses on the
specific case of a satellite orbiting the Earth, but the equations are equally
applicable to satellites orbiting other planets or planets orbiting the Sun.
First, the gravitational attraction between a satellite and the Earth
would be given by the following expression:
mE mS
FG = G ------------
2
-
r
where
mE = mass of the Earth
24
= 5.97 × 10 kg
mS = mass of the satellite (kg)
r = radius of the orbit (m)
G = universal gravitation constant.

62 SPACE
This gravitational force of attraction also serves as the centripetal force
for the circular orbital motion, hence:
FG = FC.
Therefore, we can equate the formula for FG with that for FC:
2
mE mS mS v
- = -----------
G ------------
2
r r
Gm E
∴v= -----------
r
where
−1
v = orbital velocity (m s ).
Note : The radius of the orbit, r, is the sum of the radius of the Earth
and the altitude of the orbit.
r = r E + altitude (m)
where
r = radius of orbit (m)
r E = radius of the Earth (m)
6
= 6.38 × 10 m
altitude = height of orbit above the ground (m).
Again, we may note that the orbital velocity required for a particular
orbit depends only on the mass and radius of the Earth (or other cen-
tral body) and the altitude of the orbit. Further, the greater the radius
of the orbit, the slower the orbital velocity that is required to maintain
the orbit.

Determining orbital velocity


SAMPLE PROBLEM 4.3 Calculate the orbital velocity of the Earth along its orbit around the Sun,
and compare this to the value determined by considering the period of
its motion.
SOLUTION First, the orbital velocity of the Earth can be calculated directly:

Gm s
v= ----------
r
( 6.67 × 10 –11 ) ( 1.99 × 10 30 )
= ------------------------------------------------------------------
-
( 1.50 × 10 11 )
−1 −1
= 29 700 m s = 29.7 km s
Orbital velocity can also be calculated from the known period and radius
of the Earth’s orbit:
2πr
v = ---------
T
2π ( 1.50 × 10 11 )
= -----------------------------------------------------
( 365.26 × 24 × 3600 )
−1 −1
= 29 900 m s = 29.9 km s
These two answers compare very well, and represent the same value after
allowing for rounding errors.

CHAPTER 4 GRAVITY IN THE SOLAR SYSTEM 63


Comparing centripetal and gravitational forces acting
SAMPLE PROBLEM 4.4 on the Moon
The period of the Moon’s orbit around the Earth is 27.26 days. Use this
information, along with the data in sample problem 4.1 (see pages 61–2),
to calculate the orbital velocity of the Moon. Use this answer to then
calculate the centripetal force acting on the Moon, assuming its orbit to
be circular.
Finally, compare the calculated value for centripetal force to the gravi-
tational force between the Earth and the Moon calculated in sample
problem 4.1.

2πr
SOLUTION Orbital velocity v = ---------
T
2π ( 3.84 × 10 8 )
= --------------------------------------------------
( 27.26 × 24 × 3600 )
–1
= 1024 m s
m Moon v 2
Centripetal force Fc = -------------------
-
r
( 7.35 × 10 22 ) ( 1024 ) 2
= ---------------------------------------------------
-
3.84 × 10 8
20
= 2.01 × 10 N

Referring back to sample problem 4.1 on pages 61–2, the value for the
gravitational force between the Earth and the Moon was calculated to be
20
1.98 × 10 N, which compares very closely to this result allowing for
rounding error.

Comparing centripetal and gravitational forces acting


SAMPLE PROBLEM 4.5 on the Earth
Repeat sample problem 4.4, but this time focusing on the Earth in its
orbit of the Sun.

2πr
SOLUTION Orbital velocity v = ---------
T
2π ( 1.50 × 10 11 )
= -----------------------------------------------------
( 365.25 × 24 × 3600 )
−1
= 29 900 m s
m Earth v 2
Centripetal force Fc = -------------------
-
r
( 5.97 × 10 24 ) ( 29 900 ) 2
= -------------------------------------------------------
-
1.50 × 10 11
22
= 3.56 × 10 N

Referring back to sample problem 4.1, the value for the gravitational
22
force between the Sun and the Earth was calculated to be 3.52 × 10 N.
Once again, the closeness of these values demonstrates that gravity func-
tions as the centripetal force for the orbital motion of satellites and
planets.

64 SPACE
System mass or central mass? Deriving the constant in Kepler’s Law of
The derivation shown here uses the
case of a small mass performing Periods
circular motion around a much larger In chapter 3, we learned that Kepler had originally stated his Law of
mass. However, this is not what
r3
actually happens. Even assuming that Periods in the form ------2 = k , but he was not able to determine an
the orbits are circular, the two bodies T
involved each orbit the common centre expression for the constant k. When Isaac Newton was devising his Law of
of mass of the two-body system. The Universal Gravitation, he found that he was able to derive such an
distance of each mass from the centre expression. The derivation begins by equating the gravitation and cen-
of its orbit is thus not equal to the
separation distance of the masses. This tripetal forces to give an equation for orbital velocity, just as we have
means that the distance variables in already done.
the gravitation and centripetal force mM mv 2
expressions are not the same. When G --------- = ---------
r2 r
this difference is allowed for, the
variable M in Kepler’s Law of Periods GM
becomes the mass of the system, rather
∴v= ---------- where M is the large central mass.
r
than just the central mass. However, in
the case of a satellite, or even a planet, 2πr
The expression relating period to orbital velocity v = --------- is then substi-
this difference is insignificant. T
The full derivation of this tuted into the equation, which can then be rearranged to the form of
expression can be found in Kepler’s Law of Periods.
‘Astrophysics’, chapter 16, on page
307, where it is applied to the
2πr
situation of two stars orbiting each --------- = GM ----------
other. T r
r 3 GM
∴ ------2 = ---------2- = the constant k
T 4π

4.2 GRAVITATIONAL FIELDS


A vector field can be said to exist in any space within which a force vector
can act. You will recall that a vector is any quantity that has both magni-
A vector is any quantity that has
both magnitude and direction. tude and direction. Force is one example. Corresponding to the force
Force is one example. vector that acts within it, a vector field also has strength and direction.
Some examples of vector fields are magnetic fields, electric fields and
gravitational fields.
A gravitational field is a field within which any mass will experience a
A gravitational field is a field
within which any mass will
gravitational force. Since the force of gravity will act on any mass near the
experience a gravitational force. Earth, a gravitational field exists around the Earth, and can be drawn as
The field has both strength and shown in figure 4.3. Note that on a small scale, such as the interior of a
direction. room, the field lines — or lines of force — appear parallel and point
down since this is the direction of the force that would be experienced by
a mass placed within the field.
The gravitational field of a planet or star extends some distance
from it. Figure 4.3 shows the shape of the gravitational field around
the Earth. We can see now that the field has a radial pattern with the
field lines pointing towards the Earth’s centre, again because this is
the direction of the force that would be experienced by a mass within
the field. Note that closer to the Earth, the field lines are closer
together. This indicates that the field, and its force, are stronger in
this region.
In chapter 1, we learned that the field vector g also describes the
strength and direction of a gravitational field at some point within that
field. Of course, g is more commonly known as ‘acceleration due to
−2
gravity’ and has a value of 9.8 m s at the Earth’s surface.

CHAPTER 4 GRAVITY IN THE SOLAR SYSTEM 65


(a) In a room (b) Around a planet

Figure 4.3 The gravitational field


represented (a) within a room, and
(b) around a planet

Of course, any large object near the Earth, such as the Moon, will have
a gravitational field of its own, and the two fields will combine to form a
more complex field, such as that shown in figure 4.4. Note that there is a
point between the two, but somewhat closer to the Moon, at which the
strength of the field is zero. In other words, the gravitational attraction
of the Earth and that of the Moon are precisely equal but opposite in
direction. Such points exist between any two masses, but become notice-
able when considering planets and stars that are close enough to be
gravitationally bound together.

Moon
Earth
Figure 4.4 The gravitational field
around the Earth and Moon. The
overall shape depends upon the
relative strengths of the two fields
involved.

4.3 THE SLINGSHOT EFFECT


When the Mariner 10 spacecraft reached Venus on 5 February 1974, it
conducted a very successful survey of the planet, but that wasn’t the end
of its mission. Its flight path was shifted, aiming it closer in towards the
planet, flinging it around and accelerating it towards Mercury. This was
the first mission to use the slingshot effect, also known as a planetary
The slingshot effect, or planetary
swing-by or gravity-assist swing-by or gravity-assist manoeuvre, to pick up speed and proceed on to
manoeuvre, is a strategy used with another target. The Mariner 10 arrived at Mercury a little under two
space probes to achieve a change months later, flying just 705 km above the surface of the planet.
in velocity with little expenditure During a slingshot, a spacecraft deliberately passes close to a large
of fuel. mass, such as a planet, so that the mass’s gravity pulls the spacecraft in
toward it. This causes the spacecraft to accelerate, and it heads around
the planet and departs in a different direction. The departure speed of
the spacecraft relative to the planet is the same as the approach speed
relative to the planet, but the change in direction can result in a real
change in velocity relative to the Sun. In general, a spacecraft will
approach a planet at an angle to the planet’s orbital path. By swinging

66 SPACE
behind the planet an increase in speed can be achieved, and by swinging
in front of the planet’s path a decrease in speed is achieved. These trajec-
tories are shown in figure 4.5. The major benefit of the slingshot effect is
that the change in velocity is achieved with very little expenditure of fuel.

(a) Planet

Spacecraft

(b) Planet

Figure 4.5 (a) Swinging behind a


planet’s path results in an increase in
speed relative to the Sun.
(b) Swinging in front of a planet’s
Spacecraft
path results in a decrease in speed.

In order to understand the slingshot effect, we need to analyse it as if it


were a perfectly elastic one-dimensional collision. Even though there is
no contact, the interaction behaves as a collision. However, because the
bodies do not touch in any way, there are no energy losses and, hence,
the ‘collision’ is elastic. Figure 4.6 represents the situation in this way.
Note the variables we are using: vi is the initial velocity of the spacecraft,
Vi is the initial velocity of the planet, m is the mass of the spacecraft and
K × m is the mass of the planet, where K has a very large value. Note : After
the ‘collision’ there are two unknowns — the final velocity of the planet,
Vf , and that of the spacecraft, vf .

Planet

Vi

vi
Spacecraft

Vf vf = vi + 2Vi

Figure 4.6 The velocity gained


by the slingshot effect

CHAPTER 4 GRAVITY IN THE SOLAR SYSTEM 67


Applying the conservation of momentum to this collision:
Initial momentum = final momentum
p i of planet + p i of spacecraft = p f of planet + p f of spacecraft
KmVi + m(−vi) = KmVf + mvf
KVi − vi = KVf + vf [1]

Applying now the conservation of kinetic energy:


Initial kinetic energy = final kinetic energy
E ki of planet + E ki of spacecraft = E kf of planet + E kf of spacecraft
1 2 1 2 2 2
--- KmV i + --- m(−vi) = --1- KmV f + --1- mv f
2 2 2 2
2 2 2 2
KV i + v i = KV f + v f [2]

Solving equations [1] and [2] simultaneously for vf leads to the


following expression:
vf = v i + 2Vi
where
−1
vf = maximum exit velocity of spacecraft (m s )
−1
v i = entry velocity of spacecraft (m s )
−1
Vi = velocity of planet (m s ).

This expression represents the maximum velocity achievable from


the slingshot effect. This is achieved by a head-on rendezvous as
described; at other angles, a spacecraft achieves lower velocities. The
final velocity of the planet is marginally less than its initial velocity. Note
that the kinetic energy of the system has been conserved, with the
spacecraft gaining some kinetic energy and the planet losing an equiv-
alent amount.
As an example, let us now consider a spacecraft approaching Jupiter
almost head-on for a slingshot manoeuvre. Jupiter has an orbital velocity
−1
of about 13 000 m s relative to the Sun, and let us assume that the space-
−1
craft has a velocity of 15 000 m s relative to the Sun. Using the equation
above, the exit velocity of the spacecraft will be (15 000 + 2 × 13 000) or
−1
41000 m s .
In order to visualise how this effect occurs, it is helpful to consider
the relative velocities. If we consider the velocity of the spacecraft rela-
tive to Jupiter then the situation is as if Jupiter were standing still.
When a small object collides elastically with a very large, massive
object, the small object will rebound without loss of speed — consider
a ball bouncing against a wall. Similarly, our spacecraft approaches
−1
Jupiter at (13 000 + 15 000) or 28 000 m s relative to it, swings around
−1
behind it and then slingshots back out in front at 28 000 m s relative
to Jupiter.
Look now at what has happened to the velocity of the spacecraft
−1
relative to the Sun. If Jupiter’s velocity is 13 000 m s and the spacecraft is
−1
moving ahead 28 000 m s faster than it, the spacecraft is now travelling
−1
at 41000 m s relative to the Sun.
A planetary swing-by of Jupiter very similar to that just described
was performed by the space probe Ulysses in February 1992, as shown
in figure 4.7. Ulysses used the swing-by to accelerate it out of the plane
of the solar system but still in an orbit of the Sun. This allowed it to
be the first spacecraft ever to explore this region of interplanetary
space.

68 SPACE
Jupiter orbit
Jupiter
encounter
Ecliptic February
crossing Earth orbit
1992
February Sun
1995

Launch
October
1990

South trajectory

Figure 4.7 Ulysses used a slingshot


around Jupiter to accelerate it out of 100 days
the plane of the solar system.

Some slingshot manoeuvres have become quite complex and sophisti-


cated. A case in point is the trajectory of the International Sun–Earth Explorer
3, or ISEE-3 space probe. It was initially placed into an orbit about the
Langrangian L1 point between the Earth and the Sun. Here, it performed
investigations on the solar wind and its connection with the Earth’s mag-
netic field. In June 1982, it was removed from there and sent through a
series of slingshots around the Moon and the Earth as shown in figure 4.8.
The end result of this complicated series of manoeuvres was the probe
heading into an orbit around the Sun and aiming for Comet Giacobini–
Zinner. At this point the probe was renamed the International Cometary
Explorer (ICE). It met up with Comet Giacobini–Zinner on 11 September
1985, making many important measurements of its plasma tail and was
then able to travel on and meet up with Comet Halley in March 1986,
making it the first spacecraft to investigate two comets in this way.

23-11-83

23-11-82
Comet Halley
28-3-86
Comet Giacobini-Zinner
11-9-85 1-9-82

27-9-83 30-6-83
22-12-83
30-3-83
Halo orbit
Six month L2
L1
travel
Mo

Five year 8-2-83


orbit
on

or 23-4-83
b it

16-10-82

10-6-82

2012
ISEE 3 Manoeuvres from launch
to halo orbit
to comet exploration

Figure 4.8 The trajectory of the Delta 2914


ISEE-3/ICE space probe launched August 12, 1978

CHAPTER 4 GRAVITY IN THE SOLAR SYSTEM 69


5. Calculate the magnitude of the force of gravi-
SUMMARY tation between a book of mass 1 kg and a pen
of mass 50 g lying just 15 cm from the centre
• The Law of Universal Gravitation was proposed of mass of the book.
by Isaac Newton to describe the force of attrac-
tion between any two masses. It is described by 6. Calculate the force of gravity between a 72.5 kg
24
this equation: astronaut and the Earth (mass = 5.97 × 10 kg
6
m1 m2 and radius 6.378 × 10 m), using the Law of
F = G ------------
- Universal Gravitation:
r2
(a) standing on the ground prior to launch
• Gravity acts as the centripetal force for the (b) at an altitude of 285 km after launch.
orbital motion of satellites and planets.
• Equating the formulas for universal gravitation 7. For each of the orbiting bodies shown in the
and centripetal force allows the derivation of following table, calculate the orbital velocity
the orbital velocity formula. from the period then use it to calculate the
• A gravitational field is a vector field sur- centripetal force. Also calculate the value of
rounding any mass within which another mass the gravitational force acting on the body and
will experience a gravitational force. Gravity is indicate how well the two forces compare.
such a weak force that it takes a massive object
(such as a planet) to create a significant gravi-
ORBITING ORBITAL CENTRAL ORBITAL
tational field.
BODY PERIOD BODY RADIUS
• The slingshot effect is a manoeuvre in which a
spacecraft uses the gravity of a planet to make a Satellite in Earth Altitude =
change in velocity with very little expenditure low Earth 90.6 m = 5.97 × 300 km
of fuel. orbit minutes
24
10 kg ∴r = 6.68 ×
6
m = 1360 kg 10 m
QUESTIONS Venus The Sun
11
m = 4.9 × 225 Earth m = 1.99 × 1.09 × 10 m
1. Define Newton’s Law of Universal Gravitation. 24
10 kg days
30
10 kg
2. (a) List the variables upon which the force of
gravity between two masses depends. Callisto Jupiter
9
(b) Note that time is not one of the variables, m = 1.1 × 16.7 Earth m = 1.90 × 1.88 × 10 m
23 27
implying that this force acts instantane- 10 kg days 10 kg
ously throughout space, faster even than
light. Does this seem reasonable to you?
Explain your answer. 8. (a) Discuss the reasons why Newton’s Law of
Universal Gravitation is important to an
3. State what would happen to the strength of the
understanding of the motions of satellites.
force of attraction between two masses if the
distance between them was to halve and the (b) Describe how the use of this law allows
masses themselves were each to double. calculation of the motion of satellites.
4. Given the following data, calculate the magni- 9. (a) State the nature of the slingshot effect or
tude of the gravitational attraction between: planetary swing-by.
(a) Jupiter and its Moon Callisto (b) This is also known as a ‘gravity-assist
(b) Jupiter and the Sun. manoeuvre’. Briefly describe the role of
CHAPTER REVIEW

27
Mass of Jupiter = 1.9 × 10 kg gravity in this manoeuvre.
23
Mass of Callisto = 1.1 × 10 kg (c) State the laws of physics underlying this
30
Mass of the Sun = 1.99 × 10 kg effect.
9
Jupiter–Callisto distance = 1.88 × 10 m on average (d) Identify the benefits achieved by this
11
Jupiter–Sun distance = 7.78 × 10 m on average manoeuvre.

70 SPACE
CHAPTER
5 SPACE AND TIME
Remember
Before beginning this chapter, you should be
able to:
• describe Newton’s First Law of Motion
r
• apply the definition of velocity: v = ∆ - .
t

Key content
At the end of this chapter you should be able
to:
• outline the features of the aether model for
light transmission
• describe and evaluate the Michelson–Morley
experiment
• discuss the role of experiments in the
scientific method, with reference to the
Michelson–Morley experiments and the use
of thought experiments in relativity
• outline the nature of inertial frames of
reference
• discuss the principle of relativity
• recognise the constancy of the speed of
light and discuss the implications for
theories of space and time
• discuss the definition of the metre in
terms of time and the speed of light
• discuss qualitatively and quantitatively the
following consequences of relativity: the
relativity of simultaneity, time dilation,
length contraction, mass dilation and the
equivalence of mass and energy
• discuss the implications of time dilation,
length contraction and mass dilation for
Figure 5.1 Albert Einstein in 1905 space travel.
Much of the following physics of relativity sprang out of considerations of
light — of what form it takes, how it moves from one place to another
and how fast. For many centuries physicists argued over whether light
takes the form of a shower of tiny particles, like buckshot from a shotgun,
or whether it is in the form of a wave motion like sound waves. Then, in
1801, Thomas Young performed an experiment that showed that light
rays could interfere with each other to produce a pattern, which is a
property unique to waves. The early nineteenth century saw a series of
further demonstrations of the wave nature of light.
In 1864, James Clerk Maxwell seemed to put the issue beyond doubt
when he produced a brilliant set of equations to explain the behaviour of
electric and magnetic fields, and then used the equations to show that
these fields could move together as waves through space at the speed of
light. Additionally, he proposed that light was also a form of these
electromagnetic waves. The German scientists Helmholtz and Hertz were
able, in 1887, to produce experimental evidence for the existence of
these waves.

5.1 THE AETHER MODEL


Having concluded that light moves as a waveform, nineteenth-century
physicists turned to other wave motions in order to better understand
light. There were many others known, including sound waves, water
waves, and earthquake waves. All of these waveforms need a medium
through which to travel, and so it was believed that light waves would also
require a medium. Nobody could find such a medium but belief in its
existence was so strong that it was given a name, the ‘luminiferous
The aether was the proposed aether’, and its properties were identified. The aether:
medium for light and other • filled all of space, had low density and was perfectly transparent
electromagnetic waves, before it • permeated all matter and yet was completely permeable to material
was realised that these waveforms objects
do not need a medium in order to • had great elasticity to support and propagate the light waves.
travel.
This list of properties may seem odd to us now and the whole concept
of the aether may seem strange in hindsight, but bear in mind that
nineteenth-century physicists were trying to understand a phenomenon
completely unknown to them. It is not unlike the situation facing
modern cosmologists in trying to understand why the universe seems to
have much more matter than can be observed, and why the expansion of
the universe seems to be accelerating. Some explanations of these
modern-day puzzles attribute some similarly unusual properties to other-
wise ‘ordinary’ space.
The search for the aether was to occupy physicists for several decades
before it was finally accepted that (a) the aether does not actually exist,
and (b) electromagnetic waves (including light) are unique in that they
A medium is the material through
do not require a medium of any sort in order to move.
which a wave travels.
The Michelson–Morley experiment
If you were in a boat, how could you tell whether the boat was moving?
You might simply look over the side to see if water is flowing past. If you
wanted to be certain, you might dip your hand in to feel the flowing
water.
Similarly, if the aether did exist, our Earth, moving through space at
−1
about 30 km s as it orbits the Sun, should be moving through the aether.
From our point of view, we should experience a flow of aether past us or,
as it became known, an ‘aether wind’. However, the aether was thought to

72 SPACE
be extremely tenuous, so any aether wind would be hard to detect. There
were many experiments designed and performed to detect it, but they all
failed. The assumption was that the detection mechanisms were simply not
sensitive enough.
The definitive experiment to detect the aether wind was performed by
A. A. Michelson and E. W. Morley in 1887, for which they received the
Nobel Prize in 1907. It was exceedingly sensitive.
In order to understand how the experiment worked, consider the
analogy shown in figure 5.2. Two identical speedboats are going to have
a race on a river, over two different courses. Boat A will head upstream
for 2 km before turning around and heading back. Boat B will head
directly across the 2 km-wide river before returning. Each boat is
−1
capable of a boat speed of 5 km h and each completes a 4 km circuit.
However, the current in the river affects the velocity of each boat differ-
5.1 ently, as boat A heads directly along the current while boat B heads
across it. As figure 5.3 shows, each boat has a different effective velocity
Modelling the Michelson–Morley
experiment (relative to the bank) and boat A takes 15 minutes longer to complete
the course.

Figure 5.2 Two identical


speedboats race over different 4 km
courses, one against and then back
along the current, and the other
across the current.

2 km

Boat A

Boat B has to head upstream slightly to


struggle against the current and still head
directly across the river.

2 km
Figure 5.3 The current affects each
Current 3 km h–1
boat differently, causing the boat that
heads across the current to win every
time.

(a) The situation for boat A heading along the river (b) The situation for boat B heading across the river

Journey out (against current)


Boat speed = 5 km h–1 through the water Boat speed = 5 km h–1 through
= 2 km h–1 as seen from the river bank From Pythagoras’
the water, but this boat must
. theorem,
. . time taken = distance = 2 km –1 = 1 h boat speed = 52 – 32
head slightly upstream so that
speed 2 km h it can travel directly across.
= 4 km h–1
as seen from the
Current 3 km h–1 river bank
Return journey (with current)
Boat speed = 5 km h–1 through the water Current 3 km h–1
= 8 km h–1 as seen from the river bank
.
. . time taken = distance = 2 km –1 = 0.25 h = 15 min ... time taken = distance = 2 km –1 = 0.5 h each way
speed 8 km h speed 4 km h
Hence, total time taken = 1 h 15 min Hence, total time taken = 1 h and this boat wins!

CHAPTER 5 SPACE AND TIME 73


If the object of the race was to determine the speed of the current,
then it could be calculated from the difference in arrival times of the two
speedboats. Also, by repeating the race with the boats interposed — A
heading across the river and B heading along the river — any difference
between the boats could be eliminated as a cause for the time difference
as boat A should now win by the same margin.
This is essentially what Michelson and Morley did — they raced two
light rays over two courses, one into the supposed aether wind and one
Figure 5.4 The apparatus of the across it, then swung the apparatus through 90 degrees to interpose the
Michelson–Morley experiment set up rays. They were looking for a difference
on a large stone block, to keep it rigid, between the rays as they finished their M2
and floating on mercury so that it can race, from which they could calculate the v
be easily rotated 90 degrees value of the aether wind. Figure 5.4
2
shows their apparatus, while figure 5.5
M
uses a simplified diagram to show how it
worked. 1

With hindsight, the result of the Figure 5.5 A simplified diagram showing the light
Michelson–Morley experiment has rays’ paths in the Michelson–Morley experiment. S M1
been able to help scientists of the A light ray from the source is split into two by the
twentieth century to reject the aether half-silvered mirror. Ray A heads into the aether
model and accept Einstein’s relativity. wind then reflects against mirror M1 and returns.
In this sense, it has been an important Ray B heads across the aether wind before reflecting T
experiment in helping others to decide back. Both rays finish their journey at the telescope,
between the competing theories, along where they are compared.
with the comparative success of
relativity experiments. The method of comparing the light rays involves a very sensitive effect
It is important to note, however, called ‘interference’, and hence this apparatus is referred to as an ‘inter-
that it did not sway scientific belief at ferometer’. Essentially, when looking into the telescope a pattern of light
the time. Aether supporters saw the and dark bands will be seen, as shown in figure 5.5. If the aether wind
null result only as an indication that exists, so that one light ray is indeed faster than the other, then when the
their model needed improvement. apparatus is rotated, so that the rays are interposed, the interference
Einstein, although apparently aware
pattern should be seen to shift. However, no such shift was observed.
of the Michelson–Morley result, was
The experiment was repeated many times by Michelson and Morley, at
not influenced by it and was
different times of the day and year, but no evidence of an aether wind
unconcerned with proposed aether
was ever found. The scientific community was not quick to abandon the
model modifications. He was
aether model, however, and adapted the theory to keep it alive. One sug-
approaching the problem from an
gestion was that a large object such as a planet could drag the aether
entirely different direction.
along with it. Another was that objects contract in the direction of the
aether wind. However, none of these modifications survived close scru-
tiny. Further, the Michelson–Morley experiment has been repeated many
times since 1887 by different groups with more and more sensitive equip-
ment, and no evidence of an aether has ever been found. Yet belief in the
necessity of the aether was so strong that physicists found it difficult to let
go of the idea until, in 1905, Albert Einstein showed that the aether was
not necessary at all.

5.2 SPECIAL RELATIVITY


Inertial frames of reference and the
principle of relativity
Three hundred years before Einstein, Galileo posed a simple idea, now
5.2 called the ‘principle of relativity’, which states that all steady motion is
Non-inertial frames of reference relative and cannot be detected without reference to an outside point.

74 SPACE
The idea can be found built into Newton’s First Law of Motion as well.
Put another way, if you are travelling inside a vehicle you cannot tell if
you are moving at a steady velocity or standing still without looking out
the window. You may have experienced this personally when sitting in a
train and an adjacent train begins to roll — at first you may think that
your own train is moving until you look out of a window on the other side
of the carriage.
There are two points that must be reinforced:
• The principle of relativity applies only for non-accelerated steady
motion; that is, standing at rest or moving with a uniform velocity. This
is referred to as an inertial frame of reference. Situations that involve
An inertial frame of reference is a
non-accelerated environment. Only acceleration are called non-inertial frames of reference.
steady motion or no motion is • This principle states that within an inertial frame of reference you
allowed. A non-inertial frame of cannot perform any mechanical experiment or observation that would
reference experiences reveal to you whether you were moving with uniform velocity or
acceleration.
standing still.
As an example, if you were seated in a very smooth train and you held
up a string with a small object tied to the end, the object would hang so
that the string was vertical. However, as the train pulled away from the
station you would notice the object swing backward so that the string was
no longer vertical. This would continue until the train reached its
cruising speed and stopped accelerating, at which point the object would
move forward so that the string was once again vertical. When rounding
bends in the track you would notice the string leaning one way or the
other, but once the track straightened out the string would once again be
vertical, just as it was when standing at the station. This plumb bob is
operating as a simple accelerometer but it is unable to distinguish
between being motionless and steady motion.
In the late nineteenth century, belief in the aether posed a difficult
problem for the principle of relativity, because the aether was supposed
to be stationary in space and light was supposed to have a fixed velocity
relative to the aether. This meant that if a scientist set up equipment to
measure the speed of light from the back of a train carriage to the front,
and it turned out that the light was slower than it should be, the train
must be moving into the aether. Put another way, this optical experiment
provides a way to violate the principle of relativity where no mechanical
experiment could.

A constant speed of light


Around the turn of the twentieth century, Albert Einstein puzzled over
the apparent violation of the principle of relativity posed by the aether
model. He had an ability to reduce a problem down to its simplest
form and present it as a thought experiment. In this case the question
he posed was: if I were travelling in a train at the speed of light and I
held up a mirror, would I be able to see my own reflection? If the
aether model was right, light could go no faster than the train. It could
never catch up with the mirror to return as a reflection. The principle
of relativity is thus violated because seeing one’s reflection disappear
would be a way to detect motion. On the other hand, if the principle
of relativity were not to be violated, the reflection must be seen nor-
mally, which means that it is moving away from the mirror holder at
−1
3 × 108 m s . However, this would mean that an observer on the
embankment next to the train would see that light travelling at twice
its normal speed.

CHAPTER 5 SPACE AND TIME 75


This was a considerable dilemma but Einstein had a strong belief in the
unity of physics and decided that the principle of relativity must not be
violated and the reflection in the mirror must always be seen. This, in turn,
meant that the aether did not exist. As a way out of the dilemma described,
he also decided that the train rider and the person on the embankment
8 −1
must both observe the light travelling at its normal speed of 3 × 10 m s .
But how can this be? Einstein realised that if both observers were to see
distance
the same speed of light, and since speed = --------------------- , then the distance
time
and time witnessed by both observers must be different.

A published article
These ideas were explicitly stated in Einstein’s 1905 paper titled ‘On the
Electrodynamics of Moving Bodies’, which presented:
• a first postulate: the laws of physics are the same in all frames of
reference; that is, the principle of relativity always holds
• a second postulate: the speed of light in empty space always has the
same value, c, which is independent of the motion of the observer;
that is, everyone always observes the same speed of light regardless of
their motion
• a statement: the luminiferous aether is superfluous; that is, it is no
longer needed to explain the behaviour of light. Einstein now had the
confidence to set this concept aside.

Space–time
In Newtonian physics, distance and velocity can be relative terms, but
time is an absolute and fundamental quantity. Figure 5.6 uses the
example of the train rider with the mirror, and shows how the velocity
of the light is characterised by two events — the light leaving his face
and the light arriving at the mirror (if we consider just the forward
part of the light’s journey). Remembering that the train is travelling at
the speed of light, Newtonian physics says that the observer on the
embankment outside the train records precisely double the distance of
Figure 5.6 The train traveller in the the journey of the light compared with that recorded by the train rider;
light-speed train looks at his reflection however, they both record the same time.
in a mirror. An observer on the
distance covered
embankment outside the train sees the Since velocity, v = ------------------------------------------- , this means that the observer on
light travelling twice as far.
time taken
the embankment would measure a velocity of light twice that measured
by the train rider.
However, Einstein’s theory says that this is not what will occur. Rather,
both the observer on the embankment and the train rider will measure
precisely the same value for the velocity of light, called ‘c’. He realised
that this could only be true if the observer and the rider observed dif-
ferent times as well as different distances in such a way that distance
divided by time always equals the same value, c.
Einstein radically altered the assumptions of Newtonian physics so that
now the speed of light is absolute, and space and time are both relative
quantities that depend upon the motion of the observer. In other words,
the measured length of an object and the time taken by an event depend
entirely upon the velocity of the observer. Further to this, since neither
space nor time are absolute, the theory of relativity has replaced them
with the concept of a space–time continuum. Any event then has four
dimensions (three space coordinates plus a time coordinate) that fully
define its position within its frame of reference.

76 SPACE
PHYSICS FACT
Definition of the metre
T he metre as a unit of length was first defined
in 1793 when the French government
−7
decreed it to be 1 × 10 times the length of the
technology and science, so the metre has since
been redefined twice.
The current definition of the metre uses the
Earth’s quadrant passing through Paris. This arc constancy of the speed of light in a vacuum
was surveyed and then three platinum standards −1
(299 792 458 m s ) and the accuracy of the defi-
and several iron copies were made. When it was nition of one second (9 129 631 770 oscillations of
discovered that the quadrant survey was incor- 133
the Cs atom), to achieve a definition that is
rect, the metre was redefined as the distance both highly accurate and consistent with the idea
between two marks on a bar. In 1875 the Systeme of space–time. One metre is now defined as the
Internationale (SI) of units was set up so that the length of the path travelled by light in a vacuum
definition became more formal: a metre was the 1
distance between two lines scribed on a single bar during the time interval of ------------------------------- of a second.
299 792 458
of platinum–iridium alloy. Copies, or ‘artefacts’, The term ‘light-year’ is a similar distance unit,
were made for dissemination of this standard. being the length of the path travelled by light in a
There is always a need for the accuracy of a unit time interval of one year. One light-year is
12
of measure to keep pace with improvements in approximately equal to 9.467 28 × 10 km.

5.3 CONSEQUENCES OF SPECIAL


RELATIVITY
Part of the intriguing nature of relativity is that it begins with ideas that
are so logical as to be inescapable. But these seemingly simple ideas lead
to conclusions that can be amazing and, at first, difficult to accept. In this
section, we will examine some of the more well-established consequences
of the theory of relativity.

The relativity of simultaneity


As a way of better understanding how time is affected by relativity,
Einstein analysed our perception of simultaneous events. He pointed
out that when we state the time of an event, we are, in fact, making a
judgement about simultaneous events. For example, if we say ‘school
begins at 9 am’, then we are really saying that the ringing of a certain
bell and the appearance of ‘9 am’ on a certain clock are simultaneous
events.
Einstein contended that if an observer sees two events to be simul-
taneous then any other observer, in relative motion to the first, generally
will not judge them to be simultaneous. In other words, simultaneous
events in one frame of reference are not necessarily observed to be
simultaneous in a different frame of reference. This is known as the rela-
tivity of simultaneity. In order to grasp his contention, Einstein offered
the thought experiment shown in figure 5.7.
An operator of a lamp rides in the middle of a rather special train
carriage. The doors at either end of the carriage are light-operated. At an
instant in time when the operator happens to be alongside an observer
on the embankment (outside the moving train), the operator switches on
the lamp which, in turn, opens the doors.

CHAPTER 5 SPACE AND TIME 77


(a) As seen by train traveller

v
Back door open Front door open

(b) As seen by stationary observer

Original position of lamp

v
Back door open Front door still
closed

Figure 5.7 A thought experiment


to illustrate the relativity of Light waves haven’t reached
simultaneity front door yet.

The operator of the lamp will see the two doors opening
simultaneously. The distance of each door from the lamp is the same
and light will travel at the same speed (c) both forward and backward so
that each door receives the light at the same time and they open
simultaneously.
The observer on the embankment, however, sees the situation differ-
ently. After the lamp is turned on, but before the light has reached the
doors, the train has moved so that the front door is now further away and
the back door is closer. He sees the light travelling both forward and
backward at the same speed (c), but the forward journey is now longer
than the backward journey, so that the back door is seen to open before
the front door. They are most definitely not judged to be simultaneous
events.
It is tempting to ask who is correct — the operator in the train or the
observer on the embankment. The answer is that they both are. Both
observers judged the situation correctly from their different frames of
reference and this is a direct consequence of the constancy of the speed
of light.

The relativity of time


We have already seen that time is perceived differently by observers in
L Mirror relative motion to each other. We are now going to determine an
equation that shows this mathematically. Much of the work that follows
Lamp uses the relationship that distance = velocity × time or, when we are
‘Click’ talking about the passage of light, distance = ct. This relationship comes
Receiver
from the definition of velocity.
The following thought experiment uses a ‘light clock’. As shown in
figure 5.8, a light pulse is released by a lamp, travels the length of the
clock and is then reflected back to a sensor next to the lamp. When the
Figure 5.8 A light clock sensor receives the pulse of light, it goes ‘click’.
For this thought experiment we shall return to the train scenario
favoured by Einstein. Imagine a traveller, seated in a speeding train. The
light clock is arranged vertically, with the lamp at the ceiling and the

78 SPACE
mirror on the floor. An observer is watching from the embankment out-
side the train. Our question now is this: when a light pulse is released,
how long does it take to travel down to the mirror and return back to the
ceiling, as seen by both the train traveller and the observer?
Let us first examine the situation as seen by the train traveller in the
rest frame; that is, the frame within which the event occurs. If L is the
A rest frame is the frame of
reference within which a measured height of the train carriage, for the total journey we can say that:
event occurs or a measured object
lies at rest. distance = 2L = ct0
where
t0 = time taken as seen by traveller
L = height of the carriage
so that
2L
t0 = ------- .
c
Examine now the situation as seen by the observer on the embank-
ment. Figure 5.9 (page 80) compares the way the situation is viewed by
each person. From outside the train the observer sees the light travelling
along a much longer journey, and its length can be determined using
Pythagoras’ theorem:

vt 2
ctv = 2 L +  ------v
2
Total journey =
 2
2 2 2
= 4L + v tt v .
2 2 2 2 2
Squaring this expression gives: c t v = 4L + v t v .
2
2 4L
Rearranging this leads to: t v = --------------------
2 2
.
(c – v )
2L
Taking the square root of both sides: t v = ---------------------
2
v
c 1 – -----2
c
2L
but, from above, t0 = ------- .
c
t0
Substituting this into the expression gives: t v = -----------------
-
2
v
1 – -----2
c
(the time dilation equation)
where
t0 = time taken in the rest frame of reference
= proper time
t v = time taken as seen from the frame of reference in relative motion
to the rest frame
v = velocity of the train
c = speed of light.
Note that t0 is the time taken for the clock to go ‘click’ as observed by
the train traveller, while t v is the time taken as observed by the person on
the embankment. Looking at the last expression above, we can see that

CHAPTER 5 SPACE AND TIME 79


2
v
eBook plus the term 1 – -----2 is always less than one so that t v is always greater than t0.
c
Weblink: This means that the clock takes longer to go ‘click’ as observed by the
Time dilation applet person on the embankment or, put another way, the outside observer
eModelling: hears the light clock clicking slower than does the train traveller. Time is
Time dilation calculator
passing more slowly on the train as observed by the person outside the
Use a spreadsheet to explore
how speed affects time train!
measurement.
doc-0037 (a) As seen by the train traveller

(b) As seen by an observer on an embankment outside the train

vt v

vt v

Figure 5.9 The length of the path of the light is perceived differently by the train traveller
and the observer on the embankment. The observer sees the light travel further but with the
same speed, hence time slowed down on the train.

Time dilation is the slowing down This effect is called time dilation and can be generally stated as follows:
of events as observed from a the time taken for an event to occur within its own rest frame is called
reference frame in relative motion. the proper time t0. Measurements of this time, t v , made from any other
inertial reference frame in relative motion to the first, are always greater.
The degree of time dilation varies with velocity as shown in figure 5.10.
Time dilation It can be most simply stated as: moving clocks run slow.
10
9 This rather startling conclusion has been experimentally verified by
8
7 comparing atomic clocks that have been flown over long journeys with
tv 6 clocks that have remained stationary for the same period. These experi-
— 5
t0 ments are possible only because of the extreme accuracy of atomic clocks
4
3 built over the last few decades, even though Einstein predicted this effect
2 about 100 years ago.
1
0 Further supporting evidence has been found in the abundance of
0 1 mesons striking the ground after having been created in the upper
v
Velocity _
c atmosphere by incoming cosmic rays. What is surprising is that the
Figure 5.10 The degree of time mesons have a velocity of about 0.996c and, at that speed, should take
dilation varies with velocity. approximately 16 µs to travel through the atmosphere. However, when

80 SPACE
measured in a laboratory, mesons have an average lifetime of approxi-
mately 2.2 µs. This anomaly can be explained by the fact that 2.2 µs
represents their proper lifetime, as measured in their rest frame, whereas
16 µs is a dilated lifetime due to their relativistic speed.

A time-dilated sneeze
SAMPLE PROBLEM 5.1 A train traveller sneezes just as his train passes through a station. The sneeze
takes precisely 1.000 s as measured by another person seated next to the
sneezer. If the train is travelling at half the speed of light, how long does
the sneeze take as seen by a person standing on the platform of the station?
SOLUTION The rest time, t0 , is the time as observed within the sneezer’s rest frame, and
therefore is 1.000 s. The time dilation equation is needed to determine the
time, t v , as observed from the frame in relative motion; that is, the platform:
t0
t v = ------------------
2
v
1 – -----2
c
1.000
= --------------------------------
0.5c 2
1 –  ----------
 c 
= 1.155 s.

A time-dilated yawn
SAMPLE PROBLEM 5.2 Continuing the last problem, if the person standing on the platform
yawned just as the train was passing through, and this yawn lasted 2.000 s
as measured by the yawner, what would be the duration of the yawn as
measured by the train travellers?
SOLUTION In this case the station platform is the rest frame of the yawn so that
t0 = 2.000 s and the time as measured from the train is t v .
t0
∴ t v = ------------------
2
v
1 – -----2
c
2.000
= --------------------------------
0.5c 2
1 –  ----------
 c 
= 2.309 s.
Notice that each observer sees time dilated in the other frame of refer-
ence. This is central to relativity. There is no absolute frame of reference.
No inertial frame is to be preferred over another and relativistic effects
eBook plus
are reversible if viewed from a different frame.
eModelling:
Length contraction The relativity of length
calculator
As a consequence of perceiving time differently, observers in differing
Use a spreadsheet to
explore how speed frames of reference also perceive length differently; that is, lengths that
affects length are parallel to the direction of motion. In order to understand how this
measurement. occurs we will construct another thought experiment.
doc-0038
This time our train traveller has arranged the light clock so that it runs
the length of the train, with the lamp and sensor located on the back wall

CHAPTER 5 SPACE AND TIME 81


and the mirror on the front wall. As the train passes the observer on the
embankment, the light clock emits a light pulse which travels to the front
wall and then returns to the back wall where it is picked up by the sensor,
which then goes ‘click’. This journey is observed by both people, but
what is the length of the journey that each perceives?
Let us start with the situation as seen by the train traveller. Figure
5.11(a) shows that this is a simple situation:
length of light journey = ct0 = 2L0
where
L0 = length of train as perceived by train traveller
t0 = time taken as perceived by train traveller.

The situation seen by the observer at the side of the track is somewhat
different because the train is moving at the same time, lengthening the
forward leg of the light pulse’s journey and shortening the return leg, as
shown in figure 5.11(b).
If t1 is the time taken for the forward part of the journey, then:
length of forward journey = ct1 = L v + vt1
Lv
and hence, t1 = ----------
-
c–v
where
L v = length of the train as measured by the observer on the embankment.
Similarly, t 2 is the time taken for the return, so that:
length of the return journey = ct 2 = L v + vt 2
Lv
and hence, t 2 = -----------
c+v
The time for the whole journey as seen by the observer on the
embankment is:
Lv Lv
time for journey, t v = t 1 + t 2 = ----------
- + -----------
c–v c+v

2L v
which can be rearranged to give t v = ----------------------
2
.
v
c  1 – -----2
 c
It is now crucial to appreciate that each observer perceives time differ-
ently. To take that into account, we need to equate the time dilation
equation to the one just derived:
t0 2L v 2L
- = ----------------------
----------------- 2
but we know that t 0 = ---------0
2 v c
1 – -----2
v c  1 – -----2
 c 
c
2L 0 2L v
- = ----------------------
so -------------------- 2
which reduces down to give
2 v
v 
c 1 – -----2 
c 1 – -----2 
c c
2
v
L v = L 0 1 – -----2 (the length contraction equation)
c

82 SPACE
where
L0 = the length of an object measured from its rest frame
L v = the length of an object measured from a different frame of reference
v = relative speed of the two frames of reference
c = speed of light.
(a) As seen by the train traveller

light
clock
‘click’
L0

(b) As seen by an observer outside the train


Lv vt1 vt2

Lv + vt1

Lv – vt2

‘click’ v

Figure 5.11 A thought experiment to


derive the relativity of length

Returning to the thought experiment, this equation means that, since


2
v
the term 1 – -----2 is always less than one, the length of the train as
c
observed by the person on the embankment is less than that observed by
the person inside the train. The person outside the train has seen the
train shorten, and the faster it goes the shorter it gets!

CHAPTER 5 SPACE AND TIME 83


This effect is called length contraction and can be generally stated as
Length contraction is the follows: the length of an object measured within its rest frame is called its
shortening of an object in the
direction of its motion as observed proper length, L0, or rest length. Measurements of this length, L v , made
from a reference frame in relative from any other inertial reference frame in relative motion parallel to that
motion. length, are always less.
It can be most simply stated as: Length contraction
moving objects shorten in the direction 1.0
0.9
of their motion. 0.8
This is another surprising result of 0.7
relativity and it is interesting to see how — L v 0.6
0.5
the degree of length contraction varies L 0 0.4
with velocity, as shown in figure 5.12. 0.3
0.2
Notice that as velocity approaches the 0.1
speed of light, the observed length 0
0 1
approaches zero. If this were a spaceship v
Velocity _
c
blasting past a planet at near light speed,
the inhabitants of the planet would see a Figure 5.12 The degree of length
very short spaceship of nearly zero contraction varies with velocity.
length, but the space travellers would
notice no change at all to the length of their ship. They would, instead,
briefly observe a wafer-thin planet in their windows, since from their
inertial frame of reference it is the planet in rapid motion, not themselves.

A length-contracted train
SAMPLE PROBLEM 5.3 When stationary, the carriages on the state’s new VVFT (very, very fast
train) are each 20 m long. How long would each carriage appear to a
person standing on a station platform as this express train speeds
through at half the speed of light?
SOLUTION The proper length, L0, of a carriage is 20 m, while the length as seen
from the platform is L v .
2
v
L v = L 0 1 – -----2
c
= 20 1 – ( 0.5 ) 2
= 17.32 m

A length-contracted person
SAMPLE PROBLEM 5.4 An occupant of the VVFT looks out of a window and catches a quick
glimpse of the person standing on the platform. If the thickness (from
chest to back) of that person measured on the platform is 30 cm, what is
the thickness observed from the train?
SOLUTION The proper length (thickness in this case) L0 is 30 cm and is measured
from the platform since that is the rest frame of that person. The
thickness observed from the train is L v .
2
v
L v = L 0 1 – -----2
c
= 30 1 – ( 0.5 ) 2
= 25.98 m
Notice that observers in each frame of reference perceive lengths in the
other frame to be contracted.

84 SPACE
PHYSICS IN FOCUS
Faster than light?
C an anything travel faster than the speed of
light? (so called ‘superluminal’ velocities)?
According to the theory of relativity and the
red is slowed more than the blue. The effect of
slowing the red is to change the way that the com-
ponents of the light add together, to make the
principal of causality, the answer to this question overall wave pulse appear to shift backward in
is ‘no’. The principal of causality says that a cause time. It is thus a wave interference effect rather
must happen before its effect. Yet, in July 2000, a than a genuine superluminal velocity.
team of physicists led by L. J. Wang made head- Wang et al. point out that this is the case, and
lines when they claimed to have made a light that since it is a wave effect, no object with mass
pulse travel much faster than the speed of light, could travel this way. Additionally, because of the
so fast that it went backwards in time. This result nature of the effect, no information could travel
would appear to violate both relativity and caus- faster than light speed this way either. They note
ality, however, all is not as it seems. that their effect does not violate relativity or caus-
To achieve their result the physicists passed a ality. Perhaps surprisingly, scientists have been
light wave through a specially prepared medium performing this type of experiment for almost
(caesium gas) to produce ‘anomalous dispersion’, twenty years, but the light pulses have been so
In normal dispersion, such as occurs in glass, the distorted that the results are inconclusive. The
blue light component in a light ray is slowed success of Wang et al. has been to design an
more than the red, In anomalous dispersion, the experiment that avoids the distortions.

The relativity of mass


The first postulate of relativity states that the laws of physics are the same
in all inertial frames of reference. Einstein felt strongly that this should
also apply to the law of conservation of momentum, but it did not seem
to. To demonstrate the problem consider the following example, illus-
trated in figure 5.13.

Before collision

A
String
B

After collision
A
String
B

Figure 5.13 A collision between spacecraft

A long string is stretched through space. Two identical spacecraft travel


toward each other on either side of the string, each with a velocity of 0.3c
relative to the string. As they meet they collide with a glancing blow that
marginally reduces their velocity along the string but also gives each a
small velocity away from the string. The momentum prior to this collision
is zero as the spacecraft have equal masses and equal but opposite
velocities. Following the collision, the spacecraft are moving apart,

CHAPTER 5 SPACE AND TIME 85


In relativity, velocities do not simply however, their velocities in the x direction are still equal but opposite and
add together. For example, if two cars almost unchanged. Due to the symmetry of the collision, the velocities of
travel toward each other, each with a the spacecraft in the y direction, although very small, are also equal but
−1
velocity of 90 km h , then the velocity opposite. As a result the total momentum in both the x and y directions
of one car as seen from the other will is zero, hence momentum has been conserved.
−1
be 180 km h . However, two The situation is different as seen from the point of view of one of the
spacecraft approaching each other spacecraft, however, and the difference is due to time dilation and its
with velocities of 0.9c will have a effect on the y velocities.
closing velocity of 0.99c, rather than As seen by a passenger of spacecraft A, prior to the collision space-
1.8c. The formula that applies here craft B speeds toward it with a velocity of 0.6c (actually 0.55c — see the
is: note in the margin), strikes it a sideways blow and then departs at a
v1 + v2
combined velocity = -------------------
-. slight angle to its original direction. Due to the presence of the string,
v1 v2 the passengers of spacecraft A can identify that they are now moving
1 + ---------
-
c2 slowly away from the string — covering, say, 10 metres in one second,
(The proof of this formula is not –1
giving a velocity of 10 m s . Looking across, the passenger sees that the
within the scope of this course.) Using
clocks in spacecraft B are running slow, so that it covers 10 metres in
this expression we can see that the two
1
spacecraft referred to in the text, each ----------------------- = 1.25 s.
with velocity 0.3c, actually approach 1 – 0.6 2
each other at 0.55c, not 0.6c as 10 m −1
mentioned. The velocity of spacecraft B is thus --------------- = 8 m s . Therefore, the
1.25 s
y velocities of the spacecraft are not identical and momentum is not con-
served in the y direction.

Algebraically:
py before collision = 0
py after collision = mAvA + mBvB
= m(10) + m(−8)
= 2m where m = mA = mB
Hence, momentum is not conserved.

Einstein believed very strongly that momentum must be conserved in all


inertial frames of reference. In order to solve this dilemma he suggested
that the mass of an object must increase, or dilate, at relativistic speeds by
a factor that compensates for the effect of time dilation on speed
measurement. We can use this idea to derive a formula for mass dilation.
Referring back to the spacecraft problem, assume that total
momentum is conserved in the y direction, as seen by the passenger of
spacecraft A.
momentum before collision = momentum after collision
0 = mAvA + mBvB (as seen by A)
r r
= m A  --- – m B  ----
 t 0  t V
m m t0
hence ------A- = ------B and t V = -----------------
-
t0 tV v2
1 – -----2
c
mA
so that, as seen by A, mB = -----------------
-
v2
1 – -----2
c
This expression shows how the mass of spacecraft B increases with
speed, as seen by the passenger of spacecraft A. By substituting this new
expression back into the problem we can see how the momentum
works out.

86 SPACE
pBy after collision = mBvB
 m   
r
 ------------------------- 
 -----------------
A
-  -
= 2
v 2 t v
 1 – -----2  0 1 – ---- -2
 c
 c 

  v 2
 m A   r 1 – ---- -
c 2
=  ------------------  ---------------------
 v 2  t0 
 1 – ---- c
-2  
mA r
= --------
-
t0
Hence, we can now say:
py before collision = 0
py after collision = pAy + pBy
= mAvA + mBvB
r m A r
= mA  --- –  --------
-
 t 0  t 0 
=0
Hence, momentum is now conserved.

The masses of the two spacecraft were originally identical, so the


expression relating the masses can be generalised to the form
m0
mv = -----------------
-
v2
1 – -----2
c
where
m0 = mass measured in the rest frame of reference
= rest mass
mv = mass as seen from the frame of reference in relative motion to the
rest frame
v = velocity
c = speed of light
Mass dilation is the increase in the
mass of an object as observed This effect is called mass dilation and can be generally stated as
from a reference frame in relative follows: The mass of an object within its own rest frame is called its rest
motion. mass m0. Measurements of this mass mv , made from any other inertial
reference frame in relative motion to the first, are always greater. The
Mass dilation degree of mass dilation varies with velocity as shown in figure 5.14.
10 The effect can be most simply stated as: moving objects gain mass.
9 Experimental evidence for mass dilation came quickly. In 1909 it was
8
7 noticed that beta particles (electrons) emitted by different radioactive
m 6 substances possessed different charge to mass ratios. The various par-
___
v
5
m0 4 ticles were travelling at significant fractions of the speed of light. Further-
3 more, the greater the speed of the beta particle, the smaller was its
2 charge:mass ratio. When the effect of mass dilation was accounted for,
1
0 the beta particles were all found to have the same charge:mass ratio.
0 v 1 Modern particle accelerators, however, demonstrate mass dilation every
Velocity _
c
time they are used. As they accelerate particles, such as electrons or
Figure 5.14 Mass as a function protons, to relativistic speeds, ever greater forces are required as the
of velocity particles’ masses progressively increase.

CHAPTER 5 SPACE AND TIME 87


The rest mass of an electron
SAMPLE PROBLEM 5.5 −31
The rest mass of an electron is 9.109 × 10 kg. Calculate its mass if it is
travelling at 80 per cent of the speed of light.
m0
SOLUTION mv = ------------------
v2
1 – -----2
c
9.109 × 10 –31
= --------------------------------
( 0.8c ) 2
1 – ----------------- -
c2
9.109 × 10 –31
= --------------------------------
0.6
−30
= 1.518 × 10 kg
That is, the electron’s mass is approximately 1.7 times its rest mass.

The rest mass of an electron near the speed of light


SAMPLE PROBLEM 5.6 Calculate the mass of an electron if travelling at 99.9 per cent of the
speed of light.
m0
SOLUTION mv = ------------------
v2
1 – -----2
c
9.109 × 10 –31
= -------------------------------------
( 0.999c ) 2
1 – ----------------------- -
c2
9.109 × 10 –31
= --------------------------------
0.0447
−29
= 2.037 × 10 kg
That is, the electron’s mass this time is approximately 22 times its rest mass.

The equivalence of mass and energy


Note in figure 5.14 that as the speed of an object approaches the speed
of light c, its mass approaches an infinite value. It is this enormous
increase in mass that prevents any object from exceeding the speed of
light. This is because an applied force is required to create acceleration.
Acceleration leads to higher velocities, which eventually leads to
increased mass. This means that further accelerations will require ever-
greater force. As mass becomes infinite, an infinite force would be
required to achieve any acceleration at all. Sufficient force can never be
supplied to accelerate beyond the speed of light.
But herein lies a problem — if a force is applied to an object, then
work is done on it. Another way to say this is that energy is given to the
object. In the sort of situation we are considering, this energy would take
the form of increased kinetic energy as the object speeds up. But at near
light speed the object does not speed up as we would normally expect, so
where is the energy going? The applied force is giving energy to the
object and the object does not acquire the kinetic energy we would
expect. Instead, it acquires extra mass, as shown in figure 5.14. Einstein
made an inference here and stated that the mass (or inertia) of the
object contained the extra energy.

88 SPACE
Relativity results in a new definition of energy as follows:
2
E = Ek + mc
where
E = total energy
Ek = kinetic energy
m = mass
c = speed of light.
Notice that when an object is stationary, so that it has no kinetic
energy, it still has some energy due to its mass. This is called its mass
energy or rest energy and is given by:
Rest energy is the energy
equivalent of a stationary object’s 2
E = mc
mass, measured within the object’s
where
rest frame.
E = rest energy (J)
m = mass (kg)
8 −1
c = speed of light (3 × 10 m s ).
This famous equation states clearly that there is an equivalence
between mass and energy — that mass has an energy equivalent and vice
versa. The speed of light squared is a very large number, however, and
this means that if you were able to convert just a small amount of matter
it would yield an enormous amount of energy. Just one kilogram of mass,
for example, is equivalent to 9 million billion joules of energy.
This equivalence may seem strange when first seen; however, you must
bear in mind that it has been proven experimentally many times, and
demonstrated most dramatically as the energy released by a nuclear
bomb.

The rest energy of an electron


SAMPLE PROBLEM 5.7 What is the energy equivalent of an electron of mass 9.109 × 10
−31
kg?
2
SOLUTION E = mc
−31 8 2
= (9.109 × 10 )(3 × 10 )
−14
= 8.1981 × 10 J

The rest energy of a uranium atom


SAMPLE PROBLEM 5.8 What is the energy equivalent of an atom of uranium, which has a mass
−25
of 238 amu or 3.953 × 10 kg?
2
SOLUTION E = mc
−25 8 2
= (3.953 × 10 )(3 × 10 )
−8
= 3.558 × 10 J
−11
Compare this to the energy yield of 3.2 × 10 J from a typical fission
reaction in which only part of the mass of a uranium atom is converted to
energy as the nucleus splits into two smaller parts.

Relativistic space flight


Designers of a new kind of spacecraft called a light sail make the remark-
able claim that these craft could journey to Proxima Centauri, our closest
neighbouring star and shortest interstellar journey, at a speed of 0.1c or
10% of the speed of light. This is far in excess of current achievable velo-
cities. Assuming it to be true, how long would such a journey take?

CHAPTER 5 SPACE AND TIME 89


The distance to Proxima Centauri is approximately four light-years, or
13 8 −1
3.7869 × 10 km. A speed of 0.1c is equal to 1.08 × 10 km h . The time
taken is easily calculated:
distance
speed = ---------------------------
time taken
distance
so time taken = ---------------------
speed
37 869 000 000 000 km
= --------------------------------------------------------
–1
-
108 000 000 km h
= 350 640 hours = 40 years.
When the distances and speeds are this large, a simpler calculation
results if distance is expressed in light-years and speed is expressed in
terms of c, as follows:
distance
time taken = ---------------------
speed
4 c years
= ---------------------
0.1c
= 40 years.
However, this is the time taken as observed from Earth. The space
travellers within the spacecraft will, according to relativity, record a
slightly different travel time. There are two ways to calculate it:
• Method 1: If the time recorded on Earth, t v , is 40 years, the rest time, t0,
lapsed on the spacecraft, can be calculated using the time dilation equation:
to
t v = -----------------------
-
 v 2
1 – --
 c

v 2
∴ t0 = t v 1 –  --
 c

= 40 1 – ( 0.1 ) 2
= 39.799 years
= 39 years 292 days.
In other words, the spacecraft reaches its destination 73.25 days, or
almost two-and-a-half months, short of 40 years.
• Method 2: The occupants of the spacecraft see the distance they have
to cover contracted according to the length contraction equation:

v 2
L v = L 0 1 –  --
 c
2
= 4 1 – ( 0.1 )
= 3.9799 light years.
distance
Now, time taken = ---------------------
speed
3.9799 c years
= -----------------------------------
0.1c
= 39.799 years
= 39 years 292 days.
This method produces the same result as the previous method — the
spacecraft actually arrives 73.25 days short of 40 years.

90 SPACE
This example illustrates the influence that relativity can have upon
space travel when speeds become ‘relativistic’, which usually means 10%
of the speed of light or faster. When speeds are less than this, the effects
are almost negligible. When speeds become greater than this, the effects
become significant. As figures 5.10 and 5.12 show, the effects intensify
sharply with speeds faster than 0.9c.
Table 5.1 compares space travel at a variety of speeds, showing the time
passed on board a spacecraft during one Earth day, as well as the length
Table 5.1 A comparison of
of external objects and distances as a percentage of the original length.
relativistic effects

TIME PASSED ON SPACECRAFT IN ONE EARTH DAY CONTRACTED


LENGTHS AS % OF
–1 v
SPACECRAFT SPEED (km h ) RATIO v/c
c HOURS MINUTES SECONDS ORIGINAL

Space shuttle 28 000 0.000 026 23 59 59.999 972 99.999 999 97

Fast space probe 100 000 0.000 093 23 59 59.999 630 99.999 999 6

Light sail 108 000 000 0.1 23 52 46.92 99.499

Starship Intastella 972 000 000 0.9 10 27 40.89 43.59

Starship Galactica 1 079 892 000 0.999 9 0 20 21.85 1.4

Astronauts in orbit around the Earth will not observe any noticeable
effect at all, since the length of their day is just 30 microseconds shorter
−1
than on Earth. Even in a current-day speedster at 100 000 km h the days
are just 400 microseconds shorter and lengths are still 99.999 9996% of
their former selves.
The situation is very different at 0.9c, however. While back on Earth
24 hours pass, at this speed less than 10.5 hours elapse and external
objects such as planets have squashed up to 44% of their former lengths.
Consider now travelling in the Galactica. This flyer manages to zoom
along at 99.99% of the speed of light and, in the course of one Earth day,
just over 20 minutes have passed on board. Lengths have contracted to
just 1.4% of their original lengths and the four light-year trip to Proxima
Centauri would be completed in just over 20 days according to the ship’s
clock. It would be natural at this point to wonder how this could be poss-
ible — if light takes four years to cover the distance, how could this star-
ship, travelling at very nearly the speed of light, manage the journey in
20 days? The answer is that as observed from the Earth the starship does
take four years; however, the clocks on the starship, both electronic and
biological, are running slow so that by their reckoning only 20 days pass.
It should be pointed out that the energy costs of achieving these types
of speeds would be prohibitive, even assuming that such speeds were
technically possible. Acceleration is always the most energy costly phase
of a space mission. As we have seen earlier, the effect of mass accumu-
lation and time dilation is to require accelerations beyond 0.9c to involve
ever greater forces and energy input for only marginal increases.

The twins paradox


eBook plus Einstein himself suggested one of the strangest effects of relativity. His
idea was that a living organism could be placed in a box and taken on a
Weblink: relativistic flight and then returned to its starting place almost without
Twins paradox
applet
ageing, while similar organisms that had remained behind had long since
died of old age.

CHAPTER 5 SPACE AND TIME 91


Let us reconsider the problem using the flight of the Starship Galactica
eBook plus on the previous page. Consider twins, Martha and Arthur. Martha steps
aboard the starship and speeds off to Proxima Centauri at maximum
Weblinks: speed as described previously. This journey has taken Martha just 20.5
Relativity discussion days but for Arthur, watching closely through a telescope, four years have
and applets passed. Martha immediately turns the starship around and returns to her
Relativity applets
brother. Upon arrival she is just 41 days older than when she left; how-
ever, Arthur has aged eight years waiting for her. It is not hard to appre-
ciate that had the journey been to a star further away, Martha could have
arrived back after her brother had grown old and died.
This problem is often considered a paradox, because the principle of
relativity demands that no inertial frame of reference be preferred over
others. In other words, relativity’s effects should be reversible simply by
looking at them from a different viewpoint. Length contraction, for
instance, depends upon the viewer’s frame of reference. An Earthbound
viewer looking at the Starship Galactica will see it compressed to just 1.4%
of its original length, while travellers within the starship would see the
Earth squashed up to 1.4% of its original thickness.
If this problem is viewed differently, then it is Martha who sees Arthur
disappearing away with the Earth at 0.9999c and then returning, so
Arthur should be the younger. The apparent paradox is this: if both
points of view are valid then each sibling will see the other as older than
themselves. How can this be possible?
The answer is that this particular problem is not reversible. Martha’s
frame (the starship) has not remained inertial — it has accelerated and
decelerated, turned around and then repeated its accelerations. It is very
easy to tell the difference between the two frames simply by having each
sibling carry an accelerometer. Hence, the two frames of reference are
not equivalent and there is no paradox because Martha will definitely be
younger and Arthur older.

A relativistic interstellar journey


SAMPLE PROBLEM 5.9 If Martha were to throttle back the Starship Galactica so that it cruised to
Proxima Centauri at just 0.8c, how much older would Arthur be upon
her immediate return?
SOLUTION The trip time as observed by Arthur back on Earth is calculated as follows:
distance
time taken = ---------------------
speed
2 × 4c years
= -----------------------------
0.8c
= 10 years.
The trip time for Martha can be found using the time dilation equation:
t0
t v = -----------------------
-
 v 2
1 – --
 c
2
v
∴ t0 = t v 1 –  --
 c
2
= 10 1 – ( 0.8 )
= 6 years.
Hence, Arthur will be four years older than Martha upon her return.

92 SPACE
CHAPTER REVIEW
• The mass of an object within its own rest frame
SUMMARY is called its rest mass m0. Measurements of this
mass mv made by observers in relative motion
• The aether was the hypothesised medium for will always be greater.
light and other electromagnetic waves. It was
transparent and could not be detected, yet m0
m v = -----------------
-
belief in its existence was strong since all other v2
known waveforms require a medium through 1 – -----2
c
which to travel.
• The rest mass of an object is equivalent to a
• The Michelson–Morley experiment used light
certain quantity of energy. Conversion between
and an effect called interference was devised to
the two can occur under extraordinary
detect the aether. It was extremely sensitive yet
failed to detect any indication of the existence circumstances:
2
of the aether. E = mc .
• An inertial frame of reference is a non- • Time dilation and length contraction could
accelerated environment. It allows for uniform theoretically allow exceptionally long space
velocity motion or a state of rest only. journeys within reasonable periods of time, as
• The principle of relativity states that it is not judged by the travellers. However, relativity also
possible to detect uniform velocity motion indicates that the cost of energy to do this
while within a frame of reference, without would be prohibitive.
referring to another frame. Classical physics
established this principle for mechanics but not
optics. Einstein included optics by extending QUESTIONS
the principle to include all the laws of physics.
1. Outline the features of the aether model and
• Einstein also postulated that the speed of light the reasons that scientists believed that it
has the same value, c, in all reference frames; needed to exist.
that is, to all observers. 2. List the supposed features of the aether.
• Time becomes a relative term once it is 3. (a) Identify the objective of the Michelson–
accepted that the speed of light is an absolute Morley experiment.
term. Distance, or space, is also a relative term. (b) Construct a diagram showing the paths of
• The SI unit of length, the metre, is defined in the light rays in the Michelson–Morley
terms of the speed of light and time. experiment.
(c) Write a one-paragraph description of how
• Two events in different places that are judged the apparatus worked.
by an observer to be simultaneous will not be (d) The experiment had a very definite result.
simultaneous as judged by another observer in Outline this result.
relative motion to the first.
4. Evaluate the success of the Michelson–Morley
• The time taken for an event to occur within its experiment in proving or disproving the
rest frame is called the proper time, t0. The aether model. Discuss the role it played in the
time taken, t v , as judged by observers in relative development of ideas.
motion, will always be longer. 5. Outline the essential aspects of an inertial
t0 frame of reference and identify a method to
t v = -----------------
-
2 distinguish an inertial from a non-inertial
v
1 – -----2 frame of reference.
c
6. You are in a spaceship heading, you think, in
• The length of an object measured within its rest free motion towards Pluto; however, you are
frame is called its proper length, L0. Measured far from any reference point to check your
by observers in relative motion the length, L v , progress. Suddenly a comet approaches from
will always be shorter. behind and overtakes you, heading in the
2 same direction. Identify which of the following
v interpretations of events is correct and any
L v = L 0 1 – -----2
c method that could distinguish them.

CHAPTER 5 SPACE AND TIME 93


(a) The comet is also travelling towards Pluto 13. In one paragraph, describe what is meant by
but at a higher speed. the ‘relativity of simultaneity’.
(b) Your spaceship is stationary but the comet 14. If two events are observed to occur at the same
is heading towards Pluto. place and time by one person, will they be
(c) The comet is stationary and you are travel- seen in the same way by all observers? Explain.
ling away from Pluto.
(d) You are both travelling away from Pluto, 15. Pete the flying ace needs to hide his new
but you have a higher speed. experimental high-velocity plane in the
hangar, away from the view of foreign spy satel-
7. Identify any method you could use to tell if lites. ‘You’ll ne’er do it,’ says Jock, the Scottish
your spacecraft, far from any planet or star, is flight engineer, ‘The hangar’s barely 80 m
standing still or travelling with a uniform velo- long but yer-r-r plane there is over 120.’
city. What is the name given to this idea? ‘She’ll be right, mate,’ replies Pete, ‘It’s just
8. In classical physics, time was assumed to be a matter of going fast enough!’ Jock stands at
absolute and the maximum velocity of inter- the hangar door while he watches Pete
action assumed infinite. Explain how these approach in his plane.
same concepts are viewed in the theory of (a) Jock sees the plane contract as it speeds
relativity. up. Calculate how fast it must be travelling
for him to be able to quickly close the
9. A traveller in a very fast train holds a mirror at
door with the plane, at least momentarily,
arm’s length and looks at his reflection. Both
contained inside.
he and an observer outside the train see the
(b) At the speed you determined in part (a),
same velocity of light for the traveller’s image.
calculate the length of the hangar as seen
distance by Pete in the approaching plane. Will the
Since velocity = --------------------- , describe what has
time plane fit in the hangar as judged by Pete?
happened to the length of the traveller’s arm, (c) Discuss how it is possible for Pete and Jock
and the time taken for the reflection to to perceive the situation so differently.
return, as seen by the outside observer.
16. As the Moon orbits the Earth it has an orbital
−1
10. (a) Compare the current definition of a metre speed of approximately 3660 km h . If its
to the original metre standard. proper diameter is 3480 km, calculate what we
(b) Evaluate the light-time definition, such as observe as its diameter.
the metre and the light-year. 17. A muon is a subatomic particle with an average
lifetime of 2.2 microseconds when stationary.
11. Identify the technologies required to support
When in a burst of cosmic rays in the upper
the current definition of a metre.
atmosphere, muons are observed to have a
12. Complete the table of distances shown below. lifetime of 16 microseconds. Calculate their
You will need the following conversion factors: speed.
1 light-year = 0.3066 parsecs = 63 240 AU 18. If our galaxy, the Milky Way, is 20 kiloparsecs
12
= 9.4605 × 10 km or 65 000 light-years in radius, calculate how
1 parsec = 3.2616 light-years = 206 265 AU fast a spacecraft would need to travel so that its
12
= 30.857 × 10 km occupants could travel right across it in 45
−5
1 astronomical unit (AU) = 1.5813 × 10 years.
8
light-years = 1.496 × 10 km. 19. Calculate the travel time to the following des-
CHAPTER REVIEW

tinations travelling at 0.999c. Use the con-


version factors given in problem 12.
DISTANCE DISTANCE
(a) Pluto when at a distance of 39.1 AU from
STAR (light-years) (parsecs) DISTANCE (km)
the Earth
Canopus 23 (b) Proxima Centauri at a distance of 1.295
parsecs
13
Rigel 900 (c) Sirius at a distance of 8.177 × 10 km
(d) Alpha Crucis at a distance of 522 light-
Arcturus 10 years
Hadar
15
3.0857 × 10
(e) Andromeda galaxy at a distance of 690
kiloparsecs

94 SPACE
CHAPTER REVIEW
20. The length of a spacecraft is observed, by Each is made of glass and in each the occu-
someone on a nearby planet, to shrink to half pants are holding a dance party.
its proper length. Calculate: (a) Describe the dancing Venusians as seen by
(a) the speed of the spacecraft relative to the the Martians.
planet (b) Describe the dancing Martians as seen by
(b) the observed mass of the spacecraft if its the Venusians.
4
rest mass is 5 × 10 kg (c) Are your answers to (a) and (b) contradic-
(c) the amount of time passed on the planet tory? Explain.
when one second has passed on the space-
craft, as observed from the planet. 23. Define rest energy.

21. A super rocket racer has a proper length of 24. Identify different situations in which rest
30 m, a rest mass of 300 000 kg and can fly at energy is extracted.
0.3c. Calculate: 25. Calculate the rest energy of the following
(a) the length of the aircraft at speed when objects:
−27
observed from the Earth (a) a proton of mass 1.673 × 10 kg
(b) the time difference between the clocks of (b) an alpha particle of mass 4.0015 amu or
−27
the pilot and his airbase if they were per- 6.6465 × 10 kg
fectly synchronised prior to lift-off and the (c) a carbon atom of mass 12.0000 amu or
−26
racer was aloft and at speed for 10 h 1.9932 × 10 kg
according to the pilot’s clock (d) 5 mg of aspirin
(c) the mass of the speeding aircraft when (e) 1 kg of sugar.
observed from the Earth.
26. In terms of the energy required, accelerating a
22. A Martian spaceship travelling at near-light spacecraft to light speed is an impossibility.
speed passes a stationary Venusian spaceship. Explain why this is so.

CHAPTER 5 SPACE AND TIME 95


In this analogy of the Michelson–Morley experi-
5.1 ment, the water represents the aether, the cage rep-
resents the Earth, the apparent current represents
MODELLING THE the aether wind and the swimmers represent the
two light rays.
MICHELSON– Analysis
MORLEY Swimmer A
This swimmer will need to head into the current
EXPERIMENT slightly in order to swim directly across. Add the
swimmer’s velocity relative to the water to the
apparent current to determine the swimmer’s velo-
Aim city relative to the cage, as shown in figure 5.16.
To model the Michelson–Morley experiment. Use this information to determine the time taken
to swim directly across the cage and back again.
Theory
The ‘luminiferous aether’ was the assumed medium
for the propagation of light waves, proposed by nine- Velocity of swimmer
Velocity relative to water
teenth-century physicists. In 1881, Michelson and of water
Morley performed a highly sensitive experiment to relative
detect the ‘aether wind’. This is the apparent velo- to boat
city of aether moving past us caused by the Earth
moving through the stationary aether. The appar-
atus is shown in figure 5.4 (page 74). The apparatus
Velocity of swimmer relative to boat
produced two light rays and directed them along
equal-length but separate paths as shown in figure
Velocity of Velocity of Velocity of
5.5 — one into the aether wind and one across it —
swimmer
+ water relative
= swimmer
and then compared the rays at the end of their relative to boat relative
journey to see if one was ahead of the other. to water to boat
Figure 5.16 Adding the velocities of the swimmer who heads
The model across the shark cage
This is a book exercise using an analogy as shown
in figure 5.15. A large open-top shark cage, of the Swimmer B
sort used by long-distance swimmers, is attached to This swimmer’s forward journey is different from
the side of a boat. Its dimensions are 10 m by 10 m. the return journey so each part needs to be treated
The boat and the cage are both moving through the separately. During the forward journey, the
−1
water at 25 cm s . In one of the leading corners are swimmer is swimming with the current but when
two swimmers who are about to have a race. returning, the swimmer is heading against the
Swimmer A will head across the leading edge of the current.
cage and back, while swimmer B will head along the For each part, determine the swimmer’s velocity
side of the cage and back. Each is capable of swim- relative to the cage and then use this information
PRACTICAL ACTIVITIES

−1
ming at 1 m s , relative to the water. Who will win? to determine the time taken to swim along the cage
and back again. Add these two times to find the
time for the total journey travelled by swimmer B.
Swimmer A
Comparison
Now compare the times you have calculated for
each swimmer. Which one wins the race?
Swimmer B Figure 5.15
An analogy for the Questions
Michelson–Morley 1. Will the winner of this race always win the race,
experiment. Two regardless of boat speed?
swimmers race, 2. If the speed of the current is doubled, how
across and along, would this affect the winning time?
a large moving 3. What does it mean if the swimmers’ race is a tie;
Shark cage
shark cage. that is, there is no winner?

96 SPACE
PRACTICAL ACTIVITIES
4. In the Michelson–Morley experiment there was uniform velocity without referring to another
a null result (that is, no winner in our model). frame; it is only possible to distinguish between
Given that their apparatus was supposed to be inertial and non-inertial frames. Since acceleration
sensitive enough to detect the aether, what con- involves force, any force-detecting device can
clusions can be drawn from this? identify a non-inertial frame of reference.
An accelerometer is a device that identifies the
direction and magnitude of an acceleration.
5.2
Method
NON-INERTIAL 1. Familiarise yourself with the device by holding
FRAMES OF it horizontally and moving from side to side
noting the change in its indicated acceleration.
REFERENCE 2. (a) Place the accelerometer upon a dynamics
trolley arranged as shown in figure 5.17.
Aim Ensure that the string is sufficiently long so
that the hanging mass strikes the floor well
To use an accelerometer to distinguish between before the trolley reaches the pulley. When
inertial and non-inertial frames of reference. a mass of 100 g has been placed on the
Apparatus hanging end of the string, release the trolley
and observe the scale. What was the reading
accelerometer (either a stand-alone device or data
of the acceleration? Once the mass reaches
logger attachment)
the floor the trolley will stop accelerating.
dynamics trolley
What do you observe on the accelerometer?
string
50 g mass carrier with extra masses (b) Repeat this procedure twice more, with the
hanging mass set to 200 g and 400 g.
Theory Record the observed rate of acceleration
A frame of reference is an environment within before and after the mass strikes the floor.
which an object resides. An inertial reference frame 3. This part will require the cooperation of
is one that is at rest or in uniform velocity. A non- someone with a car.
inertial reference frame is one that is undergoing (a) While seated in the car hold the accel-
acceleration. erometer parallel with the sides of the car.
The principle of relativity states that when Ask the driver to accelerate, coast for a
residing in an inertial reference frame it is not while and then brake to a stop. Describe
possible to tell whether the frame is at rest or in your observations.

Accelerometer Pulley

Bench top

Dynamics trolley

Mass carrier and


masses

Figure 5.17

CHAPTER 5 SPACE AND TIME 97


(b) Now hold the accelerometer parallel with the Questions
front and back of the car. Ask the driver to
drive around a few corners, preferably right- 1. What is the effect of increasing the hanging
angle bends. Describe your observations. mass?
2. How does the motion of an object change when
undergoing acceleration?
Results 3. What name is given to frames of reference that
Part 2(a) Hanging mass 100 g experience acceleration?
Indicated acceleration before mass strikes floor 4. Is the accelerometer display different when
= ms
−1 travelling at a steady speed compared with
standing at rest?
What type of reference frame is this? 5. What name is given to the reference frames
referred to in question 4?
Indicated acceleration after mass strikes floor
−1
= ms
What type of reference frame is this?

Part 2(b) Hanging mass 200 g


Indicated acceleration before mass strikes floor
−1
= ms
Indicated acceleration after mass strikes floor
−1
= ms
Hanging mass 400 g
Indicated acceleration before mass strikes floor
−1
= ms
Indicated acceleration after mass strikes floor
−1
= ms

Part 3
Observations when accelerating, coasting and
braking:

Observations when cornering:


PRACTICAL ACTIVITIES

98 SPACE
HSC CORE MODULE
Chapter 6
The motor effect and DC
electric motors

Chapter 7
Generating electricity

Chapter 8
Generators and power
distribution

Chapter 9
AC electric motors

MOTORS AND
GENERATORS
CHAPTER
6 THE MOTOR EFFECT
AND DC ELECTRIC
MOTORS
Remember
Before beginning this chapter, you should be able to:
• describe the behaviour of the magnetic poles when
they are brought close together
• define the direction of a magnetic field at a point as
the direction that the N pole of a compass needle
would point when placed at that point
• describe the magnetic field around single magnetic
poles and pairs of magnetic poles
• describe the nature of a magnetic field produced by
an electric current in a straight current-carrying
conductor
• explain how the right-hand grip rule can determine
the direction of current in, or the magnetic field
lines around, a current-carrying conductor
• compare the nature and generation of magnetic
fields by solenoids and a bar magnet.

Key content
At the end of this chapter you should be able to:
• identify the factors that affect the magnitude and
direction of the force acting on current-carrying
conductors in magnetic fields
• use the right-hand push rule to determine the
Figure 6.1 A disassembled motor from an direction of the force acting on current-carrying
electric drill. How does it work and what conductors in magnetic fields
are the functions of all its parts? • describe the force between long, parallel current-
carrying conductors
• define torque
• describe the motor effect
• describe the main features of a DC electric motor and
the role of each feature
• identify two methods for providing the magnetic field
for a DC motor.
What would your life be like without electricity? Modern industrialised
eBook plus nations are dependent on electricity. Electricity is easy to produce and
distribute, and is easily transformed into other forms of energy. Electric
Weblink: motors are used to transform electricity into useful mechanical energy.
Magnetic field They are used in homes, for example in refrigerators, vacuum cleaners
around a wire and many kitchen appliances, and in industry and transport.
applet
In this module we will explore how electricity is used to drive electric
motors, how it is produced and how it is distributed from the power
stations to the consumers.
Use the box below to revise your work on magnetic fields from the Pre-
liminary Course (‘Electrical energy in the home’). This material is fun-
damental to the understanding of how DC electric motors operate.

PHYSICS IN FOCUS
Review of magnetic fields
• The law of magnetic poles states that opposite (a)
poles of magnets attract each other and like
poles of magnets repel each other.
• Magnetic fields are represented in diagrams
using lines. These show the direction and
N
strength of the field. The density of the field
lines represents the strength of the magnetic
field. The closer the lines are together, the
stronger the field.
S
• The direction of the magnetic field at a par-
ticular point is given by the direction of the
force on the N pole of a magnet placed within
the magnetic field. It is shown by arrows on the
magnetic field lines.
• Magnetic field lines never cross. When a
region is influenced by the magnetic fields of (b)

two or more magnets or devices, the magnetic


field lines show the strength and direction of
the resultant magnetic field acting in the N N
region. They show the combined effect of the
individual magnetic fields.
• The spacing of the magnetic field lines
represents the strength of the magnetic field.
(c)
It follows that field lines that are an equal
distance apart represent a uniform magnetic
field.
• Magnetic field lines leave the N pole of a N S
magnet and enter the S pole.
• The following diagrams in figure 6.2 represent
the magnetic fields around (a) a single bar
magnet, (b) two N poles close to each other,
and (c) a horseshoe magnet.
• In a diagram, as seen in figure 6.3, magnetic
field lines going out of the page are repre-
sented using dot points (•). This is like an Figure 6.2 Magnetic field lines for (a) a single bar magnet,
observer seeing the pointy end of an arrow as (b) two N poles close to each other and (c) a horseshoe magnet
it approaches. (continued)

CHAPTER 6 THE MOTOR EFFECT AND DC ELECTRIC MOTORS 101


• Magnetic field lines going into the page, also Magnetic
Direction of
seen in figure 6.3, are represented using field line
magnetic
crosses (×). This is as an archer would see the field lines
rear end of an arrow as it leaves the bow.

Direction of
Page current

N S
Figure 6.5 The right-hand grip rule

B A • When a current-carrying conductor is bent into


x x x a loop, the effect is to concentrate the magnetic
x x x field within the loop, as shown in figure 6.6.
x x x
From B From A (a)
Figure 6.3 Magnetic field lines coming out of the page as
observed by A, and going into the page, as observed by B
• The movement of charged particles, as occurs in
an electric current, produces a magnetic field.
The magnetic field is circular in nature around (b)
the current-carrying conductor, as shown in x
x
x
x

x
figure 6.4, and can be represented using concen-

x
x
x
tric field lines. The field gets weaker as the

x
x
I

x
x x
distance from the current increases. x

Figure 6.6
(a) The magnetic field of a loop
(a) 3-D representation (b) 2-D representation

• A solenoid is a coil of insulated wire that can


carry an electric current and is shown in figure
(b) 6.7. The number of times that the wire has been
N wrapped around a tube to make the solenoid is
known as the number of ‘turns’ or ‘loops’ of the
N
N
solenoid. The magnetic fields around each loop
of wire add together to produce a magnetic
field similar to that of a bar magnet. Note that
I the magnetic field goes through the centre of
B the solenoid as well as outside it.

I
Figure 6.4 (a) Compasses can
be used to show the circular nature
of the magnetic field around a straight N S
current-carrying conductor. (b) The magnetic
field is circular and stronger closer to the wire.

• The direction of the magnetic field around a


straight current-carrying conductor is found
using the right-hand grip rule, as shown in
figure 6.5. When the right hand grips the con-
ductor with the thumb pointing in the direc-
tion of conventional current, the curl of the I

fingers gives the direction of the magnetic field Figure 6.7 The magnetic field around a current-carrying
around the conductor. solenoid

102 MOTORS AND GENERATORS


• The direction of the magnetic field produced • An electromagnet is a solenoid that has a soft
by a solenoid can be determined using another iron core. When a current flows through the
right-hand grip rule; see figure 6.8. In this solenoid, the iron core becomes a magnet.
case, the right hand grips the solenoid with the The polarity of the iron core is the same as the
fingers pointing in the same direction as the polarity of the solenoid. The core produces a
conventional current flowing in the loops of much stronger magnetic field than is pro-
wire and the thumb points to the end of the duced by the solenoid alone. In figure 6.10
solenoid that acts like the N pole of a bar the magnetic field of a permanent magnet is
magnet; that is, the end of the solenoid from compared to that of an electromagnet.
which the magnetic field lines emerge. • The strength of an electromagnet can be
increased by:
Fingers in direction – increasing the current through the solenoid
of current – adding more turns of wire per unit length
for a long solenoid
– increasing the amount of soft iron in the core.
N S

Thumb points
to N pole
I N S

Figure 6.8 Determining the N pole of a solenoid

• Another method for determining the poles of a


solenoid is to look at a diagram of the ends of the
solenoid (see figure 6.9), and mark in the direc- (a) Permanent magnet
tion of the conventional current around the sol-
enoid. Then mark on the
A solenoid consists of a coil of wire diagram the letter N or S
Iron core
wound uniformly into a cylinder. that has the ends of the
letter pointing in the same
direction as the current. N is for an anticlockwise
current, S is for a clockwise current.
N S

I I

I I
S pole end N pole end
of solenoid of solenoid (b) Electromagnet

Figure 6.9 Another method for determining the poles of a Figure 6.10 A permanent magnet and an electromagnet.
solenoid Note the polarity of the iron core.

6.1 THE MOTOR EFFECT


A current-carrying conductor produces a magnetic field. When the
current-carrying conductor passes through an external magnetic field,
The motor effect is the action of a the magnetic field of the conductor interacts with the external mag-
force experienced by a current- netic field and the conductor experiences a force. This effect was
carrying conductor in an external
magnetic field. discovered in 1821 by Michael Faraday (1791–1867) and is known as
the motor effect. The direction of the force on the current-carrying

CHAPTER 6 THE MOTOR EFFECT AND DC ELECTRIC MOTORS 103


conductor in an external magnetic field can be determined using the
The right-hand push rule (also
called the right-hand palm rule) is
right-hand push rule and can be seen in figure 6.11 (discussed further
used to find the direction of the in chapter 7).
force acting on a current-carrying
conductor in an external magnetic
field.

N Direction of
current flow
I
Con
duc
tor
Direction of Right hand
the force on
the conductor
6.1 S
The motor effect Direction of
the force on
the conductor Direction of
the magnetic
field (from
N to S)

Figure 6.11 The right-hand push rule for a current-carrying conductor

Factors affecting the magnitude


of the force
The magnitude of the force on a straight conductor in a magnetic field
depends on the following factors:
• the strength of the external magnetic field. The force is proportional
to the magnetic field strength, B
• the magnitude of the current in the conductor. The force is
proportional to current, I
• the length of the conductor in the field. The force is proportional to
the length, l
• the angle between the conductor and the external magnetic field. The
force is at a maximum when the conductor is at right angles to the
field, and it is zero when the conductor is parallel to the field. The
magnitude of the force is proportional to the component of the field
that is at right angles to the conductor. If θ is the angle between the
field and the conductor, then the force is the maximum value
multiplied by the sine of θ.
These factors are shown in figure 6.12 and can be expressed
mathematically as:
F = BIl sin θ.

θ B

N S
l y = l sin θ

Figure 6.12 A conductor at an angle to a magnetic field

104 MOTORS AND GENERATORS


Force on a current-carrying conductor
SAMPLE PROBLEM 6.1 If a conductor of length 8.0 cm carries a current of 30 mA, calculate the
magnitude of the force acting on it when in a magnetic field of strength
0.25 T if:
(a) the conductor is at right angles to the field
(b) the conductor makes an angle of 30° with the field
(c) the conductor is parallel with the field.
SOLUTION Use the equation:
F = BIl sin θ
where
−2
l = 8.0 × 10 m
−2
I = 3.0 × 10 A
B = 0.25 T.
(a) F = BIl sin 90°
−2 −2
= 3.0 × 10 × 8.0 × 10 × 0.25 × 1
−4
= 6.0 × 10 N
(b) F = BIl sin 30°
−2 −2
= 3.0 × 10 × 8.0 × 10 × 0.25 × 0.5
−4
= 3.0 × 10 N
(c) F = BIl sin 0°
−2 −2
= 3.0 × 10 × 8.0 × 10 × 0.25 × 0
=0

6.2 FORCES BETWEEN TWO


PARALLEL CONDUCTORS
If a finite distance separates two parallel current-carrying conductors,
then each conductor will experience a force due to the interaction of the
magnetic fields that exist around each.
Figure 6.13, shows the situation where two long parallel conductors
carry currents I1 and I2 in the same direction.
Figure 6.13(a) shows the magnetic field of conductor 1 in the region
of conductor 2. Conductor 2 is cutting through the magnetic field due to
conductor 1. The right-hand push rule shows that conductor 2
experiences a force directed towards conductor 1.
6.2 Similarly, figure 6.13(b) shows the magnetic field of conductor 2 in the
The force between two parallel region of conductor 1. The right-hand push rule shows that conductor 1
current-carrying conductors experiences a force directed towards conductor 2. This means that the
conductors are forced towards each other.

(a) 1 2 (b) 1 2

I1 I2 I1 I2

F
F
Figure 6.13 The forces acting on two
long parallel conductors carrying
currents in the same direction Magnetic field Magnetic field
due to I1 due to I2

CHAPTER 6 THE MOTOR EFFECT AND DC ELECTRIC MOTORS 105


Figure 6.14 shows the situation where two long parallel conductors
carry currents I1 and I2 in opposite directions.
Figure 6.14(a) shows the magnetic field of conductor 1 in the region
of conductor 2. The right-hand push rule shows that conductor 2
experiences a force directed away from conductor 1.
Similarly, figure 6.14(b) shows the magnetic field of conductor 2 in the
region of conductor 1. The right-hand push rule shows that conductor 1
experiences a force directed away from conductor 2. This means that the
conductors are forced apart.
Note that the magnitude of the forces acting on each pair of wires is
equal, but the directions are opposite. This is true even if the conductors
carry currents of different magnitudes.

(a) 1 2 (b) 1 2

I1 I2 I1 I2

F
F

Figure 6.14 The forces acting on two


long parallel conductors carrying
Magnetic field Magnetic field
currents in opposite directions due to I1 due to I2

Determining the magnitude of the force


between two parallel conductors
The magnetic field strength at a distance, d, from a long straight con-
X Y ductor carrying a current, I, can be found using the formula:
kI
I1 I2 B = -----
d
where
d −7 −2
k = 2.0 × 10 N A .
Note that in this equation k is a constant derived from careful experi-
mentation and that d is the perpendicular distance from the wire to the
Figure 6.15 Two parallel current-
point at which B is to be calculated.
carrying conductors
Figure 6.15 shows two parallel conductors, X and Y, that are carrying
currents I1 and I2 respectively. X and Y are separated by a distance of d.
The magnetic field strength in the region of Y due to the current
flowing through X is:
kI
B X = -------1 .
d
The magnitude of the force experienced by a length, l, of conductor Y
due to the external magnetic field provided by conductor X is:

F = I2lBX, or

kI
F = I 2 l  -------1
 d

This can be rearranged to give the formula:

106 MOTORS AND GENERATORS


F I1 I2
-- = k -------
-.
l d
A similar process can be used to show that the same formula will give
the force experienced by a length, l, of conductor X due to the magnetic
field created by the current flowing in conductor Y.

Force between parallel conductors


SAMPLE PROBLEM 6.2 What is the magnitude and direction of the force acting on a 5.0 cm
length of conductor X in figure 6.15 if I1 is 3.2 A, I2 is 1.2 A, and the sep-
aration of X and Y is 25 cm?
SOLUTION QUANTITY VALUE

F ?
−7 −2
k 2.0 × 10 N A
−2
l 5.0 × 10 m

I1 3.2 A

I2 1.2 A

d 0.25 m

Use the equation:


F kI 1 I 2
-- = -----------
-.
l d
This transposes to give:
kl I 1 I 2
F = --------------
-
d
–7 –2
2.0 × 10 × 5.0 × 10 × 3.2 × 1.2
= -----------------------------------------------------------------------------------
0.25
−7
= 1.5 × 10 N.
To determine the direction of the force, first find the direction of the
magnetic field at X, due to the current in Y, by using the right-hand grip
rule. The field is out of the page. Next determine the direction of the force
eBook plus on X using the right-hand push rule. This shows that the force is to the right.

Interactivity:
Torque
int-0049
6.3 TORQUE
eLesson: A torque can be thought of as the turning effect of a force acting on an
Torque
eles-0025
object. Examples of this turning effect occur when you turn on a tap,
turn the steering wheel of a car, turn the handlebars of a bicycle or
loosen a nut using a spanner, as shown in figure 6.16 (on page 108). It
Torque is the turning effect of a is easier to rotate an object if the force, F, is applied at a greater dis-
force. It is the product of the tance, d, from the pivot axis. It is also easier to rotate the object if the
tangential component of the force force is at right angles to a line joining the pivot axis to its point of
and the distance the force is application.
applied from the axis of rotation.
The torque, τ , increases when the force, F, is applied at a greater
distance, d, from the pivot axis. It is greatest when the force is applied at
right angles to a line joining the point of application of the force and the
pivot axis.

CHAPTER 6 THE MOTOR EFFECT AND DC ELECTRIC MOTORS 107


Pivot point is the centre If the force is perpendicular to the line joining the point of application
of the bolt. of the force and the pivot point, the following formula can be used:
F
τ = Fd.
The SI unit for torque is the newton
metre (N m).
If the force is not perpendicular to the F sin θ
d line joining the point of application of F
Figure 6.16 A force is applied to a the force and the pivot point, the com-
spanner to produce torque on a nut
ponent of the force that is perpendicular
to the line (see figure 6.17) can be used. θ
and the spanner.
The magnitude of the torque can then be
Pivot point
calculated using the following formula:
d
τ = Fd sinθ
Figure 6.17 Calculating torque
where θ is the angle between the force when F and d are not
and the line joining the point of appli- perpendicular
cation of the force and the pivot axis.

Calculating torque
SAMPLE PROBLEM 6.3 A lever is free to rotate about a point, P. Cal-
24 N
culate the magnitude of the torque acting
on the lever if a force of 24 N acts at right
angles to the lever at a distance of 0.75 m P
from P. The situation is shown in figure 6.18.
Figure 6.18 0.75 m

SOLUTION
QUANTITY VALUE

F 24 N

d 0.75 m

τ ?

τ = Fd
= 24 × 0.75
= 18 N m
24 N
Calculating torque
SAMPLE PROBLEM 6.4 What would be the magnitude of the
P 26°

torque in sample problem 6.3 if the 0.75 m


force was applied at an angle of 26° to
Figure 6.19
the lever, as shown in figure 6.19?
SOLUTION
QUANTITY VALUE

F 24 N

d 0.75 m

θ 26°

τ ?

τ = Fd sin θ
= 24 × 0.75 × sin 26°
= 7.9 N m

108 MOTORS AND GENERATORS


6.4 DC ELECTRIC MOTORS
An electric motor is a device that transforms electrical potential energy
eBook plus into rotational kinetic energy. Electric motors produce rotational motion
by passing a current through a coil in a magnetic field. Electric motors
Weblink: that operate using direct current (DC) are discussed in this section. The
DC motor applet operation of electric motors that use alternating current (AC) is
discussed in chapter 9.

Anatomy of a motor
A simplified diagram of a single-turn DC motor is shown in figure 6.20 (which
shows only the parts of the DC motor that produce rotational motion).
The magnets provide an external magnetic field in which the coil
rotates. As the magnets are fixed to the casing of the motor and are
The stator is the non-rotating
stationary, they are known as the stator. The stator sometimes consists of
magnetic part of the motor. a pair of electromagnets.
The coil carries a direct current. In figure 6.20 the coil has only one
loop of wire and this is shown with straight sides. This makes it easier to
The armature is a frame around visualise how forces on the sides come about and to calculate the magni-
which the coil of wire is wound, tudes of forces. The coil is wound onto a frame known as an armature.
which rotates in the motor’s
This is usually made of ferromagnetic material and it is free to rotate on
magnetic field.
an axle. The armature and coil together are known as the rotor. The
armature axle protrudes from the casing, enabling the movement of the
coil to be used to do work.

(a)
Stationary
magnets
Coil (rotor)
(b) F

S
N S N
Figure 6.20 (a) The functional parts
Split-ring F
of a simplified electric motor (b) The commutator
direction of current flow in the coil
and the direction of the forces acting Brush
on the sides Source of emf

The force acting on the sides of the coil that are perpendicular to
the magnetic field can be calculated using the previously discussed for-
mula for calculating the force on a current-carrying conductor in a
magnetic field:
F = BIl sin θ.
Real motor rotors have many loops or turns of wire on them. If the coil
has n turns of wire on it, then these sides experience a force that is n
times greater. In this case:
F = nBIl sin θ.
This extra force increases the torque acting on the sides of the coil.
The split-ring commutator and the brushes form a mechanical switch
A commutator is a device for
reversing the direction of a current that change the direction of the current through the coil every half
flowing through an electric circuit, turn so that the coil continues rotating in the same direction. The
for example, the coil of a motor. operation of the commutator is discussed in a later section of this
chapter.

CHAPTER 6 THE MOTOR EFFECT AND DC ELECTRIC MOTORS 109


The source of emf (electromotive force), for example a battery, drives
the current through the coil.

How a DC motor operates


Figure 6.21 shows the simplified DC motor at five positions throughout a
single rotation. The coil has been labelled with the letters K, L, M and N
so that it is possible to observe the motion of the coil as it completes one
rotation.

K
K
(a) (b) (c)
K N
N
L B N L
B B

Brush
L M
M M

Commutator
LK
LK
MN

LK
MN MN

N
(d) (e)
N K
K
B M
B

M L
L

MN
LK

MN
LK

Figure 6.21 Forces acting on the sides of a current-carrying loop. The lower part of the
diagram shows cross-sections of the coil.

In figure 6.21(a), the side LK has a force acting on it that is vertically


upwards. Side MN has a force of equal magnitude acting on it that is
vertically downwards. In this position the forces acting on the sides are
perpendicular to the line joining the axle (the pivot line) to the place
of application of the force. This means that the torque acting on the
coil is at its maximum value. Note that the current is flowing in the
direction of K to L.
In figure 6.21(b), the side LK still has a force acting on it that is verti-
cally upwards. Similarly, side MN still has a force of equal magnitude
acting on it that is vertically downwards. In this position the forces acting

110 MOTORS AND GENERATORS


on the sides are almost parallel to the line joining the axle (the pivot
line) to the place of application of the force. This means that the torque
acting on the coil is almost zero. It is just after this position that the com-
mutator changes the direction of the current through the coil. The
momentum of the coil keeps the coil rotating even though the torque is
very small.
Figure 6.21(c) shows the situation when the coil has moved a little fur-
ther than in the previous diagram, and the current direction through the
coil has been reversed. The force acting on side LK is now downwards
and the force acting on side MN is now upwards. This changing of direc-
tion of the forces and the momentum of the coil enable the coil to keep
rotating in the same direction. If the current through the coil did not
change its direction of flow through the coil, the coil would rock back
and forth about this position. Note that the current is now flowing in the
direction of L to K and the torque acting on the coil is still clockwise.
Figure 6.21(d) shows the position of the coil when the torque is
again at a maximum value. In this case side MN has the upward force
acting on it.
Figure 6.21(e) shows the position of the coil when the torque is again
virtually zero and the current has again been reversed. Note that the cur-
rent is again flowing in the direction of K to L and that there is still a
clockwise torque acting on the coil.
The magnitude of the forces acting on sides LK and MN remained
constant throughout the rotation just described. However, the torque
acting on the coil changed in magnitude.

Commutators
The commutator is a mechanical switch that automatically changes the
direction of the current flowing through the coil when the torque falls
to zero. Figure 6.22 provides a close-up look at a commutator. It
consists of a split metal ring, each part of which is connected to either
A split metal ring is the two-piece
conducting metal surface of a end of the coil. As the coil rotates, first one ring and then the other
commutator. Each part is make contact with a brush. This reverses the direction of the current
connected to the coil. through the coil. Conducting contacts called brushes connect the com-
mutator to the DC source of emf. Graphite, which is used in the
brushes, is a form of carbon which conducts electricity and is also used
The brushes are conductors that
make electrical contact with the as a lubricant. They are called brushes because they brush against the
moving split metal ring of the commutator as it turns. The brushes are necessary to stop the con-
commutator. necting wires from becoming tangled.

Insulator
To – ve terminal Commutator

I
B

Brush
Figure 6.22 A close-up look at I To +ve terminal
a split-ring commutator F

CHAPTER 6 THE MOTOR EFFECT AND DC ELECTRIC MOTORS 111


Electromagnet The magnetic field in a
N Commutator DC motor
The magnetic field of a DC motor can be provided
Brush either by permanent magnets (see figure 6.24) or by
electromagnets. The permanent magnets are fixed to
Insulator the body of the motor. Electromagnets can be created
Axle
using a soft iron shape that has coils of wire around it.
S The current that flows through the armature coil can
be used in the electromagnet coils. One arrangement
for achieving this effect is shown in figure 6.23.

– +
Changing the speed of a DC motor
Figure 6.23 Using an electromagnet to provide the Increasing the maximum torque acting on the sides can
magnetic field. Note that the coil is not shown in this increase the speed of a DC motor. This can be achieved
diagram! by:
• increasing the force acting on the sides
• increasing the width of the coil
• using more than one coil mounted on the armature.
The force can be increased by
• increasing the current in the coil (this is achieved by increasing the
emf across the ends of the coil)
• increasing the number of loops of wire in the coil
• producing a stronger magnetic field with the stator
• using a soft iron core in the centre of the loop. (The core then acts
like an electromagnet that changes the direction of its poles when the
6.3 current changes direction through the coil). The soft iron core is a
A model DC motor part of the armature.
Another method used to increase the average torque acting on the coil
and armature is to have two or more coils that are wound onto the
armature. This arrangement also means that the motor
runs more smoothly than a single-coil motor.
Having more than one coil requires a commu-
tator that has two opposite segments for each
coil. A stator with curved magnetic poles
keeps the force at right angles to the line
joining the position of appli-
cation and the axle for
longer. This keeps the
torque at its maximum value
for a longer period of time.
Figure 6.24 shows many of
these features in a small battery-
operated DC motor. Note that
only one of the stator magnets is
shown and that it is curved. The
poles of this magnet are on the
inside and outside surfaces. The armature
has three iron lobes that form the cores of the coils. The
coils are made from enamelled copper wire wound in series
on the lobes of the armature. The enamel insulates the wire
and prevents short circuits.

Figure 6.24 A cutaway look at a battery-operated DC motor

112 MOTORS AND GENERATORS


PHYSICS FACT
• Michael Faraday came up with the idea of an electric motor in 1821.
• The first electric motor was created by accident when two
generators were connected together by a worker at the Vienna
Exhibition in 1873.
• The French engineer and inventor Zénobe–Théophile Gramme
produced the first commercial motors in 1873.
• Direct current (DC) motors were installed in trains in Germany
and Ireland in the 1880s.
• Nikola Tesla patented the first significant alternating current (AC)
motor in 1888.

eBook plus Calculating the torque of a coil in a DC motor


Consider a single coil of length, l, and width, w, lying in a magnetic field,
Weblink: B. The plane of the coil makes an angle, θ, with the magnetic field. The
Electric motors coil carries a current, I, and is free to rotate about a central axis. This
situation is shown in figure 6.25.
K
w
l

N
L I
θ
N S
B

Figure 6.25 The plane of the coil at an angle, θ, to the magnetic field

Side KL experiences a vertically upward force of IlB. Side MN


experiences a vertically downward force of IlB.
Both forces exert a clockwise torque on the coil. The magnitude of the
torque on each side of the coil is given by:
τ = Fd sin φ
where φ is the angle between side KL or NM and the magnetic field. Note
that as φ decreases from 90° to 0°, θ, which is now the angle between the
plane of the coil and the magnetic field, increases from 0° to 90°.
Also,
w
F = IlB, d = --- and φ = (90 − θ)°.
2
Therefore the total torque acting on the coil is given by:
w
τ = 2 × IlB × --- × sin (90 − θ)°.
2
Since l × w = A, the area of the coil and sin (90 − θ)° = cos θ, the total
torque acting on a coil can be expressed as:
τ = BIA cos θ.
If the coil has n loops of wire on it, the above formula becomes:
τ = nBIA cos θ.
(Remember that θ is the angle between the plane of the coil and the mag-
netic field.)

CHAPTER 6 THE MOTOR EFFECT AND DC ELECTRIC MOTORS 113


Calculating torque on a coil
SAMPLE PROBLEM 6.5 A coil contains 15 loops and its plane is sitting at an angle of 30° to the
direction of a magnetic field of 7.6 mT. The coil has dimensions as shown
in figure 6.26 and a 15 mA current passes through the coil. Determine
the magnitude of the torque acting on the coil and the direction
(clockwise or anticlockwise) of the coil’s rotation.

8.0
cm
cm
.0
12

I
30°
N S
B

Figure 6.26
Use the relationship τ = nBIA cos θ.
SOLUTION QUANTITY VALUE
n 15 loops
−3 2
A 9.6 × 10 m
−2
I 1.5 × 10 A
θ 30°
τ ?

−3 −2 −3
τ = 15 × 7.6 × 10 × 1.5 × 10 × 9.6 × 10 × cos 30°
−5
= 1.4 × 10 N m
To determine the direction of rotation of the coil, apply the right-hand
push rule to the left-hand side of the coil. This shows that the direction
in this case is anticlockwise.

PHYSICS IN FOCUS
The galvanometer
A galvanometer is a device used to measure the magnitude
and direction of small direct current (DC) currents. A
schematic diagram of a galvanometer is shown in figure 6.27. Scale
The coil consists of many loops of wire and it is connected in
series with the rest of the circuit so that the current in the circuit Fixed iron core
Moveable coil
flows through the coil. When the current flows, the coil experi-
ences a force due to the presence of the external magnetic field
(the motor effect). The iron core of the coil increases the mag- N S
nitude of this force. The needle is rotated until the magnetic
force acting on the coil is equalled by a counter-balancing spring.
Note that the magnets around the core are curved. This results Iron core
Permanent
in a radial magnetic field; the plane of the coil will always be magnet I
parallel to the magnetic field and the torque will be constant no N S
matter how far the coil is deflected. This also means that the scale
of the galvanometer is linear, with the amount of deflection Figure 6.27 The galvanometer
being proportional to the current flowing through the coil.

114 MOTORS AND GENERATORS


PHYSICS IN FOCUS
Loudspeakers
L oudspeakers are devices that transform elec-
trical energy into sound energy. A loud-
speaker consists of a circular magnet that has one
connected to the output of an amplifier. The
amplifier provides a current that changes direc-
tion at the same frequency as the sound that is to
pole on the outside and the other on the inside. be produced. The current also changes magni-
This is shown in figure 6.28. tude in proportion to the amplitude of the
A coil of wire (known as the voice coil) sits in sound. The voice coil is caused to vibrate or move
the space between the poles. The voice coil is in and out of the magnet by the motor effect.
The direction of movement of the
Current flow in voice coil voice coil can be determined
Ring pole using the right-hand push rule.
This can be shown by examining
N figure 6.28(b). When the current
Moveable N
voice coil in the coil is anticlockwise the
(attached
N S N
force on the coil is out of the
to speaker page. When the current is clock-
cone) S
wise, the force on the coil is into
N N the page. The voice coil is con-
nected to a paper speaker cone
Field lines of magnet that creates sound waves in the air
Central pole
Speaker cone as it vibrates. When the magnitude
(b) End view, showing that of the current increases, so too
(a) Side view
the field lines of the does the force on the coil. When
permanent magnet are the force on the coil increases, it
always perpendicular to
moves more and the produced
the current in the coil
sound is louder.
Figure 6.28 A schematic diagram of a loudspeaker

CHAPTER 6 THE MOTOR EFFECT AND DC ELECTRIC MOTORS 115


SUMMARY QUESTIONS
• According to the motor effect, a current-carrying 1. State the law of magnetic poles.
conductor in a magnetic field will experience a
force that is perpendicular to the direction of the 2. Draw a bar magnet and the magnetic field around
magnetic field. The direction of the force is it. Label the diagram to show that you understand
determined using the right-hand push rule. the characteristics of magnetic field lines.
• The right-hand push rule is applied by: 3. Are the north magnetic pole of the Earth and
– extending the fingers in the direction of the the north pole of a bar magnet of the same
magnetic field polarity? Explain your reasoning.
– pointing the thumb in the direction of the
4. Figure 6.29 shows three bar magnets and some
current in the conductor.
of the field lines of the resulting magnetic field.
The palm of the hand indicates the direction of
(a) Copy and complete the diagram to show
the force.
the remaining field lines.
• The magnitude of the force, F, on a current- (b) Label the polarities of the magnets.
carrying conductor is proportional to the
strength of the magnetic field, B, the magni-
tude of the current, I, the length, l, of the
conductor in the external field and the sine of
the angle between the conductor and the field: S

F = BIl sin θ.
• If the conductor is parallel to the magnetic Figure 6.29
field, there is no force.
5. Draw a diagram to show the direction of the
• Two long parallel current-carrying conductors magnetic field lines around a conductor when
will exert a force on each other. The magnitude the current is (a) travelling towards you and
of this force is determined using the following (b) away from you.
formula:
F I1 I2 6. Each diagram in figure 6.30 represents two
-- = k -------
-. parallel current-carrying conductors. In each
l d
case, determine whether the conductors attract
• If the currents are in the same direction, the or repel each other. Explain your reasoning.
conductors attract each other. If the currents
are in opposite directions, the conductors repel
(a)
each other.
• Torque is the turning effect (moment) of a
force. The magnitude of the torque is deter- Figure 6.30 (b)
mined using the following formula:
τ = Fd sin θ 7. Each empty circle in figure 6.31 represents a
where θ is the angle between the force and the plotting compass near a coiled conductor.
line joining the point of application of the Copy the diagram and label the N and S poles
force and the pivot axis. of each coil, and indicate the direction of the
• The torque acting on the coil of an electric needle of each compass.
motor is given by the formula: Compass
CHAPTER REVIEW

τ = nBIA cos θ
where θ is the angle between the plane of the
coil and the magnetic field.
• A DC electric motor is one application of the
(a) (b)
motor effect.
Conductor
• A DC electric motor has a current-carrying coil
that rotates about an axis in an external mag- Figure 6.31
netic field. 8. The diagrams in figure 6.32 show electro-
• Galvanometers and loudspeakers are other magnets. Identify which poles are N and which
applications of the motor effect. are S.

116 MOTORS AND GENERATORS


CHAPTER REVIEW
(a)
11. Deduce both the magnitude and direction of
the forces acting on the lengths of conductors
shown in figure 6.35.

(a) B = 4.5 mT
(b) (c)

I = 3.5 A

l = 43 cm
Figure 6.32

9. In figure 6.33 a current-carrying conductor is


in the field of a U-shaped magnet. Identify the (b) B = 6.5 × 10–4 T
direction in which the conductor is forced. 63°

10
N S

.5
cm
Figure 6.33 S
I = 2.5 A
10. Identify the direction of the force acting on
Figure 6.35
each of the current-carrying conductors shown
in figure 6.34. Use the terms ‘up the page’, 12. A student wishes to demonstrate the strength
‘down the page’, ‘into the page’, ‘out of the of a magnetic field in the region between the
page’, ‘left’ and ‘right’. poles of a horseshoe magnet. He sets up the
apparatus shown in figure 6.36.
(a)

N I S A
Newton meter

(b)

(c) S N
I
N

Support for meter


Wire
B
I Figure 6.36

The length of wire in the magnetic field is


2.0 cm. When the ammeter reads 1.0 A, the
(d) B force measured on the newton meter is 0.25 N.
(a) What is the strength of the magnetic field?
(b) In this experiment the wire moves to the
right. In what direction is the current
I flowing, up or down the page?
13. A wire with the shape shown in figure 6.37
carries a current of 2.0 A. It lies in a uniform
Figure 6.34 magnetic field of strength 0.60 T.

CHAPTER 6 THE MOTOR EFFECT AND DC ELECTRIC MOTORS 117


D 17. The diagram in figure 6.39 represents a side
C
view of a single loop in a DC electric motor.
Identify the direction of the forces acting on
5.0 cm
sides A and B of the loop.
A B O
I = 2.0 A
A B
S N Figure 6.39
Figure 6.37 18. Figure 6.40 shows the functional parts of a type
(a) Calculate the magnitude of the force of DC electric motor.
acting on the section of wire, AB.
(b) Which of the following gives the direction of
the force acting on the wire at the point, C?
(i) Into the page A
(ii) Out of the page C
(iii) In the direction OC
(iv) In the direction CO B
(v) In the direction OD
N
(vi) In the direction DO
(c) Which of the following gives the direction
S
of the net force acting on the semicircular
section of wire?
(i) Into the page D Figure 6.40
(ii) Out of the page
(iii) In the direction OC (a) Name the parts labelled A to D in the diagram.
(iv) In the direction CO (b) Describe the functions of the parts
(v) In the direction OD labelled A to D.
(vi) In the direction DO 19. A coil is made up of 50 loops of wire and its
14. A wire of length 25 cm lies at right angles to a plane is at an angle of 45° to the direction of a
−2
magnetic field of strength 4.0 × 10 T. A current magnetic field of strength 0.025 T. The coil
of 1.8 A flows in the wire. Calculate the magni- has the dimensions shown in figure 6.41 and a
tude of the force that acts on the wire. current of 1.5 A flows through it in the
direction shown on the diagram.
15. Two long straight parallel current-carrying (a) Identify the direction of the force acting
wires are separated by 6.3 cm. One wire carries on side AB.
a current of 3.4 A upward and the other car- (b) Calculate the magnitude of the force
ries a current of 2.5 A downward. acting on side AB.
(a) Evaluate the magnitude of the force acting (c) Calculate the area of the coil.
on a 45 cm length of one of the wires. (d) Calculate the magnitude of the torque
(b) Is the force between the wires attraction or acting on the coil when it is in the position
repulsion? shown in the diagram.
16. Evaluate the magnitude of the force acting on
B
a 40 cm length of one of the two long wires
shown in figures 6.38 (a), (b) and (c).
15 cm
CHAPTER REVIEW

18 cm
(a) 5.0 A 5.0 A (b) 4.0 A
B
I
25 cm
C
A
2.5 A
45°
(c) 1.5 A
15 cm

42 cm

Figure 6.38 2.0 A D Figure 6.41

118 MOTORS AND GENERATORS


CHAPTER REVIEW
20. A student makes a model motor. She makes a
rectangular coil with 25 turns of wire with a
length of 0.050 m and width 0.030 m. The coil
is free to rotate about an axis that is repre-
sented by a dotted line in figure 6.42.

B 0.03 m
C
N
A 0.05 m
S
D

Figure 6.42

At the instant shown the plane of the coil is


parallel to the direction of the magnetic field.
The magnetic field strength is 0.45 T. When
the current to the coil is activated it has a
magnitude of 1.75 A in the direction ADCB.
(a) Calculate the magnitude and direction of
the force acting on side CD when the
current is flowing and the coil is in the
position shown in the diagram.
(b) When the current begins to flow, the net
force acting on the coil is zero yet the coil
begins to rotate. Why does this occur?
(c) Describe what happens to the magnitude
and direction of the force acting on side
CD as the coil swings through an angle of
60°.
(d) Describe three things the student could do
to get the coil to rotate at a faster rate.

CHAPTER 6 THE MOTOR EFFECT AND DC ELECTRIC MOTORS 119


5. Briefly close the switch and record the move-
6.1 ment of the foil strip.
6. Turn the magnet over so that the magnetic field
THE MOTOR is in the opposite direction across the strip.
7. Briefly close the switch and record the
EFFECT movement of the foil strip.

Analysis
Aim 1. Did the strip experience a force when a current
To observe the direction of the force on a current- flowed?
carrying conductor in an external magnetic field. 2. Verify that the movement of the aluminium strip
is in accordance with the right-hand push rule.
Apparatus
variable DC power supply
variable resistor (15 Ω rheostat) 6.2 THE FORCE
connecting wires
retort stand BETWEEN TWO
clamp
two pieces of thick card or balsa wood 10 cm × 10 cm PARALLEL
strip of aluminium foil 1 cm × 30 cm (approximately)
two drawing pins
switch
CURRENT-
horseshoe magnet CARRYING
Method CONDUCTORS
1. Pin the foil strip between the pieces of card.
Rest one card on the bench-top and support Aim
the other with the clamp and retort stand. To observe the direction of the forces between two
2. Connect the wires to the power pack’s DC parallel current-carrying conductors.
terminals, switch, variable resistor and strips as
shown in figure 6.43. This will produce a Apparatus
current in the strip. variable DC power supply
variable resistor (15 Ω rheostat)
connecting wires
retort stand
clamp
Cardboard two pieces of thick card or balsa wood cm × 10 cm
two strips of aluminium foil 1 cm × 30 cm
(approximately)
four drawing pins
PRACTICAL ACTIVITIES

S push switch
N
Strip of
aluminium foil Method
1. Pin each foil strip between the pieces of card so
Cardboard
that they are parallel when the top card is
supported by the clamp and retort stand.
2. Connect the wires to the power pack’s DC
Figure 6.43 The set-up for the motor effect activity terminals, switch, variable resistor and strips as
shown in figure 6.44(a). This will produce
3. Position the horseshoe magnet so that the strip is currents in the strips that are flowing in
between the poles. Note the position of the poles opposite directions.
of the magnet and the direction of the current 3. Set the power pack to its lowest value and turn
through the strip when the switch is closed. it on.
4. Set the power pack to its lowest value and turn 4. Briefly close the switch and record the
it on. movement of the foil strips.

120 MOTORS AND GENERATORS


PRACTICAL ACTIVITIES
5. Connect the wires to the power pack’s DC ter- adhesive tape
minals, switch, variable resistor and strips as five bamboo skewers
shown in figure 6.44(b). This will produce thick rectangular piece of polystyrene
currents in the strips that are flowing in the
same direction. Method
6. Briefly close the switch and record the move-
1. Push one of the skewers through the centres of
ment of the foil strips.
the corks as shown in figure 6.45.
(a) Current off 2. Glue two pieces of foil onto the small cork with
Top card two thin gaps between them to make a split
ring commutator.
3. Wrap the thin copper wire around the thick
cork 50 times, as shown. Hold in place with
adhesive tape.
4. Strip the ends of the wire and connect to each
of the foil strips of the commutator.
5. Make sure that the centres of the commutator
strips line up with the centre of the coil
windings.
6. Push the other skewers into the foil to support
the coil and commutator, as shown.
(b) Current off
7. Use the foil to make a set of brushes and use
drawing pins to position them so that they touch
opposite sides of the commutator, as shown.
8. Position the two magnets on opposite sides of
the coil so that a N pole faces a S pole.
9. Connect the DC supply to the brushes and
observe the motion of the armature.
10. Vary the spacing of the magnets.
11. Find two ways to reverse the direction of
Bottom card rotation of the armature.

Figure 6.44 (a) The set-up for currents flowing in opposite Bamboo skewers 50 turns of insulated
directions (b) The set-up for currents flowing in the same copper wire (armature)
direction
Large cork

Analysis Small cork N S Commutator (with


foil strips)
Account for your observations.

6.3 6V DC supply

A MODEL Bamboo Foil brush

DC MOTOR Polystyrene base skewers

Figure 6.45 The set-up for a model motor


Aim
To build a model DC motor. Analysis
1. Describe the effect of varying the spacing of the
Apparatus magnets on the speed of rotation of the
6 V, DC power supply armature. Account for this effect.
two bar magnets 2. Describe two methods for reversing the
thin insulated copper wire direction of rotation of the armature.
two cylindrical corks, one thin and one thick 3. Identify the role of the following:
aluminium foil – magnets
glue – coil.

CHAPTER 6 THE MOTOR EFFECT AND DC ELECTRIC MOTORS 121


CHAPTER
7 GENERATING
ELECTRICITY
Remember
Before beginning this chapter, you should be able to:
• define current, I, as the rate of flow of charge, Q, in a
circuit
• use the formula Q = It
• recall that the potential difference, V, is the amount
of energy transformed in a circuit element, per
coulomb of charge passing through the circuit
element
• recall that the emf (E or ε) is the amount of energy
supplied by a source, per coulomb of charge passing
through the source
• state Ohm’s Law for metal conductors at a constant
temperature, V = IR.

Figure 7.1 Faraday’s magnetic laboratory, 1852


(watercolour on paper, Key content
by Harriet Jane Moore (1801–1884)) At the end of this chapter you should be able to:
• outline Michael Faraday’s discovery of the generation
of an electric current by a moving magnet
• define the magnetic field strength, B, as magnetic
flux density
• describe the concept of magnetic flux in terms of
magnetic flux density and surface area
• describe generated potential difference as the rate of
change of magnetic flux through a circuit
• account for Lenz’s Law in terms of conservation of
energy, and relate it to the production of back emf in
motors
• explain that, in electric motors, back emf opposes the
supply emf
• explain the production of eddy currents in terms of
Lenz’s Law.
7.1 THE DISCOVERIES OF
MICHAEL FARADAY
Michael Faraday (1791–1867) was the son of an
English blacksmith. He started his working life at
the age of twelve as an errand boy at a book-
seller’s store and later became a bookbinder’s
assistant. At the age of nineteen he attended a
series of lectures at the Royal Institution in
London that were given by Sir Humphrey Davey.
This led to Faraday studying chemistry by him-
self. In 1813 he applied to Davey for a job at the
Royal Institution and was hired as a research assis-
tant. He soon showed his abilities as an experi-
menter and made important contributions to the
understanding of chemistry, electricity and mag-
netism. He later became the superintendent of the
Royal Institution.
In September 1821, following the 1820 discovery
by Hans Christian Oersted (1777–1851) that an Figure 7.2
electric current produces a magnetic field, Michael 1830 portrait of
Faraday discovered that a current-carrying con- Michael Faraday
ductor in a magnetic field experiences a force. This
became known as the motor effect (see chapter 6).
Almost 10 years later, in August 1831, Faraday discovered electromag-
Electromagnetic induction is the
netic induction. This is the generation of an emf and/or electric current
generation of an emf
and/or electric current through through the use of a magnetic field. Faraday’s discovery was not acci-
the use of a magnetic field. dental. He and other scientists spent many years searching for ways to
produce an electric current using a magnetic field. Faraday’s break-
through eventually led to the development of the means of generating
electrical energy in the vast quantities that we use in our society today.

PHYSICS FACT
Joseph Henry
A merican Joseph Henry (1797–1878) seems to have observed an
induced current before Faraday, but Faraday published his
results first and investigated the subject in more detail.

First experiments
In his first successful experiment, Faraday set out to produce and detect
a current in a coil of wire by the presence of a magnetic field set up by
another coil. He appears to have coiled about 70 m of copper wire
around a block of wood. A second length of copper wire was then coiled
around the block in the spaces between the first coil. The coils were sep-
A galvanometer is an instrument
arated with twine. One coil was connected to a galvanometer and the
for detecting small electrical other to a battery. (A galvanometer is an instrument for detecting small
currents. electric currents. Faraday’s early efforts to detect an induced current
failed because of the lack of sensitivity of his galvanometers.) A simplified
diagram of this experiment is shown in figure 7.3, on the following page.
When the battery circuit (or primary circuit) was closed, Faraday
observed ‘there was a sudden and very slight effect [deflection] at the

CHAPTER 7 GENERATING ELECTRICITY 123


galvanometer.’ This means that Faraday had observed a small brief current
that was created in the galvanometer circuit (or secondary circuit). A
similar effect was also produced when the current in the battery circuit
was stopped, but the momentary deflection of the galvanometer needle
was in the opposite direction.
Secondary circuit

Primary circuit

Switch
Galvanometer

Battery

Figure 7.3 A simplified diagram of


Faraday’s first experiment Block of wood

Faraday was careful to emphasise that the current in the galvanometer


circuit was a temporary one and that no current existed when the
current in the battery circuit was at a constant value.
Faraday modified this experiment by winding the secondary coil
around a glass tube. He placed a steel needle in the tube and closed the
primary circuit. He then removed the needle and found that it had been
magnetised. This also showed that a current had been produced
(induced) in the secondary circuit. It was the magnetic field of the
induced current in the secondary circuit that had mag-
netised the needle.
The next experiment was to place a steel
needle in the secondary coil when a current
was flowing in the primary coil. The
primary current was stopped and the
needle was again found to be magnet-
ised, but with the poles reversed to
the direction of the first experiment.

Iron ring experiment


In a further experiment, Faraday
used a ring made of soft iron (see
figure 7.4). He wound a primary
coil on one side and connected it
to a battery and switch. He wound
a secondary coil on the other side
and connected it to a galvano-
meter. A simplified diagram of this
experiment is shown in figure 7.5
on the opposite page.

Figure 7.4 Photograph of the apparatus (two


coils of insulated copper wire wound around an
iron ring) that Faraday used to induce an electric
current on 29 August 1831

124 MOTORS AND GENERATORS


Faraday’s iron ring apparatus, an Iron ring
iron ring with a primary and Primary Secondary
circuit circuit
secondary coil wrapped around it, Switch
is the basis of modern electrical
transformers.

Figure 7.5 A simplified diagram of


Battery Galvanometer Faraday’s iron ring experiment

When the current was set up in the primary coil, the galvanometer
needle immediately responded, as Faraday stated, ‘to a degree far beyond
what has been described when the helices [coils] without an iron core were
used, but although the current in the primary was continued, the effect
was not permanent, for the needle soon came to rest in its natural position,
as if quite indifferent to the attached electromagnet’. When the current in
the primary coil was stopped, the galvanometer needle moved in the
opposite direction. He concluded that when the magnetic field of the
primary coil was changing, a current was induced in the secondary coil.

Using a moving magnet


Faraday was also able to show that moving a magnet near a coil could
generate an electric current in the coil. The diagrams in figure 7.6 show
the effect when the N pole of a magnet is brought near a coil, held
stationary, and then taken away from the coil.

(a) N

(b) N

(c)

Figure 7.6 (a) When the N pole of a bar magnet is brought near one end of the coil, the
galvanometer needle momentarily deflects in one direction, indicating that a current has been
induced in the coil circuit. (b) When the magnet is held without moving near the end of the
coil, the needle stays at the central point of the scale (no deflection), indicating no induced
current. (c) When the N pole of the magnet is taken away from the coil, the galvanometer
needle momentarily deflects in the opposite direction to the first situation, indicating that an
induced current exists and that it is flowing in the opposite direction.

Similar results occur when the S pole is moved near the same end of
the coil, except that the deflection of the galvanometer needle is in the
opposite direction to when the N pole moves in the same direction.

CHAPTER 7 GENERATING ELECTRICITY 125


Another observation from Faraday’s experiments with a coil and a
moving magnet is that the magnitude of the induced current depends on
the speed at which the magnet is moving towards or away from the coil. If
the magnet moves slowly, a small current is induced. If the magnet moves
quickly, the induced current has a greater magnitude. This observation is
illustrated in figure 7.7. Note that in the case that has been illustrated the
S pole is approaching the coil and the current is in the opposite direction
7.1 to when the N pole approaches the coil from the same side.
Inducing current in a coiled
(a) (b) Fast
conductor Slow

S S

Small current Large current

Figure 7.7 (a) A slow-moving magnet induces a small current.


(b) A fast-moving magnet induces a larger current.

7.2 You can repeat some of Faraday’s experiments by doing practical


Linking coils activities 7.1 and 7.2.

7.2 ELECTROMAGNETIC INDUCTION


Induction can be defined as a
Electromagnetic induction is the creation of an emf in a conductor when
process where one object with it is in relative motion to a magnetic field, or it is situated in a changing
magnetic or electrical properties magnetic field. Such an emf is known as an induced emf. In a closed con-
can produce the same properties ducting circuit, the emf gives rise to a current known as an induced current.
in another object without making
physical contact.
Faraday demonstrated that it was possible to produce (or induce) a
current in a coil by using a changing magnetic field. For there to be a
current in the coil, there must have been an emf induced in the coil. We
will now examine how this is achieved.

Magnetic flux
The magnetic field in a region can be represented diagrammatically
The word ‘flux’ comes from the using field or flux lines. You can imagine the magnetic field ‘flowing’ out
Latin word fluo meaning ‘flow’. from the N pole of a magnet, spreading out around the magnet and then
Flux is a state of flowing or ‘flowing’ back into the magnet through the S pole. The field lines on a
movement. In physics, flux is the diagram show the direction of magnetic force experienced by the N pole
rate of flow of a fluid, radiation or of a test compass if it were placed at that point. The closeness (or den-
particles.
sity) of the lines represents the strength of the magnetic field. The closer
together the lines, the stronger the field.
Magnetic flux, ΦB, is the amount of Magnetic flux is the name given to the amount of magnetic field
magnetic field passing through a passing through a given area. It is given the symbol ΦB. In the SI system,
given area. In the SI system, ΦB is ΦB is measured in weber (Wb). If the particular area, A, is perpendicular
measured in weber (Wb).
to a uniform magnetic field of strength B (as shown in figure 7.8 on the
opposite page) then the magnetic flux ΦB is the product of B and A.
The strength of a magnetic field, ΦB = BA
B, is also known as the magnetic
flux density. In the SI system, B is The strength of a magnetic field, B, is also known as the magnetic flux
measured in tesla (T) or weber per
square metre (Wb m ).
−2 density. It is the amount of magnetic flux passing through a unit area. In the
−2
SI system, B is measured in tesla (T) or weber per square metre (Wb m ).

126 MOTORS AND GENERATORS


The magnetic flux, ΦB, passing through
an area is reduced if the magnetic field is Area A
not perpendicular to the area, and ΦB is
zero if the magnetic field is parallel to the
area. The above relationship between mag- B
netic flux, magnetic flux density and area is
often written as:
ΦB = B⊥A
Figure 7.8 The magnetic field
where B⊥ is the component of the magnetic
passing through an area at right
flux density that is perpendicular to the
angles
area, A.

7.3 GENERATING A POTENTIAL


DIFFERENCE
Coil For a current to flow through the galvanometer in Faraday’s experi-
ments there must be an electromotive force (emf, symbol E or ε). The
magnitude of the current through the galvanometer depends on the
resistance of the circuit and the magnitude of the emf generated in
N
Magnetic field the circuit.
lines threading Faraday noted that there had to be change occurring in the apparatus
the circuit or for an emf to be created. The quantity that was changing in each case was
coil
the amount of magnetic flux threading (or passing through) the coil in
the galvanometer circuit (see figure 7.9). The rate at which the magnetic
flux changes determines the magnitude of the generated emf.
This gives Faraday’s Law of Induction, which can be stated as follows:
Galvanometer
The induced emf in a circuit is equal in magnitude to the rate at which the
Figure 7.9 A galvanometer
magnetic flux through the circuit is changing with time.
circuit showing magnetic flux
threading the coil Faraday’s law can be written in equation form as:
∆Φ
ε = – ----------B- .
∆t
The negative sign in the above equation indicates the direction of
the induced emf. This is explained in section 7.4, Lenz’s Law (see
page 128).

PHYSICS FACT
he symbol for the Greek letter delta is ∆. It is used in math-
T ematics and physics to represent a change in a quantity.
The change in a quantity is calculated by subtracting the initial
value from the final value. For example, the change in your bank
balance over a month is the final balance minus your initial balance.

When calculating quantities using Faraday’s Law of Induction,


∆ΦB = ΦBfinal − ΦBinitial.
Since ΦB = B⊥A, a change in ΦB can be caused by a change in the mag-
netic field strength, B, or in the area of the coil that is perpendicular to
the magnetic field, or both.

CHAPTER 7 GENERATING ELECTRICITY 127


If a coil has n turns of wire on it, the emf induced by a change in the
magnetic flux threading the coil would be n times greater than that
produced if the coil had only one turn of wire.

Rotating coils in uniform magnetic fields


When a coil rotates in a magnetic field, as occurs in generators and
motors, the flux threading the coil is a maximum when the plane of the
coil is perpendicular to the direction of the magnetic field. If the plane
of the coil is parallel to the direction of the magnetic field, the flux
threading the coil is zero, so rotating the coil changes the magnetic flux.

7.4 LENZ’S LAW


H. F. Lenz (1804–1864) was a German scientist who, without knowledge
of the work of Faraday and Henry, duplicated many of their experiments.
Lenz discovered a way to predict the direction of an induced current.
This method is given the name Lenz’s Law. It can be stated in the
following way:
An induced emf always gives rise to a current that creates a magnetic field that
opposes the original change in flux through the circuit.
This is a consequence of the Principle of Conservation of Energy. The
minus sign in Faraday’s Law of Induction is placed there to remind us of
the direction of the induced emf.

Using Lenz’s Law


When determining the direction of the induced emf, it is useful to use
the field line method for representing magnetic fields. Figure 7.10 shows
the effect of a magnet moving closer to a coil connected to a galvan-
ometer. The coil is wound on a cardboard tube. As the magnet
approaches the coil, the magnetic flux density within the coil increases.
The induced current sets up a magnetic field (shown in dotted lines)
that opposes this change. The approaching magnet increases the number
of field lines pointing to the left that pass through the coil. The induced
current in the coil produces field lines that point to the right to counter
this increase.
The direction of the induced current in the coil can be deduced using
the right-hand rule for coils. The thumb points in the direction of the
induced magnetic field within the coil, the curl of the fingers holding the
coil show the direction of the induced current in the coil. Note that mag-
netic field lines do not cross. Dotted lines have been used to show the
general direction of the induced field lines, not the resultant field.

7.3
The direction of
induced currents
N

Figure 7.10 The N pole of a magnet approaches a coil. Note that the induced magnetic field
of the coil repels the approaching N pole.

128 MOTORS AND GENERATORS


Induced current in a coil
SAMPLE PROBLEM 7.1 A metal ring initially lies in a uniform magnetic field, as shown in figure
7.11. The ring is then removed from the magnetic field. In which direc-
tion does the induced current flow in the coil?

Figure 7.11

SOLUTION Initially the magnetic field lines of the external field are passing into the
page through the coil. As the coil is removed from the field, these field
lines reduce in number. The induced current flows in such a way as to
create a magnetic field to replace the missing lines. Therefore, the cur-
rent in the ring must flow in a clockwise direction in the ring, as indi-
cated by using the right-hand rule for coils. The current stops flowing
when the entire ring has been removed from the external magnetic field.

Lenz’s Law and the Principle of Conservation


eBook plus of Energy
What would happen if the opposite of Lenz’s Law were true? That is, if a
Interactivity: changing flux in a coil would produce a magnetic flux in the same direc-
Magnetic flux and
Lenz’s Law tion as the original change of flux. This would lead to a greater change in
int-0050 flux threading the coil, which in turn would lead to an even greater
eLesson: change in flux. The induced current would continue to increase in mag-
Magnetic flux and nitude, fed by its own changing flux. In fact we would be creating energy
Lenz’s Law
eles-0026 without doing any work. This clearly cannot occur.
The Principle of Conservation of Energy states:
Energy cannot be created nor destroyed, but it can be transformed from one form
to another.
To create electrical energy in a coil, work must be done. Energy is
required to move a magnet towards or away from a coil. Some of this
energy is transformed into electrical energy in the coil.

Lenz’s Law and the production of back emf


in motors
Electric motors use an input voltage to produce a current in a coil to make
the coil rotate in an external magnetic field. It has been shown that an
emf is induced in a coil that is rotating in an external magnetic field. The
emf is produced because the amount of the magnetic flux that is threading
the coil is constantly changing as the coil rotates. The emf induced in the
Back emf is an electromagnetic motor’s coil, as it rotates in the external magnetic field, is in the opposite
force that opposes the main direction to the input voltage or supply emf. If this was not the case, the
current flow in a circuit. When current would increase and the motor coil would go faster and faster for-
the coil of a motor rotates, a ever. The induced emf produced by the rotation of a motor coil is known
back emf is induced in the coil as the back emf because it is in the opposite direction to the supply emf.
due to its motion in the external
magnetic field. The net voltage across the coil equals the input voltage (or supply emf)
minus the back emf. If there is nothing attached to an electric motor to

CHAPTER 7 GENERATING ELECTRICITY 129


slow it down, (and if we ignore the minimal friction effects of an electric
motor), the speed of the armature coil increases until the back emf is
equal to the external emf. When this occurs, there is no voltage across the
coil and therefore no current flowing through the coil. With no current
through the coil there is no net force acting on it and the armature
rotates at a constant rate.
When there is a load on the motor, the coil rotates at a slower rate and
the back emf is reduced. There will be a voltage across the armature coil
and a current flows through it, resulting in a force that is used to do the
work. Since the armature coil of a motor has a fixed resistance, the net
voltage across it determines the magnitude of the current that flows.
The smaller the back emf is, the greater the current flowing through the
coil. If a motor is overloaded, it rotates too slowly. The back emf is reduced
and the voltage across the coil remains high, resulting in a high current
through the coil that could burn out the motor. Motors are usually pro-
tected from the initially high currents produced when they are switched
on by a series resistor. This resistor is switched out of the circuit at higher
speeds because the back emf results in a lower current in the coil.

Currents in electric motors


SAMPLE PROBLEM 7.2 The armature winding of an electric motor has a resistance of 10Ω. The
motor is connected to a 240 V supply. When the motor is operating with
a normal load, the back emf is equal to 232 V.
(a) What is the current that passes through the motor when it is first
started?
(b) What is the current that passes through the motor when it is operating
normally?
SOLUTION (a) When the motor is first started,
there is no back emf. The voltage QUANTITY VALUE
drop across the motor is 240 V. V 240 V

V = IR R 10 Ω
V I ?
I = ---
R
240
= ---------
10
= 24 A

(b) When the motor is operating nor- QUANTITY VALUE


mally, the voltage drop across the
motor equals the input voltage V 8V
minus the back emf.
R 10 Ω
So V = 240 V − 232 V = 8 V.
I ?
V = IR
V
I = ---
R
8
= ------
10
= 0.8 A
This example shows that electric motors require large currents when
starting compared with when they are operating normally.

130 MOTORS AND GENERATORS


7.5 EDDY CURRENTS
Charged particles moving in magnetic fields
Moving charged particles, for example electrons or alpha particles, produce
The right-hand grip rule is used to magnetic fields. The direction of the magnetic field is found using the right-
find the direction of a magnetic hand grip rule (see the boxed section Review of magnetic fields on page 101).
field around a straight current- When moving charged particles enter an external magnetic field, the
carrying conductor. magnetic field created by the moving charged particles interacts with the
external magnetic field. (An external magnetic field is one that already
exists or that is caused by another source.)
When the moving charged particles enter the magnetic field at right
angles to the field, they experience a force that is at right angles to the
velocity and to the direction of the external field. The direction of the force
is determined by using the right-hand push rule (also known as the right-
hand palm rule) and is demonstrated in figure 7.12.
To use the right-hand push rule, position your
N
Positive Direction of right hand so that:
particles current flow • the fingers point in the direction of the
Direction of
external field
the force on Right hand • the thumb points in the direction of conven-
the particles tional current flow (this means in the direc-
Direction of tion of the velocity of positive charges or in the
S Direction of the external opposite direction to the velocity of negative
the force on magnetic
the particles field (from charges)
N to S) • the direction of the force on the particles is
directly away from the palm of the hand.
Figure 7.12 The right-hand push
rule for moving charged particles
Magnetic fields and eddy currents
Induced currents do not occur in only coils and wires. They can also occur:
• when there is a magnetic field acting on part of a metal object and
there is relative movement between the magnetic field and the object
• when a conductor is moving in an external magnetic field
• when a metal object is subjected to a changing magnetic field.
An eddy current is a circular or Such currents are known as eddy currents. Eddy currents are an appli-
whirling current induced in a cation of Lenz’s Law. The magnetic fields set up by the eddy currents oppose
conductor that is stationary in a the changes in the magnetic field acting in the regions of the metal objects.
changing magnetic field, or that is Figure 7.13 shows one method of production of an eddy current. A
moving through a magnetic field. rectangular sheet of metal is being removed from an external magnetic
They resemble the eddies or swirls
left in the water after a boat has field that is directed into the page. On the left of the edge of the mag-
gone by. netic field charged particles in the metal sheet experience a force
because they are moving relative to the magnetic field.
Movement
By applying the right-hand push rule, it can be seen that
of metal positive charges experience a force up the page in this
Eddy
region. To the right of the edge of the magnetic field
current charged particles experience no force. Therefore the
loop charged particles that are free to move at the edge of the
field contribute to an upward current that is able to flow
downward in the metal that is outside the field. This
forms a current loop that is known as an eddy current.
The side of the eddy current loop that is inside the
magnetic field experiences a force due to the magnetic
The eddy current loop can be explained
field. The direction of the force on the eddy current can
in terms of the right-hand rule. be determined using the right-hand push rule and it is
Figure 7.13 The production of eddy currents
always opposite to the direction of motion of the sheet.
in a sheet of metal (This means, referring to figure 7.13, it is harder to

CHAPTER 7 GENERATING ELECTRICITY 131


move the metal to the right when the magnetic field is present than when
the field is not present.)

Eddy currents in switching devices


Induction switches are electronic devices that detect the presence of
metals and switch on another part of a circuit. Walk-through metal detec-
tors at airports use induction switching devices.
Induction switching devices consist of a high-frequency oscillator, an
analysing circuit and a relay. The oscillator produces an alternating cur-
rent in a coil. This produces an electromagnetic field with a frequency of
up to 22 MHz. When a metal object comes near the coil, eddy currents
are created in the object. The eddy currents place a load on the coil and
the frequency of the oscillator is reduced. The analysing circuit monitors
the frequency of the oscillator and, when it falls below a certain
threshold value, switches on an alarm circuit using the relay. The
threshold frequency can be adjusted so that small loads such as a few
coins or metal buttons and zippers will not trigger the alarm, but larger
loads such as guns and knives will.

PHYSICS IN FOCUS
Electromagnetic braking
C onsider a metal disk that has a part of it
influenced by an external magnetic field, as
illustrated in figure 7.14(a). As the disk is made of
trains. An electromagnet is switched on so that
an external magnetic field affects part of a metal
wheel or the steel rail below the vehicle. Eddy
metal, the movement of the metal through the currents are established in the part of the metal
region of magnetic field causes eddy currents to that is influenced by the magnetic field. These
flow. Using the right-hand push rule, it can be currents inside the magnetic field experience a
shown that the eddy current within the magnetic force that acts in the opposite direction to the
field in figure 7.14 will be upwards. The current relative motion of the train or tram, as explained
follows a downward return path through the below. In the case of the wheel, the wheel is
metal outside the region of magnetic influence. slowed down. In the case of the rail, the force
This is shown in figure 7.14(b). acts in a forward direction on the rail and there
is an equal and opposite force that acts on the
train or tram. Note: The right-hand push rule is
Rotation used twice. The first time we use it, we show that
an eddy current is produced. The thumb points
in the direction of movement of the metal disk
through the field because we imagine that the
metal contains many positive charges moving
through the field. We push in the direction of
(a) (b)
the force on these charges. This push gives us
B inwards the direction of the eddy current.
Figure 7.14 (a) A rotating metal disk acted upon by a The second time we use the right-hand push
magnetic field (b) The current that flows in the disk rule, we show that there is a force opposing the
motion of the metal. Our thumb is put in the
The magnetic field exerts a force on the direction of the current in the field (the eddy
induced eddy current. This can be shown to current), then we push in the direction of the
oppose the motion of the disk in the example on force on the moving charges (which are part of
the previous page by applying the right-hand the metal disk). We then see that the force is
push rule. In this way eddy currents can be uti- always in the opposite direction to the movement
lised in smooth braking devices in trams and of the metal.

132 MOTORS AND GENERATORS


PHYSICS IN FOCUS
Induction heating
A nother effect of eddy currents is that they
cause an increase in the temperature of the
metal. This is due to the collisions between Saucepan
moving charges and the atoms of the metal, as
Ceramic
well as the direct agitation of atoms by a magnetic top plate
field changing direction at a high frequency. Induction
Induction heating is the heating of an electri- coil
cally conducting material by the production of
eddy currents within the material. This is caused Electronic alternating Figure 7.15
by a changing magnetic field that passes current generator An induction cooker
through the material. Induction heating is unde- As the current is alternating in the coil, there is
sirable in electrical equipment such as motors, a changing magnetic field that cuts through the
generators and transformers, but it has been put metallic saucepan, causing eddy currents in the
to good use with induction cookers and induc- saucepan.
tion furnaces.
Induction furnaces
Applying the principle of induction to An induction furnace makes use of the heating
cook tops in electric ranges effect of eddy currents to melt metals. This type of
A gas stove top cooks food by burning gas to furnace consists of a container made from a non-
produce hot gases. The gases then flow across metal material that has a high melting point and
the bottom of a saucepan and transfer heat into that is surrounded by a coil. The metal is placed in
it by conduction. However, a large amount of the container. The coil is supplied with an alter-
the thermal energy in the gases is carried away nating current that can have a range of fre-
into the environment of the kitchen. The heat quencies and this produces a changing magnetic
transferred to the saucepan is used to cook the field through the metal. Eddy currents in the
food. metal raise its temperature until it melts. The eddy
Some electric cook tops contain induction currents also produce a stirring effect in the
cookers instead of heating coils. An induction molten metal, making the production of alloys
cooker sets up a rapidly changing magnetic field easier. Induction furnaces take less time to melt
that induces eddy currents in the metal of the the metal than flame furnaces. They are also
saucepan placed on the cook top. The eddy cur- cleaner and more efficient. A diagram of an induc-
rents cause the metal to heat up directly without the tion furnace is shown in figure 7.16.
loss of thermal energy that occurs with gas cooking.
The heat produced in the metal saucepan is used to Coil
cook the food. The induction coils of the cooker are
High melting-point
separated from the saucepan by a ceramic top plate. container
Metal
Induction cookers have an efficiency of about 80%
while gas cookers have an efficiency rating of about
43%. A diagram of an induction cooker is shown in
figure 7.15. Figure 7.16 An induction furnace

CHAPTER 7 GENERATING ELECTRICITY 133


(c) A rectangle with a length of 4.0 cm and
SUMMARY width 3.0 cm is perpendicular to a magnetic
−3
field of flux density 5.0 × 10 T.
• Magnetic flux, ΦB, is the amount of magnetic (d) A circle of radius 7.0 cm that is parallel with
field passing through a given area. It depends on −3
a magnetic field of flux density 5.0 × 10 T.
the strength of the field, B, as well as the area, A.
4. Evaluate the direction of the induced current
• The magnetic flux through a coil is the product through the galvanometer in each of the gal-
of the area, A, of the coil and the component of vanometer circuit coils shown in figure 7.17.
the magnetic field strength, B, that is perpen- The arrows represent the motion of the coil or
dicular to the area: ΦB = B⊥A. the magnet.
• Magnetic field strength is also known as
magnetic flux density.
• Faraday’s Law of Induction states that a (a) S
changing magnetic flux through a circuit
induces an emf in the circuit.
• The magnitude of an induced emf depends on G
the rate of change of magnetic flux through a
circuit.
• Lenz’s Law states that an induced emf always (b) N
gives rise to a current that creates a magnetic
field that, in turn, opposes the original change
in flux through the circuit. G
• Eddy currents are created when there is relative
movement between a magnetic field and a
metal object. The area of the magnetic field,
however, does not cover the whole of the metal (c) N
object. Eddy currents are also created when a
conducting material is in the presence of a
changing magnetic field. G

• Eddy currents increase the temperature of Figure 7.17


metal objects. 5. Describe the effect that the speed of move-
• When a metal object is moving relative to a ment of a magnet has on the magnitude of a
region affected by a magnetic field, the region current induced in a coil.
of magnetic field exerts a force on the eddy 6. A magnet moving near a conducting loop
currents that opposes the relative motion of the induces a current in the circuit as shown in
object to the field. figure 7.18. The magnet is on the far side of
the loop and is moving in the direction indi-
cated by the dotted line. Describe two ways in
QUESTIONS which the magnet can be moving to induce
the current as shown.
1. Explain how Michael Faraday was able to pro-
duce an electrical current using a magnet.
CHAPTER REVIEW

Magnet
2. Define the concept of magnetic flux in terms
of magnetic flux density and surface area.
Conducting
3. Calculate the magnetic flux threading (or loop
passing through) the areas in the following
cases.
2
(a) An area of 1.5 m is perpendicular to a
magnetic field of flux density 2.0 T. I G
2
(b) An area of 0.75 m is perpendicular to a
magnetic field of strength 0.03 T. Figure 7.18

134 MOTORS AND GENERATORS


CHAPTER REVIEW
7. Deduce the direction of the induced current 10. In what direction, clockwise or anticlockwise, is
through the galvanometer in figure 7.19 when: the induced current in the loop of wire in
(a) the switch is closed each situation shown in figure 7.22?
(b) the switch remains closed and a steady
(a)
current flows in the battery circuit
(c) the switch is opened.
I increasing

(b)
G

Figure 7.19
8. A flexible metal loop is perpendicular to a mag-
netic field as shown in figure 7.20(a). It is dis-
torted to the shape shown in figure 7.20(b). Is
I decreasing
the direction of the induced current in the loop (c)
clockwise or anticlockwise? Explain your answer.
I decreasing

(d)
I increasing

Figure 7.22
(a) (b)
11. A metal rectangle has a length of 7.0 cm and a
Figure 7.20 width of 4.0 cm. It is initially at rest in a uni-
9. Figure 7.21 shows a loop of wire connected in form magnetic field of strength 0.50 T as
series to a source of emf and a variable resistor. shown in figure 7.23.
Describe the direction of the induced current The rectangle is completely removed from
in the central loop when the resistance of the the magnetic field in 0.28 s.
outer loop circuit is increasing. Explain your (a) What is the initial magnetic flux through
reasoning. the rectangle?
(b) In what direction, clockwise or anticlock-
wise, is the induced current in the rec-
tangle when it is being removed from the
magnetic field?

7.0 cm

4.0 cm

B = 0.50 T

Figure 7.21 Figure 7.23

CHAPTER 7 GENERATING ELECTRICITY 135


12. A square loop of wire has sides of length 16. The armature winding of an electric drill has a
6.5 cm. The loop is sitting in a magnetic field resistance of 10 Ω. The drill is connected to a
−3
of strength 1.5 × 10 T as shown in figure 7.24. 240 V supply. When the drill is operating
The magnetic field is reduced to 0 T in a normally, the current drawn is 2.0 A.
period of 5.0 ms. (a) Calculate the current that passes through
(a) What is the flux through the loop initially?
the drill when it is first started.
(b) What will be the effect on the induced emf
if a 25-loop coil is used, rather than a (b) Calculate the back emf of the drill when it
single loop? is operating normally.
(c) In what direction will the current flow in
17. (a) What is an eddy current?
the loop?
(b) Discuss how eddy currents are produced.
6.5 cm (c) Describe how eddy currents raise the tem-
perature of metals.

18. A rectangular sheet of aluminium is pulled


from a magnetic field as shown in figure 7.25.
B 6.5 cm (a) Copy the diagram and indicate the pos-
ition and direction of an eddy current
loop.
(b) Which way does the force due to the
external magnetic field and the eddy
Figure 7.24 current act on the aluminium sheet?

13. Use diagrams to show three ways to change the


flux passing through a conductor loop.

14. (a) Explain the production of a back emf in B


an electric motor.
(b) Describe how the back emf of an electric
motor is produced.
(c) Explain why the back emf in an electric
motor opposes the supply emf.
(d) How does the back emf determine the
maximum rotating speed of an electric
motor?
(e) How can overloading an electric motor
cause it to burn out? Direction of motion

15. The armature winding of an electric motor has


a resistance of 5.0 Ω. The motor is connected Figure 7.25
to a 240 V supply. When the motor is oper-
ating with a normal load, the back emf is equal
19. Discuss how eddy currents are utilised in the
to 237 V.
following situations:
CHAPTER REVIEW

(a) Calculate the current that passes through


the motor when it is first started. (a) induction electric cook tops
(b) Calculate the current that passes through (b) electromagnetic braking of trains or trams
the motor when it is operating normally. (c) induction furnaces.

136 MOTORS AND GENERATORS


PRACTICAL ACTIVITIES
7.1 INDUCING 7.2 LINKING
CURRENT COILS
IN A COILED
CONDUCTOR Aim
To see if the magnetic field of a current-carrying
Aim coil can induce a current in another coil.
(a) To study ways of inducing a current in a coiled
conductor Apparatus
(b) To study factors affecting the size of the
galvanometer
induced current.
two coils having different numbers of turns of wire,
Apparatus preferably one of which fits into the other
an iron core that fits into the smaller of the coils. This
galvanometer could be made by taping large iron nails together.
two coils having different numbers of turns of wire
a 10 Ω resistor
two bar magnets, of different strengths, if possible
an iron core that fits into one of the coils. This a power supply
could be made by taping large iron nails connecting wires
together.
connecting wires Theory
A current flowing in a coil will create a magnetic
Theory field that threads the second coil. When there is a
You will be reproducing some of Faraday’s experi- changing magnetic field threading the second coil,
ments. Modern galvanometers are much more an emf will be induced in the coil.
sensitive than those available to Faraday.
Method
Method 1. Set up the apparatus as shown in figure 7.26.
1. Connect the coil with the fewest number of Set the power supply to 2.0 V.
turns to the galvanometer. Push the N pole of a
bar magnet into the coil. Describe what 10 Ω
happens.
2. Hold the magnet stationary near the coil.
Describe what happens.
3. Withdraw the N pole from the coil. Describe
what happens.
4. Repeat steps 1 and 3 at different speeds.
Describe what happens.
5. Hold the magnet stationary and move the coil
in different directions. Rotate the coil so that
first one end and then the other approaches G
the magnet. Describe what happens.
6. Place the iron core in the coil and touch it with
the N pole. Remove the magnet. Describe what
happens.
7. Design an experiment to examine the factors Figure 7.26
that affect the size of the induced current.
2. Close the switch and observe the effects on the
Analysis galvanometer.
1. Relate your results to Faraday’s law of electro- 3. Open the switch and observe the effects on the
magnetic induction. galvanometer.
2. Describe how you would make a generator to 4. Put an iron core in the smaller coil and repeat
create a relatively large current. steps 2 and 3.

CHAPTER 7 GENERATING ELECTRICITY 137


Analysis Theory
Relate your results to Faraday’s law of electro- Lenz’s Law states that the direction of an induced
magnetic induction. current in a coil is such that the magnetic field
that it establishes opposes the change of the
Questions original flux threading the coil.
1. When was a current induced in the secondary
(galvanometer) coil? Method
2. What happened when a steady current was 1. Carefully examine the coil to see which way the
flowing in the primary (power supply) coil? wire is coiled around the cylinder.
3. What effect did the iron core have on the 2. Use a battery to establish which way the gal-
induced current? vanometer deflects when currents flow through
it in different directions.
3. Connect the coil to the galvanometer.
7.3 THE 4. Push the N pole of the magnet towards the coil.
Note the deflection of the galvanometer.
DIRECTION Answer the following questions.
• In which direction did the current flow in
the coil?
OF INDUCED • Was the end of the coil nearest the magnet a
N or S pole?
CURRENTS • Does the magnetic field of the coil assist or
oppose the motion of the magnet?
• Does the magnetic field of the coil assist or
Aim oppose the change in flux threading the coil?
To learn how to predict the direction of an 5. Pull the N pole away from the coil. Answer the
induced current. questions of step 4.
6. Push the S pole of the magnet towards the coil.
Apparatus Answer the questions of step 4.
galvanometer 7. Pull the N pole away from the coil. Answer the
coil questions of step 4.
bar magnet
connecting wires Analysis
battery Do your results verify Lenz’s Law? Explain.
PRACTICAL ACTIVITIES

138 MOTORS AND GENERATORS


CHAPTER
8 GENERATORS
AND POWER
DISTRIBUTION
Remember
Before beginning this chapter, you should be able to:
• state Ohm’s Law for metal conductors at a constant temperature:
V = IR
• recall that the work done in a circuit component is equal to the
amount of energy transformed in the component
• apply the formulas: W = VQ, Q = It, and W = VIt
W
• recall that power is the rate of doing work: P = ------
t
• recall that the power dissipated in a metal conductor is given by the
2
V 2
following formulas: P = VI, P = ----- and P = I R
R
• recall that magnetic flux is the amount of magnetic field passing through
an area
• recall that, through a circuit, a changing magnetic field induces an
emf across the ends of the wire that makes the circuit
• apply Faraday’s law: The magnitude of the induced emf depends on the rate
of change of magnetic flux through the coil.
• apply Lenz’s Law: The induced emf in a coil is such that if a current were
to flow, it would produce a magnetic field that opposes the change in flux
threading the coil.

Key content
At the end of this chapter you should be able to:
• describe the main components of a generator
• compare the structure and function of a generator to an electric
motor
• describe the differences between AC and DC generators
• discuss the energy losses that occur as energy is fed through
transmission lines from the generator to the consumer
• assess the effects of the development of AC generators on society
and the environment
• outline the competition between Westinghouse and Edison to
supply electricity to cities
• describe the purpose of transformers in electric circuits
• compare step-up and step-down transformers
• identify the relationship between the ratio of the number of turns in
the primary and secondary coils of a transformer and the ratio of
primary to secondary voltage
Figure 8.1 High-voltage transmission lines are used to • explain why current transformations are related to the Principle of
distribute power from the generators to the consumers. Conservation of Energy
• solve problems and analyse information about transformers using:
Vp n
-------- = -------p-
Vs ns
• explain the role of transformers in electricity substations
• discuss why some domestic electrical appliances use a transformer
• discuss the impact of the development of transformers on society.
If you have ever experienced a power blackout you will realise the depen-
dence that society has developed for electricity. We use it for lighting,
warmth, cooling systems and refrigeration of our food. It runs our
computers, radios, televisions, DVD and CD players, and other appli-
ances. Much of industry is also dependent on the supply of electrical
energy for it to function. It provides safe, well-lit and comfortable office
environments and powers machinery in factories, hospital equipment
and communications technology.
Imagine what your life would be like if the principles of electromag-
netic induction had not yet been discovered. There would be no cars as
we know them because the ignition system relies on devices such as
generators (alternators) and transformers (coils). What sort of music
would you be listening to if there were no electric guitars and keyboard
instruments?
Modern Western society is dependent on the production and trans-
mission of electrical energy. In this chapter we will look at how electricity
is produced by generators and how it is transferred from the power
stations to homes and other consumers.

8.1 GENERATORS
A generator is a device that transforms mechanical kinetic energy into
eBook plus electrical energy. In its simplest form, a generator consists of a coil of
wire that is forced to rotate about an axis in a magnetic field.
Weblink: As the coil rotates, the magnitude of the magnetic flux threading (or
Generator applet passing through) the area of the coil changes. This changing magnetic
flux produces a changing emf across the ends of the wire that makes up
the coil. This is in accordance with Faraday’s Law of Induction (see
chapter 7), which can be stated as:
The induced emf in a coil is equal in magnitude to the rate at which the
magnetic flux through the coil is changing with time.
The magnetic field of a generator can be provided either by using per-
manent magnets, as shown in figure 8.2(a) or by using an electromagnet,
as shown in figure 8.2(b).

Figure 8.2 (a) Permanent magnets Electromagnet


provide the magnetic field.
(b) An electromagnet provides the Axle
(a) (b)
magnetic field.
– N Coil
N

+
S Axle
Coil
S
The stator is the stationary part of
an electrical rotating machine.
The stationary functioning parts of a generator are called the stator,
The rotor is the rotating part of an
and the rotating parts are called the rotor. In figure 8.2(a) and (b), the
electrical rotating machine. stators consist of the sections that produce the magnetic fields (perma-
nent magnets or electromagnets). The rotors are the coils.

140 MOTORS AND GENERATORS


If the coil of a generator is forced to rotate at a constant rate, the flux
threading the coil and the emf produced across the ends of the wire of
the coil vary with time as shown in figure 8.3 below.
In figure 8.3 the magnetic field is directed to the right. The corners of
the coil have been labelled L, K, M and N so that you can see how the
coil is rotating.
Beneath the diagrams of the coil is an end view of the sides LK and
MN showing the direction of the induced current that would flow
through the sides at that instant if the generator coil was connected to a
load. The arrows on this part of the diagram show the direction of
movement of the sides of the coil.
(a) (i) (ii) (iii) (iv) (v)

K O' N O' K O'


O' O'
N K K N

L N B M B K L B N
B B
M L L M
M L M
O O O O O

(b) (i) (ii) (iii) (iv) (v)


LK MN LK

LK LK MN
MN
MN LK MN

(c)
+

φ 0
t

1 revolution of coil
(d)
+

E 0
t

Figure 8.3 The variation of flux and emf of a generator coil as it completes a single revolution

The next section down in the diagram is a graph showing the variation
of magnetic flux through the coil as a function of time.
The last section of the diagram is a graph showing the variation of emf
that would be induced in the coil (if there was a gap in the coil between
the points L and M) as a function of time. The emf is given by the negative
of the gradient of a graph of magnetic flux threading the coil versus time.

CHAPTER 8 GENERATORS AND POWER DISTRIBUTION 141


In figure 8.3(a)(i) the flux threading the coil is at a maximum value.
The emf is zero, as the gradient of the flux versus time graph is zero,
which means that there is no change in flux through the coil at this
instant.
In figure 8.3(a)(ii) the flux threading the coil is zero. The emf is at a
maximum positive value, as the flux versus time graph has a maximum
negative gradient. At this instant the change in flux is happening at a
maximum rate.
In figure 8.3(a)(iii) the coil is again perpendicular to the magnetic
field, but now the coil is reversed to its original orientation. The flux
threading the coil is at a maximum negative value. The emf is zero, as the
gradient of the flux versus time graph is again zero, meaning that at this
instant there is again no change in flux.
In figure 8.3(a)(iv) the flux threading the coil is again zero. The emf
now has its maximum positive value, as the gradient of the flux versus
time graph has its maximum negative value. At this instant the change in
flux is again happening at a maximum rate.
In figure 8.3(a)(v) the flux threading the coil is again at a maximum
value. The emf is zero, as the gradient of the flux versus time graph is
E Coil turned twice as fast zero and there is no change in flux at this instant. And so the cycle
continues.
The frequency and amplitude of the voltage produced by a gener-
ator depend on the rate at which the rotor turns. If the rotor is turning
at twice the original rate, then the period of the voltage signal halves,
Time the frequency doubles and the amplitude doubles. This is shown in
figure 8.4.
The effectiveness of generators is increased by winding the coil onto
an iron core armature. The iron core makes the coil behave like an elec-
Figure 8.4 Doubling the tromagnet. This intensifies the changes in flux threading the coil as it is
frequency of rotation doubles the forced to rotate and increases the magnitude of the emf that is induced.
maximum induced emf. This effect also occurs when the number of turns of wire on the armature
is increased. The coil then behaves like a number of individual coils
connected in series. If there are n turns of wire on the armature, the
maximum emf will be n times that of a single coil rotating at the same
rate.

AC generators
Figure 8.3, on page 141, shows how a coil forced to rotate smoothly in a
magnetic field has a varying emf induced across the ends of the coil.
The value of the emf varies sinusoidally with time. (This means that the
graph of emf versus time has the same shape as a graph of sin x versus
x.) If such an emf signal were placed across a resistor, the current
flowing through the resistor would periodically alternate its direction.
In other words, the emf across the ends of a coil rotating at a constant
rate in a magnetic field produces an alternating current (AC). Alter-
nating current electrical systems are used across the world for electrical
power distribution.
This type of AC generator connects the coil to the external circuit or
distribution system by the use of slip rings. Slip rings rotate with the coil.
A slip ring system is shown in figure 8.5 on the following page.
In figure 8.5, side LK of the coil is connected to slip ring B while side
MN is connected to slip ring A. Brushes make contact with the slip rings
A terminal is the free end of a cell
or battery to which a connection is and transfer the emf (or current) to the terminals of the generator. In
made to the rest of a circuit. this case, the terminals are the external points of the generator where it
connects to the load.

142 MOTORS AND GENERATORS


Direction of rotation

K
N

N S
L

M
B
Slip rings
Brushes
A

Figure 8.5 The functional parts of Terminals


an AC generator Axle

Which way will the current flow?


When asked to determine the direction of the current in the generator
or some other part of a circuit connected to the generator, there are two
methods that can be used.
The first method is to consider the magnetic force on a positive test
charge in one side of the coil. The direction of the velocity of the
charge in the magnetic field depends on the direction of the rotation
of the coil. The direction of the magnetic force is determined using
the right-hand push rule (see figure 8.6 below). The direction of the
force acting on the test charge is also the direction of the current on
that side of the generator. It is then a matter of following that direction
around the coil to the terminals. Note that the terminal from which
the current emerges at a particular instant is acting as the positive
terminal.

Direction of force on test


Direction of movement of coil arm positive charge and direction
and movement of test positive charge of induced current

+
B
N S

Figure 8.6 Using the right-hand


push rule to determine the direction of Right hand
current flow in a generator coil

CHAPTER 8 GENERATORS AND POWER DISTRIBUTION 143


The other method is to apply Lenz’s Law to the coil. First determine
the way in which the flux threading the coil is changing at the instant in
question. The current induced in the coil will produce a magnetic field
that opposes the change in flux through the coil. Once you have estab-
lished the direction of the flux produced by the induced current, apply
the right-hand grip rule for coils to determine the direction of the cur-
rent around the coil.
Both methods are illustrated in sample problem 8.1.

Determining the polarity of a generator’s terminals


SAMPLE PROBLEM 8.1 Figure 8.7 shows an AC generator at a particular instant. At this instant,
which of the terminals, A or B, is positive?
Direction of rotation

K
N

N S
L

Terminal A

Axle Terminal B Figure 8.7

SOLUTION Test charge method


Consider a positive test charge in the side labelled LK. At the instant
shown, this positive charge is moving upwards in a magnetic field directed
Movement of positive
to the right. Applying the right-hand rule, the positive charge is forced in
charge in coil
the direction from L towards K. This situation is shown in figure 8.8.
K Side LK is connected to the slip ring leading to terminal A. Side MN is
connected to the slip ring leading to terminal B. If a current were to flow,
Force on positive it would emerge from terminal B. Therefore, terminal B is positive at the
+
charge instant shown.
B
Using Lenz’s Law
At the instant shown in figure 8.7, the flux is increasing to the right
L
through the coil as it is forced to rotate in the indicated direction. The
Figure 8.8 Positive charges are induced current in the coil will therefore produce a magnetic field that
pushed in the direction from L passes through the coil to the left to oppose the external change in mag-
towards K. netic flux through the coil. The right-hand grip rule for coils (thumb in
the direction of the induced magnetic field through the coil, fingers grip
the coil pointing in the direction of the current in the coil) shows that
the induced current is clockwise around the coil as we view it. The
current then emerges from the generator through terminal B. There-
fore, terminal B is positive at the instant shown.

DC generators
A direct current (DC) is a current where the flow of charge is in one
direction only. Direct currents provided by a battery or dry cell usually

144 MOTORS AND GENERATORS


Figure 8.9 A simple DC generator have a steady value. Direct currents
Generator terminals
may also vary with time, but keep
flowing in the same direction. DC
Brushes generators provide such currents.
A simple DC generator consists of
a coil that rotates in a magnetic field.
B (This also occurs in an AC gener-
ator.) The difference between an AC
and a DC generator is in the way that
Axle the current is provided to the
external circuit. An AC generator
uses slip rings. A DC generator uses a
split ring commutator to connect the
B Insulator rotating coil to the terminals.
Commutator
Coil
(Remember that a commutator is a
switching device for reversing the
direction of an electric current.) The functional parts of a simple DC gener-
ator are shown in figure 8.9. The magnets have been omitted for clarity.
This diagram of a simple DC generator should remind you of a DC
motor (see chapter 6), as it has the same parts. In the generator the coil
is forced to rotate in the magnetic field. This induces an emf in the coil.
The emf is transferred to the external circuit via the brushes that make
E contact with the commutator. When the emf of the coil changes
direction, the brushes swap over the side of the coil they are connected
to, thus causing the emf supplied to the external circuit to be in one
direction only. The result of this process is shown in figure 8.10.
Time The output from a DC generator can be made smoother by including
1 revolution
more coils set at regular angles on the armature. Each coil is connected
Figure 8.10 The output from a to two segments of a multi-part commutator and the brushes make con-
simple DC generator tact only with the segments connected to the
coil producing the greatest emf at a particular
time. A two-coil DC generator is shown in Generator terminals
figure 8.11(a) and its output is shown in
figure 8.11(b). Note that in this case the
commutator has four segments. Brushes

(a) B

Axle

B
Insulator
Commutator
Coil

Output
Coil 1
E
Coil 2
(b)

Figure 8.11 (a) A two-coil DC generator


(b) The output from a two-coil DC generator Time
8.1 1 revolution

Observing the output of a You can investigate the operation and structure of an AC generator
hand-operated generator and a DC motor used as a DC generator by doing practical activity 8.1.

CHAPTER 8 GENERATORS AND POWER DISTRIBUTION 145


8.2 ELECTRIC POWER GENERATING
Source of energy:
STATIONS
water, steam Electric power generating stations provide electrical power to domestic
or wind
and industrial consumers. In a power station, mechanical or heat energy
is transformed into electrical energy by means of a turbine connected to
a generator. A turbine is a machine whose shaft is rotated by jets of steam
Electric generator
or water directed onto blades attached to a wheel. Figure 8.12 shows a
simple turbine and generator combination.
The generators used in power stations have a different structure to
those studied so far. A typical generator has an output of 22 kV. This
requires the use of massive coils which would place huge forces on bear-
ings if they were required to rotate. To eliminate this problem, a power
station generator has stationary coils mounted on an iron core (making
Turbine
up the stator). The coils are linked in pairs on opposite sides of the rotor.
Electric energy output The rotor is a DC supplied electromagnet that spins with a frequency of
Figure 8.12 A turbine drives a 50 Hz. A simplified diagram of a power station generator is shown in
generator. figure 8.13. In this diagram only one set of linked coils is shown.
DC supply to rotor
Rotor + via slip rings.

Iron

Figure 8.13 A single-coil generator Stator Stator


generating N S
generating
coil coil

+
1 2 3
Voltage

0
Time AC output


Power station generators have three sets
of coils mounted at angles of 120° to each
1 2 3 other on the stator. This means that each
Rotor Stator generator produces three sets of voltage sig-
nals that are out of phase with each other
N
by 120°. This is known as three-phase power
generation. Each generator is connected to
S
N

four lines, one line for each phase and a


N
S

S return ground line. Figure 8.14 shows the


arrangement of the coils on the stator and
the voltage outputs of each set of coils.
There are two main types of power station
1 2 3 used in Australia: fossil fuel steam stations
Figure 8.14 Three-phase power generation and hydroelectric stations.

146 MOTORS AND GENERATORS


PHYSICS FACT
AC versus DC: Westinghouse and Edison
T homas Edison (1847–1931) is credited with
many inventions, including the electric light
bulb and the phonograph. He was the first
an AC generating and transmission system. He also
invented electrical generators and motors to use
with AC.
person to establish a business supplying electricity George Westinghouse (1846–1914) saw the
to cities. Electricity was initially supplied to cities advantages of using AC for supplying cities. He
for lighting streets and houses. Edison General purchased the patent rights for Tesla’s generators
Electric Company opened the first electric power and motors for $1000 000.
station in New York City in 1882 and began by Edison saw the threat to his business posed by
installing street lighting systems. the Westinghouse AC system and tried to dis-
credit it. He published an 83-page booklet
entitled ‘A Warning! From The Edison Electric
Light Co.’ in which he described the horrible
deaths of people who had supposedly come into
contact with Westinghouse’s AC cables. In 1887,
Edison held a public demonstration in New
Jersey to show the dangers of AC electricity. He
set up a 1000-volt Westinghouse AC generator
attached to a metal plate and used it to kill a
dozen animals.
The electric chair
The Edison research facility hired an inventor,
Harold P. Brown, and his assistant Dr Fred
Peterson to develop an electric chair using AC
Figure 8.15 Thomas Edison
electricity. They acquired a 1000-volt Westing-
Edison used a direct current (DC) system. His house generator and used it to kill dogs, cows and
generators — also known as dynamos — used a horses. They often invited the media to witness
commutator to give a DC output. The commu- the experiments.
tator proved to be a problem with the high-speed, On 4 June 1888,
steam-driven generators of the 1890s. Edison’s electrocution became
power stations could only supply areas a few kilo- the method of execu-
metres away. Power losses in power lines vary tion in the state of New
directly with the square of the current in the lines York. While still on the
and with the resistance of the lines. He therefore payroll of Edison,
relied on thick copper cables to carry the electric Dr Peterson headed
current, as this reduced the resistance of the lines. a committee that
The first alternating current (AC) system was advised the govern-
demonstrated in Paris in 1883. Nikola Tesla ment on the best
(1856–1943) was a Serbian-born scientist who method of electro-
worked for Edison Electric Company in France cution. The committee
repairing one of the electrical plants. The com- recommended the use
pany refused to pay him when he completed the of AC electricity. West-
project. He then moved to America and was hired inghouse refused to
by Thomas Edison. Edison promised to pay Tesla sell generators to the
$50 000 if he redesigned the dynamos used to New York state prisons
produce DC electricity. When Tesla completed for use in the electric
the project, Edison did not pay him — he said he chair, but Edison and
had been joking about the money. Brown found a way to
Tesla left the Edison Electric Company and provide them. Figure 8.16
worked for two years as a labourer while developing George Westinghouse
(continued )

CHAPTER 8 GENERATORS AND POWER DISTRIBUTION 147


The first execution using the electric chair distance of about 180 kilometres. Tests of the
took place in New York in August 1890 and the system showed that only 23 per cent of the
victim was William Kemmler. Edison tried to power was lost. This demonstration enabled
describe the people who died in the electric Westinghouse to win the contract in 1893. The
chair as having been ‘Westinghoused’. Westing- first AC generators were installed at Niagara
house supposedly hired the best lawyer of the Falls in 1896.
day to defend Kemmler and to attack electro-
cution as being a ‘cruel and unusual form of
punishment’. (Cruel and unusual forms of
punishment are banned in the American Bill of
Rights.)
Niagara Falls Power Plant
In 1887, a group of businessmen in Buffalo,
USA, pledged a prize of $100 000 to the inven-
tors of the world to design a system that would
use the power of Niagara Falls to provide elec-
tricity to the city, some 30 kilometres away. West-
inghouse and Edison were competitors for the
prize.
In 1891, the International Electrical Exhibi-
tion was held in Frankfurt, Germany. An AC
line was set up to carry sizeable quantities of
electrical power from Frankfurt to Lauffen, a Figure 8.17 The first electric chair, used in the 1890s

8.3 TRANSFORMERS
Transformers are devices that increase or decrease AC voltages. They
A transformer is a magnetic circuit
with two multi-turn coils wound are used in television sets and computer monitors that have cathode ray
onto a common core. tubes (see chapter 10) to provide the very high voltages needed to
drive the cathode ray tubes. They are used in electronic appliances such
as radios to provide lower voltages for amplifier circuits. They are also
found in answering machines, cordless phones, digital cameras, battery
chargers, digital clocks, computers, phones, printers, electronic key-
eBook plus boards, the electric power distribution system and many other devices.
Transformers consist of two coils of insulated wire called the primary and
Weblink: secondary coils. These coils can be wound together onto the same soft iron
Transformer applet core, or linked by a soft iron core. The structure of the most common type
of transformer and its circuit symbol are shown in figure 8.18.

Many iron layers


separated by an
Iron core insulator

(a) (b)

Figure 8.18 (a) A transformer Circuit symbol


with coils linked by a soft iron core Secondary coil
(b) Circuit symbol Primary coil Flux

148 MOTORS AND GENERATORS


Transformers are designed so that almost all the magnetic flux
produced in the primary coil threads the secondary coil. When an alter-
nating current flows through the primary coil, a constantly changing
magnetic flux threads (or passes through) the secondary coil. This con-
stantly changing flux passing through the secondary coil produces an AC
voltage at the terminals of the secondary coil with the same frequency as
the AC voltage supplied to the terminals of the primary coil.
8.2 The difference between the primary voltage, Vp, and the secondary
Making a simple transformer voltage, Vs, is in their magnitudes. The secondary voltage can be greater
than or less than the primary voltage, depending on the design of the trans-
If a steady DC current were to flow former. The magnitude of the secondary voltage depends on the number
through the primary coil, it would of turns of wire on the primary coil, np, and secondary coil, ns.
produce a constant flux threading the If the transformer is ideal, it is 100% efficient and the energy input at
secondary coil. No voltage would be the primary coil is equal to the energy output of the secondary coil. The
induced in the secondary coil, as a
∆Φ
changing flux is needed. rate of change of flux  -------- through both coils is the same. Faraday’s Law
 ∆t 
can be used to show that the secondary voltage is found using the formula:
∆Φ
V s = n s -------- .
∆t
Similarly, the input primary voltage, Vp, is related to the change in flux by
the equation:
∆Φ
V p = n p -------- .
∆t
Dividing these equations produces the transformer equation:
Vp n
----- = ----p- .
Vs ns
A step-up transformer provides an If ns is greater than np, the output voltage, Vs, will be greater than the
output voltage that is greater than input voltage, Vp. Such a transformer is known as a step-up transformer.
the input voltage. If ns is less than np, the output voltage, Vs, will be less than the input
voltage, Vp. Such a transformer is known as a step-down transformer.
A step-down transformer provides
an output voltage that is less than
the input voltage.
Transformers and the Principle of Conservation
of Energy
The Principle of Conservation of Energy states that energy cannot be cre-
ated nor destroyed but that it can be transformed from one form to
another. This means that if a step-up transformer gives a greater voltage at
the output, there must be some kind of a trade-off. The rate of supply of
energy to the primary coil must be greater than or equal to the rate of
supply of energy from the secondary coil. For example, if 100 J of energy
is supplied each second to the primary coil, then the maximum amount of
8.3 energy that can be obtained each second from the secondary coil is 100 J.
You cannot get more energy out of a transformer than you put into it.
Transformer ins and outs
(Some energy is usually transformed into thermal energy in the trans-
former due to the occurrence of eddy currents in the iron core. In other
words, eddy currents in the iron core cause the transformer to heat up.)
There is a decrease in useable energy whenever energy is transformed
from one form to another. The ‘lost’ energy is said to be dissipated,
usually as thermal energy.
The rate of supply of energy is known as power and is found using the
equation:
P = VI.

CHAPTER 8 GENERATORS AND POWER DISTRIBUTION 149


In ideal transformers there is assumed to be no power loss and the
primary power is equal to the secondary power. In this case:
Pp = Ps.
Substituting the power formula stated earlier, this equation becomes:
VpIp = VsIs.
Combining this equation with the transformer equation we get
another very important relationship for transformers:
I n
----s = ----p- .
Ip ns

Transformer calculations
SAMPLE PROBLEM 8.2 The transformer in an electric piano reduces a 240 V AC voltage to a
12.0 V AC voltage. If the secondary coil has 30 turns and the piano draws
a current of 500 mA, calculate the following quantities:
(a) the number of turns in the primary coil
(b) the current in the primary coil
(c) the power output of the transformer.

SOLUTION (a)
QUANTITY VALUE
Vp 240 V
Vs 12.0 V
ns 30
np ?

Vp np
----- = -----
Vs ns
ns Vp
⇒ np = ----------
-
Vs
30 × 240
= ---------------------
12.0
= 600
Therefore the primary coil has 600 turns.

(b)
QUANTITY VALUE
Ip ?
Is 500 mA
ns 30
np 600

I n
----s = ----p-
Ip ns
ns Is
⇒ Ip = --------
np
30 × 500
= ---------------------
600
= 25 mA

150 MOTORS AND GENERATORS


(c)
QUANTITY VALUE
Vs 12.0 V
Is 500 mA
Ps ?

Ps = VsIs
= 12.0 × 500
= 6000 mW
= 6.0 W

Reducing heat losses due to eddy currents


As we saw in chapter 7, eddy currents within a metal are circular movements
of electrons due to a changing flux passing through the metal. These cir-
cular movements are at right angles to the direction of the changing flux.
A thin layer of any substance is called By constructing the iron core from many layers of iron that are coated
a lamina. One meaning of the word with an insulator, the size of the eddy currents is reduced and the losses
laminate is to build up an object from due to heating effects are reduced. Such a core is called a laminated iron
a substance by placing layer upon core. The cross-sections of the thin layers are perpendicular to the direc-
layer of the substance. tion of the magnetic flux, so the size of the eddy currents is greatly
reduced, as illustrated in figure 8.19.

(a) Induced eddy current (b) Insulation layer


Iron
layer

I I
Figure 8.19 Eddy currents in (a) an
ordinary iron core, (b) a laminated
iron core Increasing current Increasing current

Another method for reducing eddy current losses in transformers is to


use materials called ferrites, which are complex oxides of iron and other
metals. These materials are good transmitters of magnetic flux, but are
poor conductors of electricity, so the magnitudes of eddy currents are
significantly reduced.

8.4 POWER DISTRIBUTION


Power stations are usually situated large distances from cities where most
of the consumers are located. This presents problems with power losses
in the transmission lines. Transmission lines are essentially long metallic
conductors which have significant resistance. This means that they have a
significant voltage drop across them when they carry a large current.
This could result in greatly decreased voltages available to the consumer,
as illustrated in sample problem 8.3 on page 152.
The resistance of a metallic conductor is proportional to its resistivity,
ρ, its length, l, and is inversely proportional to its cross sectional area, A:
ρl
R = ----- .
A

CHAPTER 8 GENERATORS AND POWER DISTRIBUTION 151


The voltage drop, V, across a conductor equals the current, I,
multiplied by the resistance, R. That is:
V = IR.
The rate of energy transfer in a conductor is called power, P, where:
P = VI.
If you know the current through a conductor and its resistance, the
previous equation becomes:
2
P = I R.
Therefore, the power lost in a transmission line is given by the formula:
2
Ploss = I R
where
I = current flowing through the transmission line
R = the resistance of the transmission line.

Transmission line calculations


SAMPLE PROBLEM 8.3 A power station generates electric power at 120 kW. It sends this power to
a town 10 km away through transmission lines that have a total resistance
of 0.40 Ω. If the power is transmitted at 240 V, calculate:
(a) the current in the transmission lines
(b) the voltage drop across the transmission lines
(c) the voltage available in the town
(d) the power loss in the transmission lines.
SOLUTION (a) For this calculation, use the station’s power and the voltage across the
transmission lines.
P = VI
P
⇒ I = ---
V
120 000
= --------------------
240
= 500 A
(b) V = IR
= 500 × 0.40
= 200 V
(c) Vtown = Vstation − Vlines
= 240 − 200
= 40 V
2
(d) Ploss = I R
2
= (500) × 0.40
= 100 000 W
= 100 kW

Using transformers to reduce power loss


Sample problem 8.3 demonstrates the difficulties involved in trans-
mitting electrical energy at low voltages over large distances. The solution
is to use transformers to step up the voltage before transmission. If the
voltage is increased, the current is reduced. Recall that the power lost in
transmission lines is given by the formula:
2
Ploss = I R.
8.4 If the transmission voltage is doubled, the current is halved and the
Transmission line power losses
power loss is reduced by a factor of four. If the current is reduced by a
factor of 10, the power loss is reduced by a factor of 100, and so on.

152 MOTORS AND GENERATORS


Using transformers enables electricity to be supplied over large dis-
eBook plus tances without wasting too much electrical energy. This has had a
significant effect on society. If transformers were not used in the power
eModelling: distribution system, either power stations would have to be built in the
Modelling power cities and towns or the users of electricity would have to be located
transmission near the power stations. The latter would mean that industries and
Use a spreadsheet to
model the energy loss
population centres would have to be located near the energy sources
incurred over a distance. such as hydro-electric dams and coal mines. The former would mean
doc-0040 that fossil fuel stations would dump their pollution on the near-by
population centres.

NSW electrical distribution system


The New South Wales electrical distribution system is shown in figure
8.20. The Bayswater power station in the Hunter Valley has four
660-megawatt generators that each have an output voltage of 23 kV.
(Each set of coils in the generators has an output of 220 MW.) The
three-phase power then enters a transmission substation where trans-
formers step up the voltage to 330 kV.

Figure 8.20 The New South Wales


electrical supply network

CHAPTER 8 GENERATORS AND POWER DISTRIBUTION 153


The transmission lines end at a terminal station where the
voltage is stepped down to 66 kV for transmission to zone power
substations like the one shown in figure 8.21. There is usually a
zone substation in each regional centre and in each munici-
pality of a city. Here the voltage is stepped down to values of
11 kV and 22 kV. Power substations can perform three tasks:
1. step down the voltage using transformers
2. split the distribution voltage to go in different directions
3. enable, using circuit breakers and switches, the disconnec-
tion of the substation from the transmission grid or sections
of the distribution grid to be switched on and off.

Figure 8.21 Transformers at a substation

Finally, pole transformers, as shown in figure 8.22, step the


voltage down to 415 V for industry and 240 V for domestic
consumption.

Figure 8.22 A pole transformer

PHYSICS IN FOCUS
Protecting power transmission lines from lightning
W hen lightning strikes, it will usually pass
between the bottom of a thundercloud
and the highest point on the Earth below. This
normally carries no current, but it may carry a
current if a fault develops in the system. A
second function of this cable is that it acts as a
means that it will strike tall trees, the tops of continuous lightning conductor. If this cable or
buildings such as church spires and the metal a tower is struck by lightning, the electricity of
power towers used to support high voltage the lightning will be conducted to the Earth by
power transmission lines. Many such power the metal towers and the transmission lines will
towers have a cable running between them not suffer from a sudden surge of voltage that
known as the continuous earth line. This cable could damage substations.

154 MOTORS AND GENERATORS


PHYSICS IN FOCUS
Insulating transmission lines
I n dry air sparks can jump a distance of 1 cm for
every 10 000 V of potential difference. Therefore,
a 330 kV line will spark to a metal tower if it comes
within a distance of 33 cm. In high humidity
conditions the distance is larger. To prevent sparks Static dischargers
jumping from transmission lines to the metal
support towers, large insulators separate them from
each other. It is important that these insulators are
strong and have high insulating properties. Suspen-
sion insulators, illustrated in figure 8.23, are used
for all high voltage power lines operating at voltages
above 33 kV, where the towers or poles are in a
straight line. Note that the individual sections of the
insulators are disk shaped. This is because dust and Disk shape of insulators
grime collect on the insulators and can become a increases the leakage path
conductor when wet. Many wooden poles catch fire
after the first rain following a prolonged dry period
because a current flows across wet dirty insulators.
The disk shape of the insulator sections increases
the distance that a current has to pass over the sur-
face of the insulator and so decreases the risk. Transmission cable
There is also less chance that dirt and grime will
collect on the undersides of the sections, and these
are also less likely to get wet.
Both the continuous earth wire and insulators are Figure 8.23 Suspension insulator used for high voltage
clearly visible in figure 8.1 at the start of this chapter. transmission lines

PHYSICS IN FOCUS
Household use of transformers
A ustralian houses are provided with AC electricity that has a value
of 240 VRMS. Most electronic circuits are designed to operate at
low DC voltages of between 3 V and 12 V. Therefore, household The RMS value of an AC
appliances that have electronic circuits in them will either have voltage is a way of describing
a ‘power-cube’ transformer that plugs directly into the power a voltage that is continuously
outlet socket, or have transformers built into them. changing. The voltage actually
Power-cube transformers can swings between −339 V and
be found in rechargeable appli- +339 V at a frequency of 50
ances such as ‘dust buster’ Hz. This voltage has the same
vacuum cleaners, electric key- heating effect on a metal
boards, answering machines, conductor as a DC voltage of
cordless telephones and laptop 240 V; hence, we usually
computers. You can probably describe it as 240 V.
find more in your own home.
These transformers also Figure 8.24 Most electrical appliances
have a rectifier circuit operate at low voltages and have
built into them that transformers built into them or come
converts AC to DC. with power-cube transformers.

CHAPTER 8 GENERATORS AND POWER DISTRIBUTION 155


8.5 ELECTRICITY AND SOCIETY
The development of electrical generators and motors has affected many
phases of modern life, but not always in the ways predicted.
It was first predicted that electric machines would do all physical
labour. Workers would have more leisure time. Backbreaking housework
would be eliminated by electrical gadgets so that people would have
much more leisure time. What has happened instead is a reduction in
unskilled jobs and an increase in unemployment.
Another prediction was that people could go back to living in the
countryside (which was considered to be the ideal place to live) and that
society would become more decentralised. This prediction arose during
the industrial revolution of the nineteenth century when people were
drawn to big cities where the power supplies were located. During this
time the energy supplies were smog-producing steam driven machines.
Electric power stations, however, are built at the source of their energy,
the coal mines or where dams are built to provide hydro-electricity. The
development of transformers and the power distribution system meant
that cities, factories and other industries could be located at large dis-
tances from the power stations.
Rather than going back to living in the countryside, however, the middle
classes moved out into the outer suburbs, living in bigger houses on bigger
blocks of land. The poorer citizens were concentrated into poorer inner
suburbs. However, the benefits of modern society that follow from the use
of electrical energy are available to most rural communities.
The use of electrical generators and motors has also had a dramatic
effect on the environment. Fossil fuel power stations have environmental
effects such as thermal pollution, acid rain and air pollution due to the
release of particles and oxides of nitrogen and sulfur. The huge amounts
of carbon dioxide released by power stations contribute to the enhanced
greenhouse effect that is thought to be raising the Earth’s temperature.

156 MOTORS AND GENERATORS


CHAPTER REVIEW
3. A hand-turned generator is connected to a
SUMMARY light globe in series with a switch. Is it easier to
turn the coil when the switch is open or when
• A changing magnetic flux through a coil can it is closed? Explain your reasoning.
induce a voltage across the terminals of the coil.
4. Describe the effect of the following changes
• A generator is a device that transforms mechan- on the size of the current produced by a
ical energy into electrical energy. generator:
• One type of generator has a rotating coil in an (a) the number of loops of the coil is increased
externally produced magnetic field. (b) the rate of rotation of the coil is decreased
• An AC generator uses slip rings to connect its (c) the strength of the magnetic field is increased
terminals to the coil. (d) an iron core is used in the coil rather than
• A DC generator uses a split ring commutator to having an air coil.
connect its terminals to the coil. 5. A student builds a model electric generator,
• Transformers are devices that can convert an similar to that shown in figure 8.26. The coil
AC input voltage signal to a higher or lower AC consists of 50 turns of wire. The student rotates
output voltage. the coil in the direction indicated on the dia-
• A transformer consists of a primary and secon- gram. The coil is rotated a quarter turn (90°).
dary coil, usually linked by a soft iron core. (a) In which direction does the current flow
through the external load?
• The following equations apply to ideal (b) Sketch a flux versus time graph and an
transformers: emf versus time graph as the coil com-
Vp n I n pletes one rotation at a steady rate starting
----- = ----p- and ----s = ----p- .
Vs ns Ip ns from the instant shown in the diagram.
• Losses in transmission lines can be calculated
2 Axis of rotation
using the formula Ploss = I R.
• Power losses in transmission lines can be Loop
reduced by using step-up transformers to
increase the transmission voltage, thereby
N S
reducing the transmission current.
Q

QUESTIONS Load

Slip rings
1. Identify the types of energy transformation
P
that occur in electrical generators. Figure 8.26
2. Figure 8.25 shows a generator.
(a) Name the parts of the generator labelled 6. A rectangular coil of wire is placed in a uniform
A, B, C, D and E. magnetic field, B, that is directed out of the
(b) Describe the function of each of these parts. page. This is shown in figure 8.27(a). At the
(c) What type of generator instant shown the coil is parallel to the page.
(AC or DC)
Y
is this?
D
N
E
A

D
P Q

S B
Figure 8.25 Figure 8.27(a) X

CHAPTER 8 GENERATORS AND POWER DISTRIBUTION 157


The coil rotates about the axis XY at a steady (a) In this arrangement, which part(s) make
rate. The time variation of the voltage drop up the rotor and which make up the stator?
induced between points P and Q for one com- (b) Explain why it is necessary to rotate the
plete rotation is shown in figure 8.27(b). electromagnet to produce an emf.
(c) Explain why energy must be provided to
E
+ the electromagnet to keep it rotating at a
constant speed.

DC
t +
A B C D E

– Figure 8.27(b)
(a) At which time(s) could the coil be in the
position shown in figure 8.27(a)? Justify
your answer.
(b) Which of the graphs in figure 8.27(c) shows
the variation of voltage versus time if the
coil is rotated at twice the original speed?
N S
E
+

A
A B C D E t

E
+

B AC output
A B C D E t

– Figure 8.28
8. Describe the main difference between AC and
E DC generators.
+
9. Draw a labelled cross-section diagram of a
simple transformer. Referring to your dia-
gram, explain how it operates in terms of the
C
t principles of electromagnetic induction.
A B C D E
10. Explain why a steady DC current input will not
operate a transformer.
– 11. A transformer changes 240 V to 15 000 V. There
E are 4000 turns on the secondary coil.
+
(a) Identify what type of transformer this is.
(b) Calculate how many turns there are on the
CHAPTER REVIEW

primary coil.
D 12. A doorbell is connected to a transformer that
A B C D E t
has 720 turns in the primary coil and 48 turns
in the secondary coil. If the input voltage is
– Figure 8.27(c)
240 V AC, calculate the voltage that is deliv-
7. In a power station, an electromagnet is rotated ered to the doorbell.
close to a set of coils. The electromagnet is 13. A school power pack that operates from a
supplied with a direct current. An idealised 240 V mains supply consists of a transformer
diagram of this arrangement is shown in with 480 turns on the primary coil. It has two
figure 8.28. outputs, 2 V AC and 6 V AC.

158 MOTORS AND GENERATORS


CHAPTER REVIEW
(a) Calculate the number of turns on the 2 V 19. Describe the advantages gained by transmit-
and 6 V secondary coils. ting AC electrical power at high voltages over
(b) The power rating of the transformer is large distances.
15 W. Calculate the maximum current that
20. A power station generates electric power at
can be drawn from the 6 V secondary coil.
120 kW. It sends this power to a town 10 kilo-
14. An ideal transformer has 100 turns on the pri- metres away through transmission lines that
mary coil and 2000 turns on the secondary have a total resistance of 0.40 Ω. If the power is
5
coil. The primary voltage is 20 V. The current transmitted at 5.00 × 10 V, calculate:
in the secondary coil is 0.50 A. Calculate: (a) the current in the transmission lines
(a) the secondary voltage (b) the voltage drop across the transmission lines
(b) the output power (c) the voltage available at the town
(c) the input power (d) the power loss in the transmission lines.
(d) the current flowing through the primary
coil. 21. A generator coil at the Bayswater power station
produces 220 MW of power in a single phase at
15. A transformer has 110 turns on the primary 23 kV. This voltage is stepped up to 330 kV by
coil and 330 turns on the secondary coil. a transformer in the transmission substation.
(a) Identify this type of transformer. Calculate:
(b) By what factor does it change the voltage? (a) the ratio
16. An ideal transformer is designed to provide a number of turns on the primary coil
----------------------------------------------------------------------------------------------------
9.0 V output from a 240 V input. The primary number of turns on the secondary coil
coil is fitted with a 1.0 A fuse. for the transformer
(a) Calculate the ratio (b) the output power of the transformer
number of turns on the primary coil (c) the current in the transmission line.
---------------------------------------------------------------------------------------------------- .
number of turns on the secondary coil
22. A generator has an output of 20 kW at 4.0 kV.
(b) Calculate the maximum current that can It supplies a factory via two long cables with a
be delivered from the output terminals. total resistance of 16 Ω.
17. A neon sign requires 12 kV to operate. Calcu- (a) Calculate the current in the cables.
late the ratio (b) Calculatethe power loss in the cables.
(c) Calculate the voltage between the ends of
number of turns on the primary coil the cables at the factory.
----------------------------------------------------------------------------------------------------
number of turns on the secondary coil (d) Describe how the power supplied to the
of the sign’s transformer if it is connected to a workshop could be increased.
240 V supply.
23. Write a brief essay discussing one of the
18. A 20.0 W transformer gives an output voltage following topics.
of 25 V. The input current is 15 A. (a) Describe the impact that the development
(a) Calculate the input voltage. of AC and DC generators has had on society.
(b) Is this a step-up or step-down transformer? (b) Describe what life would be like if trans-
(c) Calculate the output current. formers had not been invented.

CHAPTER 8 GENERATORS AND POWER DISTRIBUTION 159


Direction of rotation
8.1 OBSERVING
THE OUTPUT
OF A HAND- N X S

OPERATED Y
Slip X'

GENERATOR rings

Aim Y'
(a) To observe the output voltage of an AC gener-
ator using a cathode ray oscilloscope. Brushes Bulb
(b) To use a DC motor as a generator and to
observe its output voltage using a cathode ray
CRO
oscilloscope.
(c) To observe what happens when the coil of a Figure 8.29
generator is rotated at different speeds.
(d) To compare the force required to rotate a coil
of a generator when a load is connected to the Analysis
generator. Relate your observations to the theory presented
in this chapter.
Apparatus
hand-cranked model generator
DC motor
cathode ray oscilloscope (CRO) 8.2 MAKING
2 V light globe
connecting wires A SIMPLE
Theory TRANSFORMER
Rotating a coil in a magnetic field induces a voltage
across the terminals of the coil.
An AC generator connects to the coil with slip rings. Aim
A DC generator uses a split-ring commutator. (a) To set up a simple transformer
When a current flows through the coil of a (b) To observe that a steady DC current does not
generator there is a magnetic force that opposes produce an output current from a trans-
the motion of the coil. former.
Method
1. Connect the AC generator to the CRO as shown Apparatus
in figure 8.29. galvanometer
PRACTICAL ACTIVITIES

2. Turn the coil slowly and describe the trace on the two coils having different numbers of turns, one
CRO, noting the period, the peak voltage obtained coil fitting inside the other.
and the shape of the trace. Sketch the trace. 1.5 V cell
3. Repeat step 2, rotating the coil quickly. switch
4. Connect the light globe across the terminals of variable resistor
the generator. Comment on the ease of rotating
connecting wires
the coil compared with when the globe was not
in use.
5. Connect the CRO across the terminals of a DC Theory
motor. A transformer consists of a primary and secon-
6. Turn the shaft of the motor by hand and dary coil. In this experiment the primary coil fits
describe the trace on the CRO, noting the inside the secondary coil. The changing flux pro-
period, the peak voltage obtained and the duced in the primary coil induces a current in
shape of the trace. Sketch the trace. the secondary coil.

160 MOTORS AND GENERATORS


PRACTICAL ACTIVITIES
Method dual trace cathode ray oscilloscope
connecting wires
1. Place the smaller (primary) coil in the larger
long iron nails that fit inside the smaller coil to
(secondary) coil. Connect the secondary coil to
produce a soft iron core
the galvanometer. Connect the primary coil in
series with the switch, variable resistor and cell
as shown in figure 8.30.
Theory
An AC input voltage will induce an AC output voltage.
Secondary coil (top view)
If the secondary coil has more turns than the
primary, the device is a step-up transformer and
the secondary voltage will be greater than the pri-
mary voltage.
The inclusion of an iron core increases the effec-
1.5 V
G tiveness of the transformer.

Method
1. Place the smaller (primary) coil in the other
(secondary) coil.
Figure 8.30 Primary coil (top view) 2. Connect the secondary coil to a light globe.
2. Set the variable resistor to its lowest value. 3. Connect the primary coil in series with the
3. Observe the effects on the galvanometer as you switch, a light globe and the AC source.
close the switch, keep it closed for five seconds 4. Adjust the AC source to its lowest value (less
and then open the switch. than 2 V).
4. Describe what happens. 5. Connect one set of input leads of the CRO
5. Close the switch and change the value of the across the terminals of the primary coil, and the
variable resistor slowly and rapidly. Open the other across the terminals of the secondary coil.
switch. The set-up is illustrated in figure 8.31.
6. Record your observations.
6. Close the switch and observe the traces on the
CRO.
Analysis 7. Compare the primary and secondary peak vol-
1. Relate your observations to Faraday’s Law of tages and frequencies.
Electromagnetic Induction.
2. Describe the conditions necessary for the oper- 8. Insert iron nails in the primary coil and repeat
ation of a transformer. steps 6 and 7.

Secondary coil (top view)


8.3
TRANSFORMER
INS AND OUTS
2 V AC
Aim
To use an AC input voltage to produce an AC
output voltage and to compare their values.

Apparatus Primary coil (top view)

two coils having a different number of turns of


wire, preferably with the number of turns on
each being known, one coil fitting inside the
other
two 2 V light globes in holders.
AC power source CRO
switch Figure 8.31

CHAPTER 8 GENERATORS AND POWER DISTRIBUTION 161


Analysis Theory
• Explain your observations in terms of the theory Power losses in transmission lines can be reduced
you have studied. by using transformers to step up the voltage for
• What effect did the insertion of the iron nails transmission and step it down again for use. This
have on the effectiveness of the transformer? kit models transmission lines by using resistance
wire.

8.4 Method
TRANSMISSION 1. Set up the equipment as described in the
instruction brochure.
2. Transmit power from an AC supply to the load
LINE POWER globe using the transmission lines alone. Note
the brightness of the globe and the current in
LOSSES the transmission lines.
3. Measure the voltage output of the supply and
Aim the voltage at the globe.
(a) To investigate the effects of resistance in 4. Repeat steps 1 and 2, this time using the
transmission lines transformers.
(b) To investigate the use of transformers in power
distribution systems. Analysis
Comment on your results.
Apparatus
Transmission line experiment
(Available from Haines Educational P/L,
www.haines.com.au)
PRACTICAL ACTIVITIES

162 MOTORS AND GENERATORS


CHAPTER
9 AC ELECTRIC
MOTORS
Remember
Before beginning this chapter, you should be able to:
• find the polarity of a solenoid by using the right hand-rule
• recall that an electromagnet is a soft iron core
surrounded by a coil of wire. An electromagnet acts like
a magnet when a current flows through the coil.
• recall that voltages and currents can be induced in wires
and coils by the relative movement between the wire or
coil and a magnetic field
• recall that a changing magnetic field induces voltages and
currents in wires and coils
• apply Lenz’s Law: The direction of the induced current
is such that its magnetic action tends to resist the change
by which it is produced.
• recall that a current-carrying conductor in a magnetic
field experiences a force. This force is at a maximum when
the current is perpendicular to the magnetic field.
• find the direction of the force on a current-carrying
conductor in a magnetic field by using the right-hand
push rule: The thumb points in the direction of the
conventional current. The fingers point in the direction
of the external magnetic field. The palm indicates the
Figure 9.1 A cutaway view of an direction of the force.
electric drill. The motor is a universal motor. • recall that generators and motors have two main parts, the
Such AC motors are used extensively in power stator and the rotor
tools and household appliances. • recall that alternating current (AC) periodically changes
direction. In Australia, AC electricity has a frequency of
50 Hz.
• recall that a power station has generators that have three
sets of coils in their stators set 120° apart. They produce
three AC currents that are 120° out of phase with each
other. This is known as three-phase power generation.
• recall that a transformer consists of two coils, known as the
primary and secondary, which are usually linked by a soft
iron core. Transformers step up or step down AC voltages.

Key content
At the end of this chapter you should be able to:
• describe the main features of an AC motor
• identify some energy transfers and transformations
involving conversion of electrical energy into more
useful forms.
As we have seen in previous chapters, alternating current (AC) is widely
used in today’s world. It is easier to produce in power generating stations
and easier to distribute over large distances with small energy losses due
to the use of transformers. AC electricity is also produced at a very
precise frequency. In Australia this frequency is 50 Hz.
AC motors are used when very precise speeds are required, for
example in electric clocks. AC motors operate using an alternating
current (AC) electrical supply. Electrical energy is usually transformed
into rotational kinetic energy.

9.1 MAIN FEATURES OF AN


AC MOTOR
As with the DC motors, AC generators and DC generators that have been
studied earlier, AC motors have two main parts. These are called the
stator and the rotor.
The stator is the stationary part of the motor and it is usually con-
nected to the frame of the machine. The stator of an AC motor provides
the external magnetic field in which the rotor rotates. The magnetic field
produces a torque on the rotor.
Most AC motors have a cylindrical rotor that rotates about the axis of
the motor’s shaft. This type of motor usually rotates at high speed, with
the rotor completing about one revolution for each cycle of the AC
electricity supply. This means that Australian AC motors rotate at about
50 revolutions per second or 3000 revolutions per minute. If slower
speeds are required, they are achieved using a speed-reducing gearbox.
This type of motor is found in electric clocks, electric drills, fans, pumps,
compressors, conveyors, and other machines in factories. The rotor is
mounted on bearings that are attached to the frame of the motor. In
most AC motors the rotor is mounted horizontally and the axle is
connected to a gearbox and fan. The fan cools the motor.
Both the rotor and the stator have a core of ferromagnetic material,
usually steel. The core strengthens the magnetic field. The parts of the
core that experience alternating magnetic flux are made up of thin steel
laminations separated by insulation to reduce the flow of eddy currents
that would greatly reduce the efficiency of the motor.

The universal motor


There are two main classifications of AC motors. Single-phase motors
operate on one of the three phases produced at power generation plants.
Single-phase AC motors can operate on domestic electricity. Polyphase
motors operate on two or three of the phases produced at power gener-
ation plants. One type of single-phase AC electric motor is the universal
A universal motor is a series-wound
motor that may be operated on motor.
either AC or DC electricity. Universal motors are designed to operate on DC and AC electricity.
They are constructed on similar lines to the DC motor studied in
chapter 6. The rotor has several coils wound onto the rotor armature.
The ends of these coils connect to opposite segments of a com-
mutator. The external magnetic field is supplied by the stator electro-
magnets that are connected in series with the coils of the armature via
brushes. The interaction between the current in a coil of the
armature and the external magnetic field produces the torque that
makes the rotor rotate. Even though the direction of the current is

164 MOTORS AND GENERATORS


Field coil changing 100 times per second when
the motor is connected to the mains,
the universal motor will continue to
rotate in the same direction because the
magnetic field flux of the stator is also
changing direction 100 times every
Commutator second.
on armature A variable resistor controls the speed
DC or AC supply of a universal motor by varying the
current through the coils of the arma-
Brush
ture and the field coils of the stator.
The universal motor is commonly used
for small machines such as portable
Variable resistor drills and food mixers. Figure 9.2 shows
to control speed a schematic diagram of a universal
of motor motor, and figure 9.3 shows a diagram
Field coil
of a universal motor that has been
taken apart.

Figure 9.2 A universal motor

Bearings Shaft
Commutator
Brush housing
Armature coils
Field electromagnet
Brushes
core

Figure 9.3 A dismantled universal motor

AC induction motors
An induction motor is an AC
Induction motors are so named because a changing magnetic field that is
machine in which torque is set up in the stator induces a current in the rotor. This is similar to what
produced by the interaction of a happens in a transformer, with the stator corresponding to the primary
rotating magnetic field produced coil of the transformer and the rotor corresponding to the secondary.
by the stator and currents induced One difference is that in an induction motor the two parts are separated
in the rotor.
by a thin air gap. Another difference is that in induction motors the rotor
(secondary coil) is free to move.
The simplest form of AC induction motor is known as the squirrel-cage
motor. It is called a squirrel-cage motor because the rotor resembles the
cage or wheel that people use to exercise their squirrels or pet mice. It is
an induction motor because no current passes through the rotor directly
from the mains supply. The current in the rotor is induced in the con-
ductors that make up the cage of the rotor by a changing magnetic field,
as explained later in this chapter.

CHAPTER 9 AC ELECTRIC MOTORS 165


Squirrel-cage induction motors are by far the most common types of
AC motor used domestically and in industry. Squirrel-cage induction
motors are found in some power drills, beater mixes, vacuum
cleaners, electric saws, hair dryers, food processors and fan
heaters, to name but a few.

The structure of AC induction motors


The easiest type of induction motor to understand is the
three-phase induction motor. This operates by using each
phase of AC electricity that is generated in power stations
and supplied to factories.
Household electric motors are single-phase motors. This is
because houses are usually supplied with only one phase of
the three phases that are produced in power stations. It is not
important to understand how the rotating magnetic field is
achieved in single-phase AC induction motors; therefore, this
chapter will concentrate on the three-phase motor, as its
workings are easier to visualise.

The stator of three-phase induction motors


In both single- and three-phase AC induction motors, the stator sets up a
rotating magnetic field that has a constant magnitude. The stator of a
Figure 9.4 A mouse exercise wheel three-phase induction motor usually consists of three sets of coils that
is similar to a squirrel cage. have iron cores. The stator is connected to the frame of the motor and
surrounds a cylindrical space in which it sets up a rotating magnetic field.
In three-phase induction motors, this is achieved by connecting each of
the three pairs of field coils to a different phase of the mains electrical
supply. The coils that make a pair are located on opposite sides of the
stator and they are linked electrically. The magnetic field inside the
stator rotates at the same frequency as the mains supply; that is, at 50 Hz.
A cutaway diagram of a stator is shown in figure 9.5.

Field coil pair

Stator

Rotating
magnetic field
Figure 9.5 The rotating magnetic field set produced by
up by the stator. Note that in this stator there stator
are three pairs of field coils and that each
pair is connected.

The magnetic field rotates at exactly the same rate as the electro-
magnet in the power station generator that provides the AC electricity.
Each pair of coils in the stator of the generator supplies a corresponding
pair of coils in the stator of the motor. Therefore, the magnetic field in
the motor rotates at exactly the same rate as the electromagnet in the
generator. This is represented in figure 9.6.

166 MOTORS AND GENERATORS


Distribution
system

c
b
a a Motor
Generator

N c

Earth

Electromagnet Magnetic field


Figure 9.6 Supplying three-phase rotating at the same rate
as the electromagnet
electrical power to the motor of the generator

The squirrel-cage rotor End rings


The rotor of the AC induction motor
consists of a number of conducting bars
made of either aluminium or copper.
These are attached to two rings, known
as end rings, at either end of the bars.
This forms an object that is sometimes
A squirrel-cage rotor is an called a squirrel-cage rotor (see figure
assembly of parallel conductors 9.7). The end rings ‘short-circuit’ the
and short-circuiting end rings in bars and allow a current to flow from
the shape of a cylindrical squirrel Copper or aluminium rotor bars
one side to the other of the cage.
cage. The bars and end rings are encased in Figure 9.7 A squirrel-cage rotor
a laminated iron armature as shown in
As the electromagnet of the generator figure 9.8. The iron intensifies the magnetic field passing through the
rotates, it influences three sets of conductors of the rotor cage and the laminations decrease the heating
linked coils. This produces three sets losses due to eddy currents. The armature is mounted on a shaft that
of AC voltage signals. These signals passes out through the end of the motor. Bearings reduce friction and
are ‘out of phase’ with each other. See allow the armature to rotate freely.
figure 8.14 on page 146.
Conductors

Laminated iron
Iron laminations

Shaft

Figure 9.8 The rotor of an AC motor End ring

CHAPTER 9 AC ELECTRIC MOTORS 167


Figure 9.9, below shows a cutaway model of a fully assembled induction
motor. Note the field coils of the stator and the squirrel-cage rotor with a
laminated iron core. Also note that the shaft in this case is connected to a
gearbox so that a lower speed than 3000 revolutions per minute can be
achieved, and that the cooling fan is mounted on the shaft.

The operation of AC induction motors


As the magnetic field rotates in the cylindrical space within the stator, it
passes over the bars of the cage. This has the same effect as the bars moving
9.1 in the opposite direction through a stationary magnetic field. The relative
Demonstrating the principle of an movement of the bars through the magnetic field creates a current in the
AC induction motor bars. Bars carrying a current in a magnetic field experience a force. The
discussion on the opposite page shows that the force in this case is always
in the same direction as the movement of the magnetic field. The cage is
then forced to ‘chase’ the magnetic field around inside the stator.

Stator laminations

Cooling fan

Squirrel-cage rotor
Gear box

Housing
Stator field coil

Figure 9.9 A cutaway view of an induction motor

Figure 9.10(a) on the following page shows an end view of the mag-
netic field as it moves across a conductor bar of the squirrel cage. The
magnetic field moving to the right across the conductor bar has the same
effect as the conductor bar moving to the left across the magnetic field.
You can use the right-hand push rule to determine the direction of the
induced current in the conductor. The thumb points to the left (the
direction of movement for positive charges relative to the magnetic
field), the fingers point up the page (the direction of the magnetic field)
and the palm of the hand shows the direction of the force on positive
charges and consequently the direction of the induced current. This will
show that the current in the bar is flowing into the page.
There is now a current flowing in the conductor bar as shown in figure
9.10(b). The direction of the force acting on the induced current is
determined using the right-hand push rule. Therefore, the force on the

168 MOTORS AND GENERATORS


conductor is to the right, which is in the same direction as the movement
of the magnetic field. This is shown in figure 9.10(c).

(c) Magnetic field


(a)
S Movement of
magnetic field Magnetic
Imaginary pole Magnetic (b) field
of the stator field
Direction of force
Conductor bar
on conductor
Conductor bar

Current induced
in conductor

Figure 9.10 (a) The induced current in a conductor bar


(b) The direction of the magnetic field of the induced current flowing in the bar
(c) The force acting on the bar carrying the induced current

Slip
If the bars of the squirrel cage were to rotate at exactly the same rate as
the magnetic field, there would be no relative movement between the
bars and the magnetic field and there would be no induced current and
no force. If the cage is to experience a force there must be relative
movement, such as the cage constantly ‘slipping’ behind the magnetic
field. When operating under a load, the retarding force slows the cage
down so that it is moving slower than the field. The difference in
rotational speed between the cage and the field is known as the slip
Slip speed is the difference
between the speed of the rotating speed. This means that the rotor is always travelling at a slower speed
magnetic field and the speed of than the magnetic field of the stator when the motor is doing work.
the rotor. When any induction motor does work, the rotor slows down. You can
hear this happen when a beater mix is put into a thick mixture or when a
power drill is pushed into a thick piece of wood. When this occurs, the
amount of slip is increasing. This means that the relative movement
between the magnetic field and the conductor bars is greater and that the
induced current and magnetic force due to the current are increased.

Power of AC induction motors


Power is the rate of doing work. Work is done when energy is trans-
formed from one type to another. Induction motors are considered to
produce low power because the amount of mechanical work they achieve
is low compared with the electrical energy consumed. The electrical
power consumed by a motor is calculated using the formula P = VI, where
V is the voltage at the terminals of the motor, and I is the current flowing
through the coils of the stator. The ‘lost power’ of induction motors is
consumed in magnetising the working parts of the motor and in creating
induction currents in the rotor.

9.2 ENERGY TRANSFORMATIONS AND


TRANSFERS
The Principle of Conservation of Energy states that energy cannot be
created or destroyed, but it can be transformed from one type to
another. ‘Transform’ means to change form. Energy transfers happen
when energy moves from one place to another.

CHAPTER 9 AC ELECTRIC MOTORS 169


The energy transformations that occur when an electric appliance is
operating depend on what the machine is doing. Consider the oper-
ation of a hair dryer as an example. The electric motor transforms
electrical energy into mechanical energy (the rotor spins). Some elec-
trical energy is also transformed into internal energy due to eddy
currents in the laminated iron core. The mechanical energy of the
rotor is transformed into sound and internal energy within the motor,
and into the kinetic energy of air particles by a fan that is attached to
the shaft of the rotor. Energy is transferred from the motor to the air
by the rotor shaft and the fan. The air passes through a heating
element where electrical energy is transformed into internal energy
and light energy. Internal energy is then transferred out of the dryer
by direct conduction to the air particles and by convection as the air
particles carry the energy from the dryer.

170 MOTORS AND GENERATORS


CHAPTER REVIEW
SUMMARY QUESTIONS
• AC electric motors are used in many machines 1. Explain why AC electric motors are used in
in households and in industry. situations that require accurate speed, such as
clocks and tape recorders.
• The basic operating principles of AC electric
motors are the same as for DC motors. A 2. Describe how universal motors operate on
current-carrying conductor in a magnetic field both AC and DC electricity.
experiences a force, the direction of which is 3. Describe the main features of an AC induction
given by the right-hand push rule. motor.
• The stator of an AC induction motor consists of 4. Figure 9.11 shows a cutaway diagram of an AC
field coils that set up a rotating magnetic field induction motor.
that has a constant magnitude. This field (a) Name the parts labelled A to E.
rotates 50 times every second. (b) Explain the functions of the parts labelled
A to E.
• A squirrel-cage rotor has conductor bars that
are short-circuited by two end rings. The bars
are embedded in a laminated iron core. (The D C
laminations reduce energy losses due to eddy
currents.)
• A current is induced in the bars of the rotor as
the magnetic field moves across them.
• The bars experience a force because they are E
carrying a current in a magnetic field. The
direction of the force is the same as the
direction of movement of the magnetic field.
The rotor is therefore forced to chase the
magnetic field.
A B
• For the current to be induced in the rotor bars,
the rotor must slip behind the magnetic field. Figure 9.11
5. Briefly describe how the rotating magnetic
field is produced in a three-phase AC electric
motor.
6. Describe the construction of a squirrel-cage
rotor.
7. Explain how a current is produced in the rotor
bars of an AC induction motor?
8. Describe the purpose of the end rings of a
squirrel cage.
9. (a) What is slip?
(b) Explain why slip is necessary for the oper-
ation of a squirrel-cage induction motor.
10. Account for the ‘lost power’ of induction
motors.
11. Describe the energy transformations and trans-
fers that occur in the operation of a household
electric mixer.

CHAPTER 9 AC ELECTRIC MOTORS 171


3. Move the conductor downward quickly between
9.1 the poles of the magnet. Note the magnitude
and direction of the current through the
DEMONSTRATING conductor.
4. Move the conductor up slowly between the
THE PRINCIPLE poles of the magnet. Note the magnitude and
direction of the current through the conductor.
5. Move the conductor upward quickly between
OF AN AC the poles of the magnet. Note the magnitude
and direction of the current through the
INDUCTION conductor.
6. Place the conductor at an angle to the magnetic
MOTOR field and move it upward quickly between the
poles of the magnet. Compare the magnitude
Aim of the current through the conductor with the
(a) To investigate the direction of a current in a result attained in step 5.
conductor that is moving relative to a magnetic 7. Double the separation of the magnetic poles.
field What effect will this have on the strength of the
(b) To investigate the factors affecting the magni- magnetic field?
tude of a current in a conductor that is moving 8. Repeat step 5.
relative to a magnetic field. 9. Return the magnetic poles to their original
separation. Support the conductor and move
Apparatus the poles upward past the conductor. Do you
get the same direction of current as when the
a copper or aluminium rod. If none are available, a conductor moved downward between the poles?
length of copper wire will do.
two bar magnets or a horseshoe magnet Analysis
a galvanometer
connecting wires Relate your observations to the theory presented
in this chapter.
Theory
A current is induced in a conductor when there is
relative movement between it and a magnetic field.
The magnitude depends on the orientation of
the conductor to the field, the speed of the relative
motion between the conductor and the magnetic
field and the strength of the magnetic field.

Method
Set up the apparatus as shown in figure 9.12. Place Conductor
two bar magnets on two books with an N pole on
PRACTICAL ACTIVITIES

the left and an S pole on the right.


Note the separation of the poles,
as this affects the strength of
the magnetic field.
1. Connect the galvanometer
to the conductor.
2. Move the conductor down-
ward slowly between the
poles of the magnet. Note
the magnitude and direc-
tion of the current through
the conductor.

Figure 9.12 Galvanometer

172 MOTORS AND GENERATORS


HSC CORE MODULE
Chapter 10
Cathode rays and the
development of television

Chapter 11
The photoelectric effect
and black body radiation

Chapter 12
The development and
application of transistors

Chapter 13
Superconductivity

FROM IDEAS TO
IMPLEMENTATION
CATHODE RAYS

CHAPTER
10 AND THE
DEVELOPMENT OF
TELEVISION
Remember
Before beginning this chapter, you should be able to:
• describe the trajectory of a particle in a uniform
gravitational field
• describe the properties of electrostatic charges and the
fields associated with them
• describe electric potential difference, or the voltage
between two points, as the work done in moving a unit
charge between those points
• describe the effect that an electric field has on charged
particles
• recall and perform calculations using F = qE, P = VI
and Energy = VIt
• describe the fields produced by magnetic poles
• describe the magnetic fields produced around current-
carrying conductors.

Key content
At the end of this chapter you should be able to:
• explain how cathode ray tubes allowed a stream of
charged particles to be manipulated
• explain why the apparently inconsistent behaviour of
cathode rays caused debate as to whether they were
charged particles or electromagnetic waves
• list the properties of cathode rays and describe the
key experimental observations from which these
Figure 10.1 The famous physicist J. J. Thomson properties were deduced
(1856–1940) was celebrated for his experiments • identify that moving charged particles in a magnetic
with the electron. field experience a force
• describe the effect that a magnetic field has on
charged particles
• describe, qualitatively, the electric field between
parallel charged plates
• outline Thomson’s experiment in which he measured
the charge-to-mass ratio of the electron
• sketch a cathode ray tube; label the electrodes, electron
gun, the deflection plates or coils and the fluorescent
screen, and describe their role in television displays
and oscilloscopes
• perform calculations using E = V
---- , F = qE and
d
F = qvB sin θ
• using the example of cathode rays, discuss the
application of the scientific method to develop an
understanding of this phenomenon
10.1 THE DISCOVERY OF CATHODE
RAYS
In the early part of the nineteenth century, the discovery of electricity
had a profound effect on the study of science. By the 1850s, much was
known about which solids and liquids were electric conductors or
insulators, and it was thought that gases were electric insulators.
The development of a vacuum, using pumps to remove the air from glass
tubes, was also being actively researched at this time. As improved vacuum
pumps were developed, scientists were able to experiment with gases at very
low pressures. In 1855, a German physicist, Heinrich Geissler (1814–1879),
refined a vacuum pump so that it could be made to evacuate a glass tube to
within 0.01 per cent of normal air pressure. Geissler’s friend, Julius Plucker
(1801–1868), took these tubes and sealed a metal plate, called an electrode,
to each end of the tube. The electrodes made electrical connections
through the glass and were sealed to maintain the partial vacuum in the
tube. These were then connected to a
high-voltage source, as illustrated in Glass glows here
figure 10.2. To their surprise, the Anode Cathode
Partial vacuums are often described evacuated tube actually conducted an
as ‘rarefied air’. electric current. What puzzled them
more was the fact that the glass at the
positive end, or anode, of the vacuum
Fluorescence is the emission of tube glowed with a pale green light.
light from a material when it is
exposed to streams of particles or What type of invisible ‘ray’ caused this
Figure 10.2 Production of cathode
external radiation. glow or fluorescence?
rays in a discharge tube, as used by
Whatever it was must have orig-
Plucker
inated at the negative electrode, or
Cathode rays are now known to be cathode, of the vacuum tube. Another
streams of electrons emitted within physicist, Eugene Goldstein (1850–
an evacuated tube from a cathode
(negative electrode) to an anode
1930), who was studying these same effects, named the rays that caused
(positive electrode). They were the glow ‘cathode rays’, and the tubes became known as cathode ray
first observed in discharge tubes. tubes or discharge tubes (see figure 10.3). Early experimenters used
these tubes to investigate all of the properties of cathode rays and X-rays.
Some modified the cathode ray tube to include a rectangular metal plate
A cathode ray tube or a discharge covered in zinc sulphide inside the tube. This plate had a horizontal slit
tube is a sealed glass tube from cut into the end nearest the cathode and the plate was slightly bent so
which most of the air is removed by
vacuum pump. A beam of electrons that the cathode rays formed a horizontal beam. When the cathode rays
travels from the cathode to the struck this material it appeared fluorescent and showed the path of the
anode and can be deflected by rays through the tube.
electrical and/or magnetic fields. Cathode ray tubes have been refined and developed and are now used
in television sets, computers and many other applications.

Maltese cross Rotating wheel


Electric plates
Wheel

Screen display

Induction coil Wheel


S

+ –
N

Figure 10.3 A variety of early discharge tubes used in experiments

CHAPTER 10 CATHODE RAYS AND THE DEVELOPMENT OF TELEVISION 175


Discharge tubes
Discharge tubes evacuated to different air pressures were found to
produce different effects.
For example, in practical activity 10.1 (page 191), an induction coil
acts as a step-up transformer, delivering a high voltage across the set of
discharge tubes. At low pressures, electrons can accelerate to faster
speeds before colliding with gas particles. Initially, a current will flow
even though nothing can be seen. The first effect that can be observed is
a steady luminous discharge known as a ‘glow discharge’. As the pressure
is lowered further, a number of colourful effects can be seen.
10.1 At first, most of the tube is occupied by a bright luminous region called
a ‘positive column’ which appears to start from the anode and is broken
Discharge tubes
up into a series of bands or striations (see figure 10.4). Near the anode, a
weaker glow can be seen. The striations are separated by ‘dark spaces’.
These discharges and spaces are named after some of the scientists who
examined them, for example, ‘Ashton’s dark space’, ‘Crookes’ dark
space’ and ‘Faraday’s dark space’. The colours of the discharge depend
on the gas used. In low pressure air, the positive column is a brilliant pink
and the negative glow is deep blue.

Perforated cathode Cathode rays


(cathode glow) (electrons) Striations (bright regions)
Anode

– +

Figure 10.4 Some of the effects Canal rays Crookes’ Faraday’s Positive column
observed in discharge tubes (positive ions) dark space dark space

PHYSICS IN FOCUS
Everyday uses of discharge tubes
N eon signs colour the night in every city
street. They are long tubes with most of the
air removed. A small amount of gas is intro-
duced which, when excited by a high potential,
glows with a characteristic colour. For example,
when the added gas is neon, the kinetic energy
of the electrons is sufficient to ionise the gas
around the cathode causing the emission of a
reddish light.
Fluorescent tubes in the home contain mer-
cury vapour at low pressure. The light produced
is in the ultraviolet region of the electromag-
netic spectrum. To produce visible light, a thin
coating of a powder is spread on the inside sur-
face of the tube. The ultraviolet radiation causes
this coating to fluoresce with the familiar bright
white light.

Figure 10.5 Discharge tubes are used in


the neon lights that are often a feature
of city skylines at night.

176 FROM IDEAS TO IMPLEMENTATION


10.2 EFFECT OF ELECTRIC FIELDS ON
CATHODE RAYS
Review your study of electric fields You are familiar with three types of fields: gravitational, electric and
from the Preliminary Course topic, magnetic. An electric field exists in any region where an electric charge
‘Electrical energy in the home’. experiences a force. There are two types of charge — positive and negative.
We define the direction of the electric field as the direction in which a
positive charge will experience a force when placed in an electric field.
This definition of an electric field allows us to describe the fields around
a charge (see figure 10.6). Using Faraday’s ‘lines of force’, we see that these
lines radiate from a point at the centre of the charge. For a positive charge,
lines of force leave the centre of the charge and radiate in all directions from
it. For a negative charge, the lines are directed radially into the centre of the
charge.
If a positive charge is placed near another positive charge, it will
experience a force of repulsion; that is, a force which acts in the
direction of the arrow.
A number of rules apply to the interpretation of these lines of force
diagrams (see figure 10.6).
• Field lines begin on positive charges and end on negative charges.
• Field lines never cross.
• Field lines that are close together represent strong fields.
• Field lines that are well separated represent weak fields.
• A positive charge placed in the field will experience a force in the
direction of the arrow.
• A negative charge placed in the field will experience a force in the
direction opposite to the arrow.

+ – + – –

(a) Electric field around a single (b) Opposite charges (c) Like charges
positive charge

Figure 10.6 Electric fields around charges

Uniform electric fields


A uniform electric field can be made
by placing charges on two parallel
plates which are separated by a small + + + + + +
distance compared with their length.
d
These electric fields are very useful in (metres) E=V

physics and were used by prominent d
scientists such as Robert Millikan and V (volts)
J. J. Thomson (see page 180 onwards) – – – – – –
when investigating the properties of
small charged particles. Figure 10.7(a) Electric field (E )
Consider the electric field between between two parallel plates
two plates that are separated by d
metres, as shown in figure 10.7(a).

CHAPTER 10 CATHODE RAYS AND THE DEVELOPMENT OF TELEVISION 177


The magnitude, or intensity, of an electric field is determined by
+++++++ finding the force acting on a unit charge placed at that point. The
symbol for electric field is E.
F
E = ---
q
where
E = electric field intensity (in newtons per coulomb)
F = electric force (in newtons, N)
q = electric charge (in coulombs, C).
When the potential difference, or voltage, is applied to the plates, a
uniform electric field is produced. The strength of this field is the same
––––––– at all points between the plates, except near the edges where it ‘bulges’
slightly (see figure 10.7(b)).
Figure 10.7(b) The electric field is
The magnitude of the electric field, E, is given by:
uniform except at the edges of the
V
plates where it bulges slightly. E = ---
d
where
V = potential difference, in volts.
This can be derived by recalling that potential difference is the change in
potential energy per unit charge moving from one point to the other.
The amount of energy or work is given by:
W = qV.
Also, the work done by a force is the product of the force and the distance
moved, d. In this case, F = qE. Hence, the amount of work is given by:
W = Fd = qEd
It follows that: V = Ed or E = ---V- .
d
Remember that work done is equal to the gain in energy.
A small positive charge released next to the positive plate will experience a
force that will accelerate the charge. The charge will increase its kinetic energy.
2
W = qV = 1--- mv
2
This shows that the amount of work done depends only on the potential
difference and the charge, and is the same for both uniform and non-
uniform electric fields.
Electric field strength
SAMPLE PROBLEM 10.1 What is the electric field strength between two parallel plates separated
by 5.0 mm, if a potential difference of 48 volts is applied across them?
SOLUTION V = 48 volts
d = 5.0 mm
−3
= 5.0 × 10 m
V
E = ---
d
48
= -----------------------
-
5.0 × 10 –3
−1 −1
= 9600 V m or N C
Moving charge through a potential difference
SAMPLE PROBLEM 10. 2 How much work is done moving a charge of 3.6 µC through a potential
difference of 15 volts?
−6
SOLUTION q = 3.6 × 10 C
V = 15 volts
W = qV
−6
= 3.6 × 10 × 15
−5
= 5.4 × 10 J

178 FROM IDEAS TO IMPLEMENTATION


Velocity of a charge between plates
SAMPLE PROBLEM 10.3 Two parallel plates are separated by a distance of 5.0 mm. A potential
difference of 200 volts is connected across them. A small object with a
−12
mass of 1.8 × 10 kg is given a positive charge of 12 µC. It is released
from rest near the positive plate. Calculate the velocity gained as it moves
from the positive plate to the negative plate.
−3
SOLUTION d = 5.0 × 10 m
V = 200 V −
5
q = 1.2 × 10 C
−12
m = 1.8 × 10 kg
1 2
qV = --- mv
2
–5 2
v = 2
2 ( 1.2 × 10 × 2.00 × 10 )
----------------------------------------------------------------
– 12
1.8 × 10
4 −1
v = 5.2 × 10 m s

PHYSICS IN FOCUS
Protection against lightning: pointed conductors
L ightning strikes are an example of a massive
electrical discharge over a short interval of
time. Large cumulonimbus clouds generate a dis-
point to the dome of the generator, no spark can
be obtained. This is because the charge leaks away
into the surrounding air before it builds to a high
tribution of charge between the top of the cloud enough level for a spark to form.
and the bottom. As the cloud moves over the For buildings, a lightning protection system
ground the negative charge in the cloud repels involves attaching a pointed metal object at the
electrons in the ground, producing a potential highest part of the roof and running a system of
difference between the cloud and the ground. metal straps from it to carry the charge safely to
When this potential difference is large enough to ground, where the strap is buried a metre into the
overcome the resistance of the air, there is a earth. The charge from a lightning strike can drain
discharge that we see as lightning. Uncontrolled quickly through the conductor and prevent a fire.
discharge can be very destructive.
Benjamin Franklin, who experi-
Positively charged
mented with electricity in the 1750s, ions are attracted.
+ –
came up with the first ‘lightning rod’ to +
act as a conductor and protect Pointed conductor + +
– + + Neutral molecule
buildings from damage. The device is –
– – – –
– +
based on the fact that the electric field + +
+ –
around a conducting object depends
(a) A negatively charged Negatively
both on its charge and its shape. The conductor attracts positively charged
field is strongest near sharp points on charged ions in the air, and Electrons jump from ions are
the object. The field can become neutralisation occurs rapidly. the rod to the ions. repelled.
sufficiently strong so that the air
molecules lose electrons, becoming – + Negatively
ions. Eventually, sufficient air mol- – –
– charged
ecules are ionised and the air sur- + – – ions are
+
+ + ++
rounding the charged body becomes a + – attracted.
conductor. The charge can then leak – –
– +
away into the air (see figure 10.8). (b) A positively charged Positively
A Van der Graaf generator is round to conductor attracts negatively charged
allow a large electric charge to build up. charged ions in the air, and Electrons jump from ions are
When the charge is great enough, an neutralisation occurs rapidly. the ions to the rod. repelled.
electric spark can jump across a small
gap. If we attach a fine wire with a sharp Figure 10.8 Pointed conductors discharge rapidly in air.

CHAPTER 10 CATHODE RAYS AND THE DEVELOPMENT OF TELEVISION 179


PHYSICS IN FOCUS
The photocopier machine: charged plates
P hotocopiers which scan an image and produce a dry
image (that is, one that does not use photographic solu-
tions and special photosensitive paper) were first developed
by Chester Carlson in a process called xerography, on
22 October 1938. The first commercial unit utilising a
completely dry process was the Xerox-914 released in 1960.
Some semiconductors — selenium, arsenic and tellurium
— act as ‘photoconductors’. That is, they act as insulators in
the dark and electrical conductors in the light. A thin layer
of semiconductor material is deposited onto the surface of a
drum. At the start of a copy cycle, this drum is given a uni-
form electrostatic charge. The page to be copied is illumi-
nated with a strong light and an image of the page is formed
by a lens on the charged photoconductor. White areas light
up the photoconductor and it becomes a conductor so that
its charge leaks away to the metal backing. The black areas
remain. This latent image is then developed.
A fine powder of small glass beads, covered with a black
toner, is gently brushed over the drum. (The toner is actually
a black coloured thermoplastic substance which melts and
impregnates the paper.) The toner sticks only where there is
remaining charge, that is, the black areas of the original Figure 10.9 A magnified image from Xerox showing toner
image. particles clinging to a tiny carrier bead by means of
Next, a blank sheet of paper is rolled over the drum and electrostatic forces. From the bead, the negatively charged
the toner is transferred to the paper giving an exact image toner particles are attracted to the charged drum and then
of the original. The image is then ‘fixed’ by heating this to the paper.
page. Finally, the drum is cleared and made ready for the
next copy cycle.

J.J. Thomson
Figure 10.10 The apparatus The work of English physicist Joseph John Thomson (1856–1940) centred
for J. J. Thomson’s experiments around cathode rays (see figure 10.1, page 174). By incorporating
with cathode rays charged plates inside the cathode ray tube, Thomson was able to verify an
earlier hypothesis by Crookes that cathode rays would be deflected by
Secondary coil of an induction coil electric fields (see figure 10.10). In Thomson’s experiment, the cathode
rays passed between parallel plates connected to a battery. He observed that
the direction of the rays moved towards the positively charged plate,
showing that the rays behaved as negative charges. (See
page 183 for a more detailed description of
High voltage Glass tube (evacuated to low pressure) Thomson’s breakthrough.)
+ Direction of travel of
+ Double the cathode rays
collimator tubes


Light
Cathode Anode with slit
Gas discharge Helmholtz coils Charged plates
provides free electrons (not to scale)

180 FROM IDEAS TO IMPLEMENTATION


PHYSICS FACT
Millikan’s oil drop experiment
I n 1909, American physicist, Robert A. Millikan,
was able to use the uniform electric field
created between two parallel plates to investigate
directed vertically down (weight) which can be
counteracted by an electric field produced
between parallel plates by a source of variable
the properties of a charge. His set-up (see figure voltage. The spaces between the plates could be
10.11) involved an atomiser which sprayed a fine viewed through a microscope. By careful adjust-
mist of oil drops into his apparatus (region A). ments of the voltage it was possible for one drop
to be held stationary, or made to travel with uni-
form velocity. That is, the forces acting on the
drop were balanced.
Atomiser Weight force (down) = Fg = mg
Region A Electric force (up) = FE = Eq
For this drop to be suspended between plates,
Fg = FE .
Having suspended an oil drop, Millikan could
Plate X then determine the charge on that particular oil
Region B
mg
Battery Oil drop drop by solving for q; that is, q = ------- .
E
E Millikan needed to determine the mass of the
Plate Y
oil drop. His approach was to measure the
X-rays
terminal velocity of the oil drop when the elec-
tric field was turned off and it fell under the
Microscope force of gravity alone. By using equations from
fluid mechanics, he could calculate the radius of
the oil drop. By using an oil with a known den-
sity, he was able to determine the mass of the oil
drop.
Millikan’s remarkable findings, for which he
Figure 10.11 Apparatus for Millikan’s oil drop experiment won a Nobel Prize in 1923, showed that the
Some drops drifted into region B and came charge on an oil drop was not of just any arbitrary
under the influence of the electric field, E. As the value. Instead, the charge always occurred in
oil drops entered this region they were momen- ‘packets’ or multiples of some smallest value. This
−19
tarily exposed to a beam of X-rays, resulting in value was calculated to be 1.6 × 10 C and was
some of the oil drops becoming charged. The called the ‘elementary charge’, the charge found
gravitational field of the Earth exerts a force on an electron.

Field strength and the charge on an oil drop


SAMPLE PROBLEM 10.4 −6
An oil drop of mass 6.8 × 10 g is suspended between two parallel plates
which are separated by a distance of 3.5 mm, as shown in figure 10.12.
(a) What is the electric field strength between the plates?
(b) What is the charge that must exist on the oil drop?
(c) How many excess electrons must be present on the oil drop?
V
SOLUTION (a) Using the equation E = --- :
d
110
E = -----------------------
-
3.5 × 10 –3
4 −1
= 3.1 × 10 V m down.

CHAPTER 10 CATHODE RAYS AND THE DEVELOPMENT OF TELEVISION 181


+ + mg
(b) Since the oil drop is suspended, qE = mg, ∴ q = ------- .
−6 −9
E
Now 6.8 × 10 g = 6.8 × 10 kg

3.5 mm
110V 6.8 × 10 –9 × 9.8
∴ q = -------------------------------------
-
3.1 × 10 4
−12
– – = 2.1 × 10 C.
Oil drop 2.1 × 10 –12
charge on drop 7
(c) Number = charge on electron
- = 1.3 × 10 electrons.
---------------------------
Figure 10.12 1.6 × 10 –19

10.3 EFFECT OF MAGNETIC FIELDS ON


CATHODE RAYS
× × F × × Magnetic fields exert forces on electric currents; that is, on moving
F = qvB charged particles. If a particle with charge q is moving with velocity v,
Proton + v
perpendicularly to a magnetic field of strength B, the particle will
× × × × B
experience a magnetic force F, given by F = qvB.
Electron – v The direction of the force is given by the right-hand rule. (If the par-
× F × × × ticle has a positive charge, the direction of the conventional current is
that of the velocity; if the particle has a negative charge, the direction of
Figure 10.13 Forces on an electron
the conventional current is opposite to that of the velocity.) This is illus-
and a proton moving perpendicularly
trated in figure 10.13 for an electron (negative charge) and a proton
to a magnetic field
(positive charge).
If the velocity is at an angle θ to the
magnetic field, the force is given by
F = qvB sin θ. Component
of v v B
Review your study of magnetic fields To find the direction of the force, use perpendicular
from the Preliminary Course topic the component of the velocity perpen- to B
dicular to the magnetic field and the
+
F = qvB sin θ
‘Electrical energy in the home’ and the F into page
HSC Course topic ‘Motors and right-hand rule. This is illustrated in
generators’. figure 10.14 where the direction of the Figure 10.14 Force on a charged
force is into the page. particle moving at an angle, θ, to a
If the charged particle is moving parallel magnetic field
to the magnetic field, θ = 0, and there-
fore F = 0.

Effects of magnetic fields on electrons


SAMPLE PROBLEM 10.5 –19
An electron of charge –1.6 × 10 C is projected into a region where a
magnetic field exists, as shown in the diagram. If the velocity of the
4 −1
electron is 2.5 × 10 m s , determine:
(a) the force on the electron at the instant it enters the magnetic field
(b) the shape of the path which the electron follows.
B = 2.0 x 10–2 T
SOLUTION (a) F = qvB sin θ
−2 –19 4 × × × × ×
= 2.0 × 10 × 1.6 × 10 × 2.5 × 10
= 8.0 × 10
−17
N downwards. × × × × ×
(b) The path that the electron follows
will be circular. This is because the
× × × × ×
magnetic force is always acting
perpendicular to the velocity of the
× × × × ×
electron. Figure 10.15

182 FROM IDEAS TO IMPLEMENTATION


10.4 DETERMINING THE
CHARGE-TO-MASS RATIO OF
CATHODE RAYS
The name ‘electrons’ was first J. J. Thomson was an intuitive and brilliant experimentalist. Following on
suggested for the natural unit of from his experiment showing that cathode rays were deflected by electric
electricity by G. J. Storey in 1891. fields, he succeeded in measuring the charge-to-mass ratio of the cathode
ray particles, called electrons. Thomson built a cathode ray tube with
(a)
charged parallel plates (called capacitor plates) to provide a uniform
Electric field electric field and a source of uniform magnetic field. Using this appar-
up the page
atus, he investigated the effect of cathode rays passing through both
– fields (see figure 10.16). The fields were oriented at right angles to each
other and this had the effect of producing forces on the cathode rays
Magnetic- that directly opposed each other (see the ‘Physics fact’ below).
field-only
path Thomson’s experiment involved two stages:
1. varying the magnetic field and electric fields until their opposing
magnetic field,
out of the page
forces cancelled, leaving the cathode rays undeflected. This effect is
shown in figure 10.16(a). By equating the magnetic and electric force
Path when equations, Thomson was able to determine the velocity of the cathode-
+ both fields
are on ray particles.
2. applying the same strength magnetic field (alone) and determining
the radius of the circle path travelled by the charged particles in the
(b)
magnetic field (see figure 10.16(b)).
Segment of Thomson combined the results and obtained the magnitude of the charge-
circular path
to-mass ratio for the charged particles that constituted cathode rays.

Magnetic field, Figure 10.16 (a) A beam of negatively charged particles left undeflected by the combination
out of the page
of a magnetic field out of the page, and an electric field up the page (b) A negatively charged
particle deflected by a magnetic field out of the page. The mechanics of circular motion
describes the path, with the centripetal force provided by the magnetic force acting on the
Current-carrying coil particle.

PHYSICS FACT

W hen charged particles enter an electric field


they follow a trajectory under the influence
of an electric force.
We can combine these two effects by arranging
the electric field, magnetic field and the velocity
of the particle at right angles to each other.
+ + + + + For example, by adjusting the strengths of the elec-
tric and magnetic fields, their effects on the motion
Direction B of a charged particle can
Electron of force cancel each other out. The
– – – – – on the particles can then travel
Figure 10.17 electron
along a straight path.
Similarly, when a charged particle enters a mag- In figure 10.10 (see page
netic field, it experiences a magnetic force. The v 180), there are two sets of
direction of this force is given by the right-hand electric fields. The first
palm rule. Figure 10.19 accelerates the electrons
through a set of collimators
Electron × × × × × × to produce a narrow beam. This beam then
Magnetic
× × × × × × field into passes through a combination of electric and
Figure 10.18 × ×B × × × page E magnetic fields that can be adjusted.

CHAPTER 10 CATHODE RAYS AND THE DEVELOPMENT OF TELEVISION 183


10.5 CATHODE RAYS — WAVES OR
PARTICLES?
In 1875, twenty years after their discovery, William Crookes (1832–1919)
designed a number of tubes to study cathode rays (some of these are
shown in figure 10.3, page 175). He found that the cathode rays did not
penetrate metals and travelled in straight lines. It was initially thought
that the rays may be an electromagnetic wave because of the similarity in
their behaviour to light. This was discounted when Crookes discovered
that the cathode rays were deflected by magnetic fields, an effect which
did not occur with light.
In a paper read to the Paris Academy of Science in 1885, Jean Perrin
(1870–1942) described the two main hypotheses concerning the nature
of cathode rays:
Some physicists think, with Goldstein, Hertz and Lenard, that this
phenomenon is like light of very short wavelength. Others think,
with Crookes and J. J. Thomson, that these rays are formed by
10.2 matter which is negatively charged and moving with great velocity,
and on this hypothesis their mechanical properties, as well as the
Properties of cathode rays manner in which they curve in a magnetic field, are readily
explicable.
The way that physicists set out to understand the nature of cathode rays
shows how the scientific method is used to solve problems. That is, observ-
ations from experiments are interpreted and a hypothesis developed to
explain what is thought to be happening. Opposing models may arise,
with supporters of each side arguing strongly for their belief. The argu-
ment may eventually be resolved either by improved experiments or with
greater understanding of the phenomenon.
In this case, the debate about whether cathode rays were electro-
magnetic waves or streams of charged particles remained unsolved until
1897, when J. J. Thomson showed beyond doubt that the rays were
streams of negatively charged particles, which we now call electrons. Why
was the debate so prolonged? The problem was the apparently
inconsistent behaviour of rays. For example, the following observations
from cathode ray experiments fitted the wave model:
• they travelled in straight lines
• if an opaque object was placed in their path, a shadow of that object
appeared
• they could pass through thin metal foils without damaging them.
The following observations fitted the particle model:
• the rays left the cathode at right angles to the surface
• they were obviously deflected by magnetic fields
• they did not appear to be deflected by electric fields
• small paddlewheels turned when placed in the path of the rays
• they travelled considerably more slowly than light.
The main restriction for the charged particle theory was the absence of
deflection in electric fields. However, Thomson showed that this was due
to the rays themselves. He stated:
‘. . . on repeating the experiment I first got the same result, but
subsequent experiments showed that the absence of deflection is
due to the conductivity conferred on the rarefied gas by the
cathode rays. On measuring this conductivity . . . it was found to

184 FROM IDEAS TO IMPLEMENTATION


(a) decrease very rapidly with the exhaustion of the gas . . . at very high
exhaustions there might be a chance of detecting the deflection of
Cathode ray

+++++++++++++++++++ cathode rays by an electrostatic force.’


–––––––––––––––––––
+++++++++++++++++++ Within the tube, the cathode rays ionised the gas. The ions were
––––––––––––––––––– attracted to the plate with the opposite charge and the line-up of ions
effectively neutralised the charge on the plate, allowing the cathode rays
to pass by unaffected.
(b)
After evacuating the chamber, Thomson observed deflection and that
the particles were always deflected towards the positive plate, which
+++++++++++++++++++
Cathode ray

confirmed that they were negatively charged particles. The deflection of


cathode rays in tubes of different gas pressure is shown in figure 10.20.
––––––––––––––––––– The ability of cathode rays to penetrate thin metal foils was still
unexplained. The answer lay, not simply with the properties of cathode
rays, but with the model of the atom. If the atom was not a solid object,
Figure 10.20 The path of cathode but much more open, it might be possible for very small particles to
rays (a) at high gas pressure and pass through thin foil. Although not considered at this time, Ernest
(b) at low gas pressure Rutherford (1871–1937) would use a similar approach to change the
model of the atom (discussed in chapter 22, pages 419–423).

PHYSICS FACT
X-rays: discovery and application
I n 1895, a type of radiation was discovered by
Wilhelm Röntgen (1845–1923) while he was
experimenting with cathode rays. He found that,
wavelength (see chapter 13, pages 235–239, for
further discussion of X-rays).
Among the many characteristics that make
in a dark room, a screen covered with a sensitive X-rays so useful are the fact that they can:
fluorescent material (barium platino-cyanide) • penetrate many substances
glowed when it was placed near the end of a • expose photographic film
cathode ray tube (see figure 10.21). • cause certain substances to fluoresce
• be reflected and refracted.
The most common use of X-rays is in the field
Glass tube surrounded
with dark paper of medicine, for diagnosing illness or injury as
well as treating illnesses such as cancer. X-ray
Screen
machines are used widely — to check luggage at
Cathode rays
– Invisible rays airports, analyse the welding of metal parts in an
(high-speed
electrons) (X-rays) aircraft wing, and look at things that we otherwise
could not see.
+

High electric potential

Figure 10.21 Röntgen’s apparatus

Since cathode rays could not pass through the


glass at the end of the tube, he deduced that this
fluorescence must be due to a new form of radi-
ation. He called this radiation ‘X-rays’ as their
properties were not known. Later research
showed that X-rays were produced when high-
speed electrons interacted with matter, such as
the glass in the cathode ray tube.
X-rays were later found to be electromagnetic
waves, similar to light but with a much smaller Figure 10.22 X-ray of a knee replacement

CHAPTER 10 CATHODE RAYS AND THE DEVELOPMENT OF TELEVISION 185


10.6 APPLICATIONS OF CATHODE
RAYS
Cathode ray tubes have a multitude of applications, ranging from small
screens at an automatic teller machine through to radar screens,
television or computer monitors, and cathode ray oscilloscopes (CRO).
They are also found in many different medical instruments. They are all
based on the simple cathode ray tube.

Parts of a cathode ray tube


The main parts of a cathode ray tube are the electron gun, the deflecting
plates and the fluorescent screen (see figure 10.23).
• In the electron gun, the heating filament heats the cathode, releasing
electrons by thermionic emission. A number of electrodes are used to
control the ‘brightness’ of the beam, to focus the beam and accelerate
the electrons along the tube. Electrons are negatively charged particles
and the positively charged anode develops a strong electric field that
exerts a force on the electrons, accelerating them along the tube.
• Two sets of parallel deflecting plates are charged to
produce an electric field that can deflect the
beam of electrons separately, up or down and
Electrodes
Anode Y-plates left or right. These fields are used to move
Heater X-plates the beam so that the electrons can be
Electron beam directed to all points on the fluorescent
Cathode screen. Electric current passing
Electrode
through coils around the cathode
controlling Electron ray tube produces magnetic fields
brightness gun that control the movement of the
Deflecting plates
electron beam (as outlined on

High voltage
+ page 183). This ‘trace’ on the
power supply
screen produces the visible output,
Fluorescent such as the picture on a television.
screen • The glass screen is coated with
Figure 10.23 The parts of a simple cathode ray tube layers of a fluorescent material. It
emits light when high energy
electrons strike it.

Television
Television sets use a cathode ray tube as their output device. A colour tele-
vision camera records images through three coloured filters — red, blue
and green (see figure 10.24). The information is transmitted to the receiver
which then directs the appropriate signal to one of three electron guns,
each corresponding to one of the primary colours. The picture is then
reconstituted on the screen by an additive process involving three coloured
phosphors. Each electron gun stimulates its appropriate phosphor.
Each television image is made up of 625 horizontal lines of dots. The
current in the coils energises the deflection coils and is varied to scan the
screen twice for each image. The electrons sweep across the screen,
building up the picture. Each picture is formed from two passes of the
electron beam. The odd-numbered lines are drawn first, then the beam
‘flies’ back to the start and ‘draws in’ the even-numbered lines. Each scan
takes one-fiftieth of a second. This is shorter than the time that the retina/
brain system retains each image so that the screen does not seem to flicker.

186 FROM IDEAS TO IMPLEMENTATION


Figure 10.24 (a) Simplified side Shadow mask
view of a colour television (b) The (a)
main parts of a colour television
Deflecting coils Glass

Red
Green Fluorescent screen
Indigo blue

Deflecting coils

(b) Beam of
electrons Focusing coil
Grid Deflecting coils

Cathode Fluorescent screen


Anode
Electron gun

Dots of phosphorescent paint on the screen


convert the energy of the electron beam into
Phosphor dots
on screen light. After being excited, the phosphorescence
continues to emit light for a longer time,
Violet which helps to minimise screen flicker. In a
Red
colour television, the phosphorescent dots
come in the three colours (red, green and
Green
blue) and the many colours that you see on
Electron the screen are formed from combinations
beams
of these three.
Three electron beams scan the screen.
They come from slightly different direc-
tions through holes in a shadow mask to
Shadow mask control the brightness of the three sets of
phosphors (see figure 10.25).
Figure 10.25 Enlarged section
of the shadow mask in the Cathode ray oscilloscopes
cathode ray tube of a TV set. The The introduction of electronic control systems into all forms of science
beams of electrons are directed and industry has seen the cathode ray oscilloscope (CRO) become one
through the mask and hit a of the most widely utilised test instruments. Because of its ability to make
pattern of phosphor dots on the ‘voltages’ visible, the cathode ray oscilloscope is a powerful diagnostic
inside of the screen. and development tool.
A CRO uses a cathode ray tube to display a variety of electrical signals.
The horizontal deflection is usually provided by a time base (or sweep
generator), which allows the voltage (on the vertical axis) to be plotted as
a function of time (on the horizontal axis). This enables complex
waveforms or very short pulses to be displayed and measured. Figure
10.26 shows the basic controls and the front panel of a typical CRO.

CHAPTER 10 CATHODE RAYS AND THE DEVELOPMENT OF TELEVISION 187


(a) Focus knob Intensity knob (b)

screen Locate

Horizontal
deflection
knob

Vertical
deflection
knob
On–off Input Timebase control
switch Trigger Vertical amplifier

Figure 10.26 (a) The basic controls of a single trace CRO (b) The front panel of a typical CRO

The timebase control allows the technician to select a variety of sweep


rates. This sets the ‘time per division’ for the figure drawn on the screen.
The bottom rotary switches control the amplitude of the displayed
waveform. Each centimetre of the grid can then be used to measure the
voltage of the input waveform.
Figure 10.27 shows the major parts of the oscilloscope. The input
waveform enters from the left on this diagram. As can be seen, one part
of the signal is amplified and the voltage produced goes to the vertical
deflecting plates. The other part passes to the trigger and time base and
is then passed to the horizontal deflection plates.
The combination of both signals produce the waveform displayed on
the front screen.
There are many types of cathode ray oscilloscope: single, double or
multiple trace; analogue and digital; and storage CROs that allow
technicians to record and store waveforms for comparison.

Time base waveform


y Spot moves across screen
Time X-pos 0
Div t
Spot blanked out during flyback

Trigger Time base

Heater
Anodes Y-plates

Electron beam

Cathode
A.C./D.C. Spot
switch
Signal X-plates
input
Y-amplifier
Cathode-
ray tube
Screen
y

0 Volts Y-pos
t Div
Signal waveform

Figure 10.27 The main parts of a CRO

188 FROM IDEAS TO IMPLEMENTATION


CHAPTER REVIEW
3. An electron is travelling at right angles to a
SUMMARY magnetic field of flux density 0.60 T with a
6 −1
velocity of 1.8 × 10 m s . What is the force
• Cathode ray tubes were used to investigate the experienced by the particle?
properties of cathode rays.
4. Given that the mass of the electron in question
• Cathode rays were found to be negatively −31
3 is 9.1 × 10 kg, what is the acceleration of
charged particles. the particle in the direction of the force acting
• Charged parallel plates produce a uniform on it?
electric field. 5. A pair of parallel plates is arranged as shown
• A charged particle moving with a velocity, v, at in figure 10.28. The plates are 5.0 cm apart
an angle, θ, through a magnetic field of and a potential difference of 200 V is applied
strength, B, experiences a force, F. The across them.
magnitude, in Newtons, is given by:
F = qvB sin θ. 5 cm

F, v and B are all vector quantities, and each has


a direction associated with it.
• The strength of a uniform electric field, E, in +
volts per metre, produced by parallel plates,
separated by a distance, d, and charged by an 200 V
applied voltage, V, is given by: Figure 10.28
V
E = --- .
d Data:
• Thomson’s experiment, using perpendicular −19
Charge on electron = −1.6 × 10 C
electric and magnetic fields, allowed him to deter- −31
mine the charge-to-mass ratio of an electron. The Mass of electron = 9.1 × 10 kg
11 −1 −27
value he determined was 1.759 × 10 C kg . Mass of proton = 1.67 × 10 kg
• Cathode ray tubes are made up of a number of (a) Calculate the magnitude and direction of
components including an electron gun, the the electric field between the plates.
electric field and a fluorescent screen. (b) Calculate the force acting on an electron
• Cathode ray tubes have many applications placed between the plates.
including cathode ray oscilloscopes and tele- (c) Calculate the force acting on a proton
visions. placed between the plates.
• Cathode ray oscilloscopes enable engineers to (d) Explain why these two forces are different.
develop new electronic circuits by making the (e) Calculate the work done in moving both
behaviour of pulses or waves visible. the electron and the proton from one
plate to the other.
QUESTIONS 6. Look up the meaning of ‘conservation of
charge’. In a discharge tube, cathode rays were
Note : The charge on a single electron is taken as formed and moved from the cathode (the
–19
–1.6 × 10 C. negative electrode). If these rays carried an
1. A beam of electrons moves at right angles to a electric charge, where was the corresponding
magnetic field of flux density 6.0 × 10−2 T. amount of positive charge?
7 −1
The electrons have a velocity of 2.5 × 10 m s . 7. Draw the electric field lines between a positive
What is the magnitude of the force acting on and negative charge of equal magnitude. In
each electron? which area is the electric field strongest?
2. A stream of doubly ionised particles (missing
8. Calculate the electric force on a charge of
two electrons and therefore carrying a positive −6
1.0 × 10 C placed in a uniform electric field
charge of twice the electronic charge) move at a −1
of 20 N C .
velocity of 3.0 × 10 4 m s−1 perpendicular to a
−2
magnetic field of 9.0 × 10 T. What is the 9. Draw the electric field lines between two
magnitude of the force acting on each ion? parallel plates placed 5.0 cm apart.

CHAPTER 10 CATHODE RAYS AND THE DEVELOPMENT OF TELEVISION 189


10. The electric field between parallel plates can 16. If charged particles enter a magnetic field at
be considered ‘uniform’ only in the region angles other than at right angles, describe
between the plates that is well away from the their path.
edges of the plates. What is meant by this 17. List the properties of cathode rays which can
statement? be described:
11. Negatively charged latex spheres are (a) as wave motion
introduced between two charged plates and (b) by a particle model.
are held stationary by the electric field. Each 18. Explain how the properties of cathode rays
−12
sphere has a mass of 2.4 × 10 kg and the were demonstrated using the evacuated tubes
strength of the field required to counter their in which a metal cross was mounted in the
7 −1
weight is 4.9 × 10 N C . Sketch this arrange- path of the rays, and in which a small paddle
ment, identifying the positive and the negative wheel was able to roll along glass rails.
plate, and determine the charge on the 19. Describe the path of an electron when it enters
spheres. the region between parallel plates across which
12. Two parallel plates are separated by a distance a potential difference of 1500 V is applied.
of 10.0 cm. The potential difference between Sketch the arrangement for an electron
the plates is 20.0 V. entering with a horizontal velocity of 2.4 ×
4 –1
(a) Calculate the electric field between the 10 m s at right angles to the electric field.
plates, assuming the field to be uniform. 20. Describe the conditions needed for an elec-
−3
(b) A charge of + 2.0 × 10 C is placed in the tron entering a magnetic field to undergo uni-
field. Calculate the force acting on this form circular motion.
charge.
21. In a tube similar to that used in the Thomson’s
13. A charge of 5.25 mC, moving with a velocity of electromagnetic experiment, a magnetic field
–1 –2
300 m s due north east, enters a uniform of 1.00 × 10 T is sufficient to allow the elec-
magnetic field of 0.310 T directed vertically trons to pass through the electric deflection
downwards, into the page. Calculate the mag- plates. The plates are 10 mm apart and have a
netic force on the charge. potential difference of 300 V across them.
(a) What is the strength of the electric field
N v x x x x between the plates?
(b) What was the speed of the electrons as
45° x x x x
they entered the region between the
Charge q x x x x plates?
B = 0.310 T (c) What was the strength of the magnetic
force acting on the electrons?
Figure 10.29
22. Using the library or the internet, research how
14. Why can cathode rays be observed and J. J. Thomson used his understanding of the
manipulated within a vacuum tube and not in nature of cathode rays to develop a new model
air? of the atom.
15. Describe the forces acting on charged particles 23. Compare the methods used to control the
entering a uniform magnetic field at right movement of the cathode-ray beam in a CRO
angles. and in a television set.
CHAPTER REVIEW

190 FROM IDEAS TO IMPLEMENTATION


PRACTICAL ACTIVITIES
ated with them. Your teacher will set up the equip-
10.1 ment related to the induction coil.
1. Attach the induction coil to the power pack
DISCHARGE using the two plug–plug leads. Adjust the points
on the induction coil to obtain a continuous
TUBES spark from the coil. Switch off the power pack.

Cathode
Aim
To observe the effect that different gas A B C D E F
pressures have on an electric discharge
passed through a discharge tube.

Power pack
Apparatus
power pack DC
two plug–plug leads Induction
coil Anode Discharge tubes
one set of discharge tubes Points
(with varying pressures)
induction coil
two plug–clip leads
Figure 10.30 Set-up for practical activity 10.1
Theory 2. Set the power pack at the correct setting for the
The high voltage produced by the induction coil is induction coil (usually 6 volts) and turn it on.
applied across the terminals inside the discharge 3. Attach the negative terminal of the induction
tubes. One plate (the cathode) becomes highly coil to the cathode of the discharge tube marked
negative and releases a ray (cathode ray or electron). with the highest pressure (40 mmHg) and attach
The electron passes through the gas in the tube and the positive terminal to the other end as shown
excites electrons in the atoms of the gas contained in figure 10.30. Switch on the power pack.
in the tube. The pressure of the gas determines the 4. Sketch a diagram of the pattern observed in
density of the atoms and therefore the nature of the this tube and describe it carefully.
collisions which take place between the electrons 5. Repeat the above procedure using each of the dis-
and atoms. Therefore, different discharge effects charge tubes and see if you can observe streamers,
under different pressures can be observed (refer Faraday’s dark space, cathode glow, Crookes’ dark
back to figure 10.4, page 176). space, striations and the positive column. Care-
fully describe each pattern, identifying each of
Method the effects mentioned. (Tubes to be used should
Safety note be 40 mmHg, 10 mmHg, 6 mmHg, 3 mmHg,
When the induction coil is connected to the dis- 0.14 mmHg and 0.03 mmHg. These are repre-
charge tube, X-rays are produced. However, it is sented as A, B, C, D, E and F in figure 10.30).
the cathode rays hitting the glass or metal within
the discharge tube that creates the X-rays, not the Questions
induction coil. If the experiment uses a minimum 1. What effects were common throughout all tubes?
operating voltage these X-rays will be of a low 2. If the striations are produced by electrons
energy and are significantly reduced after passing (cathode rays) striking atoms and causing light to
through the glass. be released, give an explanation for the occur-
We need to deal with induction coils with rence of variation in the patterns for different
extreme care because of the high voltages associ- pressures.

CHAPTER 10 CATHODE RAYS AND THE DEVELOPMENT OF TELEVISION 191


induction coil so that a strong steady spark is
10.2 being produced, as in practical activity 10.1.
2. Connect the terminals of the induction coil to
PROPERTIES OF the discharge tube containing the maltese cross
(Crookes’ tube). Observe the end of the tube
CATHODE RAYS containing the cross when the cross is down and
when it is up.
3. Replace the Crookes’ tube with the tube
Aim containing the electric plates and connect the
terminals of the plate to its high DC voltage
To determine some of the properties of the rays
supply. Observe the effects of the electric field
which come from the cathode of a discharge tube.
on the cathode rays.
Apparatus 4. Connect the tube with the fluorescent screen
display to the induction coil and record the
two power packs effect of placing a set of bar magnets around
two plug–plug leads the cathode rays as shown in figure 10.3 (page
one pair of magnets 175).
induction coil 5. Finally, attach the tube containing the glass
four plug–clip leads wheel on tracks to the induction coil and
discharge tubes (maltese cross, electric plates, observe the effects that the cathode rays have
rotating wheel, screen display)
on the wheel when the tube is horizontal.
Theory Analysis
This experiment will most likely be performed as a
class demonstration by your teacher. The discharge 1. For each of the tubes placed in the circuit,
tubes used are illustrated in figure 10.3, page 175 sketch a diagram of the tube and the effect
and are similar to those Sir William Crookes would caused by the cathode rays.
have used. 2. Using the laws of electromagnetism, determine
the charge that is evident on the cathode rays.
Method
Before starting, it would be advisable to read the
Questions
‘Analysis’ section of this experiment so as to plan 1. What are five properties of cathode rays which
what you should record during the experiment. can be deduced from this experiment?
1. Connect the power pack to the induction coil 2. From these results, can we conclusively say that
and set it at 6 volts. Adjust the points on the the cathode rays are electrons? Why or why not?
PRACTICAL ACTIVITIES

192 FROM IDEAS TO IMPLEMENTATION


CHAPTER
11 THE PHOTOELECTRIC
EFFECT AND BLACK
BODY RADIATION
Remember
Before beginning this chapter, you should be able to:
• define and apply the terms medium, displacement,
amplitude, period, crest, trough, transverse wave, frequency,
wavelength and velocity to the wave model
• recall the terms velocity, frequency and wavelength, and
their appropriate units, and solve numerical
problems using v = f λ
• recall that different types of radiation make up the
electromagnetic spectrum and that they are
propagated through space at a constant speed, c
• explain that the relationship between the intensity of
electromagnetic radiation and the distance from a
1
source is determined by the inverse square law I ∝ ------2-
d
• recall the arguments in the previous chapter about
the nature of cathode rays
• recall that the wavelength of the radiation emitted
from an object depends on its temperature (black
body radiation).

Key content
At the end of this chapter you should be able to:
• outline Hertz’s experiments with the speed of radio
waves, their properties compared to light, and the
photoelectric effect
Figure 11.1 Radio telescopes are pointed into space. • describe the model of the black body and its role in
They collect radio waves and other electromagnetic understanding the particle nature of light
radiation from galaxies. The SETI Project (Search for • identify the contributions that Planck and Einstein
Extra-Terrestrial Intelligence) utilises radio telescopes to made to the concept of quantised electromagnetic
‘listen’ for intelligent signals from other intelligent beings. energy and describe how this relates to the particle
The electromagnetic spectrum we know today extends from model of light
the wavelength of gamma rays, as small as 10 m,
–14
• identify the relationships between photon energy,
through to radio waves with wavelengths of 10 m.
5
frequency, speed of light and wavelength by using the
That knowledge has come from the work of two of the giants formulas E = hf and c = f λ
of science, James Clerk Maxwell and Heinrich Hertz. • outline the use of the photoelectric effect in
breathalysers, solar cells and photocells.
In this chapter we will look at the changing ideas regarding the nature of
light in the latter part of the nineteenth century and the early twentieth
century. We will also look at how changes in theory and experimentation
led to an understanding of black body radiation and the photoelectric
effect and also introduced the ‘quantum theory’.

11.1 MAXWELL’S THEORY OF


ELECTROMAGNETIC WAVES
The passage of light across the vast universe was
apparent to all who looked toward the heavens and
saw the stars. Explaining how that could be was
another matter. According to previous wave theory,
waves were propagated through a medium. What
was the medium in which light travelled?
One of the problems that nineteenth century
scientists had in understanding how electromag-
netic waves carry energy through the vacuum of
space was that mechanical waves vibrate in a
medium. Was there a medium filling space? One
name given to this unproven medium was the
luminiferous aether. The presence or absence of the
aether was hotly debated. Proof of its presence or
absence became one of the great goals of science.
In the end, Albert Einstein simply said that its
existence or absence was irrelevant. It could not be
detected and made no difference to the passage of
light.
Based on observations that a changing magnetic
field induces an electric field in the region around
a magnet, and that a magnetic field is induced in
the region around a conductor carrying an electric
current, James Clerk Maxwell concluded that the
mutual induction of time- and space-changing
Figure 11.2 James Clerk Maxwell (1831–1879) electric and magnetic fields should allow the
following unending sequence of events.
• A time-varying electric field in one region produces a time- and space-
varying magnetic field at all points around it.
• This varying magnetic field then similarly produces a varying electric
field in its neighbourhood.
• Thus, if an electromagnetic disturbance is started at one location (for
example, by vibrating charges in a hot gas or in a radio antenna) the
disturbance can travel out to distant points through the mutual gener-
ation of electric and magnetic fields.
• The electric and magnetic fields propagate through space in the form
of an ‘electromagnetic wave’ (illustrated in figure 11.3).
The upshot of this sequence of events was clear. Light, and indeed any
electromagnetic wave, does not need a medium to propagate. Electro-
magnetic waves are self-propagating. Once started, they have the capacity
to continue forever without continuous energy input. You are probably
familiar with the idea that the light we see coming from distant stars took
millions, if not billions, of years to reach the Earth. It is possible that the
origin of that light no longer exists, yet you can still see the light that
emanated from the object.

194 FROM IDEAS TO IMPLEMENTATION


E
Changing magnetic field
B

Direction
of energy
transfer

Figure 11.3 A diagrammatic


representation of an electromagnetic wave Changing electric field

Maxwell’s theory gave a definite connection between light and elec-


tricity. In a paper titled ‘A dynamical theory of the electromagnetic field’,
which he presented before the Royal Society in 1864, Maxwell expressed
four fundamental mathematical equations that have become known as
‘Maxwell’s equations’.
Maxwell’s equations predicted that light and electromagnetic waves
must be transverse waves and that the waves must all travel at the speed of
light. They also implied that a full range of frequencies of electro-
magnetic waves should exist. In other words, the equations suggested the
existence of an electromagnetic spectrum.
At the time of these predictions, only light and infra-red radiation was
known and confirmed to exist. One look at the spectrum shown in figure
11.4 allows you to see how little was known of the complete electromagnetic
spectrum known to exist today. Maxwell’s equations also suggested that the
speed of all waves of the full electromagnetic spectrum, if they did exist, was
a definite quantity that he estimated as 3.11 × 10 m s−1. Maxwell’s theoretical
8

calculations were supported by the experimental data of French physicist


Armand Hippolyte Louis Fizeau (1819–1896) who had determined a figure
very close to this for the speed of light. In 1849, Fizeau’s experiments to
measure the speed of light had obtained a value of 3.15 × 10 m s−1.
8

Visible spectrum

Radio waves Infra-red Ultraviolet Gamma


radiation radiation X-rays rays
Emergency TV and Microwaves
and defence AM FM and
radio radio radio radar

Increasing energy

Increasing frequency

104 106 108 1010 1012 1014 1016 1018 1020 1022
Frequency (Hertz)

106 104 102 100(1) 10–2 10–4 10–6 10–8 10–10 10–12 10–14
Wavelength (metres)
Increasing wavelength

Figure 11.4 The electromagnetic spectrum

CHAPTER 11 THE PHOTOELECTRIC EFFECT AND BLACK BODY RADIATION 195


11.2 HEINRICH HERTZ AND
EXPERIMENTS WITH RADIO WAVES
Maxwell’s two most important predictions were that:
• electromagnetic waves could exist with many different fre-
quencies
• all such waves would propagate through space at the speed of
light.
In 1886, Heinrich Hertz conducted a series of experiments
that verified these predictions. Unfortunately, Maxwell had
died in 1879 and did not see this experimental confir-
mation of the theoretical predictions of his equations.
Hertz reasoned that he might be able to produce
some of the electromagnetic waves with frequencies
other than that of the visible light predicted by Max-
well’s equations. He thought he could produce some of
these electromagnetic waves by creating a rapidly oscil-
lating electric field with an induction coil that caused a
rapid sparking across a gap in a conducting circuit.
In his experiments that confirmed Maxwell’s predic-
tions, Hertz used an induction coil to produce sparks
between the spherical electrodes of the transmitter. He
observed that when a small length of wire was bent into
a loop so that there was a small gap and held near the
sparking induction coil, a spark would jump across the
gap in the loop. He observed that this occurred when a
spark jumped across the terminals of the induction coil
Figure 11.5 (see figure 11.6). This sparking occurred even though the
Heinrich Hertz (1857–1894) loop was not connected to a source of electrical current.
Hertz concluded this loop was a detector of the electromagnetic
waves generated by the transmitter. This provided the first experi-
mental evidence of the existence of electromagnetic waves.

Transmitter

Receiver

Oscillating spark
To high
voltage source

Spark induced
by arriving
radio waves

Up to several hundred metres

Figure 11.6 Hertz, using an induction coil and a spark gap, succeeded in generating and
detecting electromagnetic waves. He measured the speed of these waves, observed their
interference, reflection, refraction and polarisation. In this way, he demonstrated that they all
have the properties characteristic of light.

196 FROM IDEAS TO IMPLEMENTATION


Hertz then showed that these new electromagnetic waves could be
reflected from a metal mirror, and refracted as they passed through a
prism made from pitch. This demonstrated that the waves behaved
similarly to light waves in that they could be reflected and refracted.
Additionally, Hertz was able to show that, like light, the new electro-
magnetic waves could be polarised. Hertz showed that the waves
originating from the electrodes connected to the induction coil behaved
as if they were polarised by rotating the receiver loop. When the detector
loop was perpendicular to the transmitter gap, the radio waves from the
gap produced no spark (see figure 11.7). The spark in the receiver was
caused by the electric current set up in the conducting wire. When the
detector loop was parallel to the spherical electrodes attached to the
induction coil (see figure 11.8), the spark in the receiver was at
maximum. At intermediate angles it was proportionally less. This was a
behaviour similar to that shown by polarised light waves after the light
has passed through an analyser, such as a sheet of Polaroid. It
demonstrated that the newly generated electromagnetic waves were
polarised.

Charged Charged
plates plates

Transmitter Transmitter

Detector Detector
loop (at right angles
(parallel to transmitter)
with Induction Induction
transmitter) coil coil

Figure 11.7 No spark was detected Figure 11.8 Hertz detected the waves
when the detector loop was rotated. when the detector loop was placed like this.

If Hertz’s interpretation was correct, and electromagnetic waves did


travel through space from the coil to the loop, he reasoned there must
Hertz was the first to observe what we
be a small delay between the appearance of the first and second spark.
refer to as the ‘photoelectric effect’.
The spark in the detector cannot occur at exactly the same time as the
This is discussed further on pages
spark in the induction coil because even travelling at the speed of light it
202–208.
takes a finite time for the wave to move from one point to another. Hertz
measured this speed in 1888 by using a determined frequency from an
oscillating circuit and a measured wavelength as determined by interfer-
ence effects for the waves produced and found that it corresponded to
the speed predicted by Maxwell’s equations; it was the same as the speed
of light.
To measure this wavelength, Hertz connected both the transmitter and
the detector loop with a length of wire. He had already shown, by
rotating the second loop, that the waves produced by the sparking

CHAPTER 11 THE PHOTOELECTRIC EFFECT AND BLACK BODY RADIATION 197


behaved as if they were polarised. The spark in the receiver was caused by
the electric current set up in the conducting wire. At intermediate
angles, interference of the currents provided a measure of the
wavelength of the radio waves through the air.
The speed of transmission of the sparks was measured using a
technique taken from light. Lloyd’s mirror uses interference of two
separate beams of light. One beam travels directly from the source to a
detector. The other reflects a beam from the source from a mirror set at
a small angle. Both beams interfere both constructively and destructively
when they arrive at the detector. It is possible to use the pattern pro-
duced to determine the wavelength of the waves.
Hertz carried out a modification of this experiment, reflecting the
sparks from a metal plate. He suggested that the waves produced had a
wavelength larger than light, which should make measurements easier.
Knowing the frequency of the sparks and their wavelength, he obtained a
value for the speed of transmission. His value was similar to the speed of
light measured by Fizeau (see page 195).
The invention of this set of experiments and procedures was the first
time that electromagnetic waves of a known frequency could be gener-
ated. Today these waves originally produced by Hertz in his experiments
are known as radio waves. Hertz never transmitted his radio waves over
longer distances than a few hundred metres. The unit for frequency was
changed from ‘cycles per second’ to the ‘hertz’ honouring the contribu-
tion of Heinrich Hertz.
Using the microwave apparatus available in schools, large wax prisms
and metal plates, students can reproduce many of these results — dem-
onstrating that electromagnetic waves have all the properties of light.

Radio waves and their frequencies


The discovery of radio waves was made by Hertz; however, the develop-
ment of a practical radio transmitter was left to the Italian, Guglielmo
Marconi (1874–1937). Marconi’s experiments showed that for radio
waves:
• long wavelengths penetrate further than short wavelength waves
• tall aerials were more effective for producing highly penetrating radio
waves than short aerials.
The earliest radio messages were sent in 1895 by Marconi across his
family estate, a distance of approximately three kilometres. By 1901 he
was sending radio messages across the Atlantic Ocean from Cornwall,
England to Newfoundland, Canada.
Different frequency radio waves can be generated easily and precisely
by oscillating electric currents in aerials of different length. This is
11.1 because the frequency of the waves generated faithfully matches the
frequency of the AC current generating them. This ease of generation
Producing and transmitting allows radio waves to be utilised extensively for many purposes. Appli-
radio waves cations include communication technologies, such as radio, television
and mobile phones, and other technologies such as microwave cooking
and radar. Essentially, the only difference between any of these waves
used for these different purposes is the frequency of the waves generated
by the transmitting aerials. For communications, sections of the available
electromagnetic spectrum or bands of spectrum are used (see figure 11.4
on page 195). These chunks of spectrum use many single frequencies to
transmit without interference.

198 FROM IDEAS TO IMPLEMENTATION


PHYSICS IN FOCUS
How do radio aerials work?
O ne form of aerial is the dipole antenna. Using
an alternating electric current in the
antenna, the electrons are continually accelerated
short-length antennae produce short-wavelength
radio waves, which are called microwaves.
For communication bands such as radio waves,
back and forth. Electric charges the length of the antenna can be
(electrons) move back and forth as small as one half of the wave-
along a length of conductor. The length of the carrier wave (see
electric and magnetic fields of these figure 11.9(a) and (b)). When
moving charges produce an electro- you consider the wavelength of
magnetic radiation that has the λ
radio waves, you can understand
same frequency as the alternating 2 Energy AC current why radio antennae must be so
current operating in the antenna. tall. To avoid building very large
The electric current travels along antennae they are often
the entire length of the aerial. mounted on buildings. One end
Therefore, the greater the length of is on the roof and the other end
the aerial the longer the time for is grounded. This is to reduce the
Figure 11.9(a) A dipole antenna
each cycle and hence, the lower the overall length of the actual
frequency of the electromagnetic antenna structure while still pro-
wave generated. Long antennae therefore ducing an antenna that is the height of the building
generate long-wavelength radio waves whereas and the antenna structure.

Aerial

Sound

Radio wave with


fluctuating amplitude
Fluctuating
direct current

Figure 11.9(b) A simple High frequency


aerial transmits a radio wave. AC producer

11.3 THE BLACK BODY PROBLEM


AND THE ULTRAVIOLET
CATASTROPHE
When an object such as a filament in a light globe is heated (but not
burned) it glows with different colours: black, red, yellow and blue-white
as it gets hotter. To understand how radiation is emitted for all objects,
and how the wavelength of the radiation varies with temperature, creative
experiments involving the behaviour of standard objects called ‘black
bodies’ were required. A black body is one which absorbs all incoming
radiation. The use of black bodies was necessary because all objects
behave slightly differently in terms of the radiation they emit at different
temperatures. Scientists could use the standard black body in experi-
ments to study the nature of radiation emitted at different temperatures,
and then extrapolate their findings for other objects.

CHAPTER 11 THE PHOTOELECTRIC EFFECT AND BLACK BODY RADIATION 199


As an example of an object used to model a black body, imagine you
drilled a very small hole through the wall of an induction furnace (an
efficient oven in which the temperature can be set to known values). At
6000 K
a temperature of 1000°C, the walls of such a model black body will emit
all types of radiation, including visible light and infra-red and ultra-
Intensity

violet radiation, but they will not be able to escape the furnace except
through the small hole. They will be forced to bounce around in the
4000 K
furnace cavity until the walls of the furnace absorb them. As the walls
3000 K absorb the radiation they will increase in energy. This causes the walls
to release radiation of a different wavelength, eventually establishing an
0 0.5 1 1.5 2 equilibrium situation. All radiation entering through the small hole is
Wavelength (µm) absorbed by the walls, so the radiation leaving the hole in the side of
Figure 11.10 The peak in intensity the furnace is characteristic of the equilibrium temperature that exists
moves to lower wavelength and higher in the furnace cavity. This emitted radiation is given the name black
frequency radiation with increasing body radiation.
temperature. As figure 11.10 shows, the radiation emitted from a black body extends
over all wavelengths of the electromagnetic spectrum. However, the rela-
tive intensity varies considerably and is characteristic of a specific tem-
perature.
Black bodies absorb all radiation that falls on them. That energy is
Classical theory spread throughout the object. The cavity walls within the black body also
get hotter. As the walls of the cavity get hotter, the emission of more
intense, shorter wavelength radiation from the cavity occurs. Physicists
Radiance

used a spectrometer to measure how much light of each colour, or wave-


length, was emitted from the hole in the side of the black body models
Experiment
and
they constructed. The shape of the radiation versus intensity curves on
Planck’s theory the graphs that they created presented a problem for the physicists
attempting to explain the intensity and wavelength variations that
occurred quantitatively.
0 1 2 3 4 The problem was how to explain the results theoretically. The tra-
Wavelength (µm) ditional mathematics based on thermodynamics predicted that the pat-
Figure 11.11 The predicted curve
tern of radiation should be different to that which the physicists found
for the classical model and the actual
occurred.
curve obtained from experiment
The ‘classical’ wave-theory of light predicted that, as the wavelength of
radiation emitted becomes shorter, the radiation intensity would
increase. In fact, it would increase without limit. This would mean that, as
the energy (that was emitted from the walls of the black body and then
re-absorbed) decreased in wavelength from the visible into the ultraviolet
portion of the spectrum, the intensity of the radiation emitted from the
hole in the black body would approach infinity. This increase in energy
level would violate the principle of conservation of energy and could not
be explained by existing theories. This effect was called the ‘ultraviolet
catastrophe’.
The experimental data from black body experiments (see figure 11.11)
showed that the radiation intensity curve corresponding to a given
temperature has a definite peak, passing through a maximum and then
declining. This could not be explained.
The German scientist, Max Planck, arrived at a revolutionary expla-
nation for the nature of the radiation emitted in experiments. Planck
proposed that energy would be exchanged between the particles of the
black body and the equilibrium radiation field. Using an analogy with the
transmission of radio waves (with an aerial of specific length indicating
the frequency of the radio wave radiation produced), the relatively high
frequency of light emitted by a black body required an ‘aerial’ of a size
similar to that of the atom for its production. The question was, how did
this come to be?

200 FROM IDEAS TO IMPLEMENTATION


Planck came up with a revol-
utionary idea to explain the results
observed in experiments. He
assumed that the radiant energy,
although exchanged between the
particles of the black body and the
radiant energy field in continuous
amounts, may be treated statistically
as if it was exchanged in multiples of
a small ‘lump’. Each lump is charac-
teristic of each frequency of radiation
emitted. He described this small,
average packet as a ‘quantum’ of
energy, that could be described by hf,
where f was the frequency, and h a
small constant, now called ‘Planck’s
−34
constant’ (h = 6.63 × 10 J s).
Therefore:
E = hf
Figure 11.12 Max Planck (1858–1947)
where
has come to be recognised as one of the key
E = energy, measured in joules
−34 people involved in the development of
h = Planck’s constant = 6.63 × 10 J s modern physics.
f = frequency in hertz.
This equation models the quantum relationship. This modification was
seen by Planck as a small correction to classical thermodynamics. It
turned out to be a most significant step towards the development of a
totally new branch of physics: the quantum theory. Although he is con-
sidered to be the first to introduce this theory, Planck was never comfort-
able with the strict application of the quantum theory. He had invented
the quantum theory but believed that all he had really done was to invent
a mathematical trick to explain the results of black body radiation experi-
ments. He failed to accept the quantisation of radiation until later in his
career when the quantum theory was backed up with more examples and
The term photon describes a supporting evidence.
unit (or ‘packet’) of energy relating Armed with the Planck relationship and the knowledge that the speed
8
to the quantum description of of electromagnetic radiation (c = 3 × 10 m s−1) was the product of the
matter. It is also a particle of frequency of the radiation and the wavelength of the radiation
electromagnetic energy with a
zero rest mass. (c = f × λ), the energy in one quantum, or photon, of light of any known
wavelength was then able to be determined.

Calculating photon energy


SAMPLE PROBLEM 11.1 What is the energy of an ultraviolet-light photon, wavelength = 3.00 × 10 m?
−7

c
SOLUTION c = f λ so f = ---
λ
E = hf
c
= h ---
λ
6.63 × 10 –34 × 3.00 × 10 8
= ------------------------------------------------------------
-
3.00 × 10 –7
−19
= 6.63 × 10 J
In this way the energy of a light photon of any known wavelength of light
can be determined.

CHAPTER 11 THE PHOTOELECTRIC EFFECT AND BLACK BODY RADIATION 201


11.4 WHAT DO WE MEAN BY
‘CLASSICAL PHYSICS’ AND
‘QUANTUM THEORY’?
A useful analogy is that of a slippery Classical physics can be described, in broad terms, as physics up to the end
dip. Classical physics says that the of the nineteenth century. It relied on Newton’s mechanics and included
difference in height from the top to the Maxwell’s theories of electromagnetism. Classical physics still applies to
bottom is a continuous ‘slide’. large-scale phenomena and to the motion of bodies at speeds very much
Quantum mechanics says that there less than the speed of light. The quantum theory applies to the very small
are a set number of small steps (the scale, particularly at the atomic level. Energy is believed to occur in dis-
ladder) between the top of the slide and crete ‘packets’ or ‘quanta’. Energy packets can be absorbed by an atom,
the ground. and then re-radiated. Classical physics predicts that the emission of
electromagnetic radiation is continuous; that is, can occur in any amount.
The difference between the classical description of ‘continuous’
energy and the new discrete, or quantised description of energy may be
understood with a simple ‘thought experiment’. Consider an old-
fashioned balance, pivoted in the centre, with large pans suspended from
each side. On one side we place a bucket with water and on the other we
add house bricks to balance the bucket and water. We can increase the
weight of bricks by adding or subtracting one brick at a time. If each
brick has a mass of 2 kg, then our smallest packet, or quantum, of mass is
2 kg. On the other side, we can increase the weight by adding water.
Without extending this example too far, we have a discrete variable (the
bricks) and a continuous variable (the water) which can be added in
smaller and smaller drops.
The continuous variable (water) is similar to the classical model of
energy, and the bricks correspond to the quantum model in which
energy can be exchanged in multiples of a small number. This simple
example uses only one size of brick, or quantum. In fact the size of the
quantum value varies with the energy effect we are studying.
Why don’t we see these ‘jumps’ in light? The constant, h, now called
−34
‘Planck’s constant’, is equal to 6.63 × 10 J s. The quantum of energy
is very small and, for all practical purposes, it is too small to be observed.

11.5 THE PHOTOELECTRIC EFFECT


The photoelectric effect is one of several processes for removing
The photoelectric effect is the
name given to the release of electrons from a metal surface. The effect was first observed by Hertz in
electrons from a metal surface 1887 when investigating the production and detection of electromagnetic
exposed to electromagnetic waves using a spark gap in an electric circuit (see pages 196–198). Hertz
radiation. For example, when a used an induction coil to produce an oscillating spark. Hertz called the
clean surface of sodium metal is transmitting loop, spark A, and the detecting loop, spark B. In Hertz’s
exposed to ultraviolet light,
electrons are liberated from the
own words, describing his detection of the photoelectric effect:
surface. ‘I occasionally enclosed spark B in a dark case so as to more easily
make the observations; and in so doing I observed that the
maximum spark length became decidedly smaller inside the case
than it was before. On removing, in succession, the various parts of
the case, it was seen that the only portion of it which exercised this
prejudicial effect was that which screened the spark B from spark A.
The glass partition exhibited this effect not only in the immediate
neighbourhood of spark B, but also when it was interposed at
greater distance from B between A and B.’

202 FROM IDEAS TO IMPLEMENTATION


Hertz had discovered the photoelectric effect. He had found that
eBook plus illuminating the spark gap in the receiving loop with ultraviolet light
from the transmitting gap gave stronger sparks in the receiving loop.
Weblinks: Glass used as a shield between the transmitting and receiving loops
Explaining the blocked the UV. This reduced the intensity of sparking in the receiving
photoelectric effect
loop. When quartz was used as a shield there was no drop in the intensity
The photoelectric effect
eModelling: of sparking in the receiving loop. Quartz allowed the UV from the trans-
Modelling the mitted spark to fall on the detector.
photoelectric effect Wilhelm Hallwachs (1859–1894) read the extract written by Hertz
A simple spreadsheet can easily
be set up to enable playing with describing the photoelectric effect in a journal and designed a simpler
the idea of the ‘work function’ of method to measure this effect. He placed a clean plate of zinc on an
a metal. Setting up the formulae insulating stand and attached it by a wire to a gold leaf electroscope. He
for the spreadsheet will also help
reinforce the relationship between
charged the electroscope negatively and observed that the charge leaked
the various physical quantities. away quite slowly. When the zinc was exposed to ultraviolet light from an
doc-0042 arc lamp, or from burning magnesium, charge leaked away quickly. If the
electroscope was positively charged, there was no fast discharge (see
figure 11.13).

Figure 11.13 A positively charged UV


electroscope is not affected by light
illumination with UV light, while
the charge on a negatively charged
electroscope discharges.

+ + + + + + Zinc – – – – – –

+ + – –
+ + – –
+ + – –
+ + – –
+ + – –
+ + – –
+ + – –
+ –
+ + – –
+ –

Negatively
Positively charged charged
electroscope — no electroscope —
reaction discharges
Light
Light source

Evacuated tube In 1899, J. J. Thomson established that the ultraviolet light caused elec-
Photoelectrons trons to be emitted from a sheet of zinc metal and showed that these
Collector electrons were the same particles found in cathode rays. He did this by
Metal surface
enclosing the metal surface to be exposed to ultraviolet light in a vacuum
tube (see figure 11.14).
The new feature of this experiment was that the electrons were ejected
from the metal by radiation rather than by the strong electric field used
in the cathode ray tube. At the time, recent investigations of the atom
G
had revealed that electrons were contained in atoms and it was proposed
that perhaps they could be excited by the oscillating electric field.
In 1902, Hungarian-born German physicist Philipp von Lenard (1862–
Variable 1947) studied how the energy of emitted photoelectrons varied with the
voltage intensity of the light used. He used a carbon arc lamp with which he was
able to adjust the light intensity. He found in his investigations using a
Figure 11.14 Apparatus used to vacuum tube that photoelectrons emitted by the metal cathode struck
demonstrate the photoelectric effect another plate, the collector.

CHAPTER 11 THE PHOTOELECTRIC EFFECT AND BLACK BODY RADIATION 203


Light
Cathode

When each electron struck the collector, a small electric current was
produced that could be measured. To measure the energy of the
electrons emitted, Lenard charged the collector negatively to repel the
G electronsh. By doing so, Lenard ensured that only electrons ejected with
enough energy would be able to overcome this potential hill (see figure
11.15). Surprisingly, he found that there was a well-defined minimum vol-
tage, Vstop (see figure 11.16).
Lenard was also able to filter the arc light to investigate the effect that
different frequencies of incident electromagnetic radiation had on
– 9V +
photoelectron emission.
Figure 11.15 The voltage applied Lenard observed that:
across the variable resistor opposes • doubling the light intensity would double the number of electrons
he motion of the photoelectrons. emitted
The electrons that reach the opposite • there was no change in the kinetic energy of the photoelectrons as the
electrode create a small current, light intensity increased
measured by the galvanometer. The • the maximum kinetic energy of the electrons depends on the fre-
value of the voltage at which the quency of the light illuminating the metal, as figure 11.17 shows.
current drops to zero is known as
0.9
the stopping voltage.
0.8 Increasing light
0.7 intensity
Photocurrent (A)

0.6
0.5 Figure 11.16 For a given frequency,
0.4
photoelectrons are emitted with the
Increasing light same maximum kinetic energy because
0.9 frequency 0.3
the electrons are all stopped by the
0.8 0.2 Stopping
voltage same voltage. Increasing the intensity
0.7 0.1
Photocurrent (A)

of the light increases the number of


0.6 electrons released from the surface,
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0.5 Stopping voltage (V ) causing an increase in the
0.4
0.3 Stopping
voltage
Explanation of the photoelectric effect
0.2
Classical physics was unable to explain the photoelectric effect. Maxwell had
0.1
predicted electromagnetic radiation, and Hertz had confirmed that light was
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 electromagnetic radiation. Light was a wave phenomenon, and according to
Stopping voltage (V) the classical theory of Maxwell, somehow or other the light waves falling on
the surface of a metal would cause the emission of electrons.
Figure 11.17 For a given light
The key observations a theory had to explain are:
intensity, increasing the frequency
1. If the radiation falling on the metal surface is going to cause emission
of the light increases the maximum
of photoelectrons, they may be emitted almost immediately after the
kinetic energy with which the
light falls on the metal. There may be no significant time delay, even if
photoelectrons are emitted.
the intensity of the light, and hence the rate at which energy is being
transferred to the metal surface, is very low.
2. There is a cut-off (or threshold) frequency. Radiation of lower fre-
quency than a particular value will not cause the emission of photo-
electrons, regardless of how bright the light source is. (Very intense
light, which carries a large amount of energy, cannot cause emission
of photoelectrons if the frequency of the light is less than the
threshold frequency.)
3. If the light does cause the emission of photoelectrons, increasing the
intensity of the light will increase the number of photoelectrons
emitted per second.
4. The energy of the photoelectrons does not depend on the intensity of
the light falling on the surface of the metal, but it does depend on the
frequency of the light. Light of higher frequency causes the emission
of photoelectrons with higher energy.

204 FROM IDEAS TO IMPLEMENTATION


Classical physics had difficulties with three of these four observations.
• According to a wave theory of light, the light waves would distribute
their energy across the whole of the metal surface. It might be
expected that all electrons in the atoms of the metal (or at least all of
the outer electrons) would gain energy from the light waves. If the
light was very faint, it could take a considerable time for the electrons
to gain sufficient energy and for one electron to be able to escape
from the metal surface, but this is not what is observed.
• There was no way to explain the cut-off frequency.
• There was no way to explain the relationship between the frequency of
the light and the kinetic energy of the most energetic electrons.
• The only observation that could be explained by a wave theory was the
fact that increasing the intensity of light increases the rate of emission
of photoelectrons.
The problems associated with explaining the photoelectric effect were
solved by Einstein in 1905. Although his paper is often referred to as his
photoelectric paper, it was in fact a paper on the quantum nature of
light, and the photoelectric effect was just one of several examples used
by Einstein to illustrate the quantum nature of light.
In this paper, Einstein states: ‘The simplest conception is that a light
quantum transfers its entire energy to a single electron.’ In other words,
the light quantum is acting as a particle in a collision with an electron.
This ‘light quantum’ model of light is able to explain all four observ-
ations of the photoelectric effect. (The term photon was not introduced
until 1926, but we will use it at this stage to refer to a ‘light quantum’.)
If a photon strikes a metal surface, it will collide with a single electron
in an atom in the metal. All of the energy of the photon will be passed on
to the electron. This electron may gain sufficient energy to escape from
the metal surface, so there is no problem with a time delay. Even with
very faint light, the first photon to strike the metal surface could possibly
cause the emission of an electron.
The energy of a photon is related to its frequency (E = hf ). A certain
minimum energy is required to cause the emission of an electron from
The work function is the minimum the metal surface (this is the work function for that metal). If the energy
energy required to release the of the photon is less than this value, the photon cannot cause an electron
electron from the surface of a to be emitted. This explains the threshold frequency.
particular material. Increasing the intensity of the light increases the rate at which photons
fall on the metal surface, hence the rate of emission of photoelectrons
increases.
As the frequency of the light is increased beyond the cut-off frequency,
more energy is provided to the electrons, hence the kinetic energy of the
most energetic electrons increases.
This last idea played an important part in the development of the par-
ticle nature of light in the photoelectric effect. In 1905, observations of
sufficient accuracy were not available to Einstein to test his equation; it
was not until 1916 that Millikan, reluctantly, provided that evidence.
Little attention had been paid to Einstein’s ideas of photoelectric effect
in the intervening ten years. Millikan announced his results in 1916 but
said, ‘The Einstein equation accurately represents the energy of the elec-
tron emission under irradiation with light [but] the physical theory upon
which the equation is based [is] totally unreasonable.’ However, Millikan
also stated that his results, combined with Einstein’s equation, provided
‘the most direct and most striking evidence so far obtained for the reality
of Planck’s h’.

CHAPTER 11 THE PHOTOELECTRIC EFFECT AND BLACK BODY RADIATION 205


Einstein’s photoelectric equation
As previously discussed, when different metal surfaces are illu-
minated with monochromatic light, electrons may be ejected
3
Maximum kinetic energy (eV)

from the metal surface. These electrons are called photoelec-


trons. Different metals hold electrons with different forces.
m

um
n
m

c
2

ste
Zin
siu Providing the photons of light illuminating the metal have suf-

diu

tin
ng
tas

So

Pla
ficient energy (are of a high enough frequency) to overcome

Tu
Po

1
the energy holding the electrons in the metal, the electrons
f0 may be emitted. Only a small proportion of such electrons will
0
f in fact escape from the metal surface, and the emitted elec-
0.5 1.0 1.5
trons will have a spread of energies, as some electrons may
Frequency (× 1015 Hz) have required energy to move them to the metal surface. We
–W
will deal with the most energetic electrons emitted.
If a graph is plotted of the maximum kinetic energy of the
Figure 11.18 This graph shows the maximum emitted electrons versus the frequency of the light, the
kinetic energy with which the photoelectrons are gradient of the lines representing different metals is the same
emitted versus the frequency of light, for five (see figure 11.18). The point at which the lines intercept the
different metals. Note that the gradient of all the frequency axis is a measure of the threshold frequency for that
lines is equal to Planck’s constant. metal. If the frequency of the monochromatic light is below
this threshold frequency, no photoelectrons will be emitted
from the metal surface.
The lines for all of the metals are parallel and have a gradient equal to
Planck’s constant.
If we apply the general gradient equation y = mx + b to any of the lines
on this graph, we find that:
Ek max = hf –W
This equation is an energy equation:
Ek max = the maximum kinetic energy of an emitted electron
W = the minimum energy required to remove the electron from
the metal surface (the work function of the metal)
hf = the energy of the incident photon.
The energy of Einstein’s ‘light quantum’ is hf, so this equation rep-
resents an interaction between an individual quantum of light (a
photon) and an individual electron.
Of course, we now have light behaving as a particle in the photo-
electric effect but as a wave in other phenomena (such as interference
and diffraction). The photon has a dual wave and particle nature.

PHYSICS FACT

B oth Planck and Einstein lived in Germany


during the early part of the twentieth cen-
tury. Working in the same area of physics, they
groups towards the ‘decadent Jewish science’ of
relativity and the quantum theory.
It is difficult to understand the pressures experi-
were firm friends. However, during World War I, enced by Planck, who tried to protect his Jewish
this friendship became strained. Einstein was a friends and students throughout the period. Unlike
pacifist, while Planck strongly supported the Einstein, he did not see the moral imperative of
Geman cause, even though he lost his son in opposing Hitler but tried to compromise and work
battle in 1915. within the system. Einstein remained a pacifist, yet
With the rise of Hitler and the anti-Semitism some would say that he compromised some of these
movement, Einstein, who was Jewish, emigrated ideals with his support of the development of the
to the United States during the 1930s. Planck was atom bomb. He later wrote of the pain he experi-
able to continue his academic career in Berlin, enced when the bombs were finally used. See
even in the face of the hostility of anti-Semitism chapter 25 for more detail on the nuclear bomb.

206 FROM IDEAS TO IMPLEMENTATION


Applications of the photoelectric effect
Solar cells — an important use of the photovoltaic effect

Figure 11.19 A panel of commercial solar cells

A photocell is a device that converts energy from sunlight into electrical


A photocell is a device that uses
the photoelectric effect. These energy. The first true solar cell was made in 1889 by Charles Fritts, using
devices include photovoltaic, or a thin selenium wafer covered with a thin layer of gold. By 1927, light-
solar cells, which convert sensitive devices, such as light meters for photography, became generally
electromagnetic energy, such as available.
sunlight, into electrical energy.
Modern solar cells use silicon and gallium arsenide. They use focusing
Other examples are
photoconductive cells and devices, such as lenses, to achieve efficiencies of greater than 37% in con-
phototubes. verting light energy into electricity. An efficiency of 37% means that 37%
of the light energy falling onto the cell is converted into electricity.
The work function of most materials requires the electromagnetic energy
to have a frequency near that of ultraviolet light to allow electrons to be
emitted. This limits the application of photocells using the photoelectric
effect. Devices that use p–n junctions are more commonly found in the
generation of power and in the detection and measurement of light.
In a solar cell (see figure 11.20),
light energy is applied to the junction
Electron flow region of a semiconductor diode
Sunlight Current flow
does work
where p-type silicon is in contact with
Current
n-type silicon. Electrons are released
from the silicon crystal lattice because
Electrons
return to the
of the photoelectric effect. This has
p-type material the effect of raising the junction
voltage. For the solar cell to work, the
n-type layer is exposed to light and the
n-type p-type layer is not. On the light-
silicon exposed side of the solar cell a fine
grid of metal provides electrical con-
tacts. These contacts are able to collect
Metal contact p-type Junction the photoelectrons emitted from the
Photons grid collects silicon
Electron flow
light-exposed n-type silicon surface.
photoelectrons

+ +

‘Hole’ flow Figure 11.20 A solar (photovoltaic) cell showing the junction between the n-type silicon
and p-type silicon, metal grid, external circuit and electron path. Note that the direction of
hole and electron migration from the junction is opposite.

CHAPTER 11 THE PHOTOELECTRIC EFFECT AND BLACK BODY RADIATION 207


Photoconductive cells
Most modern photoconductive cells are made of
semiconductor material. This material is
described more fully in chapter 12. Briefly, it is a
mixture of elements that has some, but not all,
electrons free to conduct electricity. The rest are
confined in a lattice structure between atoms.
To produce the photoconductive effect, light
energy causes the elctrons to be released from
their valence bonds in a material. The number
of free or mobile electrons in a semiconductor
is limited and the addition of the light-released
electrons raises the conductivity, or reduces the
resistance, of the semiconductor. The resistance
change (in ohms) may be in the order of many
thousands of times.
Many different substances are photo-
Figure 11.21 An example of the use of solar cells to
conductive, that is, they conduct when exposed
provide energy to power a telephone in a remote location
to light but do not when in the dark. Examples
of photoconductive materials include lead
A photoconductive cell, or photo- sulfide (PbS), lead selenide (PbSe) and lead telluride (PbTe). These
resistor, uses the fact that electrical are extremely sensitive and become conductive when exposed to
resistance is affected by light electromagnetic radiation in the infra-red range. Cadmium sulfide
falling onto it. (CdS) is also sensitive and becomes photoconductive when exposed to
light in the visual range.
The photoconductive cell or photo-resistor was the first photoelectric
device created. It was developed in the late nineteenth century.
Photoconductive cells are used as switches to turn street lights on and off
according to the amount of light energy naturally available at any
particular time. They can also be used as light gates. You have used light
gates to measure velocity by detecting the time that a light beam is
blocked by a glider moving along a linear air-track. Such gates are used
in industry as counting devices on production lines, and in packaging,
and can be employed in alarm systems.
Because photoconductive cells can register the amount of light, they
can be used to ascertain when concentrations of particles exceed certain
levels in liquids and gases. When light scatters from particles in the water,
for example, less light passes through into the detector. This can be cali-
brated to measure the amount of pollution. They can also be used to test
the purity of commercially produced drinks and water supplies.
Photoconductive cells are part of the sensor that is used to scan bar-
codes on grocery items at the supermarket checkout. They are also used
in light meters to measure the intensity of illumination in photography.
Phototubes
Figure 11.22 Example of the use
of a photoconductive cell Phototubes, also known as photocells, are commonly used as the ‘electric
eyes’ to open automatic doors in shopping centres. In public toilets, the
cells are used to turn taps on and off when people wash their hands. Some
holiday resorts use the ‘eyes’ to trigger a presentation of descriptive audio
or visual information when a tourist enters a room. They can also be used
in astronomy to measure electromagnetic radiation from celestial objects
such as stars and galaxies. Photomultiplier cells are extremely sensitive
detectors. They effectively multiply the number of electrons released by a
factor of a million.

208 FROM IDEAS TO IMPLEMENTATION


CHAPTER REVIEW
SUMMARY QUESTIONS
• In 1887, Hertz showed that both radio and light 1. A beam of monochromatic light falls onto a
waves are electromagnetic waves — involving cold, perfect black body and imparts 0.10 mW
varying or changing electric fields, coupled of power to it. If the wavelength of the light is
−7
with a changing magnetic field that was perpen- 5.0 × 10 m, calculate:
dicular to the electric field. (a) the frequency of the light
• The relationship v = f λ, relates the frequency (b) the energy per photon for the light
(Hz) and the wavelength (m) to the velocity (c) the number of photons per second
−1
(m s ). In a vacuum, all electromagnetic waves striking the black body.
travel at the speed of light (usually referred to 15
2. A beam of UV light of frequency 7.0 × 10 Hz
as c) and thus c = f λ.
is incident on the apparatus shown in figure
• The photoelectric effect is the release of 11.23. If the maximum kinetic energy of an
electrons from materials, usually metals, by the −19
emitted electron is 9.0 × 10 J, calculate:
action of light or some other electromagnetic (a) the potential required to stop electrons
radiation such as X-rays or gamma radiation.
reaching the collector
• An oscillating or vibrating atom can emit (b) the work function of the material on which
electromagnetic energy only in discrete the light is shining
packets, or quanta. For light, in particular, the (c) the threshold frequency of the material.
basic quantum of energy is a photon.
• The relationship between the frequency, f, and
the amount of energy emitted, E, is given by the Evacuated tube
formula E = hf where h is Planck’s constant.
• Quantum mechanics states that the amount of
energy emitted from a black body (or any other
atomic-level energy transformation) is quan-
tised so that it can increase only in certain small e–
steps. Collector
• Albert Einstein explained the photoelectric Metal surface
effect, based on Planck’s work on black body
radiation, and included the development of the
idea of a quantum of energy or photon being
the basic unit of energy.
• As scientists, Einstein and Plank both experi-
enced pressure from social and political forces
and coped in different ways.
V
• When a photon hits electrons in a metal it will
release either all or none of its energy. An indi- Figure 11.23
vidual electron cannot accept energy from 3. Examine figures 11.7 and 11.8 (page 197).
more than one photon. Explain why Hertz came to the conclusion
• The amount of energy required to overcome that the waves produced by the sparks were
the attractive forces of an electron in the elec- polarised.
tron ‘sea’ is called the work function of the
electron. 4. Demonstrate the dependence of colour on the
temperature of a black body. In figure 11.10
• The maximum kinetic energy of the electron
(page 200), what is the significance of the
after it has been ejected from the metal’s
peak of the curve?
surface can be determined by measuring the
stopping voltage, vstop. 5. Maxwell, Michelson and Hertz carried out
• A photocell is a device which uses the photo- experiments on electromagnetic waves of dif-
electric effect to generate or control an electric ferent frequencies. Compare their observations
current. Applications include solar cells and and discuss how an understanding of the elec-
breathalyser. tromagnetic spectrum was developed.

CHAPTER 11 THE PHOTOELECTRIC EFFECT AND BLACK BODY RADIATION 209


6. A scientist is investigating the effect of different 7. Above a particular (and specific to each mat-
types of radiation on the surface of a piece of erial) threshold frequency of electromagnetic
sodium metal. A method is devised to cut a new radiation, electrons are ejected immediately.
surface across the sodium plate while in vacuo, Below this threshold frequency, electrons are
since sodium is highly reactive and oxidises never ejected. Explain how the photon model
quickly. The apparatus is finally set up as shown for light, rather than the wave model, explains
in figure 11.24. The two variables under investi- this behaviour.
gation will be the frequency, f, of the radiation 8. A photon collides with an electron and is scat-
and the kinetic energy, Ek, of the photo- tered backwards so that it travels back along its
electrons. original path. Describe and explain the
(a) Should the sodium plate be positively or expected wavelength of the scattered light.
negatively charged in order to make the
proper investigations? 9. The light from a red light-emitting diode
14
(b) Results for the experiment are as follows: (LED) has a frequency of 4.63 × 10 Hz. What
is the energy change of electrons that produce
this light?
FREQUENCY OF STOPPING
INCIDENT LIGHT POTENTIAL 10. We can detect light when our eye receives as
−17
14
(¥ 10 Hz) (volts) little as 2.00 × 10 J. How many photons of
light, with a wavelength of 5.50 × 10−7 m, is this?
5.4 0.45 −7
11. A red laser emitting 600 nm (λ = 6.0 × 10 m)
6.8 1.00 wavelength light and a blue laser emitting
450 nm light emit the same power. Compare
7.3 1.15 their rate of emitting photons.
8.1 1.59 12. One electron ejected from a clean zinc plate
by ultraviolet light has kinetic energy of
9.4 2.15 −19
4.0 × 10 J.
11.9 2.91 (a) What would be the kinetic energy of this
electron when it reached the anode, if a
Record these results on a Ek max versus retarding voltage of 0.90 V was applied
frequency graph. between anode and cathode?
(c) Determine the threshold frequency for the (b) What is the minimum retarding voltage
sodium metal. that would prevent this electron reaching
(d) Determine a value for Planck’s constant, h, the anode?
from your graph. (c) All electrons ejected from the zinc plate
are prevented from reaching the anode by
(e) What is the work function for sodium?
a retarding voltage of 4.3 V. What is the
maximum kinetic energy of the electrons
ejected from the zinc?
(d) Sketch a graph of photocurrent versus
Evacuated tube voltage for this metal’s surface. Use an
arbitrary photocurrent scale.
f
13. The waves of the electromagnetic spectrum
share some similarities and have some differ-
ences. What are their similarities? What are
Na plate Collector
their differences?
CHAPTER REVIEW

14. (a) What is the wavelength of the radio waves


broadcasting station 2MMM in Sydney if
the frequency of the broadcast is
104.9 MHz?
(b) What is the energy of a photon of that wave?

A V 15. Arrange the following electromagnetic waves


°
in order of increasing energy levels:
Power pack long-wave radio waves, gamma ray, blue light,
red light, infra-red light, microwave radio
Figure 11.24 waves, X-rays.

210 FROM IDEAS TO IMPLEMENTATION


PRACTICAL ACTIVITIES
16. What were the features of radio waves that Warning: The spark across the gap of an induction
were demonstrated by Hertz’s experiments? coil generates long-wavelength X-rays and short-
17. What is the energy of an X-ray of wavelength wavelength ultraviolet radiation. These are poten-
−11
2.5 × 10 m? tially dangerous, and students should not stand
18. If a black body was to gain sufficient energy to closer than one metre.
raise its temperature from 2000 K to 4000 K, 2. Adjust the tuner of the radio, so that it does not
describe how a plot of its radiant intensity receive a station.
versus radiant wavelength of the electromag- 3. Move around the room and try to estimate
netic radiation would change. where the radio can receive the static noise
from the spark.
11.1 4. Adjust the gap to 10 mm, and repeat the exer-
cise.
PRODUCING AND 5. Change the tuner of the radio and scan across
the range of wavelengths.
TRANSMITTING Questions
RADIO WAVES 1. What is the maximum distance from the induc-
tion coil at which the radio receives static noise
Aim from the 5 mm spark?
To demonstrate the production and transmission 2. Is the distance different when the spark is pro-
of radio waves. duced by a 10 cm gap?
3. Can you detect any pattern in the static
Apparatus received at different wavelengths?
induction coil 4. An induction coil is an example of a trans-
transformer rectifier with leads former. What can you infer about the voltage
small transistor radio across the gap and the resulting charge move-
ment observed as the spark?
Method 5. In what form is the energy transferred from the
1. Adjust the gap on the induction coil to about spark to the radio? In what manner must
5 mm and adjust the transformer to 6 V DC. charges move to produce this energy?

CHAPTER 11 THE PHOTOELECTRIC EFFECT AND BLACK BODY RADIATION 211


CHAPTER
12 THE DEVELOPMENT
AND APPLICATION
OF TRANSISTORS
Remember
Before beginning this chapter, you should be able to:
• discuss how the length, cross-sectional area,
temperature and type of material affect the
movement of electricity through a conductor
• describe the relationship between potential
difference, current and power dissipated
• recall the arguments relating to the ‘wave–particle’
debate of matter and energy.

Key content
At the end of this chapter you should be able to:
• identify that some electrons in solids are shared
between atoms and can move freely
• describe the difference between conductors,
insulators and semiconductors in terms of band
structures and relative electrical resistance
• identify absences of electrons in a nearly full band as
holes, and recognise that both electrons and holes
help to carry current
• compare the relative number of free electrons that
can drift from atom to atom in conductors,
semiconductors and insulators
• discuss the use of germanium and silicon as raw
materials in transistors
• describe how ‘doping’ a semiconductor can change
its electrical properties
• identify the differences in p-type and n-type
semiconductors in terms of the relative number of
negative charge carriers and positive holes
• discuss the differences between solid state and
Figure 12.1 An example of an integrated circuit thermionic devices and discuss why solid state devices
have largely replaced thermionic devices
• assess the impact on society of the invention of
transistors, relating particularly to microchips and
microprocessors.
In this chapter you will extend your knowledge of the electrical nature of
matter and how this electrical nature led to the development of transis-
tors and integrated circuits. Thermionic (radio) valves, although often
unreliable, were used in all electronic appliances during the first part of
the twentieth century. The invention of the transistor — that did the
same job more reliably in most applications — enabled the miniaturis-
ation of electronics. This miniaturisation has enabled the invention of
the integrated circuit used in portable radios, CD players, computers and
digital phones.

PHYSICS IN FOCUS
Band structure in solids
Y ou may previously have encountered the shell
structure of electrons in atoms, and you may
have related this to the spectra emitted when
In a gas, there is no problem with two well-
separated atoms having electrons in precisely the
same energy state, but this does become a problem
atoms are excited by electrical discharge or as the atoms are brought closer together and the
heating. (Sprinkle some sodium chloride into a electrons from different atoms begin to interact
Bunsen flame to observe the bright orange spec- with each other. This interaction results in a slight
trum of sodium.) change in energy of the levels so that no two elec-
These spectral lines are produced by electrons trons have identical energy. As more and more
being excited into a higher energy shell and atoms are pushed closer together, this results in
jumping back to a lower energy shell, emitting what were precise energy levels in the individual
light of a particular frequency as they do so. All atoms being spread into energy bands in the solid.
atoms of an element have
the same electron shells or
Energy
energy levels when they exist
as individual atoms, but this
situation changes when they
are present in solids.
In 1925, Wolfgang Pauli
proposed what became
known as the Pauli Exclu-
sion Principle. It can be
stated simply that no two Energy levels in
Separation of atoms
electrons can simultaneously isolated atoms
in the crystal
occupy the same energy
state. (This is important in
the electron structure of Energy bands
individual atoms. All the
electrons cannot collapse Separation of atoms
into the lowest energy shell Figure 12.2 The band structure of different types of solids, semiconductors in particular,
in an atom.) is important in the development of the solid state devices studied in this chapter.

12.1 CONDUCTORS, INSULATORS AND


SEMICONDUCTORS
Different materials vary greatly in their ability to conduct electricity.
Their conduction strength depends on the ease with which electrons are
able to move through the crystal lattice. In materials that are good
insulators, the atoms in the lattice are held by strong covalent bonds in
which electron pairs are shared between atoms. This sharing means that

CHAPTER 12 THE DEVELOPMENT AND APPLICATION OF TRANSISTORS 213


electrons are held tightly and are not available to conduct electricity
through the lattice.
Metal lattices, on the other hand, consist of an orderly array of positive
metal ions (see figure 12.3). To maintain stability, valence electrons are
‘delocalised’, or free to move like a cloud of negative charges throughout
the lattice. These electrons can conduct electricity through the lattice.
Delocalised electrons in the metal lattice move randomly between atoms.
Under the influence of an electric field, the random motion of the elec-
trons decreases and begins to have a net motion in a direction opposite to
the electric field. This net motion produces the electric current.

+ + + + + + +
– – – – – – –
e e e e e e e e–
+ + + + + + +
e– e–
e– e– e– e– e–
e– e–
+ + + + + + +
Figure 12.3 In a metal, the positive –
e –
e –
e –
e –
e –
e e –
e–
ions of the lattice are surrounded by
+ + + + + + +
delocalised electrons.

The de Broglie model of electron orbitals around the nucleus of an


atom (see pages 215–216) explained why electrons in an atom can have
only certain, well-defined energies. For any particular element, the
highest energy level available for electrons to occupy may, or may not, be
completely filled. Elements that do not have their outermost energy level
or shell filled will try to get it filled by bonding with other elements. They
may do this by:
• gaining electrons from other atoms of an element, forming ionic
bonds
• giving electrons to other atoms, forming ionic bonds
• sharing electrons with other atoms, such as in covalent bonds.
The aim of this giving of electrons in ionic bonds and the sharing of
electrons in covalent bonds, is to fill electron shells.
When atoms of any type of substance, including insulators, semi-
A semiconductor is a material in
which resistance decreases as its conductors and conductors, are very close together (such as in a solid)
temperature rises. Its resistivity lies their highest electron energy levels overlap in a continuous fashion.
between that of a conductor and Regions within these highest energy levels are called energy bands. The
an insulator. highest energy band occupied by electrons at absolute zero is called the
valence band. In a conductor, the valence band and the conduction band
overlap (see figure 12.4(a) below). Electrons from the valence band are
A valence band is the energy band
in a solid in which the outermost
able to move freely because the band is only partially filled.
electrons are found. In an insulator, electrons completely fill the valence band, and the gap
between it and the conduction band is large (see figure 12.4(b) below).
The electrons cannot move under the influence of an electric field unless
they are given sufficient energy to cross the large energy gap to the con-
duction band.

Conduction band
Energy levels

Conduction band
Energy gap
Overlap

Valence band Valence band


Figure 12.4 Energy bands for
(a) a conductor and (b) an insulator (a) Conductor (b) Insulator

214 FROM IDEAS TO IMPLEMENTATION


In an ionic crystal such as sodium chloride, the bond between the
sodium and the chlorine atom has been accomplished by the sodium atom
giving away an electron in its outermost, unfilled electron orbital level to
the chlorine atom. This enables the chlorine to fill its outermost electron
orbital level without overlapping of bands. Both the positively charged
sodium (that has lost an electron) and the negatively charged chlorine
(that has gained the electron from the sodium atom) are ions. They have
12.1 filled outer electron orbitals corresponding to completely filled bands.
Solid sodium chloride therefore behaves as an insulator. The sodium and
Band structures
chlorine ions have no surplus electrons in their outer electron orbital
energy levels to transfer from atom to atom. The application of an electric
field cannot cause the net movement of a current of electrons. However, it
is not impossible for an electric current to flow in an insulator. If the
applied electric field is sufficiently large, even an insulator can conduct.

PHYSICS FACT
de Broglie’s wave model
T he concept of the wave–particle duality of light
states that light can act both as a wave and as
a particle at different times. If, for example, we
he said that as he observed the patterns formed
by standing waves in a string he wondered what
would happen if the string was bent into a circle.
carry out an experiment to study the interference Stable patterns would form when multiples of
of light, we are observing light behaving as a wave. the wavelength corresponded to the circumfer-
Similarly, by experimenting with the photoelectric ence (or length) of the string. Extending this to
effect, we are observing light behaving as a particle. the circumference of the Bohr orbit in an atom,
French physicist Louis de Broglie (1892–1987) and using the wavelength of an electron, only
proposed that electrons should also demonstrate certain orbits would be stable — exactly those for
a wave-like nature. which the circumference was equal to a multiple
Working with Einstein’s theory and Planck’s of the wavelength. Here was an opportunity to
quantum theory, he derived an expression explain the assumptions made by Bohr. The
relating momentum and wavelength. assumptions were:
2
He combined Einstein’s equation, E = mc , and • electrons could occupy stable, non-radiating
Planck’s equation, E = hf, to show that orbits. Electrons moving in circular motion
2
mc = hf . undergo acceleration, and will radiate electro-
Using momentum, p = mv, and replacing v with magnetic energy. Losing energy would drive
c (since c = the velocity of light), he obtained the electron into increasingly smaller orbits,
pv = hf. finally collapsing into the nucleus. Since we
know that atoms are stable, the energy levels
v h h
After substituting λ = -- , λ = --- = ------- that the electrons occupy must also be stable
f p mv and cannot radiate energy. These are the ‘non-
is obtained radiating’ states referred to by Bohr.
where • radiation emission and absorption by atoms
m = mass (kg) can only occur in quantised amounts. Electrons
−1
v = velocity (m s ) can only move between these discrete levels
c = speed of light in a vacuum and must absorb or emit only the amount of
p = momentum = mv (N s) energy needed to move between energy levels.
−34
h = Planck’s constant = 6.626 × 10 J The cornerstone of de Broglie’s idea was that
λ = wavelength (m) the electron orbiting the atom must have a
f = frequency (Hz). standing-wave pattern of vibration so that its
orbit does not destructively interfere with itself.
In about 1923, de Broglie put forward the idea
Since the orbital level represents an energy
that, just as light could be thought of as having
level, only electron energy levels where the elec-
particle characteristics, electrons could act as a
tron orbits the nucleus with a standing wave
wave. Describing his approach many years later,
(continued)

CHAPTER 12 THE DEVELOPMENT AND APPLICATION OF TRANSISTORS 215


pattern consisting of a ‘whole’ number of wave- have an energy precisely equal to the energy dif-
lengths is possible. Intermediate electron energy ference between the orbital energy levels. Again
levels can’t be stable as they would produce an this energy release must be of a specific size — a
interfering wave character in the orbiting elec- quantum. De Broglie’s hypothesis is discussed
tron. The electron can absorb energy and move further in the option topic ‘From Quanta to
to a higher standing-wave energy level but this Quarks’ (see pages 444–447).
energy absorption must be of a specific amount; De Broglie’s wave model of electrons allowed
that is, a quantum. The same electron can move electrons to orbit the nucleus only when the cir-
to a lower energy orbital (that produces a cumference of the circular orbit was a whole
standing-wave pattern of orbit) by emitting radi- number of wavelengths (see figure 12.5).
ation energy as a photon. This photon must
(a) (b)

Circumference = Circumference = Circumference = Unless a whole number of wavelengths


2 wavelengths 4 wavelengths 9 wavelengths fit into the circular hoop, destructive
interference occurs and causes the
vibrations to die out rapidly.
Figure 12.5 A model of the atom showing an electron as a standing wave

12.2 BAND STRUCTURES IN


SEMICONDUCTORS
In a semiconductor, the gap between the valence and the conduction
bands is smaller than that in an insulator (see figure 12.5).

Conduction band
Energy levels

Conduction band Conduction band


Energy gap
Overlap Energy gap

Valence band Valence band Valence band

(a) Conductor (b) Semiconductor (c) Insulator

Figure 12.6 Energy bands in


Table 12.1 Differences between the structure of conductors and
(a) a conductor (b) semiconductor
non-conductors
and (c) an insulator
TYPE OF MATERIAL

INSULATOR
(NON-CONDUCTOR) SEMI-CONDUCTOR CONDUCTOR

Valence band Completely filled Almost filled Partly filled

Conduction band Well separated Just separated Overlapping

Energy gap Large Small Very small/


non-existent

216 FROM IDEAS TO IMPLEMENTATION


In many materials, including metals, Resistivity
resistivity increases with temperature. In
some materials, such as the semi-
conductors germanium and silicon,
resistivity decreases markedly with
increasing temperature (see figure
12.7). The increased thermal energy
causes some electrons to move to levels Temperature
in the conduction band from the Figure 12.7 Resistivity as a function
valence band. Once there, they are free of temperature for a pure silicon
to move under the influence of an semiconductor
electric field.
At absolute zero, all of the electrons
in a semiconductor occupy the valence band and the material acts as an
insulator. As the temperature of the semiconductor material increases,
thermal energy allows some electrons to cross the gap into the conduc-
tion band. This leaves the valence band unfilled. This means that holes
have been created in the valence band where the electrons have left.
These holes actually act as a positive flow of current moving in the oppo-
site direction to the electron current flow. (Note that electron current
refers to the movement of the electrons and is in the opposite direction
to conventional current.) Thus, conduction is possible in both the con-
duction band as a flow of electrons and in the valence band as a flow of
positive holes. The hole current flows towards the negative potential
while the electron current flows towards the positive potential. The speed
of the electron current is, however, much greater. This is because the
hole current must move from single atom to single atom whereas the
electron current can simply flow through the overlapping conduction
bands of adjacent atoms.
A dopant is a tiny amount of an
Sometimes an impurity atom (dopant), of a different type to the atom
impurity that is placed in an making up the main crystal lattice of a semiconductor material is present
otherwise pure crystal lattice to in a semiconductor crystal lattice. If that dopant atom has a different
alter its electrical properties. number of valence electrons from the atom of the semiconductor it
replaces, extra energy levels can be formed within the energy gap between
the valence and conduction bands. This means it is easier for these mat-
erials to conduct because the energy difference between the valence and
conduction bands for such dopant atoms is less. The number of dopant
atoms needed to create a difference in the ability of a semiconductor to
conduct is very small. It will occur with the replacement of a semiconductor
atom by only one in every 200 000 atoms.
There are two main types of semiconductor materials: intrinsic and
extrinsic.
• Intrinsic — the semiconducting properties of the material occur
naturally. No doping of the crystal lattice is necessary to enable the
material to act as a semiconductor. Examples include the elements
silicon and germanium.
• Extrinsic — the semiconducting properties of the material are manufac-
tured to behave in the required manner. Generally this means that the
material is a naturally occurring semiconductor that has its
semiconducting properties modified by the addition of dopant atoms.
These are generally silicon or germanium with small impurity levels of
dopants, such as phosphorous or boron.
Nearly all of the semiconductors used in modern electronics are
extrinsic and based on silicon.

CHAPTER 12 THE DEVELOPMENT AND APPLICATION OF TRANSISTORS 217


Table 12.2 Some comparisons of resistivity between conductors and
semiconductors

MATERIAL TYPE RESISTIVITY (Ωm)


–8
Aluminium Metallic conductor 2.5 × 10
–8
Copper 1.6 × 10
–8
Iron 9.0 × 10
–8
Silver 1.5 × 10
–7
Constantan (copper, nickel) Alloy metal conductor 4.9 × 10
–6
Nichrome (Ni, Cr, Fe) 1.1 × 10
Germanium Semi-metal 0.9
Silicon semiconductor 2000
10 14
Glass Non-metallic insulator From 10 to 10
11 15
Mica From 10 to 10
15 19
Polystyrene From 10 to 10
12 17
Wax From 10 to 10

Making a semiconductor
The most widely used semiconductor materials are made from crystals
of elements from Group 4 of the periodic table. These elements have
four electrons in their valence band. They fill the valence band to eight
electrons by sharing an electron with each of four adjacent atoms. Each
of these four atoms also contributes a single electron — forming a pair
of electrons that is a bond between the atoms. In turn, each of the four
atoms bonded to the first atom share a single electron with four adja-
cent atoms. In this way, each of the atoms has its own four valence band
electrons and shares four single electrons from four adjacent atoms.
The atom then appears to have eight electrons in its valence band and
the band is filled. This sharing of electrons between atoms to form a
A covalent bond is a strong
bond between the atoms is known as covalent bonding.
chemical bond formed between Two Group 4 elements, silicon and germanium, were predicted to be
atoms by the sharing of electrons ideal for the production of electronic components because of their
in the valence band. semiconducting properties.

Silicon
The conducting properties of silicon can be related to its crystal structure.
Silicon crystal forms the so-called diamond lattice where each atom has
Silicon
four nearest neighbours at the vertices of a tetrahedron (see figure 12.8).
atom The tetrahedron consists of a silicon atom at the centre with the four other
silicon atoms bonded to it and forming a triangular prism about it.
This fourfold tetrahedral coordination uses the four outer (valence)
Electron
electrons of each silicon atom. According to the quantum theory, the
energy of each electron in the crystal must lie within well-defined bands.
The next higher band above the valence band, where the outer four elec-
trons exist, is the conduction band. The conduction band is separated
Figure 12.8 The lattice structure from the valence band by an energy gap. Heating the semiconducting
of silicon material enables some electrons to gain enough energy to jump that gap
from the valence band to the conduction band. This means one bond of
the tetrahedron is no longer complete. This incomplete bond is a hole.

Germanium and zone refining


Germanium was the first Group 4 element that could be sufficiently
purified to behave as a semiconductor. Germanium as an element is

218 FROM IDEAS TO IMPLEMENTATION


relatively rare, making up less than 1.5 parts per million of the Earth’s
crust. It is never found in an uncombined form in nature, existing only as
a compound.
Early diodes and transistors were made from germanium because
suitable industrial techniques were developed to purify the germanium
to the ultrapure level required for semiconductors during World War II.
Germanium has one major problem when used in electronic com-
ponents: it becomes a relatively good conductor when it gets too hot. The
conductivity level means that hot germanium electronic components
allow too much electric current to pass through them. This can damage
the electronic equipment and cause it to fail to perform the task for which
it was designed. The problem is that the resistance to electric current flow
that makes the semiconductor useful in electronic components also
generates heat.

Advantages of silicon
Silicon was the other element with semiconducting properties that was
predicted to be ideal for the production of electronic components.
Unlike germanium, silicon is very common in the Earth’s crust. Like
germanium, silicon never appears as a free element in nature. Silicon is
always combined into chemical compounds so it has to be purified
before it can be used in the production of semiconductors. Almost every
grain of sand you see is made of silicon dioxide, so silicon as a raw
material is far more plentiful than the rare germanium.
The problem with using silicon in electronic components is that it is
more difficult to purify. However, silicon makes the most useful semi-
conductors for electronics. It is affected less by higher temperatures in
terms of maintaining its performance level.
The first silicon transistors were made in 1957 by Gordon Teale
working for Texas Instruments. After the production of those first silicon
transistors, the germanium transistors were largely phased out of
production, except for specialised applications. From the 1960s onwards,
silicon became the material of choice for making solid state devices. It is
much more abundant than germanium and retains its semiconducting
properties at higher temperatures.

12.3 DOPING AND BAND STRUCTURE


A pure semiconductor (called an intrinsic semiconductor) has the right
number of electrons to fill the valence band. Semiconductors can
conduct electricity only if electrons are introduced into the conduction
band, or are removed from the valence band to create holes. Electrons
are the negative charge carriers in the conduction band, and holes in the
valence band act as positive charge carriers.
As we have seen, the process of doping is one method of enhancing
the conductivity of a semiconductor. A tiny amount of an impurity atom
is introduced into the semiconductor crystal lattice to alloy with the
material. This process produces designer semiconductors that are said to
be extrinsic semiconductors.

Extrinsic semiconductors
There are two types of extrinsic semiconductors: n-type semiconductors
and p-type semiconductors.

CHAPTER 12 THE DEVELOPMENT AND APPLICATION OF TRANSISTORS 219


Silicon N-type semiconductors are formed when a Group 5 impurity atom
atom (such as phosphorous or arsenic) is substituted into a silicon crystal
lattice, replacing an atom of silicon (see figure 12.9). Group 5 atoms have
Arsenic five electrons in the valence band whereas silicon and germanium, as
atom
Group 4 atoms, have only four electrons in the valence band. When this
occurs, four of the five outermost electrons from the substituting doping
Electron
atom fill the valence band just like electrons from a silicon atom would.
Extra
electron The one extra electron is promoted to the conduction band. These
impurity atoms are in this case called donor atoms. The extra electrons
from the donor atoms in the conduction band are mobile. Since they are
Figure 12.9 Lattice structure of
electrons, and hence carry a negative electrical charge, a semiconductor
silicon doped with arsenic
doped in such a way so as to produce an excess of negative charge
carriers is called an n-type semiconductor.
P-type semiconductors are formed in a similar fashion to n-type
semiconductors. A Group 3 atom (such as boron or gallium) is substi-
Silicon tuted into the crystal lattice in place of a Group 4 atom of silicon or
atom
germanium (see figure 12.10). The Group 3 atom has only three elec-
trons in the valence band. This means that when such an atom replaces a
Gallium
atom Group 4 atom, there is one electron short in the tetrahedral structure.
This means that a hole has effectively been incorporated into the crystal
lattice without the need to elevate an electron from the valence band to
Electron
hole the conduction band. The holes act as positive charge carriers that are
Electron mobile and carry current.
When the doping impurity has
only three electrons, for example, Movement
Figure 12.10 Lattice structure of when indium is added to germa- of the gap
as it is
silicon doped with gallium nium, there is one site unfilled by filled by an
electrons. This hole actually acts as electron
a mobile positive charge carrier
Electron
when an electric field is applied. gap
The movement of the electron is
shown in figure 12.11. Under the
effect of an electric field, electrons
move towards the positive terminal Figure 12.11 Lattice structure of
of the source and move into any silicon doped with indium showing the
hole. Since there are no free elec- electron movement
trons available, the deficient indium
atoms near the positive terminal attract electrons from their neighbours,
disrupting covalent bonds. This creates new holes in the adjacent atoms.
As electrons continue to move towards the positive terminal the holes
move in the opposite direction.

12.4 THERMIONIC DEVICES


In electrical appliances there is often a need to control the direction of
A valve is a thermionic device in
which two or more electrodes are current flow, convert AC into DC, switch current flow on or off or
enclosed in a glass vacuum tube. amplify a current. Devices within the appliance control this. Prior to the
The name comes from the invention of solid state devices, such as the solid state diode or the
rectifying property of the device; transistor, thermionic or valve devices accomplished these tasks.
that is, the current flows in only Thermionic devices utilise heated filaments and terminals set in glass
one direction.
vacuum tubes, such as the older radio valves. The filament in the vacuum
valve is heated by an electric current, causing it to liberate electrons and
A diode contains only two act as a cathode. These electrons are then accelerated by a high potential
electrodes. difference towards an anode. The simplest example of such a valve is the
diode (see figure 12.12).

220 FROM IDEAS TO IMPLEMENTATION


(b)

(a) Anode

Filament
Figure 12.12 (a) Example of a
thermionic diode. Electrons are
emitted from a heated filament and
accelerated to the anode. (b) A Heating circuit
photograph a thermionic valve, of the
type used in the mid-twentieth century

A diode has two elements inside the glass vacuum tube: a cathode and
a plate (or anode). The cathode may be heated directly (with current
flowing through the cathode) or indirectly (with a separate filament).
With the negative terminal attached to the cathode, electrons will flow
through the diode — creating an electric current. If the battery is
connected in reverse, no current will flow. Such ‘unidirectional conduc-
tion’ makes a diode suitable as an ‘electronic switch’ and for converting
AC current into DC current. This is called rectification.
Thomas Edison (1847–1931) (of electric light fame) observed that an
electric current flowed between the cathode and the positively charged
plate in the first diode. Lee de Forest (1873–1961) added a third
electrode to the vacuum diode and demonstrated that the valve now
acted as a current amplifier. This third electrode was called the ‘grid’ and
the valve was called the triode.
The grid circuit can be adjusted seperately from the anode (or plate)
circuit. It is closer to the cathode than the anode. A voltage placed on the
grid has a much larger effect on the electric field within the valve. The
grid can therefore be used to control the anode current (see figure
12.13(a) on the following page).
When an alternating voltage, or signal, is applied to the grid, the
electron current is an amplified replica of the signal voltage. Because a
high voltage is applied between the cathode and the anode, small vari-
ations in the grid current produce amplified signals in the anode circuit.

PHYSICS FACT

L ee de Forest’s invention of the triode valve in 1906 began the elec-


tronic age. He called the triode the ‘audion’. He did not realise
the potential of his device until 1912 when he discovered that the
triode could be used to amplify sound and electromagnetic waves.
The invention of this device made possible the whole range of elec-
tronics we accept as normal today: radio, television, radar, computers
and long-distance telephony.

This was important in the early days of radio reception. A radio wave
consists of a ‘carrier wave’ with the signal superimposed. The wave hit-
ting an antenna generates an alternating electrical current. The carrier
wave is removed, leaving a small AC current signal. This small AC signal
current can be applied to the grid, with the amplified signal passing to a
loudspeaker to produce sound.

CHAPTER 12 THE DEVELOPMENT AND APPLICATION OF TRANSISTORS 221


(a) (b)

Current
Anode Time

Grid
(i) Carrier current
Cathode

Current
Time

(ii) Voice current from microphone

Current

Time

(iii) Carrier with voice added


Current

Figure 12.13 (a) Example of a triode


amplifier circuit (b) A voice current Time
superimposed on a carrier wave after
passing through a diode (iv) Effect of passing (iii) through diode

12.5 SOLID STATE DEVICES


p–n junction Modern appliances utilise solid state Conduction band
devices, such as transistors and
integrated circuits. Solid state Donor level
devices are made from semi-
conductor materials. For example, a
Valence band
junction or interface between a p-
type semiconductor and an n-type n-type semiconductor
semiconductor acts as a diode (see
Figure 12.15 Energy band changes in a
figure 12.14). This combination
doped n-type semiconductor. The level is
allows current to flow in only one called ‘donor’ because electrons are
Holes in p-type Electrons in n-type
direction. Electrons move across the donated to the conduction band.
semiconductor semiconductor junction from the n-type semi-
conductor (see figure 12.15),
Figure 12.14 The holes and neutralising available holes in the Conduction band
electrons in a p–n junction p-type semiconductor material (see
figure 12.16). This zone adjacent to
the interface is called the ‘depletion Acceptor
zone’. level
The depletion zone exerts a Valence band
‘force’ that resists the movement of
any more electrons into the region. p-type semiconductor
This effectively means that the con- Figure 12.16 Energy band changes in a
ventional electrical current flow is doped p-type semi-conductor. The level is
eBook plus confined to one direction only; that called ‘acceptor’ because electrons are
accepted from the conduction band.
is, from the positive terminal of the
Weblink:
Semiconductors:
battery, into the p-type semiconductor material and then out of the n-type
p- and n-type semiconductor material to the negative terminal of the battery. A diode
connected in such a way is said to have a forward bias applied to it.

222 FROM IDEAS TO IMPLEMENTATION


If the battery is connected the other way — that is, with the p-type
material connected to the negative terminal of the battery and the n-type
material connected to the positive terminal — no current, or only very
minute currents, can flow through the diode. The diode in such a case is
acting as a large resistor and is said to be reverse biased (see figure 12.17).

(a) p-type n-type (b)


Anode Cathode Anode Cathode

Figure 12.17 (a) A junction diode.


(b) Its circuit symbol. The arrow on a
Battery + – Battery
diode indicates the direction of Direction of forward bias current
terminal terminal
conventional current flow possible
– +
through the diode. Note that it is
unidirectional. Direction of reverse bias current

PHYSICS IN FOCUS
The p-n junction
A small region at the junction of a p-type semi-
conductor and an n-type semiconductor is the
key to the operation of diodes, transistors and pho-
The mobile charge carriers from one type of
material will diffuse into the other type of mat-
erial. The thickness of the deletion zone is much
tovoltaic cells. This region is called the depletion less than the distance that the charge carriers can
region or depletion zone, and it is formed by the diffuse through the material.
diffusion of charge carriers from one type of mat- As electrons from the n-type material diffuse
erial to the other. into the p-type material, they will introduce
In n-type material, the dominant charge car- negative charge to the p-type material and leave
riers are the electrons from the donor atoms. The the n-type material positively charged.
n-type material will contain some positively Similarly, as holes from the p-type material dif-
charged donor ions and some electrons contrib- fuse into the n-type material, they will introduce
uted by the donors. There will be a relatively small positive charge to the n-type material and leave
number of electrons and positive holes due to the the p-type material negatively charged.
material’s intrinsic semiconductor properties. The diffusion of electrons from n-type to p-type
In p-type material, the dominant charge car- and the diffusion of holes from p-type to n-type
riers are the positive holes from the acceptor both contribute to the build-up of positive charge
atoms. The p-type material will contain some on the n-type material and negative charge on
negatively charged acceptor ions and the corre- the p-type material. Hence an electric field is
sponding positive holes. Of course, there will also established across the depletion zone. As the dif-
be a relatively small number of electrons and fusion of charge carriers increases, this electric
positive holes due to the material’s intrinsic semi- field also increases, until it opposes further dif-
conductor properties. fusion of the charge carriers across the junction.
n p Figure 12.18 The electric field at a p-n junction. The orange
(contains donor atoms) (contains acceptor atoms)
dots represent donor ions. These are the dopant atoms that
+ have lost their ‘extra’ electron and become positively charged
ions. The electrons have diffused across the boundary into the
– – p-type material. The red dots represent acceptor ions. These
+ – + are the dopant atoms that have accepted an electron (which
has diffused across from the n-type material). The electrons
– have effectively filled the positive holes in the acceptor atoms,
– +
+ and they have become negatively charged. The + signs
– + represent a very small number of positive holes that still exist,
Excess positive charge Excess negative charge and the – signs represent the very small number of conduction
band electrons that are still present.
Electric field (continued)
(continued)

CHAPTER 12 THE DEVELOPMENT AND APPLICATION OF TRANSISTORS 223


As we can see in figure 12.18, the depletion Energy bands in a p-n junction diode
zone at the junction contains donor and acceptor The electric field is responsible for a change in
ions and very few electrons or positive holes, so it energy of the valence and conduction bands
is almost devoid of mobile charge carriers. across the junction.
(Even in equilibrium, there is a small flow of When a positive potential is applied to the n-
electrons from n-type to p-type through the type material and a negative potential is applied
depletion zone. These electrons undergo recom- to the p-type material, as shown in figure
bination with holes in the p-type material. There 12.19(b), the energy difference is increased and
is an equal flow of electrons from p-type to n- almost no electrons are able to pass from the n-
type. These electrons are produced by thermal type to the p-type. (It is a similar case for holes,
generation in the p-type material. The recombi- which are unable to travel from p-type to n-type.)
nation current and thermal generation current This is known as reverse bias.
must be equal and opposite unless either of the When a negative potential is applied to the n-
type material and a positive potential is applied to
two is altered by the application of a potential dif-
the p-type material, as shown in figure 12.19(c),
ference across the junction.)
the energy difference is decreased. Now large
The presence of the electric field can be used numbers of electrons will be able to flow from n-
to explain both the diode nature of the p-n junc- type to p-type, and there will be a correspond-
tion and also how the p-n junction acts as a pho- ingly large flow of holes from p-type to n-type.
tovoltaic cell. This is known as forward bias.
(a) (b) (c)
Conduction band
Conduction band
Electron energy Electron energy Electron energy
Conduction band

Valence band
Valence band
Valence band

n-type Transition p-type n-type Transition p-type n-type Transition p-type


region region region

Electric field Electric field Electric field


Figure 12.19 (a) The energies of the valence and conduction bands change across a p-n junction when the junction is in thermal
equilibrium and there is no electric potential difference applied across the junction. (b) The change in the band energies when a
reverse bias voltage is applied. Neither electrons nor holes will be able to flow across the junction. (c) The band energies when a
forward bias voltage is applied. Large numbers of electrons will be able to flow from the n-type material to the p-type material with
a correspondingly large flow of holes from p-type to n-type.

(continued)

12.6 THERMIONIC VERSUS SOLID


STATE DEVICES
Thermionic devices, such as radio valves and amplifiers, cannot match
the efficiency, cost or reliability of solid state devices. Appliances utilising
valves had a number of disadvantages compared with devices utilising
solid state devices.
• Thermionic devices and appliances were bulky. Even radios advertised
as ‘portable’ could be lifted only with difficulty by a child. So much
power was required that batteries had to be large or numerous. A twelve,
D-sized battery radio that was considered portable was not unusual.
• A large amount of heat was developed by the valves. This required
engineering solutions to protect surrounding electronics.

224 FROM IDEAS TO IMPLEMENTATION


• Valves are fragile. Like a light globe, there is a seal between the evacu-
ated glass tube and the Bakelite (an early plastic) base through which
the leads pass from internal connections to the pins in the base. This
meant radios or tape recorders could not be carried as easily, or
treated as roughly as modern tape recorders or portable CD players.
• The cathode was coated with a metal that released large numbers of
electrons. The heat that was produced slowly boiled off the metal
coating and the coating reacted with the traces of gas present in the
tube.
• Valves had a relatively short lifetime. Technicians would start testing a
malfunctioning appliance by testing the valves, even replacing them to
see if the fault disappeared. Solid state devices are now among the least
likely faults. One of the original uses for the valve was in telephone
exchanges. As telephone networks began to expand rapidly in the late
1940s and 1950s, the unreliability of the valve began to be intolerable.
• Individual sockets and valves were mounted on a metal chassis.
Components were connected by insulated wires to other discrete com-
ponents. There was often movement between the chassis and sockets,
leading to broken solder joints. The glass envelopes of the valve were
fragile and the seals frequently broke, allowing air into the valve and
destroying it.
• High voltages were required to correctly bias the triodes to amplify
signals. This is in contrast to a silicon transistor that requires around
0.6 V to do the same job.

12.7 INVENTION OF THE TRANSISTOR


A transistor is a tiny switch that
William Shockley is considered the ‘father’ of the transistor. However, he
changes the size or direction of was not present on 17 November 1947 when scientists John Bardeen and
electric current as a result of very Walter Brattain, at Bell Laboratories, observed that when electrical contacts
small changes in the voltage across were applied to a crystal of germanium, the output power was larger than
it. Transistors are used in sound the input. Shockley saw the potential and, over the next few months, greatly
amplifiers and in a wide range of
electronic devices. Today, a single
extended the understanding of the physics of semiconductors. (For more
chip of silicon can hold many detail on the team’s work, see Practical activity 12.2, page 231.)
microscopic transistors and is A transistor is a semiconductor device that can act as a switch or as part
called an integrated circuit. of an amplifier. There are two types: npn transistors and pnp transistors
(see figure 12.20).

Collector Collector
n
Base
p Base
n Emitter
Emitter
npn

(a) npn transistor

12.2 Collector
Invention of the transistor p Collector
Base
n Base
p Emitter
Emitter
pnp
Figure 12.20 npn and pnp
transistors with their symbols (b) pnp transistor

CHAPTER 12 THE DEVELOPMENT AND APPLICATION OF TRANSISTORS 225


A transistor is simply a combination of two junctions. One consists of a
thin layer of n-type material between two sections of p-type material. The
other is the reverse, with a thin section of n-type between two sections of
p-type material. The three connections are called the emitter, base and
collector (see figure 12.21).

Collector p

Base n

Emitter p

Figure 12.21 A block diagram of a – + + –


pnp transistor. Arrows indicate
electron movement. Bias voltage 1 Bias voltage 2

Each of these junctions must be correctly biased in order for the


transistor to work.
For a pnp transistor, mobile electrons in the n-region initially move away
from the junctions towards the positive terminal. The holes in each of the
p-regions also move away from the junctions towards the negative
terminals. When the emitter is slightly positive, or forward biased, holes
move across the juction into the n-region, or base. Note that the junction
that is forward biased is known as the emitter. Most of the holes do not
recombine with electrons in the base but flow across the second junction
into the collector. The input impedance (electrical resistance) between the
emitter and the base is low, whereas the output impedance is high. Small
changes in the voltage of the base cause large changes in the voltage drop
across the collector resistance, making this arrangement an amplifier.

Figure 12.22 Some of the


numerous transistor cases and types

226 FROM IDEAS TO IMPLEMENTATION


12.8 INTEGRATED CIRCUITS
Integrated circuits (ICs) are tiny electronic circuits used to perform a
specific electronic function. An IC is formed as a single unit by diffusing
impurities into a single crystal of silicon. Each IC is really a complex
circuit built up in a three-dimensional manner in layers (see, for
example, figure 12.1, page 212). Each vertical section of the circuit can
be etched into the silicon by means of light beams.
Large-scale integrated circuits (LSI) have as many as five million circuit
elements, such as resistors and transistors, combined on one square of
silicon often less than 1.3 cm on a side. Hundreds of these can be arrayed
on a single wafer. All of the interconnections between the components
are built into the chip. This reduces the problems of variable resistance
in wiring and soldering components and improves the speed of signal
transmission. These chips are assembled into packages containing all of
the external connections to facilitate insertion into printed circuit
boards. (A summary of the process is shown in figure 12.24.)

The solid state revolution


The earliest solid state devices were discrete components; that is, a single
transistor or diode. That means that each component was a separate
item. They were small, used much less power and were more reliable.
The invention of integrated circuits made miniaturisation of elec-
tronic circuits possible. Whole amplifiers could be built on a single chip
and connected to a circuit board. This meant that there were fewer con-
nections, less wiring and less heat produced than in comparable ther-
mionic device appliances. An added advantage of solid state devices was
that the time taken for signals to move around the circuit was much
shorter. Computers run much faster and handle incredible amounts of
data. Large-scale integration (LSI) has since led to hand-held calculators
that have more computing power than the computer onboard the Voyager
spacecraft.
By 1960, vacuum tubes were being supplanted by transistors, because
of their reliability and their cheaper construction costs. Computers
changed from a room full of cabinets requiring airconditioning and
stable power supplies to table-top models and eventually to hand-held
devices.
Discrete transistors soon gave way to integrated circuits (ICs), allowing
complete appliances to be built onto a single chip. The first silicon chip
was made in 1958 and, by 1964, ICs contained about ten individual com-
ponents on a chip approximately 3 mm square. In 1970, the same size
chip held one thousand components and, importantly, cost no more to
produce.
In 1971, Intel produced its first microprocessor. As microcomputers
became popular, the demand for mass-produced ICs increased, further
reducing the cost. By 1983, over half a million components could fit on a
single chip. Now they hold millions of components and microprocessors
Figure 12.23 Thousands can be found in most household appliances.
of tiny components make up In late 2007 Intel introduced a new line of power-efficient micro-
the integrated circuits of an processors codenamed Penryn. They are based on a 45-nanometre pro-
everyday calculator. cess that uses smaller and lower-power transistors. The laptop version was
introduced in 2008 and consumes 25 watts of power, a significant reduc-
tion from the 35 watts of the older 65-nanometre chips. As well as
reducing power usage, the new chips can operate at higher clock speeds,
delivering 40% to 60% improvement in video and imaging performance.

CHAPTER 12 THE DEVELOPMENT AND APPLICATION OF TRANSISTORS 227


(a) Production of the wafer of silicon
with a silicon dioxide coating Prepared
silicon wafer
Photoresist

Silicon dioxide layer Projected


Metal Silicon nitride layer light
connector
Silicon substrate

(f) The transistors are


connected by metal
connections.

New photoresist is spun


on wafer, and steps (b) to Reticle
(d) are repeated. (or mask)

Lens

Doped region (b) Pattern


(c) Each patterned projected
image turns into onto the
a single chip. wafer
(e) Doping by showering
(d) After the
with ions of the impurity
photosensitive
produces the transistor.
coating is removed,
the exposed areas
are etched away.

Figure 12.24 Chip fabrication is the result of many steps and many chips are produced on a single wafer of silicon.

(a) (b)

Figure 12.25 (a) Packaged silicon


chips (b) A magnified view of a
silicon chip

Socially, the advent of transistors has made continuing changes to the


way we live our lives. This book is being written using integrated circuits.
We communicate with friends using mobile phones that use chips. The
way we do business, travel, shop and are entertained — almost all our
activities rely on silicon microchip technology.

PHYSICS FACT
Applications of semiconductors: photovoltaic cells
O ne of the most important uses for silicon
transistors is in photovoltaic (PV) cells or
solar cells. The operation of solar cells is based
To make solar cells, discs or wafers of crystal-
line silicon undergo a number of steps, such as:
• grinding and cleaning
on the formation of a junction. The natural • doping
potential difference of the junction permits cur- • metallisation
rent to flow from the p side to the n side, • anti-reflection coating.
remembering that the definition of conventional The resulting cell is shown in figure 12.26 with
current is from the positive terminal through the the p-type and n-type materials joined to produce
circuit to the negative terminal. a ‘sandwich’.

228 FROM IDEAS TO IMPLEMENTATION


When light strikes the cell, some is absorbed
within the semiconductor material and the Front contact
energy of the absorbed light is transferred to the
semiconductor. The light photons knock the elec-
trons from the valence band to the conduction
band, allowing them to move freely. By placing + –
metal contacts on the top and bottom of the solar +–
– +
cell, the current of moving electrons can be used
externally. Common uses of photovoltaic cells –+
include the powering of calculators, telephones, Back contact n-type
silicon
lights and radio warning beacons.
pn junction
Every PV cell has an electric field which forms Figure 12.26 Principal
where the n-type and p-type silicon are in contact. structure of a crystalline solar cell p-type silicon
The free electrons in the n side move into the free
holes on the p side. Before this movement of elec- hole. The electric field will then send the elec-
trons, the p- and n-type semiconductor silicon tron to the n side, and the hole will tend to
wafers making up the PV cell are electrically neu- migrate to the p side of the PV cell.
tral, but now the electrons move to the p side. The If an external circuit is provided (as shown in
holes at the p–n junction fill first, making it diffi- figure 12.27), electrons will flow through the cir-
cult for later electrons to move across the junction. cuit path to the p side, releasing energy and
Eventually, a level of equilibrium is reached in doing work.
terms of the charge distribution about the junc- To minimise energy losses from the PV cell, it is
tion. This equilibrium produces an electric field covered by a metallic contact grid that shortens
between the p- and n-type sides of the PV cell. This the distance that electrons have to travel. The
field acts as a diode, allowing electrons to flow metal conducting grid also covers only a small
from the p side to the n side of the PV cell, but not proportion of the surface of the wafer. In spite of
the reverse. It acts like an energy hill — electrons this, some photons are blocked by the metal
can easily go down the hill (to the n side), but conducting grid. The grid cannot be too small or
cannot reverse up the hill (to the p side). This one- its own resistance will increase and be unaccept-
way current can be stimulated to continue if the ably high. There is a trade-off between grid size
electrons are collected by a metal plate on the and current generation.
n-type side of the PV cell and fed back to the p-type An anti-reflective coating (see figure 12.28) is
side of the PV cell by a circuit wire. The current of applied to the top of the PV cells to reduce reflec-
electrons in the external circuit can do work tion losses, as every photon that is reflected is a
because of the potential difference that exists photon that cannot be used for energy. The
between the p- and n-type sides of the PV cell. PV cell is covered with glass for protection. Most
The mechanism of current production by a PV PV cells are made up into modules of many cells.
cell is as follows. When light, in the form of Modules must be made by connecting several
photons, hits a cell, its energy frees electron–hole cells to increase both current (connecting in
pairs. Each photon with more than the minimum parallel) and voltage (connecting in series) and
energy level will free exactly one electron. As a adding terminals on the back.
result, the freeing of the electron produces a

Photons

e–
+ n-type Si
+ + + + + + + + + + + + + + +
lass coating
– – – – – – – – – – – – – – – er g
Load Cov flective
p-type Si t ir e
– An ct grid
ta
Co n con
e sili
n-typ
Figure 12.27 Photons hitting electrons, causing them to be released Figure 12.28 con
e sili
p-typ
from their structure and move in the diode, thus releasing energy to the Final cross-section t
nt ac
load (which may, for example, be a calculator or a light) of a photovoltaic cell k co
B ac

CHAPTER 12 THE DEVELOPMENT AND APPLICATION OF TRANSISTORS 229


They allow the passage of information with very
SUMMARY low power, an increase in speed of transmission
with a much lower energy use, more com-
• The number of freely moving electrons dictates ponents to fit in a confined space, portability
whether a material will be a conductor, semi- and flexibility and are strong enough to be
conductor or insulator. used in many situations.
• The valence energy band is used for electrons
when atoms bond. The conduction band is for
electrons that can move from one atom to
another and conduct electricity. QUESTIONS
• Conducting materials contain a very small 1. Calculate the wavelength of an electron travel-
energy band gap, a partially filled valence band 5 −1
ling at a speed of 5.00 × 10 m s .
with gaps or holes and an overlap between the
valence band and the conduction band. 2. Explain, using band theory, why it is possible
• Insulators have strong bonding links between for an electric current to flow in an insulator.
electrons and large energy band gaps between 3. Use the band theory to explain why metals are
the filled valence band and conduction band of good conductors.
electrons.
4. What is a hole current and how does a hole
• Semiconductors contain weaker bonding links current differ from an electron current?
between the electrons and a smaller energy gap
between the electron-filled valence band and 5. Describe how you could model the production
the conduction band. and movement of electrons and holes in a
semiconductor that has an applied electric field
• Substituting other elements into the silicon from left to right using a Chinese checker set
crystal lattice (doping) is used to vary the and marbles.
conductivity of the silicon.
6. Outline why germanium was used for early
• Intrinsic semiconductors mainly conduct cur-
rent because of the excitation of a valence elec- transistors instead of the silicon now typically
tron into the conduction band level. used.
• Extrinsic semiconductors mainly conduct using 7. Explain why doping with an atom that has five
either excess electrons or excess holes in their valence electrons makes the doped silicon a
structure. much better conductor than undoped silicon.
• n-type extrinsic semiconduction is produced by 8. Describe the difference between an n-type
the addition of an impurity with an extra semiconductor and a p-type semiconductor.
valence electron above four valence electrons. 9. Explain why it was so desirable to develop solid
p-type extrinsic semiconductors are produced state devices to replace the thermionic devices
by the addition of an impurity with one less such as the triode and the diode valve.
valence electron below four valence electrons.
10. The transistor can act as an amplifier or as a
• Diodes and triodes are two examples of solid
state electronic devices made from semicon- switching device. The use of the transistor as a
ducting material. When large numbers of tran- switching device led directly to the develop-
sistors and other components are combined on ment of the microchip and microprocessor.
a single chip it is called an integrated circuit. What is a transistor?
• Diodes are used as rectifiers and transform 11. Describe the difference between an n-type
CHAPTER REVIEW

alternating current into direct current. transistor and a p-type transistor.


• Semiconductors and transistors have a large 12. Why does a hole current always move in the
number of advantages over the large, hot, opposite direction to an electron current
evacuated tubes called thermionic devices. operating in the same transistor?

230 FROM IDEAS TO IMPLEMENTATION


PRACTICAL ACTIVITIES
12.1 12.2
BAND INVENTION
STRUCTURES OF THE
Aim TRANSISTOR
To investigate a model of the difference between Aim
conductors, insulators and semiconductors in To research the work of the team of Nobel Prize
terms of band structures. winners who developed the transistor and, eventu-
ally, integrated circuits.
Apparatus
A computer with internet access Apparatus
A computer with internet access
eBook plus Weblink:
Conductors Weblinks:
eBook plus The development
of the transistor
William Shockley
Theory
Materials are labelled as conductors, semi- Theory
conductors or insulators depending on their ability The invention of the transistor is an interesting
to conduct electrical charge. Conduction occurs example of team dynamics. William Shockley
when a substance contains electrons with sufficient considered himself the ‘ideas man’. He would
energy to occupy the conduction band energy think of a design concept and pass it on to John
level. Bardeen, the theorist, who would try to find a way
to make it work. Bardeen then passed the design
Method to Walter Brattain, the experimentalist, who could
1. Log on to the web site given above. You should build anything in a lab.
be viewing the page with the heading ‘Let’s For two years, Bardeen and Brattain tried
Imagine’. unsuccessfully to make Shockley’s idea of a field
2. View each page, finishing with the heading transistor work. Finally they worked on an idea of
‘Let’s Summarise’. their own and, in 1947, invented the world’s first
transistor — the point contact transistor. Slighted
Analysis but spurred on by this, Shockley immediately
Describe the model used for the behaviour of elec- designed the very successful bipolar transistor. The
trons in: trio won the Nobel Prize for their work, but, sadly,
(a) conductors jealousies split up the team.
(b) semiconductors Another scientist, John Atalla, later developed
(c) insulators. Shockley’s original idea. Ironically, most of today’s
applications, including integrated circuits, use the
Questions field effect transistor.
What is the essential difference in terms of band
structure between conductors, semiconductors
Method
and insulators? Access the web sites quoted above. Using the
biographies and the timeline (under ‘Transistor-
ized!’), work in groups to compile reports on the
contributions of each of the scientists involved in
the development of the transistor, and present the
reports to the class.

CHAPTER 12 THE DEVELOPMENT AND APPLICATION OF TRANSISTORS 231


CHAPTER
13 SUPER-
CONDUCTIVITY
Remember
Before beginning this chapter, you should be able to:
• recall that electrical conduction in metals is related to
the movement of electrons through the lattice
• recall the principle of superposition.

Key content
At the end of this chapter you should be able to:
• outline the methods used by the Braggs to determine
the structure of crystals
• explain that metals possess a crystal lattice structure
• identify that the conducting properties of a metal are
related to the large numbers of electrons able to
move through the crystal lattice
• discuss how the lattice structure impedes the paths of
electrons, resulting in the generation of heat
• identify that resistance in metals is increased by the
presence of impurities and scattering of electrons by
lattice vibrations
• describe the occurrence in superconductors below
their critical temperature of a population of electron
pairs unaffected by electrical resistance
• describe how superconductors and magnetic fields
Figure 13.1 A maglev train. This Japanese train
have been applied to the development of a maglev
operates on mutually repelling magnetic fields
–1 train
and achieves speeds of over 500 km h .
• discuss the BCS theory
• discuss the advantages and limitations of
superconductors and possible applications in
electricity transmission.
In 1911, research into the structure and electrical behaviour of materials
led to the identification of some materials for which the electrical resis-
tance almost disappeared when the temperature approached absolute
zero. This property became known as superconductivity.
An explanation of superconductivity depended on an understanding
of the crystalline structure of conductors and the way in which electrons
interacted with it. W. Bragg used the diffraction of X-rays from a regular
crystal to determine its structure.

13.1 INTERFERENCE
When you have previously studied the behaviour of waves, you probably
observed that the amplitude displacements produced by the waves would
combine when they passed through each other. This effect is called
superposition. The amplitude of the resultant wave at every point was
found by adding the displacements of each wave. You should recall that a
wave disturbance due to two or more sources can usually be taken as the
algebraic sum of the individual disturbances. In the special case where
the sources vibrate with the same frequency and with a constant phase,
Interference is the interaction of some fascinating interference effects occur.
two or more waves — producing Figure 13.2 shows how waves in water from two slits in a barrier produce
regions of maximum amplitude a characteristic pattern of interference. This is a property of all waves.
(constructive interference) and
zero amplitude (destructive
interference). The Michelson–
Morley experiment (see chapter 5,
pages 72–74) used the interference
of light in an attempt to measure
the movement of the Earth
through the aether.

Figure 13.2 Constructive and


destructive interference produced by
two in-phase sets of circular water
waves

The two slits act as separate sources of coherent circular waves. As they
Light waves are coherent when
there is a constant phase spread out, they interfere with each other. In some directions the waves
difference between them; that is, combine constructively, making waves of larger amplitude. In other direc-
the peaks line up and the troughs tions they combine destructively, so that there is little or no resulting
line up. wave amplitude. The wedge-shaped areas of sharp contrast indicate the
crests (light) and troughs (dark) of strongly reinforced waves. In some
cases the waves arrive at the same point in time and space totally (180°)
out of phase. The resulting wave amplitude is zero so there are no
contrasting lines, and we see a region with no wave motion.

CHAPTER 13 SUPERCONDUCTIVITY 233


The condition for complete constructive interference is that waves
eBook plus from two sources arrive at the same point with the same phase. Since the
waves are continuous, each point one full wavelength apart will have the
Interactivity: same phase. The condition for constructive interference may be
Young’s experiment described in terms of the difference in path length. Points will be in
(interference effects phase so long as the difference in the path travelled by each wave is a
with white light)
int-0051
whole number of wavelengths. If we let n be any whole number of
eLesson: wavelengths, then we can describe this difference in path lengths as
Young’s experiment ∆D = nλ where λ = wavelength.
(interference effects Light waves also demonstrate interference effects. Young’s ‘double slit’
with white light)
experiment provides a clear example of the geometry, and the relation-
int-0027
ships necessary for constructive interference. In figure 13.3 the light
from a small aperture falls on two small slits.

A
2nd-order bright spot

D1 P
1st-order bright spot
y
d θ
θ Central bright spot
S
D2

nλ 1st-order bright spot


Figure 13.3 Geometrical L
construction for describing the 2nd-order bright spot
interference pattern of two point
sources (Young’s experiment) B

The two slits are at a distance, d, apart and a screen is placed a


distance, L, from the slits. The light from each slit travels a distance
shown as D1 and D2 respectively and meets at a point, P, a distance, y ,
above the midpoint of the screen. If the waves meet at P when they are in
phase, they will interfere constructively. The points of the waves are in
phase if the path difference that each wave has travelled with respect to
the other is a whole number multiple of the wavelength λ.
The difference in path lengths ∆D = D1 − D2.
The constructive interference, also known as a ‘maximum’, occurs when
d
d sin θ = nλ where n = integer and nλ = y --- .
L
This relationship permits the determination of the wavelength of light
of a particular frequency using simple measurements.
When monochromatic light (light of a single frequency) is passed
through a pair of closely spaced slits in a screen or grating of some kind,
The term maxima refers to points the light passing through each slit acts as a new point source of light.
on an interference pattern where Because the light is from a single monochromatic source it is coherent
the peaks of each set of waves when passing through the slits. The resulting interference of the light from
coincide. This produces a bright
spot when light is used and is a
these two sources causes a pattern of alternating light and dark lines, or
point of constructive interference. fringes, on a screen placed some distance away from the slits that is solely
the product of the path distance between the sources and the screen.
The bright areas that result on the screen are called maxima; the dark
The term minima refers to points areas minima. The centre of each maxima, where it is the brightest, is
on an interference pattern where where the light from the two sources is completely in phase at the screen.
peaks of one wave coincide with
troughs of the other. This
The darkest area in the centre of each minima is where the light from
produces a dark spot and is a point the two sources is completely out of phase at the screen. The values of y,
of destructive interference. d and L can easily be measured from a screen placed some distance from
a monochromatic light source that is transmitted through a double slit in

234 FROM IDEAS TO IMPLEMENTATION


a slide or diffraction grating. However, when doing this it is important to
A diffraction grating is a device make sure that all measurements are made to the central bright region of
consisting of a large number of slits
and is used to produce a spectrum. any maximum that is formed on the screen.

13.2 DIFFRACTION
We are familiar with shadows cast on a wall by an object and know that the
shadow has the same shape as the object. However, if we look carefully, we
will see that the edges of the shadow are a little fuzzy, that is, they are not
perfectly sharp. This lack of sharply defined edges on the shadow is due
to the phenomenon of diffraction.
Diffraction refers to the spreading
out of light waves around the edge Young’s double slit experiment
of an object or when light passes showed that light does not travel past an
through a small aperture. object in straight lines, but spreads out
around the object’s edges as waves.
These waves can interfere with each
other as they spread out. This spreading
out of light that occurs around an object
eBook plus or when light is passing through a small
aperture is called diffraction. It is
eModelling: pronounced when the waves have to
Modelling interference
and diffraction
travel different paths to a point some dis-
Spreadsheets help explore tance from the source and in doing so Figure 13.4 Diffraction pattern of a
interference and diffraction and travel paths that have differences in circular aperture (an opening through
their combined effects length that approach either multiples of which light passes), showing maxima
doc-0041
half or full wavelengths. A diffraction and minima
pattern is shown in figure 13.4.

13.3 X-RAY DIFFRACTION


All modern X-ray tubes are basically Diffraction is a property of all waves, including electromagnetic radiation. Dif-
the same as the type developed by fraction effects increase as the physical dimension of the aperture approaches
Coolidge. This device became the the wavelength of the waves. Diffraction of waves results in interference that
standard method of producing X-rays. produces dark and bright rings or spots. The precise nature of these effects
In the X-ray tube, electrons are is dependent on the geometry of the object causing the diffraction.
accelerated by a high potential X-rays were discovered by Röntgen towards the end of the nineteenth
difference or voltage to produce the century (see the Physics fact in chapter 10, page 185). A study of their
X-rays as they hit a target within the nature revealed they were electromagnetic waves. Although similar to light
Coolidge tube. Figure 13.5 shows a and radio waves, X-rays were determined by experiment to have a wave-
−10
Coolidge X-ray tube. length much shorter than that of visible light, in the order of 10 m.
Within a short period, scientists studying these new electromagnetic waves
were able to reliably produce X-rays of a specific frequency.
eBook plus Cooling fins
Cathode Anode
Weblink:
X-ray diffraction

Voltage Electrons
source
for heater
Figure 13.5 A Coolidge X-ray tube,
invented in 1913 by American – V +
High-voltage source
physicist William D. Coolidge X-rays

CHAPTER 13 SUPERCONDUCTIVITY 235


A diffraction grating for visible light is a device for producing interfer-
ence effects such as spectra. A grating consists of a large number of equi-
distant parallel lines engraved on a glass or metal surface. The distance
between the lines is of the same order as the wavelength of the light.
Note that the larger the number of slits on a grating, the sharper the
image obtained. This is why a diffraction grating produces a well-
separated pattern of narrow peaks (see figure 13.6).

(a) Intensity (c) To point P


on viewing
m=0
3 2 1 1 2 3 screen
θ

θ
0 θ

θ
(b) Path length
θ difference
3 2 1 m=0 1 2 3 d between adjacent rays
θ

Figure 13.6 A diffraction grating is a set of accurately ruled lines on a glass that produces
a sharp pattern of maxima and minima.

X-rays are electromagnetic radiation with a wavelength of the order of


−10 −7
1 × 10 m. Compare this with a wavelength of 5.5 × 10 m for green
light in the middle of the visible spectrum. A standard optical diffraction
grating cannot be used to discriminate between different wavelengths in
the X-ray wavelength range as it can in the visible spectral range. An
optical grating can split visible light up into a series of spectral lines or
A crystal is a naturally occurring
the rainbow of colours you are probably familiar with from earlier
solid with a regular polyhedral learning. The discrimination of different wavelengths
shape. All crystals of the same in the X-ray range requires an instrument capable of
substance grow so that they have measuring an angle of less than 0.0019° required for
the same angles between their their diffraction.
faces. The atoms that make up the
crystal have a regular arrangement
In 1912, the German physicist Max Von Laue
called crystal lattice. (1879–1960) proposed that the regular spacing of a
crystal, such as sodium chloride, might form a natural
three-dimensional ‘diffraction grating’ for X-rays.

Hole in screen Crystal


to collimate
X-ray beam

X-ray tube

Screen coated
with photographic
emulsion

Figure 13.7 A representation of


Von Laue’s original diffraction experiment

236 FROM IDEAS TO IMPLEMENTATION


This experiment was carried out by his colleagues, W. Friedrich and
P. Knipping, who bombarded a crystal of zinc sulphide. They obtained a
diffraction pattern on photographic film (see figure 13.7).
British physicist Sir William Henry Bragg (1862–1942) and his
Australian-born son Sir William Lawrence Bragg (1890–1971) developed
an X-ray spectrometer to systematically study diffraction of X-rays from
crystal surfaces. They proposed that X-rays, because of their short
−10
wavelength (in the order of the size of the atom, 10 m), could
penetrate the surface of matter and ‘reflect’ from the atomic lattice
planes within the crystals.
When X-rays enter a crystal such as sodium chloride (see figure 13.8),
Figure 13.8 Crystal structure of they are scattered (absorbed and re-emitted) in all directions.
sodium chloride In some directions the scattered waves undergo destructive
interference, resulting in an intensity minimum; in other directions, the
3 2 1 interference is constructive, resulting in an intensity maximum. This
Incident
X-rays process of scattering and interference is a form of diffraction. The
θ θ
Braggs observed that the maxima occurred in specific directions. They
concluded that the X-rays were reflected from the regularly spaced
d θ θ parallel planes of the crystal that were formed by the arrangement of
atoms in the crystal lattice. This effect is shown in figure 13.9. For the
d θ θ first time, thanks to the work of the Braggs, it was possible to look at the
arrangement of the atoms in a solid material.
Figure 13.9 The interference in The work of W. L. Bragg provided a mathematical analysis of their
emitted X-rays is caused by some experiments, deriving the relationship between the spacing of the crystal
X-rays reflecting from lower levels or planes, the wavelength of the radiation and the angle of reflection. X-ray
adjacent atomic layers. diffraction provides a tool for studying both X-ray spectra and the
arrangement of atoms in a crystal.
To study spectra, a crystal is chosen with a known interplanar
spacing, d. A detector is mounted on a device called a goniometer (see
a0 figure 13.11) and can be rotated through a range of angles to measure
the crystal rotation angles at which the maxima occur. A chart recorder
produces a trace of X-ray intensity against rotation angle.
Alternatively, the crystal itself can be studied with a monochromatic
X-ray beam of known wavelength. This enables the determination of the
Figure 13.10 a 0 is the width of the
spacing of various crystal planes as well as the structure of the crystal unit
‘unit cell’ of a sodium chloride crystal.
cell (smallest possible crystal unit, shown in figure 13.10).

(a) (b)
Film A Film B

X-ray
Specimen beam

Pinhole
collimator

Specimen holder
(adjustable goniometer)
Figure 13.11 (a) A flat plate camera used for X-ray diffraction of a crystal (b) A Laue pattern of a silicon crystal

CHAPTER 13 SUPERCONDUCTIVITY 237


Bragg’s results revealed a predictable relationship among several
factors. They were:
• the distance between similar atomic planes of a crystal (the inter-
atomic spacing), known as the ‘d-spacing’, measured in angstroms
• the angle of incidence (the angle θ ), measured in degrees
• the wavelength of the radiation, λ , which in this case is approximately
−10
1.54 angstroms (1.54 × 10 m).
Bragg’s Law summarises this relationship between the maxima and the
diffraction angles as:
nλ = 2d sin θ
where
n = an integer taking values of 1, 2, 3 . . . etc.
θ = diffraction angle in degrees.
A crystal was found to be made of atoms arranged in a regular three-
dimensional pattern. The diffraction pattern obtained by transmission
existed as a pattern of spots. Analysis of these patterns made it possible to
determine the position of atoms in the crystal.

13.4 BRAGG’S EXPERIMENT


An X-ray diffraction instrument has a source of X-rays. This is an X-ray
eBook plus tube operating at about 40 000 volts. The target of the high-energy elec-
trons in the tube may be copper or chromium and is mounted on a base
Weblink: of metal through which cooling water flows. The X-rays produced as the
Bragg diffraction accelerated electrons hit the target material are collimated using parallel
experiment with plates of metal covered in molybdenum. This causes the X-ray beam to
microwaves
become parallel. The parallel X-ray beam strikes the sample under
investigation and the scattered X-rays coming from the material are then
detected very precisely. (The Braggs actually used an ionisation chamber
to detect the X-rays. An ionisation chamber is a gas-filled detector that
produces a current pulse when X-rays ionise the gas inside the detector.
This enabled them to determine the position and intensity of the dif-
fracted X-rays.)
The process of X-ray diffraction by a crystal is quite complex. The
pattern of maxima and minima occurs as if the X-rays were reflected by a
eBook plus system of parallel planes. However, the X-rays are not actually reflected,
but scattered. This model uses reflection to simplify the description and
Weblink: calculations.
Bragg’s law and
diffraction applet
Ray 2 Ray 1

d
θ θ θ
d
θ θ d θ θ
d θ θ
To X-ray detector d sin θ d sin θ
From X-ray source θ
θ θ

Figure 13.12 A set of reference planes and Bragg diffraction along the planes of a crystal
θ θ
Crystal
The Bragg experiment utilised X-rays reflected from adjacent atomic
planes within the crystal, as shown in figures 13.12 and 13.13. The
Figure 13.13 As a result of X-ray reflected X-rays interfered constructively and destructively, producing
diffraction the intensity of the detected the familiar pattern. Measurement of the angles allows the spacing
X-rays varies according to the angle θ. and arrangement of the crystal to be determined.

238 FROM IDEAS TO IMPLEMENTATION


As shown in figure 13.14, the rays of the incident beam are always in
phase and parallel until the top beam strikes the top layer of the crystal
at an atom (labelled A2). The second beam continues to the next layer
where it is reflected by atom B2. The second beam must travel the extra
distance PB2 + B2Q. If the total distance PB2 + B2Q is a whole number of
wavelengths then the two waves will be in phase.
nλ = PB2 + B2Q

A1 A2 A3
θ θ
Figure 13.14 Describing Bragg’s law
θ
requires the use of both geometry and P Q
trigonometry. The lower beam must travel d
the extra distance (PB2 + B2Q) to
continue travelling parallel and adjacent
to the top beam. B1 B2 B3

13.5 THE CRYSTAL LATTICE


STRUCTURE OF METALS
As you saw in chapter 12, the properties of solids depend on the type of
bonding. The classical model of metals (see figure 13.15) describes
valence electrons as being common property of all of the atoms in the
metal, forming an ‘electron cloud’. These electrons are said to be
‘delocalised’. Because of the random direction of movement of these
electrons, with equal numbers moving in each direction, a steady state is
established. That is, there will be no net transport of electric charge.
Nuclei lattice

Full
electron
shells

Figure 13.15 Random motion of Nucleus


electrons in a metallic lattice Moving valence electrons

When an electric field is applied, it produces a small component of


velocity in the direction opposite to the field (because electrons have a
negative charge). The applied electric field creates a force that drives the
electrons in a common direction. At any point in a conductor, the average
velocity of the electrons is proportional to the strength of the electric field.
If ‘piling up’ of electrons occurs it can create a reverse potential. This can
oppose the electric field. If, for example, there is not a complete circuit,
there cannot be a continual flow of electrons and charge will ‘pile up’. This
means that there will be a short interval after the application of a potential
difference, in which a current will flow. However, the build-up of charge
opposes the potential difference and the current stops. As such, potential
difference can exist across a conductor in an incomplete circuit without
any current flowing through it. This is exactly the situation in a dry cell
battery. A potential difference exists between the terminals of a dry cell bat-
tery without a current necessarily flowing in the circuit.

CHAPTER 13 SUPERCONDUCTIVITY 239


When a current flows, charge moves under the effect of the applied
field. As the charge moves, work is continually being done by the elec-
tric field. For each coulomb of charge moving through a potential
difference of one volt, one joule of work is done. For a constant cur-
rent, no kinetic energy is gained, so all of the work done must go into
heat. This is generally called ‘joule heating’. It might be noted that
joule heating is independent of the direction of current flow and is, in
fact, irreversible.
Since direct current and low-frequency, alternating current travel
through the body of the metal, the greater the cross-sectional area of the
wire (see figure 13.16), the more electrons there are to move along the
Wire A (cross-section radius = r ) wire, and the greater the current which can flow for a given value of
applied voltage. The atoms that form the lattice vibrate more as their
temperature increases. As the electrons begin to move, they collide with
impurities and tiny imperfections in the lattice.
The resistance increases as a direct result of these collisions with
irregularities in the crystal lattice (see figure 13.17). Inside a supercon-
Wire B (cross-section radius = 2r )
ductor the behaviour of electrons is very different. The impurities and
Figure 13.16 Cross-sectional area crystal lattice are still there, but the movement of electrons is significantly
determines the amount of space in different. The electrons pass almost unobstructed through the lattice.
which electrons can travel. Because the electrons do not collide with anything, superconductors can
transmit electric current with no appreciable loss of energy.

(a)
Electrons are obstructed by
impurities in the lattice.

(b)
Electrons move easily in a
uniform, vibration-free lattice.

(c)
Electrons are hindered by
vibrations in the lattice.

Figure 13.17 The presence of impurities and lattice structure determines the motion of
electrons through a conductor.

13.6 SUPERCONDUCTIVITY
A photograph of a magnet floating above a curved disk is one of the most
widely published images of superconductivity working (see figure 13.18).
We have seen that electric power is better transmitted as alternating
current at very high voltage than as direct current. One advantage of AC
transmission is that less energy is lost due to the heating effects caused by
the resistance of wires to the flow of electric current. AC transmission
also allows the use of very efficient transformers to step up and down the
voltage to the required difference in potential.

240 FROM IDEAS TO IMPLEMENTATION


Figure 13.18 A small permanent magnet hovering above a superconducting ceramic disc

The battle over whether transmission of power would be via AC or DC


was apparently settled during the latter part of the nineteenth century,
with AC transmission winning out. Today the advent of the super-
You should recall that the Kelvin conductor may swing the advantage back to a system of DC transmission.
scale of temperature defines absolute The problem is that even copper wire — which is an excellent con-
zero (0 K) as 273.16 degrees Celsius ductor of electricity — offers some resistance to the passage of that elec-
below zero. tricity. A copper wire carrying a current of a mere 100 A, if the copper
has a resistance of 8 Ω km−1, will dissipate a huge amount of power for
each kilometre the electricity must be transported. The amount of power
2
dissipated per kilometre is determined from the relationship P = I R . For
0.150 the case described, a 100 A current moving through a distance of only
1 km will dissipate energy at a rate of 80 000 W. When you consider that
0.125 almost every domestic dwelling has the capacity to draw a current of
around 100 A through the total of its circuits, and then consider the
0.100 number of kilometres between the user and the power station supplier
Resistance (Ω)

generating the electricity, it is clear that the generation and production


0.075 of power for domestic distribution is a relatively inefficient and energy
wasting process.
0.050 Imagine the advantages of removing resistance to the flow of electricity
along the copper wires from the power station altogether. The study of
0.025 superconductivity suggests this is possible. (See also pages 246–249.)
The earliest experiments demonstrating that electrical resistance in a
0.000 wire disappeared at low temperatures occurred in 1911. Before that, in
4.0 4.1 4.2 4.3 4.4 1906, H. K. Onnes (1853–1926) discovered a superconductor. That
Temperature (K) superconductor was mercury. Onnes’ results are shown in figure 13.19.
Figure 13.19 A graph of resistance The processes for cooling matter to near absolute zero involve using a
versus temperature in mercury wire as succession of liquefied gases down to about 4.2 K. Lower temperatures
measured by Onnes may be achieved by successive magnetisation and demagnetisation. In

CHAPTER 13 SUPERCONDUCTIVITY 241


1906, Onnes liquefied hydrogen
(20.4 K) by continuous vacuum
pumping to ensure maximum
expansion of the cooling gas. In Normal
1908, using liquid hydrogen, Onnes conductivity

Resistance (Ω)
succeeded in liquefying helium at
4.2 K. In 1911, he used that liquid Critical
temperature, Tc
helium as a coolant and discovered
that the electrical resistance of Normal
some metals dropped rapidly to conductivity
Superconductivity
almost zero below a temperature
that was characteristic of that metal
(see figure 13.20). Temperature (K)
The characteristic temperature at
which a metal becomes super- Figure 13.20 The rapid change in
conducting is called its critical tem- resistance with temperature
perature, TC. Table 13.1 gives the
critical temperature for some
elements.
60 Since the first superconducting metals were discovered, a number of
Resistance (mΩ)

YBa2Cu3O7 ceramics have been developed that have a much higher critical tempera-
40 ture and, therefore, are easier and cheaper to use as superconductors
than metals or metal alloys. The resistance of the ceramic YBCO at
20 varying temperatures is shown in figure 13.21.

0 Table 13.1 Critical temperature (Tc) values for some elements. The ceramics
50 70 90 110 130 have a much higher critical temperature and therefore are easier to use as
Temperature (K) superconductors.
Figure 13.21 Resistance versus
temperature for the ceramic YBCO ELEMENT/ALLOY T c (K) T c (°C)

Aluminium 1.20 −271.95


Hafnium 0.35 −272.8
Lead 7.22 −265.93
Mercury 4.12 −269.03
Niobium-aluminium-germanium alloy 21 −253.15
Technetium 11.2 −261.95
Tin 3.73 −269.42
Tin-niobium alloy 18 −255.15
Titanium 0.53 −272.62
Uranium 0.8 −272.35
Metal oxide ceramics

YBa2Cu3O7 (YBCO) 90 −183.15


HgBa2Ca2Cu3O8 133 −140.15

Very pure metals that become superconductors are known as Type I


superconductors and high-temperature ceramics that become super-
conductors are known as Type II superconductors.

242 FROM IDEAS TO IMPLEMENTATION


13.7 HOW IS SUPERCONDUCTIVITY
EXPLAINED?
It was nearly 50 years after the discovery of superconductivity that a satis-
factory explanation was provided. After Onnes discovered superconduc-
tivity, many great physicists attempted to find an explanation. Albert
Einstein, Werner Heisenberg, Niels Bohr and Wolfgang Pauli are some of
those who tried but failed. Some of them had worked on it before
quantum mechanics was developed, and it is no wonder that they failed,
as superconductivity can only be explained using quantum mechanics.
Success finally came to John Bardeen (1908–1991), Leon Cooper
(1930– ) and J. Robert Schrieffer (1931– ) in 1957. In 1955, John Bar-
deen, who had already won a Nobel Prize for the invention of the tran-
sistor, was keen to recruit Leon Cooper, who had recently received his
PhD, to work with him on the problem of superconductivity. Bardeen
had left the Bell Laboratories, the scene of his transistor success, and
Robert Schrieffer was then a graduate student. Bardeen had already
made two unsuccessful attempts, but in 1957, with the help of Cooper
and Schrieffer, it was a case of third time lucky. The three of them were
awarded the Nobel Prize for Physics in 1972.
Their theory, now called the BCS theory (for the initials of Bardeen,
Cooper and Schrieffer), has not received the general recognition it
deserved, possibly because it is deeply founded in quantum mechanics,
and attempts at classical explanations fail badly. In 2007, at a conference
honouring the 50th anniversary of their paper, Dr Cooper recalled the
difficulty of the calculations relating to the collective quantum behaviour
of millions and millions of electrons. At the time of publication in 1957,
many physicists who read their paper failed to comprehend it. At the
2007 conference, Dr Cooper jokingly recalled that when Bardeen was
trying to recruit him to work on superconductivity, Bardeen failed to
mention his previous two failures!
A clue to the nature of superconductivity had been provided by
Walther Meissner and Robert Ochsenfeld in 1933 when they measured
the magnetic field in a superconductor and discovered it was precisely
zero. They also showed that if a magnetic field was present inside a super-
conductor above its critical temperature, it would become zero when the
material was cooled below its critical temperature and became supercon-
ducting. This is now known as the Meissner effect.
In his presentation in 2007, Dr Cooper recalled ‘the simple facts of
superconductivity (as of 1955)’. He mentioned the discovery of super-
conductivity by Kammerling Onnes and the discoveries of Meissner and
Ochsenfeld. He also mentioned that there was evidence (from specific
heats) to suggest that there was an energy gap involved. The so-called iso-
tope effect, which showed that the critical temperature depended on the
mass of the ions present in the lattice, provided evidence for the involve-
ment of phonons, lattice vibrations, in the process.
Bardeen, Cooper and Schrieffer considered that superconductivity
occurred because of an interaction between electrons. They initially tried
to use all the methods then available for dealing with such a problem,
but did so without success.
They knew that electrons repel each other with the Coulomb force
between the negatively charged electrons. This contributes an average
energy of 1 electron volt per atom. The average energy associated with the
–8
superconducting transition could be estimated as 10 electron volts per
atom. This huge difference in energy made the problem very difficult.

CHAPTER 13 SUPERCONDUCTIVITY 243


Bardeen and David Pines (a student of Bardeen before the arrival of
Cooper) had shown that there could be an interaction between electrons
and phonons in the lattice, that under some conditions electrons could
interact through the exchange of phonons, and that such an interaction
could possibly be attractive.
(We have seen in section 13.5 that lattice vibrations increase the resis-
tance of a metal because lattice vibrations, or phonons, scatter electrons.
However, sometimes phonons can be exchanged between electrons and
produce an attraction between the electrons.)
The idea of an energy gap is important. Somehow, by exchanging pho-
nons, an electron is able to reach a lower energy state. Once it is in that
state, there is an energy gap that the electron must overcome to get out
of that state. In other words, the energy gap helps to keep electrons in
the superconducting state. Bardeen believed that understanding this
energy gap was the key to understanding superconductivity.
Dr Cooper found that solving equations with millions and millions of
electrons was virtually impossible. However, he found that from his equa-
tions he could establish a pairing of electrons; hence the idea of the
‘Cooper pair’ was born. This was a step beyond the previous discovery of
electrons interacting via phonons. Cooper found that when the electrons
did interact, they grouped into pairs. Even though these pairs may be
constantly breaking and reforming, the key to superconductivity is the
electrons interacting and forming Cooper pairs. When Cooper came up
with this idea, it was apparent that the two electrons in the pair must have
a considerable separation, otherwise the Coulomb repulsion would com-
pletely dominate the attractive force between the electrons of the pair.
Cooper noted that there would be an overlap of Cooper pairs and that
6 7
something like 10 or 10 pairs could occupy the same volume.
We encountered the Pauli exclusion principle when we first met band
structure in solids. Electrons normally obey the Pauli exclusion principle
13.1 because they are fermions (or spin 1--2- ) particles. The electrons in a Cooper
pair have opposite spins, so the Cooper pairs are bosons and are not
Temperature change in
restricted by the Pauli exclusion principle. In fact, all the Cooper pairs
superconductors
can occupy the same energy state, and even though huge numbers of
Cooper pairs overlap, they do not interact with each other.
The mathematical problem of dealing with vast numbers of particles still
remained, though Cooper had set them on the right track. Robert Schri-
effer made the next breakthrough. He applied statistical methods to solve
the difficulties associated with the Cooper pairs. He realised that the
Cooper pairs seemed to merge into one large group that moved along
together. He provided a simple analogy of a line of skaters who are linked
arm in arm. If one of the skaters hits a bump, that skater is supported by
all the others and continues to move with the group. Although this may
sound simple, the mathematics associated with it is very complex.
Once the mathematical problems had been resolved, the BCS team
had a theory that could explain all the phenomena related to type I
superconductivity. Bardeen, Cooper and Schrieffer were awarded the
Nobel Prize for Physics in 1972. (This was Bardeen’s second Nobel prize
in Physics.)
However, the problem of type II superconductors remains unsolved. BCS
theory cannot explain it, and so far there are no satisfactory theories.
In summary, the key to understanding superconductivity in type I
superconductors is the formation of Cooper pairs. A Cooper pair consists
of two electrons that are a considerable distance apart. The attractive
force between the two electrons is provided by the exchange of phonons
(lattice vibrations). Formation of a Cooper pair lowers the energy of the

244 FROM IDEAS TO IMPLEMENTATION


electrons, and there is an energy gap they have to cross to jump out of
the superconducting state. (Of course, raising the temperature will
achieve this and destroy the superconducting state.) The two electrons in
a pair have opposite spin, so the Cooper pairs are not constrained by the
6 7
Pauli exclusion principle. There will be perhaps 10 or 10 Cooper pairs
overlapping each other, but they can all be in the same energy state. It
can be considered that all the members of this large association of
Cooper pairs assist all the other members to move through the lattice.
When a superconducting material in its normal state is placed in a
magnetic field, the magnetic field strength inside the material is almost
the same as the magnetic field strength outside the material. If the
material is in its superconducting state, currents flow in the super-
conductor to produce a magnetic field that cancels the applied magnetic
13.2 field inside the superconductor. That is, the magnetic field inside the
superconductor is always zero.
Levitation and the Meissner effect
This expulsion of the magnetic field from inside a superconductor is
called the Meissner effect and is illustrated in figure 13.22.

(a)

Magnetic field
lines pass through
superconductor

(b)

Currents
Figure 13.22 (a) The magnetic field in a flow inside Magnetic field lines
superconductor expelled from
superconductor in its normal state (b) The
superconductor
expulsion of the magnetic field by a superconductor
in its superconducting state

If a magnet is brought near to a superconductor, the currents in the


superconductor that expel the magnetic field create magnetic poles
which cause repulsion between the magnet and the superconductor. If a
small magnet is placed above a superconductor, the repulsive force on
the magnet can balance the weight of the magnet — causing it to be
suspended above the superconductor. This is shown in figure 13.23 and
explains the levitation of the magnet shown in figure 13.18, page 241.

Small rare earth magnet

Currents produced
in superconductor

Figure 13.23 The currents


produced in a superconductor lead to
levitation of a magnet. Superconductor

CHAPTER 13 SUPERCONDUCTIVITY 245


The effects that are described here — zero resistance and the Meissner
effect — are the macroscopic properties of materials that become super-
conducting.
There are also very important microscopic or quantum effects shown
by superconductors. One such property is the ‘tunnelling effect’. This
is a property of the wave-electron, transporting the electron through
spaces that are forbidden by classical physics because of high electric
potential. In 1962, Brian Josephson (1940– ) discovered that, if two
superconducting metals were separated by an insulating barrier, it was
possible for electron pairs to pass through the barrier without resis-
13.3 tance.
There are two important consequences of this. Even in the absence of
Resistance and superconductors
a potential difference between two identical superconductors, a tunnel-
ling current will flow across a Josephson junction. Secondly, when a
constant potential difference is applied, the current that flows will oscil-
late with a constant frequency. This second effect allows the most precise
measurement of the fundamental quantum constant ---- e- 2eV
h since f = h .
---------------

Such a Josephson junction acts as a superfast switch (see figure


Superconductor Superconductor 13.24). This property is a major advantage in computers where the pro-
cessing time depends on the speed at which signal pulses can be trans-
mitted. It also makes the very precise measurement of magnetic flux
Electron pair possible.
Oxide layer

Figure 13.24 The Josephson effect Applications of superconductivity


allows superconducting material to act One of the major problems facing the widespread use of superconductivity
as a switch. is the very low temperatures required to produce superconductivity effects.
Researchers are working on ‘high-temperature superconductors’ which will
retain their superconductive properties at temperatures that can be more
easily managed.
The potential application for superconductors is almost unlimited. We
will look at a few examples.

Power transmission
The ability to conduct electricity without losing power in heat stimu-
lated research into engineering applications. Electrical transmission
lines lose an appreciable amount of energy due to the resistance of the
wires. If materials can be developed which overcome the physical prob-
lems that make high-temperature superconductors brittle, very large
current densities could be conducted in relatively thin wires. This
would reduce the cost of power and the need for the ever increasing
demand for new power stations. Superconducting wires could carry
three to five times as much current as conventional transmission lines.
The current in such transmission lines would of course be DC rather
than the conventional AC. This is because the constant direction-
switching in AC causes energy losses and heating. That would defeat
the purpose of the thinner wires and would counter the low supercon-
ducting temperatures. One experimental electricity transmission line
uses an HTS (high temperature superconductor) material wound
around a hollow core which carries a liquid helium coolant.

Power generation
At the point of generation, superconducting magnets that would not
require the presence of an iron core would potentially be only a fraction
the size and mass of present generators. Less fossil fuel would be
required to produce electricity which would reduce the emissions of

246 FROM IDEAS TO IMPLEMENTATION


greenhouse gases and other pollutants from power plants. Figure 13.25
shows an example of the use of superconductors in the power generation
industry.

Bushing

Steel
enclosure

Foam
insulation

Liquid
nitrogen

Vacuum

Foil
winding

Superconducting
bars

Figure 13.25 A section of the prototype for a high-temperature superconducting fault current
limiter (FCL). FCLs, sometimes called ‘ chokes’ , are important devices in the energy industry
for controlling faults in power supplies. The superconducting elements are housed in a
stainless steel vacuum flask filled with nitrogen.

Power storage
One of the major problems faced by power stations is that electrical
energy cannot be stored easily. Essentially, electricity must be used
immediately. Superconducting magnetic energy storage (SMES) is one
possible answer to this problem. These facilities use a large ring structure
constructed using a HTS material and refrigerators. Electrical energy in
the form of a DC current can be introduced into the device. The current
is introduced as DC because the constant switching of direction in AC
produces some energy loss. The DC electrical current would flow around
the SMES device’s circular path indefinitely without energy loss until
required, whereupon it can be retrieved and converted into AC current
for delivery to domestic users and industry. Alternatively, it could be
transported by a superconducting transmission system as DC. The big
advantage of the SMES electricity storage system is that the power gener-
ation machinery can continuously operate at peak efficiency levels no
matter whether demand is at a maximum or minimum. This minimises
the need to build new power stations and potentially opens the way to the
use of large-scale solar power stations with energy produced during the
daylight being stored for 24-hour use.

Electronics
There is enormous scope for the use of superconductors in electronics.
The speed and further miniaturisation of computer chips are limited by
the generation of heat (due to resistance of the electric current flow
required to make them run), and by the speed with which signals can be

CHAPTER 13 SUPERCONDUCTIVITY 247


conducted. Superconductive film, used as connecting conductors, may
result in more densely packed semiconductor chips. These could
transmit information several orders of magnitude faster. Supercon-
ducting digital electronic components have achieved switching times
−12
of 9 picoseconds (9 × 10 s). This is much faster than any conventional
switching device. The development of Josephson junctions has led to
very sensitive microwave detectors and magnetometers that are used in
geological surveys.
Medical diagnostics
Superconducting magnets have a vital role in the development of new
diagnostic tools. The intense magnetic fields used in magnetic resonance
imaging (MRI) instruments, are ideal applications of superconductors.
The magnetic solenoids need to be large enough to allow a person to
enter. To produce a magnetic field of four tesla, an ordinary solenoid
winding would have to be a metre thick. There would be several hundred
kilowatts of power dissipated as heat for every metre length of conductor.
With superconducting alloys of tin-niobium, magnetic fields of 4 T can
be established easily. Furthermore, once the desired strength of magnetic
field is energised, that is, the current level in the superconducting sol-
enoid is achieved, it runs in a ‘persistent current mode’. In other words,
the current is simply cycling around the solenoid without energy loss.
Therefore, the device does not require the input of any more electric
power to maintain the magnetic field. Normal solenoids would require
constant power input. The MRI works by tunnelling radio frequencies to
produce photons that have energies similar to the difference between the
‘spin up’ and ‘spin down’ states in a hydrogen atom in the human body.
The signal produced is essentially a measure of the concentration of
hydrogen atoms. From this information a measure of the soft tissue in a
person’s body can be computer generated without the need for invasive
surgery or without the inherent disadvantages of using seismic energy in
ultrasounds. Smaller magnetic solenoids can be placed around a person’s
body to follow the contours of the body. By changing the radio fre-
quencies of the instrument and with the use of computer reconstruction,
three-dimensional maps of the soft tissues of a patient can be made that
are highly detailed and accurate. The ability to measure magnetic flux
precisely is used in a SQUID (Superconducting Quantum Interference
Device). This is an instrument to measure tiny magnetic fluxes that gen-
erate electrical impulses in the device. Magnetic fields as small as 10−13 T
occur with small variations in current within the brain or the heart.
These provide important diagnostic information for doctors.
The magnetically levitated train
Magnetic levitation (maglev) suspends an object so that it is free of
contact with any surface. This has the effect of providing a frictionless
contact with the ground, making it particularly appropriate for high-
speed trains. A typical maglev train is streamlined to reduce air resis-
tance and travels along a guideway (see figure 13.1, page 232, and
figure 13.26). Once the train is levitated, by continually changing the
polarity of alternate magnets along the track, a series of attractions and
repulsions is generated that provides the force to overcome air resis-
−1
tance and accelerate the train along the guideway. Speeds of 517 km h
have been achieved in Japan. The main obstacle to higher speeds is the
air resistance encountered by the train during motion. The enormous
amount of electrical power needed by the train is an obstacle to its
wider use.

248 FROM IDEAS TO IMPLEMENTATION


There are two different maglev systems.
• The electromagnetic suspension system (EMS), currently used in
Germany, uses conventional electromagnets mounted under the train
on structures that wrap around the guideway to provide the lift and to
created the frictionless running surface. This system is unstable
because of the varying distances between the magnets and the
guideway. This instability needs to be monitored closely and computers
have provided the control to correct the instability. The lifting force is
produced by arrays of electromagnets of like polarity in both the train
and the guideway. The magnets repel each other to lift the train above
the track.
• The electrodynamic suspension (EDS) system, developed in Japan,
uses superconducting magnets on the vehicle and electrically conduc-
tive strips or coils in the guideway to levitate the train. This does not
require the same degree of computer monitoring and adjustment
while travelling, but the requirement for very low temperatures
means that, for the moment, this is not a practical system. The system
for accelerating the train along a guideway is similar to the EMS
system.

(a)
Electromagnetic Levitation

Guidance rail
Guidance magnet

Longstator

Levitation magnet

(b)

Figure 13.26 (a) Use of magnets in


suspending the train (b) A maglev
train, based on the EMS system

CHAPTER 13 SUPERCONDUCTIVITY 249


Particle accelerators
Most high energy particle accelerators now use superconducting mag-
nets. The Tevatron at Fermilab uses 744 superconducting magnets in a
ring of circumference 6.2 km.
The Large Hadron Collider (LHC) at CERN has a circumference of
27 km. It uses 1232 large dipole superconducting magnets, as well as many
smaller ones, bringing the total to over 1700 magnets. Each of the large
dipole magnets is 15 metres long and weighs 35 tonnes. A total length of
over 7000 kilometres of niobium–titanium superconducting cable is used
in these magnets. It has been estimated that if conventional electromagnets
had been used, the accelerator would have had to be over 120 kilometres
in circumference to produce the same energies, and its electricity demands
would have been phenomenal. (For more information on the LHC, see
‘Recent developments in accelerator research’, page 512.)

Table 13.2 A time line of the development of superconductivity

1911 Dutch physicist Heike Kamerlingh Onnes discovers superconductivity in mercury at a temperature of 4 K.
Onnes immediately predicts many uses for superconductors. One prediction involves the ability to produce
electrical motors without the need for an iron core in the electromagnet because the current carried would be
so much greater than possible with conventional wiring. Onnes’ vision is thwarted by the difficulties of finding
superconductors that can be made into wires easily and carry the required current densities required to realise
his vision. The search for HTS begins.

1912 Onnes is awarded the Nobel Prize in physics for his research into the properties of matter at low temperature.

1933 W. Meissner and R. Ochsenfield discover the Meissner effect.

1941 Scientists report superconductivity in niobium nitride at 16 K.

1953 Vanadium-3 silicon found to superconduct at 17.5 K.

1962 Westinghouse scientists develop the first commercial niobium-titanium superconducting wire.

1962 English physicist Brian Josephson predicts the ‘tunnelling’ phenomenon in which pairs of electrons can pass
through a thin insulating strip between superconductors.

1968 IBM begins a research program to produce a Josephson junction computer.

1972 John Bardeen, Leon Cooper and John Schrieffer win the Nobel Prize in physics for the first successful theory of
how superconductivity works.

1972 The Japanese test their first magnetically levitated (maglev) rail vehicle using a niobium-titanium
superconductor.

1982 The first MRI machines are placed in hospitals for evaluation. These use superconducting wires that create a
powerful magnetic field. These machines are considered the most significant advance in imaging devices since
the X-ray machine.

1986 IBM researchers A. Muller and G. Bednorz make a ceramic compound of lanthanum, barium, copper and oxygen
that superconducts at 35 K.

1987 University researchers at Houston use yttrium instead of lanthanum to produce a superconductor that operates
at 92 K.

1988 Allen Hermann of the University of Arkansas makes a superconducting ceramic containing calcium and thallium
that superconducts at 120 K, well above the boiling point of liquid nitrogen (78 K).

1993 A. Schilling, M. Cantoni and J. Guo produce a superconductor from mercury, barium and copper with a
maximum transition temperature of 133 K.

1996 US researchers demonstrate a 200 horsepower motor and a 2.4 kilowatt current limiter based on HTS.
A 50-metre HTS transmission line is built.

250 FROM IDEAS TO IMPLEMENTATION


CHAPTER REVIEW
2. Explain the significance of the Braggs’ use of
SUMMARY regular crystal planes instead of a diffraction
grating with X-rays.
• Max von Laue first showed that X-rays were dif- 3. Describe the role that Cooper pairs of elec-
fracted into different patterns by rock crystals. trons play in superconductivity.
• Bragg’s Law is the description of the relation- 4. (a) Describe two main differences between
ship between the wavelength of X-rays hitting Type I and Type II superconductors.
and bouncing off a crystal surface and the dis- (b) When you experiment with superconduc-
tance between the particle layers in the crystal’s tors in the lab, you use Type II supercon-
structure. It is described by the simple equation: ductors. Explain why you would not
normally use Type I.
nλ = 2d sin θ.
5. (a) What is the resistance of a superconductor
• X-ray crystallography deals with the scattering in the normal state if 300 mA of current are
of X-rays by arrays of atoms. When such arrays passing through the sample and 4.2 mV are
are very regular, such as in crystals, it is possible measured across the voltage probes?
to interpret the results of the photographic pat- (b) What potential difference is required to
tern caused by this diffraction in terms of the force 300 mA through the same supercon-
atomic array in the structure of the crystal. −4
ductor, if the resistance is 1.0 × 10 Ω?
• Metal possesses a crystal lattice structure. 6. Consider wiring the superconductor from
question 6(b) in series with a 10 Ω resistor and
• The temperature at which a metal achieves connected to a 1.5 V battery. How much
superconducting ability is called the critical electrical current will flow through the super-
temperature, Tc. Ceramic alloys have a much conductor?
higher critical temperature and therefore are
7. Explain how a flat superconductor is able to
easier to use as superconductors than metals or levitate a small magnet above its surface.
metal alloys.
8. The following data were obtained from a
• The Meissner effect is the expulsion of a mag- YBCO superconductor.
netic field from inside a superconductor. (a) Calculate the resistance (and complete the
table on page 252) for each trial given that
• The BCS describes pairs of electrons (Cooper a constant current of 100 mA was flowing
pairs) which perform a coordinated motion through the sample.
inside the crystal structure of the conductor. (b) Explain why some of the voltages obtained
• There are two types of superconductor: Type I in this experiment show negative values.
are made of metals, have a low critical tempera- (c) Using the information in the table on the
tures and produce small magnetic fields; Type following page, graph resistance versus
II are made from ceramic compounds which temperature.
(d) Find the point on the graph with the
have a higher Tc and produce more useful mag-
largest slope.
netic fields.
(e) Estimate the critical temperature, Tc, from
• Superconductors are used in superconducting your graph.
magnets (for example, the maglev train), trans- (f) Is this a Type I or a Type II supercon-
mission of current and computer switching. ductor? Explain.
9. List and describe two applications of super-
conductors.
10. (a) List the problems that scientists must over-
QUESTIONS come before superconductors can be used
effectively.
1. In an X-ray diffraction experiment, a maximum (b) Suggest some ways of overcoming these
intensity of X-rays was found to be at 15° and problems.
−10
the interatomic distance was 2.5 × 10 m. 11. What are the advantages of ‘high temperature’
(a) Would you expect to observe maximum superconductors?
intensities at other angles? 12. Describe the differences between conductors
(b) Explain why this happens. and superconductors.

CHAPTER 13 SUPERCONDUCTIVITY 251


VOLTAGE (V) TEMPERATURE (K) RESISTANCE (Ω)

0.001 0370 118.2

0.0010270 116.1

0.001 0600 114.8

0.001 0490 112.9

0.001 0350 110.9

0.001 0200 109.1

0.001 0090 106.9

0.001 0010 105.0

0.000 9890 103.5

0.000 9750 102.2

0.000 9670 100.0

0.000 9510 97.9

0.000 9440 95.8

0.000 9180 95.0

0.000 9110 94.3

0.000 8920 93.8

0.008 440 93.5

0.007 830 93.2

0.006 390 93.0

0.000 5050 92.6

0.000 3790 92.3

0.000 2430 92.1

0.000 0930 91.7

0.000 0100 91.4

0.000 0030 91.0

0.000 0002 90.8

−0.000 0002 90.1

−0.000 0001 89.9


CHAPTER REVIEW

0.000 0003 89.5

−0.000 0001 88.8

0.000 0001 88.5

252 FROM IDEAS TO IMPLEMENTATION


CHAPTER REVIEW
Note: Even though the top of the disc warms up
13.1 before the bottom, the Meissner effect will continue
(but diminish) until the entire superconductor is
TEMPERATURE above the critical temperature.

CHANGE IN Results
Observe the effects of each step of the method and
SUPER- record the temperature when the magnet comes to
a complete rest on the superconducting disc.
CONDUCTORS Analysis
Aim Predict what will happen as the liquid nitrogen
boils away.
To determine the Tc of a superconductor using the
Meissner effect.
Questions
Theory Why should the thermocouple be placed on the
bottom during step 2 of the method?
One way to measure the critical temperature of a
superconductor is by using the Meissner effect.
When the temperature of a superconductor is low-
ered to below the critical temperature, Tc, the
13.2
superconductor will push the field ‘out of itself’’.
This will result in the magnet being forced to levi- LEVITATION AND
tate and float above the superconductor.
By noting the temperature changes as this levi- THE MEISSNER
tation occurs it is possible to obtain the critical
temperature. EFFECT
Apparatus Aim
YBCO superconductor with attached To observe the Meissner effect — the levitation of
thermocouple a magnet above a superconductor.
small magnet
digital voltmeter Apparatus
liquid nitrogen superconducting pellet
If available, a temperature probe attached to a data neodymium-iron-boron (or other strong) magnet
logger is more useful. Data collected can be trans- liquid nitrogen
ferred to a computer and can be processed by all petri dish
students using an appropriate software package. dewar flask or styrofoam cup
non-magnetic (non-metallic) tweezers
Method insulated gloves
1. Attach the thermocouple lead from the super- Theory
conductor to a digital voltmeter set to the milli-
volt range. Levitation of a magnet above a superconductor is
an example of the Meissner effect. If the tempera-
2. Completely immerse the superconducting disc ture of a superconductor is lower than its critical
in liquid nitrogen with the thermocouple on temperature, a magnetic field cannot penetrate
the bottom. Calibrate your thermocouple the superconductor.
according to the manufacturer’s specifications.
3. Balance your magnet above the supercon- Method
ducting material and observe the levitation due 1. Carefully fill the styrofoam cup with liquid
to the Meissner effect. When the liquid nitrogen. Place the petri dish on top of the
nitrogen has almost completely boiled away the styrofoam cup and carefully pour in enough
temperature will begin to increase. liquid nitrogen until the liquid is about 1 cm
4. Observe the magnet as the disc warms. deep. Wait until the boiling subsides.

CHAPTER 13 SUPERCONDUCTIVITY 253


2. Using non-metallic (non-magnetic) tweezers,
carefully place the superconductor in the liquid
nitrogen in the petri dish. Again wait until the
13.3
boiling subsides.
3. Using the same non-metallic tweezers, carefully
RESISTANCE
place a small magnet about two mm above the
centre of the superconductor pellet. Release
AND SUPER-
the magnet.
4. If the magnet jumps away from the supercon-
CONDUCTORS
ductor, try placing the magnet on the pellet Aim
and then placing both in the petri dish and wait To produce a switch and hence observe the resis-
until the boiling subsides. tance within superconductors.
5. While the magnet is suspended above the
superconducting pellet, gently rotate the Apparatus
magnet using the non-magnetic tweezers. YBCO superconductive wire with attached leads
two dry cell batteries with holder/attachments
Results three volt light globe with holder
Describe what you observe and note any changes liquid nitrogen
that occur over time. dewar flask or styrofoam cup (as an alternative)

Questions Method
1. Connect the superconductor, light globe and
1. What temperature would you predict that the
batteries in series. When the superconductor is
magnet to move away from the supercon-
at room temperature it is in the normal state
ductor?
and will therefore have a high resistance. As a
2. Predict what should happen using the alter- result of having a high resistance in series in the
native procedure otlined in step 4 of the circuit, the globe will not light.
method. Is your prediction supported by your 2. Your teacher will place the superconductor into
observations? Explain. the liquid nitrogen.
3. Explain what you observe when the magnet is 3. Your teacher will then remove the supercon-
gently rotated using non-magnetic tweezers. ductor from the liquid nitrogen. The globe will
begin to dim and eventually go out.

Analysis
At each stage describe what you observe and
explain the observations.
PRACTICAL ACTIVITIES

254 FROM IDEAS TO IMPLEMENTATION


HSC OPTION MODULE
Chapter 14
Looking and seeing

Chapter 15
Astronomical measurement

Chapter 16
Binaries and variables

Chapter 17
Star lives

ASTROPHYSICS
CHAPTER
14 LOOKING AND
SEEING
Remember
Before beginning this chapter, you should be able to:
• describe the electromagnetic spectrum and its
components
• describe atmospheric filtering of the electromagnetic
spectrum.

Key content
At the end of this chapter you should be able to:
• describe the selective absorption of the
electromagnetic spectrum by the atmosphere and
relate this to the need to observe those wavelengths
from space
• discuss Galileo’s use of his telescope to observe the
features of the Moon
• define the terms resolution and sensitivity of
telescopes, and be able to calculate the resolution of
a variety of telescopes
• demonstrate why telescopes need a large diameter
objective lens or mirror for sensitivity and resolution
• discuss the problems associated with ground-based
astronomy in terms of resolution and selective
absorption of electromagnetic radiation
• outline methods being employed to try to improve
the resolution and/or sensitivity of ground-based
systems, including active optics, adaptive optics and
interferometry.

Figure 14.1 The Hubble Deep Field is a small portion


of the sky selected for its absence of foreground stars.
Although a very small portion of the northern sky — about
the size of a grain of sand held at arm’s length — the
Hubble Deep Field shows over 3000 galaxies at various
distances. This image is a compilation of over 300 separate
exposures taken by the Hubble Space Telescope.
14.1 GALILEO’S TELESCOPES
In 1609, an Italian scientist named Galileo Galilei heard the news that a
Dutchman had pieced together a few glass lenses in such a way as to
make distant objects seem much closer. Galileo set about constructing
such a device for himself. First he had to work out the optics involved,
then grind the lenses and then build it. His first attempt was quite poor
but clearly demonstrated its potential. He built more and better (two are
shown in figure 14.2) telescopes until he had one with a magnification of
over thirty times.
Five years earlier, Galileo witnessed the appearance of a new star in the
heavens — something that was not supposed to occur according to the
prevailing Aristotelian view, endorsed by the Roman Catholic Church,
that the heavens were perfect and unchanging. That event sparked
within him an interest in astronomy so that, when finally he had a tele-
scope of his own, it was natural for him to point it upward, at the night
sky. He was the first person to do so.
The first heavenly object to catch his attention was the Moon. Of that
he wrote:
. . . the surface of the moon is not smooth, uniform, and precisely
Figure 14.2 Two of Galileo’s spherical as a great number of philosophers believe it (and other
telescopes on display at the museum heavenly bodies) to be, but is uneven, rough, and full of cavities and
of science in Florence, Italy prominences, being not unlike the face of the Earth, relieved by
chains of mountains and deep valleys.

Figure 14.3 A sample of the sketches Galileo made of the Moon

Galileo was even able to devise a means of calculating the height of a


mountain on the Moon from a measurement of its shadow. These observ-
ations of the Moon were startling enough but, during the course of that
year and the next, Galileo made a number of other startling discoveries,
such as the moons of Jupiter, that would fundamentally challenge the way
science would regard the heavens.

CHAPTER 14 LOOKING AND SEEING 257


Such was the impact of the first application of a telescope to the field
of astronomy. Telescopes literally provide us with a window to the
heavens and, as technology has improved, so too has our ability to build
bigger and better telescopes which, in turn, yield more discoveries and
better science.

14.2 ATMOSPHERIC ABSORPTION OF


THE ELECTROMAGNETIC
SPECTRUM
The electromagnetic spectrum was You will recall from your Year 11 studies of Physics that visible light is but
discussed in Physics 1: Preliminary one part of a larger spectrum of electromagnetic radiation. The various
Course, chapter 3. It is also components of the spectrum are summarised in table 14.1. Note that
illustrated in this text in chapter 11 there is no clear boundary between adjacent parts of the spectrum, and
(see page 195). so there is some overlap in wavelength.

Table 14.1 Components of the electromagnetic spectrum

EM SPECTRUM WAVELENGTH (m) COMMENT

Gamma rays < 10−10 Absorbed by the atmosphere.

−11 −7
X-rays 10 to 10 Absorbed by the atmosphere.

−8 −7
Ultraviolet 10 to 4 × 10 Mostly absorbed by the
atmosphere.

−7 −7
Visible light 4 × 10 to 7 × 10 Not absorbed by the atmosphere.

−7 −2
Infra-red 7 × 10 to 1 × 10 Freely penetrates haze but is
incompletely absorbed by the
atmosphere.

−3 6
Radio waves 1 × 10 to 1 × 10 A broad grouping of microwaves
and radio bands — uhf, vhf, hf, mf
and lf. Not absorbed by the
atmosphere.

As indicated in table 14.1, not all of the electromagnetic spectrum can


penetrate the atmosphere of the Earth. Despite the fact that all com-
ponents of the electromagnetic spectrum strike the outer atmosphere
from space, only visible light, radio waves and some UV and IR make it
through to the ground (see figure 14.4).
See the ‘Physics in focus’ boxes on This, in turn, means that ground-based telescopes can operate only in
pages 264, 266 and 268 for the visible spectrum (optical telescopes such as the Anglo-Australian
information on the latest in space Telescope, shown in figure 14.5) or in the radio bands (radio telescopes
and ground-based telescopes around such as Parkes, shown in figure 14.6 on page 260). Observations of other
the world. frequencies must be carried out either from a plane or high-altitude
balloon in the upper atmosphere or from a spacecraft above the
atmosphere, such as the Hubble Space Telescope (see figure 14.7,
page 260). For ground-based telescopes, some atmospheric effects still
have to be taken into account as discussed in section 14.4 (page 265).

258 ASTROPHYSICS
600

400
X-rays and
gamma rays
Ultraviolet
200 Interacts with
electrons in the Visible and
ionosphere near infra-red

Altitude (km)
Infra-red
100
Microwaves
80
Interacts Radio waves
60
Interacts with with ozone Long waves
oxygen and
Figure 14.4 The Earth’s 40
nitrogen
atmosphere filters out most
of the electromagnetic
Interacts with
radiation coming from 20 water vapour and
carbon dioxide
space. Two notable

cm

cm

km
‘windows’ exist in this
Å

m
0

10

0
10

0
10

10

1
10
1
10

barrier — visible light and


radio waves. Wavelength

Figure 14.5 The 3.9 metre


Anglo-Australian Telescope (AAT)
at Siding Springs

CHAPTER 14 LOOKING AND SEEING 259


Figure 14.6 The Parkes
64-metre radio telescope

Figure 14.7 The Hubble Space


Telescope (HST)

260 ASTROPHYSICS
14.3 TELESCOPES
There are many different designs
for telescopes, yet all of the popular
designs are based upon just two basic
arrangements — refracting telescopes
and reflecting telescopes.

Refracting telescopes
A refracting telescope, such as that shown in
figure 14.8, is the style of telescope that
most people recognise. Lenses are used
to gather and focus the starlight by
refraction, or bending, of the rays. As
figure 14.9 shows, the light enters at
one end and is focused by two lenses
to form an image in an observing
eye located at the other end. This
arrangement of lenses causes an
image to be seen upside-down and
Figure 14.8 A common
back-to-front; however, this is not a
refracting telescope
problem when observing stars.

Light Eye
from a
star

Eyepiece lens
Objective lens Optical tube

Figure 14.9 The arrangement of lenses inside an astronomical refracting telescope

Refracting telescopes are preferred for planetary and lunar observ-


ations but not for observing stars because the lenses can introduce image
errors (called aberrations), and because large lenses are expensive to
manufacture accurately. In addition, the unobstructed light path of a
refractor results in good image contrast, which is important when
observing planets.

Reflecting telescopes
Figure 14.10 shows the type of reflecting telescope found in NSW high
schools. This type of telescope uses a parabolic concave mirror to gather
and focus the starlight by reflection of the rays. Figure 14.11, on the
following page, shows a variety of common designs. The most basic
design, shown in figure 14.11(a), is the prime focus. This is the design
used by radio telescopes, with the signal coming from the detector in
electronic form. For optical work, however, it is necessary to direct the
light out of the telescope tube. School telescopes use the design shown in
figure 14.11(b), known as a Newtonian reflector since Isaac Newton first
suggested it. Larger research telescopes use the Cassegrain design shown
in figure 14.11(c), which directs the light through a hole in the primary
Figure 14.10 A common mirror. This design can be produced on a large scale far less expensively
Newtonian reflecting telescope than similarly sized refracting telescopes.

CHAPTER 14 LOOKING AND SEEING 261


(a) Prime focus

Photographic film or other


detection devices

(b) Newtonian Plane secondary mirror Parabolic primary mirror

Light from a star

Eyepiece lens
Incoming light rays
from distant
Point image at point
the focal point source (c) Cassegrain Eyepiece lens
Lens

Focal length
Hyperbolic secondary mirror Parabolic primary mirror

Figure 14.11 A number of reflecting telescope designs:


Curved (a) the prime focus, (b) the Newtonian, and (c) the Cassegrain
Focal point
mirror

Rays from
distant point
Telescope performance
source Many newcomers to telescopes can become unduly concerned with
magnification. Of the three performance measures discussed here,
Incoming light magnification is the least important. In fact, it is mentioned here only
Focal length because its discussion demonstrates the practical effect of changing the
eyepiece lens of a telescope.
Figure 14.12 The focal length of a Any convex lens or concave mirror has a focal length, as shown in
convex lens or concave mirror is the figure 14.12. A telescope has two focal lengths of concern — the focal
distance between the lens/mirror and length, f, of the telescope itself (that of the objective lens in a simple
the focus when parallel light enters. refractor, or that of the primary mirror in a simple reflector) and the
focal length, fe, of the telescope eyepiece. The magnification, m, of the
telescope can be calculated using the expression:
f
m = --- .
fe
Calculating magnification
SAMPLE PROBLEM 14.1 A telescope has a focal length of 125 cm and it is fitted with an eyepiece
with a focal length of 12.5 mm. Determine its magnification.
SOLUTION f
m = ---
fe
1.25 m
= ------------------------
0.0125 m
= 100 × magnification

262 ASTROPHYSICS
Changing the eyepiece
SAMPLE PROBLEM 14.2 The telescope in sample problem 14.1 is now fitted with a different eyepiece,
this time with a focal length of 20 mm. What is the new magnification?
f
SOLUTION m = ---
fe
1.25 m
= ---------------------
0.020 m
= 62.5 × magnification
This lower magnification may seem less desirable. However, it will give a
wider and brighter field of view, which can make the job of locating
specific stars easier.

The sensitivity of a telescope is its ability to pick up faint objects for


observation, or its light-gathering power. This depends upon the collecting
area of the lens or mirror, since a larger area means more light is being
14.1 gathered and focused to form an image. The collecting area of the
Comparing the light-gathering telescope depends upon its radius, or diameter (which is the dimension
ability of different sized lenses usually quoted). Therefore, a larger diameter telescope will usually mean
a more sensitive one. A 100 mm school telescope, such as the one shown
in figure 14.10, will be much less sensitive than the 3.9 m AAT shown in
figure 14.5 simply because of the significant difference in diameter.
The theoretical resolution of a telescope is its ability to distinguish two
The theoretical resolution of a
telescope is its ability to distinguish
close objects as separate images. It is measured as an angle and depends
two close objects as separate upon the wavelength of light, or other electromagnetic radiation being
images. It is measured as an angle. collected, as well as the diameter of the telescope. The following formula
for resolution is sometimes called the ‘Dawes limit’:
2.1 × 10 5 λ
R = --------------------------
where D
R = resolution (arcsec, or seconds of arc)
λ = wavelength (m)
D = diameter (m).
Note that a smaller angle indicates a higher resolution.

SAMPLE PROBLEM 14.3 Theoretical resolution of the Parkes telescope


What is the theoretical resolution of the 64 m Parkes radio telescope
when observing radio waves of wavelength 3 cm?
2.1 × 10 5 λ
SOLUTION R = --------------------------
D
2.1 × 10 5 × 0.03
= ---------------------------------------
64
= 98 arcsec

Theoretical resolution of a small telescope


SAMPLE PROBLEM 14.4 What is the theoretical resolution of a 100 mm Newtonian reflecting tele-
scope when observing starlight with a wavelength of approximately 500 nm?
2.1 × 10 5 λ
SOLUTION R = --------------------------
D
2.1 × 10 5 × 500 × 10 –9
= -------------------------------------------------------
0.100
= 1.05 arcsec

CHAPTER 14 LOOKING AND SEEING 263


Theoretical resolution of the Anglo-Australian Telescope
SAMPLE PROBLEM 14.5 Determine the theoretical resolution of the 3.9 m Anglo-Australian
Telescope when observing starlight of wavelength 500 nm.
SOLUTION 2.1 × 10 5 λ
R = --------------------------
D
2.1 × 10 5 × 500 × 10 –9
= -------------------------------------------------------
3.9
= 0.027 arcsec

We can see from the sample problems above that a radio telescope, by
the nature of the wavelengths it observes, is restricted to very poor
resolutions. However, they usually have very large collecting areas and
therefore can be very sensitive devices. Another factor contributing to their
sensitivity is that radio signals can be amplified with very little increase in
noise using electronic amplifiers. This cannot be done with light signals.
By comparison, when looking at the stars with just a 100 mm optical tele-
14.2 scope you will be enjoying a far superior resolution of about 1 arcsec. How-
The Australian Telescope ever, this telescope is much less sensitive than a radio telescope. To look at
Compact Array the stars with increased sensitivity, we will need to move to a larger optical
telescope such as the Anglo-Australian Telescope which, by virtue of its
3.9 m mirror, enjoys a much brighter field of view. Theoretically, it should
also enjoy a much greater resolution than the small telescope. Ironically,
however, it does not because of atmospheric blurring, or ‘seeing’.

PHYSICS IN FOCUS
The S-Cam
T he S-Cam (Superconducting Camera) is an example of technology pushing optical telescope
sensitivity to its limits. It incorporates a cryogenic light sensor built using superconductors and
cooled to just 1 K. It is able to register individual photons of light, very quickly recording their position
as well as directly measuring their colour. The information is
accumulated in a database that allows the examination of very
quick variations in light. Such variations are typical of some
astronomical events, such as the optical explosions
associated with gamma-ray bursts; these events could not
be adequately studied with previous technology. The
S-Cam is currently fitted to the 4.2 metre William
Herschel Telescope, located at the
Observatorio de Roque de los Muchachos
on the island of La Palma in the Canary
Islands.

Figure 14.13 The S-Cam has been


developed by the European Space Agency
(ESA).

264 ASTROPHYSICS
14.4 SEEING
If you look across a car park or along a road on a very hot day you will see
ripples rise from the surface. You notice it because the moving hot air
distorts the light passing through it. The same thing occurs to starlight
entering the Earth’s atmosphere. Turbulent air distorts the path of the
starlight through it, making the stars appear to twinkle, and blurs their
‘Seeing’ refers to the twinkling and
image. This effect is known as ‘seeing’, and will normally blur the image
blurring of a star’s light due to of a star to about 1 arcsec. The best locations in the world for looking at
atmospheric distortion. stars, such as Mauna Kea, Hawaii, have a seeing of about 0.5 arcsec.
This imposes a practical limit on the achievable resolution from a large
optical telescope. The Anglo-Australian Telescope, when opened in 1974,
was restricted to a seeing of about 1 arcsec, despite its theoretical
resolution of approximately 0.03 arcsec. Ironically, this is no better than a
small 100 mm telescope with a theoretical resolution of 1 arcsec,
although the AAT’s view is much brighter.
Radio telescopes are not affected as much by seeing, by virtue of the
longer wavelengths they observe. There is some effect when observing
wavelengths of a few millimetres — water vapour and oxygen in the atmos-
phere tend to absorb radio signals of this wavelength. In addition, rain can
be a factor since raindrops are a few millimetres in size. However, wave-
lengths longer than this are not affected by atmospheric blurring.
There is one other obstacle to viewing that should be mentioned —
the Sun. Obviously the Sun interferes with optical viewing, restricting
optical astronomers to night viewing. Less obviously, the Sun is also a
source of interference for radio astronomers since it is a strong radio
source. This usually prevents radio telescope observations within 90° of
the Sun, unless a particularly strong radio source is being viewed, such as
certain quasars.

14.5 MODERN METHODS TO IMPROVE


TELESCOPE PERFORMANCE
The capability limits of telescopes appeared to have been reached several
decades ago. Radio telescopes are quite sensitive and are not bothered by
seeing conditions but have quite poor resolution. Optical telescopes are
expensive to manufacture in large diameters but they must be large to be
sensitive. However, seeing limits their effective resolution to no better
than a 200 mm telescope even at the very best locations in the world.
Recently there have been many innovative approaches to overcoming
these barriers to effective ground-based astronomy.

Interferometry
The resolution problem of radio telescopes can be overcome by using
many radio dishes laid out in a large pattern, and then combining their
signals together to make them behave as a single radio telescope with a
much larger diameter.
This has been done in New Mexico, USA, to create the Very Large
Array (VLA) shown in figure 14.13. The VLA is made up of 27 radio
dishes set out in a large Y pattern up to 36 km across. Each dish is 25 m
in diameter but, when combined electronically, they provide the
resolution of a dish 36 km in diameter and the sensitivity of a dish 130 m
in diameter.

CHAPTER 14 LOOKING AND SEEING 265


Figure 14.14 The Very Large Array
(VLA) in New Mexico. The dishes can
be moved along tracks to give four
different arrangements of 36 km
across, 10 km across, 3.6 km across or
just 1 km across.

We can use the resolution formula on the largest array size of 36 km


when observing the shortest receivable wavelength of 7 mm, to calculate
the maximum theoretical resolution:
2.1 × 10 5 λ
R = --------------------------
D
2.1 × 10 5 × 0.7 cm
= ---------------------------------------------
-
3.6 × 10 6 cm
= 0.04 arcsec.
This means that at 0.04 arcsec the VLA is challenging large optical tele-
Interferometry is a technique used scopes for theoretical resolution and, in practical terms, is bettering
to combine the data from several them since it is not bothered as much by seeing.
elements of an antenna array in The VLA is an interferometer, which means that it combines the data
order to achieve a higher resolution. from each element of the array to form an interference pattern.
Computers are used to mathematically analyse these patterns to reveal
information about the structure of the radio source. This technique is
known as interferometry.
eBook plus
Interferometry techniques have also been used to ‘unblur’ the images
from large optical telescopes. ‘Speckle interferometry’ uses many images
Weblink:
The Square
from a telescope, keeping each exposure short enough to freeze the atmos-
Kilometre Array pheric blur. A computer is then used to process the many exposures and
extract more exact information about the star or other object.

PHYSICS IN FOCUS
The Square Kilometre Array (SKA)
T he SKA is a new generation radio
telescope array that may place Aus-
tralia at the forefront of radio
Array
station. Approximately 80 of these stations
will be arranged in a spiral pattern up to
400 km from a core array as shown in
station
astronomy. An international project figure 14.14. There will also be
that may be located on Australian several array stations located even
soil, the SKA will be an array of Inner further from the core in order to
core
radio antennas linked to work as a give the SKA a high resolution
radio telescope with an effective capability.
collecting area of one square kilo-
metre. The antennas will be Figure 14.15 The planned layout
arranged into groups of about one 200 km of the SKA, a new generation radio
hundred, each group forming an array telescope array

266 ASTROPHYSICS
Active optics
The most recent developments for optical telescopes are active and adap-
tive optical systems. These systems seek to detect the errors in starlight
caused by atmospheric blurring and then to optically correct them auto-
matically. If done properly, the telescope operator should be aware only
of improved seeing.
Active optics use a slow feedback
Active optics use a slow feedback system to correct sagging or other
system to correct sagging or other deformities in the primary mirror of large modern reflector tele-
deformities in the primary mirror scopes. In the past, large telescopes such as the AAT used primary mir-
of large modern reflector rors with a thickness about one sixth of their diameter in order to
telescopes. ensure that they did not deform as the telescope was moved around
the sky. However, there is a new generation of 8 to 10 m reflecting
telescopes that use thin mirrors — just 20 cm thick approximately.
These mirrors will certainly change shape as the telescope changes
direction or heats up or cools down. However, the back of the mirror
is fitted with many actuators that can push or pull the mirror back
into the correct shape.
When the light leaves the primary mirror, but before it reaches the
final lens (where the eyepiece is in a small telescope), it is slowly sam-
pled by a ‘wavefront sensor’. This is a type of interferometer, which
can detect how the incoming light has been altered. By sampling
slowly, the effect of atmospheric turbulence is eliminated and any
remaining effect is then due to deformities in the primary mirror. A
computer calculates the required shape adjustments and then moves
the actuators as required every few minutes.
The first telescope to use active optics
was the 3.5 m New Technology Telescope
Star in Chile, which uses 75 actuators under
its primary mirror. The most notable use
of active optics is in the new 10 m tele-
scopes Keck I and Keck II in Hawaii (see
Adaptive optics use a fast feedback page 269). The enormous mirrors in
Telescope
system to attempt to correct for these telescopes are each made up of 36
effects of atmospheric turbulence. separate pieces of appropriate shapes.
The position and curvature of each piece
is controlled by the active system and
adjusted twice per second.
Deformable Incoming light
Computer mirror wavefronts have
been distorted by
Adaptive optics
atmosphere Adaptive optics use a more aggressive
Signals to control
deformable mirror approach in an attempt to correct effects
of atmospheric turbulence. A wavefront
Tip-tilt mirror
sensor is still employed between the pri-
mary mirror and the lens, as shown in
Signals to control figure 14.16. This time, however, rapid
tip-tilt mirror computer-calculated corrections are fed
Wavefront
sensor Beam splitter to one or two secondary mirrors that
‘straighten out’ the light. These correc-
tions are made at up to 1000 times per
Light wavefronts now corrected second, and this speed is the major
difference between adaptive and active
Figure 14.16 systems. Figure 14.17, on the following
A typical adaptive Focal-plane page, shows how adaptive optics allows a
optical system layout image binary star to be seen more clearly.

CHAPTER 14 LOOKING AND SEEING 267


One of the possible secondary mirrors is called a ‘tip-tilt’ mirror, which
is able to adjust for slight changes in the position of the light. (A tip-tilt
system is used in the Anglo-Australia Telescope.) The other is a deliber-
ately deformable mirror to adjust for deformities in the light. Making this
type of image correction presents a considerable technological challenge
and some development is still required before many large telescopes can
successfully adopt adaptive optics.

Figure 14.17 Adaptive optics used to improve the seeing on the Canada–France–Hawaii
Telescope (CFHT) at Mauna Kea Observatory, Hawaii. On the left can be seen a raw image
of an unresolved close binary (double) star system. The two stars are 0.38 arcsec apart but
seeing is approximately 0.7 arcsec. In the centre, the adaptive optics have been turned on and
the two stars can now be seen quite clearly. In the third image, computer enhancement has
been added. © CFHT, 1996. Used with permission.

PHYSICS IN FOCUS
Advanced telescope technology
N ASA’s ‘Great Observatories Program’ has
worked to place four telescopes in space to
cover the whole electromagnetic spectrum,
disks of matter around black holes. These
gamma rays are not visible to us so the view
that GRO could see was a view of the world
including regions not observable from the foreign to our eyes.
ground. Being above the atmosphere also elim- 3. The Chandra X-Ray Observatory (formerly
inates atmospheric blurring of images. The four called AXAF) was named after Nobel Prize
telescopes are described below. winner Subrahmanyan Chandrasekhar. It was
1. The Hubble Space Telescope (HST) was put launched into orbit in 1999 to observe X-rays
into orbit in 1990. It needed repairs soon after that, although less energetic than gamma rays,
but has since demonstrated the remarkable are produced by similar events — the hot
clarity possible with no seeing to blur its images. matter associated with objects such as super-
HST detects visible and ultraviolet light, so that novae, quasars or black holes.
its view of the universe is much as we see it. 4. The Spitzer Space Telescope (formerly called
2. The Compton Gamma Ray Observatory SIRTF) was launched in 2003 for what became
(GRO), which detected high-energy gamma a five-year mission. Spitzer observes the infra-
rays. GRO was put into orbit in 1991 but was red light produced by cool objects such as
brought down on 4 June 2000 after failure of nebulae discs in which stars are born, and discs
one of its gyroscopes six months eariler, and of dust or planets around other stars. To
crashed into the Pacific Ocean. Gamma rays observe these, Spitzer was placed into an
are produced by high-energy processes. This Earth-trailing orbit around the Sun, where its
meant that GRO could look at events such as cryogenically cooled instruments suffer less
solar flares, supernovae and hot, spinning heating than if it were in an Earth orbit.

268 ASTROPHYSICS
Gamma rays X-rays Ultra- Visible Infra-red Microwave Radio
violet light waves

Compton Chandra Hubble Spitzer

Observes hot events Also observes hot Observes nearby Observes events of
such as supernova events which can stars and distant the early universe at
explosions at also be associated galaxies at temperatures of
temperatures of with black holes at temperatures of ~100 degrees.
~10 billion degrees. temperatures of ~10 000 degrees.
~10 million degrees.

Figure 14.18 NASA’s space observatories span the electromagnetic spectrum. Compton and Chandra observe hot events such as
supernovae and black holes. Hubble observes nearby stars and distant galaxies. Spitzer observes the birth places of planets, stars
and even whole galaxies.

Since the 1990s, many advanced ground-based this, the two telescopes were linked by interferom-
telescope facilities have been initiated, such as etry in 2001 to deliver the resolving power of a
the pair of telescopes known as Keck I and Keck telescope with a mirror 85 m in diameter!
II, located on Mauna Kea in Hawaii (shown in As remarkable as the Keck telescopes are, there
figure 14.19). Each has a 10 m primary mirror, are plans to apply this same technology to tele-
easily the largest in the world. The mirrors scopes with mirrors of more than 20 m diameter.
themselves are extraordinary, being made up of These are known as Extremely Large Telescopes
36 smaller hexagonal mirrors 1.8 m in diameter. (ELTs). Two examples that are currently being
Each has the appropriate shape ground into it, planned are the European Extremely Large Tele-
and each is held in place by a computer-con- scope and the Thirty Metre Telescope.
trolled active optical system so that they act
together as a single near-perfect mirror.
A 10 m mirror has 17 times the light-gathering
area of the Hubble Space Telescope. Being free
of the atmosphere, the HST can see finer detail,
but the Keck telescopes are more sensitive so they
can see fainter objects. A common approach of
research teams is to first use the HST to find and
pinpoint distant objects and then to use the
Kecks to explore those objects.
Mauna Kea, a dormant volcano in Hawaii, offers
some of the best seeing conditions in the world.
Surrounded by thermally stable ocean, the sur-
rounding air is unusually still. Despite this, the
Kecks have an adaptive optical system, which elim- Figure 14.19 Telescopes Keck I and Keck II on Mauna Kea,
inates even the seeing from this site. In addition to Hawaii

CHAPTER 14 LOOKING AND SEEING 269


SUMMARY QUESTIONS
• Galileo was the first person to point a telescope 1. State the years in which Galileo built and
at the sky and the object of his first studies was began to use his telescopes.
the Moon. He was able to observe the texture
2. Identify the initial object of his observations
of the surface, commenting on the craters and
and what he learnt from those observations.
even measuring the height of a mountain.
• All of the various electromagnetic wavelengths 3. (a) List the components of the electro-
impinge upon the upper atmosphere, yet only magnetic spectrum in order from least to
visible light, some infra-red, microwaves and most energetic. Which has the highest
radio waves successfully penetrate through to frequency?
the ground; the other wavelengths are (b) Identify the components that are absorbed
absorbed by the atmosphere. by the atmosphere as they attempt to
penetrate it from space.
• Ground-based telescopes are therefore restricted (c) Discuss strategies that can be employed to
to detecting visible light and radio waves. systematically observe these radiations.
• Space-based telescopes are used to observe all 4. State which components of the electromagnetic
electromagnetic wave bands without the restric- spectrum can be observed by ground-based
tions of ground-based telescopes. telescopes.
• Two basic designs of telescopes are refractors 5. Identify the two basic designs used in telescopes.
(which use lenses) and reflectors (which use Draw a sketch of each.
mirrors).
6. Define the sensitivity of a telescope.
• The sensitivity of a telescope is its ability to pick
up faint objects for observation, or its light- 7. Define the resolution of a telescope.
gathering power. This depends upon the
8. Define the magnification of a telescope.
collecting area of the lens or mirror, and hence
the square of its diameter. 9. Calculate the theoretical resolution of the
• The theoretical resolution of a telescope is its following optical telescopes when observing
ability to distinguish two close objects as light of wavelength 500 nm:
separate images. (a) a 50 mm refracting telescope
(b) a 50 mm reflecting telescope
2.1 × 10 5 λ
R = -------------------------- (c) a 100 mm Newtonian reflector
D
(d) a 200 mm Cassegrain reflector
• Seeing refers to atmospheric blurring of a star’s
(e) a 3 m Schmidt telescope
light. It is caused by turbulence within the
(f) an 8 m Schmidt telescope.
Earth’s atmosphere and severely restricts the
practical resolution of large optical telescopes. 10. Calculate the resolution of a 200 mm optical
• Radio telescopes, which are normally quite sen- telescope when viewing light of wavelength:
sitive, can improve their resolution by being (a) 500 nm
connected into an array and using interferom- (b) 600 nm
etry to give an effectively much larger dish dia- (c) 700 nm.
meter. 11. Calculate the resolution of the following radio
• Large ground-based optical telescopes are telescopes when observing 3 cm radio waves:
CHAPTER REVIEW

combating atmospheric blurring by introducing (a) a 50 mm dish (why is this impractical?)


computer-controlled adaptive optics to smooth (b) a 10 m dish
out the twinkling of stars and unblur their (c) a 30 m dish
images. (d) a 70 m dish
(e) a 200 m array
• The latest space telescopes include the Hubble
(f) a 1 km array
Space Telescope, the Compton Gamma Ray
(g) a 15 km array.
Observatory, the Chandra X-ray Observatory
and the Spitzer Space Telescope. A new gener- 12. Calculate the resolution of a 50 m radio telescope
ation of 8 m ground-based telescopes includes when observing the following wavelengths:
the twin 10 m Keck telescopes in Hawaii. (a) 1 mm (b) 1 m (c) 1 km.

270 ASTROPHYSICS
CHAPTER REVIEW
13. Calculate the magnification and resolution of (a) If the Parkes radio telescope and the 15 m
a 300 mm Cassegrain telescope used to radio telescope at Perth are used, then a
observe light of wavelength 650 nm. The baseline of approximately 3000 km is
telescope has a focal length of 1000 mm and is achieved. Calculate the theoretical resol-
fitted with: ution for such an observation.
(a) a 25 mm eyepiece (b) Another cooperating telescope is at
(b) a 10 mm eyepiece. Haartebeesthoek in South Africa. If this is
used with Narrabri then a baseline of
14. The Very Large Array in New Mexico, USA, is
9853 km is achieved. Calculate the effec-
a set of 27 radio antennas linked by interfer-
tive theoretical resolution.
ometry to give the resolution of a single dish
of diameter 36 km and the sensitivity of a dish (c) The network provides a variety of baselines
of diameter 130 m. down to 113 km, between Narrabri and
Mopra. Discuss reasons why a radio astron-
(a) With reference to the sensitivity, calculate
omer would deliberately choose to use a
the effective collecting area of the array.
smaller baseline than the maximum avail-
(b) Calculate the theoretical resolution of the able.
VLA when observing radio waves of wave-
length 10 cm. 17. Describe the condition known as ‘seeing’ and
how it is caused.
15. The Square Kilometre Array, planned to be
constructed in Australia, will be an array of up 18. Discuss how ‘seeing’ severely restricts the
to 1000 radio antennas with an effective capabilities of large ground-based telescopes.
2
collecting area of 1 km . 19. Discuss strategies currently being employed in
(a) Calculate the diameter a single dish would modern, large, ground-based telescopes to
need in order to have this collecting area. counter the effects of ‘seeing’.
(b) Referring to figure 14.14, if the outlying 20. Compare the resolution and sensitivity of a
array stations were 400 km from the centre typical radio telescope to that of a large optical
of the array, what would be the theoretical telescope.
resolution of the SKA when observing 21. Outline strategies being employed in modern
radio waves of wavelength 10 cm? radio telescope facilities to improve their
16. Radio telescope facilities across Australia have resolution.
formed a network called the Australian Long 22. Compare the resolution and sensitivity of a
Baseline Array (LBA). In this system, selected radio telescope array such as the VLA or SKA
telescopes are used to observe the same object to that of a large optical telescope.
at much the same time; however, these observ-
23. Describe NASA’s ‘Great Observatories Pro-
ations are made independently of each other.
gram’. What does it hope to achieve?
Data from each telescope are transported to a
correlator in Sydney and the interferometry is 24. Describe the twin 10 m telescopes Keck I and
then applied. Although results are delayed, Keck II.
very long baselines can be used to give extra- 25. Compare the capabilities of the Hubble Space
ordinary resolutions. Telescope and the Keck I telescope.

CHAPTER 14 LOOKING AND SEEING 271


Questions
14.1 COMPARING 1. Based on the information in your table, what
can be said about the reading on the light
THE LIGHT- meter as lens diameter is increased?
2. What does this observation tell us about lenses
GATHERING of larger diameter?
3. What are the implications of this finding for
ABILITY OF telescopes?

DIFFERENT SIZED
LENSES 14.2 THE
Aim AUSTRALIAN
To demonstrate the extra light-gathering ability of
larger lenses.
TELESCOPE
Apparatus COMPACT ARRAY
at least two biconvex lenses of different diameter Aim
light meter
two polarising filters To use the internet to find out more about radio
one sunny day astronomy at the Australian Telescope Compact
Array.
Warning! Apparatus
Do not use the lenses to look at the Sun. Blindness
Internet access is all that is required for this
can result.
activity.
Method Method
1. Set up the light meter in the sunlight. The
The Australian Telescope Compact Array (ATCA)
needle on the meter may already indicate a
is a radio telescope array of six 22 m antennas
maximum value. If this is the case, place both
located at Culgoora, NSW. It is operated by the Aus-
polarising filters, one upon the other, over the
tralian Telescope National Facility (ATNF), which is
sensor panel and then rotate the upper filter
part of the CSIRO, and also operates several other
only until the needle on the meter slides back
installations such as the Parkes radio telescope.
to a central reading.
2. Starting with the smallest lens, measure its
Weblink:
diameter and record this information. Use the eBook plus The Australian Telescope
lens to gather the Sun’s light onto the panel of National Facility
the light meter. Do not focus the light to a
PRACTICAL ACTIVITIES

point, but rather create a circle of light that fills You will be presented with access to each of the
the panel of the light meter. Record the reading telescope facilities operated by the ATNF. Investi-
from the light meter. gate each of the options, and when you are ready,
3. Repeat this method with each lens of different select ‘ATCA (Narrabri)’. When the page loads,
diameter, in ascending order. select ‘information for the public’, then select the
‘ATCA Live!’ option. Alternatively, if time is short,
Results you can go directly to this page using the following
link:
LENS DIAMETER LIGHT METER
2
(cm) LENS AREA (m ) READING
Weblink:
eBook plus ATCA Live!

Use the information on this web page to answer


the following questions.

272 ASTROPHYSICS
PRACTICAL ACTIVITIES
Questions
1. Write down the date and time at which you are
completing this exercise.
2. What are the current weather conditions at
Culgoora?
3. Which antenna(s), if any, are currently offline?
4. (a) What object(s) are the other antennas
tracking?
(b) What is the right ascension and declination
of this object?
(c) Consult a star map. Within which constel-
lation does this object lie?
(d) What is the closest star, of magnitude 6
(approximate naked-eye limit) or brighter
to this object?
5. (a) What frequencies are being observed?
(b) To what radio bands do these frequencies
correspond? (Refer to figure 14.4.)
6. How have the antennas been configured? Draw
their configuration in your practical book.
7. Notice that the telescopes are arranged along
an east–west line. This gives good resolution in
this direction but poor resolution in the north–
south direction. How does the ATCA overcome
this difficulty? An explanation can be found at
the following link.

Weblink:
eBook plus Virtual radio interferometry

CHAPTER 14 LOOKING AND SEEING 273


15
CHAPTER ASTRONOMICAL
MEASUREMENT
Remember
Before beginning this chapter, you should be able to:
• describe the dimensions of an ellipse
• describe the concept of a black body
• explain the Bohr model of the atom
• describe star groups such as giants, main sequence, and white
dwarfs
• describe the Hertzsprung–Russell diagram and the star
groupings found within it, in particular main sequence stars, red
giants and white dwarfs.

Key content
At the end of this chapter you should be able to:
• define the terms parallax, parsec and light-year
• explain how trigonometric parallax can be used to determine
the distance to stars
• calculate the distance to a star given its trigonometric parallax,
using d = 1---
p
• discuss the relative limitations with trigonometric parallax
measurements
• compare the relative limits of ground-based astrometric
measurements to space-based measurements
• account for the production of emission and absorption spectra,
and compare these with a continuous black body spectrum
• describe the technology needed to measure astronomical
spectra
• identify the general types of spectra — continuous, emission and
absorption, and identify astronomical objects that produce each
• describe the key features of stellar spectra and explain how these
are used to classify stars
• describe how stellar spectra can provide a variety of information
such as chemical composition, surface temperature, rotational
and translational velocity, and density
• define apparent and absolute magnitude
Figure 15.1 Photographic astronomer ( mB – mA )
I ----------------------------------
David Malin at the prime focus of • calculate brightness ratios using -----A- = 100 5
IB
the Anglo-Australian Telescope
• determine the distance to a star using the distance modulus
d
M = m – 5 log  --------
 10
• outline the method of distance approximation called
spectroscopic parallax
• explain how colour index is obtained and why it is useful
• describe the advantages of photoelectric technologies over
photographic methods for photometry.
15.1 ASTROMETRY
Astrometry is positional astronomy; the branch of astronomy concerned
Astrometry is the careful
measurement of a celestial object’s with the careful measurement of position, and changes of position, of a
position, and changes of position, star or other celestial object, to a high order of accuracy. These apparent
to a high order of accuracy. position changes can be due to the real motion of the body, or the
motion of the Earth around its orbit, representing a shifting point of
observation. This latter case is of particular interest here because it allows
a measurement of distance, and the technique involved is the focus of
this section.

Parallax
Try this little experiment. Hold up your index finger so that it is vertical
and about 10 cm in front of your face. Close your left eye and note the
position of your finger against the background. Now open your left eye and
close the right. You will notice that the position of your finger has appar-
ently shifted against the background. This effect is known as parallax.
Parallax is the apparent change in position of a nearby object as seen
Parallax is the apparent shift
in position of a close object against
against a distant background due to a change in position of the observer.
a distant background due to a
change in position of the observer. Trigonometric parallax
Trigonometric parallax is a method of determining distances by using tri-
Trigonometric parallax is a method angulation together with parallax. The method is used by surveyors to
of using trigonometry to solve the determine terrestrial distances, and is used by astronomers to determine
triangle formed by parallax to distances to certain nearby stars.
determine distance.
As shown in figure 15.2, if the length of the baseline (formed by the
motion of the observer) is known, and the angle of deviation, θ, measur-
able, then the distance to the object can be calculated using trigonometry.
baseline
tan θ = ---------------------
distance
baseline
so distance = ---------------------
tan θ

Final viewing
position θ , Angle of deviation
Baseline
θ
Initial apparent
Initial viewing Distance Closer position
position
object
Figure 15.2 Parallax allows
Final apparent
distance to be calculated because a position
triangle is formed and trigonometry
can be applied. Background

For astronomical purposes, our viewing position of the heavens


changes regularly in several ways. The rotation of the Earth results in a
diurnal parallax, in which the viewing position changes over the course
of one evening. In this case the length of the baseline, that is, the dia-
meter of the Earth, is so small that this method can really be used to
determine distances only within our own solar system.
The motion of the Earth in its orbit around the Sun results in an annual
parallax, and this provides a larger and much more useful baseline.

CHAPTER 15 ASTRONOMICAL MEASUREMENT 275


Annual parallax
Observations of a nearby star during the course of a year will show
apparent shifts in its position against the background of more distant
stars. The largest shift will be between observations made six months
apart.
The annual parallax, p , of a star is half the angle through which the
Annual parallax, p, is half the angle
through which a nearby star star appears to shift as the Earth moves from one side of its orbit to the
appears to shift against the other. It can be seen from the triangle formed in figure 15.3 that:
backdrop of distant stars, over a
particular six-month period. radius of Earth’s orbit
Distance of star from Earth = --------------------------------------------------------- .
sin p

Earth – 2nd position

1st apparent position


1 AU

Earth's
orbit p
Sun

Close star

2nd apparent position

Figure 15.3 The formation of the


annual parallax angle, p Earth – 1st position

The largest annual parallax observed belongs to Proxima Centauri and


is 0.772 seconds of arc, as measured by the HIPPARCOS satellite (see page
278). With angles this small we can make the following approximation:
sin θ = tan θ = θ.
In addition to this, if the radius of the Earth’s orbit is expressed as
1 AU (Astronomical Unit), then the above formula for distance can be
stated as:
1
d = ---
p
where
d = distance from Earth (parsecs, pc)
p = parallax (arcsec).

Hence, distance is inversely proportional to parallax. The greater a


star’s annual parallax the closer it must be to us, and vice versa. Note that
this expression has led to the definition of a new unit of astronomical
distance known as the parsec.

The parsec (parallax-second)


One parsec (pc) is the distance from the Earth to a point that has an
One parsec, or parallax-second, is
the distance that corresponds to an
annual parallax of one arcsec, as shown
annual parallax of 1 second of arc. in figure 15.4. 1 parsec
In fact, no star is as close as 1 pc to us. 1 AU
As already mentioned, the closest star, 1 arcsec
Proxima Centauri, has an annual
parallax of 0.772 arcsec, which places it Figure 15.4 When
at a distance of 1.29 pc. p = 1 arcsec, then d = 1 parsec.

276 ASTROPHYSICS
For the purpose of comparison between various length units, note that:
12
1 parsec = 30.857 × 10 km
One light-year is the distance = 206 265 AU
travelled through space in one = 3.2616 light-years
year by light or other electro-
magnetic wave. It corresponds where
to a distance of 0.3066 parsecs 1 light-year = the distance travelled through space in one year by light
12
or 9.4605 × 10 km. 12
= 9.4605 × 10 km
= 0.3066 parsecs.

Calculating distance using annual parallax


SAMPLE PROBLEM 15.1 Determine the distance, in pc and light-years, of Procyon with an annual
parallax of 0.286 arcsec.
SOLUTION 1
d = ---
p
1
= --------------
0.286
= 3.50 pc
= 3.50 × 3.26 = 11.4 light-years

The parallactic ellipse


It was stated earlier that annual parallax was measured between two
observations of a star made six months apart, and this was based on
geometry such as that shown in figure 15.3. However, not just any
six-month period will do. It must be the right six-month period.
As shown in figure 15.5, if the position of a nearby star is determined
frequently throughout a year, then it will appear to trace out an ellipse,
called the parallactic ellipse. The ellipse is due to the fact that most stars
are at some angle to the plane of the Earth’s orbit, measured from the
Sun. (If they were located perpendicularly, as shown in figure 15.3, then
the star would trace out a circle.)
The annual parallax is more rightly defined as the angle subtended by
the semi-major axis of the parallactic ellipse. It occurs only twice in the
course of a year, and that is when the Earth-Sun-star angle is 90 degrees.
These two events are six months apart, and this is the specific six-month
period that must be used to determine the angle.

These positions occur on


two specific dates six months apart.

Ma
jor
ax
is

p 2p
Figure 15.5 The annual parallax is Parallactic ellipse
half the angle subtended by the major
axis of the parallactic ellipse. This
ellipse is the apparent path traced out
by a nearby star over the course of a
year.

CHAPTER 15 ASTRONOMICAL MEASUREMENT 277


Limitations
The very small angular measurements described on the previous page
have traditionally been made photographically using large ground-based
optical telescopes. However, such observations are affected by atmos-
pheric blurring or ‘seeing’ (as described in chapter 14, page 265) so that
the smallest parallax that can be measured from the ground is approxi-
mately 0.03 arcsec. This corresponds to a maximum distance, measur-
able to reasonable accuracy, of approximately 30 pc. In astronomical
terms this is only a very short distance; however, this astrometric tech-
nique has provided a foundation of measurements upon which other
techniques have built.

PHYSICS IN FOCUS
Astrometric satellites
T he perfect way to overcome atmospheric
blurring is to get above the atmosphere and
make observations from space. Before NASA
100 times more precise than that achieved by
HIPPARCOS. This will allow it to determine star
distances right across our galaxy to a good accu-
launched its Hubble Space Telescope, the Euro- racy (10–20 per cent), provided that a star is
pean Space Agency (ESA) put into orbit a 290- bright enough to be measured. Approximately
mm astrometric telescope aboard a satellite one billion such stars will be logged, which rep-
called HIPPARCOS. Between 1989 and 1993, it resents approximately one per cent of the stars in
was able to measure the parallax of approxi- our galaxy, the Milky Way.
mately 120 000 stars to a precision of 0.001 arcsec.
This is over 10 times more precise than ground-
based measurements, and extends the maximum
distance determined by astrometric means to
1000 pc. Its results are available in the ‘HIP-
PARCOS Catalogue’, which can be accessed on
the internet.
ESA’s planned next-generation astrometric
satellite has been dubbed ‘Gaia’. It is intended
that Gaia will measure star positions and paral-
laxes to a precision of 10 microarcsec, which is

15.1
Accessing star data

15.2 Figure 15.6 The HIPPARCOS satellite


Annual parallax precision

278 ASTROPHYSICS
15.2 SPECTROSCOPY
Consider the following three observations:
• looking at a torch light which has been covered over with yellow plastic
• looking at the yellow flame produced by a spray of sodium chloride
solution into a Bunsen burner flame
• looking at a reflection of the yellow light produced by the Sun.
In each case you are observing yellow light, but the composition of that
light is different. The difference is not apparent to your naked eye, how-
ever, and in order to discover this difference you will need to use a device
known as a spectroscope. You will then be able to examine the components
of the light and draw many inferences about the material that produced it.
This is the field of spectroscopy and, by using it, astronomers have been
able to learn a great deal about the observable objects in the universe.

Making spectra
We must first be aware that most light is a mixture of wavelengths or
colours. If we were able to spread out or disperse a light ray, we would be
able to observe the spectrum of colours within that light. This is what a
A spectroscope is a device used to
spectroscope does and it can be attached to the eyepiece of a telescope to
spread a light into its spectrum. It examine the spectra of starlight. It is made up of several elements working
can be attached to the eyepiece of together, as shown in figure 15.7 below. There must first be a light source
a telescope to examine the spectra and this will be followed by several slits to form the light into a flat, vertical
of starlight. beam. The light then enters either a triangular prism or a diffraction
grating, both of which have the ability to disperse light out into its spec-
trum. Because the light is in the form of a flat beam, the spectrum spreads
out as a rectangular strip. The spectrum can then be recorded on a photo-
graphic plate or examined in more detail with a small telescope.

Prism or diffraction grating

Slits

Light from
source
Figure 15.7 A simple spectroscope

If a photographic plate is used, then spectra such as those shown in


figure 15.8(a) on page 280 will result. If a small telescope is used, then an
electronic sensor, called a photometer, can be attached and used to sense
the intensity of each wavelength. This can produce a spectrum in the
form of a graph, such as that shown in figure 15.8(b) on page 280. The
device is then known as a spectrophotometer.

CHAPTER 15 ASTRONOMICAL MEASUREMENT 279


(a)

(b)

Figure 15.8 (a) An example of a


visible line spectrum of hydrogen It is understandable that spectroscopy was first based on the visible
atoms produced using a photographic spectrum of light, these wavelengths being most apparent to us and most
plate in a spectroscope (b) Using an easily observed. With the use of appropriate electronic sensors, however,
electronic sensor can produce an spectroscopy is no longer restricted just to the electromagnetic radiation
intensity graph. that we can see.

PHYSICS IN FOCUS
The S-Cam: Spectrophotometer in a chip
O n page 264 we discussed the S-Cam, a new CCD (video camera)
for electronic astronomical observations that greatly increases
the sensitivity of an optical telescope. Although it is still in develop-
ment, this cryogenic superconducting camera has the ability to
record the position and colour of individual photons of light as they
are received. All this information is quickly compiled into a database
by a computer. This also gives the S-Cam the ability to function as an
extremely accurate spectrophotometer, making observations quickly
and simply without the need for intervening filters or prisms that
would normally reduce sensitivity.

Types of spectra
At the beginning of this section there were three different examples of light
sources mentioned: a light globe, a vapour and a star. Each produces a dif-
ferent type of spectrum — continuous, emission and absorption.

Continuous spectra
If the light source for a spectroscope is a hot, glowing solid, liquid or
high-pressure gas, then a continuous rainbow-like spectrum will be pro-
duced such as that shown in figures 15.9 and 15.10. A common,
Incandescent means bright or
everyday source is an ordinary incandescent light globe. ‘Incandescent’
glowing. Like black bodies, most means bright or glowing. Among celestial objects, a common source is
substances become incandescent the inner layers of a star, which are made up of hot, dense gas and
when they become hot enough. therefore produce a continuous spectrum. Galaxies also produce con-
tinuous spectra, this being the feature that first distinguished them
from nebulae.

280 ASTROPHYSICS
Continuous spectrum

Continuous spectrum

750 700 650 600 550 500 450 400


nm
Figure 15.10
A continuous spectrum

Figure 15.9 Incandescent solids,


liquids and gases under high pressure
produce continuous spectra. Incandescent bulb

PHYSICS IN FOCUS
Black body radiation
B lack body radiation is the electromagnetic
radiation that is emitted by a black body at a
particular temperature. It is distributed continu-
In figure 15.11, you will notice that each curve
is specific to a particular temperature and has a
peak intensity corresponding to a particular fre-
ously, but not evenly, across the various wave- quency. There are several noticeable trends
lengths, as shown in figure 15.11. which are explained here for interested students
(although not essential content for this course):
8000 K 1. As the temperature increases, the peak moves
toward the shorter wavelengths. At lower tem-
peratures the radiation lies mostly in the
6000 K infra-red region but, as temperature increases,
the peak moves into the visible spectrum, and
at higher temperatures the peak has moved
well into the ultraviolet. This relationship can
Intensity

be written as:
λmaxT = W
where
λmax = wavelength of maximum output (m)
T = temperature (K)
3000 K W = a constant
−3
= 2.9 × 10 m K.
0 1000 2000 3000
UV Visible IR
This is known as Wien’s Law, and it can be used
to determine the approximate surface tempera-
Wavelength (nm)
ture of a star. If, when observing the light from a
Figure 15.11 Black body radiation curves. Note that as the star, the wavelength of maximum output can be
temperature increases, the curve becomes higher, indicating measured (using a spectrophotometer) then the
greater energy output, and the peak of the curve shifts to shorter surface temperature can be calculated.
wavelengths. (continued)

CHAPTER 15 ASTRONOMICAL MEASUREMENT 281


2. The colour of a black body changes with where
temperature. Referring to figure 15.11, as L = luminosity
the temperature increases, the distribution = total energy output per second
−1
of wavelengths within the visible spectrum (joules/second, J s )
changes. Applying this to stars we can say = total power output
that low-temperature stars appear red. As (watts, W)
the temperature increases, the wavelengths T = temperature
peak in the yellow, and the star will appear (kelvin, K)
yellow. At slightly higher temperatures the R = radius of the star
spread is more even and a star will appear (metres, m)
white. As the peak moves beyond the visible σ = Stefan’s constant
−8 −2 −4
spectrum and into the ultraviolet with yet = 5.6705 × 10 W m K .
higher temperatures, the distribution
Note that, while a star’s energy output is very
within the visible spectrum is concentrated
sensitive to its temperature (proportional to the
at the blue end, and so the star will appear fourth power of T ), it also depends upon the
blue. square of its radius. As an example, let us com-
3. As temperature increases, the black body pare the two stars Castor and Deneb. Both are
radiation curve becomes higher and almost equally bright stars in our night sky and
broader, indicating that more total energy is both have the same surface temperature. How-
being emitted. This can be expressed by the ever, Deneb is much further away and much
following relationship, known as Stefan’s more luminous than Castor. Its greater energy
Law: output is due to its much larger radius — it is a
bright supergiant, whereas Castor is a main
2 4
L = 4πR σT sequence star, like the Sun.

In chapter 11 black bodies were discussed. Their consideration is


important here also. A black body is a hypothetical object that is a per-
fect absorber and emitter of electromagnetic radiation. When at the
temperature of its surroundings it emits as much radiation as it
absorbs. The emitted radiation has a continuous distribution of wave-
lengths and depends only on the temperature of the surface of the
body.
At optical to infra-red wavelengths, incandescent bodies, including stars,
Emission spectrum produce a continuous spectrum that, in the manner that they radiate
energy, approximates a black body. Much is known of the nature of black
body radiation. This allows information to be learned about a star from
the close examination of the range of wavelengths and the intensity of the
radiation it produces.

Emission spectra
An emission spectrum has the appearance of a long, dark rectangle
upon which appears discrete bright coloured bands. They are produced
This light consists by hot glowing (incandescent)
of discrete wavelengths. gases of low density like that
shown in figure 15.12.
Excited vapour

Figure 15.12 Incandescent,


low-density gases produce
emission spectra.

282 ASTROPHYSICS
E3 In order to understand how these discrete spectral
lines are produced, we need to consider a simple model
E2 of the atom. In this model, shown in figure 15.13, the
nucleus can be considered as a single positively charged
E1 body at the centre, and the electrons as tiny negatively
Electron
Nucleus excited
charged particles orbiting the nucleus. In 1913, the
to higher Danish physicist Niels Bohr (1885–1962) suggested that
Electron energy the electrons were not free to orbit anywhere around the
level
nucleus, but could orbit only at specific radii. Each
specific radius represents a specific energy level.
Photon emitted as electron Bohr restricted his original treatment to the hydrogen
drops back to lower energy atom as this is the simplest, having just one electron. He
level. E 2 – E1 = hf
suggested that the electron, which normally occupied the
Increasing
energy lowest energy level or ‘ground state’, E 1, could be given
some energy which would cause it to jump up to a higher
Figure 15.13 The different electron energy levels
energy level, or ‘excited state’, say E 2. The energy could
suggested by Niels Bohr
be given by a collision with other particles or with light.
Soon after occupying the excited state, the electron will drop back to the
more stable ground state. The energy it loses in doing this (E 2 − E 1) is given
off in the form of a photon, or packet of electromagnetic radiation.
When Bohr suggested this process, Max Planck (1858–1947) had
already suggested that electromagnetic radiation occurred in packets of
energy called photons. These can be thought of as elementary particles
A photon is a quantum (or discrete
packet) of electromagnetic with zero rest mass and charge, travelling at the speed of light. A beam of
radiation. It can be thought of as light is thus a shower of photons, with the intensity of the beam dependent
an elementary particle with zero upon the number of photons in the shower.
rest mass and charge, travelling at Planck had further deduced that the energy of each photon depended
the speed of light. only upon the wavelength of the radiation it contained, so that
E = hf
where
E = energy (joules, J)
f = frequency (hertz, Hz)
h = Planck’s constant
−34
= 6.6 × 10 J s.
However, in the case of our atom, the energy of the photon must equal
the difference in energy of the two states involved, so that:
E 2 − E 1 = hf
where
E 1 = ground state
E 2 = excited state.
If the electron had been excited to an even higher excited state, then
it may return to the ground state in a single large jump, or alternatively
in a set of smaller jumps. Each particular jump down between different
energy levels represents a different amount of energy, and therefore a
photon of radiation of different frequency or wavelength given off.
As a result, a hydrogen atom that has been excited tends to produce
many photons, with a set of discrete frequencies unique to its own set of
energy levels. When this light is directed through a spectroscope, then
the spectrum produced will contain only discrete wavelengths, or lines,
rather than a continuous spread of colours. The set of lines seen is so
unique to hydrogen that it can be regarded as a fingerprint of that
element. If the spectrum of an unknown mix of elements is examined
and that particular set of lines appears, then it can be stated with confi-
dence that the mixture contains hydrogen.

CHAPTER 15 ASTRONOMICAL MEASUREMENT 283


It is more difficult to fully explain the spectra produced by larger
atoms or molecules, but the principle that each produces a characteristic
set of spectral lines that can be regarded as a unique identifying finger-
print, still holds true. Figure 15.14 shows the characteristic emission
spectra of several elements.
Emission spectra can be produced by certain hot interstellar gas clouds
known as emission nebulae. These nebulae are heated by ultraviolet radi-
ation from a nearby star until they become hot enough to shine by their
own light (usually red light, which is characteristic of the hydrogen gas
that makes up most of these nebulae). Quasars are another type of astro-
nomical object that produces emission spectra. These are extremely dis-
tant and old objects that emit more energy than a hundred galaxies
combined. Their spectra can be spread right across the electromagnetic
spectrum, with the bulk lying within infra-red wavelengths.

Mercury

Cadmium

Strontium

Barium

Figure 15.14 The emission spectra


of several elements. Each has a unique
and characteristic pattern. Calcium

Absorption spectrum Absorption spectra


An absorption spectrum has the appearance of a continuous spectrum
upon which discrete dark lines appear. These lines represent wavelengths
that are missing from the otherwise continuous spectrum. Absorption
spectra are produced by a cool, non-luminous gas placed in front of a
continuous spectrum source, as shown in figure 15.15.
Cool gas absorbs As the continuous spectrum
certain wavelengths shines through the gas, the
This light is
and re-emits them atoms of the gas will absorb
in all directions. those particular wavelengths that
deficient in
certain wavelengths. they would emit if the gas were
hot. These are the frequencies
that correspond to differences
between energy levels within the
atoms. Absorbing these wave-
Figure 15.15 Absorption spectra are lengths raises their electrons to
produced by a cool, non-luminous gas excited states. The radiation is
placed in front of a source of a immediately re-emitted as the
continuous spectrum. The gas absorbs electrons drop back down to the
and then re-emits certain characteristic
Incandescent
ground state. However, this radi-
wavelengths in all directions, leaving bulb producing ation is re-emitted in all direc-
the original direction deficient in those continuous tions, not just the original
wavelengths. spectrum direction of the incident light.

284 ASTROPHYSICS
This means that the original light is now deficient in those particular
wavelengths. In general, the wavelengths missing from an absorption
spectrum correspond to the bright lines in the emission spectrum of the
same gas if it were hot and glowing. Therefore, the absorption spectrum
of a cool gas contains the same identifying pattern of lines that are con-
tained within emission spectra of hot gases.
This idea is the basis of stellar spectroscopy, since stars produce absorp-
tion spectra. The reason for this is that the main body of the star is hot,
dense gas and therefore produces a continuous spectrum. Surrounding
the star is a cooler and less dense atmosphere, which absorbs certain
wavelengths and re-emits them away, resulting in an absorption spectrum.
This is shown in figure 15.16.

Outer atmosphere

Star
Atmospheric particles absorb
and re-emit certain wavelengths
in all directions.

Li
i n gh t r
ce e c
Figure 15.16 Stars produce rta eiv
in ed
absorption spectra. The hot, dense gas wa is
v e de
len fic
which forms the star is the continuous gth ien
s. t
spectrum source. The surrounding
atmosphere is the cool, non-luminous
gas which absorbs certain wavelengths Absorption spectrum recorded
is missing certain wavelengths.
from the spectrum before re-emitting
them away in all directions.
Table 15.1 summarises the three types of spectra, how they are
produced and what types of celestial objects produce them.

Table 15.1 Types of spectra and their production by celestial objects

TYPE OF SPECTRUM GENERALLY PRODUCED BY: CELESTIALLY PRODUCED BY:

Continuous Hot solids, liquids, gases Galaxies, inner layers of


15.3 under pressure stars
Spectra
Emission Incandescent low- Emission nebulae,
density gases quasars

Absorption Cool gases in front of Atmosphere of stars


continuous spectrum

Spectral analysis of starlight


In the late 1800s, the technique of spectroscopy began to be applied to the
light of celestial bodies. Since then it has become one of the most valuable
tools of the professional astronomer, as the spectral ‘fingerprints’ reveal the
chemical composition of stars, nebulae and galaxies, as well as the atmos-
phere of the other planets of our own solar system.

CHAPTER 15 ASTRONOMICAL MEASUREMENT 285


Stellar spectroscopy, concentrating on the spectra of stars, can deduce
Stellar spectroscopy is the a great deal of other information, such as surface temperature, velocity
examination of the spectra of stars
in order to learn more about their and density, as the following sections describe.
composition, surface temperature,
velocity, density, etc. Spectral classification
Most stars are made up of a very similar set of elements and compounds,
yet their spectra can vary considerably. The reason is that different atoms
and molecules produce spectral lines of very different strengths at different
temperatures. At lower temperatures, molecules can exist near the surface
of a star and they produce particular spectral lines. At hotter temperatures,
these molecules can no longer exist and the spectral lines produced belong
to neutral atoms. At higher temperatures the atoms become ionised, and
the spectral lines produced are characteristic of these particles.
The stars have been classified into a set of spectral classes, each desig-
nated by a letter, according to the main spectral lines evident. When placed
in order of decreasing temperature they are the seven spectral classes O, B,
A, F, G, K and M. This can be remembered using the mnemonic ‘Oh, Be A
Fine Girl (or Guy), Kiss Me’. Table 15.2 summarises the main spectral
features of each class, as well as the temperature and colour (recalling that
the colour of a star depends upon its temperature) that correspond to each.

PHYSICS FACT

I n recent years, astronomers have discovered a new class of stars,


which they have named class L. These are dwarf stars cooler than
M class stars, with a surface temperature less than 2500 K and possibly
as low as 1600 K.

Table 15.2 Spectral classifications and their corresponding features. Note that
in astronomy, the term ‘metal’ refers to any element other than hydrogen or
helium.

SPECTRAL CLASS TEMPERATURE (K) COLOUR SPECTRAL FEATURES

O 28 000–50 000 Blue Ionised helium lines


Strong UV component

B 10 000–28 000 Blue Neutral helium lines

A 7500–10 000 Blue-white Strong hydrogen lines


Ionised metal lines

F 6000–7500 White Strong metal lines


Weak hydrogen lines

G 5000–6000 Yellow Ionised calcium lines


Metal lines present

K 3500–5000 Orange Neutral metals dominate


Strong molecular lines

M 2500–3500 Red Molecular lines dominate


Strong neutral metals

Each spectral class has been further divided into subgroups by attaching
a digit, from 0 to 9, following the letter. As an example, a small section of
the classification system would be as follows:
-B8-B9-A0-A1-A2-A3-A4-A5-A6-A7-A8-A9-F0-F1-F2-

286 ASTROPHYSICS
PHYSICS IN FOCUS
Luminosity classes

W hen considering black body


radiation and temperature
on page 281, we saw that stars with
similar temperatures could still
–8
have very different luminosity
(energy output) due to a differ- Ia
ence in size. To account for this,
the spectral classification system –6
was extended in 1941, by the inclu-
Ib
sion of eight luminosity classes as
listed in table 15.3. Each is indi- –4
cated by a Roman numeral that II
appears as a suffix behind the spec-
tral class. For example, Castor is an –2
A2V star (a blue-white main
sequence star) while Deneb is an III
A2Ia star (a blue-white bright 0
supergiant). Each luminosity class
corresponds to specific regions of a
Hertzsprung–Russell diagram (see 2
figure 15.17).
Luminosity

IV
Table 15.3 The eight luminosity
4
classes

Ia Bright supergiant
6
Ib Supergiant

II Bright giant 8 VI

III Giant

IV Subgiant 10

V Main sequence
12
VI Subdwarf

VII White dwarfs V


14 VII

The luminosity class of a star can


be learned from its spectra, in par-
16
ticular the pressure broadening of
B0 A0 F0 G0 K0 M0
its spectral lines. (See the text
under ‘Density’ on page 289.) Temperature classes
Figure 15.17 The eight luminosity classes plotted on a
Hertzsprung–Russell diagram

CHAPTER 15 ASTRONOMICAL MEASUREMENT 287


Real velocity Other inferred information
relative to the
Sun (‘Space Temperature
velocity’)
As can be seen from table 15.2 on page 286, once a star’s spectral class
has been identified from its main spectral lines, then its temperature can
be deduced. (There is another method. Using a spectrophotometer, each
Proper motion
wavelength can be examined for intensity in order to discover the wave-
(determined length of maximum energy output, λmax. Once this has been determined,
from astrometric the temperature of the star can be calculated using Wien’s Law, as
measurement) described on page 281. This measurement would also suggest the spec-
Radial velocity
tral class and, hence, composition of the star.)
(determined
from red shift) Translational velocity
If a star moves away from us then the patterns of recognisable lines
Sun
within its spectrum, as we observe them, appear at slightly longer wave-
lengths. We say they have red-shifted because the lines now appear a few
nanometres closer to the red end of the spectrum. Similarly, if the star
Earth
were to move toward us then its spectral lines would be blue-shifted. This
Figure 15.18 If a star’s radial is due to the Doppler effect and, by measuring the extent of the wave-
velocity is combined with its proper length shift, the radial velocity of the star (velocity toward or away from
motion then its real velocity relative to us) can be calculated. As figure 15.18 shows, if this is combined with the
the Sun can be determined. proper motion of the star (sideways velocity as seen by us) then the star’s
real velocity relative to the Sun can be computed.

PHYSICS IN FOCUS
The Doppler effect O2

T he Doppler effect is an apparent change in the


frequency or wavelength of a wave as a result of
the source and the observer moving relative to each
other (see figure 15.19). O1
If a source of waves (these could be sound waves, S O3
light waves, microwaves, radio waves, etc.) is moving
at a significant fraction of the speed of those waves,
then the wavelength is effectively shortened ahead
of its motion, and lengthened behind it. Observers
at these locations will experience these changes in O4
wavelength and frequency and, by measuring the Legend:
change, the velocity of the source can be calculated.
S = Source
The same effect can be produced if the source is
O1 = Observer 1
stationary and the observer is moving, so it is really
O2 = Observer 2
relative motion between the source and observer
O3 = Observer 3
that is significant.
The Doppler effect is used by police radar units to O4 = Observer 4
measure the velocity of speeding cars. In astronomy it Figure 15.19 The Doppler effect. When the source is
is used to measure the velocity of stars relative to us. moving quickly compared to the speed of the waves it
Very fast and distant celestial objects are usually produces, then observer 1 ‘sees’ a longer wavelength than
assumed to owe their velocity to the general expansion emitted, observer 3 sees a shorter wavelength, while
of the universe, and in this case the degree of Doppler observers 2 and 4 see the same wavelength as that emitted
shifting can also give an indication of distance. by the source.

Rotational velocity
Smaller Doppler shifts (both red and blue) can be caused by a star’s own
rotational velocity, or by its participation in a rotating double star system.
In the case of a single, rapidly rotating star, the atoms moving quickly

288 ASTROPHYSICS
away from us on one side and atoms moving quickly toward us on the
eBook plus other side combine to produce a slight but simultaneous red and blue
shift which broadens the spectral lines. The faster the star rotates, the
Weblink: greater this Doppler broadening effect is.
The Doppler effect In the case of a rotating double star system seen from its edge, at cer-
tain times one star is blue-shifted while the other is red-shifted. At some
later period this situation will reverse and, by keeping track of this, the
rotational period and velocities can be calculated.
Density
High density and pressure within the atmosphere of the star can also
broaden its spectral lines. The effect is progressive — the greater the
atmospheric density and pressure, the greater the ‘pressure broadening’.
The spectral lines of a supergiant star (with a particularly low density
atmosphere) are much narrower than those of a more dense main
sequence star (such as our Sun) of the same spectral class.

15.3 PHOTOMETRY
Photometry is the measurement of the brightness of a source of light or
Photometry is the measurement of
the brightness of a source of light other radiation. In astronomy this is applied to the light from stars as well
or other radiation. as other celestial objects. Astronomical photometry has a long history,
beginning in Greece over two thousand years ago. Early measurements of
star brightness were judged by eye. More recently, photographic tech-
niques were applied with a corresponding increase in accuracy. Most
recently, electronic devices have been employed to measure star brightness
with further improvements in sensitivity and faster response times. These
devices may be photomultiplier tubes that offer high sensitivity to very low
light levels, or charge coupled devices (CCDs), such as those found within
video cameras, that can produce digitised images for computer processing.
However it is measured, much can be learned from the knowledge of a
star’s brightness and how it compares to other stars.

Measuring brightness and luminosity


Brightness depends on: When looking at the night sky it is obvious that the stars vary in bright-
ness. A star’s brightness, in watts per square metre, is a measure of the
intensity of the radiation reaching the Earth from the star. This depends
upon the luminosity of the star as well as its distance from us.
We briefly considered luminosity on page 282 and saw that luminosity
luminosity, distance–2 depends upon the radius of the star and, especially, upon its tempera-
which depends
on: ture. The luminosity of a star, in watts, is the total energy radiated by it
per second. Since it is the rate of energy output, it is also the power
output of the star and is sometimes called intrinsic or absolute brightness
(see figure 15.20).
radius2 temperature4

Figure 15.20 Brightness and


PHYSICS FACT
luminosity are related.

D etectors aboard satellites have measured the amount of radiant


energy per square metre per second reaching us from the Sun.
From this, astronomers have been able to calculate the total power
26
output of the sun as 3.83 × 10 watts, and this is designated by the
symbol L O. The Sun’s power output, L O, is used as a standard for
comparison with other stars.
In this method, a measurement of the Sun’s brightness and a
knowledge of its distance has allowed a calculation of its luminosity.

CHAPTER 15 ASTRONOMICAL MEASUREMENT 289


Magnitudes
During the second century BC, the Greek astronomer Hipparchus estab-
lished a scale to record the brightness of stars, an extension of which is
still in use today. He defined the brightest star that he could see as being
of magnitude 1 and the faintest star as magnitude 6. Therefore, his scale
was a reverse one, with lower numbers indicating brighter stars.
Since Hipparchus established his scale, stars have been found that are
brighter than magnitude 1 (the brightest star in the night sky is Sirius
with a magnitude of −1.4) and dimmer than magnitude 6 (large tele-
scopes can observe stars fainter than magnitude 25). Of course,
Hipparchus did not have a telescope to help him.
In the 1850s scientists began to realise that the human eye did not
respond to light increases in a linear way. For example, a light that is
doubled in intensity is not perceived to be twice as bright. William
Herschel had observed that a magnitude 1 star was approximately 100
times brighter than a star of magnitude 6. In 1856 Norman Pogson fixed
this observation as a mathematical definition:
I
If mB − mA = 5 then ---A- = 100
IB
where
mA = magnitude of star A (brighter star)
mB = magnitude of star B (dimmer star)
IA
---- = brightness ratio of the two stars.
IB
This means that:
I
If mB − mA = 1 then ---A- = 5th root of 100
IB
= 2.512.
In other words, a star of magnitude, say, 4 is 2.512 times brighter than
a star of magnitude 5. This is known as Pogson’s ratio, and the com-
pounding nature of this ratio can be seen in table 15.4. The magnitude
scale, now based upon the mathematical relationship described, is known
as Pogson’s scale.
Table 15.4 Brightness ratios using the Pogson scale
MAGNITUDE DIFFERENCE BETWEEN
TWO STARS BRIGHTNESS RATIO OF THE STARS
0
0 2.512 = 1
1
1 2.512 = 2.512
2
2 2.512 = 6.310
3
3 2.512 = 15.85
4
4 2.512 = 39.81
5
5 2.512 = 100
6
6 2.512 = 251
7
7 2.512 = 631
8
8 2.512 = 1585
9
9 2.512 = 3981
10
10 2.512 = 10 000

290 ASTROPHYSICS
In general, the brightness ratio of any two stars can be calculated using
the following formula: ( mB – mA )
IA --------------------------
---- = 100 5
IB
where
mA = magnitude of star A (brighter star)
mB = magnitude of star B (duller star)
IA
---- = brightness ratio of the two stars.
IB

Calculating a brightness ratio


SAMPLE PROBLEM 15.2 The Sun has a magnitude of −26.8. The brightest star in the night sky is
Sirius with a magnitude of −1.4. How much brighter does the Sun appear
compared to Sirius?
B (m – m )
A
SOLUTION IA -------------------------
-
---- = 100 5
IB
Sirius (m Sun –m )
I Sun ------------------------------------------
-
- = 100
---------- 5
I Sirius
( – 1.4 – – 26.8 )
-------------------------------------
= 100 5
5.08
= 100
10
= 1.4 × 10
In other words, the Sun appears over ten billion times brighter than Sirius.

Another brightness ratio


SAMPLE PROBLEM 15.3 Proxima Centauri is the closest star to our solar system and yet it is quite
faint, with a magnitude of 11. Calculate the brightness ratio of Algol
(magnitude 2.1) compared to Proxima Centauri.
B (m – m )
A
SOLUTION IA -------------------------
-
---- = 100 5
IB
Proxima (m Algol –m )
I Algol -------------------------------------------------
-
--------------- = 100 5
I Proxima
( 11 – 2.1 )
------------------------
= 100 5
1.78
= 100
= 3600
That is, Algol appears 3600 times brighter than Proxima Centauri,
although a good telescope is required to see Proxima Centauri.
The concept of magnitudes has developed over the years for various
purposes and we now need to be more specific about what type of
magnitude we are dealing with at any time.

Apparent magnitude
Apparent magnitude, given the symbol m, is the magnitude given to a star
Apparent magnitude, m, is the
magnitude given to a star as viewed as viewed from Earth. This is the same magnitude that we have been dis-
from Earth. cussing in the previous section of work. Apparent magnitude is a measure
of the brightness of a star and is therefore influenced by the distance of
the star as well as its intrinsic brightness (and any intervening matter
such as interstellar dust, which can make a star look dimmer than it
otherwise would be). Measurements of apparent magnitude can be
performed photographically or photoelectrically.

CHAPTER 15 ASTRONOMICAL MEASUREMENT 291


Absolute magnitude
Absolute magnitude, M, is the Absolute magnitude, given the symbol M, is defined to be the magnitude
magnitude that a star would have if that a star would have if it were viewed from a standard distance of 10
it were viewed from a standard parsecs. Because distance has been set to a standard, it is no longer an
distance of 10 parsecs. influence and so absolute magnitude is a measure of the intrinsic bright-
ness or luminosity of a star. Although it is not a quantity that is directly
measurable, there are other ways that absolute magnitudes can be
deduced, and it proves to be a useful means to make direct comparisons
between stars. For example, Achernar and Betelgeuse are both bright
stars with apparent magnitudes of 0.45; however, Achernar has an abso-
lute magnitude of −2.8 while Betelgeuse’s is much brighter again at
−5.14. The reason the two stars appear the same is that Betelgeuse is
much further away from us than is Achernar.

The distance modulus


Consider the following comparison: Castor and Vega are both A class
main sequence stars with absolute magnitudes of approximately 0.6. Vega
lies at a distance of 7.76 pc and has an apparent magnitude of 0.03.
However, Castor lies roughly twice as distant at 15.8 pc and therefore is
not as bright, with an apparent magnitude of 1.58.
The close relationship between apparent magnitude, absolute magnitude
and distance is expressed in the following formula:
d
M = m − 5 log  ------
 10
where
M = absolute magnitude
m = apparent magnitude
d = distance (pc).
Rearranging this equation produces a number of other accepted forms
of the expression, as follows:
d
m − M = 5 log  ------
 10
m − M = 5 log d − 5.
The term (m − M ) is known as the distance modulus, and the above
The term (m − M ) is known as the
distance modulus. It is directly
formula as the distance modulus formula. Using this expression, the
related to the distance of a star. distance to a star can be calculated once the apparent and absolute
magnitudes have been determined.

Calculating distance using the distance modulus


SAMPLE PROBLEM 15.4 Achernar has an apparent magnitude of 0.45 and an absolute magnitude
of −2.77. Calculate its distance.
d
SOLUTION M = m − 5 log  ------
 10
d
−2.77 = 0.45 − 5 log  ------
 10
d
−3.22 = −5 log  ------
 10
d
0.644 = log  ------
 10
d 0.644
∴ ------ = 10 = 4.4
10
∴ d = 4.4 × 10 = 44 pc

292 ASTROPHYSICS
Utilising annual parallax and the distance modulus
SAMPLE PROBLEM 15.5 According to the HIPPARCOS Catalogue, Altair has a parallax of 194.44
milliarcsec and an apparent magnitude of 0.76. Calculate:
(a) its distance, and
(b) its absolute magnitude.
SOLUTION 1 1
(a) d = --- = ---------------------
p 0.194 44
= 5.14 pc
d
(b) M = m − 5 log  ------
 10
5.14
= 0.76 − 5 log ----------
10
= 0.76 − 5 log 0.514
= 0.76 − 5(−0.289)
= 2.2

Spectroscopic parallax
Spectroscopic parallax is the name given to a method of using the
Spectroscopic parallax is a method
of using the H–R diagram and the Hertzsprung–Russell (H–R) diagram and the distance modulus formula
distance modulus formula to to determine the approximate distance of a star.
determine the approximate When you studied ‘The Cosmic Engine’ in Year 11, you were intro-
distance of a star.
duced to the H–R diagram. This is a graph of luminosity or absolute mag-
nitude (vertical axis) versus temperature or spectral class (horizontal
axis). When many stars are plotted onto an H–R diagram, certain star
groupings become apparent, such as the main sequence, red giants and
white dwarfs, as shown in figure 15.21.

Figure 15.21 An H–R diagram showing the various star groupings

The method involves the following steps:


1. Using photometry, measure the apparent magnitude, m, of the star in
question.
2. Using spectroscopy, determine the spectral class of the star. It is also
important to note the luminosity class, in order to determine to which

CHAPTER 15 ASTRONOMICAL MEASUREMENT 293


group the star belongs. Earlier in this chapter we discussed how the
width of the spectral lines is an indicator of the luminosity class of a star.
3. The H–R diagram is now consulted. Starting on the horizontal axis at
the appropriate spectral class, a vertical line is drawn to the middle of
the correct star group. This will place the star on the H–R diagram in
approximately the correct position. From this position a horizontal
line is drawn across to the vertical axis, so that the absolute magnitude,
M, can be read off.
4. Now that m and M are both known, the distance modulus formula is
applied in order to calculate the distance, d.
This technique can be applied to many stars; however, the value deter-
mined for the absolute magnitude can carry a large percentage error
and, therefore, so too will the calculated distance. In other words, the
distance determined by this method is approximate only.

Calculating the distance to Aldebaran using spectroscopic parallax


SAMPLE PROBLEM 15.6 Aldebaran is a bright red star with an apparent magnitude of 0.9 and belongs
to spectral class K5 and luminosity class III, which means that it is a red giant.
Use spectroscopic parallax to determine its approximate distance.
SOLUTION With reference to the H–R diagram in figure 15.21, we locate spectral
class K5 along the horizontal axis and move from here vertically up to the
middle of the giants. From this point, slide horizontally across to read off
an absolute magnitude, M, of approximately 0. The distance modulus
formula is then applied:
d
M = m − 5 log  ------
 10
d
0 = 0.9 − 5 log  ------
 10
d
−0.9 = −5 log  ------
 10
d
0.18 = log  ------
 10
d 0.18
∴ ------ = 10 = 1.5
10
∴ d = 1.5 × 10 = 15 pc.
This problem serves as a good example of the approximate nature of
this technique. The annual parallax of Aldebaran has been accurately
measured to be 50 milliarcsec. Therefore, its distance can be much more
accurately calculated using:
1 1
d = --- = ----------
p 0.05
= 20 pc.
The source of the error lies in the approximation of M, as Aldebaran’s
real absolute magnitude lies closer to −0.6.

Calculating the distance to Proxima Centauri using spectroscopic


SAMPLE PROBLEM 15.7 parallax
Proxima Centauri, the closest star to our solar system, is an M5V star with
an apparent magnitude of 11. Determine its distance using spectroscopic
parallax.
SOLUTION Spectral class M5 indicates that Proxima is a red star. Luminosity class V
indicates that it is a main sequence star (these stars are known as red

294 ASTROPHYSICS
dwarfs). From the H–R diagram in figure 15.21 the absolute magnitude
can be approximated at 12.5. Applying the distance modulus equation:
d
M = m − 5 log  ------
 10
d
12.5 = 11 − 5 log  ------
 10
d
1.5 = −5 log  ------
 10
d
−0.3 = log  ------
 10
d −0.3
∴ ------ = 10 = 0.5
10
∴ d = 0.5 × 10 = 5 pc.
It is well known that Proxima Centauri actually lies at a distance of 1.3 pc.
Once again the source of the inaccuracy lies in the approximation of M,
as Proxima Centauri’s actual absolute magnitude is almost 15.5.
The sample problems above demonstrate the technique, as well as the
approximate nature of the distances it produces. What then is its use? It
is not used when there is a more accurate solution available. Rather, it
can be used to give a ‘ball-park figure’ when no other technique can.

Measuring colour
It has already been noted that the stars vary in colour. This variation is
not obvious when looking with the naked eye. Aldebaran, found in the
constellation of Taurus, and Betelgeuse, found in Orion, are both red
giants yet their colour is a subtle pink when studied with unaided eyes.
Look with a good set of binoculars or a telescope, however, and the
colour becomes obvious. Similarly, the blue colour of stars such as Rigel,
also found in the constellation of Orion, is quite faint until looked at with
a telescope. All of these stars are quite bright, but just how bright they
appear depends very much upon the colour sensitivity of the device and
method used.
Early measurements of star magnitudes were done by eye, which can
be a surprisingly sensitive discriminator. However, the human eye is most
sensitive to the yellow–green portion of the visible spectrum. As a result,
the red and blue stars mentioned earlier are not judged by the eye to be
as bright as they really are. Magnitude determined this way is referred to
as visual magnitude.
Visual magnitude refers to
magnitude as judged by eye, or Later measurements of star magnitudes were made photographically,
more accurately by a photometer and called ‘photographic magnitudes’. However, a problem arose. Photo-
fitted with a yellow–green filter. graphic magnitudes were inconsistent with visual magnitudes. The source
of the inconsistency was that photographic film is most sensitive to the
blue end of the visible spectrum, so that blue stars were measured to be
brighter than by eye, while yellow and red stars were measured to be
fainter.
Today, magnitudes are measured using photometers. Not only are
these devices very sensitive and accurate, but they are also sensitive to a
much wider range of wavelengths such as ultraviolet and infra-red, to
which the human eye is quite insensitive. In order to maintain consis-
tency with naked eye observations, stars are observed through a yellow–
green filter, called a V (for visual) filter, when apparent magnitudes are
being measured.

CHAPTER 15 ASTRONOMICAL MEASUREMENT 295


Colour magnitudes
The influence of colour sensitivity upon magnitude measurements, as
described on the following page, has been taken advantage of to devise a
way to quantify star colours. By placing a standard set of coloured filters
in front of a photometer, three different colour magnitudes for each star
can be measured. The filters are described in table 15.5 on the following
page. Each filter transmits a broad band of wavelengths, and the wave-
length at the centre of this band is also indicated in the table.
Note that one of these filters is the yellow–green filter (V), used to
simulate visual magnitudes. Another is a blue filter (B), which is used to
simulate photographic magnitudes. The ultraviolet filter (U) utilises the
extra sensitivity available from the photometer. This has become an
internationally accepted system referred to as the UBV system.
Table 15.5 Colour filters used in the UBV system
NAME COLOUR CENTRE WAVELENGTH BASED UPON
U Ultraviolet 365 nm —
B Blue 440 nm Photographic magnitude
V Yellow–green 550 nm Visual magnitude

The letters U, B and V are also used to denote a star’s apparent magni-
tude as measured through each of the filters. Note that a red star, such as
Betelgeuse, will appear brightest through the V filter, so its V magnitude
will be lower than its B or U. Similarly a blue star, such as Rigel, will
appear brightest through the B filter so that its B magnitude will be lower
than its V or U.
Comparisons such as these can be useful. However, because these
colour magnitudes are numbers, comparisons can be made numerically,
and this is the purpose of the colour index.

PHYSICS FACT

T he UBV system of stellar magnitudes has been extended into the


red and infra-red wavelengths, better utilising the capabilities of
photometers and creating a system able to produce two-colour values
for a much greater range of stars. The filters that have been added
are listed in table 15.6.
Table 15.6 Filters used to extend the UBV system
NAME CENTRE WAVELENGTH
R 700 nm or 0.7 µm
I 900 nm or 0.9 µm
J 1250 nm or 1.25 µm
H 1.6 µm
K 2.2 µm
L 3.4 µm
M 5.0 µm
N 10.2 µm
Q 21 µm

296 ASTROPHYSICS
Colour index
By subtracting one colour magnitude from another, a numerical two-
colour value will result that expresses the colour of a star. The most
standard of this type of numerical comparison is the colour index.
Colour index is the difference between the photographic magnitude,
A star’s colour index is the
difference between its B, and the visual magnitude, V.
photographic magnitude, B, and its Colour index = B − V
visual magnitude, V.
The application of this formula results in a numerical scale that
expresses colour. To see how this occurs, recall that a red star is brighter
through a V filter than a B, so that its V magnitude is lower than its B
magnitude. Therefore, the expression B − V will result in a small positive
number. A blue star is the reverse of this. A blue star is brightest through
a B filter, and so its B magnitude will be less than its V magnitude. The
expression B − V will result in a small negative number. By definition,
stars of spectral class A0 have a colour index of zero. These stars have a
surface temperature of 10 000 K and a blue-white colour.
Table 15.7 shows the range of the colour index scale and how it correlates
with colour, as well as temperature and spectral class.
Table 15.7 The correlation of colour index, colour, temperature and
spectral class
.

COLOUR INDEX COLOUR SPECTRAL CLASS TEMPERATURE (K)

−0.6 Blue O 28 000–50 000

Blue B 10 000–28 000

0 Blue–white A 7500–10 000

White F 6000–7500

+0.6 Yellow G 5000–6000

Orange K 3500–5000
15.4 +2.0 Red M 2500–3500
Colour filters

Note that the relationship between colour index and temperature is not
linear — colour index −0.6 to zero covers a temperature range of
40 000 K, colour index zero to +0.6 covers a range of 4000 K, and colour
index +0.6 to +2.0 covers about 4000 K. if a spectrophotmeter is available,
Wein’s Law can be used to give a more accurate value for the star’s tem-
perature (see page 281).

Using colour index information


SAMPLE PROBLEM 15.8 Three stars are measured to have colour indexes of +0.5, 0 and −0.5.
What can be said of each star?
SOLUTION Refer to table 15.7. Of the first star (colour index +0.5) we can say that its
colour is white–yellow, its spectral class is about F5 and its surface tem-
perature is approximately 6500 K. Of the second star (colour index 0) we
can say that its colour is blue–white, its spectral class is A0 and its tem-
perature is approximately 10 000 K. Of the third star (colour index −0.5)
we can say that its colour is blue, its spectral class is about O5, and its
temperature is approximately 30 000 K to 40 000 K.

CHAPTER 15 ASTRONOMICAL MEASUREMENT 297


PHYSICS IN FOCUS
Photoelectrics versus photographics for photometry
P hotometry involves the measurement of the
brightness or magnitude of a source of light
such as a star.
Photoelectric photometry is more common.
These systems use a combination of a filter and an
electronic sensor such as a Charge Coupled Device
This can be done photographically, using spe- (CCD) which is a light-sensing array also found in
cially prepared emulsions, in the field known as a video camera — CCDs for astronomy have a
photographic photometry. This method involves higher resolution. A photomultiplier tube may also
making a photograph of a portion of the sky. When be used. Both devices convert the light input into
the photograph is developed, a measurement is an electronic signal that can be multiplied, digi-
made of the size and density of the spot made by tised, analysed and stored electronically. All of this
each star. Brighter stars expose a larger area of film can be done quite quickly and remotely if necessary.
and hence appear on a photograph as a larger, These devices have a much wider range of
denser spot (as shown in figure 15.22). Each spot is wavelengths to which they are sensitive than do
compared to standard spot sizes and densities to photographic emulsions. Together with a suitable
determine the stars’ magnitudes. filter they can sense intensities over broad wave-
Photographic emulsions are restricted to the length bands, as in the UBV system, or very
visible spectrum including the near-infra-red and narrow bands, useful when searching for the
near-ultraviolet; however, particular emulsions presence of a particular element in a celestial
can restrict this range further. In addition, fine object. In addition, the electronic detectors in
detail can be recorded photographically, often to use today are more sensitive to faint light sources
a higher resolution than can be achieved than is photographic emulsion, although they
electronically. cannot, as yet, achieve the same resolution.

Figure 15.22 On a photograph, brighter stars appear as larger spots.

298 ASTROPHYSICS
CHAPTER REVIEW
• Spectroscopic parallax is a method of approxi-
SUMMARY mating a star’s distance using an H–R diagram
to estimate M, and a measurement of m, to then
• Annual parallax, p, is half the angle, in seconds calculate d.
of arc, through which a nearby star appears to • Star magnitudes can be measured through
shift as the Earth moves between two positions colour filters to produce the colour magnitudes
six months apart. U, B and V.
• The distance, in parsecs, to a nearby star is • Colour index = B − V. This is a numerical
1 measure of the colour of stars and produces a
given by d = --- . scale from −0.6 (blue) to 0.0 (blue–white) to
p
+2.0 (red).
• Atmospheric blurring reduces the accuracy of • Photoelectric photometry has a greater range,
ground-based astrometric measurements. Space- degree of sensitivity and ability to produce
based measurements avoid this problem, easily digitised images. Photographic photo-
allowing the ability to measure more distant stars. metry can achieve a higher resolution.
• A spectroscope is a device that uses a prism or a
diffraction grating to separate a light into its
spectrum. QUESTIONS
• There are three different types of spectra — Astrometry
continuous (produced inside stars), emission 1. Complete the following table of conversion
(produced by quasars and certain nebulae) and factors:
absorption (produced by stars).
• Black body radiation is approximated by a star’s km AU l-y pc
energy output in optical and infra-red wave- 1 km = 1
lengths. It is spread continuously but not evenly
1 AU = 1
across the electromagnetic spectrum. It has a
peak output that is dependent upon tempera- 1 light-year = 1
ture. 1 parsec = 1
• The position and appearance of the lines in a
spectrum can reveal details of a star’s compo- 2. Define:
sition, temperature, velocity and density. (a) an astronomical unit (AU)
(b) annual parallax
• The spectral classes, from hottest to coolest, are (c) parsec
O, B, A, F, G, K, M. (d) light-year.
• Apparent magnitude, m, is the magnitude given 3. (a) Construct a triangle to explain the method
to a star as viewed from Earth. It is a measure of of astronomical distance determination
a star’s brightness. using trigonometric parallax.
• A star that is one magnitude lower than (b) Identify the baseline in this triangle.
another is 2.512 times brighter. The brightness (c) What mathematical assumption was made
ratio of any two stars can be calculated using in the development of this technique?
the following formula: 4. Using the HIPPARCOS Catalogue web site the
following annual parallaxes were found. Use
(m – m )
IA B A
-------------------------
- them to calculate the distance to each star, in
---- = 100 5 . parsecs.
IB
(a) Achernar, p = 0.0227 arcsec
• Absolute magnitude, M, is the magnitude that a (b) Acrux, p = 0.0102 arcsec
star would have if viewed from a standard dis- (c) Aldebaran, p = 0.0501 arcsec
tance of 10 parsecs. It is a measure of a star’s (d) Algol, p = 0.0351 arcsec
luminosity. (e) Altair, p = 0.1944 arcsec
• If m and M are known, then a star’s distance (f) Antares, p = 5.4 milliarcsec (mas)
can be calculated using the following formula: (g) Arcturus, p = 88.9 mas
(h) Barnard’s Star, p = 549.0 mas
d (i) Hadar, p = 6.2 mas
M = m − 5 log ------ or m − M = 5 log d − 5. (j) Mira, p = 7.8 mas
10

CHAPTER 15 ASTRONOMICAL MEASUREMENT 299


5. Use the following star distances to calculate 15. Describe the parts of a spectroscope.
the corresponding annual parallaxes (data
based on information from the HIPPARCOS 16. List the three types of spectra, along with the
Catalogue). astronomical objects that may produce each type.
(a) Castor, d = 15.8 pc 17. Describe the process that produces discrete
(b) Pollux, d = 10.3 pc bright lines in an emission spectrum.
(c) Rigel, d = 237.0 pc
(d) Sirius, d = 2.6 pc 18. Describe the process that produces discrete
(e) Bellatrix, d = 74.5 pc dark lines in an absorption spectrum.
(f) Betelgeuse, d = 131.1 pc 19. Describe the general trend of dominant lines
(g) Canopus, d = 95.9 pc in stellar spectra, beginning with cool stars and
(h) Capella, d = 12.9 pc moving to hot stars.
(i) Deneb, d = 990.1 pc
(j) Fomalhaut, d = 7.7 pc 20. List the names of the spectral classes, beginning
6. For each of the stars listed in question 5, con- with hot stars. Next to each write down the
vert their distance from parsecs to light-years. dominant spectral feature and the temperature
range for each class.
7. (a) State the main limiting factor with the trig-
onometric technique when used with 21. Describe the method by which the spectrum of
ground-based telescopes. a star can be used to deduce information
(b) Discuss means that can be used to over- about that star’s:
come this limitation. (a) surface temperature
8. (a) Compare the precision of astrometric (b) chemical composition
measurements made by traditional ground- (c) translational velocity
based telescopes with that of space-based (d) rotational velocity
telescopes. (e) density.
(b) Extend this comparison to the limits of dis-
tance determinations using each type of
telescope.
Photometry
9. In chapter 14, on pages 268 and 269, we looked 22. Describe the difference between the bright-
at some highly advanced, ground-based tele- ness and luminosity of a star.
scopes. Discuss the improvements to astrometric 23. Describe in point form three basic features of
measurements that these telescopes offer. the magnitude scale as devised by Hipparchus.

Spectroscopy 24. (a) What was the definition that Pogson used
to describe Hipparchus’ magnitude scale
10. Define a black body, and black body radiation.
mathematically?
11. Describe how the theory of black bodies can (b) Identify the significance of Pogson’s ratio.
be applied to stars.
25. (a) If two stars differ in brightness by one mag-
12. Referring to figure 15.11 (page 281), explain
nitude, state how much brighter one is
the changes that occur to a black body’s radi-
ation curve as its temperature increases. compared to the other.
(b) If the two stars differ in magnitude by five,
13. Use the graph of intensity versus wavelength calculate their brightness ratio.
shown in figure 15.11 (page 281) to predict
the approximate surface temperature of a star 26. Calculate the brightness ratios of each of the
with its peak intensity: pairs of stars below. Be sure to state clearly
CHAPTER REVIEW

(a) at 1000 nm which star is the brighter.


(b) at ultraviolet wavelengths (a) Achernar (m = 0.45) and Algol (m = 2.09)
(c) in the visible spectrum. (b) Antares (m = 1.06) and Arcturus
14. The spectrum of a star has its peak intensity at (m = −0.05)
a wavelength of 400 nm. The star has a radius (c) Barnard’s star (m = 9.54) and Hadar
8
of 6.90 × 10 m. Using the equations on (m = 0.61)
pages 281 and 282 calculate: (d) Pollux (m = 1.16) and Procyon (m = 0.4)
(a) the surface temperature of the star (e) Mira (m=6.47) and Proxima Centauri
(b) the total energy output per second (power (m = 11.01)
output) of the star. (f) Mira (m = 6.47) and Algol (m = 2.09)

300 ASTROPHYSICS
CHAPTER REVIEW
27. The apparent magnitude of the Sun is −26.7
PARALLAX DISTANCE
while that of the full Moon is −12.5. Calculate
STAR (mas) (pc) m M
how much brighter the Sun is compared to the
Moon. Fomalhaut 130.08 1.17

28. (a) Define apparent magnitude and absolute Vega 128.93 0.03
magnitude.
(b) State what characteristic of a star each Canopus 10.43 −0.62
measures.
Betelgeuse 7.63 0.45
29. (a) If a star’s apparent magnitude, m, were
Rigil Kent 742.12 −0.01
numerically greater than its absolute mag-
nitude, M, what does this tell you about its
distance from us? Identify the sign of the 33. Describe the steps involved in spectroscopic
distance modulus. parallax.
(b) If M > m, what does this tell you about a
34. Distances determined by spectroscopic
star’s distance? Identify the sign of the dis-
parallax can involve a high degree of error.
tance modulus now.
Identify the source of this error.
(c) If M = m, what can be inferred about a
star’s distance? Identify the value of the 35. Fomalhaut is an A3V star while Vega is an A0V.
distance modulus. Determine their distances using the spectro-
scopic parallax techniques, and then compare
30. Acrux has an apparent magnitude of 0.77 and
these distances to those calculated in question
an absolute magnitude of −4.19.
34. Comment on any differences.
(a) Calculate how much brighter this star
would be if it were located at a distance of 36. Canopus is an F0I star. Determine its distance
10 pc rather than its true distance. using spectroscopic parallax and compare this
(b) Calculate the true distance of Acrux using figure to that calculated in question 32.
the distance modulus formula.
37. Explain the difference that occurs between
31. Use the distance modulus formula to calculate apparent magnitudes determined by eye and
the missing data in the following table. those determined photographically.

STAR m M d 38. Identify the three filters used in the UBV


system. Include wavelengths for each filter.
Rigel 0.18 −6.69
39. (a) Compare an orange star’s apparent magni-
Bellatrix 1.64 74.5 tude as measured through a B filter, to that
measured through a V filter.
Capella −0.48 12.9 (b) Make the same comparison for a blue star.
(c) Repeat the comparison, this time for a
Sirius −1.44 2.64 white star.
Deneb 1.25 −8.73 40. (a) Define colour index.
(b) Construct a number line to represent the
Altair 2.2 5.14 colour index scale. Upon this line write
down star colours that correspond to par-
Achernar 0.45 44.1
ticular values.
Spica 0.98 −3.55
41. Aldebaran has a colour index of 1.538. State its
approximate:
32. The following is an extract of data from the HIP- (a) colour
PARCOS Catalogue. Use the parallax formula (b) spectral class
(c) temperature.
d = 1
--- as well as the distance modulus formula
 p 42. Spica has a colour index of −0.235. Discuss
to complete the table. Limit your answers to what can be inferred about this star based
three significant figures. solely upon this single piece of information.

CHAPTER 15 ASTRONOMICAL MEASUREMENT 301


15.1 15.2 ANNUAL
ACCESSING PARALLAX
STAR DATA PRECISION
Aim Aim
To access the HIPPARCOS Catalogue search To compare the limits of annual parallax as meas-
facility in order to access up-to-date star data. ured by ground-based and space-based methods.

Apparatus Background information


Internet access Errors can be propagated through a calculation.
In general, if
1
Background information A = ---
B
then
During the period 1989 to 1993 the HIPPARCOS percentage error in A = percentage error in B.
satellite surveyed 118 218 stars, measuring their Hence, since
annual parallax to an accuracy of one milliarcsec. 1
d = ---
It took until 1997 to analyse and compile the data p
into the HIPPARCOS Catalogue. Auxiliary instru- we can say that
ments aboard the satellite recorded other param- percentage error in d = percentage error in p.
eters of over a million other stars, and these data
have been compiled to form the Tycho Catalogue.
Method
By reference to this textbook, or otherwise, deter-
Method mine the accuracy limit, in seconds of arc, of
measurements of annual parallax made from the
Portions of the HIPPARCOS and Tycho Catalogues
ground-based telescopes, as well as from the HIP-
can be accessed at their search facility web site. To
PARCOS satellite. For the purpose of comparison,
locate the page either perform a web site using
we will also include the anticipated accuracy of the
their name as the search string, or use the weblink
future GAIA satellite. Enter this information into a
provided.
results table as shown.
Next, determine parallax angles for which each
eBook plus Weblink: accuracy limit represents a 1% error and record this
The HIPPARCOS in the appropriate spaces. Repeat this process, deter-
Catalogue
mining angles that would have 10% and 50% errors.
Finally, use these parallax angles to calculate the
To retrieve information on particular stars you corresponding distances that would have a 1%,
will first need to know the HIP number for the 10% and a 50% error in each case.
star. There is a link for this function, known as
‘Simbad’, at the bottom of the Research Tools Results
PRACTICAL ACTIVITIES

page on the HIPPARCOS Catalogue site. Find this


link and determine the HIP number for the star GROUND- SPACE- SPACE-
known as Procyon. (The HIP number can also be BASED BASED BASED
found at any number of third party database web- ACCURACY LIMIT TELESCOPE HIPPARCOS GAIA
sites.) Now go back to the search page and enter
Parallax with 1% error
this number in the field labelled ‘HIPPARCOS
Identifier’. Print out the results of your search. Parallax with 10% error
On the printout, highlight the annual parallax of
this star, then include a calculation of the Parallax with 50% error
distance. Distance with 1% error
There is something unusual about Procyon. Can
you spot what it is? Distance with 10% error
You should now be able to retrieve information
Distance with 50% error
on other stars of your choice.

302 ASTROPHYSICS
PRACTICAL ACTIVITIES
Questions (b) Turn on the available gas discharge tubes in
turn, allowing each time to warm up. Examine
1. What is the main source of error for ground- each with the spectroscope, then describe and
based measurement? draw its appearance.
2. How many times more accurate is HIPPARCOS (c) Turn on the incandescent globe and place a
than ground-based measurements? large beaker of coloured solution in front of it.
3. How many times more accurate will GAIA be Using the spectroscope, view the light through
than HIPPARCOS (as planned)? the beaker. Try several different colourings such
4. Bearing in mind that there is always some error as copper sulfate or potassium permanganate.
in any measurement, what would you regard as an Describe and draw the spectra you observe.
acceptable percentage error to an astronomical
distance, if it is to be stated with confidence?
Questions
1. Part (a) should have produced a continuous spec-
15.3 trum. What astronomical object produces this
type of spectrum?
SPECTRA 2. Part (b) should have produced emission spectra.
What astronomical objects produce this type of
spectrum?
3. Part (c) should have produced absorption spectra.
What astronomical objects produce this type of
Aim spectrum?
To observe each of the three types of spectra —
continuous, emission and absorption.
15.4
Apparatus
spectroscope COLOUR FILTERS
gas discharge tubes
incandescent lamp
coloured solutions

Background information Aim


Each of the three types of spectra are produced by To demonstrate the use of colour filters for photo-
different types of sources as outlined in the table metric measurements.
below.
Apparatus
SPECTRUM PRODUCED BY LAB SOURCE light ray kit with colour filters
data logger with light sensor or a light meter
Continuous Incandescent Incandescent
spectra solids, liquids globe
and high
Theory
density gases The human eye is most sensitive to the yellow–
green portion of the spectrum and this sensitivity
Emission spectra Incandescent Gas discharge is reflected in visual magnitudes. Early photo-
low density tubes graphic emulsions were most sensitive to the blue
gases end of the spectrum, and this is reflected in photo-
graphic magnitudes. Photoelectric photometry
Absorption Non-luminous Coloured
simulates these by the use of colour filters. The
spectra fluid in front of solution in front
UBV system uses an ultraviolet filter (U), a blue
a continuous of a globe
filter (B) and a yellow–green filter (V).
spectrum

Method
Method (a) Place a red filter before the light ray kit lamp, to
(a) Turn on the incandescent globe and examine its simulate a red star. Darken the room and place
spectrum with the spectroscope. Describe and your light sensor at a distance from the lamp
then draw its appearance using coloured pencils. that produces appropriate output levels. Record

CHAPTER 15 ASTRONOMICAL MEASUREMENT 303


the reading from the light sensor. Next, place a
yellow filter (which we will use to simulate the V
filter) in front of the sensor and record the
output reading that results. Finally, replace the
yellow filter with a blue filter (for the B simu-
lation) and record the output reading.
(b) Replace the red filter in front of the lamp with
a blue filter to simulate a blue star. Repeat the
measurements made in part (a).

Results
(a) Red star: No filter reading =
V filter reading =
B filter reading =
(b) Blue star: No filter reading =
V filter reading =
B filter reading =

Questions
1. (a) Compare the ‘no filter’ readings from parts
(a) and (b) above. Were the readings the
same, or did the device show a greater sensi-
tivity to one of them?
(b) If they were different, can you suggest
another explanation for the difference?
2. (a) Through which colour filter is the red star
brightest?
(b) Explain how this would result in a positive
colour index for a red star.
3. (a) Through which colour filter is the blue star
brightest?
(b) Explain how this would result in a negative
colour index for a blue star.
PRACTICAL ACTIVITIES

304 ASTROPHYSICS
CHAPTER
16 BINARIES AND
VARIABLES
Remember
Before beginning this chapter, you should be able to:
• describe the Law of Universal Gravitation
• describe the role of centripetal force in circular
motion
• describe and draw a Hertzsprung–Russell (H–R)
diagram and the placement of main sequence, red
giants and white dwarfs upon it.

Key content
At the end of this chapter you should be able to:
• describe binary stars in terms of the means of their
detection; that is, visual, eclipsing, spectroscopic and
astrometric
• explain the importance of binary stars in
determining stellar masses
• use Kepler’s Third Law to calculate the mass of a
binary star system and solve problems using
4π 2 r 3
m 1 + m 2 = ---------------2
( GT )
• classify variable stars as either extrinsic or intrinsic
Figure 16.1 Two exposures of a binary or double star — and non-periodic or periodic
the Dog Star, Sirius A (the brightest star in the sky, after the • explain the importance of the period–luminosity
Sun) and the Pup, Sirius B (the first white dwarf to be relationship for determining the distance of Cepheid
discovered). Sirius B is indicated by the arrows. variables.
Binary stars are double star systems. Variable stars are stars that vary in
brightness. These two apparently different stellar objects have at least two
things in common. Firstly, an unresolved binary can appear to be a vari-
able star. Secondly, the study of each type of object has made an invalu-
able contribution to the advancement of astronomy. The study of
binaries has increased our knowledge of the mass of stars; while the study
of variables has given us a reliable distance-measuring tool.

16.1 BINARIES
A binary star system consists of two stars in orbit about their common
centre of mass. It may come as a surprise to learn that more than half of
the main sequence stars known are not single stars, but are members of
binary (double) or other multiple star systems. Binaries, in particular, are
useful because their motion can be analysed to determine their masses.
To understand the significance of this we need to realise that there is
no direct way to measure the mass of an isolated star. As we have seen
already in studying this astrophysics option, by analysing the light from a
single star we can infer a great deal about its size, temperature, luminosity,
More massive star
composition, density and velocity. However, none of this information will
follows a smaller lead us directly to the knowledge of a star’s mass.
ellipse. In order to determine the mass of a star we need to observe its gravi-
tational effect on another object. For instance, the mass of the Sun can
be determined by analysing its effect on the motion of the Earth (or
other planet). The vast majority of single stars show no such effect that
Centre
of mass
can be analysed; however, binary stars do. In a sense, binaries have func-
tioned as stellar scales because through them so much has been learned
of the masses of stars that even the mass of single stars can be inferred.
Binary systems have been classified according to the way that they have
been detected, placing them into the following four groups: visual,
eclipsing, spectroscopic and astrometric. Bear in mind, however, that
Figure 16.2 Each star follows an there is no physical difference between any of these binary systems.
elliptical path around the centre of mass
of the system with the more massive star Visual binaries
closer. A visual binary can be resolved by a telescope; that is, a good telescope
can clearly show both stars in the system. The brightest of the pair is
called the primary and is designated with the letter A; the other star is
the secondary and carries the letter B. Examples that are easily observed
are Alpha Crucis A and B, Gamma Andromeda A and B, and Alpha
Centauri A and B, although this last example is actually a triple system
with the third star, Alpha Centauri C, not visible with a small telescope.
Suspected visual binaries may appear close only by line-of-sight, so it is
sometimes necessary to observe the stars for many years to be sure that they
are in motion around each other. Each r1 + r2 = r
star follows an elliptical orbit around the
centre of mass of the system, with the star
of larger mass tracing out a smaller
ellipse. This is shown in figure 16.2. r
By observing closely in order to m1 m2
measure the period of the motion and r1 Centre r2
of mass
the separation of the stars, it is possible
to calculate the total mass of the system.
To see how this is possible, consider the
simplified system shown in figure 16.3 in
which the ellipses have been simplified Figure 16.3 A simplified
to circles. binary system

306 ASTROPHYSICS
The centre of mass of the system is the point at which
m1r1 = m2r2 but r2 = r − r1
so that m1r1 = m2(r − r1)
m2 r
∴ r1 = ------------------
-
m1 + m2
m2 r
or r1 = -------- where M = m1 + m2.
M
You will recall from your work for the module ‘Space’ (page 41) that it is
the gravitational attraction between each star that acts as the centripetal
force that keeps each star in its orbit.
Therefore Fgravitational = Fcentripetal
Gm 1 m 2 m 1 v 2
- = ----------- .
-----------------
r2 r1
Recall that the orbital period, T, is related to orbital speed by:
2πr
v = -----------1 .
T
Substituting this into the previous equation gives:
Gm 2 4π 2r 1
- = ------------
---------- -.
r 2 T2
The expression for r1 above is substituted into the expression:
Gm 2 4π 2m 2 r
----------
- = -----------------
-
r2 T 2M
4π 2 r 3
and therefore M = -------------2-
GT
where
M = total mass of the binary
system (kg) = M1 + M2
m1 = mass of star 1 (kg)
m2 = mass of star 2 (kg)
r = separation distance
of the stars (kg)
T = orbital period of the
binary system(s).
Note that by rearranging this formula it becomes Kepler’s Third Law:
r3 GM
------2 = ---------2- .
T 4π
By using this equation it is possible to calculate the mass of the binary
system, but not the individual star masses. In order to do that a measure-
ment must be made of the distance from one of the stars to the centre of
mass. This is not an easy measurement to make, since the inclination of
the orbit relative to us must be known. However, if the measurement can
be made then, by combining the following three relations (used above),
the individual masses m1 and m2 can be calculated:
m1r1 = m2r2
r1 m2
so that ---- = ------ .
r2 m1
Now, if r2 = r − r1
and M = m1 + m2
then r1 = M – m1
------------ -----------------
r – r1 m1

CHAPTER 16 BINARIES AND VARIABLES 307


M ( r – r1 )
which simplifies to m1 = ---------------------
-
r
where
r1 = distance of star 1 from the centre of mass of the binary system (m).
This expression allows each star’s mass to be calculated once its distance
from the centre of mass of the system is known.

Calculating the mass of a binary system


SAMPLE PROBLEM 16.1 Sirius A and B, shown in figure 16.1, are known as the ‘Dog Star’ and
‘The Pup’. Sirius A is a white, A-class main sequence star and Sirius B was
the first white dwarf to be discovered. The system has a trigonometric
parallax of 379.2 milliarcsec, which means that it lies at a distance of just
2.64 pc. The pair is observed to have a period of 18 295.4 days (just over
9
50 years). If their separation is 3.0 × 10 km, calculate the mass of the
system.
9
SOLUTION Note that 18 295.4 days = 1.58 × 10 s
9 12
and 3.0 × 10 km = 3.0 × 10 m
4π 2 r 3
M = ------------2-
GT
4π 2 ( 3.0 × 10 12 ) 3
= ---------------------------------------------------------------------
-
( 6.672 × 10 –11 ) ( 1.58 × 10 9 ) 2
30
= 6.4 × 10 kg

Calculating the mass of the stars within a binary system


SAMPLE PROBLEM 16.2 The stars in a visual binary system are observed to have an orbital period
8 8
of 1.8 × 10 s and are 5.0 × 10 km apart. Further measurement deter-
8
mines that the more massive star is 1.5 × 10 km from the centre of mass
of the system. Determine:
(a) the total mass of the system
(b) the masses of each star.
4π 2r 3
SOLUTION (a) M = ------------2-
GT
4π 2 ( 5.0 × 10 11 ) 3
= ------------------------------------------------------------------
-
( 6.672 × 10 –11 ) ( 1.8 × 10 8 ) 2
30
= 2.3 × 10 kg
M ( r – r1 )
(b) m1 = ----------------------
r
2.3 × 10 30 ( 5.0 × 10 11 – 1.5 × 10 11 )
= -----------------------------------------------------------------------------------
-
5.0 × 10 11
30
= 1.6 × 10 kg
m2 = M − m1
30 30
1.00 = 2.3 × 10 − 1.6 × 10
Relative brightness

Secondary 30
minimum = 0.7 × 10 kg
0.75

0.50
Eclipsing binaries
Primary minimum An eclipsing binary is a binary system whose orbital plane is seen edge on
0.25 by us. This means that at some stage through an orbit each star eclipses the
–10 0 10 20 30 40 50 60 70
other and blocks its light. These binaries are characterised by their light
Time (hours)
curve; a graph of brightness versus time. A typical example is the light curve
Figure 16.4 The light curve of Algol of Algol, the first eclipsing binary discovered, shown in figure 16.4.

308 ASTROPHYSICS
Algol A is a spectral class B8 main sequence star, while Algol B is a fainter
spectral class K2 sub-giant. When the stars are side-by-side from our point
of view, then the system produces maximum light. When Algol B is in front
of Algol A (the primary eclipse) the brighter star is hidden and the light
received by us drops significantly. When Algol A is in front of Algol B (the
secondary eclipse) the dimmer star is hidden and the light received by us
drops again but not as much as during the primary eclipse. As a result of
these variations the light curve shows a regular pattern of asymmetrical
dips and the period of the motion can easily be measured as the time
between successive primary or secondary minima.
The duration of the eclipses can also reveal the diameter of each star,
and in Algol’s case the two stars are of very similar size (2.9 solar radii for
A and 3.5 solar radii for B) although their masses are quite different (3.7
solar masses for A and 0.8 solar masses for B).
Figure 16.5 presents an example with two stars of quite different sizes.
16.1 In this case a small hot star and a large cool star orbit each other. While
Eclipsing binaries their luminosities may be quite similar, the primary eclipse in this case is
with the larger star in front of the smaller star (which produces more
light than a similar sized area of the larger star).

Small, hot Small, hot Small, hot


star star in star behind
Corresponding behind front again
star positions

100%
brightness
Relative

Figure 16.5 The light curve typical Period


of a binary system with a small hot 0%
star and a large cool star Time

Spectroscopic binaries
A spectroscopic binary is an unresolved pair whose binary nature is
revealed by alternating Doppler shifting of their spectral lines. Spectro-
scopic detection of a binary system is most likely if the period of the
motion is short and the individual star velocities are high. Consequently,
most spectroscopic binaries are close binary systems.
16.2 Figure 16.6, on the following page, shows four positions in the rotation
of a binary system. The system is viewed from its edge; that is, along the
Spectroscopic binaries plane of the orbits of the stars. When in positions 1 and 3, the stars are
moving across our line of sight so that a single regular absorption spec-
trum is observed. However, when in position 2 star A is moving away from
us, then its spectral lines are slightly red-shifted. At the same time star B
is moving towards us so that its lines are slightly blue-shifted. This results
in a doubling (or splitting) of the spectral lines, and a measurement of
the degree of shifting can lead to the velocities of each star. In position 4
the situation is similar to position 2; however, the motions are interposed.
Regular observation of the spectrum of such binaries will reveal their
period. This, in combination with the velocities of the stars, allows the cir-
cumference of the orbit, and hence the separation of the two stars, to be
calculated. Kepler’s Third Law can then be employed to calculate the
total mass of the system.

CHAPTER 16 BINARIES AND VARIABLES 309


Sequence Star positions As seen from Spectrum recorded
the side
Red Violet

1 B A

Spectral lines single

B
Red Violet

A B A B AB
A
Spectral lines doubled

Red Violet

3 A B

Spectral lines single

A Red Violet

4
Figure 16.6 The doubling of
spectral lines characteristic of B A B AB A
B
spectroscopic binaries Spectral lines doubled

Note that this analysis works properly only if the orbits are viewed from
the edge. Therefore, it is quite possible that a spectroscopic binary can also
be an eclipsing binary. Unfortunately, it is quite difficult to determine the
angle of inclination of the plane of the orbit to us, and without this vital
piece of information we cannot calculate the individual masses of the stars.

Astrometric binaries
In an astrometric binary one of the stars is too faint to be observed;
however, the visible star can be seen to have an orbital motion. This shows
itself as a detectable ‘wobble’ in the star’s proper motion, or motion rela-
tive to the rest of the sky. From this, astronomers infer the presence of the
unseen partner. Astrometric measurement of the visible star’s wobble can
reveal the period of the orbit as well as its size, leading to an estimation
of the mass of the system and, possibly, the individual star masses.

PHYSICS FACT

T he unseen partners in astrometric binaries have been found to


have a very wide range of masses. Very high mass partners have
been found, providing early evidence of the existence of black holes.
Modern astrometric techniques (see chapter 15) have detected stars
with very small wobbles indicating very small mass unseen partners,
providing the first evidence of the existence of planets outside our
solar system. Most of these planets have masses similar to Jupiter and
many are positioned quite close to their star.

310 ASTROPHYSICS
The mass-luminosity relationship
A major benefit of the study of binary systems has been the ability to
determine star masses. If the luminosities of the main sequence stars are
5
Luminosity (x luminosity of the sun, L . )

10 also measured and plotted against their masses, then a relationship


4
10 between the two becomes apparent. Known as the mass–luminosity
3
10 relationship, as shown in figure 16.7, it shows that for most main
2
10
sequence stars the luminosity is proportional to the fourth power of its
1
mass.
10
4
0
10 = 1 L ∝ mass
–1
10
–2
This has implications for interpreting the main sequence on an H–R
10
diagram. As luminosity increases up the sequence, so too does mass,
–3
10
0.1 1 10
although not by as much (because of the fourth power relationship).
Also, the brighter, more massive stars have shorter lifetimes. This is
Mass (solar masses, M . )
because, although they are larger and have more fuel, they are also much
Figure 16.7 The mass–luminosity more luminous and therefore are burning that fuel at a much faster rate.
relationship for main sequence stars This new information about the main sequence, all flowing from the
study of binaries, is summarised in figure 16.8.

• very high luminosity


Blue
• high mass
giants
High • short lives

M
as
si
Lu nc
m re
ino as
Luminosity

sit es
Lif y
et inc
im re
e as
sh es
or gr
te ea
ns
tly

• very low luminosity


Red
• low mass
dwarfs
Low • long lives
High Low
Temperature

Figure 16.8 The main sequence moves from dim red dwarfs to bright blue giants.
Moving up the main sequence luminosity increases, mass increases and length of
lifetime decreases.

PHYSICS IN FOCUS
Naming stars
J ohann Bayer (1572–1625), a German astronomer,
introduced the system of naming stars in 1603. In
this system a star’s name is a letter (or letter combi-
Centaurus, and Delta Cephei is the fourth brightest
star in the constellation of Cepheus.
In 1862 the system was modified for variable
nation) followed by the Latin version of the name of stars, whereby the letters R to Z were reserved for
the constellation within which it lies. The letters are this purpose, and when these were exhausted in a
assigned in order of brightness, beginning with the particular constellation then two letter combi-
letters of the Greek alphabet, followed by lower-case nations were used, beginning with RR, RS, RT,
letters, followed by upper-case letters. Hence, Alpha and so on, down to ZZ. Two well known examples
Centauri is the brightest star in the constellation of are T Tauri and RR Lyrae.

CHAPTER 16 BINARIES AND VARIABLES 311


16.2 VARIABLES
These are stars that vary in brightness with time. There is a variety of
types of variable stars, along with a variety of apparent causes. Approxi-
mately 30 000 variable stars have been recorded to date. To put some
order to this assortment of stars, they can be classified using the system
shown in figure 16.9. Each type shown in this figure is discussed in the
following sections.
The initial step in this system is to classify the stars as either extrinsic or
intrinsic variables.

Eclipsing binaries
Extrinsic
Rotating variables
Super novae
Variable
Novae
stars
Non-periodic Flare stars
R Coronae Borealis
T Tauri
Intrinsic
Mira
RV Tauri Type I (classical)
Periodic
Figure 16.9 Classification of Cepheids
variable stars RR Lyrae Type II (W Virginis)

Extrinsic variables
With extrinsic variables the variation in brightness is due to some process
external to the star. This includes eclipsing binaries, already discussed in
this chapter, as well as rotating variables. The latter group are stars with
hotter or cooler areas on their surface that move in and out of view as the
star rotates, thereby altering its brightness. Extrinsic variables are
summarised in table 16.1.

Table 16.1 Extrinsic variables

TYPE DESCRIPTION

Eclipsing binaries Orbiting stars periodically eclipse each other.

Rotating variables Large cool/hot spots change star’s brightness as it rotates.

Cepheid instability
–6 Type I strip
RV Tauri stars
(classical)
–4 Cepheids
RR Lyrae Type II
Absolute magnitude

–2
stars (W Virginis)
Cepheids Mira stars
0

2 Dwarf Cepheids

4
Sun T Tauri stars
6
Main sequence
8

10 Flare stars

Figure 16.10 Locations of specific B0 A0 F0 G0 K0 M0


variables on an H–R diagram Spectral type

312 ASTROPHYSICS
Intrinsic variables
The brightness variation in this case is due to changes within the star
itself. Many of the intrinsic variable stars occupy specific locations on an
H–R diagram, and these are shown in figure 16.10. Intrinsic variables are
further classified as non-periodic or periodic variables.

Non-periodic variables
These intrinsic variables show irregular variations in brightness. This
group shows a variety of types, summarised in table 16.2. They include
supernovae, novae, flare stars, R Coronae Borealis, and T Tauri stars.

Table 16.2 Types of non-periodic variables

TYPE BRIGHTNESS VARIATION DESCRIPTION

Supernovae Temporary increase to M < −15 before fading A violent explosion destroying the star, leaving
away behind a compact, high density object
(neutron star or black hole) and an expanding
shell of gas.

Novae Sudden increase of about 10 magnitudes A close binary pair in which hydrogen-rich
before returning to normal material is drawn from one star to the other,
a white dwarf. Eventually enough material
accumulates to react, creating the nova
explosion. The star returns to normal though
a shell of gas may have been ejected.

Flare stars Sudden increase >2 magnitudes, returning to Red dwarfs which experience intense
(UV Ceti stars) normal within an hour outbursts of energy from small areas of their
surface.

R Coronae Borealis Sudden decrease of about 4 magnitudes, Supergiant stars rich in carbon which
slowly fluctuating back to normal periodically accumulates in the outer
atmosphere, strongly absorbing light, before
being blown away.

T Tauri Irregular, unpredictable variations. Light is Young protostars still contracting from the gas
usually obscured by gas cloud, necessitating cloud in which they lie. They are rotating
observations in the infra-red. rapidly and losing mass. Light variation is due
to this activity in the outer layers.

PHYSICS FACT

I n an extreme version of the nova process described in table 16.2, a


white dwarf can accumulate too much mass from a companion
star, producing a runaway reaction that leads to the star’s destruction
in a ‘type I’ supernova.

Periodic variables
Periodic variables display a regular pattern of brightness variation. The
various types of periodic variables can be characterised by their light
curve parameters as shown in figure 16.11.

CHAPTER 16 BINARIES AND VARIABLES 313


Period

Magnitude
Median light Amplitude

Time

Figure 16.11 A generic light curve for a periodic variable, showing the parameters
used for description

Also known as pulsating variables, the regular variation in brightness is,


in general, due to a disequilibrium that exists between the two forces that
act upon a star to determine its size. These two forces are the
gravitational force and its radiation pressure, as represented in figure
16.12. This can occur whenever there is a change in radiation pressure.

Radiation pressure
from the energy
source within

Force of
gravitation
Figure 16.12 The two forces that due to
the mass
determine a star’s size are its radiation of the star
pressure pushing outwards, and its own
gravitation pushing inward. When these
two forces are in disequilibrium then the
star will pulsate in size, temperature,
luminosity and brightness.

Included in this category are Mira variables, RV Tauri variables,


Cepheids and RR Lyrae variables. The properties of each are summarised
in table 16.3.

Table 16.3 Periodic (pulsating) variables

AMPLITUDE MEDIAN LIGHT


TYPE PERIOD (DAYS) (MAGNITUDES) (MAGNITUDE) COMMENT

Mira 80–1000 2.5–10 No typical value Long period, pulsating red giants and
supergiants.

RV Tauri 20–150 No typical value No typical value Yellow supergiants. Alternating deep
and shallow minima on light curve.

Cepheid 1–50 0.1 to 2 −1.5 to 5 Very luminous yellow supergiants.


Type I (young) and Type II (older).

RR Lyrae <1 <2 0 to +1 Old giants. Always have M ≈ +0.6.

Cepheid and RR Lyrae variables are of particular importance to


astronomers as they offer another means of distance measurement.

314 ASTROPHYSICS
Apparent magnitude
3.4
3.6
Period-luminosity relationship
3.8 From 1908 to 1912, Henrietta Leavitt studied Cepheids located in the
4.0 Small Magellanic Cloud, which is one of two small irregular galaxies close
4.2 to our own (the other is the Large Magellanic Cloud). The Cepheids she
4.4 studied are therefore all at a similar distance from us. Cepheid variables
4.6
can be recognised by their characteristic light curve, which displays a
0 5 10 15 sharp increase in brightness followed by a slower decrease to complete
Time (days) the oscillation, as shown in figure 16.13. Henrietta Leavitt recognised
Figure 16.13 The typical light curve that those Cepheids with longer periods of oscillation were also, on
of a Cepheid variable average, more luminous. This is now known as the period–luminosity
relationship.
It was later discovered that there are two types of Cepheids — dubbed
type I and type II. Type I (or classical) Cepheids are massive, young,
second generation stars while type II (or W Virginis stars) are small, old,
red, first generation stars.
The period–luminosity relationship for both types is shown in figure
16.14. It is this relationship that allows the distance to a Cepheid to be
calculated. The steps in the method are:
1. Establish the type of Cepheid being observed (by spectral analysis).
2. From the light curve of the Cepheid, determine the period.
3. From the period–luminosity relationship, use the period to determine
the star’s average absolute magnitude, M.
4. From direct observation, measure the star’s average apparent
magnitude, m.
d
5. Use the distance modulus formula m − M = 5 log  ------ to calculate the
 10
distance to the star.
This method of distance determi-
nation can be complicated by inter-
–6
stellar dust, which can make a star
appear dimmer than it otherwise
would. This will lead to a calculated
s
Absolute magnitude

–4 i d distance greater than the correct


he
C ep value. Nevertheless, this technique
eI
Typ has proved to be a particularly
s useful tool and has been used to
–2 heid find distances within our galaxy as
ep
IIC well as distances to neighbouring
e
Typ
RR Lyrae stars galaxies.
0

Figure 16.14 The period–luminosity


0.3 0.5 1.0 3 5 10 30 50 100
relationship
Period (days)

PHYSICS FACT

R R Lyrae variables are more populous than Cepheids and just as


useful for distance measurement. Several thousand of these old
giant stars are known, and their value lies in the fact that they always
have a similar average absolute magnitude of +0.6. This is also shown
in figure 16.14. Once a RR Lyrae variable is recognised, a measure-
ment of its average apparent magnitude will immediately allow a
distance calculation using the distance modulus formula.

CHAPTER 16 BINARIES AND VARIABLES 315


SUMMARY QUESTIONS
• A binary star system consists of two stars in orbit
about their common centre of mass.
Binaries
1. Define a binary star.
• Binary systems have been classified into four
groups — visual, eclipsing, spectroscopic and 2. Draw and describe the orbits of a binary pair.
astrometric.
3. Describe the nature of a visual binary.
• A visual binary can be resolved by a telescope.
4. (a) State the measurements that need to be
• If the period of the motion and the separation taken of a binary in order to calculate the
of the stars are known, then it is possible to mass of the system.
calculate the total mass of a binary system by (b) What further piece of information is
using a rearranged form of Kepler’s Third Law: required in order to calculate the masses
4π 2 r 3 of the individual stars?
m1 + m2 = ------------2- .
GT 5. Use the information in the following table to
• An eclipsing binary is a system whose orbital calculate the mass, in kilograms and solar
plane is seen edge on by us, meaning that at masses, of the binary systems specified. Data:
some stage through an orbit each star eclipses 30
1 solar mass = 1.989 × 10 kg.
the other and blocks its light.
• Eclipsing binaries demonstrate a light curve TOTAL
with a characteristic asymmetrical, double- TOTAL MASS OF
minimum cycle. MASS OF SYSTEM
BINARY PERIOD SEPARATION SYSTEM (SOLAR
• A spectroscopic binary is an unresolved pair SYSTEM (HOURS) (km) (kg) MASSES)
that shows an alternating Doppler shifting of its
6
spectral lines. This creates a regular doubling a 39.5 5.50 × 10
of the spectral lines. 6
b 52.6 8.88 × 10
• An astrometric binary has one star too faint to
7
be observed; however, the visible star can be c 123 1.05 × 10
seen to have an orbital motion. 7
d 426 5.89 × 10
• The study of binaries has revealed the mass– 8
luminosity relationship, which allows masses of e 752 1.04 × 10
main sequence stars to be inferred. 7
f 1150 5.82 × 10
• Variables are stars that vary in brightness with 8
time. They can be classified as either extrinsic g 3590 1.66 × 10
or intrinsic variables. h 10 500
8
3.94 × 10
• Extrinsic variables owe their variation in bright- 8
i 27 800 4.00 × 10
ness to some process external to the star.
Intrinsic variables owe their variation to j 43 800
8
5.55 × 10
changes within the star itself. They can be
further classified as non-periodic or periodic
CHAPTER REVIEW

variables. 6. Data required: solar mass = 1.989 × 10 kg


30

• Non-periodic variables are intrinsic variables solar radius = 696 000 km


that show irregular variations in brightness. (a) A close binary pair is observed to have a
period of just 55 hours and a separation of
• Periodic (or pulsating) variables display a
27.5 solar radii. Calculate the total mass of
regular pattern of brightness variation.
the binary system in kilograms and solar
• Cepheids are a type of periodic variable that masses.
possesses a period–luminosity relationship. It (b) The ratio of orbital radii (distance from
allows a calculation of distance to the variable star to centre of mass) is observed to be
star. 3 : 5. Calculate the individual star masses.

316 ASTROPHYSICS
CHAPTER REVIEW
7. Describe the nature of an eclipsing binary. 18. Stars at the top of the main sequence are
8. (a) Construct and describe the light curve of bright and therefore massive. This means that
an eclipsing binary, in which one star is they have more fuel to ‘burn’ compared to
small and hot while the other is large and duller, smaller stars. Explain why they should
cool. be expected to have shorter lifetimes?
(b) Your sketch should show two unequal
minima per cycle. Identify the primary Variables
minimum. Explain your choice and
19. Define a variable star.
describe the positions of the stars at this
point. 20. Outline the system of classification used for
9. (a) Compare the primary eclipse to the secon- variable stars.
dary eclipse for an eclipsing binary.
(b) Explain the brightness difference between 21. Define an extrinsic variable and identify some
the primary and the secondary eclipses. examples.

10. Describe the information that can be learned 22. Define an intrinsic variable.
from the light curve of an eclipsing binary.
23. Describe the nature of a non-periodic variable.
11. Describe the nature of a spectroscopic binary. Identify some examples.

12. Construct a diagram to explain the spectral 24. (a) Describe the nature of a periodic variable.
line doubling observed with spectroscopic (b) Explain why these stars are also known as
binaries. pulsating variables.

13. Describe the information that can be learned 25. One of the most important types of periodic
from the spectrum of a spectroscopic binary. variable to astronomers is Cepheids.
(a) Describe the period–luminosity relation-
14. Identify the most serious limitation to the ship of Cepheids.
analysis of spectroscopic binaries. (b) Explain why this relationship is important.
15. Define the nature of an astrometric binary. (c) Describe the process of distance determi-
nation using Cepheids.
16. Describe the mass–luminosity relationship and (d) Identify when a Cepheid variable is
how it was discovered. brightest — when it is largest or smallest.
Explain why. (You will need to do some
17. Describe what the mass–luminosity relation- extra research to discover the answer to
ship tells us about main sequence stars. this.)

CHAPTER 16 BINARIES AND VARIABLES 317


deepest of the two dips produced by the eclipses.
16.1 The period of the motion is the time taken
between successive primary eclipses, or twice the
ECLIPSING time between the primary and secondary eclipse.
Recall that Kepler’s Third Law links the period of
BINARIES a binary system to its total mass and star separation.
r3 GM
------2 = ---------2-
Aim T 4π
To model the light curve produced by an eclipsing
binary using a computer simulation. Method
Figure 16.15 shows the java applet at Cornell
Apparatus University. The screen shows a simulation of the
There are several computer programs available to orbiting stars and the light curve they produce,
perform this simulation; however, there are also and a small point moves along the light curve to
several internet sites which run the simulation as a indicate the progress of the system. The applet
java applet. This activity utilises an applet devel- provides control over the viewing angle, the star
oped at Cornell University and provided via the separation, and the spectral class of each star.
following weblink. Notice the effect of each of the following actions
on the period of the motion, being sure to press
the ‘enter values’ button after each alteration.
eBook plus Weblink: 1. Increasing and decreasing the separation of the
Eclipsing binary applet
stars.
2. Altering the star types. The applet begins with
stars that are similar but not identical. Try
Theory making the stars the same (both O, both F, both
If we have a side view of a binary system then the M), and quite different (one O and one M, one
light received from it will vary as the stars eclipse B and one K).
each other. The primary eclipse is when the Finally, note that the light curve will alter if the
brighter of the two stars is eclipsed and it produces system is not viewed exactly from the side. Try
the greatest drop in light intensity received. This is varying the viewing angle, and note the change in
clearly seen on the system’s light curve as the the light curve.

Figure 16.15 A simulation of an eclipsing binary system


PRACTICAL ACTIVITIES

318 ASTROPHYSICS
PRACTICAL ACTIVITIES
Questions stars orbit, a small point moves along the graph to
indicate the progress of the system, and the spectral
1. How did the period change when the star sep- lines periodically double. The applet provides control
aration was: over the masses of each star (M1, M2), the star separ-
(a) increased ation, a, the inclination, i, of the orbit to us, as well as
(b) decreased? other parameters.
2. Describe and draw the light curve produced Notice the effect of each of the following actions
when the stars are: on the spectral line doubling, being sure to press
(a) both large (b) both average mass the ‘enter’ button after each alteration.
(c) both small (d) very different 1. Vary the masses M1 and M2 so that they are
(e) slightly different. similar but high, similar but low, and quite
different.
2. Vary the star separation a, increasing as well as
16.2 decreasing it.
3. Vary the angle of inclination i between 90 and
SPECTROSCOPIC 0 degrees.

BINARIES Questions
1. How did the doubling change when the star
Aim masses were:
To model the spectral line doubling observed from (a) similar but high
a spectroscopic binary. (b) similar but low
(c) quite different?
Apparatus 2. What is radial velocity?
Internet access. This activity utilises a java applet 3. What effect does varying the star separation
developed at Cornell University and provided via have on:
the following weblink. (a) the radial velocities of the stars
(b) the doubling of the spectral lines?
4. Describe the effect of a combination of high values
eBook plus Weblink: for both M1 and M2 as well as a low star separation.
Spectroscopic binary applet
5. What happens to the line doubling effect when
the angle of inclination is reduced?
Background
information
Refer to figure 16.6. An
unresolved close binary can
produce regular doubling of
its spectral lines as the two
stars move towards and away
from us. The cause of the
line shifting is the Doppler
effect. Regular observations
can determine the period as
well as the radial velocities
of the two stars. This can
lead to a knowledge of the
mass of the binary system.

Method
Figure 16.16 shows the java
applet at Cornell University.
The screen shows a simulation
of the orbiting stars, a graph of
their velocities relative to us,
Figure 16.16 A simulation of a spectroscopic binary system
and their spectral lines. As the

CHAPTER 16 BINARIES AND VARIABLES 319


CHAPTER
17 STAR LIVES
Remember
Before beginning this chapter, you should be able to:
• describe the nature of the force of gravity
• interpret a written nuclear equation, including the
various symbols used
• interpret a Hertzsprung–Russell diagram.

Key content
At the end of this chapter you should be able to:
• describe the processes involved in stellar formation
• outline the key stages in a star’s life
• describe the nuclear processes within a star that
correspond to the stages outlined in the above point
• discuss the synthesis of elements inside stars,
including the heavier elements on the periodic table
• explain how the age of a globular cluster can be
determined from its turn-off point on an H–R
diagram
• explain the concept of star death in relation to
planetary nebulae, supernovae, white dwarfs,
neutron stars/pulsars and black holes
• draw the evolutionary tracks of stars of 1, 5 and 10
solar masses, on an H–R diagram.

Figure 17.1 The Great Nebula in Orion is one of the


very few nebulae that can be seen with a pair of binoculars.
It is a region of hydrogen, ionised by the hot stars within it.
The dark regions contain dust, which obscures background
light. Behind the nebula lies a large molecular cloud,
which is a site of star formation.
In this chapter we are going to discuss the life of a star. Of course, a star
is not really alive — this is simply a useful analogy. Stars form from giant
interstellar clouds, they experience a long period of activity and then,
when their fuel is spent, they shut down. We will refer to these periods as
the star’s birth, lifetime and death, respectively. Each is spectacular in its
own way.

17.1 STAR BIRTH


The interstellar medium
Many people think that interstellar space, the space between the stars, is
a vacuum but this is not so. This space is filled with a sparse and irregular
gas as well as even more sparse grains of dust. Stars are born out of this
interstellar medium, so we shall examine it in more detail.
The interstellar medium consists
of gas and dust. The interstellar gas is mostly hydrogen with some helium and traces of
other elements. It covers broad regions in the form of neutral atoms,
charged ions or molecules, and is distributed in clouds, or ‘nebulae’, or
The interstellar gas occurs as intercloud gas. The hotter regions of ionised hydrogen are easily
regions of neutral atoms, ions or
observed as nebulae around hot stars, such as the Great Nebula in Orion
molecules. It is mostly hydrogen.
shown in figure 17.1. These regions absorb ultraviolet radiation given off
by the stars and re-emit it at visible wavelengths. The neutral hydrogen,
mostly concentrated in the plane of the galaxy, has been detected only
more recently using radio telescopes, since it gives off radiation with a
wavelength of 21 cm.
Most of the molecular gas is located in enormous cold clouds. Easily
most common in these clouds is molecular hydrogen (about half the
hydrogen in our galaxy appears to be in this form), but other molecules
have been detected. These include water, ammonia, carbon monoxide,
methane, ethanol and other carbon-based molecules. Occasionally these
molecules collide with each other, become excited and radiate at ultra-
violet, visible and infra-red wavelengths, but especially at millimetre
(radio) wavelengths.
The molecular clouds are several tens of light years across, have den-
sities of several billion molecules per cubic metre, and masses of about a
thousand solar masses. Further, these clouds are arranged in huge cloud
complexes, and play a vital role in the process of star birth.
The interstellar dust is much more tenuous — just one grain per cubic
The interstellar dust is made of
grains of silicates and ices in a core metre on average. It is thought to be formed in the outer atmosphere of
and mantle structure, just one cool supergiant stars before being blown away by the star’s stellar wind.
micrometre across. Each grain of dust is thought to be composed of a core and a mantle,
with the core consisting of silicates (or iron or graphite) and the mantle
1 micrometre made up of a mixture of ices (water, carbon dioxide, methane,
Core – ammonia). This structure is represented in figure 17.2.
probably Dust clouds can be detected because they affect any light trying to pass
silicates through them, by a reddening or extinction of the light. The reddening
effect is caused by the scattering of the bluer wavelengths and is similar
to the reddening of the Sun’s light that we see at sunset. Extinction refers
Mantle – ices to the dimming or complete blocking of light, and is caused by scattering
as well as absorption of the light. An interstellar cloud with sufficient dust
Possible thin surface to completely block the light of stars or a nebula behind it is called a dark
Figure 17.2 A model for the nebula. A good example is the Horsehead Nebula shown in figure 17.3
structure of an interstellar dust grain on the following page.

CHAPTER 17 STAR LIVES 321


Figure 17.3 The Horsehead Nebula

In the interstellar medium, dust and molecules are associated. It


appears that the dust grains act as a site of molecule formation. Conse-
quently, molecular clouds invariably contain plenty of dust. Therein lies a
problem for astronomers, because molecular clouds are where stars are
born but the dust blocks our view and prevents us seeing the process of
star birth, at least in visible light. Infra-red and radio wave radiation pen-
etrates the dust so that infra-red and radio telescopes can reveal what
optical telescopes cannot. As we saw in chapter 14 (page 268), NASA’s
space infra-red telescope, the Spitzer Space Telescope, is designed for
just this purpose.

Gravitational collapse
A molecular cloud that is sufficiently cool and massive will contract
under its own gravity. It begins slowly but, as it draws itself in, the gravi-
tational freefall speeds up. The density increases more quickly at its
centre and, being denser, it experiences greater gravity and contracts
even faster. The cloud now has two parts — a rapidly contracting core
and the slower contracting surroundings.

322 ASTROPHYSICS
As the core contracts, the gravitational potential energy of its gas par-
ticles is being converted to kinetic energy, so that it heats up. This heat
creates an outwards pressure that works against the gravitational collapse,
but only slightly at first. As the core gets hotter and hotter, this pressure
builds, slowing and eventually stopping the collapse and stabilising the
A protostar is a new star before it size of the core, which is then called a protostar (shown in figure 17.4).
begins to produce any nuclear This process takes approximately one million years.
energy in its core.

Mass continues
to fall in upon
Accreting the protostar.
protostar

Surrounding cloud of gas and dust.


Figure 17.4 The formation of a The dust obscures events.
protostar

The protostar is hidden from our view because the surrounding molecular
cloud contains obscuring dust. However, this surrounding material is still
contracting and continues to rain in upon the protostar. The protostar slowly
increases its mass by this accretion. It then begins to behave as a T Tauri
variable, developing strong stellar winds that blow away the remnants of the
surrounding cloud. Finally, we can see the forming star with visible light.
With no source of energy, the protostar begins a slow shrinkage. This
decrease in size causes it to become less luminous but also heats its core
further. Eventually the core may reach a temperature high enough to
trigger the nuclear fusion of the hydrogen within it (approximately eight
million kelvin). This new long-lasting energy source stabilises the star. It
is now a zero-age main sequence star — a smaller and less luminous but
more stable object than the protostar it once was. Its mass is somewhere
between 0.01 and 100 solar masses. Were it smaller than this, the protostar
would not have heated sufficiently to begin nuclear fusion; if it were larger
than this, the protostar would have overheated and blown itself apart.
Note that a plot of the main sequence using only zero-age stars is
The zero-age main sequence referred to as the zero-age main sequence (ZAMS). It forms the complete
(ZAMS) is a plot of the main diagonal, main sequence shape shown on most generic H–R diagrams.
sequence using only zero-age stars. So far this discussion has focused on the formation of a single star. How-
ever, the portion of cloud that has suffered gravitational collapse is usually
of several solar masses and is spinning as well. As it contracts, conservation
of the angular momentum of the spinning cloud makes it spin faster, and
this causes the cloud to fragment into smaller spinning parts, so that a group
of stars is formed. Each smaller part is also spinning, which eventually causes
further fragmentation and leads to systems of planets around the stars.

CHAPTER 17 STAR LIVES 323


PHYSICS FACT

A ngular momentum is the rotational equivalent of linear


momentum. When a spinning object becomes smaller (not less
massive), it must spin faster in order to conserve its angular
momentum. This idea is used by divers, gymnasts and figure skaters
who spin faster when they tuck in their arms or legs.

The process of star birth described can be traced on an H–R diagram,


but the pathway is slightly different for different mass stars, as shown in
figure 17.5. For a star of approximately one solar mass, the process takes
about 50 million years.

106

104 10 solar
masses

102 5 solar
masses
L (LSun)

1 1 solar
mass

10–2

0.1 solar
10–4 mass

Figure 17.5 The pathways of 40 000 20 000 10 000 5000 2500


stellar birth on an H–R diagram T (K)

17.2 MAIN SEQUENCE STAR LIFE


A main sequence star is
A main sequence star is characterised by the fusion of hydrogen to helium
characterised by the fusion of in its core, surrounded by unused, or non-reacting, hydrogen layers. The
hydrogen to helium in its core. stability of the star comes from the equilibrium it has achieved, both hydro-
static and thermal. The hydrostatic equilibrium is the balance between the
outward radiation pressure and the inward gravitational force, as repre-
sented in figure 16.12. The thermal equilibrium refers to the balance
between the rate at which energy is produced in the core of the star and
the rate at which energy is radiated away from the surface.

Hydrogen ‘burning’
The source of the energy in the core of the star is the fusion of hydrogen.
This is commonly referred to as ‘hydrogen burning’, although it must be
remembered that this process is not the chemical reaction of combus-
tion. Rather, it is the joining, under high temperatures, of hydrogen
nuclei to form helium nuclei. This nuclear reaction results in a slight
decrease in mass, and the lost mass is transformed into the energy
2
released in accordance with Einstein’s equation, E = m c . The net
reaction can be written as follows:

324 ASTROPHYSICS
+
4 11 H → 24 He + 2e + 2ν
where
+
e = positron (positive electron)
ν = neutrino (small, massless, chargeless particle).

While this is the net reaction in all main sequence stars, there are two
different mechanisms to achieve it.

The proton–proton chain


The first is known as the proton–proton (p–p) chain, and it is the first to
The proton–proton (p–p) chain is
the hydrogen fusion mechanism start in all stars when they reach the main sequence. This process begins
that is first to occur in main with the following two reactions:
sequence stars.
1
1 H + 11 H → 12 H + e + + ν
1
1 H + 12 D → 23 He

where
2
1 H = deuterium (heavy hydrogen)
3
2 He = light helium.

These two reactions must both occur twice before the final reaction
can take place:
3
2 He + 23 He → 24 He + 2 11 H .
Note that six 11 H ’s go into the reactions but two are returned so that the
net reaction has four hydrogens combining to produce a helium. In
figure 17.6 we have tried to represent this chain in a flow diagram.

ν e+
γ

1 2
H 1
H
1

1
1
H
3
2
He

1 4
H He
1 2

1
1
H

3 γ
2
He

1
H 2
1
1
H
γ
1
1
H

e+
ν

Figure 17.6 The proton–proton chain

CHAPTER 17 STAR LIVES 325


The CNO cycle
There is another mechanism present, known as the carbon–nitrogen–
The CNO cycle is the hydrogen
fusion mechanism that dominates oxygen (CNO) cycle, although in smaller, cooler stars it does not pro-
in hotter main sequence stars. duce much energy. However, in stars more massive than the Sun the core
7
temperature exceeds 1.6 × 10 K, and at approximately this temperature
the CNO cycle becomes the dominant process.
The CNO cycle is a six-stage process in which carbon acts as a catalyst.
That is, a 126 C nucleus goes into the first reaction and is returned in the
last. The reactions are listed below in the order they occur, and are
drawn in figure 17.7 as a cycle. Note that the net reaction is still that four
hydrogens combine to produce a helium.
1 12
1 H+ 6 C → 13
7 N
13
13
7 N → 6 C + e+ + ν
1 13 14
1 H+ 6 C → 7 N
1 14
1 H+ 7 N→ 15
8 O
15
15
8 O → 7 N + e+ + ν
1 15
1 H+ 7 N→ 12
6 C + 24 He

4
2
He

12
C
6

15 1
N H
7 1 13
1 N
e+ 1
H 7

ν
1
1
H
e+
15
O 1
13
C
8
1
H 6

14
N
γ 7

Figure 17.7 The CNO cycle γ

Both of the above mechanisms, the proton–proton chain and the CNO
cycle, can occur simultaneously within a star. However, in less massive,
cooler stars the p–p chain is dominant, while in more massive, hotter
stars the CNO cycle dominates. This is represented in figure 17.8 on the
following page.

326 ASTROPHYSICS
The helium produced by hydrogen burning collects at the centre of the
star, since it is denser than the hydrogen. Here it accumulates, building a
106 store of material that will become the star’s next energy source when
its hydrogen supply finally runs down. How long this will take
depends upon the mass of the star. A star of about one solar mass
104 will spend approximately 10 billion years burning hydrogen on
CN
O the main sequence, whereas a high mass star will have a life-
cy
c le time of just a few million years. Many low mass stars have
do
102 mi had lifetimes as long as the universe is old. (We dis-
na
tes
cussed the relationship between a star’s mass and its
L (LSun)

lifetime in the previous chapter.)


1 p–
pc Whatever the star’s mass, it does eventually run
ha out of hydrogen fuel in its core. At this point, the
in
star’s life on the main sequence is over and it is

do
10–2

m
about to become a red giant.

ina
tes
Core temperature
10–4 1.6 × 107 K
Figure 17.8 The lower end of the main sequence is populated
by smaller, cooler stars in which the proton–proton chain
40 000 20 000 10 000 5 000 2 500 dominates. In the more massive, hotter stars in the upper main
Surface temperature (K) sequence the CNO cycle dominates.

17.3 STAR LIFE AFTER THE MAIN


SEQUENCE
A red giant is a star characterised
A red giant is characterised by a helium-burning core surrounded by a
by a helium-burning core hydrogen-burning shell. Let us look at how this structure develops, as
surrounded by a hydrogen-burning represented in figure 17.9.
shell.
Non-reacting H
Non-reacting H Non-reacting He
Fusing H

Core H runs out

Main sequence star ing


burn End of main
ell
sh sequence life
c t s , i ns
t r a eg
c on b
re
Fusing H Co
Fusing H
Hot, non-reacting He

Sh
ell
e xp
an
Fusing H ds
Fusing He

egins
b ur ning b
Co re H e

Red giant
Figure 17.9 The developing layers
Hot, non-reacting He
within a star during the transition
Red giant
from main sequence to red giant (possible pulsating variable)

CHAPTER 17 STAR LIVES 327


At the end of its main sequence life, a star’s core is almost completely
non-reacting helium. It is surrounded by a shell of unused hydrogen.
With no radiation pressure to support the core, it begins to contract
under gravity. This contraction heats up the core and also the shell sur-
rounding it. Hydrogen fusion to helium begins in the shell, producing
energy and increasing the luminosity of the star. Under its own radiation
pressure, the shell begins to expand and this expansion causes the sur-
face temperature to decrease. The shell expansion continues until it is
enormous and the surface temperature is comparatively cool. The star is
now a red giant and on an H–R diagram it has shifted up and to the right
of its former main sequence position.
Meanwhile, the non-reacting helium core has been heating up. If the
star is less than approximately 0.5 solar masses, it is doubtful that it will
ever reach sufficient temperature for anything further to ignite. Such
small mass stars are already near the end of their life.
If the star’s mass is higher the core will reach a sufficient temperature
for helium fusion to begin. High mass stars have hotter cores, which
achieve this ignition smoothly. However, in the core of stars of inter-
mediate mass the helium fusion begins very suddenly. This is referred to
A helium flash is the sudden onset as a helium flash. The star adjusts to the new energy source by reducing its
of helium fusion in the core of a radius and luminosity slightly, moving down and to the left on the H–R
new red giant. diagram, towards the Cepheid instability strip. It is at this point that the
hydrogen-burning shell can become sufficiently unstable to cause the star
to pulsate as a periodic variable, driven by the changed radiation pressure
within.
The transition of stars from the main sequence across to the red giants,
as represented on an H–R diagram, has been summarised in figure 17.10.
Note that very massive stars move across to the super giants rather than
the red giants.

106
Core helium
burning Shell burning

104 10 solar
Core helium Shell burning
masses
burning
Shell burning

102 5 solar
masses Shell burning
Core helium
burning
L (LSun)

1 solar
mass

10–2

10–4 0.1 solar


mass

40 000 20 000 10 000 5000 2500


T (K)

Figure 17.10 The transition from main sequence to giant, as represented on an


H–R diagram

328 ASTROPHYSICS
PHYSICS IN FOCUS
Evidence for the main sequence to red giant transition

E arlier in this chapter we discussed how


protostars contract out of giant molecular
clouds. These clouds have sufficient mass to
(b)

form many hundreds of thousands of stars, and


often stars will be formed in clusters. There are
two distinct types of clusters that can be
observed — open (or galactic) clusters and
globular clusters. Examples of both are shown
in figure 17.11.
One obvious difference between these two
types is that globular clusters contain many
more stars; however, a less obvious difference is
that open clusters contain spectral class O and
B stars, whereas globular clusters do not. Recall
that these hot, massive stars have short life-
times. This means that open clusters are
younger than globular clusters.

Figure 17.11 (a) An open cluster


(the Pleiades) and (b) a globular cluster (M3)

(a)

CHAPTER 17 STAR LIVES 329


(a) (b) (c)

Figure 17.12 H–R diagrams of (a) the nearest and brightest stars, (b) an open cluster such as the Pleiades, and (c) a globular cluster such
as M3. Notice that in (c) the top of the main sequence is missing as these stars have evolved and shifted to the red giant zone.

This difference is revealed also in an H–R plot As a cluster ages, the main sequence shortens
of the stars within a cluster. Look first at figure from the top as the stars progressively evolve into
17.12(a), which is an H–R plot of a sampling of red giants in order of their mass. The result is that
nearest and brightest stars. This represents a the position of the turn-off point acts as an indi-
random sampling of star types and, not unexpect- cation of a cluster’s age. Using this method, the
edly, each of the prominent star groups is repre-
oldest clusters appear to be almost as old as the
sented. However, clusters are not a random
sample, because all the stars within a cluster were universe (12–15 billion years) while the Pleiades is
formed at much the same time and so they are all estimated to be just 100 million years old.
of approximately the same age.
104
Look now at figure 17.12(b). If the stars within
Pleiades
an open cluster are catalogued and plotted on an
M3
H–R diagram, it looks like this. We can see that 103 M41
they occupy almost the entire zero-age main M11
Luminosity relative to the Sun

sequence. If the same exercise is then performed


ZA

102 Hyades
M

M67
with the stars within a globular cluster, it looks
S

like figure 17.12(c), and this time the top of the


main sequence is missing. However, there are 101

now stars occupying the red giant region of the


H–R diagram. This indicates that the missing 1 Sun
stars have already moved on to become red giants
and have shifted to the right.
10–1
The highest remaining point of the main
sequence group is called the turn-off point. If this
exercise is repeated for many clusters and a com- 10–2
20 000 10 000 6000 4000 2000
posite H–R diagram is constructed, it looks like
Temperature (K)
figure 17.13. Notice that each cluster shows a dif-
ferent turn-off point, and this can be used to Figure 17.13 A composite H–R diagram of several clusters.
infer the age of a cluster. Note that the lower the turn-off point, the older the cluster.

330 ASTROPHYSICS
The triple alpha reaction
The fusion of helium in the core of a red giant star proceeds by the pro-
The triple alpha reaction is the cess known as the triple alpha reaction. You should recall that an alpha
process of helium fusion in the particle is a helium nucleus, so that in this reaction three helium nuclei
core of a red giant. combine to form a single carbon atom. The equation is as follows:
3 24 He → 12
6 C + γ radiation.
The carbon atom can easily fuse with another helium nuclei to form
oxygen, using the following reaction:
12
6 C + 24 He → 16
8 O+γ.
Note that the triple alpha reaction produces just 10% of the energy per
kilogram of fuel compared with hydrogen burning. The fuel is used up
quickly so that the time a star spends as a red giant may be just 10–20%
of its prior life as a main sequence star.

Post-helium burning
When the star has exhausted its supply of helium in the core, the fusion
reactions cease there. The core is now largely composed of non-reacting
carbon and oxygen, although hydrogen fusion is still going on in the
shell. What happens from this point depends upon the star’s mass.
A star of one solar mass is near the end of its energy-producing life.
The still-fusing shell expands and becomes unstable, pulsating irregularly
and shedding material, already in its death throes.
However, larger mass stars still have some life left in them. The non-
reacting carbon contracts under gravity, heating up and igniting a
helium-burning shell just below the hydrogen-burning shell. This new
shell-burning causes the star to expand again, moving diagonally up
and right on the H–R diagram. This helium-burning shell is turbulent
and, unsupported from beneath like this, can make the star pulsate as a
non-periodic variable.
If the star is larger than about five solar
masses, then its core becomes hot enough H → He
to begin fusion of carbon to neon and He → C, O
magnesium, possibly starting quickly in a C → Ne, Mg
O → Si, S
‘carbon flash’.
Si, S → Fe
When the carbon supply is exhausted, a
Fe core
very massive star may proceed further. As
each energy source runs out the core con-
tracts under gravity and heats up. This
ignites the element that has been produced
by the shell immediately above it, creating
a new shell of energy production (see figure
17.14). The core contracts further, heating
sufficiently to ignite a new, heavier energy
source (produced by previous fusions).
Oxygen can be fused to silicon and sulfur,
and the most massive stars are able to fuse
these to form an iron core, but here the Figure 17.14 A very massive star
reactions must stop. This is because the can develop many layers of shell
fusion of iron, or any element heavier, con- burning as it finds successively
sumes energy rather than producing it. heavier core fuels each time the
Eventually, however massive the star, it finds core contracts and heats up. When
itself at the end of its life, unable to initiate it develops an iron core, however, it
any new energy source. can go no further.

CHAPTER 17 STAR LIVES 331


PHYSICS IN FOCUS
The nucleosynthesis of heavy elements in stars
B y now you may have wondered how heavier
elements such as gold or uranium are manu-
factured. We have already seen that hydrogen,
elements. Both processes require an input of
energy and a supply of neutrons.
The first process is the slow capture of neu-
helium and lithium were created in the big bang, trons inside red giants that have achieved a
that main sequence stars fuse their hydrogen into helium-burning shell. The neutrons are captured
helium, that red giants fuse their helium into by nuclei to form heavier ones. This slow process
carbon and oxygen, and that massive stars are is capable of generating elements up to lead on
able to continue to fuse these elements further to the periodic table, including gold.
form heavier elements again. The most massive The second process is the fast capture of neu-
stars are able to fuse elements up to iron to pro- trons in a supernova explosion. In this environ-
duce energy, and no further. But iron is just ment there is sufficient energy available to allow
number 26 on the periodic table, which contains the rapid formation of the elements heavier than
over 100 different elements. So how are the lead, such as uranium.
elements heavier than iron created? The two processes complement each other to
There are two different processes for the provide the wide range of elements found here
manufacture, or nucleosynthesis, of these on Earth and listed in the periodic table.

17.4 STAR DEATH


Sooner or later, a giant star develops a core of material that it cannot fuse
(either because it cannot get it hot enough to fuse, or the core is iron which
will not fuse to produce energy) surrounded by still-fusing shells. From this
point the death of every star follows a similar pattern: the shells will be shed
into space and the core will collapse under gravity. The way in which this
happens depends upon the mass of the star, as represented in figure 17.15.

Molecular cloud

Protostar

Main sequence

Note: M . = Solar mass


Red giant
(number of burning shells
depends on mass)

Planetary nebula Supernova


(original mass ≤ 5 M . ) (original mass > 5 M . )

leaving behind a
leaving
behind a
Figure 17.15 A schematic diagram
showing the evolution of a star as a White dwarf Neutron star Black hole
function of its mass. Note that all stars (core < 1.4 M . ) (core < 3 M . ) (core > 3 M . )
follow the same general pattern.

332 ASTROPHYSICS
Stars of approximately five solar masses or less
The unsupported shells are unstable, producing bursts of energy known
as thermal pulses, as well as extraordinarily high ‘superwinds’. These com-
bine to blow material rapidly away from the star and disperse the shells
until all that is left is the core. The dispersed material is initially in the
form of an expanding shell-shaped nebula around the core, known as a
A planetary nebula is a shell- planetary nebula. (An historical name since the nebula can look like a
shaped cloud of gas that is the planet through a small telescope.) Seen from our perspective this shell
blown-away outer layers of a star. looks very much like a ring, as the photograph of the Ring Nebula in figure
17.16 shows.

Figure 17.16
The Ring Nebula

The core, which has a mass less than 1.4 solar masses, collapses under
A white dwarf is a dense star made the force of gravity to form a white dwarf. This is a very dense star (about
9 3
of degenerate matter. It is the end 10 kg/m ), which means that a star the size of our Sun would crush down
point of small- to medium-sized to the size of the Earth. The white dwarf is composed of ‘degenerate
stars. matter’ — a form of highly crushed matter. The pressure of the high-speed
electrons within this matter (‘electron degeneracy pressure’) is all that is left
to act against gravity and stabilise the star’s size.
The mass limit of 1.4 solar masses is an important one to note. Known
The Chandrasekhar limit (1.4 solar as the Chandrasekhar limit, it is the greatest mass that a non-rotating core
masses) is the greatest mass that a can have and still become a white dwarf. Rotation increases the limit some-
non-rotating white dwarf can have. what. If the mass of the core is greater than this, even electron degeneracy
pressure is not enough to hold back the gravitational collapse.
Eventually the planetary nebula disperses into space and the white
dwarf simply cools down to form a stellar corpse known as a black dwarf.

Stars of more than five solar masses


The shells of massive stars can develop multi-layered structures; however,
they still experience the unstable pulsations and superwind of the less
massive giants. The difference here, though, is the mass of the core, being
greater than 1.4 solar masses. The core begins to collapse under the force
17.1 of its own gravity, but this time electron degeneracy pressure will not stop
Researching stellar objects it. Just how far this crush goes depends again on the mass of the core.

CHAPTER 17 STAR LIVES 333


If the core is less than approximately three solar masses (but greater
A neutron star is the extremely
dense remnant of the core (1.4 to 3
than 1.4 solar masses) then the matter is crushed to such an extent that
solar masses) of a massive star. It is electrons and protons are forced together to form a sea of neutrons, and
composed of neutron matter. it is the neutron degeneracy pressure that finally halts the collapse. The
core has become a neutron star, with a density of approximately
17 −3
10 kg m and just 10 to 15 km in diameter.
A black hole is the crushed If the core is greater than approximately three solar masses, nothing
remnant of the core (greater than
5 solar masses) of a very massive can stop its gravitational collapse to a black hole. The matter is crushed
star. Theoretically, it is a point of down to a point of infinite density, known as the singularity. The gravity
zero volume and infinite density. of this point is so great that nothing, including light, can escape it from
within a certain radius, called the ‘event horizon’. It is this characteristic
that makes black holes black.
A supernova is a violent explosion
of uncontrolled nuclear reactions
In both of these cases, the collapse of the core draws in the remaining
that completely blows away the gases of the shells of the star and they rebound from this implosion with
various layers of a massive star a supernova. This is a violent explosion of uncontrolled nuclear reactions
(original mass greater than five solar that completely blows away the material that was the various layers of the
masses). star, leaving behind just the highly dense core.

PHYSICS IN FOCUS
Pulsars
N eutron stars spin very rapidly, up to 600 rev-
olutions per second. This is because the
angular momentum of a comparatively slowly
Earth
spinning giant’s core is conserved as it shrinks
to a neutron star. This is analogous to a spin-
ning ice skater who pulls in her arms to spin
even faster.
If a beam swings
In addition, these stars possess very strong mag-
by Earth, we see
netic fields. Such strong, quickly rotating mag- regular flashes of
netic fields result in the emission of a beam of radiation – radio,
electromagnetic radiation from each magnetic optical, x-ray
pole, as shown in figure 17.17. The beam traces Neutron star or gamma rays.
out the surface of a cone as the star rotates. If the spins rapidly
Earth happens to be intercepted by one of these
beams, we will ‘see’ a very regular pulsation of
radiation each time the beam swings past. Magnetic axis
In 1967, while a research student at Cambridge
Intense
University, British radio astronomer Jocelyn Bell
magnetic
(1943–) discovered just such a repetitive, pul- field
sating source of radio waves for the first time.
They were dubbed ‘pulsars’, and confirmation of
their connection with neutron stars came the
following year with the discovery of a pulsar at
the centre of the Crab Nebula. This nebula,
shown in figure 17.18 on the following page, is
the remnant of a supernova observed by Chinese
and Japanese astronomers in 1054.
There are currently over 500 pulsars known, Figure 17.17 The strong magnetic fields of the quickly
with periods ranging from 1.54 milliseconds to rotating neutron star produce beams of electromagnetic
4 seconds. The radiation they emit can occur at radiation from each magnetic pole that sweep through the
radio, optical, X-ray as well as gamma ray wave- galaxy like a lighthouse beacon.
lengths.

334 ASTROPHYSICS
Figure 17.18 Two very different views of the Crab Nebula. This nebula is a remnant of a supernova observed in 1054. In 1968,
a pulsar was observed in its centre, confirming that pulsars are neutron stars that happen to be sweeping their beam of radiation
past our line of sight. The image on the left is an optical photograph taken by the Palomar Observatory; this is our usual view of
the nebula. The image on the right is an X-ray photograph of the same nebula, taken by the Chandra X-ray Observatory shortly
after it was placed in orbit. This previously unseen view clearly shows the powerhouse neutron star within.

Evolutionary tracks
In earlier sections we traced the paths followed by different stars on an
H–R diagram, through their early and middle lives. Let us now complete
the picture by including their deaths. Figure 17.19 presents this infor-
mation for stars of approximately 0.1, 1, 5 and 10 solar masses.

106
Core
Supergiants burning Shell
burning 10 solar
104 masses
Shell He burning
5 solar
Core He
Shell H burning masses
Main sequence burning,
102
possible Red giants
variable Shell H burning
L (LSun)

1
1 solar
mass

White
10–2 dwarfs

0.1 solar
mass
10–4
Figure 17.19
The evolutionary paths of
several different stars on 40 000 20 000 10 000 5 000 2 500
an H–R diagram T (K)

CHAPTER 17 STAR LIVES 335


• The H–R diagrams of whole star clusters show a
SUMMARY turn-off point that can be used to infer the age
of the cluster. The lower the turn-off point, the
• Interstellar space is filled with an uneven older the cluster.
spread of gas and dust, sometimes forming
clouds or nebulae. The gas is mostly hydrogen,
and occurs in the form of neutral atoms, ions QUESTIONS
and molecules. The sparse dust is made up of
icy grains. 1. (a) Describe interstellar gas.
(b) Identify the types of substance that can be
• The dust is associated with molecular clouds — found within interstellar gas.
the grains act as the site of molecule formation.
2. (a) Explain how ionised hydrogen is different
• A molecular cloud that is sufficiently cool and from neutral hydrogen.
massive will collapse under gravity to form a (b) If it is known that a region of hydrogen is
heated core called a protostar. This core is ionised, what can be inferred about its
hidden from our view by the dust in the cloud. temperature?
• The protostar accretes some material from the 3. (a) Construct a list of molecules that can be
cloud and then blows off the rest, revealing found in the interstellar medium.
itself. (b) What are ‘organic’ molecules, and do you
think it strange that they should appear in
• When the star is hot enough to begin the the interstellar medium?
fusion of hydrogen in its core, it becomes a 4. Describe the composition of an interstellar
main sequence star. This is the longest and molecular cloud.
most stable period of its lifetime.
5. (a) Describe an interstellar dust grain.
• In cooler stars the dominant hydrogen-burning (b) State where these grains are thought to
mechanism is the proton–proton chain; in originate.
hotter stars it is the CNO cycle. 6. Interstellar dust is closely associated with
• When core hydrogen runs out, the core con- molecular clouds. Explain why.
tracts and heats up. The shell of previously 7. The presence of dust within molecular clouds,
unused hydrogen ignites and expands to turn the site of star formation, causes a seeing diffi-
the star into a red giant. The core then ignites culty for astronomers. Describe this problem
to begin helium-burning. The star can become and the method used to overcome it.
unstable and behave as a periodic (pulsating) 8. In point form, outline the steps of formation
variable. of a star by gravitational collapse.
• When the core fuel runs out, the core will con- 9. (a) Describe a protostar.
tract and heat up. If the star is massive enough (b) Compare the luminosity, radius and core
it will ignite a new layer of shell-burning as well temperature of a protostar to the zero-age
as a new and heavier fuel in the core. This pro- main sequence star it eventually forms.
cess can be repeated over and over but must 10. Identify the event that marks the transition
stop when the core consists of iron. from protostar to zero-age main sequence star?
• If the original star was five solar masses or less, 11. (a) Write a description of the proton–proton
it will shed its shells as a planetary nebula, while chain without using equations. Mention
the core will be compacted into a white dwarf. reactants and products.
CHAPTER REVIEW

(b) Repeat the exercise with the CNO cycle.


• If the original star was greater than five solar
masses, it will shed its shells violently in a super- 12. (a) Identify the hydrogen-burning mechanism
nova. If the core was less than three solar that is dominant in cool stars.
masses, it will be compressed into a neutron (b) Identify the hydrogen-burning mechanism
star; if greater than this, it will be crushed to a that dominates in hot stars.
black hole. (c) State the core temperature that marks the
crossover between the reactions in (a) and
• A pulsar is a rotating neutron star that happens (b).
to swing its beam of electromagnetic radiation (d) Can both reactions occur at the same time
past us as it spins. within a star? Explain.

336 ASTROPHYSICS
CHAPTER REVIEW
13. (a) Explain the role of carbon in the CNO 23. Discuss how the process of a star losing its
cycle. outer shells depends upon its mass.
(b) State why nitrogen and oxygen are men- 24. Describe the various final states for the core of
tioned in the name of this process. a star and link each to the mass of the original
14. (a) Identify the net reaction of hydrogen star and the mass of the star’s core.
burning in the core of a main sequence 25. (a) Construct a diagram that shows each step
star. of a star’s life, in a general form.
(b) Describe what happens to the helium pro- (b) Include on this diagram, the possible vari-
duced by hydrogen burning. ations in the giant stage, noting mass with
15. In point form, summarise the steps in the tran- each variation.
sition of a star from main sequence to red (c) Include possible variations in the final
giant. stages, noting mass with each variation.
16. Describe the two layers of reactions typical of a 26. Construct a Hertzsprung–Russell diagram,
red giant. including main sequence, red giants and white
dwarfs. On this one diagram draw the evol-
17. (a) Write a description of the triple alpha utionary tracks of stars of 1, 5 and 10 solar
reaction. masses, using a different colour for each path.
(b) Identify the temperature required to
27. When pulsars were discovered, it was first
initiate this reaction.
thought that the pulsations were a communi-
18. Describe a helium flash, and the types of star cation from space. What feature of the signal
experience it? do you think quickly dispelled this suspicion?
19. Identify the heaviest element able to be fused 28. (a) Compare a pulsar and a neutron star.
within the core of a star of: (b) There must be many neutron stars in our
(a) 0.1 solar mass galaxy, but only about 500 have been dis-
(b) 1 solar mass covered. Explain why more pulsars have
(c) 10 solar masses not been found. (Hint: It has to do with
(d) 50 solar masses. the radiation beam.)
20. State when a giant is vulnerable to regular pul- 29. Describe a cluster’s turn-off point on an H–R
sations of luminosity. diagram, and explain what it can tell us.
30. (a) It has been suggested that the Earth is par-
21. Describe the structure of a massive star late in
tially supernova remnant. Discuss this con-
its giant stage.
tention.
22. Describe in general terms what becomes of: (b) Bearing in mind that the solar system was
(a) the various outer layers or shells of a star formed from the same molecular cloud as
(b) the core the Sun, what does your answer to part (a)
during the final stages of a star’s life. tell us about the Sun?

CHAPTER 17 STAR LIVES 337


17.1
RESEARCHING
STELLAR
OBJECTS
Aim
To access up-to-date information on various stellar
objects.

Apparatus
Internet access

Method
Use your computer to access on the internet some
or all of the online astronomy databases listed
below. Normally, each will have a specialty so that it
is useful to sample a variety of sites. In this way,
gather data, plus a picture (if possible), of two of
each of the objects listed:
• main sequence stars
• variables
• binaries
• giants
• open clusters
• globular clusters
• supernovae
• white dwarfs/neutron stars/black holes
• nebulae (especially planetary)
• galaxies.

eBook plus

Weblinks:
The Messier Database
MOST Supernova Remnant Catalogue (MSC)
PRACTICAL ACTIVITIES

A Catalogue of Galactic Supernova Remnants


The Double Star Library
The HIPPARCOS and Tycho Catalogues

Try a general search engine or web crawler service


if you still cannot find some information.

338 ASTROPHYSICS
HSC OPTION MODULE
Chapter 18
The use of ultrasound in
medicine

Chapter 19
Electromagnetic radiation as
a diagnostic tool

Chapter 20
Radioactivity as a diagnostic
tool

Chapter 21
Magnetic resonance imaging
as a diagnostic tool

MEDICAL PHYSICS
CHAPTER
18 THE USE OF
ULTRASOUND
IN MEDICINE
Remember
Before beginning this chapter you should be able to:
• recall the features of waves, including speed,
frequency and wavelength
• distinguish between transverse and longitudinal
waves
• outline the properties of waves including reflection,
refraction and scattering.

Key content
At the end of this chapter you should be able to:
• describe the properties and production of ultrasound
• describe the piezoelectric effect and the effect of an
alternating potential difference on a piezoelectric
crystal
• describe how acoustic impedance affects the
behaviour of ultrasound
• calculate the acoustic impedance of a variety of
materials
• calculate the amount of reflected ultrasound signal at
various interfaces
• describe the situations where different types of
ultrasound scans would be used
• describe the use of Doppler ultrasonics in detecting
cardiac problems
• describe how ultrasound is used to measure bone
density.
Figure 18.1 Ultrasound is an important technique for
medical diagnosis. This photograph shows a pregnant
woman undergoing an ultrasound examination of her foetus.
In this chapter we will look at the properties of ultrasound waves and how
they are applied to medical imaging. Images of organs can be produced
because ultrasound waves penetrate and interact with the body, as in the
example of a pregnant woman undergoing an ultrasound (see figure
18.1). Movement such as blood flow in veins and arteries can also be
measured using ultrasound. Ultrasound is one of the most frequently
used imaging techniques in medical diagnosis.

18.1 WHAT TYPE OF SOUND IS


ULTRASOUND?
Ultrasound is very high frequency Ultrasound is very high frequency sound. Ultrasound waves are sound
sound. Ultrasound waves are waves with frequency greater than that of normal human hearing. That
sound waves that have a frequency is, the frequency is greater than 20 000 hertz.
above the range of human hearing, You will recall that, in the Preliminary Course topic ‘The world com-
that is, greater than 20 000 hertz.
municates’, you learnt that sound waves are longitudinal waves and need
a medium in which to travel. The particles of the medium oscillate back
and forth in the same direction as the wave travels through the medium,
producing a series of compressions and rarefactions. The compressions
and rarefactions correspond to pressure differences in the medium. It is
with reference to these compressions and rarefactions that the wave-
length can be measured.

Wavelength
C R C R C

Air pressure
Wavelength
+∆P
C C C
Normal
air pressure
R R
–∆P
C = compression
R = rarefaction Distance from source (m)

Figure 18.2 Wavelength is the distance between successive compressions or rarefactions of


the longitudinal wave. The pressure in the medium is maximum at a compression and
minimum at a rarefaction.

The amplitude of a wave is the maximum displacement of the particles


on either side of the equilibrium position. The greater the amplitude of
the wave, the greater the intensity of the wave and the greater the energy
it is carrying.
The frequency (f ) is the number of oscillations the particles make per
second and is measured in hertz (Hz). The wavelength (λ) is the distance
between two successive compressions or between two rarefactions. Fre-
quency and wavelength are related by the equation
v = fλ
where v is the speed of the wave in the medium.

CHAPTER 18 THE USE OF ULTRASOUND IN MEDICINE 341


If the wavelength is measured in metres (m), the speed is in metres per
−1
second (m s ).
Ultrasound can be reflected, refracted, scattered and superimposed
in the same way as audible sound or transverse waves such as light.
These properties are important for the use of ultrasound in medical
diagnosis.

Ultrasound and medical diagnosis


Ultrasound, in its applications in the field of medical diagnosis, is
reflected off parts of the body, detected and analysed to produce an
image. Structural images, not functional images are produced.
Ultrasound scanning provides a safe way of observing internal organs,
as no damaging effects of ultrasound used in medical diagnosis are
known. The frequency of the ultrasound determines the clarity of the
image obtained.

Why isn’t audible sound used for medical diagnosis?


In order to produce a clear image we must be able to distinguish
between different internal parts of the body, some of which may be very
close together. For example, we may want to detect the individual
fingers of a foetus in the womb, or the parts of a heart valve. Audible
sound will not distinguish between such close objects. A sound wave’s
ability to produce an image depends on the sound wave’s wavelength
and hence its frequency. Audible sound has a frequency below 20 000
Hz and a correspondingly poorer ability to produce images of small
objects than higher frequency ultrasound. In producing an ultrasound
image, echoes from two objects close together must be able to be
detected. Generally, if the separation or size of the objects is less than
the wavelength of the sound wave, the objects cannot be detected. (We
say they cannot be resolved.)
The higher the frequency of the sound wave, the shorter the wave-
length and the better the resolution that is possible. For example,
in water the wavelength of sound at the limit of hearing (20 000 Hz) is
75 mm, whereas 1.2 MHz ultrasound has a wavelength of 1.2 mm, and 3.5
MHz ultrasound has a wavelength of 0.43 mm. This means 3.5 MHz ultra-
sound can be used to distinguish objects in water that are 0.43 mm apart.
This is a much better resolution than that obtained by audible sound;
even at the limit of hearing, objects in the water would have to be 75 mm
apart to be distinguished.

What frequency will produce the clearest image?


We may ask why we do not use incredibly high frequencies with corre-
spondingly small wavelengths so that we can detect very fine detail.
A compromise on the frequency used must be made because the
absorption of the wave increases as the frequency increases. For
example, a 50 MHz ultrasound does not penetrate nearly as well
through tissue as a 10 MHz ultrasound (see table 18.1). Hence the fre-
quency used must be suitable for the part of the body being analysed. A
small organ on the surface of the body, such as the eye, can be exam-
ined with a much higher frquency than would be useful for examining
deep abdominal organs. Medical diagnosis uses ultrasound in the range
1 MHz to tens of MHz.

342 MEDICAL PHYSICS


PHYSICS FACT
The frequency range used for diagnosis varies depending on the part of the body that is being exam-
ined. Some examples are given in table 18.1.

Table 18.1 Ultrasound frequency for different parts of the body being imaged

PART OF BODY BEING TYPICAL PENETRATION REASON FOR CHOOSING THIS


FREQUENCY CHOSEN EXAMINED DEPTH FREQUENCY

50 MHz Skin or areas reached a few mm The region must be close to


through surgery, such as the ultrasound as absorption
blood vessel walls and of the ultrasound energy by
cartilage. the tissue is significant and
hence it does not penetrate
tissues effectively.

10–15 MHz Eye 1 cm The organ is small and on the


surface of the body, so
absorption of the ultrasound
by tissues is not a problem.

4–10 MHz Thyroid, carotid artery, 5 cm These organs are quite close to
breast the surface of the body, so
absorption of the ultrasound is
not a problem at this
frequency range.

3–5 MHz Liver, heart, other 10–20 cm These organs, such as the
abdominal organs heart, uterus and liver, are
deep in the body. Ultrasound
with a frequency that is too
high will be absorbed before it
reaches the organ.

18.2 USING ULTRASOUND TO DETECT


STRUCTURE INSIDE THE BODY
A pulse of ultrasound is directed into the body and echoes from tissue
boundaries are detected. Some of the energy of the pulses of ultra-
sound reflect off the junction between different substances in the body
and some pass through. This is similar to what happens when a beam of
light strikes the surface between water and glass — some light is
reflected from the surface and some is transmitted. Because the sur-
faces inside the body are not usually flat, the reflection results in scat-
tering of the ultrasound. The ultrasound that comes back to the
detector is analysed.
The properties of body materials such as bone, skin and muscle are dif-
ferent and, as a result, sound propagates through them at different
speeds and with different amounts of absorption.

CHAPTER 18 THE USE OF ULTRASOUND IN MEDICINE 343


(a) Reflection and transmission of ultrasound

Ultrasound
signal
produced and Organ
received Vertebra
here

(b)
(b) Abdominal wall

(c)

Figure 18.3 (a) Reflection and transmission


of ultrasound. Ultrasound is reflected from
each side of the organ and from the vertebra.
These reflected waves, which are much weaker
than the original ultrasound, are detected and
used to make the image. (b) Typical heads
used for producing and detecting the
ultrasound (c) An ultrasound image of a foetus

Acoustic impedance
Acoustic impedance, Z, is a
The extent to which body tissues transmit sound varies. The acoustic
measure of how readily sound will impedance, Z, of a material measures how readily sound will pass through
pass through a material. It is a material and is defined by the formula
-2 -1
measured in kg m s . Z = ρv
where
−2 −1
Z is acoustic impedance (in kg m s )
−3
ρ is the density of the medium (kg m )
−1
v is the velocity of sound in the material (m s ).

Table 18.2 shows various body materials and their properties from
which their acoustic impedance can be calculated.

344 MEDICAL PHYSICS


Table 18.2 Properties of selected body materials
DENSITY (r) VELOCITY (v)
-3 -1
MEDIUM (kg m ) (m s )
Air 1.3 330
Water 1000 1430
Eye
aqueous humour 1000 1500
vitreous humour 1000 1520
1140 1620
lens
Soft tissue such as nerves (average) 1060 1540
Muscle (average) 1075 1590
Fat 952 1450
Liver 1050 1570
Brain 1025 1540
Blood 1060 1570
Bone 1400–1908 4080

Ultrasound travelling through fat


SAMPLE PROBLEM 18.1
An ultrasound wave of 2.0 MHz is travelling through fat. Calculate:
(a) the wavelength of the ultrasound
(b) the acoustic impedance of the fat.

SOLUTION (a) Using the wave equation,


v = fλ
6
1450 = 2.0 × 10 λ
−4
λ = 7.25 × 10
−4
The wavelength of the ultrasound in fat is 7.25 × 10 m (0.725 mm).
(b) Using the acoustic impedance formula,
Z = ρv
Z = 952 × 1450
6
= 1.38 × 10
6 −2 −1
The acoustic impedance of fat is 1.38 × 10 kg m s .

Reflection of ultrasound
The acoustic impedances of two adjoining tissues are used to calculate
the intensity of the reflected pulse compared with the incoming one. The
following formula enables us to determine the fraction of the intensity
reflected at a surface between two media, such as bone and muscle:
2
I ( Z2 – Z1 )
----r = ------------------------
-
I0 ( Z2 + Z1 )2
where
−2
Ir is the intensity of the pulse reflected back (W m )
−2
I0 is the intensity of the pulse incident on the surface (W m )
−2 −1
Z1 and Z2 are the acoustic impedances of media 1 and 2 (kg m s ).

CHAPTER 18 THE USE OF ULTRASOUND IN MEDICINE 345


PHYSICS FACT
3 −2

O ften, reflected ultrasound is very weak


because, when the wave travels through
tissue, much of the energy is absorbed and con-
therapy and, at power levels of 10 W cm , may
be used to destroy tissue. Ultrasound can be used
to heat tissue deep in the body, and so produce
verted to heat. Why not send a very high intensity relief from pain in the joints of sufferers of arth-
signal so that more energy can be reflected and ritis and stimulate blood flow to damaged tissues.
detection of the reflected wave will be easier? Recent advances have been reported in
The power levels for diagnostic medical ultra- research into the treatment of prostate cancer.
sound must be kept at very low intensity of Investigations have shown that ultrasound of
−2
0.01–20 watts per square centimetre (W cm ) so sufficient intensity can destroy cancer cells. Ultra-
that heating or destruction of tissue does not sound has also been used to break up gallstones
occur when energy is absorbed by tissues inside and kidney stones, avoiding the need for surgery,
the body. Ultrasound at continuous power levels and to break up the lens of the eye during a
−2
of about 1 W cm is used to heat tissues for cataract operation.

SAMPLE PROBLEM 18.2 Reflection of ultrasound


Calculate the percentage of ultrasound that is reflected at the junction
between fat and muscle, using information from table 18.2.
6 −2 −1
SOLUTION The acoustic impedance of fat is 1.38 × 10 kg m s (from sample
problem 18.1).
We need to calculate the acoustic impedance of muscle.
Z = ρv
Z = 1075 × 1590
6 −2 −1
Acoustic impedance of muscle = 1.70 × 10 kg m s
Using the equation from page 346:
6 2
I ( 1.38 × 10 6 – 1.70 × 10 )
----r = ----------------------------------------------------------------
-
I 0 ( 1.38 × 10 6 + 1.70 × 10 6 ) 2
= 0.011
This is the ratio of reflected intensity to incident intensity. This ratio
must be converted to a percentage (0.011 × 100%).
Hence 1.1% of the incident signal is reflected back.
If there is a very large difference in acoustic impedance between the
two materials, a large fraction of the ultrasound will be reflected back.
This is what happens when ultrasound is directed through the air to the
skin. The very large difference in the acoustic impedance of air and skin
means that most of the ultrasound will be reflected off the skin and will
not enter the body. To minimise this reflection at the skin surface, a gel
with an acoustic impedance similar to that of skin is placed on the skin.
This gel excludes air from the space between the ultrasound transducer
An ultrasound transducer is a
device for converting electrical and the skin and provides an acoustic match. The signal can pass from
energy to ultrasound energy or for the transducer through the skin without reflection.
converting ultrasound energy to In a pregnant woman, the bladder is between the outer surface of the
electrical energy. body and the foetus. A pregnant woman must have a bladder full of urine
in order to obtain a successful ultrasound of the foetus. The full bladder
lifts the uterus to a good position for imaging and pushes the lower
intestine out of the way of the signal. The lower intestine contains a lot of
gas that would reflect the ultrasound and make the imaging of the foetus
difficult.

346 MEDICAL PHYSICS


18.3 PRODUCING AND DETECTING
ULTRASOUND: THE
PIEZOELECTRIC EFFECT
An ultrasound transducer produces ultrasound of a specific frequency
and this same transducer is capable of detecting the reflected ultrasound.
In order to produce ultrasound, a material must be made to vibrate at a
6
very fast rate, of the order of 1.5 × 10 Hz. The material used is a piezo-
electric crystal. If an electric field (potential difference) is applied across
the crystal, the shape of the crystal changes. This behaviour of the crystal
is called the piezoelectric effect. By reversing the potential difference
The piezoelectric effect is the
conversion of electrical energy to repeatedly and rapidly, the crystal can be made to vibrate and produce
mechanical energy resulting in the frequencies in the ultrasound range. Such a crystal is part of the ultra-
change in shape of a piezoelectric sound transducer (see figure 18.4).
crystal when it is subjected to a
potential difference.
Metal outer casing Backing block

Electrodes apply
an alternating
potential difference

Power Piezoelectric
cable crystal
Acoustic
Plastic ‘nose’
insulator

Figure 18.4 An ultrasound transducer

A crystalline substance is a solid in


Naturally occurring crystalline substances exhibiting the piezoelectric
which the atoms or molecules are effect include quartz, lithium sulfate and barium titanate. The most com-
arranged in a regular pattern. monly used substance for a transducer for medical purposes is the arti-
ficial ceramic material, lead zirconate titanate (PZT).
Conducting material on either side of the crystal of the transducer per-
mits the application of an alternating potential difference across the
crystal. The backing block dampens the vibration. If there was no backing
block, the crystal would continue to vibrate after the electric field was taken
away, in the same way that a drum continues to vibrate after it has been
struck. The backing block dampens the vibration of the crystal in the same
way as placing a hand on a drum skin would stop the drum’s vibrations.
When ultrasound strikes this crystal there is a variation in pressure felt
at the crystal surface. (Recall from your Preliminary Course work in
Physics 1, chapter 2 that compressions and rarefactions are detected as
pressure differences in the medium through which the wave is travel-
ling.) When a compression meets the surface, the crystal is compressed;
when a rarefaction meets the surface, the crystal relaxes. The crystal will
be changed in the same way as when the crystal was generating ultra-
sound. In this case the changing crystal will cause a changing electric
field to be produced across the crystal. This changing electric field will
vary in time with the frequency of the ultrasound and can be detected in
the transducer. This means that the transducer can be used to detect
ultrasound as well as to produce it. The voltage produced is greater when
the amplitude of the ultrasound is greater. Hence the transducer pro-
vides information about the ultrasound wave intensity.

CHAPTER 18 THE USE OF ULTRASOUND IN MEDICINE 347


PHYSICS FACT
History of the use of ultrasound
T he first device for the production of ultrasound was a pipe, pro-
duced in 1820 and called the Wollaston whistle. It was developed
by the English scientist W. H. Wollaston, who was determining the
limits of human hearing.
Modern medical transducers are based on the piezoelectric effect,
which was discovered by Jacques and Pierre Curie in 1880. These
improved as developments in electronics grew from the introduction
of radar, just before and during World War II. At this stage, imaging
based on the pulse-echo principle was possible.
A patent was filed in 1940 for an ultrasound device to detect flaws
in metal structures. In 1948, Karl Dussik, who was trying to detect
brain tumours, unsuccessfully attempted the first medical application
of ultrasound. It was not until 1952 that the first successful medical
use of pulse-echo imaging was described by J. J. Wild and J. M. Reid.
Six years later, in 1958, the first commercial equipment for ultrasonic
imaging appeared.

18.4 GATHERING AND USING


INFORMATION IN AN
ULTRASOUND SCAN
An ultrasound transducer detects reflected signals from different body
structures. A computer then analyses the signal to obtain information
about the location of the structures to produce an image. Different types
of scans are chosen to suit particular purposes.

A-scans
An A-scan is a range-measuring An A-scan is a range-measuring system that records the time for an
system that records the time for an ultrasonic pulse to travel to an interface in the body and be reflected
ultrasonic pulse to travel to an back. In an A-scan the ultrasound pulses are directed into the body in
interface in the body and be one line and the reflected signal is detected. The intensity of the
reflected back.
reflected beams is plotted on a graph as a function of time. In this way
the position of various features can be determined from the time lapse
between sending the signal and receiving its echo and a knowledge of
the speed of sound in the tissue. The intensity of the reflected beam
provides information about the type of material through which the
ultrasound is travelling.
An A-scan provides one-dimensional information about the location of
the reflecting boundaries. Originally this type of scan was used to deter-
mine the midline position of the brain and detect any abnormalities
there caused by tumours, because the midline would be displaced by a
tumour. A-scans are no longer used for this as more sophisticated
methods of imaging the brain have been developed. A-scans are still used
in ophthalmology for the diagnosis of eye disease and for measurements
of distances in the eye, where no image of the interior of the eye is
needed (see figure 18.5).

348 MEDICAL PHYSICS


(a) Sclera (b)
Ultrasound Retina
transducer

Lens
Cornea

S r S

Figure 18.5 (a) The reflected ultrasound from parts of the eye displayed on an oscilloscope
(b) Ultrasound studies of a detached retina. The A-scan trace shows an echo (s) from the front
of the eye, an echo (r) from the retina and an echo (s) from the back of the eye. In a normal
eye the echo from the retina would blend with the echo from the back of the eye.

B-scans
A B-scan displays the reflected In a B-scan the intensities of the reflected ultrasound are represented as
ultrasound as a spot, the brightness spots of varying brightness, the brightest spot corresponding to the most
of which is determined by the intense reflected ultrasound. By moving the transducer probe, the body is
intensity of the ultrasound. viewed from a range of angles. A series of spots are obtained, each series
corresponding to a different line through the body. These spots can give a
2-D picture of a cross-section through the body (see figure 18.6).

(a) (b)
(b) Trace (c) Trace (d) Trace
Probe
Placenta
Body
cross-
Limbs
section

Foetal skull Spine

(c) (d)

Figure 18.6 Building up a B-scan image of a foetus

CHAPTER 18 THE USE OF ULTRASOUND IN MEDICINE 349


Sector scans and phase scans
Sector scans are scans in the shape Sector scans are scans of a fan-shaped section of the body. They are made
of a sector. They are made up of a number of B-scans, which build up an image of the sector in the
from a series of B-scans. body through a series of dots of varying intensities.
When this type of scanning was first used, a single transducer was
rocked back and forth manually so that the ultrasound pulses would
sweep across a sector of the body. This required skill and experience to
achieve clear images and is now rarely used
in hospital work. However, its advantage is
that it requires a small entry ‘window’ into
the body and is still valuable in imaging
through a small space, such as the space
between the bones of an infant’s skull to
obtain an image of the infant’s brain, as
shown in figure 18.7.
Modern scanning techniques use an array
of transducers very close together in the one
probe head. This enables very clear images
to be produced and also allows the poss-
ibility of real-time scans (scans that are pro-
duced faster than 16 images per second and
displayed on the monitor at that rate). Real-
time scans allow movement to be monitored
and so are used to examine, for example,
Figure 18.7 Sector scan of an infant’s brain foetal movement or heart movement.

(a) Time of stimulation (represented (b) This transducer is fired last.


by arrowheads) This transducer
is fired first.

Transducers

Arcs representing
moving crests
Wavefront Arc representing
Wavefront first crest position

Figure 18.8 (a) When the transducers send their signals together the signals are in phase.
The wavefront of the signal is parallel to the transducers. (b) When the transducers fire at
different times the signal fired first will travel furthest. The waves at the probe surface will be
out of phase with one another. The wavefront, which is a tangent to the crests of the advancing
waves, will have changed direction compared with (a).

There may be as many as several hundred transducers in an array.


These transducers may be fired simultaneously in which case they
produce a wavefront parallel to the transducers. If the transducers are
fired in close succession so that they are slightly out of phase with one
another, they produce a wavefront that strikes the surface at an angle
other than 90°. By changing the time between firing of the transducers,
the phase difference of the waves from the transducers and hence the
direction of the ultrasound beam can be altered. The beam can be swept

350 MEDICAL PHYSICS


from side to side, producing a scan over a wide arc. Improvements in
A phase scan is a scan produced
using an array of transducers. The
transducer array construction and electronic processing have led to
phase difference between the improvements in image quality. The firing is electronic and very accu-
signals from each transducer may rate. By using this phased array of transducers, a phase scan is produced.
be varied to produce this scan. Ultrasound scanning using phase scans is the most common scanning
technique used today.

PHYSICS FACT
Phase difference
T he phase difference represents the relative
positions of two waves compared with one
another. If two identical waves are generated side
(a) (b)

by side at the same instant and travel through a


medium, the crests of the waves are always together
and there is no phase difference between the
waves. If the crest of one wave is always in line with
Direction of
the trough of the other wave then the phase differ- approaching
ence is half a wavelength. In other words, if the two waves
identical waves are generated at slightly different
times from one another, the first crest of one wave
Wavefront
will always be ahead of the first crest of the second
wave and the waves will be out of phase. (There is
a phase difference.) The wavefront, or line repre- Waves generated
along this line
senting the approaching waves, is the line that is a
tangent to the crests of the waves.
If arcs represent the crests of the waves trav- Figure 18.9 The size of the phase difference determines
elling through a medium, figure 18.9 shows the direction in which waves propagate. (a) no phase
waves where there is (a) no phase difference difference (b) phase difference
and (b) two distinct phase differences.

PHYSICS IN FOCUS
Ultrasound and bone density
T here are several techniques for measuring
bone density. These vary in their usefulness
for detecting the risk of osteoporosis, a disorder in Transducer
Transducer

which reduced bone density leads to brittle bones.


Normal X-rays are not suitable for the early
detection of osteoporosis. This is because X-rays
show changes due to loss of bone density only
when approximately 30 per cent of bone has been
lost. Diagnosis needs to be made earlier than this. Heel
Ultrasound measurement of bone density is Figure 18.10 Transducers send and receive the
more effective and the technique is readily avail- ultrasound signal through the heel bone to provide data
able, often through mobile units at pharmacies. for the analysis of bone density.
The patient inserts a foot into a warm water bath
and ultrasound waves are directed through the higher speed of ultrasound and larger attenuation
heel, as shown in figure 18.10. than osteoporotic bone. Speed and attenuation
The speed of the ultrasound through the bone are combined to give an index from which an esti-
and the ultrasound attenuation (degree of mate of heel bone mineral density is reported.
absorption) are measured. Normal bone has a (continued)

CHAPTER 18 THE USE OF ULTRASOUND IN MEDICINE 351


It is not possible by this method to measure sites Further information about ultrasound measure-
of fractures that occur in people with osteoporosis, ment of bone density may be found from the
such as at the hip or spine. Although it is a cost- Royal Adelaide Hospital website or using key
effective method of screening for bone density, words of ‘ultrasound bone mineral density’ in an
ultrasound on its own does not predict the prob- internet search engine.
ability of fractures and so is not recommended for
Weblink:
the diagnosing of osteoporosis. If an abnormal
result is obtained from the bone density analysis, a eBook plus Ultrasound measurement of
bone density
DEXA examination should be undertaken to test An interesting discussion of ultrasound and
for osteoporosis. X-rays from the Health Report on ABC Radio
National. This information is still reliable,
DEXA (Dual Energy X-ray Absorptiometry)
although it was recorded in 1999.
using X-rays is regarded as the most reliable
method of measuring bone density and can detect
small changes only 6–12 months after a previous
measurement. The density of the lumbar spine
and left hip are usually measured. The DEXA pro-
cedure is different from a normal X-ray because
low-energy X-rays are used. (The X-ray dose is so
low that the radiographer can remain in the room
with the patient.) People who are shown to have
low bone density through ultrasound measure-
ment are referred for a DEXA scan, because
DEXA measures bone density with high accuracy
and precision. More commonly, a person will be Figure 18.11
sent directly for a DEXA scan and not for ultra- A heel ultrasound
sound testing for bone density. scanner

18.5 USING ULTRASOUND TO EXAMINE


BLOOD FLOW
Ultrasound may be used to measure blood flow and hence detect prob-
lems such as threatened blockages in arteries. An understanding of the
The Doppler effect is the Doppler effect is needed to understand how ultrasound is used to
apparent change in frequency measure blood flow.
observed when there is relative
movement between a source of a
sound and an observer.
The Doppler effect
You may have noticed the change in pitch when a car sounding its horn
or an ambulance sounding its siren drives past. The sound becomes
lower as the source of sound passes and moves away. The reason for this
is shown in figure 18.12.
In figure 18.12(a) we see the sound waves generated by the horn.
Sarah and Sam hear the exact frequency that the horn makes. In figure
18.12 (b) we see that the car is running into the sound waves it makes in
front of it and moving away from the sound waves that travel out behind
it. This means that the wavelength of the sound wave is shorter in front of
the car and the frequency that is heard by Sarah is higher than normal.
Behind the car, the wavelength of the sound wave is longer than normal
and hence the frequency of the sound heard by Sam is lower.
The same effect occurs if we drive towards a stationary source of sound.
We pass more waves per second when we travel towards the source
compared with when we were stationary, and hence we hear a note of
higher pitch than normal. Similarly, as we travel away from the source we
meet fewer waves per second and hear a sound of lower pitch than normal.

352 MEDICAL PHYSICS


(a)

Sam Car is stationary Sarah

(b)

Figure 18.12 Change in frequency


heard when source moves Sam Car is moving Sarah

This apparent change in frequency of a sound when there is relative


movement between the source of a sound and the observer is called the
Doppler effect.

Doppler ultrasound in practice


In Doppler ultrasound, the change in frequency is measured and ana-
lysed to give information about rate of blood flow in the body, and par-
ticularly through the heart.
An ultrasound is directed into the body and some of this ultrasound is
reflected off blood cells moving with the blood. Due to the movement of
the blood cells, the reflected ultrasound that is received by the transducer
will have changed in frequency compared with the incoming signal.
In fact, the Doppler effect has to be taken into account twice. To illus-
trate this, imagine the blood is moving towards the transducer. The blood
cells will receive a signal at a higher frequency than that given out by the
transducer. These blood cells then act as a source when they reflect the
signal. They reflect the higher frequency wave and then move into the
wave at the same time, resulting in a further increase in frequency (see
figure 18.13). This higher frequency is received by the transducer.

Transducer
produces Moving blood cells
signal of receive signal of
frequency f. frequency f1,
greater than f.

Blood cells reflect


waves of frequency
Transducer f1, then move into
receives signal these reflected
of frequency waves, further
f2, which is increasing their
greater than f1. frequency to f2.

Figure 18.13 Double Doppler effect Total frequency change is Δf = f2 – f

CHAPTER 18 THE USE OF ULTRASOUND IN MEDICINE 353


−1
For example, if the blood flow is 1 m s and a 5 MHz ultrasound is
used, the frequency change is approximately 3 kHz, which is in the audible
range. Note that the frequency change is what is measured. An experi-
enced practitioner can listen to the frequency change and make judge-
ments about whether the flow is towards or away from the transducer and
whether the blood flow rate is normal. Of course the signal can also be
electronically analysed and displayed on a screen for examination.
The ultrasound reflected from internal tissues may pose a problem as
these waves interfere with the echo ultrasound being analysed. Particular
types of signals must be used to overcome this problem. These are dis-
cussed below.

Choosing the best ultrasound signal


If a continuous signal is used there must be separate transmitter and
receiver transducers within the one head so that the wave being sent
does not interfere with the one being received. The change in fre-
quency can be detected through headphones or measured electroni-
cally. Using a reference frequency, the new frequency is measured
above or below this reference frequency. The electronic device
measures beats, equal to the difference in frequency between the trans-
mitted and detected waves. The beats can be read from a display of
amplitude of the signal against time. By this means, flow towards or
away from the transducer can be determined and information about
the blood flow at an instant can be determined. Note that the amount
of Doppler shift also depends on the angle at which the beam strikes
the moving blood.
The drawback of the continuous Doppler signal is that it does not
convey clear information about deep blood vessels due to scattering and
reflection from soft tissues encountered by the ultrasound as it pen-
etrates the body. Although the continuous Doppler signal will work for
blood vessels close to the skin, it is not suitable for studying the heart,
which is deep in the body.
Most Doppler systems now use pulsed signals. These allow examin-
ation of blood vessels deep under the skin. The position of the blood
vessel can be determined with a traditional B-scan. Then a pulsed signal
is directed into the blood vessel. The time between pulses is deter-
mined by how deep the vessel is, as the return pulse must be received
before the next pulse is sent. If a deeper blood vessel needs to be exam-
ined a longer period of time is waited before the next pulse is sent. The
timing of the pulses is determined electronically and they are only
microseconds apart.
To view Doppler ultrasound video Many modern Doppler instruments allow an image of the anatomy of
images of blood flow through the the body and blood flow to be recorded in real time. This is called real-
heart, search the internet using key time, two-dimensional colour flow imaging. A pulse-echo system is used
words such as ‘Doppler colour flow to obtain a phase scan of, for example, the heart. Pulsed Doppler tech-
imaging’ ‘Doppler ultrasound images’ niques are used at the same time to obtain blood flow information. This
or ‘Colour Doppler echocardiography’. information is colour-coded to indicate whether the blood is flowing
towards or away from the transducer. Red, orange and yellow are used for
blood flowing towards the transducer, the colour depending on the
speed of the blood. Dark blue to light blue is used for blood flowing away
from the transducer. The colour varies accross the artery, showing that
the blood flows more quickly in the centre of the artery.

354 MEDICAL PHYSICS


(a)
Detection of
velocity of blood Colour addition
flow
x
Scan position data Colour
Steering of Image
display
ultrasound beam formatter y
z
Pulse-echo
Transducer imaging to detect
depth Greyscale

Skin
Sector
scan

Blood
vessel

(b)

Figure 18.14 (a) Block diagram of a real-time, two-dimensional, colour flow imaging system.
A phased array transducer produces a sector scan. Distance information is extracted by the pulse-
echo system, and velocities are determined from the size of the Doppler shift. From these two signals
an overall display is made, in which the blood flow is colour coded. (b) Display of an ultrasound
scan showing blood flow through a section of the carotid artery

CHAPTER 18 THE USE OF ULTRASOUND IN MEDICINE 355


Table 18.3 Medical uses for ultrasound scans

MEDICAL AREA AND ORGANS


EXAMINED HOW ULTRASOUND IS USED

Cardiology: the heart A real-time ultrasound phase scan of the beating heart allows diagnosis of abnormal
heart wall motion or disease or fluid accumulation in the region around the heart (an
echocardiogram). When the scan is combined with Doppler colour flow imaging, the
valves can be checked to see if they open and shut correctly and if they leak. The
blood flow in the vessels and heart chambers can also be checked.

Endocrinology: the thyroid gland Ultrasound phase scans are used to detect cysts, tumours or goitres.

Gynaecology: female Ultrasound phase scans are used to detect blocked oviducts or ectopic pregnancy
reproductive organs (foetus attached in the fallopian tubes, not the uterus).

Obstetrics: the foetus and uterus Ultrasound scans are used to determine the position of the foetus, placenta and
umbilical cord before birth (see figure 18.3(c), page 344). Multiple births can be
detected.
Ultrasound can be used during amniocentesis to guide the safe sampling of
amniotic fluid.

Paediatrics: the infant A sector scan can be used to image an infant’s brain through the unclosed space in
the skull.

Renal: the kidneys Pulsed ultrasound phase scans are used to detect kidney tumours, kidney stones or
blockages of the renal tubes.

Vascular: the arteries Doppler ultrasound can be used to detect blood flow, particularly through the carotid
artery, to assess the chance of a stroke. Blood flow through the umbilical cord can
also be studied.

(a)

(b)

Figure 18.15 Ultrasound images can be used to identify


diseased organs, such as in this comparison of (a) an early stage
and (b) late stage of cancer of the liver.

356 MEDICAL PHYSICS


Figure 18.16 Ultrasound is safe
for use with babies and foetuses.

Table 18.4 Advantages and disadvantages of ultrasound for medical diagnosis

ADVANTAGES DISADVANTAGES

• No damaging side-effects are known. • As the interface between gas and soft tissue reflects
• It is non-invasive. As no surgery is involved, there is no risk 99.9 per cent of the ultrasound energy that hits it, images
of infection. cannot be obtained of structures that lie on, for example,
• It is more effective than conventional X-rays in producing the far side of the lungs or intestines.
images of soft tissues. • It is difficult to obtain good ultrasound images of the
• The equipment is relatively inexpensive. brain of an adult, as most of the ultrasound is reflected
from the tissue/bone interface.
• The equipment is safe, portable and can be operated from
a wall socket. • Images are not as clear as those obtained by many other
techniques.
• Real-time imaging is possible.
• ‘Keyhole’ surgery can be carried out at sites close to the
body surface by the use of real-time ultrasound
imaging while operating. For example, a damaged part
of an artery can be located using ultrasound and a
balloon catheter can be inserted to help repair it
through a small incision using local anaesthetic.

CHAPTER 18 THE USE OF ULTRASOUND IN MEDICINE 357


• Pulsed Doppler techniques are used to measure
SUMMARY the speed of blood through blood vessels and
through the heart.
• Ultrasound is high frequency sound above the
range of normal hearing. • The Doppler effect can be used to detect heart
problems such as abnormal heart wall motion,
• As the frequency of the wave increases, the faulty valves and accumulation of fluid around
absorption of the wave increases. the heart.
• As the energy of the wave increases, the heating
effect on the tissues increases.
QUESTIONS
• A piezoelectric crystal can be made to vibrate
and produce ultrasound by applying an alter- 1. Contrast ultrasound with the sound used for
nating voltage. This is the principle behind normal hearing in humans.
generation of ultrasound in a transducer. 2. Describe how ultrasound is transmitted
• Ultrasound can make a piezoelectric crystal through body material.
vibrate and produce an alternating voltage. 3. Describe what is meant by the piezoelectric
This is the principle that allows a transducer to effect.
detect ultrasound.
4. Outline why a piezoelectric crystal can be
• Acoustic impedance (Z) can be calculated made to produce and receive ultrasound
using Z = ρv where ρ is the density of the mat- waves.
erial and v is the velocity of the sound through
the material. Acoustic impedance measures 5. Using the data from table 18.2 (page 345), cal-
how readily sound passes through a material. culate the acoustic impedance of muscle.
6. If the acoustic impedance of blood is
• At interfaces where the acoustic impedance 6 −2 −1
1.59 × 10 kg m s and the velocity of sound
changes, ultrasound is reflected and refracted. −1
through the blood is 1570 m s , calculate the
• The ratio of the intensity of reflected signal (Ir) density of blood.
to initial signal (Io) is given by −3
7. The density of air is 1.3 kg m and its acoustic
−2 −1
impedance is 429 kg m s . Calculate the velo-
I
r ( Z – Z )2
2 1 city of ultrasound through air.
---- = ------------------------
-.
I0 2
( Z2 + Z1 ) 8. (a) Using the data from table 18.2, calculate
the acoustic impedance of:
• A-scans measure in one dimension and record
depth of structure in the body. They are used (i) soft tissue
−3
mainly for scanning the eye. (ii) bone of density 1600 kg m .
(b) Calculate the ratio of the amplitudes of
• B-scans record intensity of reflection in two reflected and transmitted ultrasound
dimensions and build up an image of internal travelling from soft tissue to the bone in
structure through a series of dots of varying part (a).
intensities.
9. Using the data from table 18.2, calculate the
• Sector scans are scans in the shape of a sector, fraction of incident ultrasound intensity
made up from a series of B-scans. reflected from a liver-muscle interface.
CHAPTER REVIEW

• Phase scans are scans produced using an array 10. The value of the ratio of reflected intensity to
of transducers. There can be a phase difference incident intensity (Ir/Io) for sound at various
between the signals from the transducers. interfaces is found in the table on the opposite
page. Use it to answer the following questions.
• Bone density can be investigated by measuring
(a) Identify the tissue interface at which there
the speed of ultrasound and attenuation of
is the most reflection.
ultrasound in the heel bone.
(b) Identify the tissue interface at which there
• The Doppler effect is the apparent change in is the least reflection.
frequency observed when there is relative (c) Identify the tissue interface at which the
movement between the source and an observer. greatest amount of absorption occurs.

358 MEDICAL PHYSICS


CHAPTER REVIEW
(d) If an ultrasound signal of intensity (b) Calculate the optimum acoustic imped-
−2
60 mW cm meets a fat-bone interface, ance of the gel and justify your answer.
calculate the intensity of the reflected (c) Calculate the speed of the ultrasound in
signal. the gel if the gel is made of material of
−3
(e) (i) If the ultrasound signal striking a density 1200 kg m .
−2
fat–muscle interface is 80 mW cm , 16. Outline the difference between an ultrasonic
calculate how much energy travels A-scan and an ultrasonic B-scan.
into the muscle.
(ii) Describe what happens to this energy. 17. Using a specific situation where A-scans are
used in the body, outline why they are some-
times referred to as ‘range finders’.
TISSUE OR MATERIAL I
r
---- 18. Figure 18.17 shows an oscilloscope display of
INTERFACE I0
pulse amplitude against time for an ultrasound
Air–water 0.999 A-scan through a person’s abdomen.

Water–blood 0.0057 Generated pulse

Brain–fat 0.0044
Spinal column echo

Strength of signal
Fat–muscle 0.011

Fat–bone 0.029

Skin–air 0.999

Water–brain 0.024

11. If ultrasound travels towards the lungs, which 0 0.1 0.2 0.3

are full of air, what will happen at the interface Time (m s–1)
between the lung and the surrounding tissue? Figure 18.17 An oscilloscope display of pulse amplitude
Justify your answer. against time for an ultrasound A-scan
12. Using the information from table 18.2 (page
(a) Explain how the spacings of the pulses are
345), compare the percentage of ultrasound
interpreted.
reflected at the junction between fat and liver
(b) Give two reasons why the amplitude of the
with the percentage of ultrasound reflected at reflected pulses varies.
the junction between liver and fat. (c) If the speed of ultrasound through water and
−1
13. A pregnant woman needs to have a bladder soft tissue is approximately 1500 m s , esti-
full of urine if she wishes to have a successful mate the distance between the front of the
ultrasound of her baby. Explain why an empty patient’s abdomen and the spinal column.
bladder would make an ultrasound unsuc- (d) Name the type of scan now more com-
cessful. monly used for this part of the body.
−2
14. A low-intensity ultrasonic beam of 15 mW cm 19. A body structure at a depth of 350 mm is to be
is used to study the lens of the eye. Use the data imaged using a B-scan. In order to obtain a
in table 18.2 to calculate the intensity of the clear image the reflected signal must be
reflected beam if we assume the fluid in front of received before the next pulse is sent.
−1
the lens is aqueous humour. (a) Assuming the sound speed is 1540 m s in
15. For the following question, assume the density the body, calculate the minimum time
−3
of skin is 1010 kg m and the velocity of sound between pulses that may be used to pro-
−1
through skin is 1540 m s . A 1 MHz trans- vide an unambiguous image.
ducer requires the use of a gel on the skin to (b) Explain why a faster rate of pulse would
avoid acoustic mismatch at the skin-transducer produce an image that was not clear.
interface. 20. Use a table to discuss the value of A-scans, B-
(a) Describe what would happen if air was scans, sector scans and phase scans as ultra-
between the transducer and the skin. sound imaging techniques.

CHAPTER 18 THE USE OF ULTRASOUND IN MEDICINE 359


21. (a) Use a flow diagram to describe how
ultrasound is used to obtain bone density
information.
(b) Outline its effectiveness in the diagnosis of
osteoporosis. (You may wish to use the web
sites and key words suggested on page
352).
22. Outline how ultrasound Doppler effect is used
to monitor foetal heart rate.
23. Outline what is meant by ‘real-time imaging’
and assess its value in ultrasound diagnosis.
24. Carry out internet research, using the key
words suggested on page 354, to observe and
describe blood flow from the heart by
observing ultrasound video images.
25. Collect at least two ultrasound images of body
organs. One source of these images is the
internet (search using key word ‘ultrasound
images’). Compare the images.
CHAPTER REVIEW

360 MEDICAL PHYSICS


CHAPTER
19 ELECTROMAGNETIC
RADIATION AS A
DIAGNOSTIC TOOL
Remember
Before beginning this chapter you should be able to:
• recall the features of X-rays, including speed,
frequency and wavelength range
• outline what is meant by the photoelectric effect
• describe what is meant by critical angle and outline
the conditions needed for total internal reflection.

Key content
At the end of this chapter you should be able to:
• describe how X-rays are produced
• compare the differences between ‘soft’ and ‘hard’
X-rays
• explain how a computed axial tomography (CT) scan
is produced
• identify X-ray images of fractures and other body
parts
• compare the information from a CT scan image with
that provided by an X-ray image of the same body
part
• describe when a CT scan would be a superior
diagnostic tool to either X-rays or ultrasound
• explain how an endoscope works with particular
reference to the transfer of light through the optic
fibres
• observe images produced by an endoscope
• discuss the role of coherent and non-coherent
bundles of optic fibres in an endoscope
• explain how an endoscope is used to observe internal
Figure 19.1 X-ray of the lungs showing damage organs and to assist in obtaining tissue samples.
to the right-hand side due to tuberculosis
In this chapter we will discuss the properties of electromagnetic radiation
as they apply to medical diagnosis. X-rays are used frequently in areas of
medicine and dentistry. It is likely that you or people you know have had
X-rays at some time, for example, to check the development of teeth at
the dentist or at a hospital for a suspected broken bone. (Figure 19.1
shows an X-ray of the lungs.) A more complex procedure is the CT scan.
It also uses X-rays and may be used to obtain images of cross-sections
through the body to enable diagnosis of such problems as brain dis-
orders, ruptured spinal discs or damage to soft tissue in association with
a bone fracture.
We will also consider the development of optic fibres and their use in
endoscopes, which have allowed light to be used to examine the body
internally and even to operate using keyhole surgery.

19.1 X-RAYS IN MEDICAL DIAGNOSIS


What are X-rays?
X-rays are electromagnetic waves
X-rays are part of the electromagnetic spectrum, discussed in the Prelim-
of very high frequency and very inary Course (see Physics 1, chapter 3). X-rays are electromagnetic waves
short wavelength. of very high frequency and very short wavelength, in the range 0.001 nm
to 10 nm. Because of their high frequency, and hence high energy, they
will penetrate flesh and may cause ionisation of atoms they encounter.

PHYSICS FACT
Effect of X-radiation on the body
I f the intensity of X-radiation striking the body is
great enough, it may be absorbed and cause elec-
trons to be removed from atoms or molecules
the recommended limit for the general popu-
lation is 1 mSv per year. This appears to be a con-
servative value as the limit for radiation workers is
(ionisation). The effect may be harmful, which is set at 20 mSv per year averaged over 5 consecutive
why X-radiation is often referred to as ‘harmful ion- years. These values are in addition to the back-
ising radiation’. One reaction that may occur is the ground radiation from the Earth and from cosmic
ionisation of water molecules in the body and the rays, which amounts to a value under 3 mSv.
subsequent formation of hydroxyl and hydrogen Approximate values for radiation from various
free radicals. (Free radicals are uncharged frag- sources are listed below.
ments of a molecule resulting from a covalent bond
being broken. Each free radical has an unpaired Aircraft crew additional annual 2000 µSv
electron and is highly reactive.) These free radicals exposure due to cosmic rays
may alter base structures and sequences in DNA in
chromosomes, causing mutations. Dental X-ray < 10 µSv
The result may be somatic effects that affect only
Chest X-ray 20 µSv
the person exposed to the radiation, or hereditary
effects that affect the reproductive organs and may Pelvic X-ray 70 µSv
be passed on to the person’s children.
Radiation, which can cause damage to the body, Mammogram < 4000 µSv
includes alpha (α), beta (β) and gamma (γ) radi-
ation as well as X-rays. The amount of radiation ‘Barium meal’ X-ray 3000 µSv
present is measured in units called sieverts (Sv).
CT scan of head 2000 µSv
Dose limits that are considered to be safe are set
by government bodies, therefore they vary from CT scan of chest 8000 µSv
one country to another. For example, in Australia

362 MEDICAL PHYSICS


When X-rays pass through the body, energy is absorbed by the body
tissue and the intensity of the beam is reduced. Denser material, such as
bone, absorbs more X-radiation.

Production of X-rays
In the topic ‘From ideas to implementation’ (see chapter 10), you learnt
that X-rays are emitted from a cathode ray tube when the cathode rays
strike the glass of the tube. Similar principles are used to produce X-rays
for medical diagnosis.

Evacuated chamber
Heated
filament
Electron beam Anode mounting (copper)
Coolant
circulates here

Metal target
(tungsten)

X-rays Window

Figure 19.2 An X-ray tube Very high potential difference

A cross-section of an X-ray tube is shown in figure 19.2. The tube is


highly evacuated, and a very high potential difference, from 25 000 to
250 000 volts, is applied between the anode and cathode.
The cathode is a filament of wire through which a current is passed.
Electrons are emitted from the hot filament and a metal focusing cup
directs the electrons towards the anode. The very high potential differ-
ence between the cathode and anode accelerates the electrons to the
anode. The anode is usually made of tungsten that can withstand the
high temperatures generated. When the electrons strike the tungsten
they are absorbed and some of their energy is converted to X-rays. By
placing the tungsten target at an angle to the incoming electron beam,
the X-rays that are emitted from the tungsten can be sent in a pre-
determined direction.
Tungsten is usually used for the target as it has a very high melting
point of about 3400°C and emits X-rays when struck by electrons. The
production of X-rays is not very efficient as only about 1 per cent of the
energy reaching the target is converted to X-rays, the rest being con-
verted to heat. The heat generated in the target per second can be
enough to heat a cup of water to boiling point in one second. Hence it is
important to prevent the target from overheating or melting. Copper —
a good conductor of heat — is used for the anode mountings, and oil cir-
culating in the outer region near the anode helps the cooling by convec-
tion. Cooling can also occur by rotating the target at a rapid rate of
approximately 3600 revolutions per minute, allowing the heat produced
to be distributed over a large area.

Use and detection of X-rays


Since X-rays cannot be focused, the images from X-rays are shadows of
objects placed in the beam. To obtain the sharpest image it is necessary
to have an object that is as still as possible and illuminated by an X-ray
beam of small cross-sectional area, with the detecting plate as close to the

CHAPTER 19 ELECTROMAGNETIC RADIATION AS A DIAGNOSTIC TOOL 363


object as possible. In this way, blurring of the image is minimised as the
penumbra of the shadow is reduced. The surrounding material also
affects the sharpness of the image as scattering of X-rays from sur-
rounding tissue may occur. This is illustrated in figure 19.3, using light
instead of X-rays, a foot to represent internal bone and cloudy water to
represent surrounding body tissue.

(a) (b) (c)

Cloudy
water
Pinhole
opening
Object

Image is sharp Large penumbra Image is not sharp


causes image to blur

Figure 19.3 Obtaining a sharp shadow image. (a) A narrow source produces a sharp
shadow. (b) An extended source or large distance between object and screen results in a
blurry image due to the large penumbra. (c) Cloudy water scatters light and produces a
blurry image.

A narrow beam of X-rays can be obtained by controlling the angle of


the anode target. This technique allows the beam of electrons to strike
the target over a reasonable area while significantly reducing the width of
the X-ray beam, as illustrated in figure 19.4.

Anode angle

Filament Wide beam


of electrons

Rotating
tungsten target

Narrow beam
of X-rays

Figure 19.4 Electrons hitting the target over a wide area produce a narrow beam of X-rays.

The X-ray beam is directed at the part of the patient being imaged.
Some tissues absorb X-rays very well and cast a shadow on the detecting
screen. Bone is more dense than soft tissue and absorbs X-rays. Conse-
quently bones produce a clear image when X-rayed.
X-rays may be detected on a photographic film or by an image intensi-
fier. The photographic film is used when a record of the image is
required. An image intensifier allows direct viewing of the X-ray image.

364 MEDICAL PHYSICS


X-rays strike a phosphor screen that produces light. This light stimulates
a photocathode to produce electrons that are accelerated to strike an
output phosphor screen, producing more light than was originally gener-
ated and intensifying the image up to 1000 times. The image produced
can be viewed directly by the eye, a movie camera or a TV camera. The
viewing area can be altered while the X-ray process is occurring.

PHYSICS FACT

In shoe shops in the first half of the twentieth century, a fluorescent


screen was used to observe the X-rays passing through a person’s
foot to see if toes were squashed by shoes that were too tight. These
screens were banned from about 1960 because of the danger from
exposure to scattered radiation — a danger that clearly outweighed
any benefit in shoe fitting, as is obvious from the figure 19.5.
(a) (b) Viewed from above
Viewing ports

Lead glass
Fluorescent
screen

X-ray source

Figure 19.5 (a) Shoe fluoroscope showing a considerable amount of scattered


X-radiation (dashed lines) striking other parts of the boy’s body, for example, the
reproductive organs. (b) X-rays are used to produce an image of the boy’s feet.

A fluorescent screen did not produce a bright enough image to


view easily. Rather than increasing the intensity of the X-ray beam, the
fluorescent screen was replaced, in medical diagnosis, by the X-ray
intensifying technique using phosphor screens and photocathodes.

Types of X-rays
As outlined earlier, X-rays are produced when electrons strike a target.
There are two mechanisms by which X-rays are produced.
The first mechanism produces X-rays with a range of frequencies. The
electrons are slowed down by the target atom and some of each electron’s
kinetic energy is converted to electromagnetic radiation corresponding to

CHAPTER 19 ELECTROMAGNETIC RADIATION AS A DIAGNOSTIC TOOL 365


X-radiation. The frequency of the X-radiation depends on the amount by
which the electron has been slowed, or, in other words, the amount of
braking that has occurred. This radiation is called the Bremsstrahlung radi-
ation, which is German for ‘braking radiation’.
The second mechanism results in the ionisation of the atom and so the
frequency of the X-radiation produced depends on the nature of the target
atom. A series of frequencies that make the characteristic or line spectrum
are produced. To produce X-radiation of a particular frequency, an elec-
tron is knocked out of an inner shell of an atom by the approaching
electron. An outer shell electron takes the inner shell electron’s place,
losing energy equal to the energy difference between the two shells. The
energy difference between electrons in the two shells determines the
frequency of the X-ray produced.
Figure 19.6 shows the X-rays produced by a typical target.

Characteristic wavelengths

Relative intensity 1.0 Wavelengths due to


Bremsstrahlung radiation

0.5
Emax

0
50 100 150 200
Photon energy (keV)

0.1 0.03 0.02 0.01 0.005


0.05 λ min
Wavelength (nm)
Figure 19.6 The energy of the X-ray photons (and the corresponding wavelength) plotted
against the intensity of the X-rays

Hard and soft X-rays


A thin sheet of material placed in the path of the X-ray beam will act as a
filter and absorb more low-energy photons than high-energy photons.
The beam will now have more high-energy photons, which are more pen-
Hard X-rays consist of high-energy etrating. The beam is said to be hard. By contrast, a beam of X-rays with
photons and are more penetrating lower energy photons is less penetrating and is said to be soft.
than soft X-rays, which have lower Note that the higher the energy of the photons, the higher their fre-
energy photons. quency and the shorter their wavelength.
Hard X-rays are preferred for imaging as they penetrate the body and
are absorbed by material such as bone, allowing images of the bone to be
observed. Soft X-rays are not useful for imaging as they will not penetrate
the body. They expose the patient to additional useless and possibly
harmful X-radiation.

Using conventional X-rays as a diagnostic tool


X-rays have a number of different effects on the tissues of the body,
depending on the energy of the X-ray photons and the time of exposure
to them. For diagnostic purposes, the optimal photon energy is around
30 keV, resulting in the best contrast between different tissues. At
this energy the photoelectric effect dominates. This means the X-rays
are absorbed by the tissues and electrons are released. The extent of the
X-ray absorption depends on the cube of the number of protons in the
nuclei of the atoms encountered. For example, bone, which has a high

366 MEDICAL PHYSICS


atomic density (high number of protons in the nuclei), will attenuate the
Attenuation of a signal like an X-
ray beam is the reduction in
beam about eleven times more than the surrounding tissue and hence
intensity of the beam. produce a strong X-ray shadow, allowing a very good image of the bone
to be obtained.
Atomic density values are high for bone, moderate for soft tissue and
low for air. Hence the skeleton is imaged very well by X-rays.

Figure 19.7 X-ray showing a


fracture of the radius and ulna in
the forearm

Imaging parts of the body


To image soft tissue, an artificial contrast
medium that absorbs X-rays readily may be
introduced. Iodine in a compound is intro-
duced into the bloodstream for investigations
of the circulatory system. To X-ray the gastro-
intestinal tract, which is composed of soft
tissue, a ‘barium meal’ consisting of a thick sus-
pension of barium sulfate is swallowed by the
patient or introduced into the intestines
through the anus. The barium compound
absorbs X-rays and gives a clear image, as
shown in figure 19.8.
A chest X-ray is the most common way of
detecting lung cancer or tuberculosis. The
X-ray must be taken from several different
directions to overcome the problem that the
heart sometimes obstructs a clear view of the
lungs.
The teeth and jaw are X-rayed to detect
tooth decay as well as crowded teeth or wisdom
teeth, before surgery or orthodontal treatment
is recommended.
Someone who has swallowed a foreign object
may be X-rayed to locate the position of the
object.

Figure 19.8 Barium sulfate in the bowel shows as white


contrast. Two narrowings, one due to a tumour, are shown.
The black sections of the bowel are where the barium
compound has coated the wall of the bowel, which is filled
with gas.

CHAPTER 19 ELECTROMAGNETIC RADIATION AS A DIAGNOSTIC TOOL 367


An X-ray may be needed to determine whether a patient has a metallic
implant before an MRI examination is ordered (see figure 19.9(a)).
For imaging the breast, which is an area of continuous soft tissue,
careful choice of the X-ray beam and film detector provides high resol-
ution (see figure 19.9(b)). Molybdenum targets in the tube and low vol-
tage maximise photoelectric attenuation. High tube current and short
exposure time minimise image blur due to movement by the patient.

(a) (b)

Figure 19.9 (a) An X-ray image showing a heart pacemaker (b) An X-ray image of a breast
showing a tumour

As we will see in the next section, a better technique, called computed


axial tomography (CT), is used for imaging soft tissue as it detects small
differences in X-ray attenuation.

19.2 CT SCANS IN MEDICAL


DIAGNOSIS
Computed axial tomography scanning (or CT scanning) uses X-rays to
obtain an image of a cross-section through the body. Very slight differ-
ences in X-ray attenuation can be measured and so soft tissue can be
accurately imaged.

368 MEDICAL PHYSICS


PHYSICS FACT

G odfrey N. Hounsfield was born in


England in 1919 and educated as
an electrical engineer. He joined the
medical systems section of the firm EMI
in 1951 and his long career in medical
research and engineering led to his
invention of the computed axial tomog-
raphy scanner, for which he earned the
Nobel Prize for Medicine in 1979.
Tomography comes from the Greek
word tomos meaning ‘slice’. The CT
scanner produces an image of a slice
through the object it is examining.
Hounsfield analysed the data by com-
puter, using a technique that was orig-
inally developed for use in astronomy.

Figure 19.10 The first commercial CT scanner.


The head is surrounded by a rubber bag filled with
water to absorb scattered X-rays.

How is a CT scan produced?


A CT scanner consists of an X-ray tube that is rotated around the patient
being imaged. The tube and detection mechanism are mounted on a frame
called a gantry. The part of the patient’s body being scanned is positioned
in a gap in the gantry. An image is obtained in the plane being examined.
The patient, on a bed, is moved slowly through the gantry so that a series
of images of ‘slices’ through the body may be obtained.

Figure 19.11 A modern CT scanner


showing the control console and
gantry assembly

CHAPTER 19 ELECTROMAGNETIC RADIATION AS A DIAGNOSTIC TOOL 369


The X-ray source must produce a very narrow beam so that the path of
the X-rays can be carefully controlled. To produce the narrow beam the
tube voltages are high and consequently a lot of heat must be conducted
away from the anode in the tube generating the X-rays. This require-
ment, coupled with the tube movement during scanning, results in the
tubes failing and having to be replaced after a few months of use. The
cost of such replacement is high.
The beam needs to be filtered to remove some soft X-rays which are
not needed. This ensures that the beam is relatively homogeneous and
the dose to the patient’s skin is reduced. The X-rays are detected by an
array of several hundred detectors. The newer detectors convert the
X-radiation directly into electrical signals that go to multiple integrated-
circuit amplifiers.
The patient is accurately positioned in the gantry so that a plane of the
body can be scanned. A beam of X-rays is sent through the patient and
detected on the other side. The beam is then rotated, usually 1°, and
another beam transmitted and detected. This process is repeated until an
angle of 180° has been swept out.

PHYSICS FACT

E arly CT machines used a single X-ray beam and a single detector,


and to rotate over the full 180° took about 20 minutes. This was
because each beam scan took about 5 seconds followed by a rotation
of 1° followed by a 1-second delay to allow the machine to stop
vibrating. Recent improvements using several hundred stationary
detectors and a rotating X-ray tube reduce the scan time to less than
2 seconds and the slice to 2 mm. In more recent machines, the table
on which the patient lies is moved in a smooth, stepless motion with
the tube assembly rotating continuously and these have resulted in a
spiral scanning method in which image data is obtained faster than 5
images per second. By using paired detectors it is possible to scan two
slices 1 cm apart simultaneously. The increased speed of data collec-
tion has made possible such studies as CT angiography (dynamic
studies of the blood vessels of a beating heart). CT angiograms
obtained in a 9-second scan are increasingly popular to diagnose
blocked arteries in seemingly healthy people.

The data from the scan is collected, displayed and reconstructed using
a powerful computer. The computer analyses the absorption of the X-rays
at each measured point in the slice. For example, if X-ray beam absorp-
tion is measured at 160 distinct points along each scanning path and 1°
increments in angle are used, approximately 29 000 distinct pieces of
data about X-ray absorption are obtained. The reconstruction, which is
explained in simplified form in figure 19.12, is the result of around one
million computations. The image can be displayed on a TV screen or
stored in the computer’s memory and used with other data to produce
an image in a different plane.
In recent years, full CT body scans have been advertised for those who
want to detect problems before symptoms appear. The medical profession
has criticised this opportunity, citing several reasons. People are exposed
to unnecessary radiation, potential problems may not be detected and
harmless abnormalities may be found. Hence people are given either false
security or false alarms. (For internet information about full CT body scans,
use ‘CT scans’ or ‘full body scans’ as key words in a search engine.)

370 MEDICAL PHYSICS


X-rays

5 attenuation = 5 + 5 10 9 7 9 10
4 9 8 6 8 9
X-rays 2 attenuation = 5 + 2 7 6 4 6 7
4 9 8 6 8 9
5 10 9 7 9 10
5 4 2 4 5

(a) The attenuation is measured (b) The total attenuation (c) A shade of grey is assigned to
at many points (pixels) from of the X-rays at each each pixel and from this the
different angles. (Here only 2 pixel is calculated. image is created. (As an
angles at 90° are recorded.) example assign the darkest
shade to the smallest number.)

Figure 19.12 Creating a CT scan image

(a) (b)

(c)

(d)

Figure 19.13 Images showing


positions of (a) transverse and
(b) longitudinal ‘slices’ of the upper
leg (femur) taken by a CT scan
(c) Two transverse slices through the
femur showing positions of a tumour
(d) Two longitudinal slices showing
the same tumour

CHAPTER 19 ELECTROMAGNETIC RADIATION AS A DIAGNOSTIC TOOL 371


CT scans as a diagnostic tool
compared with X-rays and
ultrasound
CT scans are more expensive than ultrasound and
significantly more expensive than conventional
X-rays. They are, however, a superior diagnostic tool
to ultrasound and X-rays when fine detail is needed.
This is the case when an image of the brain is
required.
CT scans provide detail to distinguish between
areas where the density difference is quite small even
though a dense material shields the area. In the
brain, the density range is only a few per cent but the
bony skull is so dense that most of the X-rays are
absorbed by the skull. A conventional X-ray will there-
fore provide an image of the dense skull rather than
the brain tissue inside. However, by taking X-ray
images from many angles in a CT scan, the material
along the path of the X-ray beams can be distin-
guished clearly. Due to the method of obtaining and
analysing the image it is possible to see behind bone
using a CT scan. With ultrasound, imaging behind
bone is not possible as the ultrasound signals are
reflected strongly from a tissue-bone interface so they
do not reach the tissue beyond the bone and cannot
Figure 19.14 Using computer analysis, the data from
give information about this material. Hence ultra-
images of ‘slices’ through the body can be combined to
sound cannot be used to image the brain through the
produce a three-dimensional image of the area under
skull.
investigation.
X-rays are valuable when there is high natural contrast between the
tissues to be viewed. This means there is a large difference in proton
number. The proton number is high for bone, moderate for soft tissue and
low for air. Hence X-rays are good for diagnosing bone problems such as
fractures, dislocation and arthritis. Conventional X-rays can also be used to
image soft tissue if an artificial contrast medium is introduced, such as a
barium meal if the digestive tract is being imaged (see page 367).
CT scans provide much better resolution of soft tissue than conven-
tional X-rays. They can be used to investigate soft tissue damage due to
bone fracture or ruptured spinal discs, or in other areas where a contrast
medium cannot be easily introduced. CT scans are also used to scan the
liver and kidneys to obtain resolution better than 1 mm. This means dif-
ferences separated by 1 mm can be detected. Ultrasound cannot produce
as clear an image as this.
CT scans are preferred for imaging the lungs (see figure 19.1, page
361). Although conventional X-rays give adequate routine lung
screening, CT scans provide clearer detail. Ultrasound cannot image past
air spaces in the lungs.
Ultrasound is the preferred imaging method for viewing blood flow,
whereas CT scans are limited unless sophisticated CT angiography is used.
X-rays need a contrast dye such as an iodine compound to image this area.
X-rays of sufficient intensity are a harmful ionising radiation whereas
ultrasound does not produce any ionisation. Hence ultrasound is safe for
use with foetuses. Ultrasound could in fact be used many times with the
same patient without any harmful effects. The specific gain from a CT

372 MEDICAL PHYSICS


scan or X-ray would need to be considered in each case before exposing
the patient to the X-ray doses involved.
Often a cheaper, quicker and relatively portable imaging technique using
X-rays or ultrasound can give an initial diagnosis that could lead to further
testing for tissue damage or internal bleeding by ordering a CT scan.

19.3 ENDOSCOPES IN MEDICAL


DIAGNOSIS
Endoscopes are optical instruments using light for looking inside the
body to examine body organs, cavities and joints. Modern endoscopes
An optical fibre is a glass core use optical fibres to transfer light to and from the area being examined.
surrounded by a cladding of lower Light is transferred along the optical fibre by total internal reflection —
refractive index. Light is the same concept that you studied in relation to communication technol-
transferred along the optical fibre
ogies in Physics 1, chapter 4.
by total internal reflection.

PHYSICS FACT

T he first endoscope was invented in the late nineteenth century. It


was a straight, rigid metal tube illuminated by an oil lamp and it
appears that sword swallowers were the first subjects for experimen-
tation. The tale is told that one such recruit exclaimed, ‘I’ll swallow a
sword anytime, but I’m damned if I’ll swallow a trumpet!’
In the twentieth century, Rudolf Schindler produced an endo-
scope with a semi-flexible tube using prisms and lenses to bend the
light into an arc. This endoscope was still quite bulky and must
have been uncomfortable. A real breakthrough came in the 1960s
with the invention of the optical fibre. Endoscopes that were very
flexible and of narrow diameter could be made, much to the relief
of patients.

Optical fibres and transmission of light


An optical fibre is a glass core surrounded by a glass cladding of lower
refractive index than the core. You will recall from your Preliminary Course
studies that a critical angle exists for light travelling in the core. If this
critical angle is reached or exceeded by light striking the core-cladding
interface, the light is totally internally reflected and trapped in the core.
By reflecting off the core-cladding interface, the light can travel along the
core whether the core is bent or straight (see figure 19.15).
19.1 Usually optical fibres are grouped together in a bundle. For endo-
Transfer of light by optic fibres scopes the bundle of optical fibres may contain up to 10 000 fibres.

Cladding

Core

Light Cladding

Figure 19.15 Cross-section of a glass fibre showing total internal reflection

CHAPTER 19 ELECTROMAGNETIC RADIATION AS A DIAGNOSTIC TOOL 373


Structure and use of
Controllable
bending section
Distal
end
an endoscope
Figure 19.16 shows a diagram of a
typical endoscope. The long flex-
Air, CO2, water ible shaft containing the fibres is
and suction controls Channel for operating protected from damage by a
instruments
helical steel band inside steel
Lens mesh. The outer coating is plastic
adjustment
Flexible
to prevent chemical damage and
Camera section to make the shaft waterproof and
mount easy to move through the body.
The shaft is about 10 mm in dia-
Air/water meter and may be up to 2 m
nozzle Objective
lens long.
The end inserted in the
Eyepiece Deflection control
patient’s body, known as the
distal end, is able to bend in
directions that are controlled
Suction from the viewing end by the oper-
Air
ator. The distal end contains a
lens to focus the image onto the
CO2 in Illumination
Biopsy/suction lenses end of the fibre bundle. The
Water in
channel shaft of the endoscope contains a
Optical fibre number of distinct parts as listed
in table 19.1.
Figure 19.16 A fibre-optic endoscope

Table 19.1 Function of parts of the endoscope shaft

A coherent optic fibre bundle is PART OF SHAFT FUNCTION


one in which the optic fibres keep
the same position relative to one Non-coherent optic fibre bundles To guide the light to the area to be
another. A non-coherent optic fibre (usually two) examined
bundle is one in which the fibres
are not kept in the same position
relative to one another. Coherent optic fibre bundle To transmit the image back to the
eyepiece for viewing

Water pipe To wash the distal face of the


endoscope to keep the optical section
clear

Operations channel To insert surgical instruments to


perform specific tasks (see figure 19.17
for some surgical instruments)

Control cables To operate the flexible end

Additional optional channel To suck or pump in air or carbon


dioxide

Once the image is sent to the viewing end of the endoscope it may be
viewed by the operator directly or captured as a still photograph or video
record (see figure 19.19, page 377). In this way an operation can be accu-
rately controlled and also recorded for later study.

374 MEDICAL PHYSICS


Biopsy Grasping Scissors Washing Snare Diathermy
35 mm still camera forceps forceps cutter

Surgical instruments

Controls
16 mm movie
camera

Inputs
Light sources
CO2 supply
Conventional
Air supply
Xenon
TV camera Water Suction
pump
Stroboscopic flash

Halogen

Figure 19.17 Some accessories for an endoscope

How do endoscopes work?


Endoscopes work because light can be transmitted along them to illumi-
nate internal areas of the body and then the image of those areas can be
carried back along the fibres of the endoscope.
Bundles of optic fibres are needed to carry the light to and from the
inside of the body. These fibres operate by the principle of total internal
Figure 19.18 (a) A coherent bundle reflection as indicated previously. The optic fibre bundles carrying light
of optic fibres (b) A non-coherent into the body can be non-coherent, and hence less expensive, as their role
bundle of optic fibres is only to illuminate the internal area. These non-coherent bundles can
have thicker fibres, which are more
(a) Through coherent bundles, the parts of the image efficient at transmitting light as there
are transmitted in correct relative positions. are fewer reflections along a given path
length and hence fewer opportunities
for the light to be lost through the sides
of the fibre (see figure 19.18(b)).
The optic fibre bundle transmitting
the image back along the shaft must be
coherent so that when the light
Light reflected reaches the viewing area it will be in
from object the same relative position as it was
when it left the object being viewed
Through non-coherent bundles, light is
(b) transmitted but image has lost its shape as parts
(see figure 19.18(a)). It is an advan-
are not in correct positions. tage to make these fibres as narrow as
possible with the core to cladding ratio
as large as possible to ensure that there
are many beams of light transmitting
the image. The image will therefore be
very clear — in other words, it will
have good resolution.

CHAPTER 19 ELECTROMAGNETIC RADIATION AS A DIAGNOSTIC TOOL 375


Using an endoscope
The endoscope is inserted through a natural orifice in the body or
through a small incision. It allows doctors to see inaccessible parts of the
body and in some cases to carry out minor surgical procedures. Such a
procedure, guided by use of an endoscope, is called keyhole surgery.
Some of the uses of endoscopes are indicated in table 19.2.

Table 19.2 Uses of endoscopes

NAME OF PROCEDURE PLACE OF INSERTION OF ENDOSCOPE PURPOSE OF PROCEDURE

Arthroscopy Through skin near joint To examine joints and carry out repairs such as
removal of torn cartilage

Bronchoscopy Through bronchial tubes To examine trachea and lungs to show problems
such as inflammation, bronchitis, cancer and
tuberculosis

Colonoscopy Through the anus To detect problems such as polyps, tumours,


ulceration and inflammation in the colon and
large intestine

Colposcopy Through the vagina To look for problems such as inflammation


and cancer in the vagina and cervix
(in females)

Cytoscopy Through the urinary tract To examine the bladder, urethra and opening of
the prostate gland (in males)

Endoscope biopsy Through a natural opening or through To remove specimens of tissue for examination
an incision and analysis by a pathologist

Gastroscopy Through the mouth To look for the source of problems such as
bleeding from the lining of the oesophagus,
stomach and duodenum

Laparoscopy Through an incision in the abdominal To examine abdominal organs including


wall the stomach, liver and fallopian tubes (in
females)

If a tumour is detected, a small sample may be taken for analysis, using


the sampling implements in the endoscope. Such a sample is called a
biopsy.
Polyps may be surgically removed using an endoscope with an attach-
ment. Sometimes lasers are used in conjunction with endoscopes when
surgery is being carried out, as lasers will cut without distorting the area
around the incision and without causing bleeding.
Before the use of endoscopes, open surgery was needed to examine
and treat internal organs. The procedure using endoscopes is less
invasive and carries less risk, allowing the patient to recover quickly. It
may be carried out in an outpatients department and so not require
admission of the patient to hospital. The cost of health care is thus
reduced.

376 MEDICAL PHYSICS


Figure 19.19 Images taken during an operation on a damaged shoulder,
using an endoscope

CHAPTER 19 ELECTROMAGNETIC RADIATION AS A DIAGNOSTIC TOOL 377


(b) explain (i) the range of frequencies
SUMMARY obtained and (ii) the sharp peaks on
your graph
• X-rays are produced by the collision of elec- (c) redraw the graph to show how it would be
trons with a target material. changed if some soft X-rays were removed
• Soft X-rays are less penetrating and have lower by filtering.
frequency than hard X-rays. 5. (a) Outline how the attenuation of X-rays
• A CT scan is produced by the computer analysis changes for different materials in the body.
of the attenuation of X-rays moving around a (b) Describe and account for the appearance
slice of the body. of an X-ray image of part of the body con-
taining bone, muscle and air spaces.
• A series of consecutive 2-D scans can be stored
by the computer and combined to produce a 6. State a difference between ultrasound and
3-D image. X-rays and outline why this difference is
important for the way each is used.
• CT scans can distinguish soft tissue with small
differences in density and can produce an 7. Use a table to compare hard and soft X-rays.
image of tissue behind bone. 8. This question refers to figure 19.15 on
• CT scans are expensive compared with conven- page 373. Assume the external medium is air.
tional radiographs. (a) Explain what is meant by the ‘critical
angle’ of a material.
• CT scans can provide confirmation of sus- (b) Outline why a critical angle is important in
pected problems, such as tumours, damaged an optical fibre.
cartilage or internal bleeding. (c) Describe the function of the cladding in
• Endoscopes use optical fibres to transfer light an optical fibre.
to and from the inside of the body to enable (d) State the relationship between the refrac-
internal structures to be seen. tive index of the cladding and the refrac-
tive index of the core of the fibre.
• Non-coherent bundles of optical fibres transfer
light to the inside of the body. 9. (a) Describe how the fibres are positioned in a
coherent bundle.
• An image is formed when light is transferred (b) Explain why a coherent bundle is neces-
along coherent bundles of optical fibres from sary in an endoscope.
inside the body to the light receptor.
(c) Explain why the endoscope has to be used
in conjunction with a powerful light
QUESTIONS source.
(d) Which properties of the fibre bundle
1. Outline the property of electrons that allows affect the ability of the observer to see
them to be focused using electric and mag- small details when using the instrument?
netic fields but prevents X-rays from being 10. ‘The main principle behind the operation of
focused. an endoscope is the transfer of light to and
2. X-rays used for diagnosis are generated with from the internal organs of the body.’ Evaluate
tube voltages of about 70 kV compared with this statement.
several megavolts when X-rays are needed to
11. Use a table to summarise situations in which
destroy tissue.
CT scans are a superior diagnostic tool to
CHAPTER REVIEW

(a) State what is meant by ‘diagnosis’.


X-rays or ultrasound. For each situation, out-
(b) Outline the effect that X-rays produced
line why X-rays and ultrasound are inferior.
with tube voltages of several megavolts will
have on body tissue. 12. An endoscope is used to take a biopsy of a
3. With the aid of a labelled diagram, give a descrip- small tumour in the oesophagus, which leads
tion of the way in which X-rays are produced. from the mouth to the stomach. Explain how
an endoscope can be used to do this.
4. For a typical X-ray tube with a tungsten target:
(a) sketch a graph that shows how the inten- 13. Explain how an endoscope could be used to
sity of the resulting X-radiation varies with examine and repair a torn ligament inside the
photon energy knee joint.

378 MEDICAL PHYSICS


CHAPTER REVIEW
14. Examine the image of the lungs taken by an
X-ray (figure 19.1 on page 361) and a CT scan
of the upper leg (figure 19.13 on page 371).
Compare the information provided by each
image.
15. Find at least three different X-rays images on
the internet, using key words such as ‘X-ray
image of fracture’, ‘mammogram’, ‘barium
meal X-ray’, ‘lung X-ray’. Outline why X-rays
have been successful in producing the image
in each case.
16. From the internet or other sources, find at
least three images of internal organs obtained
by using an endoscope. If searching the
Internet, use key words such as ‘laparoscopy
images’ or ‘colonoscopy images’. For each
image, describe the internal organ as viewed
from the endoscope.
17. (a) Describe how a CT scan is obtained.
(b) Outline why improvement in computer
technology is linked strongly with clearer
CT scans.
18. From the internet or other sources, find at
least three different images of the body
obtained using CT scanning. For each image,
describe the detail that is visible and outline
how the image would be different if ultra-
sound or X-rays were used.

CHAPTER 19 ELECTROMAGNETIC RADIATION AS A DIAGNOSTIC TOOL 379


Method
19.1 TRANSFER 1. (a) Shine the light in one end of the optic fibre
bundle.
OF LIGHT BY (b) Observe the light coming from the other end.
(c) Bend the bundle and again observe the light
OPTIC FIBRES coming out the other end.
(d) By carrying out the experiment in a dark
Aim room, note whether light is lost through the
sides of the optic fibre.
(a) To demonstrate the transfer of light by optic
(e) Carry out step (d) with most of the bundle
fibres.
under water and note any changes to your
(b) (Extension) To compare light transferred observation. If there was a change, outline
through optic fibres with light transferred reasons based on refractive index.
through the air.
(f) Record your results as labelled diagrams.
2. (a) Using a light probe and data logger, com-
Apparatus pare the intensity of the light coming out of
optic fibre bundle the optic fibre with the light entering.
light source (b) Using the inverse square law, compare the
light probe and data logger (or lightmeter) light intensity coming through the probe, as
measured in part (b), point 1 above, with
Theory that expected for light travelling the same
Light is transferred by total internal reflection distance through air.
along an optic fibre.
The intensity of light travelling through air Analysis
decreases according to the inverse square law (see Relate your observation to the operation of an
Physics 1 Preliminary Course, 3rd edition, chapter 3). endoscope.
PRACTICAL ACTIVITIES

380 MEDICAL PHYSICS


CHAPTER
20 RADIOACTIVITY AS
A DIAGNOSTIC TOOL
Remember
Before beginning this chapter you should be able to:
• recall the structure of the atom
• recall the nature of alpha, beta and gamma radiation
• recall the law of conservation of momentum.

Key content
At the end of this chapter you should be able to:
• outline properties of radioactive isotopes
• sketch in general terms what is meant by half-life
• recognise from a list and name radioisotopes used to
obtain scans of organs
• describe how radioisotopes are metabolised by
the body so that they are found in the target
organ
• compare a bone scan with an X-ray
• identify that positron emission occurs during
the decay of certain radioactive nuclei
• discuss the interaction of electrons and
positrons to produce gamma rays in the context
of positron emission tomography (PET)
• describe how positron emission tomography (PET)
technique is used for medical diagnosis
• compare the scan of at least one healthy body organ
with its diseased counterpart.

Figure 20.1 This image, taken using a radioactive tracer,


shows cancer of the lower section of the left kidney
In this chapter the properties of radioisotopes will be applied to medical
diagnosis. Cancer is often treated using ‘radiotherapy’. The idea of using
a radioactive material to kill cancerous cells is well known. Less well
known is the use of a radioactive material inside the body to diagnose
disease. Use of radioactive material in the body may seem very risky
because of the danger associated with radioactivity. In fact the use of
radioisotopes and more recently PET to image organs and study their
function has become a very common, effective and safe means of diag-
nosis. The image in figure 20.1 (page 381), taken using a radioactive
tracer, shows a cancer in the patient’s left kidney.

20.1 RADIOACTIVITY AND THE USE OF


RADIOISOTOPES
Diagnostic nuclear medicine is essentially a functional imaging technique
involving the imaging of physiological processes. This is to be contrasted
with imaging using X-rays, where only structural information is obtained.
For the purposes of medical diagnosis, radioactive substances may be
introduced into the body and used to target areas of interest. The radi-
ation produced is measured and used to determine the health of the
organ or section of the body under investigation.

Properties of radioisotopes
Each element has a particular number of protons in the nucleus. However,
the number of neutrons in each element may vary. The atoms of the same
Isotopes are atoms of the same element with different numbers of neutrons are isotopes of the element.
element with different numbers of For example, hydrogen (H) has isotopes 11 H , 21 H and 31 H. All these
neutrons. A radioactive isotope (or isotopes have the same chemical properties because they have one
radioisotope) is an isotope that is proton in their nucleus and one orbiting electron. 11 H has no neutrons,
unstable and will emit particles 2 3
1 H has one neutron and 1 H has two.
from the nucleus until it becomes
stable. All elements have more than one isotope; some occur naturally and some
may be made artificially. Sometimes the isotope is unstable and is then
known as a radioactive isotope or radioisotope. 31 H is the only radioactive
Radioactive decay is the emission isotope of hydrogen. Radioisotopes are unstable and will decay by emitting
of particles from the nucleus of a particles from their nucleus. Sometimes a new radioactive substance is pro-
radioactive element. duced and the decay continues until a stable isotope forms.

Types of radiation: alpha (a), beta (b) or


gamma (g)
In your studies of ‘The cosmic engine’ in the Preliminary Course, you
learnt about emissions from radioactive material. The radiation from a
radioactive isotope may consist of alpha (α) particles, beta (β) particles
or gamma (γ) radiation.
Alpha particles consist of two protons and two neutrons held together.
Alpha particles are helium nuclei ( 42 He ). Their emission results in the
mass number of the radioisotope decreasing by 4 and the atomic number
decreasing by 2. The following is an example of decay by the emission of
alpha particles from the parent nucleus radium-226:
radium-226 → radon-222 + α particle
226 222
88 Ra → 86 Rn + 42 He

Radon-222 is known as the daughter nucleus.

382 MEDICAL PHYSICS


Beta particles are electrons. They are formed in the nucleus when a
neutron changes to a proton and an electron. The emitted electron is
called a beta particle. The following is an example of beta particle decay:
oxygen-19 → fluorine-19 + β particle
19 19 0
8 O → 9 F + –1 β –1

In beta decay, the atomic number of the new element increases by 1 and
the mass number remains the same.
Gamma radiation is electromagnetic radiation of very short wavelength
and very high energy. This means that it can readily penetrate the body.
Gamma radiation is frequently produced in conjunction with alpha and
beta particles. The decay of oxygen-19 in the example above produces
gamma radiation as well as beta particles.
Usually the gamma radiation is emitted less than a microsecond after
the emission of alpha or beta particles, but sometimes there is a delay if
A metastable nucleus is in an the daughter nucleus is left in an excited state, known as a metastable
excited state for a period of time state. This is the case with technetium-99m, a very important radio-
before decaying. isotope used in medical diagnosis as a gamma radiation emitter. (The ‘m’
in ‘99m’ means this isotope is metastable.)
The characteristics of α, β and γ radiation are summarised in table 20.1.

Table 20.1 Characteristics of α, β and γ radiation

RELATIVE
SYMBOL REST MASS (kg) CHARGE NATURE PENETRATION
−27
Alpha α 6.6 × 10 +2 helium nucleus Stopped by a sheet of
paper or
7 cm of air
−31
Beta β 9.0 × 10 −1 electron Stopped in a few mm
of tissue

Gamma γ 0 0 electromagnetic radiation of Absorbed in many cm


wavelength shorter than an X-ray of tissue

Half-life of a radioactive isotope


Not all radioactive isotopes decay at the same rate. The rate of decay is
Half-life is the time taken for half measured by the half-life of the radioisotope. This is the time taken for
the radioactive material in a half the radioactive material to decay.
9
sample to decay. Half-lives can vary significantly. Uranium-238 has a half-life of 4.5 × 10
−4
years whereas polonium-218 has a half-life of only 1.5 × 10 seconds.
A radioisotope with a very long half-life is unsuitable for medical diag-
nosis as it lingers in the patient after all necessary measurements are
taken. This can pose a danger to the patient and people in close contact
because of the radiation emitted. On the other hand, if the half-life is too
short, the radioisotope either loses its useful radiation before measure-
ments can be taken or has to be administered in a dangerously large
dose. Radioisotopes with half-lives ranging from several minutes to days
are used for medical diagnosis.
The decay of a radioisotope can be plotted on a graph from which the
half-life can be read. In the graph in figure 20.2, we see that there is
initially 100 g of sodium-24. From the graph, the mass has halved to 50 g
after 15 hours. The half-life (T1/2) of sodium-24 is therefore 15 hours.

CHAPTER 20 RADIOACTIVITY AS A DIAGNOSTIC TOOL 383


Radioactive decay of iodine-123
SAMPLE PROBLEM 20.1a A 20 mg sample of iodine-123 is to be 100
used as a radioactive tracer in the body.
75
The half-life of the iodine-123 is 13

Mass (g)
hours. 50
(a) How long will it take for 17.5 mg to
decay? 25
(b)Calculate how much iodine-123 will 0
remain after 26 hours. 15 30 45 60
Time (hours)
SOLUTION (a) In 1 half-life, 10 mg of iodine-123 will
Figure 20.2 The radioactive decay
decay. This will leave 10 mg iodine-
of sodium-24
123.
In the second half-life, 5 mg iodine-
123 will decay, leaving 5 mg iodine-123.
In the third half-life, 2.5 mg iodine-123 will decay.
Altogether, 17.5 mg (10 + 5 + 2.5 mg) iodine will have decayed in 3
half-lives or 39 hours.
(b)26 hours is 2 half-lives (2 × 13 hours).
After 1 half-life, 10 mg of iodine-123 will decay leaving 10 mg iodine-123.
After 2 half-lives, 5 mg iodine-123 will decay leaving 5 mg iodine-123.
5 mg iodine-123 will remain after 26 hours.

Radioactive decay
SAMPLE PROBLEM 20.1b A radioisotope sample has a half-life of 10.0 minutes.
(a) Calculate the time it will take the activity to drop from 8.0 MBq
(mega becquerels) to 4.0 MBq.
(b) Calculate the time it will take for its activity to be 1.0 MBq.

SOLUTION (a) When half the sample has decayed the activity will also halve. This
assumes that the atoms formed are not radioactive. Hence the time
needed to reduce the activity to 4.0 MBq is one half-life, or 10.0 minutes.
(b) Halving the activity each half-life means 3 half-lives have passed
before the activity is 1.0 MBq. The time taken is 30.0 minutes.

PHYSICS FACT
Determining the size of the dose of radioisotope
for medical diagnosis
W hen a radioisotope is introduced into the
body, other factors in addition to the half-
life of the radioisotope need to be considered.
has been taken up by the targeted organ. For
example, calculations show that, although iodine-
123 has a half-life of 60 days, half the administered
The radioisotope is removed from the patient’s radioisotope will be removed from an average
body by processes such as respiration, urination patient’s body in 16 days.
and defecation. Also, some patients metabolise Generally, if the radioisotope remains in the
the chemical to which the radioisotope is patient’s body for a long period of time, the half-
attached more quickly than others, so it is impor- life of the radioisotope should be comparable to
tant that the characteristics of the particular the time taken to carry out the investigation, to
patient are considered when dosages are being minimise the dose to the patient. When the
determined. radioisotope is excreted in about the same time
The half-life must be long enough to allow as is needed for the investigation, a longer half-
useful readings to be taken after the radioisotope life radioisotope can be safely used.

384 MEDICAL PHYSICS


PHYSICS IN FOCUS
Producing radioisotopes: the medical cyclotron
T he effectiveness of nuclear medicine for
diagnosis of disease has depended on the
ability to:
• produce radioisotopes
• detect the gamma radiation produced.
The production of radioisotopes became possible
with the development of the cyclotron by E. O.
Lawrence in 1931. From the mid 1940s, a range
of radioisotopes was available from the United
States and later from the United Kingdom.
Most of the radioisotopes used in nuclear medi-
cine in Australia are made by ANSTO at two major
facilities: the OPAL nuclear research reactor, at
Lucas Heights in the south of Sydney, and the
National Medical Cyclotron, at Royal Prince Alfred
Hospital in Sydney. Some radioisotopes are
imported from South Africa and Canada. Besides
the ANSTO facilities, there are also hospital-based
cyclotrons in Melbourne, Brisbane and Perth.
Cyclotrons are needed to make radioisotopes for
positron emission tomography (PET), a diagnostic
technique discussed later in this chapter.
In late 2007 ANSTO announced a partnership
with Siemens Medical Solutions to build two new
state-of-the-art cyclotrons. These will supply hos- Figure 20.3 The National Medical Cyclotron, located near
pitals with more of the isotopes needed for PET, Royal Prince Alfred Hospital in Sydney
increasing the availability of PET scanning.

Metabolising of radioactive isotopes


by the body
Radioisotopes that emit alpha particles are not used in the diagnosis of
disease because the alpha particles cause damaging ionisation inside the
body.
Beta particles travel further than alpha particles before they are
absorbed but their ionisation damage is much less. They are used in
therapy but not in diagnosis of disease.
The most useful radioisotopes for nuclear medicine are those that emit
gamma radiation only. Technetium-99m and iodine-123 are two such iso-
topes. A gamma-emitting radioisotope inside the body can be detected
outside the body because gamma radiation is very penetrating.
Common radioisotopes used in medical diagnosis are listed in
table 20.2 on the following page.
The radioisotope is chosen on the basis of its ability to target the organ
to be studied. First, the radioisotope needs to be chemically attached to a
compound that would normally be metabolised by the organ of interest.
When this compound is chemically attached (‘labelled’) with the radio-
A radiopharmaceutical is a
compound that has been labelled isotope, it is called a radiopharmaceutical. For example, glucose is a com-
with a radioisotope. pound that is readily absorbed by the brain. Hence glucose is labelled to
become a radiopharmaceutical for imaging brain function.

CHAPTER 20 RADIOACTIVITY AS A DIAGNOSTIC TOOL 385


Table 20.2 Radioisotopes used in medical diagnosis

RADIOISOTOPE PRODUCTION SITE HALF-LIFE FUNCTION

Chromium-51 Nuclear reactor 27.70 days Used to label red blood cells and measure gastro-
intestinal protein loss.

Gallium-67 Cyclotron 3.26 days Used to detect tumours and infections.

Molybdenum-99 Nuclear reactor 65.94 hours Used as the ‘parent’ in a generator to produce
technetium-99m, which is the most widely used isotope
in nuclear medicine.

Technetium-99m ‘Milked’ from 6 hours Used to investigate bone metabolism and locate bone
molybdenum-99 disease; assess thyroid function; study liver disease and
disorders of its blood supply; monitor cardiac output, blood
volume and circulation clots; monitor blood flow in lungs;
assess blood and urine flow in kidneys and bladder;
investigate brain blood flow and function; estimate total
body plasma and blood count.

Iodine-123 Cyclotron 13 hours Used to monitor thyroid function, evaluate thyroid gland
size and detect dysfunction of the adrenal gland. Also used
to assess stroke damage.

Iodine-131 Nuclear reactor 8 days Used to diagnose and treat various diseases associated
with the thyroid gland. Used in the diagnosis of the
adrenal medullary. Used for imaging some endocrine
tumours.

Thallium-201 Cyclotron 3.05 days Used to detect the location of damaged heart
muscles.

PHYSICS IN FOCUS
Radioisotopes emitting gamma radiation
B oth iodine-123 and technetium-99m are
valuable radioisotopes because they decay by
the emission of gamma radiation only.
Technetium-99m has a half-life of only 6 hours
so it must be produced in the hospital where it is
to be used. A purpose-built generator system is
Iodine-123 is more expensive than iodine-131, an used to obtain the technetium-99m when
emitter of beta and gamma radiation. Iodine-131 needed. The generator contains the ‘parent’ iso-
has been used in the investigation of the thyroid tope, molybdenum-99, which decays to the
gland. However, it emits beta radiation, leading to metastable ‘daughter’ isotope technetium-99m.
larger radiation doses than desirable. Also, the The technetium-99m is flushed from the molyb-
energy of the gamma radiation produced from denum using a saline solution. The flushing is
iodine-131 is very high, resulting in poor image called elution. The molybdenum remains in the
quality when detected by the gamma camera. (The generator as it is chemically attached to a central
operation of a gamma camera is discussed later in column. The technetium-99m is said to be
this chapter, page 388). The half-life of 8 days for ‘milked’ from the molybdenum. This operation
iodine-131 is relatively long, resulting in exposure usually happens daily, allowing the technetium
of the patient to radiation well after the testing has sufficient time to build up. Because the molyb-
been carried out. By contrast, iodine-123 has a half- denum has a half-life of approximately 66 hours
life of 13 hours, also concentrates in the thyroid it must be replaced weekly as by that time the
gland and emits gamma rays of energy that can be rate of production of technetium is too low to be
detected clearly by the gamma camera. of value.

386 MEDICAL PHYSICS


Technetium-99m has the added advantage that Figure 20.4 shows a cross-section through a tech-
it readily attaches to different compounds to netium generator and figure 20.5 shows a tech-
form radioactive tracers. nician preparing radiopharmaceuticals for tracer
studies in a hospital nuclear medicine department.

Figure 20.4 A cross-section through a typical technetium Figure 20.5 The preparation of labelled compounds for
generator used in hospitals to generate technetium-99m tracer studies

Using radioactive isotopes to target body


organs
As indicated in table 20.2 (page 386), particular radioisotopes are chosen
to target particular organs. The radiopharmaceutical is injected into the
bloodstream, inhaled or taken orally, and its passage through the body is
traced by measuring the radiation it emits.
Sometimes an image is taken after a period that may be up to several
hours. The radioisotope has accumulated in the target organ, so this
image measures the amount of radiation emitted from different organs
and shows where the radioisotope has accumulated.
In other situations, a series of images is taken over a period of time
starting from when the radioisotope is first introduced. This type of
investigation shows the distribution of the radioisotope and rate of
absorption or excretion by various organs. The images may be taken over
a few minutes for a heart or lung study or over a period up to half an
hour for a kidney or bladder investigation.

CHAPTER 20 RADIOACTIVITY AS A DIAGNOSTIC TOOL 387


In analysing the images, radiologists identify ‘hot spots’ with a higher
than normal concentration of radioisotope and ‘cold spots’ showing a
lack of radioisotope. These areas often indicate disease (see figure 20.6).

(a) (b)

(c)

Figure 20.6 (a) A goitre is an enlargement of


the thyroid gland that can be shown by variation
in the uptake of the radiopharmaceutical.
(b) Scan showing an enlarged thyroid gland
(c) Scan showing a normal thyroid gland

Obtaining an image
The image is obtained by measuring the amount of gamma radiation
coming out of the patient’s body using a gamma camera. The gamma
camera is a stationary imaging system that collects gamma radiation over
a large area. It converts the gamma rays into light flashes (scintillations)
which are transformed into amplified electrical signals. These are then
analysed and processed to form an image.
A gamma camera is shown in figure 20.7. The three main sections are
the collimator, the sodium iodide crystal and the phototubes. Gamma
rays travelling at right angles to the sodium iodide crystal enter the
camera through a lead collimator. Usually the collimator is a circular slab
of lead with many holes perpendicular to the face (see figure 20.7(c)).
Gamma rays striking the crystal from other angles would degrade the
image and so are blocked by the lead collimator.
The radiation detector is a single sodium iodide crystal 30 to 40 cm in
diameter and 1.2 cm thick. An array of photomultiplier tubes is arranged
in a hexagonal pattern at the rear of the detector.
When a gamma ray enters the sodium iodide crystal, the light from the
resulting scintillation spreads through the crystal and each photomulti-
plier tube receives some of the total light. The fraction of the total light
seen by each tube depends on how close that tube is to the original point
of entry of the gamma ray. The resulting electrical pulses from each
photomultiplier tube are decoded and converted to signals to be
displayed on a computer screen. The image showing the gamma ray
output from the organ is constructed from all the gamma rays detected.

388 MEDICAL PHYSICS


(a) x+ Output signals
x– Screen
y+ for processing display (b)
y–
Lead shield 8
19 9
18 2 10
Photomultipliers
7 3
Light deflector 17 1 11
6 4
Optical light guide 16 5 12
Glass 15 13
14
Sodium iodide crystal
Lead collimator

(c)
(d)

Figure 20.7 (a) A cross-sectional


view of a gamma camera (b) Top
view showing photomultiplier tubes
arranged in a hexagonal pattern
(c) A magnified view of the surface of
a lead collimator (the coin shows
relative size) (d) A gamma camera
used for radioisotope imaging.

Medical applications
Thyroid investigations
The thyroid gland metabolises iodine. A drink of a dilute solution of
sodium iodide tagged with iodine-123 is administered and its accumulation
is measured from 10 minutes to 48 hours after being adminis-
Hyperthyroid tered. An image of the goitre may be obtained as in figure 20.6
100
(overactive)
(page 388), or the uptake of the isotope may be graphed and
> 50% compared with a standard as in figure 20.8.
Uptake (%)

Normal
50 Thyroid investigations now commonly use technetium-
30–50% Hypothyroid 99m, which is also taken up by the thyroid but is more readily
(underactive)
15–30%
released than iodine.
0
24 48 The heart
Time (hours) Human serum albumen is labelled with technetium-99m and injected
Figure 20.8 Uptake of iodine-123 into the patient to measure the efficiency of the heart as a pump. The
by the thyroid gland passage of the radiopharmaceutical is monitored through the heart
chambers. Thallium-201 as part of thallium chloride is injected and mon-
itored to assess damage caused by a stroke or to measure the effect on
the heart of exercise or drugs (see figure 20.9).

CHAPTER 20 RADIOACTIVITY AS A DIAGNOSTIC TOOL 389


Figure 20.9 Performance of heart
muscle using thallium-201
A series of images produces ‘slices’
through a chamber in the heart.
The left image shows a ‘hole’ in the
muscle during exercise, probably
indicating a blockage to that part of
the muscle. A ‘hole’ is seen when there
is no gamma radiation from that area,
and shows up as no blue or red.
The right image, taken during resting
of the patient, shows the muscle is alive
because the ‘hole’ has disappeared,
indicating blood is flowing. The
blockage can possibly be cleared,
leading to recovery. Bones, lungs and brain
Technetium-99m is used in imaging the bones, lungs and brain.
Polyphosphate ions are labelled with technetium-99m and injected,
accumulating in bone within an hour. The image shows the function of
the bone. Areas of increased blood flow show up as ‘hot spots’. Such
areas are frequently associated with disease. Bone imaging often shows
up bone tumours and stress fractures earlier than standard X-rays, which
only show the structure of the skeleton (see figure 20.10).

(a) (b) (i) (b) (ii)

Figure 20.10 (a) An X-ray of a


broken leg (b) A bone scan showing (i)
a healthy skeleton and (ii) a skeleton
with tumours. (Note the white spot on
the right arm in each bone scan shows
where the isotope was injected.)

390 MEDICAL PHYSICS


Brain studies using technetium-99m as a tracer measure blood flow
through the brain, allowing dementia and stroke damage to be identified
(see figure 20.11).

Figure 20.11 Images of ‘slices’ To study the lungs, technetium-99m attached to albumen is coagu-
through the brain show areas of lated, mixed with saline and injected into the veins in the arm. It
reduced activity due to stroke damage becomes trapped in the fine capillaries in the lung and allows a map to
(shown in red). be made of the functioning capillaries. Any blockage in the lung, perhaps
due to a clot, shows as a region without any radioactive tracer. This blood
flow study is called a perfusion study.
To enable the health of the airways to be studied, the patient inhales
an aerosol labelled with technetium-99m. This ventilation study shows,
over about half an hour, ‘cold spots’ where the radioisotope has not
accumulated because the airway is blocked (see figure 20.12).

(a) (i) (ii)

Figure 20.12 Lung studies (b) (i) (ii)


(a) A normal perfusion study and
ventilation study of the lungs
(b) Front view of lung scans of a
patient with a blockage in the left
pulmonary artery: (i) the perfusion
scan shows no blood flow to the left
lung; (ii) the ventilation scan shows
both lungs as the airway is not
blocked.

To determine the volume of blood in the body, a measured quantity of


a radioisotope is administered, and after a period of time a sample of
blood is taken. If the activity of the tracer in the blood is measured, the
dilution of the tracer and hence the volume of blood in the body can be
calculated. This procedure, known as dilution analysis, is valuable in
investigating disorders such as anaemia, assessing stroke damage and
monitoring blood loss as a result of an accident.

CHAPTER 20 RADIOACTIVITY AS A DIAGNOSTIC TOOL 391


PHYSICS FACT
Radioactivity and safety issues
I n a hospital, the general public, medical teams
and patients must be protected from overexpo-
sure to radioactive material. Strict guidelines are
implemented to control and monitor exposure to
radiation.
Areas where work is carried out with ionising
radiation are clearly marked as controlled areas
with limited access. Equipment is checked
regularly to make sure it does not leak radio-
active material. Personnel distance themselves
from radioactive material where possible and
wear monitors to measure their exposure to
radioactive sources. These monitors are checked Figure 20.13 A radiation warning sign. The trefoil is the
regularly. internationally recognised symbol for radiation.

20.2 POSITRON EMISSION


TOMOGRAPHY (PET )
Positron emission tomography, known as PET, is used to diagnose and
monitor brain disorders, investigate heart and lung functioning and
detect the location and spread of tumours. Using particular radio-
pharmaceuticals, a cross-sectional image through an organ can be
obtained or a region of the body can be imaged, allowing the function of
an area to be determined. A PET image of the brain shows the patient’s
responses to factors such as noise, illumination and changes in mental
concentration.

Figure 20.14 A PET scan showing a


build-up of fluid in the lungs
(pulmonary edema)

392 MEDICAL PHYSICS


Positron-electron interactions
Positrons are positively charged
Certain radioisotopes decay by the emission of positrons. Positrons are posi-
beta particles formed when a tively charged beta particles. That is, they are identical to electrons except
proton disintegrates to form a that they have positive charge instead of negative charge. Positrons are
neutron and a positron. A positron formed when a proton disintegrates to form a neutron and a positron. Radio-
is identical to an electron except isotopes that are deficient in neutrons often decay in this way. For example,
that its charge is positive instead of +
negative.
carbon-11 decays to boron-11, emitting a positron (β ) as shown below.
+1
11
C →
6
11
5 B + 10 β
When a positron meets an electron they ‘annihilate’ each other,
converting their combined energy and mass into two gamma rays (γ rays).
MeV stands for million electron
The energy of each of these gamma rays is 0.51 MeV (million electron
volts and is the energy gained by volts) and they travel in opposite directions, as momentum is conserved in
one electron accelerating through the interaction. This process is sometimes called ‘pair annihilation’.
a potential difference of one
million volts. How a PET scan is obtained
To obtain a PET scan, a suitable pharmaceutical is labelled with a
positron-emitting radioisotope. The radiopharmaceutical is usually
injected into the patient but sometimes the chemical is inhaled. After a
short period of time the radiopharmaceutical has accumulated in par-
ticular areas of the body and begun to decay by the emission of positrons.
These positrons travel a short distance, of the order of a few millimetres,
before they encounter electrons in the body. Pair annihilation takes place
and two gamma ray photons are produced. The gamma photons travel in
opposite directions from the site of annihilation and emerge from the
body where they are detected by gamma cameras.
Modified gamma cameras surround the patient in the section being
scanned. The cameras do not have multihole collimators so that gamma
photons from all angles can be detected. Pairs of gamma photons travel-
ling in opposite directions are detected and their relative intensity meas-
(a)
ured. By taking measurements of pairs of gamma photons from all angles
and correlating these measurements with known attenuation coefficients
for gamma rays passing through tissue, the position of the decaying
radioisotope can be approximately determined. In this way, an image is
produced showing where radioisotopes accumulate. It takes about half a
million gamma ray pairs to produce a useful image.
A PET imaging system detecting emissions from a region of the brain is
illustrated in figure 20.15.

Detector
(b) (c)
Photomultipliers

γ γ

γ γ

Annihilation of electron and positron


γ produces two gamma rays.
γ

Figure 20.15 (a) Some PET images of brain activity (b) A patient undergoing a PET scan of their brain (c) Cross-section showing
pairs of gamma rays travelling in opposite directions and reaching detectors

CHAPTER 20 RADIOACTIVITY AS A DIAGNOSTIC TOOL 393


Isotopes used in PET
Common isotopes used in PET are listed in table 20.3.

Table 20.3 Common isotopes used in PET

RADIOISOTOPE SYMBOL HALF-LIFE

11
Carbon-11 6 C 20.4 minutes

13
Nitrogen-13 N
7
10.0 minutes

15
Oxygen-15 8 O 2.13 minutes

18
Fluorine-18 9 F 109.8 minutes

To image brain function, the patient may inhale a small quantity of


carbon monoxide (CO) made with carbon-11 because CO quickly
accumulates in the brain. Oxygen-15 can be used to label oxygen to study
blood volume in the brain or to label water to study blood flow in the
brain. Fluorine-18 or carbon-11 are used in research on the brain related
to Parkinson’s disease and schizophrenia.
In all these studies it is important that the amount of radiopharma-
ceutical administered is large enough to obtain a good image in the time
needed for the procedure but small enough to minimise patient expo-
sure to radiation.
As can be seen from table 20.3, the half-lives of isotopes suitable for
PET are very small. The isotopes must be created on the day of use and,
except for fluorine-18, must be made at the site of use. A cyclotron is
needed for their production (see the box on page 385). This poses a
serious limitation, as most hospitals cannot afford the cost of an on-site
cyclotron and facility for producing radiopharmaceuticals. In New South
Wales, the cyclotron near the Royal Prince Alfred Hospital is used to pro-
duce isotopes for the PET facilities at that hospital (see figure 20.3 on
page 385). Fortunately the longer half-life of fluorine-18 means tracers
can be labelled with fluorine-18 and shipped to nearby hospitals.

20.3 IMAGING METHODS WORKING


TOGETHER
Medical imaging to obtain both functional and structural images is often
needed for adequate diagnosis. CT scans are used to obtain structural
images. (MRI, which is discussed in chapter 21, is also generally used for
structural images, although it can be used to obtain functional images in
certain parts of the body.) Radioisotopes, on the other hand, allow func-
tional information to be gathered from which the site of a tumour can be
determined. A nuclear medicine image may show tumours but not very
much normal tissue. Hence it may be difficult to determine the position
of the tumour relative to other structures. If a CT or MRI scan is
obtained at the same time, the location of the tumour can be established
precisely. In fact, scanning devices combining CT with PET are available
already, and small-scale systems combining MRI with PET have been
developed for use in veterinary hospitals.

394 MEDICAL PHYSICS


PHYSICS FACT

P ET has found important applications in the studies of brain func-


tion and metabolism. As early as 1902, P. Ehrlich suggested that
there was a barrier that prevented many molecules in the blood-
stream from gaining access to the brain. It was called the blood-brain
barrier (BBB). This model suggested that some molecules could
penetrate the barrier and some could not. Obviously it was impor-
tant that certain molecules cross this barrier for healthy brain func-
tion. β-D-glucose is one such molecule. It is often referred to as the
brain’s ‘fuel’.
β-D-glucose may be suitably labelled so that PET scans of the
brain’s function can be made. Fluorine-18 is commonly used for this
labelling. Fluorine-18 replaces a hydrogen atom on glucose to
produce 2-fluoro-2-deoxy-D-glucose (FDG). Healthy brain function
correlates with regions where glucose is found in known concen-
trations. However, tumours require more oxygen and therefore more
glucose. As a result, tumours show up in a PET scan due to the
accumulation of glucose in the tumour.
A compound containing technetium-99m is also of the correct size
to cross the BBB. Technetium-99m is used to image the brain in a pro-
cess called single photon emission computed tomography (SPECT).
SPECT also enables a cross-sectional image of the brain to be made by
rotating the gamma camera around the head, but the position of the
decaying radioisotope is not as clearly identified as with PET.

Evaluating imaging using radioisotopes


The advantages and disadvantages of using radioisotopes as a diagnostic
tool in medicine can be summarised as follows.
Advantages:
• Body function can be assessed through examining particular organs
and flow rates of blood and water.
• Stress fractures can be identified early by detecting increased activity of
bone cells.
• Volume of blood and water can be measured using dilution analysis.
• Whole body scanning will allow disease of the skeleton to be assessed.
• Disadvantages:
• Resolution of the images is poor compared with other methods of
imaging.
• Some radiation risk exists due to use of radioisotopes.
• The radioisotope is usually injected,
so this is an invasive procedure.
• Radioactive waste needs to be dis-
posed of with care.
• Costs are greater than for ultra-
sound or X-rays (and are similar to
MRI).

Figure 20.16 Comparison of PET (top)


and MRI scans (bottom) of a patient with a
brain tumour

CHAPTER 20 RADIOACTIVITY AS A DIAGNOSTIC TOOL 395


(b) Calculate the time it will take for its activity
SUMMARY be 0.25 MBq.

• Radiopharmaceuticals are taken up by par- 7. The function of the lungs can be studied using
ticular organs in the body. a radioactive gas. The choices are krypton-81m
or xenon-133 and their properties are in the
• Gamma radiation from the radioisotope is table below.
detected and used to make an image of the organ.
• The rate at which the radioisotope accumulates EMISSION
in the target organ indicates the health of the ISOTOPE PRODUCTS HALF-LIFE
organ.
Krypton-81m γ 13 seconds
• The half-life of the radioisotope and length of
time needed for the procedure must be Xenon-133 β, γ 5.3 days
considered when choosing an appropriate
radioisotope. Evaluate the claim that ‘Xenon should be used
• PET uses radioisotopes that are positron emitters. in preference to krypton for investigations of
lung function’.
• Positrons and electrons annihilate in the body,
producing two gamma rays. 8. (a) Choose two specific radioactive isotopes
• Detecting the position where the gamma rays used in medical diagnosis and outline
originate enables the position of the positron- where they would be used in the body. Jus-
emitter to be mapped. tify your answer.
(b) Explain why α-emitting radioisotopes are
• PET scans indicate the biochemistry, metab- not used for medical imaging.
olism and function of a particular area.
9. State two factors, other than its emissions, that
• PET scans are used for studying the brain and affect the choice of a radioisotope for a tracer
heart, detecting cancers at an early stage and study.
monitoring cancers during treatment.
10. Carbon-11 has a half-life of 20 minutes and
bromine-75 has a half-life of 100 minutes. If
QUESTIONS samples of these isotopes initially have the
same activity, show on the same graph how
1. Define the following terms: their activities vary with time.
(a) radioisotope
(b) radioactive decay 11. Identify a radioactive tracer study in which the
(c) emissions from radioactive nuclei. tracer:
(a) mixes with the substance under investi-
2. (a) Using an example, outline what is meant gation
by half-life of a radioisotope.
(b) is accumulated in the organ of interest.
(b) Using the example given in (a), outline
how the exposure to radioactive emissions 12. Use a flow diagram to outline the steps in
could be decreased. obtaining technetium-99m from its parent
3. Using a specific example of a radioisotope, isotope.
describe how it accumulates in the target organ. 13. Explain why technetium-99m is such an ideal
4. A particular isotope has a half-life of 100 days. radioisotope for medical imaging.
CHAPTER REVIEW

Discuss the suitability of this isotope for use in 14. Describe how a radioisotope of your choice is
medical diagnosis. used in a PET investigation. In your answer
5. Describe the problems associated with using a you should name the isotope and state what
radioisotope of very short half-life for medical radiation is emitted and how it is monitored.
diagnosis. You should describe what measurements are
made and how they are used to obtain a result.
6. A sample of a particular radioisotope has a You should also mention any precautions or
half-life of 2.0 minutes. safety procedures.
(a) Calculate the time it will take the activity to
drop from 4.0 MBq (mega becquerels) to 15. (a) Describe what is meant by a positron.
1.0 MBq. (b) Identify how positrons may be obtained.

396 MEDICAL PHYSICS


CHAPTER REVIEW
(c) Identify issues associated with positron–
electron interaction and describe how this
interaction is used in medical diagnosis.
16. Examine figure 20.1 (page 381), which shows a
study of kidneys. Compare the diseased kidney
with the healthy one and outline reasons for
the observed differences in the images.
17. Figure 20.12 (page 391) shows two different
types of studies of lungs.
(a) Contrast the studies.
(b) Relate the type of study to the disease
diagnosed.
18. Figure 20.10 (page 390) shows an X-ray of a
leg and a bone scan of the body.
(a) Compare the X-ray image with the bone
scan.
(b) Explain why there are differences in the
images obtained.
19. Using the internet or other sources:
(a) find a scanned image of at least two
healthy body parts or organs (the image
should have been obtained using radio-
isotopes)
(b) find a scanned image of the diseased
counterpart of the body parts or organs
(c) compare the images and outline why the
differences are obvious in the images.
(In a search engine, use phrases such as
‘radiopharmaceuticals’, ‘nuclear medicine
images’, ‘PET images’, ‘bone scans’ or
‘brain scans’ as key words. Go to a par-
ticular hospital web site and search for
‘nuclear medicine department’.)

CHAPTER 20 RADIOACTIVITY AS A DIAGNOSTIC TOOL 397


MAGNETIC

CHAPTER
21 RESONANCE
IMAGING AS A
DIAGNOSTIC TOOL
Remember
Before beginning this chapter you should be able to:
• recall what is meant by radio frequency
electromagnetic radiation
• recall that there is a magnetic field associated with a
charge moving in a circle
• recall what is meant by a superconductor.

Key content
At the end of this chapter you should be able to:
• describe how the net spin of protons and neutrons in
the nucleus is produced
• explain that nuclei with net spin produce a magnetic
field and this influences their response to an external
magnetic field
• describe the response of nuclei to a strong magnetic
field
• relate frequency of precession to the composition of
the nuclei and the strength of the applied magnetic
field
• discuss the effect of subjecting precessing nuclei to
pulses of radio waves
• explain that the amplitude of the radio signal emitted
by the nuclei as they relax increases as the number of
nuclei present increases
• contrast the relaxation time between tissue
containing hydrogen-bound water molecules and
tissue containing other molecules
• compare MRI scans of healthy and damaged tissue
• explain why MRI scans can be used to distinguish
Figure 21.1 An image of a brain taken
between grey and white matter in the brain, to
using MRI, showing multiple sclerosis
identify areas of high blood flow and to detect
plaques (shown as black patches)
cancerous tissue
• identify the function of the following parts of MRI
equipment: electromagnet, radio frequency
oscillator, radio receiver and computer
• compare advantages and disadvantages of X-ray
images, CT scans, PET scans and MRI scans
• assess the impact of medical applications of physics
on society.
In this chapter you will learn how the properties of magnetic fields
of nuclei are used to produce images for medical diagnosis. Magnetic
resonance imaging (MRI) uses strong magnetic fields and the magnetic
properties of nuclei in the body to obtain clear images of the brain,
spinal cord and soft tissues such as muscle, tendons, cartilage and joints.
It produces excellent spatial resolution and hence fine detail between dif-
ferent tissues can be detected in the areas that are imaged.
Since 1977 when the first whole-body magnetic resonance image was
produced, MRI has developed at a remarkable pace and is now an indis-
pensable imaging technique. It poses minimum risk to the patient and
can provide accurate information about organ function and biochem-
istry as well as body anatomy.

21.1 THE PATIENT AND THE IMAGE


USING MRI
If you are a patient undergoing an MRI scan, you are placed on a bed
that moves on a gantry inside a region where the magnetic field is very
strong. Before the MRI machine is turned on, all metal objects must be
removed from the patient and from the room because the magnet is so
strong that loose magnetic material can become missiles (see figure
21.2). Eddy currents are induced in any nearby metallic material when
the machine is operating because there are changing magnetic fields in
the MRI machine. If you have a pacemaker or other metallic implant you
will not be permitted to undergo MRI.
The following sequence of steps occurs while you are in the strong
magnetic field:
1. A pulse of electromagnetic radiation in the radio frequency range is
sent into your body.
Figure 21.2 This metal chair was
2. This pulse is turned off.
not secured in the room with the MRI
3. Nuclei in your body produce a radio frequency pulse as a result of the
machine and, due to the strong
pulse that was sent into your body.
magnetism, became stuck in the
4. The radio waves emitted are analysed, amplified and processed by a
machine.
computer to form part of the image that is displayed on a screen.
5. The process is repeated rapidly with many radio frequency pulses
while the whole area of interest is scanned.

Figure 21.3 A patient having a scan


using MRI

CHAPTER 21 MAGNETIC RESONANCE IMAGING AS A DIAGNOSTIC TOOL 399


To understand MRI it is necessary to understand:
• how magnetic effects in the body arise
• how external magnets and then electromagnetic radiation can interact
with the nuclei in the body
• how data are collected from the radio waves produced to create an image.

Magnets in the body


Our bodies are made of a relatively small variety of elements combined
into a large variety of compounds. Hydrogen is the most commonly
imaged element in MRI because it is abundant in the body and gives the
strongest signal when subjected to the MRI process that we will outline.
About one-tenth of the mass of our body is hydrogen and of this approxi-
mately 70 per cent is in water molecules, 20 per cent is in fats and a small
amount is in proteins.
It is possible to use other nuclei to produce an image. These include
carbon-13, fluorine-19, sodium-23 and phosphorus-31. However, they are
not as abundant in the body as hydrogen and the signal from them is not
as strong.
Net spin is a property of a nucleus.
All the nuclei used in magnetic resonance imaging have a net spin. Net
If a nucleus has a net spin it spin is a concept used in quantum mechanics and is described in more
behaves as a tiny magnet. detail in the Physics fact on page 401. If the nucleus of an atom has a net
spin, it may behave as a small magnet (see figure 21.4).
For this discussion the nucleus of a hydrogen atom will be considered,
because hydrogen is the most commonly imaged element in MRI. The
nucleus of a hydrogen atom is a proton and has a net spin. Individual
Proton protons behave as small magnets. The magnetic field associated with the
protons in hydrogen atoms will be randomly orientated until an external
strong magnetic field is applied.
Protons become aligned due to the interaction of their magnetic field
with an external magnetic field. They align themselves either in the
direction of the external magnetic field (parallel) or in the opposite
Spin axis direction to the external magnetic field (anti-parallel). (The parallel and
anti-parallel result follows from quantum mechanics and is beyond the
B (magnetic field) scope of this discussion.) Each of these orientations corresponds to a
slightly different energy state and this fact is important in the MRI pro-
N
cess. The lower energy state corresponds to parallel alignment and
slightly more of the protons are found in this state when an external
magnetic field is applied.

External field
Hydrogen protons
B0
S

Figure 21.4 The magnetic field


associated with a hydrogen proton
is like that around a bar magnet.
N S

(a) No external field (b) In an external field


(random orientation) (alignment)

Figure 21.5 Alignment of protons in an external magnetic field

400 MEDICAL PHYSICS


PHYSICS FACT
Net spin of a nucleus
Y ou will recall from the Preliminary Course
(Physics 1 Preliminary Course, 3rd edition,
chapter 11) that linear momentum is possessed
• There can be a maximum of two protons or two
neutrons in any energy level in the nucleus. A
proton cannot be paired with a neutron.
by objects moving in a straight line. Spinning • The pair of neutrons will have opposite spins
objects and orbiting objects have momentum to one another, called spin-up and spin-down.
called angular momentum. In classical mech- The same rule applies to a pair of protons. It
anics, angular momentum has a direction per- should be noted that spin is a concept from
pendicular to the plane in which the object is quantum mechanics and there is no evidence
spinning or orbiting. If the object is moving that the protons and neutrons are actually
clockwise, when viewed from above, the direction spinning.
of the angular momentum is down. If the object In general:
is moving anticlockwise, the direction of the 1. If the mass number is even but the atomic
angular momentum is up. number is odd, the net nuclear spin is a whole
In atomic physics, electrons can exist in number.
different energy levels around the nucleus. 2. If the mass number and the atomic number
Using quantum mechanics, each electron can be are both even, the net nuclear spin is zero.
thought of as having a spin angular momentum 3. If the mass number is odd, the net nuclear
and an orbital angular momentum. The total spin is a multiple of 1--- .
2
angular momentum is a combination of the spin It is very difficult to calculate the net spin for a
angular momentum and the orbital angular nucleus without using quantum mechanics at a
momentum. (This is often thought of as an elec- level beyond this course. Usually the net spin of a
tron that is spinning and orbiting the nucleus, nucleus is looked up in a table to determine
although the electrons are not actually spinning whether the spin is non-zero. The nucleus would
or orbiting the nucleus like a satellite around behave as a small magnet if the spin was non-zero.
the Earth. The angular momentum of the elec- Table 21.1 shows the net spin for some nuclei.
tron is a concept from quantum mechanics.)
Table 21.1 Net spin for selected nuclei
The nucleus of an atom is composed of
positively charged protons and neutral neutrons NUCLEUS NET SPIN
— collectively called nucleons. The model that
deals with nuclear spin is called the Nuclear Hydrogen-1
1
---
Shell Model. Each nucleon has an energy level 2
associated with it, just like the energy levels of
Helium-4 0
electrons. Each nucleon has a total angular
momentum which is a combination of the spin 1
Carbon-13 ---
angular momentum and the orbital angular 2
momentum. The equation from quantum 1
Fluorine-19 ---
mechanics that gives the total angular 2
momentum is: 3
Sodium-23 ---
2
Ih
J = ------ 5
2π Aluminium-27 ---
2
where 1
J = total angular momentum Phosphorus-31 ---
2
h = Planck’s constant
I = net spin. Although any nucleus with a non-zero net spin can
The net spin can be zero, multiples of 1--- , or respond to an external magnetic field and could
2 1
whole numbers. be used in MRI, it is the nucleus of hydrogen, 1 H,
To calculate the net spin for nuclei we need to that is used most frequently, because hydrogen is
follow some rules from quantum mechanics. a very common component of chemicals in the
These are: human body.

CHAPTER 21 MAGNETIC RESONANCE IMAGING AS A DIAGNOSTIC TOOL 401


PHYSICS FACT
Energy levels of protons
W e are familiar with the idea that electrons
orbiting an atom can have discrete energy
levels. (We say their energy is quantised.) If an
allowing the element to be identified (this is dis-
cussed in the Astrophysics option, chapter 15,
pages 282–284).
electron jumps from a high energy level to a A similar absorption or release of energy
lower energy level it releases a bundle of energy, occurs when hydrogen protons, having been
known as a quantum, equivalent to the differ- subjected to a strong magnetic field, move
ence in energy of the two levels. This energy is between parallel and anti-parallel orientations in
often measured in the convenient units of elec- the hydrogen atom. The energy difference
−19
tron volts (eV) where 1 eV = 1.6 × 10 joules. between these two levels is only about 0.2 µeV
The energy released in these electron tran- and corresponds to electromagnetic radiation of
sitions is of the order of 10 eV and often corre- frequency about 42.5 MHz. This is in the radio
sponds to a frequency in the visible frequency range. Hence by directing the correct
electromagnetic radiation range. This is what radio frequency waves at the protons, they can
happens when particular elements are excited be made to absorb energy and change from
and then return to a lower energy level, emit- being parallel to being anti-parallel to the
ting their own characteristic colours and external magnetic field.

21.2 THE MRI MACHINE: EFFECT ON


ATOMS IN THE PATIENT
An MRI machine must provide:
• a strong magnetic field
• additional weaker varying magnetic fields, called gradient magnetic fields
• pulses of radio frequency waves
• detectors of the radio frequency waves
• computers to analyse the signals received.

Data processor

Display Storage

Computer

Operator
RF transmitter interface
and amplifier

RF receiver Magnet power Position


and amplifier supply selection

Field gradient
Receiver Transmitter Main magnet
coils

Figure 21.6 A block diagram of


RF coils
an MRI machine

402 MEDICAL PHYSICS


Effects on the orientation of nuclei of applying
a strong magnetic field
The patient undergoing MRI lies on a bed in a cylindrical bore called a
gantry and is subjected to a uniform magnetic field of strength between
0.2 and 2 teslas (T). The strength of this field can be appreciated when
we realise that it is more than 10 000 times the strength of the Earth’s
magnetic field. A magnetic field with a strength of approximately 2 T
results in an image of better resolution than that obtained when the field
is 1 T or less.
The magnetic field causes protons in the hydrogen atoms in the
patient’s body to try to line up either parallel or anti-parallel to the
external field.

PHYSICS FACT
The strong external magnetic field
Y ou may ask how such a strong magnetic
field can be produced by a machine. There
are three possible ways such a magnet could be
2. Electromagnets
Electromagnets are created by passing a direct
current through a coil of copper wire. Magnetic
produced. field strengths no greater than 0.5 T are gener-
1. Permanent magnets ated due to the significant loss of energy through
The permanent magnets are usually made of heat in the wires. For example, to generate a
an alloy of aluminium, cobalt and nickel magnetic field of 0.15 T, over 60 per cent of the
(alnico). Due to their large weight, only 0.2 T energy put in is converted to heat in the wires.
field strength can be achieved. Even then the Up to 150 litres of water per minute must be
magnet weighs about 80 tonnes! They need no pumped through the system to remove this heat.
power supply, the field does not extend signifi- The running cost is high as a large power supply
cantly beyond the magnet and the running costs is needed. An advantage of having a power supply
are low. However, the magnet cannot be is that it can be switched on and off. The mag-
switched off and the scanning time is long, pro- netic field extends beyond the coils so shielding
ducing an image of only reasonable quality. In is necessary. The weight of the magnet is only
figure 21.7 we can see that the external mag- about 2 tonnes and installation costs are relatively
netic field does not exist along the patient’s low. Although the scan time is long, the image
body. quality is good.

Large permanent magnet Electromagnet

S
Electric current

B0

N Horizontal
magnetic field

Figure 21.7 Magnetic field from a permanent magnet Figure 21.8 Magnetic field from an electromagnet
(continued )

CHAPTER 21 MAGNETIC RESONANCE IMAGING AS A DIAGNOSTIC TOOL 403


Coils of superconducting material
3. Superconducting magnets
Liquid nitrogen
These magnets use a coil of super- at temperature
conducting material, which is 77 kelvin
cooled with liquid helium to the Liquid helium
critical temperature. This means (coolant) at
temperature
that huge currents can flow with 4 kelvin
very little power input and mag- Vacuum chambers Shield cooled
netic fields up to 2.5 T can be by liquid nitrogen
obtained. Shielding is needed Bore at room temperature
1m
because the large magnetic field 293 kelvin
extends outside the coil. Cost of B0
installation is high for this magnet,
which weighs about 6 tonnes. The
maintenance of the coolants also
adds to the running costs. To shut
the magnet down the coolant must Outer casing
be slowly drained. The use of this
magnet allows short scan times with
excellent image quality. The
limited supply of helium as a
coolant has fuelled research to find
Nitrogen vent
suitable superconducting material Helium vent
that becomes a superconductor at
the temperature of liquid nitrogen Figure 21.9 A superconducting magnet shown in cross-section. The external magnetic
or even at higher temperatures. field runs along the length of the patient’s body.

Precession
When the nuclei with net spin change their magnetic field orientation in
response to the external magnetic field, they do not remain in a steady
position along the external magnetic field, but rather precess around the
direction of the magnetic field.
To understand precession, consider a spinning top. It stays in the
Precession is the movement, in a
conical path, of the axis of a upright position while it is spinning rapidly. However, if it slows down and
spinning object. starts to tip over due to gravity, or if you tilt it off its vertical orientation
and allow the force of gravity to try to tip it over, the spinning top starts
to wobble on its axis, tracing out a conical path. This motion, called pre-
cession, is similar to the motion of the nuclei in response to the force of
the external magnetic field (see figure 21.10).
The frequency with which a nucleus precesses in a given magnetic field
The Larmor frequency is the is called the Larmor frequency. The Larmor frequency is different for
frequency with which a nucleus different nuclei in the same magnetic field, as illustrated in table 21.2.
precesses about its spin axis, in We can use the Larmor frequency in a given magnetic field to identify an
response to the force due to an element.
external magnetic field.
Table 21.2 Larmor frequency of nuclei

LARMOR FREQUENCY IN A 1.0 T MAGNETIC


NUCLEUS FIELD (MHz)

Hydrogen-1 42.57

Carbon-13 10.70

Phosphorus-31 17.24

404 MEDICAL PHYSICS


The Larmor frequency depends on the Precessional orbit
strength of the external magnetic field. The
Magnetic field
Larmor frequency is 42.57B0 MHz for of nucleus
hydrogen protons, where B0 is the strength of
the external magnetic field. This equation
shows that the Larmor frequency is pro-
portional to the strength of the applied
external magnetic field. If the external field
strength is 1 T, the Larmor frequency is 42.57 Nucleus
MHz. The frequency 42.57 MHz is in the External magnetic
radio frequency range, a fact that is critical for field
the MRI process because radio frequency sig- Figure 21.10 Precession
nals are made to interact with the precessing of a nucleus due to the
hydrogen protons. application of an external
magnetic field
Calculating the Larmor frequency
SAMPLE PROBLEM 21.1 Calculate the Larmor frequency for hydrogen protons in a magnetic field
of strength 1.6 T.

SOLUTION For hydrogen protons:


Larmor frequency = 42.57B0 MHz
where B0 = 1.6 T
∴ Larmor frequency = 42.57 × 1.6 MHz
= 68.11 MHz

Application of radio frequency pulses


With the patient in the strong magnetic field, so that the protons in the
hydrogen atoms are precessing around the direction of the external mag-
netic field, a pulse of radio frequency electromagnetic radiation is
beamed into the patient. The frequency is chosen to correspond exactly
with the Larmor frequency. This is the frequency of precession of the
protons. The protons will resonate with the radio frequency, and so
To resonate means to absorb
energy when an applied frequency absorb its energy, move to a higher energy level and precess in phase with
matches the natural frequency of one another. A radio oscillator produces pulses of a precise frequency.
an object. The radio oscillator can change the frequency of the signal to match dif-
ferent Larmor frequencies.
The amount of energy absorbed is very small, corresponding to the
small energy difference between the parallel and anti-parallel precessing
protons. (Recall from the Physics fact on page 402 that this energy is in
the radio frequency range.) When the pulse is switched off, the protons
release the absorbed energy. The intensity and duration of the energy
signal released is analysed, enabling part of an image of a ‘slice’ through
the patient’s body to be obtained. (The complex steps from releasing the
energy to creating an image are discussed later in the chapter.)
You may be wondering how a single slice can be imaged when the
Larmor frequency is the same for all the protons that are precessing from
one end of the patient’s body to the other. Ability to distinguish between
‘slices’ of the patient is achieved by changing the strength of the strong
field slightly and uniformly along the length of the patient’s body by the
use of gradient coils. In this way, the Larmor frequency changes along
the patient’s body. Then the radio frequency oscillator can change the
frequency of the pulse of the signal to match the particular Larmor fre-
quency of the chosen ‘slice’ of the body.

CHAPTER 21 MAGNETIC RESONANCE IMAGING AS A DIAGNOSTIC TOOL 405


The gradient magnetic field
A gradient magnetic field is one
A gradient magnetic field, which changes by small uniform amounts, is
that changes by small known applied along the length of the patient’s body. This field adds to the external
increments throughout the region field, varying it by no more than 1 per cent (see figure 21.11). This means
of the field. that the Larmor frequency changes along the length of the patient’s body.

Gradient field adds to B0 in this


Gradient field (B)
region because B and B0 are in
the same direction.

Distance
0
Figure 21.11 A gradient magnetic
field changes by small amounts over a
distance. Gradient field subtracts from B0 in this region.

The gradient magnetic field can be generated from oppositely wound


coils that have the same current flowing but vary uniformly in how tightly
they are wound. In figure 21.12, the direction of the current in the right-
hand coil produces a magnetic field which adds to the external field. The
field is stronger at the far right-hand end of the coil because the coils are
wound more tightly. The way the coils are wound on the left-hand side
results in a field which is opposite in direction to the external field and
so reduces the external field. The tightness with which the coils are
wound changes uniformly and so the strength of the field produced
changes uniformly.

External field
B0

B Gradient
field B
Figure 21.12 A gradient field
generated by coils of wire

Determining the gradient magnetic field


SAMPLE PROBLEM 21.2 A small child of height 1.0 m is placed in the magnetic field (1.5 T) of an
MRI machine. A gradient magnetic field is applied to the external field.
This gradient field increases the external field by 0.005 T for every 10 cm
between the midpoint of the child and her head, and decreases the external
field in the same way between the midpoint of the child and her feet.
Determine:
(a) the total external magnetic field strength at the child’s chin, which is
20 cm below the top of her head
(b) the total external magnetic field strength at the bottom of her feet.

SOLUTION (a) The child’s chin is 30 cm above her waist. As the field is increased by
0.005 T for every 10 cm moved towards her head,
external field strength = 1.5 + 3 × 0.005 T
= 1.515 T
(b) The child’s feet are 50 cm below her midpoint. As the field is
decreased by 0.005 T for every 10 cm moved below her midpoint,
external field strength = 1.5 − 5 × 0.005 T
= 1.475 T

406 MEDICAL PHYSICS


Removal of the radio frequency pulses
eBook plus
When the radio frequency pulse is stopped, the protons release their
Weblink:
absorbed energy. This energy released is the same as that absorbed from
MRI the radio frequency pulse. The energy is detected by a radio receiver,
then the signal is sent to computers to be analysed.
How can this radio frequency signal be analysed to identify where in
the slice the signal is coming from and to distinguish one type of
hydrogen compound from another? We will look at these questions in
the following sections.

Localising the signal within a slice


In order to determine exactly where particular signals originate, two fur-
ther small gradient fields in the plane of the slice are applied.
In one direction the gradient field modifies the phase of the pre-
cessing protons very slightly. That is, it makes the protons slightly out of
step with one another as they precess.
In the other direction, at right angles to the first, the gradient field
alters the frequency of the precessing protons. It is common for the
gradient fields in the two directions to divide the ‘slice’ into 256 × 256
A voxel is a small volume. It is part
voxels. The response from protons in each voxel can be determined
of a ‘slice’ through the body. from the very complicated returning signal (see figure 21.13). This signal
is analysed mathematically by a process called Fourier transformation
and each voxel is given an intensity value from which an image can be
constructed. The gradient fields need to be switched on and off rapidly,
allowing the pulses to be repeated at a fast rate. A typical rate at which
the radio frequency signal is switched on and off is 63 million times per
second for an MRI machine using a magnetic field of 1.5 T. Every time
a pulse is sent in and then switched off, new information about the
composition of particular voxels is obtained.
Proton spin
vector given
a ‘large kick’ y Protons all at the same frequency
but slightly out of phase
By

Single pixel
(picture element)

Phase Plane XX
field

encoding
y
Gradient

Grid showing part of Frequency


slice XX at 1.5 T encoding
x

Proton spin Protons all in


vector given phase but at
a ‘small kick’ slightly different
frequencies B
Frequency 63.85 63.86 63.87 63.88 63.89 x
(MHz)

t field
Gradien

Bx (T)
1.5000 1.5002 1.5004 1.5006 1.5008 1.5010
Figure 21.13 Localising the signal within a single slice

CHAPTER 21 MAGNETIC RESONANCE IMAGING AS A DIAGNOSTIC TOOL 407


Distinguishing one type of hydrogen compound from
another — factors influencing the strength of the signal
returned from the patient
If the signal strength returned is large, the area on the image is bright;
and if the signal is small, the area is dark. The signal is influenced by
many factors. However, the three most important factors are:
• proton density
• type of tissue
• rate of radio frequency pulse (the rate will enable contrast to be changed).
The first two factors are properties of the material being imaged. The
third factor is imposed on the material in order to change the contrast in
the image.

(a) (b)

(c) (d)

Figure 21.14 Images using MRI and showing clear contrast


of soft tissue (a) MRI of the lungs, showing cancer on the right
lung (in orange) (b) MRI of a normal pelvis (c) A patient’s
spinal column showing damaged discs (circled) (d) MRI image
of a normal skull

408 MEDICAL PHYSICS


The greater the density of hydrogen protons, the larger the signal and
the brighter the image. Air and outer bone have no hydrogen and so
appear dark in an MRI image. Cerebrospinal fluid in the brain and spinal
column has a large amount of water that is not bound to other mol-
ecules. The hydrogen protons in the water are quite mobile, therefore
water produces a strong signal and shows up bright on an image. Soft
tissue also appears quite bright. However, it is difficult to differentiate
between sections of the soft tissue, unless relaxation time, discussed
below, is taken into account.
The type of tissue determines how easily the protons can release
their energy to neighbouring nuclei. By examining the time it takes to
‘relax’ or reach the original energy state, the type of tissue can be
identified.
By changing the rate at which the radio frequency pulse is sent into the
patient, we can make the image of one type of tissue brighter or less
bright and hence change the contrast in the image, and identify what
Contrast refers to the brightness
difference between parts of the type of tissue we are looking at.
image.
Measuring relaxation time and changing the
contrast in images
When the radio frequency pulse is removed, the protons move back to
Relaxation refers to the precessing
their original energy level. This process is called relaxation and the time
nuclei moving back to their it takes to do that is called the relaxation time.
original energy state. There are two relaxation times that can be measured, and the
measurements can be used to emphasise different aspects of the
image. The resulting different images are said to be T1 weighted or T2
weighted.
For the relaxation time called T1, relaxation energy is transferred
from precessing protons to the surrounding molecules. If the mol-
ecules have a natural frequency of oscillation close to the Larmor fre-
quency, energy is more rapidly released. The T1 value is small. This is
true for larger molecules or bound water molecules, which move more
slowly than free water molecules. Fat, liver and spleen tissues have a
short T1. An image emphasising small T1 values is said to be T1
weighted.
You may be wondering how the T1 image can be made. It depends on
the rate at which the radio frequency pulses, mentioned in the previous
section, are switched on and off. If the radio frequency pulses are
repeated quickly before all the protons have had time to relax, those with
a short relaxation time will keep absorbing and releasing energy and
appear bright. (Fat molecules, which are large, will appear bright in this
case.)
For the relaxation time called T2, we are measuring the time for pro-
tons to go out of phase with one another by exchanging energy with one
another. T2 is short for solids and larger molecules which are found in
tendons, muscle and liver, and long for watery tissues. Images recording
the long T2 relaxation time are said to be T2 weighted.
Once again, the pulse rate of the radio frequency signals is important
in obtaining T2 weighted images. If the radio frequency pulses are
repeated more slowly, watery tissues will have had time to relax before
the next pulse is sent in and so they will appear bright. Table 21.3 on the
following page summarises the features involved in the different
weighted images, and figure 21.15 shows some examples of images.

CHAPTER 21 MAGNETIC RESONANCE IMAGING AS A DIAGNOSTIC TOOL 409


Table 21.3 Types of images and their features

TYPE OF IMAGE APPEARANCE OF ORGANS USEFULNESS

T1 weighted Fat and larger molecules are bright. Water is dark. For body structure.
Excellent for soft tissue detail.

T2 weighted Watery tissues and diseased tissues bright. Tendons, Preferred for investigating
muscle and liver are dark. diseased areas.

Proton density weighted Urine and cerebrospinal fluid are highlighted because the Preferred for showing diseased
images density of water (protons) is high. organs.

(a) (b) (c)

Figure 21.15 The same section through the brain showing images that are (a) proton
density weighted, (b) T1 weighted and (c) T2 weighted.

Each time the pulse is switched on and off, the gradient fields used to
locate signals within a ‘slice’ are also switched on and off. The rapid
switching of the fields causes the loud noises heard by the patient during
the scan.
If the manipulation of contrast is inadequate, artificial contrast agents
may be injected intravenously or taken orally. Agents that travel through
the blood are used for highlighting tissues with a large number of blood
vessels, such as tumours, or for imaging blood vessels themselves. The
contrast agents have the potential to make the signal stronger from
specific tissues or even from regions in which specific genes are being
expressed. When genes are being expressed, they are producing certain
proteins that can be detected using MRI. Genetic research is already
using MRI in this way.

21.3 MEDICAL USES OF MRI


MRI is considered to be the best diagnostic imaging technique for struc-
tural resolution and contrast. It depicts soft tissue so well that it is the
preferred choice for imaging the brain and spine, where it is able to show
suspected tumours and slipped discs. It is useful for imaging areas with
large amounts of water as these areas have many hydrogen nuclei.
Cancerous tumours contain different amounts of water from normal tissue

410 MEDICAL PHYSICS


or are surrounded by watery tissue. They can be distinguished in an MRI
scan because of the different brightness due to different proton density.
Grey matter in the brain and spinal cord contains hydrogen bound in
a different way from that in white matter. As a result, the relaxation times
for hydrogen protons are different in grey matter and white matter. This
means they are able to be distinguished in an MRI scan. The fact that
nerve cells of grey matter can be distinguished from those of white
matter can be used in the diagnosis of multiple sclerosis.
A clear image of the brain and spinal cord can be made without the
skull or spine interfering, because bone contains no hydrogen and will
not show up in an MRI scan.

PHYSICS FACT

C ardiac MRI allows investigation of congenital abnormalities and


coronary heart disease to be carried out. Improvements in the
speed of MRI have made abdominal imaging possible. Early MRI
machines took 10 minutes to scan 24 ‘slices’ of the body and this can
now be done in under 1 second. Injection of a contrast agent into the
blood, combined with rapid imaging techniques, now allows blood
flow in the kidneys to be examined and narrowing of the arteries due
to fatty plaques to be seen.

Functional MRI allows parts of the brain to be investigated while changes


are taking place. There is an increased flow of oxygenated blood to areas
that are stimulated. Knowledge of the magnetic properties of oxygenated
blood allows parts of the brain involved in processing sensory data or motor
tasks to be identified and studied (see figure 21.16). Parts of the brain may
be able to be studied prior to surgery.

Figure 21.16 Brain activation


showing increased blood flow due to a
simple motor task

With the development of special non-metallic tools and more open


magnet geometries for MRI machines, minimally invasive procedures
and open surgery can be performed inside the MRI scanner.

CHAPTER 21 MAGNETIC RESONANCE IMAGING AS A DIAGNOSTIC TOOL 411


21.4 COMPARISON OF THE MAIN
IMAGING TECHNIQUES
Table 21.4 provides some comparison between imaging techniques.
Improvements are being made in the machines used in all these imaging
methods, and students are advised to search the internet for the latest
advances. At the time of printing, the white boxes represent the pre-
ferred method for imaging the organ or tissue indicated.

Table 21.4 Comparing imaging techniques

ULTRASOUND X-RAYS CT NUCLEAR MRI

Cost of machine Moderately Least expensive Quite expensive Quite expensive Very expensive
(capital cost) expensive

Mobility of Portable Small portable Fixed machines Fixed machines Very few mobile
machines machines machines machines
commonly used available

Spatial 1–5 mm 0.1 mm 0.25 mm 5–15 mm 0.3–1 mm


resolution
(ability to see
fine detail)

Time for Moderate Very fast Moderate May be long, Relatively long
examination depending on but some
tracer and procedures are
procedure. now quite short

Comfort and No known Small dose of Usually higher Moderate dose Some
safety hazards ionising radiation dose of ionising of ionising claustrophobia
radiation than for radiation from from lying inside
X-rays radioisotopes the bore
containing the
magnetic field.
Patients with
metallic implants
cannot be
scanned.

Imaging soft Excellent, Image poor — Good for whole Good for growth Good resolution
tissue of especially for needs contrast abdomen scan of tumours and for specific areas
abdomen obstetric cases, as medium functional study e.g. kidneys
it is safe and real- of liver and
time imaging is kidneys (see
possible (see page 381)
page 344)

Imaging soft Reasonable if Poor contrast Good —preferred Poor resolution Excellent for
tissue of joints bone can be to MRI when extra but good for studying muscles,
bypassed bone detail is functional tendons and
needed (see information cartilage (see
page 414) page 408)

412 MEDICAL PHYSICS


ULTRASOUND X-RAYS CT NUCLEAR MRI

Imaging heart Excellent for Contrast Limited use Good for blood Good resolution
and circulation structure and medium is with digital flow studies and ability to
using Doppler needed imaging measure blood
technique techniques flow
for blood flow
(see page
355)

Imaging chest Poor as air– Adequate for Better detail Good for Not good for
tissue boundary routine lung than X-rays functional imaging air
reflects sound screening (see studies of blood spaces
waves page 361) and air flow

Imaging brain Poor as bone– Limited use as Good and PET scans are Excellent for
and spinal cord tissue boundary bone blocks preferred to useful for giving good
region blocks sound most waves MRI for details showing contrast
waves of bone of function between tissues
spine (see
page 414)

Imaging bone Poor as waves Gives very good Good when Good for whole Signal is weak so
are blocked by resolution (see more body bone of limited use.
bone page 367) complicated cancer and early
structures must diagnosis of
be viewed stress fractures

(a)

(b)

Figure 21.17 Comparison of


(a) PET and (b) MRI scans of
the brain

CHAPTER 21 MAGNETIC RESONANCE IMAGING AS A DIAGNOSTIC TOOL 413


(a)

(b)

Figure 21.18 (a) CT scan of the lower disc in the spine (b) MRI scan of the same area of discs of the spine

414 MEDICAL PHYSICS


CHAPTER REVIEW
(b) How many magnetic gradient fields are
SUMMARY used in an MRI machine?
(c) A patient of height 1.8 m is positioned
• Nuclei with net spin align themselves in an horizontally in a steady, uniform, hori-
external magnetic field and precess about the zontal magnetic field of flux density 1.0 T,
field direction with a frequency dependent on running from her feet to her head. If a
the strength of the field. −1
gradient field of 8.0 mT m is applied
• Subjecting precessing nuclei to pulses of radio horizontally from her feet to her head (as
waves at the Larmor frequency causes some in figure 21.11, page 406), calculate the
nuclei to move from a low to a high energy magnetic flux density 1.4 m from her feet.
state. 5. Outline why gradient magnetic fields are
• When the high energy nuclei relax, they pro- needed to image a slice through the body.
duce a signal, the intensity of which is related 6. Explain why artificial metal joints between
to the number of nuclei present. bones and metal fillings in teeth are a problem
• The relaxation signal allows information about with MRI.
the abundance of the atom and its bonding 7. Use figure 21.15 (page 410) to answer these
with neighbouring atoms to be determined. questions.
• By changing the rate at which the radio fre- (a) Explain why the cerebrospinal fluid that
quency pulses are switched on and off, dif- surrounds the brain may look brighter
ferent aspects of the image can be emphasised. than brain tissue on a MRI image.
(b) Predict whether grey or white matter in
• The MRI machine consists of a strong magnet the brain would look darker on an MRI
and at least three other varying magnets, a image. Justify your answer.
radio frequency oscillator, a radio receiver, and
a computer. 8. (a) Compare the superconducting external
magnetic field used in MRI with the
• An MRI scan detects soft tissue clearly, allowing Earth’s magnetic field.
cancerous tissue and areas of high blood flow (b) (i) What are superconducting magnets
to be detected and grey and white matter in the and why do most MRI scanners use
brain to be distinguished. superconducting magnets?
(ii) Are superconducting magnets electro-
magnets? Justify your answer.
QUESTIONS
9. Describe two different pieces of information
1. Compare the types of strong magnets that may that can be obtained from the relaxation of
be used for the MRI machine. (See the Physics the protons in the nucleus of a hydrogen
fact on magnets, page 403.) Use a table for atom.
your answer.
10. ‘Medical physics has produced a wide range of
2. (a) Outline what is meant by: harmless techniques that avoid the use of
(i) proton spin invasive surgery.’ Evaluate this statement.
(ii) proton precession.
11. Examine the photograph in figure 21.14(c)
(b) Describe how an external magnetic field
(page 408). From this MRI image, compare
influences a hydrogen proton.
the healthy and damaged discs.
3. (a) Outline two reasons why the hydrogen
12. Research and then explain why MRI scans can
nucleus is imaged more than other nuclei
be used to:
in MRI.
(a) detect cancerous tissue
(b) When protons are in a strong magnetic
(b) identify areas of high blood flow
field they can occupy one of two possible
(c) distinguish between grey and white matter
energy states. Describe these energy states
in the brain.
and state which is the higher of the two.
To help with your research use a search engine
(c) Describe in what sense the MRI technique
and key words such as ‘MRI and cancer’, ’blood
is nuclear.
flow and MRI’, ‘grey and white matter and
4. (a) Explain the function of the magnetic MRI’, ‘relaxation time and MRI’, ‘T1 and T2
gradient fields used in MRI. weighted images’, ‘proton density and MRI’.

CHAPTER 21 MAGNETIC RESONANCE IMAGING AS A DIAGNOSTIC TOOL 415


13. Compare the advantages and disadvantages of (iii) ability to diagnose early, leading to
X-ray scans, CT scans, PET scans and MRI effect on the recovery rate of patients.
scans. Use your own research and information The effect on people closely associated
gathered from chapters 18–21 of this book. with the patient should be considered.
For example, figure 21.18 on page 414 com- (iv) type of diagnosis and treatment avail-
pares CT and MRI scans of the spine. able. Include to what extent the pro-
cedure is invasive.
14. Assess the impact of medical applications of Resources to use in planning could include:
physics on society. Plan your answer to this • newspaper articles about medical appli-
question before you begin to write a full answer. cations relevant to the areas you have studied
In your final answer make sure you address the • relevant journal articles such as those found
meaning of the verb ‘assess’. Ideas for planning in New Scientist
are given below. • relatives or friends who can recall medical
(a) Make a list of medical applications of diagnoses and treatment available 20, 40
physics. Be more specific than simply and 60 years ago.
listing, for example, ultrasound. List par- Finally, link the impacts on society with the
ticular medical applications of ultrasound. medical applications you have identified.
(b) Consider the impacts on society that are In your presentation of the answer to the
important. They might include: question, begin with a statement of your assess-
(i) cost, affecting the time a patient is ment of the impact. In the following para-
away from work or the time of treat- graphs, link the applications and the evidence
ment or place of treatment to support your assessment. Conclude with a
(ii) cost of equipment, which might affect summary statement of your assessment based
the health budget on the evidence that you have provided.
CHAPTER REVIEW

416 MEDICAL PHYSICS


HSC OPTION MODULE
Chapter 22
The atomic models of Rutherford
and Bohr

Chapter 23
Development of quantum
mechanics

Chapter 24
Probing the nucleus

Chapter 25
Nuclear fission and other uses of
nuclear physics

Chapter 26
Quarks and the Standard Model
of particle physics

FROM QUANTA
TO QUARKS
THE ATOMIC

CHAPTER
22 MODELS OF
RUTHERFORD AND
BOHR
Remember
Before beginning this chapter, you should be able to:
• recall the discovery of the electron by J. J. Thomson
• outline Thomson’s ‘plum pudding’ model of the
atom
• recall the contributions of Planck and Einstein to the
development of the quantum model of light
(photons)
• state the relationship between the energy and
frequency of a photon (E = hf).

Key content
At the end of this chapter you should be able to:
• discuss the main features of the Rutherford model of
the atom and identify difficulties with this model
• understand the role that the hydrogen spectrum
played in leading Bohr to formulate his model of the
atom
• discuss the contribution of Planck to the concept of
quantised energy
• state Bohr’s postulates
• understand that with Bohr’s postulates superimposed
on the Rutherford atom, it is possible to derive a
theoretical equation for the hydrogen spectrum that
is in agreement with Balmer’s empirical equation
1 1 1
• solve problems using --- = R  -----2 – -----2
λ n n 
Figure 22.1 Photograph of ‘aurora australis’, f i
the southern lights. The stars’ trails indicate that the • discuss the limitations of the Rutherford–Bohr model
photograph is a time exposure of several minutes. In an of the hydrogen atom.
aurora, atoms of the gases in the upper atmosphere emit
radiation after being excited by interactions with charged
particles from the Sun.
In the late nineteenth century, many physicists believed that the answers
had been found to all the major questions in physics. Electricity, mag-
netism, light, mechanics, cosmology, gravity — all, they claimed, could be
understood using the theories of Newton and Maxwell, which we now
refer to as ‘classical’ theories. Many chemists thought similarly about
chemistry. They were sure that with elements, each with its own indivis-
ible atom, and the discovery of the periodic table, there was little left to
discover. There were minor problems in physics but it seemed likely that
these would soon be explained in terms of the existing theories. Even as
some discoveries of new phenomena occurred, there still seemed no
doubt that classical physics would explain all.
However, discoveries made from 1895 onwards eventually saw the
demise of classical physics. Some aspects of classical physics were found
to be inadequate and were replaced with a theory that became known as
Quantum refers to a quantity or an quantum theory.
amount (from the Latin word The ideas that led to quantum theory flew in the face of accepted
quantum meaning ‘how much’). In science. Although there were still groups of scientists who denied the
‘classical physics’ an object could existence of atoms, the late nineteenth century saw the atom become
possess any amount of energy. In generally accepted as a small, indivisible chunk of matter. This was to be
quantum theory objects could
possess only certain discrete challenged by the work of J. J. Thomson and Ernest Rutherford. After
amounts of energy. Instead of fighting for the existence of atoms, many scientists regarded it as heresy
being ‘continuous’, energy was when Thomson proposed that electrons were constituents of atoms.
available only in ‘packets’. Rutherford proposed a nuclear atom and then Bohr looked at intro-
ducing ideas of quantum theory to atomic structure. As quantum theory
developed, aspects of it were so strange that even some of the most
‘Anyone who is not shocked by quantum famous physicists were not happy to apply it but found it the only
theory has not understood it.’ possible way to explain their observations.
Niels Bohr (1885–1962) Perhaps the most amazing thing is that technology based on quantum
theory works. It has given physics and chemistry a firm scientific base. We
‘I don’t like it, and I’m sorry I ever will study the findings of some of the most significant physicists in these
had anything to do with it.’ early stages of understanding, and particularly the work of Rutherford
Erwin Schrödinger (1887–1961) and Bohr.

22.1 THE RUTHERFORD MODEL


OF THE ATOM
In 1895, Ernest Rutherford (1871–1937), a New Zealand born physicist,
went to work with Joseph John (J. J.) Thomson (1856–1940) at the Cav-
endish Laboratory at Cambridge University in England. As we saw in
Rutherford’s early work at the chapter 10 (pages 180–185), J. J. Thomson had identified the electron as
Cavendish Laboratory also included a component of the atom in 1897. The model of the atom changed from
wireless signalling and at one time he the small indestructible sphere of Dalton to the ‘plum pudding’ model of
held the world record for distance Thomson. Negatively charged electrons were considered to be distrib-
communication of about a kilometre. uted throughout a sphere of positive charge. Ernest Rutherford and
John Townsend helped Thomson with his work that led to this discovery
of the electron, although it was Thomson who designed and performed
the crucial experiment.

The first alpha particle scattering experiment


In 1898, Rutherford moved to McGill University in Montreal where he
investigated radioactivity. While there, Rutherford had a difference of
opinion with Henri Becquerel (1852–1908), who had discovered radio-
activity in 1896. Rutherford had shown that alpha particles were deflected

CHAPTER 22 THE ATOMIC MODELS OF RUTHERFORD AND BOHR 419


by magnetic fields. Becquerel studied the passage of alpha particles through
magnetic fields and believed (incorrectly) that he had observed that the
radius of curvature of the alpha particles increased as they moved greater
distances through the magnetic field. He believed that the alpha particles
increased in mass when passing through air and the increase in mass was
responsible for the increase in radius. He did not like this idea but pre-
ferred it to the other alternative — that alpha particles increase their velo-
city as they pass through air (see ‘Becquerel’s predicament’ below).
Rutherford demonstrated that alpha particles slow down as they collide
with air molecules. He repeated Becquerel’s experiment but with alpha
particles passing through a magnetic field in air and also in a vacuum. He
found that the beam of alpha particles was wider in air than in a vacuum.
He observed that when he put a thin mica sheet in the beam it deflected
the alpha particles by up to two degrees. Rutherford calculated that a
relatively large electric field must be present in the mica sheet to deflect
the alpha particles. As far as is known, he did not speculate about the
origin of that electric field.

PHYSICS FACT
Becquerel’s predicament
A charged particle moving in a circular path in a uniform magnetic
mv 2
field will experience a centripetal force of magnitude F c = --------- ,
r
Figure 22.2 Ernest Rutherford which is provided by the magnetic force of magnitude F = qvB. When
(Lord Rutherford of Nelson), was these equations are combined, an expression for the radius of the path
awarded the Nobel Prize for mv
Chemistry in 1908. The award that can be determined as r = ------- .
qB
year was a matter of intrigue. In an
We can see Becquerel’s predicament. He believed that r was
attempt to award Nobel prizes to two
increasing and hence either m or v (or both) would have to increase
atomists (Planck and Rutherford) in
as q and B were constant. Although he did not like the idea of an
the same year, Dr Arrhenius, Director
increasing mass he preferred it to an increasing velocity and
of the Nobel Institute for Physical
defended it very strongly when Rutherford challenged him.
Chemistry, arranged for Rutherford to
The real problem was the photographic detection method Becquerel
be nominated for the chemistry prize
was using. The radius actually decreased.
and for that prize to be determined
before the physics prize. In the end,
Planck did not win the physics prize
that year, but was eventually awarded Geiger and Marsden’s alpha particle scattering
it ten years later. experiment
Rutherford did nothing more with alpha particle scattering until 1907
when he moved to Manchester, England. There he inherited Dr Hans
Geiger (1882–1942), a German physicist, as his assistant. Rutherford
returned to his investigations of the scattering of alpha particles, this time
by very thin metal foils. Rutherford suggested to Ernest Marsden (1889–
1970), an undergraduate student being trained in radioactive detection
techniques by Geiger, that Marsden could determine whether alpha par-
ticles were directly reflected from a metal surface. Marsden observed that
a very small fraction of the alpha particles were reflected from a thin gold
foil. (About 1 in 8000 alpha particles were deflected at an angle greater
than 90°.) Geiger and Marsden published these results in 1909. They used
a very simple apparatus with a thin conical tube containing ‘radium
emanation’, which we now know as radon, as their source of alpha
particles (see figure 22.3).

420 FROM QUANTA TO QUARKS


Rutherford and Geiger also developed Figure 22.3 Drawing of the apparatus used
the Rutherford–Geiger detector — by Geiger and Marsden, from their original
later improved by Geiger and Müller paper published in 1909. AB is the conical tube
and known as the GM tube or, more sealed at the end with a mica sheet. P
commonly, a ‘Geiger counter’. represents a lead shield that prevents the alpha
particles from travelling directly to the
scintillation screen, S. RR represents the thin
metal foil and M the low power microscope
through which the scintillations were observed.

Geiger and Rutherford had con-


A scintillation is a flash of light firmed that each scintillation (flash
observed on a scintillation screen. of light observed on the scintillation
Another example of scintillation is screen) was produced by an alpha
electrons striking the screen of a particle and that all of the alpha
cathode ray oscilloscope. The
particles produced a scintillation.
screen produces many
scintillations when it is struck by Geiger and Marsden reported the number of scintillations observed per
electrons. Of course, the minute for a number of different metals. They also investigated the
continuous beam of electrons number of scintillations observed per minute for different thicknesses
produces a constant glow, not of gold foil. The radon gas was at low pressure in the conical tube but
individual flashes as would be
the experiment was performed in air. The alpha particles from the con-
observed when alpha particles hit
such a screen. ical tube did not form a well-defined beam and, while Geiger and
Marsden were able to detect that alpha particles had been deflected
through large angles, their simple apparatus was not able to detect a
significant change in the number of particles deflected through dif-
ferent angles. (The apparatus was later refined to permit the measure-
ment of angles and the experiment was also performed in an evacuated
chamber; see figure 22.4.)

PHYSICS FACT

S everal years later, during World War I, Hans Geiger and Ernest
Marsden found themselves on opposite sides of the same sector of
the front line in France. In 1915, Marsden had been appointed Pro-
fessor of Physics at Victoria College in Wellington, New Zealand.
Marsden joined the New Zealand army as a signals officer and
returned to the fighting in France where he won the Military Cross.
While in France he received a letter from Geiger, congratulating him
on his appointment. Rutherford also kept in touch with Geiger and
other German scientists by sending letters via mutual friends.
In one of his last lectures, Ernest Rutherford described his reaction
to Marsden’s discovery of deflection of alpha particles through large
angles as ‘the most incredible event that has ever happened to me in
my life. It was almost as incredible as if you had fired a fifteen-inch
shell at a piece of tissue-paper and it came back and hit you’.

The nuclear atom


It was two years after the publication of the paper by Geiger and Marsden
that Rutherford explained the result by proposing a nuclear atom.
Rutherford informed Geiger that he knew what the atom looked like.
Using his nuclear model, he had worked out the relative numbers of
alpha particles that would be scattered through different angles and
Geiger began a series of careful experiments (see figure 22.4 on the
following page).

CHAPTER 22 THE ATOMIC MODELS OF RUTHERFORD AND BOHR 421


Metal foil
The vast majority of the
α particles pass straight
through the metal foil. Metal box
Vacuum

Figure 22.4 The apparatus used to 0°


Source of
study alpha particle scattering in α particles
The angle of
1911. In this version, the microscope deflection (polonium)
and scintillation screen can be rotated
to observe the alpha particles at
different angles. Polonium was used 150°
as the alpha particle source, the Microscope and outer α particles scattered by
metal foil used was gold and the box can be rotated. Zinc sulfide 90° large angles are rare events.
apparatus was evacuated. screen

Geiger made observations of the number of alpha particles scattered


Incident α particles
through different angles and his results confirmed the predictions of
Rutherford. The distribution of the scattered alpha particles confirmed
Target
foil that the scattering was caused by an inverse square force, which Rutherford
took to be the electrostatic force.
As the thickness of the gold foil was increased, the number of alpha
particles scattered first increased but then remained constant. This con-
firmed that the alpha particles were in fact interacting with the atoms of
gold. The scattering of alpha particles through small angles could be
accounted for by the alpha particles undergoing a large number of inter-
actions with different atoms, each interaction contributing a small
amount to the total scattering. Even scattering at about 90° could be
accounted for by multiple scattering. However, the probability of mul-
tiple scattering producing deflections of more than 90° (see figure 22.5)
was so small that Rutherford concluded that the deflection must be due
to the encounter of the alpha particle with a single atom.
Once it was established that the deflection was due to an encounter
with a single atom, Rutherford showed that the charge that caused the
deflection must be concentrated in a region about 10 000 times smaller
Atom than the radius of the atom. The alpha particles, which were known
Nucleus
to have a velocity of about 1.6 × 107 m s−1, would penetrate to within
−12
3 × 10 cm of the centre of the atom before being turned back. He con-
Figure 22.5 Deflection of alpha cluded that most of the mass and positive charge of the atom must be
particles concentrated in a very small nucleus. Rutherford’s model of the atom is
shown in figure 22.6.

PHYSICS FACT
+ – Electron A scale model of a hydrogen atom?
(in orbit
around
nucleus) T he radius of a hydrogen atom (in its first excited state — see page
−10
433) is about 2.1 × 10 m. The radius of a proton is about 0.85
−15
femtometres (0.85 × 10 m). Physicists sometimes call this unit a
fermi, named after Enrico Fermi (see chapter 24, page 462–463). The
ratio of the radius of this atom to the radius of its nucleus is about
5
Proton (nucleus) 2.5 × 10 . This would make it very difficult to construct an accurate
scale model of an atom in your laboratory. If your laboratory was 10 m
Figure 22.6 Diagram of the across and this represented the diameter of the atom, the diameter of
Rutherford model of the atom of −5
the nucleus would have to be 4 × 10 m or 40 microns in diameter.
hydrogen

422 FROM QUANTA TO QUARKS


Electrons in the Rutherford atom
When Rutherford published his paper ‘The scattering of alpha particles
by matter and the structure of the atom’ (1911) he stated: ‘It will be
shown that the main deductions from the theory are independent of
whether the central charge is supposed to be positive or negative. For
convenience, the sign will be assumed to be positive. The question of the
stability of the atom proposed need not be considered at this stage, for
this will obviously depend upon the minute structure of the atom, and on
the motion of the constituent parts . . .’
Rutherford knew that if the electrons were in orbit around the
nucleus, they would be accelerating. They would be expected to be
emitting electromagnetic radiation in accordance with the theories of
Maxwell and, of course, the atom would be unstable.

22.2 BOHR’S MODEL OF THE ATOM


Before moving to Manchester to work Niels Bohr (1885–1962), a Danish
with Rutherford, Bohr had worked physicist, was one of eleven Nobel
for a short time at the Cavendish prize winners who were trained by
Laboratory under J. J. Thomson. Rutherford. One of Bohr’s first con-
Bohr and Thomson did not get along. tributions was to predict that a
At their first meeting, Bohr informed hydrogen atom would contain only
Thomson that one of Thomson’s one electron outside the positively
equations was wrong. There were charged nucleus. (At the same time,
several similar incidents and Bohr others predicted that one-electron
later recalled his disappointment that atoms could not exist.)
Thomson was not interested to learn
that his work was incorrect. Bohr did Planck, Einstein and
acknowledge that his own lack of
knowledge of the English language
‘quantised energy’
contributed to the failure of the two Bohr attempted to apply the new
men to hit it off. quantum ideas of Planck and Ein-
stein to the model of the hydrogen
atom. As we saw in chapter 11 (page
201), Planck had managed to find
an equation that solved the
problem of the ‘ultraviolet catas-
trophe’ that troubled the theory
of black-body radiation. Planck
held a traditionalist’s view of physics
and was opposed to the statistical Figure 22.7 Niels Bohr
processes of Boltzmann. After
attempting to explain his black-body equation, Planck reluctantly tried
to derive it using the methods of Boltzmann. This involved dividing the
energies up into small amounts and eventually should have finished
with an integration in which all the energies would have been added
together and would have experienced the problem of an infinite
energy. However, before that final step, Planck realised that he had
A quantum (plural: quanta) of reached his equation for black-body radiation and therefore did not
energy can be considered to be the ‘complete’ the process. Einstein later showed that the problem of infin-
smallest amount of energy possible ities will occur in any process where ‘classical’ theories and quantum
in a given situation. Planck’s theories are linked.
atomic oscillators could oscillate
only with certain precise amounts Planck interpreted his result as meaning that the ‘atomic oscillators’
of energy. that produced the radiation could vibrate only with certain discrete
amounts of energy. These discrete amounts of energy were called quanta.

CHAPTER 22 THE ATOMIC MODELS OF RUTHERFORD AND BOHR 423


Einstein later extended this idea to the radiation itself being quantised.
A photon is the quantum of Einstein’s ‘quanta of light’ were later named ‘photons’ by Gilbert Lewis
electromagnetic radiation which
exhibits both a particle and wave (1875–1946).
nature.
Bohr uses quantum theory to explain the
spectrum of hydrogen
Bohr knew that, somehow, atoms must produce radiation that formed a
characteristic spectrum for each element (see Physics in focus on ‘The
spectra of gases’). Bohr realised that the ‘atomic oscillators’ of Planck
were probably electrons in the atom. The Rutherford model failed to
provide any information about the radius of the atom or the orbital
An empirical equation is one that
frequencies of the electrons. Bohr attempted to introduce the quantum
has no theoretical basis but can be ideas of Planck to the atom, but at first failed.
used to calculate correct values.
2 3
Early in 1913, Bohr was introduced to Balmer’s equation (see below)
Kepler’s Third Law, T ∝ R , which for the wavelengths of the spectral lines of hydrogen and it ‘made every-
you encountered in ‘The Cosmic
Engine’, is another example of an
thing clear to him’. After seeing this equation, Bohr realised how elec-
empirical equation. trons were arranged in the hydrogen atom and also how quantum ideas
could be introduced to the atom.

PHYSICS FACT
Balmer’s equation
J ohann Jakob Balmer (1825–1898) completed a
PhD in mathematics in 1849. He became a
teacher at a girls’ school in Basel, Switzerland,
1 1 1
Sometimes the equation --- = R H  -----2 – -----2 is
λ  nf ni 

and had a desire to ‘grasp the harmonic relation- known as the Rydberg equation. Sometimes it is
ships of nature and art numerically’. Anders called Balmer’s equation. Rydberg had
Angström (1814–1874) had measured the wave- attempted to find his own equation for the spec-
lengths of four of the spectral lines of hydrogen tral lines of hydrogen. He was unsuccessful and,
(now known as the Balmer series). Balmer found as his contribution was to modify Balmer’s
an equation that enabled him to calculate the equation, we will continue to refer to it as the
wavelengths of these and, he believed, the Balmer equation.
infinite number of spectral lines emitted by
hydrogen. 2
n 
Balmer’s equation was λ = b  ---------------- and the
 n 2 – 2 2
constant b was found empirically to be 364.56 nm.
Janne Rydberg (1854–1919) modified Balmer’s
equation for wavelength to produce the familiar
equation:
1 1 1
--- = R H  ----- – -----
λ  n f2 n i2
where
λ = wavelength of the emitted radiation
7 −1
RH = Rydberg’s constant (RH = 1.097 × 10 m )
nf and ni are integers.
The wavelengths of the visible lines of
hydrogen correspond to nf = 2 and ni = 3, 4, 5 or
6. Of course, this is an empirical equation
(Balmer played around with numbers until he
arrived at something that worked). Figure 22.8 Johann Jakob Balmer

424 FROM QUANTA TO QUARKS


Calculating the wavelengths of hydrogen spectral lines
SAMPLE PROBLEM 22.1
Calculate the wavelength of the visible spectral line of hydrogen with the
longest wavelength.
1 1 1
SOLUTION From Balmer’s equation --- = R H  -----2 – -----2 , we can see that the longest
λ  nf ni 

1 1
wavelength will occur when the term  -----2 – -----2 is smallest.
 nf ni 

As the visible spectral lines correspond to nf = 2, the smallest value will be


when n i = 3.
1 1 1
--- = R H  ----
- – -----
λ  2 2 3 2
1 1
= 1.097 × 10  -----2 – -----2
7
2 3 
6
= 1.524 × 10
−7
λ = 6.562 × 10 m

−7
The wavelength is 6.562 × 10 m. This is the wavelength of the red line in
the hydrogen spectrum in figure 22.9.

PHYSICS IN FOCUS
The spectra of gases
T here are three types of emission
spectra: continuous spectra, bright-
line spectra and band spectra. Contin-
Continuous spectrum

400 450 500 550 600 650 700 750


uous spectra are produced by incan-
descent objects, bright-line spectra are Sodium
produced by excited gases and band
spectra are produced by excited mol-
400 450 500 550 600 650 700 750
ecules. We will consider the bright-line
emission spectra of excited gases and
Mercury
also the absorption spectra of cool
gases (as shown in figure 22.9).
Spectral lines are produced as 400 450 500 550 600 650 700 750

images of the slit that is an essential


component of any spectroscope. Hydrogen
After passing through the slit, the dif-
ferent wavelengths of light are dif- 400 450 500 550 600 650 700 750
fracted by different amounts by a
grating or dispersed by a prism by dif- Absorption spectrum
ferent amounts. Hence, the images of
the slit corresponding to the dif-
400 450 500 550 600 650 700 750
ferent wavelengths are separated.
When the slit is very narrow, closely
spaced lines can be resolved (distin- Figure 22.9 A continuous white light spectrum, the emission spectra of
guished from one another). If the slit excited atoms of the elements sodium, mercury and hydrogen, and an
is wider, more light is admitted, but absorbtion spectrum. The red line in the hydrogen spectrum is known as the Hα
at the expense of the resolution. line, and the other lines as Hβ, Hγ and Hδ respectively.
(continued)

CHAPTER 22 THE ATOMIC MODELS OF RUTHERFORD AND BOHR 425


Photographic plate

Prism or
diffraction
grating
Slit

Figure 22.10 When the light emitted from an excited gas is


passed through a prism, a series of coloured lines is observed.
The lines correspond to the colours (wavelengths) of light
emitted by the gas.

Light from
source Absorption spectrum

Cool gas absorbs


light of certain
wavelengths. The
gas atoms then
emit light of the
same wavelengths
but emit it in all
directions.
This light is
deficient in
certain wavelengths.

Figure 22.11 When white light is passed through a cool


gas, the gas absorbs radiation from the light. After the light
has been passed through a prism, the colours corresponding
to the wavelengths of light absorbed by the gas are absent Incandescent
from the spectrum. An absorption spectrum of dark lines on bulb producing
continuous
a bright coloured background is observed.
spectrum

An emission spectrum (see figure 22.10) is produced when a gas is


An emission spectrum is a series of
brightly coloured lines on a dark
excited. A gas can be excited by heating it or by passing an electrical
background that is produced when discharge through it. The emission spectrum is a series of narrow col-
light from an excited gas is viewed oured lines on a dark background. Each element has its own charac-
through a spectroscope. teristic spectrum and this can be used to identify the gas.
An absorption spectrum (see figure 22.11) is produced when white
light is passed through a cool gas. The atoms in the gas absorb energy
An absorption spectrum is a series
of dark lines on a coloured
from the white light. The atoms will then re-emit the energy that was
background that is produced when absorbed. The energy will be emitted as light and it will be emitted in
white light is passed through a cool random directions. Therefore, the transmitted beam of light will be
gas and viewed through a deficient in light at those energies or wavelengths. When this light is
spectroscope. analysed, it will show a continuous spectrum of the white light with a
series of narrow dark lines across it.

426 FROM QUANTA TO QUARKS


PHYSICS IN FOCUS
Observing spectra using a simple
spectroscope
Small, hand-held, direct vision spectroscopes can be used to examine
spectra. However, if accurate measurements of wavelength are
22.1 needed, a spectrometer that incorporates a collimator, prism or dif-
fraction grating and telescope (see figure 22.12) is required. The
The spectrum of hydrogen method for using such a spectroscope is given in practical activity
22.1 (page 437).

eBook plus
C G
θ
Weblink:
Spectra
θ
S L1 S1 L2

L3
T

F'
E

Figure 22.12 A simple spectroscope or spectrometer. The light enters the collimator, C,
which focuses parallel light rays onto the prism or grating, G. The telescope, T, which
has been focused for parallel light rays, is used to observe the dispersed or diffracted
light. The telescope can be rotated and accurate measurements can be made of the angle
through which the light has been deviated. This enables the wavelengths of the spectral
lines to be calculated.

A spectacular example of emission from excited atoms occurs in


the production of an aurora (figure 22.1). In this case the colours
are produced by excited atoms or ions present in the atmosphere
above about 60 km. Neutral oxygen atoms can produce pink and
green colours, nitrogen molecules produce a red–violet colour
and nitrogen ions can produce blue–violet. The atoms or ions are
excited by interactions with charged particles from the Sun,
usually after intense solar activity. Auroras do not usually extend
very far north but occasionally they are visible from New South
Wales.

22.3 BOHR’S POSTULATES


While Bohr believed that he knew the arrangement of electrons, he
could not explain why the electrons were arranged in this way. Bohr pub-
lished three papers between April and August 1913. In these papers,
known as the great trilogy, he started with the problem of electrons in the
Rutherford model and pointed out that the accelerating electrons must

CHAPTER 22 THE ATOMIC MODELS OF RUTHERFORD AND BOHR 427


lose energy by radiation and collapse into the nucleus. He then applied
quantum theory to the atom. He generally assumed that the orbits of the
electron were circular.
Bohr was awarded the Nobel Prize for Physics in 1922 and in his Nobel
lecture stated in reference to Rutherford’s discovery of the nucleus:
‘This discovery made it quite clear that by classical conceptions alone it
was quite impossible to understand the most essential properties of
atoms. One was therefore led to seek for a formulation of the principles
of the quantum theory that could immediately account for the stability in
atomic structure and the radiation sent out from atoms, of which the
observed properties bear witness. Such a formulation I proposed [1913]
in the form of two postulates.’
Bohr continued with rather lengthy statements of his postulates. Simpler
statements are:
1. Electrons in an atom exist in ‘stationary states’ in which they possess
When an electron is in a stationary
state, it will orbit the nucleus an unexplainable stability. Any permanent change in their motion
without emitting any must consist of a complete transition from one stationary state to
electromagnetic radiation. another.
2. In contradiction to the classical electromagnetic theory, no radiation
is emitted from an atom in a stationary state. A transition between two
stationary states will be accompanied by emission or absorption of
electromagnetic radiation (a photon). The frequency, f, of this
photon is given by the relation:

hf = E1 − E 2

where
h = Planck’s constant
E1 and E 2 = values of the energy of two stationary states that form the
initial and final states of the atom.

Bohr then introduced what is generally known as his quantisation


The angular momentum, L, of a
point mass, m, which is in circular condition and is sometimes called his third postulate.
motion of radius, r, with velocity, v, An electron in a stationary state has an angular momentum that is an
is given by: h
integral multiple of -----
- (Planck’s constant divided by 2π).
L = mvr 2π
Angular momentum is the n
rotational equivalent to linear Bohr actually proposed that the kinetic energy of an electron was -------- -
momentum and is an important 2hf
quantity in rotational motion. (It but this reduces to the quantisation condition given if the orbits are circular.
follows a similar conservation In his first postulate, Bohr put forward one of the most audacious
principle to linear momentum.)
hypotheses ever proposed in physics by predicting that electrons exist in
states in which they do not radiate energy. The second postulate
involves the quantum of energy being emitted or absorbed when an
electron jumps from one stationary state to another and hence explains
Many famous physicists had the origin of spectral lines. The quantisation condition is really an
addressed the problem of electrons intuitive guess.
being in non-uniform motion without Using these postulates together with the energy of electrons calcu-
radiating energy. This had become lated from ‘classical’ physics applied to the Rutherford model, it is
important after Thomson had possible to derive a theoretical equation for the wavelengths of the
discovered the electron and was not spectral lines of hydrogen. It is a great success of the Bohr model that
just associated with the Rutherford this theoretical equation is the same as the empirical equation of
model. Balmer.

428 FROM QUANTA TO QUARKS


22.4 MATHEMATICS OF THE
RUTHERFORD AND BOHR
MODELS
In the following sections we will derive an expression for the classical
energy of the Rutherford hydrogen atom and then impose Bohr’s postu-
lates on that atom. This will enable us to calculate the energies of the
stationary states of the hydrogen atom and then calculate the change in
energy of an electron involved in a transition between two stationary states.
Finally, this change in energy will enable us to calculate the frequency (or
wavelength) of the spectral lines of hydrogen.

The ‘classical’ energy of the Rutherford


hydrogen atom
When you studied the escape velocity of an object fired from the Earth,
you found the total energy of the object was the sum of its kinetic energy
and its gravitational potential energy. When this total energy was zero,
the object had just enough energy to escape from the Earth. If the total
energy was negative, the object was unable to escape the Earth.
In a similar way we can calculate the total energy of a proton and elec-
tron. This time it is the sum of the kinetic energy and the electrical
potential energy. The zero point will be when the electron has just
enough energy to escape from the proton.
Kinetic energy of electron:
E k = 1--- m e v 2 .
2
The electron is held in orbit around the proton by the electrical force of
magnitude:
kq e2
F = -------
-
where r2
−19
qe = magnitude of the charge on the proton and electron (1.602 × 10 C).

We know that this electrical force provides the centripetal force of magnitude:
me v 2
F c = -----------
r
Fc = FE

m e v 2 kq e2
------------ = -------
-
r r2
1 me v 2 = 1 kq e2
--- ------------ --- ------- -
2 r 2 r2
2
1 1 kq
--- m e v 2 = --- -------e-
2 2 r
2
1 kq
Ek = --- -------e- .
2 r
The potential energy of the electron is given by:
kq 2
E p = – -------e- .
r

CHAPTER 22 THE ATOMIC MODELS OF RUTHERFORD AND BOHR 429


The total energy is the sum of the kinetic and potential energies.

Total energy = Ek + Ep
2 2
1 kq kq
= --- -------e- – -------e-
2 r r
2
1 kq e
= – --- --------
2 r
This is the total ‘classical’ energy of Rutherford’s hydrogen atom.

Radii of the ‘stationary states’ of the Bohr


hydrogen atom
When Bohr’s quantisation condition is applied to the ‘classical’ hydrogen
atom, the electron is restricted to stationary states in which the angular
momentum of the electron is an integer multiple of Planck’s constant,
divided by 2π.
nh
Angular momentum = -------

nh
mevr = -------

The value of n for each stationary


In this equation, n is an integer, known as the principal quantum number.
state or orbit of the Bohr atom is We can obtain an expression for the radius of the stationary states
called the principal quantum corresponding to each value of the integer, n:
number of that stationary state or
nh
orbit. mevr = -------

nh
r = ----------------
2πm e v
2 n2h2
r = --------------------
-.
4π 2 m e2 v 2

m e v 2 kq e2 2
From the earlier equation, ----------- = -------
- , we can obtain an expression for v :
r r2
2 kq 2
v = --------e .
me r
Substituting this gives:
2 n2h2
r = ------------------------2-
kq
4π 2 m e2 --------e
me r
n2h2
rn = ------------------------2
4π 2 m e kq e
9rI where
4rI rn = the radius of the stationary state corresponding to the integer n.
+
The radius of the stationary state corresponding to n = 1 will be:
rI
12h2
r1 = ------------------------2
4π 2 m e kq e
h2
Figure 22.13 The relative radii of = ------------------------2 .
the orbits of an electron in different 4π 2 m e kq e
2
stationary states in a hydrogen atom We can combine the expressions for rn and r1 to give rn = n r1.

430 FROM QUANTA TO QUARKS


Energies of the ‘stationary states’ of the
Bohr atom
If we now return to the classical energy of the Rutherford hydrogen atom
2
1 kq
(energy = – --- -------e- ) and impose the restriction that the only possible energies
2 r
n2h2
correspond to values of radius given by r n = ------------------------2 , we can calculate
4π 2 m e kq e
the value of these energy states:
1 kq e2
En = – --- ---------- -
2 n2h2
-----------------------
-
4π 2 m e kq e2
4
1 4π 2 k 2 m e q e
= – --- --------------------------
2 n2h2
4
1 2π 2 k 2 m e q e 
= – -----2  -------------------------- .
n  h2 
Again we can see that:
2π 2 k 2 m e q e4
E1 = –  --------------------------
 h2 
and hence
1
En = -----2 E 1
n
remembering that E1 has a negative value.

Calculating the energies of electrons in the hydrogen atom


SAMPLE PROBLEM 22.2 Given that the energy of an electron in the first stationary state of
−18
hydrogen is E1 = −2.179 × 10 J, determine the energy in electron volts
(eV) of an electron in the following stationary states of the hydrogen
atom:
(i) the first stationary state (n = 1)
(ii) the second stationary state (n = 2)
(iii) the tenth stationary state (n = 10).
SOLUTION (i) We have been given this energy in joules so it is only a matter of
converting to electron volts:
−19
1 eV = 1.602 × 10 J
−18 2.179 × 10 –18
2.179 × 10 J = -------------------------------
-eV
1.602 × 10 –19
= 13.60 eV.
The energy of the first stationary state is −13.6 eV.
(ii) The energy of an electron in the second stationary state, for which
n = 2 is given by:
1
En = -----2 E 1
n
1
E 2 = -----2 E 1
2
– 13.6
= --------------
4
= −3.4 eV.

CHAPTER 22 THE ATOMIC MODELS OF RUTHERFORD AND BOHR 431


(iii) The energy of an electron in the tenth stationary state, for which
n = 10 is given by:
1
En = -----2 E 1
n
1
E10 = --------2 E 1
10
– 13.6
= --------------
100
−1
= −1.36 × 10 eV.

Electron volt
When an electron gains energy as it is
Theoretical expression for wavelengths
accelerated across a potential of the spectral lines of hydrogen
difference of V volts, its gain in We are able to combine the expression for the energies of the stationary
energy is given by W = qe V. states with Bohr’s second postulate to derive an expression for the energy
When the electron is accelerated differences between stationary states and, hence, the energies of the
across a potential difference of photons that may be emitted or absorbed by hydrogen.
1.0 V, it will gain energy equal to We will consider the emission of a photon as an electron jumps from a
−19 −19
1.602 × 10 × 1.0 = 1.602 × 10 J. higher energy initial state, E i, to a lower energy final state, E f.
The gain in energy of an electron
The change in energy of the electron is:
accelerated across a potential
difference of 1.0 V is also called ∆E = E i − E f
1.0 electron volts (eV). 1.0 eV =
1 1
1.602 × 10 J.
−19
= -----E 1 – -----E 1
n i2 n f2
1 1
= E 1  -----2 – -----2 .
n n 
i f

This is the energy of the emitted photon, hf.


We can now derive an expression for the frequency and wavelength of
the photon.
1 1
hf = E 1  -----2 – -----2
n n 
i f

–E 1 1
f = --------1  -----2 – -----2
h  nf ni 

c –E 1 1 
--- = --------1  ----- – -----
λ h  n f2 n i2

1 –E 1 1 
--- = --------1  ----- – -----
λ hc  n f2 n i2

This equation is of the same form as Balmer’s equation. If the value of


–E1
-------- is calculated, it agrees with the value of the Rydberg constant in
hc
Balmer’s equation. (Remember that E1 is a negative quantity and, hence,
−E1 is positive.)
Balmer’s equation is an empirical equation (see page 424). A theor-
etical equation derived from Bohr’s model of the atom now agrees with
the empirical equation. This is a major achievement and offers very
strong support for the Bohr model.

432 FROM QUANTA TO QUARKS


Emission of photons from a hydrogen atom
SAMPLE PROBLEM 22.3 (a) Given that the energy of the first stationary state of hydrogen is
−13.60 eV, calculate the energy of the fourth stationary state of the
hydrogen atom.
(b) Use this information to calculate the frequency of the photon
emitted when an electron undergoes a transition from the state n = 4
to the state n = 1.
(c) Calculate the wavelength of the radiation emitted.
SOLUTION (a) The energy of the fourth stationary state is:
1
En = -----2 E 1
n
1
E4 = -----2 E 1
4
– 13.6
= --------------
16
= −0.85 eV.
(b) The energy emitted by the photon will be:
13.60 eV − 0.85 eV = 12.75 eV.
−19
Energy of photon = 12.75 × 1.602 × 10 J
E
f = ---
h
12.75 × 1.602 × 10 –19
= ----------------------------------------------------
-
6.626 × 10 –34
15
= 3.083 × 10 Hz
(c) The wavelength will be calculated from:
c
λ = --
f
3.00 × 10 8
= -----------------------------
-
3.083 × 10 15
−8
= 9.73 × 10 m.
(Of course the wavelength could have been calculated directly from
Balmer’s equation.)

The hydrogen atom explained


We are now able to calculate the wavelengths of the many spectral lines of
the hydrogen atom. The original series of spectral lines was known as the
Balmer series and contained the four spectral lines in the visible region
of the spectrum. These lines correspond to electron jumps to the second
lowest energy state, or first excited state, (n = 2) of the hydrogen atom.
The wavelengths of the spectral lines in other series can be calculated
using Bohr’s equation and are shown in figure 22.14 on the following
page. The Paschen series of infra-red lines had already been discovered
but other series of lines were found later and their wavelengths were in
agreement with Bohr’s theory. The series of lines in the ultraviolet and
infra-red, named after their discoverers, are:
An electron has the lowest possible • Lyman series, discovered in 1916. These were ultraviolet lines with
amount of energy when it is in the transitions to the ground state (n = 1).
ground state. • Paschen series, discovered in 1908. These were infra-red lines with
transitions to the second excited state (n = 3).
• Brackett series, discovered in 1922. These were infra-red lines with
If it exists in a stationary state in transitions to the third excited state (n = 4).
which it has more energy, it is said
to be in an excited state. • Pfund series, discovered in 1924. These were infra-red lines with tran-
sitions to the fourth excited state (n = 5).

CHAPTER 22 THE ATOMIC MODELS OF RUTHERFORD AND BOHR 433


n=6

(a) (b)
Paschen series
Principal (infra-red)
n=5
quantum number Energy
(n) E (eV)
7 Pfund Lyman –0.28 n=4
6 series series –0.38
5 –0.54
4 –0.85 n=3
Brackett
3 series –1.51 Lyman series n=2 Brackett series
Paschen (ultraviolet) n=1 (infra-red)
series
2 –3.4
Balmer
series
Pfund series
(infra-red)

Balmer series
1 –13.6 (visible light)

Figure 22.14 a) Atomic energy level view of the spectral series of hydrogen (b) Electron
orbit view of the spectral series of hydrogen. Note that the radii of the orbits of the electrons
are not to scale.

22.5 LIMITATIONS OF THE BOHR


MODEL OF THE ATOM
The Bohr model takes the first step to introduce quantum theory to the
hydrogen atom but it is only a first step. The model has the following
limitations:
• it is not possible to calculate the wavelengths of the spectral lines of all
other atoms
• the Bohr model works reasonably well for atoms with one electron in
their outer shell but does not work for any of the others
• examination of spectra shows that the spectral lines are not of equal
intensity but the Bohr model does not explain why some electron
transitions would be favoured over others
• careful observations with better instruments showed that there were
other lines known as the hyperfine lines. There must be some splitting
of the energy levels of the Bohr atom but the Bohr model cannot
account for this.
• when a gas is excited while in a magnetic field, the emission spectrum
produced shows a splitting of the spectral lines (called the Zeeman
effect). Again, the Bohr model cannot account for this.
• finally, the Bohr model is a mixture of classical physics and quantum
physics and this, in itself, is a problem.
In the next chapter we will examine some of the observations that supported
ideas from Bohr’s model but also see that there was a need to break completely
from classical physics and move to a new theory — quantum mechanics.

434 FROM QUANTA TO QUARKS


CHAPTER REVIEW
(b) What trend do you notice in the wave-
SUMMARY lengths as the value of n increases?

• The scattering of alpha particles through large 3. The radius of the orbit of an electron in the
angles by very thin gold foils led Rutherford to ground state of the hydrogen atom is
−11
propose that an atom consisted of a very small, 5.3 × 10 m. Calculate the radius of the orbit
dense, positively charged nucleus. Electrons were of an electron when it is in each of the
in orbit about the nucleus at distances very large following states:
compared to the dimensions of the nucleus. (a) the state n = 2
• A major problem with the Rutherford model (b) the state n = 3
was that it did not account for any properties of (c) the state n = 4.
the electrons in the atom, in particular how the
4. (a) State which photon, red or blue, has the
electrons could be accelerating without emit-
higher frequency.
ting electromagnetic radiation.
(b) State which photon, red or blue, has the
• Bohr extended the Rutherford model by for- longer wavelength.
mulating two postulates that enabled him to (c) State which photon, red or blue, has the
apply the quantum ideas of Planck and Einstein higher energy.
to the Rutherford atom.
• Bohr’s postulates enabled him to describe an 5. If the atoms in a sample of hydrogen were all
atom in which electrons existed in stable in the state n = 5, how many different spectral
‘stationary states’ where they did not emit elec- lines could possibly be produced by the gas as
tromagnetic radiation. The transition of an the electrons returned to the ground state?
electron from one stationary state to another 6. Given that E1 = −13.6 eV, E 2 = −3.40 eV,
would be accompanied by the emission or E 3 = −1.51 eV, E4 = −0.85 eV, E 5 = −0.54 eV,
absorption of a quantum of electromagnetic calculate the wavelengths of:
radiation or a photon.
(a) the first two lines in the Lyman series
• Using his model of the atom, Bohr was able to (b) the first two lines in the Balmer Series
derive a theoretical expression for the wave- (c) the first two lines in the Paschen series.
lengths of the spectral lines of hydrogen which
was in agreement with Balmer’s empirical 7. (a) What is the wavelength of the longest
formula. wavelength spectral line of the Pfund
• While successful in explaining the wavelengths series?
of the spectral lines in the hydrogen spectrum, (b) What is the wavelength of the shortest
Bohr’s model failed to account for the relative wavelength line of the Pfund series?
intensities of the lines, the existence of the 8. The ‘series limit’ is the term applied to the
hyperfine structure of the lines or for the split- shortest wavelength spectral line in each of the
ting of spectral lines when the excited gas was
spectral series of hydrogen.
in a magnetic field. Bohr’s model was also a
(a) What value of ni would be used to calculate
strange mixture of classical physics and
quantum physics. the wavelength of the series limit?
(b) Calculate the series limit for the Lyman,
Balmer and Paschen series of hydrogen.
QUESTIONS (c) How many electron volts of energy would
be carried by a photon corresponding to
1. Use Balmer’s equation to calculate the wave- the series limit of the Lyman series?
length of the radiation emitted from an 9. Figure 22.15 is an energy level diagram for
excited hydrogen atom when an electron energies of the stationary states in atoms of a
undergoes a transition from the state n = 5 to:
gas, Q.
(a) the state n = 1
(a) (i) Determine the energy of the photon
(b) the state n = 2
emitted when an electron in the state
(c) the state n = 3. n = 3 undergoes a transition to the
2. (a) Calculate the wavelengths of the lines of state n = 2.
the Balmer series corresponding to tran- (ii) Determine the frequency and wave-
sitions from the states n = 8, n = 10, n = 12. length of this photon.

CHAPTER 22 THE ATOMIC MODELS OF RUTHERFORD AND BOHR 435


(b) Determine the wavelength of the photon 11. An absorption spectrum is produced when the
absorbed by this gas when an electron atoms in a cool gas absorb energy from white
undergoes a transition from the state n = 1 light passing through the gas. These excited
to the state n = 4. atoms then re-emit the energy and return to
low energy states. How can this re-emission
0 occur but there still be dark lines in the
absorption spectrum?
n=4 –1.6 12. Balmer predicted accurately the wavelengths
of the visible spectral lines and invisible spec-
Energy (eV)

n=3 –3.7 tral lines of hydrogen that had not been


detected. Bohr did the same about thirty years
n=2 –5.5 later. Explain why Bohr’s prediction is con-
sidered more important than that of Balmer.
13. What evidence supports the idea that the elec-
tron energies in the hydrogen atom are discrete?
14. If electrons in hydrogen atoms obeyed the rules
of classical mechanics instead of those of
n = 1 –10.4 quantum mechanics, would the hydrogen
Figure 22.15 The energy level diagram for the gas, Q atoms produce a line spectrum or a continuous
spectrum? Explain your answer.
10. The emission spectrum of a particular gas has 15. Explain why each element has its own charac-
eight bright lines in the visible region as shown teristic spectrum.
in figure 22.16. The absorption spectrum of
the same gas has only three lines in the visible 16. Two spectral lines of hydrogen have frequencies
14 14
region as shown. of 2.7 × 10 Hz (infra-red) and 4.6 × 10 Hz
(a) Explain why each of the absorption lines (red).
corresponds to one of the emission lines. (a) Explain how you could use this information
(b) Explain why there is not a corresponding to determine the frequency of a higher
absorption line for five of the emission frequency spectral line of hydrogen.
lines. (b) Calculate the frequency of that line.

Emission spectrum

Absorption spectrum

Figure 22.16 The emission and


absorption spectra for a particular gas
CHAPTER REVIEW

436 FROM QUANTA TO QUARKS


PRACTICAL ACTIVITIES
supply that is designed for spectral tubes. The spec-
22.1 THE tral tube can be clamped in place on a vertical metal
rod mounted on top of the power supply. (The rod
SPECTRUM OF is maintained at Earth potential and is safe to touch
when the power supply is switched on.)
HYDROGEN Some hydrogen spectral tubes are very faint and
this makes measurement of the spectral lines very
difficult. Hydrogen spectral tubes should probably
Aim be replaced fairly regularly as they tend to become
To observe the spectral lines of hydrogen, measure fainter over time.
their wavelengths and compare these values with
the theoretical values. The spectroscope
The spectroscope consists of two tubes, one of which
Apparatus can be rotated around a small central table. One
tube, the fixed one, is a collimator and the moveable
hydrogen spectral tube and power supply one is a telescope. A small prism or a diffraction
spectroscope grating can be mounted on the small table. Figure
22.12 (page 427) shows a diagram of a spectroscope.
Theory There is an adjustable narrow slit at the front of
According to the Bohr model of the hydrogen the collimator. The collimator is set up to shine
atom, when an electron jumps from a higher energy parallel rays of light onto the diffraction grating or
state to a lower energy state, it will emit a photon. prism. (For the remainder of this practical activity,
When an electron jumps to the state n = 2 from any we will assume that a diffraction grating is being
of the states from n = 3 to n = 6, the emitted photon used. We will assume that the information about the
will be in the visible region of the spectrum. number of lines per metre is provided. It is possible
A spectroscope can be used to measure the devi- to calibrate a grating, and a procedure to do this is
ation of the spectral lines. The wavelengths of the included in the last section of the method.)
spectral lines can then be calculated. The light that passes through the diffraction
The wavelengths of the spectral lines will be grating deviates through an angle that depends on
given by: the wavelength of the light and the number of
d sin θ lines per metre ruled on the diffraction grating.
λ = -------------- The telescope is rotated around the table and
n
the image of the narrow slit is observed at dif-
where
ferent angles for the different wavelengths of light.
λ = wavelength
These angles can be measured, usually with the
θ = angle of deviation help of a vernier scale fixed to the telescope.
d = distance between lines on the grating
n = order of spectra. Setting up the spectroscope
The theoretical values of the wavelengths, based Setting up the spectroscope involves two parts;
on Bohr’s theory of the hydrogen atom, can be cal- adjusting the telescope for parallel light rays and
culated after the energies of the states n = 2 to then adjusting the collimator to produce parallel
n = 6 have been calculated. light rays.
The energy of the ground state, n = 1 is −13.6 eV. There should be fine cross-wires visible in the
The energies of the other states are given by eyepiece of the telescope. These cross-wires should
be in sharp focus and an adjustment of the eye-
E
En = -----2-1 . piece in its holder may have to be made if they are
n not sharply focused.
The telescope should be pointed at a distant
Method object and the focus adjusted using the objective
In this experiment, the hydrogen spectral tube is lens of the telescope, lens L 3 in figure 22.12, until
switched on and the radiation viewed through a the image of the distant object is sharply focused.
spectroscope. (In fact any object outside should be far enough
away.) The telescope is then aligned with the colli-
Setting up the hydrogen spectral tube mator and the lens on the collimator, lens L 2 in
Different types of spectral tube and power supply figure 22.12, is adjusted until the slit is seen sharply
may be used but we will describe a special power focused.

CHAPTER 22 THE ATOMIC MODELS OF RUTHERFORD AND BOHR 437


A light source, possibly a brighter spectral tube (n = 1). This line is really a double line, the wave-
than the hydrogen tube, can now be set up in front lengths of the lines being 589.0 nm and 589.6 nm.
of the slit and the slit width adjusted until narrow You can use the information in the equation
spectral lines can be viewed when the telescope is d sin θ
rotated to the appropriate position. λ = -------------- to calculate d.
n
By clamping the telescope and then using the
fine adjustment, it should be possible to align the
cross-wires visible in the eyepiece with the spectral Results
line. A measurement can then be made. Record your results in a table similar to the table
below and calculate the wavelengths of the spectral
Reading a vernier scale lines.
A vernier scale has ten lines on the moveable scale
The number of lines per centimetre or perhaps
in the space of nine lines on the fixed scale. This
even the number of lines per inch is probably
enables an extra decimal place to be determined.
supplied with the diffraction grating. It will be
This extra digit corresponds to the position of the
necessary to convert this to lines per metre and d is
line on the vernier, moveable scale that aligns with
the inverse of this value.
any one of the lines on the fixed scale.
Record the reading of the straight through
position θ0.
A B

330 335 340


Fixed scale SPECTRAL ORDER OF WAVE-
LINE POSITION ANGLE SPECTRA LENGTH
Vernier scale COLOUR θ θ – θO (n)
0 0.5 1.0
Figure 22.17 Reading a vernier scale. The position of the Faint
zero mark on the vernier scale, indicated by arrow A, is just violet 1
less than 331. The line on the vernier scale that matches a line
Violet 1
on the fixed scale is 0.6, as indicated by arrow B. Therefore,
the reading is 330.6°. Blue-
Green 1
Measuring the wavelengths of the spectral
lines of hydrogen Red 1
The lines will probably be quite faint and it will
probably be necessary to have the apparatus in a 2
darkened room to observe the lines clearly. The 2
most difficult part is aligning the spectral lines with
the cross-wires. If the room is completely dark, it 2
will be impossible to see the cross-wires. A small
amount of field illumination is necessary to be able 2
to see the cross-wires.
There should be no problem with making the
measurement for the straight through position. Analysis
PRACTICAL ACTIVITIES

There should be sufficient light coming directly


1. The energy of the ground state of hydrogen is
through the slit to make locating the image of the
E1 = −13.6 eV.
slit on the cross-wires quite easy.
The energies of the other states are given by
Record this value and then record the reading
E
of as many of the spectral lines as possible. (If it is E n = -----12 .
possible to measure any of the spectral lines of the n
second order spectrum it is worth doing so.) Determine the energy, in electron volts, of
the energy states n = 2, 3, 4, 5, and 6.
Calibration of diffraction 2. Draw an energy level diagram and calculate the
energies (in electron volts) of photons emitted
grating when an electron jumps to the n = 2 state from
If necessary, the diffraction grating could be cali- each of the four higher energy states.
brated using a sodium vapour spectral tube. Set up 3. Convert these values from electron volts to
this tube and observe the angle to the very bright joules and calculate the wavelengths of these
orange line in the first order spectrum of sodium photons.

438 FROM QUANTA TO QUARKS


PRACTICAL ACTIVITIES
Use:
c
E = hf = h ---
λ
hc
λ = ------
E
where
−19
1 eV = 1.6 × 10 J
−34
h = 6.602 × 10 J s.
4. Compare the values of the wavelengths calcu-
lated above with the values determined from
the measurements of the angles.

Questions
1. How accurate do you consider your determination
of the wavelengths of the spectral lines? Aside from
any difficulty with aligning the spectral lines with
the cross-wires, you are restricted to measuring the
angle to the nearest 0.1°. Consider how a change
in angle of 0.1° will alter your calculations.
2. Taking into account the expected accuracy of
your observations, do you consider that your
results are in agreement with the theoretical
values of the wavelengths of these four spectral
lines of hydrogen?

CHAPTER 22 THE ATOMIC MODELS OF RUTHERFORD AND BOHR 439


CHAPTER
23 DEVELOPMENT
OF QUANTUM
MECHANICS
Remember
Before beginning this chapter, you should be able to:
• recall Planck and Einstein’s contributions to the early
development of quantum theory
• recall the features of Bohr’s model of the atom in
which he introduced the ideas of quantum theory to
atomic structure
• realise that there were difficulties with the Bohr
model, not the least of these being that it mixed
classical and quantum physics
• recall that photons of light possessed the nature of
both waves and particles.

Key content
At the end of this chapter you should be able to:
• recognise that diffraction is a wave phenomenon and
that a diffraction pattern is produced by the
interference of diffracted waves
• describe de Broglie’s proposal that matter has a wave
nature as well as a particle nature
h
• solve problems using λ = -------
mv
• describe the impact of de Broglie’s proposal
• describe the experimental evidence provided by
Davisson and Germer confirming the wave nature
of electrons
• use de Broglie’s matter waves to explain the stability
of the stationary states of the hydrogen atom
• assess the contributions of Heisenberg and Pauli to
the development of a quantum mechanical model of
Figure 23.1 Neodymium, YAG (yttrium aluminum the atom.
garnet), argon and dye lasers. Lasers are used in a very
wide range of fields including communication, medicine,
measurement, holography, entertainment and scientific
research. Common devices such as CD players and
supermarket scanners use laser technology. These are just
two of the many applications in use today that are directly
related to the theory of quantum mechanics.
As we saw at the end of the previous chapter, Bohr had taken the first
I like relativity and quantum theories steps in applying quantum ideas to atomic structure but there were
because I don’t understand them problems associated with his model of the atom. Despite these prob-
and they make me feel as if space shifted lems, Bohr’s model, which reached its peak in 1922, was able to explain
about like a swan that can’t settle,
the periodic table and make accurate predictions about the properties
of then undiscovered elements (see page 447).
refusing to sit still and be measured;
In the 1920s, there was still a perceived problem with the nature of
and as if the atom were an impulsive thing light. In 1924, Einstein wrote: ‘There are therefore two theories of light,
always changing its mind. both indispensable and — as one must admit today despite twenty years
—D. H. Lawrence of tremendous effort on the part of theoretical physicists — without any
logicalconnection.’ (Reference from an article by Einstein in Berliner
Tageblatt, 20 April 1924, quoted in Abraham Pais, Inward Bound.)
When Einstein made reference to the wave theory of light and the
particle theory of light being without any logical connection, he was
unaware of the predictions of Louis de Broglie that particle and wave
natures were inextricably linked. Einstein was soon called on to make
comment on de Broglie’s doctoral thesis. In it, de Broglie predicted that
not only did light have a dual wave and particle nature, but particles also
had a wave nature. Einstein was impressed with de Broglie’s ‘crazy idea’.
Other famous physicists expressed discontent with the state of physics
in the early 1920s. In 1924, Max Born wrote, ‘At the most we possess only
a few unclear hints’; and in 1925, Wolfgang Pauli wrote, ‘Physics at the
moment is very muddled.’ The important breakthrough was supplied by
Werner Heisenberg who devised his theory of matrix mechanics, later to
be known as quantum mechanics.
Before we can study the wave nature of matter as predicted by de Bro-
glie, we must first study diffraction, a phenomenon exhibited by waves,
which was important in detecting the wave nature of particles.

23.1 DIFFRACTION
Diffraction of light occurs when light passes through a very finely ruled
grating, or when it is reflected from a surface with fine lines ruled across
it. It also occurs when light passes a barrier or passes through a small
opening (see figure 23.2). It is not easy to observe because the dimensions
of the barrier or opening must be comparable to the wavelength of light.
(a) (b)
(b) (c)

Figure 23.2 (a) Diffraction of monochromatic light by a straight edge, a razor blade (b) An
enlargement of the shadow shows bright and dark lines. The arrows indicate the edge of the
geometric shadow. A small amount of light passes behind the straight edge but this is too faint
to be observed. A series of bright and dark lines are observed next to the edge. (c) Diffraction
of light by a small circular opening. This time a series of bright and dark circles is observed.

CHAPTER 23 DEVELOPMENT OF QUANTUM MECHANICS 441


An explanation of diffraction
In the seventeenth century, Christian Huygens proposed that light was a
wave. He proposed what has become known as Huygens’ Principle which
A wavefront is either the crest or states ‘Every point on a wavefront may be considered to act as a source of
trough of a wave. The wavefront is circular secondary wavelets that travel in the direction of the wave. The
perpendicular to the direction of new wavefront will be tangential to the secondary wavelets.’ Huygen’s
the velocity of the wave. Principle is shown in figure 23.3.
This principle can be used to derive the laws of reflection and
refraction. It also helps to explain diffraction.

(a) (b) New wavefront

A A`
Wavefront New wavefront
B B` Wavefront
Source

B A A´ B´

Figure 23.3 (a) Huygens’ Principle explains the propagation of a plane wave. Each point
acts as a point source and a new plane wave is formed. AA´ is the original wavefront and the
new wavefront is BB´. (b) Each point on the curved wavefront acts a point source and a new
curved wavefront is formed.

If we use Huygen’s Principle to explain the propagation of a wave (the


formation of a new wavefront), it is necessary that all the point sources
contribute to the production of the new wavefront. When a barrier
blocks part of the wave, not all the point sources will be able to contri-
bute to the new wavefront.

Figure 23.4 Interference of two


identical waves (a) Where the crest of (a) P
one wave meets the crest of another
wave, constructive interference occurs Source 1
and the resultant displacement will be Constructive
twice that of one wave. (b) Where a interference
crest from one source meets a trough
from the other source, destructive Source 1
interference occurs and there will be
+ =
no displacement.
Source 2 Source 2

(b)
Interference of light was first Source 1 P
demonstrated by Thomas Young in the
early nineteenth century (see figure
23.5). He used two narrow slits as
light sources and produced a pattern Destructive
interference
of bright and dark lines on a screen.
The bright lines occurred at positions Source 1
Source 2
where waves met in phase (trough
=
met trough or crest met crest), and the +
dark lines where the waves met out Source 2
of phase (crest met trough).

442 FROM QUANTA TO QUARKS


If we consider a point on the new wavefront, some of the wavelets that
were blocked would have interfered destructively with the wavelets that
reach that point. Others would have interfered constructively . This effect
is shown in figure 23.4. If the net effect was that more of the blocked
wavelets would have interfered destructively than constructively, that
point on the new wavefront will be of greater intensity than the incident
wavefront. As we move further across the pattern, the relative amounts
will change and we will observe successive regions that are brighter and
then fainter than the incident light. The intensities gradually become
closer to that of the incident light.

Figure 23.5 Thomas Young’s


original drawing of a two-source
(double-slit) interference pattern. A
and B are the point sources. The dark
circles represent the wave crests and
the troughs are the white spaces in
between crests. Destructive
interference occurs on the screen at C,
D, E, and F where crest meets trough.

If we observe a straight edge or a narrow slit, we will notice bright and


We can explain the light and dark dark straight lines. If it is a circular aperture, we will see a series of bright
regions of a diffraction pattern in and dark rings.
terms of the constructive and A transmission diffraction grating is a transparent material that has
destructive interference of wavelets many fine lines ruled across it. (A grating may have many thousands of
from the point sources. lines per centimetre.) These lines can be considered to be breaking the
wavefront into point sources. The interference of light from these many
point sources produces a diffraction pattern (see figure 23.6).

Figure 23.6 The pattern produced


when light from a mercury vapour
lamp is shone through a transmission
diffraction grating
A reflection diffraction grating is a reflecting surface with many lines
ruled across it. The ‘reflected’ light is not reflected with an angle of inci-
dence equal to the angle of reflection because the gaps between the lines
act as point sources.
If a laser, which is a source of coherent light, is shone onto part of the
If light waves maintain a constant
phase relationship, they are said to scale of a metal ruler at an angle of incidence close to 90°, the ‘reflected’
be coherent. light will produce a series of bright spots, a diffraction pattern. This
effect is shown in figure 23.7.
Screen

Bright spots
of diffraction
pattern are
Beam from
observed on
laser
screen.

Figure 23.7 A diffraction pattern


can be observed when light from a
laser is reflected from the scale on a
metal ruler. Laser beam incident on scale of steel rule

CHAPTER 23 DEVELOPMENT OF QUANTUM MECHANICS 443


23.2 STEPS TOWARDS A COMPLETE
QUANTUM THEORY MODEL OF
THE ATOM
At the end of chapter 22 we saw that Bohr’s model of the atom did not
explain certain aspects of the spectrum of hydrogen. It could not
account for hyperfine spectral lines or the fact that spectral lines were
split if the gas was placed in a magnetic field. However, perhaps the most
puzzling aspect of the model was how Bohr could throw out some
features of classical physics while retaining others and how he could
impose quantum theory features on an otherwise classical model.

Louis de Broglie’s proposal that particles


had a wave nature
The correct pronunciation of Louis de Broglie (1892–1987) was a
de Broglie is ‘de broy’. French nobleman who had the misfor-
tune to have his studies of physics
interrupted in 1913 by what should
have been a short period of compul-
sory military service. Because of World
War I, his military service continued
until 1919.
He did, however, return to his
studies and, in 1923, published three
short papers on light quanta. He then
prepared his doctoral thesis. In it he
argued that the fact that nobody had
managed to perform an experiment
that settled once and for all whether
light was a wave or a particle was Figure 23.8 Prince Louis Victor de Broglie,
2 because the two kinds of behaviour are who was awarded the Nobel Prize in 1929
E = hf and E = mc for his discovery of the wave nature of
inextricably linked.
2
mc = hf The expressions for the energy and electrons
hf momentum of light quanta:
mc = ----- hf
c E = hf and p = -----
c
hf have quantities that are properties of particles on the left-hand side and
∴ p = -----
c quantities that are properties of waves on the right-hand side.
De Broglie made the bold proposal that all particles must have a wave
nature as well as a particle nature.
Electrons had been thought of as well-behaved particles except for the
fact that they occupied distinct energy states in the hydrogen atom.
These energy states were associated with integers. De Broglie was aware
of other phenomena in physics that were associated with integers. These
included the interference of waves and the vibration of standing waves.
He stated: ‘This fact suggested to me the idea that electrons, too, could
not be regarded simply as corpuscles, but that periodicity must be
assigned to them.’
His work was not just idle speculation. His great achievement was to
take this idea and develop it mathematically. He described how matter
waves ought to behave and suggested ways that they could be observed.

444 FROM QUANTA TO QUARKS


The wavelength of a photon was Planck’s constant divided by its
momentum and de Broglie proposed that, similarly, the wavelength of a
moving particle would be Planck’s constant divided by its momentum.
Therefore, photon momentum would be:
hf
p = -----
c
h
= ---
λ
h
λ = --- .
p
h
The de Broglie wavelength of a particle λ = ------- .
mv
The examiners of de Broglie’s thesis liked his mathematics but did not
believe that it had a physical significance. When de Broglie was ques-
tioned about this he disagreed and claimed that it should be possible to
Diffraction is a phenomenon observe the wave nature of a beam of electrons diffracted from the sur-
exhibited by waves when waves face of a crystal. The examiners accepted de Broglie’s thesis and were
either bend behind a barrier or the influenced by Einstein’s comment ‘I believe it is a first feeble ray of light
wavefront is broken up into many on this worst of our physics enigmas’.
small sources (see previous
section).
Most other developments in physics in the past had occurred when a
theory was developed to explain an observation. Here, de Broglie did the
opposite. He made a prediction based on his theoretical work and
suggested the observations that would support his theory.
In fact, de Broglie had initiated the revolution in which Heisenberg,
Schrödinger, Dirac, Born, Pauli and others developed a detailed
Quantum mechanics is the name
theory called quantum mechanics. Even before experimental evidence
given to a set of physical laws that of the wave nature of electrons had been observed, the development
apply to objects the size of atoms of quantum mechanics was well on its way (see Physics in focus,
or smaller. The concepts of wave– page 449).
particle duality and uncertainty lie Quantum mechanics is a complete theory, not a mixture of classical
at the heart of quantum
mechanics. and quantum ideas as had been used previously by Bohr. The ‘old’
quantum-theory model of the atom (which still retained some classical
physics) reached its peak in about 1922 but was then replaced by
quantum mechanics, a completely quantum theory.
In quantum mechanics, particles have both a wave and a particle
nature and the rules of mechanics that are obeyed on a macroscopic
scale are not obeyed. The uncertainty principal and wave–particle duality
lie at the heart of quantum mechanics.

Determining the wavelength of a moving electron


SAMPLE PROBLEM 23.1 Calculate the wavelength of an electron moving with a velocity of
5 −1
5.00 × 10 m s and compare this value to the wavelength of visible light.
(At this velocity, relativistic effects can be ignored.)
h
SOLUTION λ = -------
mv
6.63 × 10 –34
= ------------------------------------------------------------
-
9.11 × 10 –31 × 5.00 × 10 5
−9
= 1.46 × 10 m
The shortest wavelength of light (violet) is about 400 nm. The wave-
5 −1
length of the electron moving at 5.00 × 10 m s is about 300 times less
than the wavelength of violet light.

CHAPTER 23 DEVELOPMENT OF QUANTUM MECHANICS 445


Confirmation of de Broglie’s matter waves
In 1922 and 1923, Clinton Davisson (1881–1958) and Charles Kunsman
studied the strange behaviour of electrons scattered from the surface of
crystals. They explained their scattering results as being caused by the
structure of the atoms that were bombarded by the electrons. After
We shall see in the Physics in focus de Broglie’s prediction of matter waves, Walther Elsasser, a 21-year-old
section on page 449 that Max Born student of Max Born, published a brief note explaining the results of
played an important role in working these experiments in terms of the wave nature of the scattered electrons.
with Heisenberg to develop quantum This was not appreciated or accepted by Davisson and Kunsman.
mechanics. Born made a major In 1927, Davisson, working this time with Lester Germer (1896–1971),
contribution to the mathematics of the studied the surface of a piece of nickel by examining the scattering of
theory which he felt was generally electrons from it (see figure 23.9 below). The surface of the nickel con-
overlooked. sisted of many microscopic crystals bonded together at random orien-
tations and it was expected that even the smoothest possible surface
would appear rough to the electrons.

Electron gun

Electron detector

Electron beam
travelling through
Power supply
a vacuum
Nickel crystal
Figure 23.9 The experiment of Davisson and Germer. Electrons are accelerated in the
electron gun and fired at the surface of a nickel crystal. As the angle to the electron detector
(θ ) was changed, the diffraction pattern was observed.

During the course of their experiment, an accident occurred and air


entered the vacuum chamber. An oxide film formed on the metal sur-
face. In an attempt to remove the oxide film, Davisson and Germer
heated the metal to a temperature just below its melting point. They did
not know this, but it had the effect of annealing the metal. Large single
crystal regions, which were larger than the width of their electron beam,
were produced.
The results were now very different. Davisson and Germer were fami-
In 1906, J. J. Thomson received the
liar with X-ray diffraction and with de Broglie’s theory of matter waves.
Nobel prize for discovering that the
electron was a particle. In 1937, his
They recognised that the electrons were being diffracted. As diffraction is
son, George, shared a Nobel prize a property of waves and not particles, they had established that electrons
that was awarded for detecting that had a wave nature as well as a particle nature.
the electron had a wave nature. G. P. Thomson (1892–1975), a son of J. J. Thomson, made a similar dis-
covery in England about the same time. If Davisson had accepted the
idea of Walther Elsasser, he might have received the Nobel prize by
himself. As it was he shared it with G. P. Thomson in 1937.

Bohr’s electron orbits explained


When de Broglie developed the idea of matter waves, he had believed
that the orbits of the electron in the hydrogen atom were something like
standing waves. This idea is depicted in figure 23.10.

446 FROM QUANTA TO QUARKS


The condition for a standing wave to be formed on a string fixed at
(a)
each end, is that the length of the string must be an integral number of
half wavelengths.
If we consider an electron as setting up a standing wave pattern as it
orbits around a nucleus, there must be an integral number of wavelengths
(b) in that pattern.
If the circumference is taken as 2πr then there are n wavelengths in the
λ circumference, nλ = 2πr.
λ λ The de Broglie wavelength is:
n=2 n=3 n=4
h
λ = ------- and nλ = 2π r.
mv
Figure 23.10 (a) A standing wave nh
So: ------- = 2π r
in a string (b) Circular standing mv
waves nh
mvr = ------- .

This is Bohr’s quantisation condition that angular momentum can exist
h
only in integer multiples of ------ . The quantised electron orbits of Bohr

can be explained by de Broglie’s proposal that particles have a wave
h
nature and that their wavelength is λ = ------- .
mv

The Bohr model revisited


We have just seen that we can use the wave nature of electrons to account
for the stability of the electron orbits in the Bohr model. This also pro-
vides a reason for the quantisation condition used by Bohr in 1913.
The Bohr model had other significant successes. In 1913, H. G.
Moseley had shown in his work on the emission of X-rays from various
substances that the wavelengths of the emitted X-rays were in agreement
with the values predicted by Bohr’s theory.
Also in 1913, Johannes Stark showed that when a gas was placed in an
electric field, the spectral lines were split. Bohr was able to show that this
could be explained by his theory.
In 1914, two German scientists, James Franck and Gustav Hertz, found
that when mercury atoms were bombarded by electrons, the atoms would
not absorb energy below a certain critical value from the incident elec-
trons. Bohr immediately realised that this could be interpreted in terms
of his theory, but Franck and Hertz continued for some time to provide
their own incorrect interpretation.
In 1922, Bohr developed an explanation of the atomic structure that
underlies the regularities of the periodic table. He considered that atoms
are built up of shells of orbiting electrons with the shells being filled, in
the case of uranium, by 2, 8, 18, 32, 18, 8 and 6 electrons. Uranium is
the naturally occurring element with the highest atomic number. How-
The periodic table is reproduced in ever, Bohr could not explain why these were the numbers of electrons
Appendix 2, page 528. in the shells.
Chemists had successfully predicted the properties of previously
unknown elements using the periodic table, but Bohr’s variation pre-
dicted that element 72, when discovered, would not belong to the rare
earth group as chemists predicted but rather be a valency four metal
similar to zirconium. Georg de Hevesy and Dirk Coster discovered the
new element, hafnium, and the night before Bohr was awarded his Nobel
Prize they relayed to him the information that it had the properties he

CHAPTER 23 DEVELOPMENT OF QUANTUM MECHANICS 447


had predicted. This was the high point of the ‘old’ quantum theory. (The
term ‘old’ is applied to quantum theory before 1925.)
The wave nature of electrons may have helped with an understanding
of the electron orbits, but it did not help with any of the other difficulties
we have experienced with the Bohr model.

A new quantum theory


The Physics in focus section on pages 449–451 provides an outline of
some of the steps that were taken in replacing the old quantum theory of
Bohr with the new theory of quantum mechanics.
The complete quantum theory came about after breakthroughs by
Werner Heisenberg (1901–1976) and Erwin Schrödinger (1887–1961).
These scientists, independently in 1925 and 1926, discovered different
forms of the same theory. (Heisenberg’s matrix mechanics and
Schrödinger’s wave mechanics were later shown to be equivalent.)
Heisenberg introduced the uncertainty principle and Bohr completed
the theory with his principle of complementarity.
By the time of the Solvay Conference of October 1927, the old
quantum theory had been replaced. At this conference, Schrödinger pre-
sented a paper on his wave function theory but he declined to discuss the
interpretation of the wave functions (which Born interpreted as being
Figure 23.11 Werner Heisenberg
related to the probability of finding an electron in a certain location).
(1901–1976)
The theory is now called quantum mechanics, and Bohr’s ideas — along
with Heisenberg’s uncertainty principle and Born’s probability interpret-
ation — became known as the Copenhagen interpretation. At the Solvay
Conference, Einstein raised his first public objections to quantum mech-
anics and he was to continue to debate with Bohr this interpretation of
quantum mechanics. Einstein never accepted that quantum mechanics
was a ‘complete’ theory and the Copenhagen interpretation is still con-
sidered obscure by some physicists today. It is no wonder that Bohr made
his famous statement, ‘Anyone who is not shocked by quantum theory
has not understood it.’
We have not even scratched the surface of quantum mechanics. We
have seen that there were major problems with the ideas of the original
quantum theory and that, in the process of overcoming those problems,
a new theory was developed that required a modification of our ideas
about the physical world.
In this strange new theory there is no such thing as a particle or a wave
Figure 23.12 Erwin Schrödinger but rather there is a wave-particle duality, and making an accurate observ-
(1887–1961)
ation of one property means that another property cannot be measured
accurately. We have not studied the mathematics of either the matrix
mechanics of Heisenberg or the wave mechanics of Schrödinger, and we
have not studied the probability interpretation of Born. We are therefore
not in a position to see why quantum theory and, in particular, the
Copenhagen interpretation is as shocking as Bohr suggests.
A deeper study of quantum mechanics leads to an atomic world that is
fuzzy and nebulous and in which, according to Bohr and the Copen-
hagen interpretation, nothing actually exists until it is observed. The
clockwork world of Newton becomes a world of quantum uncertainty
where nothing is predictable. Even worse, events can occur without
having a cause and quantum particles can suddenly pop into existence.
Yet, despite all the problems of interpretation, quantum mechanics is
an incredibly successful theory. Consider the following facts:
• Quantum mechanics helps us explain and control the properties of
metals, insulators, semiconductors and superconductors.

448 FROM QUANTA TO QUARKS


• The inventors of the transistor acknowledge the part quantum theory
played in their discovery. That discovery led to the development of
ever more powerful computers and microcomputers and a revolution
in communications and information technology.
• Lasers and masers are quantum devices.
• Quantum mechanics explains the structure of the atom and nucleus as
well as mechanical and thermal properties of solids.
• Quantum mechanics gave chemistry a firm base and explained
chemical bonding. The new areas of molecular biology and genetic
engineering have also arisen from quantum chemistry.
• In astrophysics, the processes that occur in stars can be explained by
quantum mechanics, and even our theories regarding such exotic
objects as black holes are based on quantum mechanics. There has even
been the suggestion that our universe began as a ‘quantum fluctuation’.
Quantum mechanics is a theory that has changed the world and our view
of it.

PHYSICS IN FOCUS
The development of quantum mechanics
Heisenberg develops quantum mechanics
M any physicists contributed to the develop-
ment of the theory of quantum mechanics.
Many arguments on the interpretation of the
In May 1925, Heisenberg suffered from very bad
hayfever and took a holiday for two weeks. He
meaning of quantum mechanics took place in the went to Heligoland where there was little pollen
twentieth century and continue today. Some of and he soon recovered. While there, he devel-
the major contributors are mentioned below with oped his mathematical theory of quantum
a brief statement about their contribution. mechanics. In it he had arrays of numbers that
Research on the internet will reveal more about Max Born (1882–1970) later realised were mat-
their lives and contributions. rices. In the next few months, Heisenberg
Heisenberg and Bohr worked with Born and Pascual Jordan (1902–
1980) to develop what he called ‘a coherent
Werner Heisenberg (1901–1976) heard Niels mathematical framework, one that promised to
Bohr (1885–1962) lecture on the periodic table embrace all the multifarious aspects of atomic
when Bohr visited Germany. Heisenberg was physics’. At the time, this was referred to as
impressed with Bohr, although he believed that matrix mechanics.
Bohr did not understand the reason why his
theories were correct. Bohr was also impressed
with Heisenberg, who had objected to one of
Bohr’s statements. Bohr liked to identify smart
people who were not afraid to speak up. In a
similar way he had picked out Richard Fey-
nman when he first met him at Los Alamos
during the Manhattan Project — the project to
develop the atomic bomb (see chapter 25,
page 481).
Bohr invited Heisenberg to Copenhagen and
they took a fresh look at quantum theory. Heisen-
berg thought Bohr’s electron orbits were fanciful
and Bohr suggested to Heisenberg that he should
forget about electron orbits around the atom.
Heisenberg decided to reject a mechanical model
completely and to look for patterns in numbers Figure 23.13 Max Born (1882–1970)
— in other words, to develop a completely math-
ematical model. (continued)

CHAPTER 23 DEVELOPMENT OF QUANTUM MECHANICS 449


Pauli applies quantum that looked like the equations used
mechanic to hydrogen to describe real waves, and it
seemed that he had managed to
Wolfgang Pauli (1900–1958), like bring quantum ideas back towards
Heisenberg, was a student of a much more comfortable formu-
Arnold Sommerfeld at Munich. lation associated in some way with
Pauli was a friend of Heisenberg classical physics.
and had influenced his path into Heisenberg did not like
atomic physics. Pauli took Heisen- Schrödinger’s approach. Heisen-
berg’s quantum mechanics and berg did not see how continuous
applied it, with difficulty, to the waves could be used to describe
hydrogen atom. He was able to the discontinuous behaviour of an
derive Balmer’s equation and Ryd- electron jumping from one state
berg’s constant using quantum to another. Many papers started
mechanics. appearing on Schrödinger’s wave
Bohr had done the same thing Figure 23.14 Wolfgang Pauli mechanics and very few on
in 1913 using his inconsistent (1900–1958) Heisenberg’s matrix mechanics.
assumptions of classical physics and Schrödinger did not like Heisen-
quantum theory. Bohr was delighted with the berg’s non-visual interpretation or the use of
work of Pauli and the success of the new matrices. Most physicists of the time preferred
quantum mechanics. Schrödinger’s approach to Heisenberg’s
approach.
Pauli and his exclusion principle However, it was not long before Schrödinger
The first three ‘quantum numbers’ are the prin- demonstrated that the two different approaches
cipal quantum number n, from the Bohr model, were simply different versions of the same thing.
the angular momentum quantum number l, and He showed that Heisenberg’s matrices could be
the magnetic quantum number m. generated in Schrödinger’s theory and
Pauli used Bohr’s idea of shells of electrons Schrödinger’s waves could be produced from
and, in 1925, realised that if he introduced a Heisenberg’s matrices.
fourth quantum number, he could explain the Schrödinger later spent time with Bohr and
maximum number of electrons in each shell. was most disappointed to find that his ‘waves’
The fourth quantum number was associated were not real waves at all. Max Born showed that
with ‘spin’. (For more on spin see page 465.) they were associated with the probability of
The maximum number of electrons in each finding an electron at a particular location.
shell corresponded to the number of different
sets of quantum numbers available for each Heisenberg and uncertainty
shell. Pauli’s exclusion principle states that no
two electrons can have the same set of quantum In late 1926, Heisenberg showed that uncer-
numbers. tainty is an inherent property of quantum mech-
Pauli’s exclusion principle provided the reason anics. He showed that there are pairs of
for electrons in atoms being arranged in shells quantities that cannot be determined simul-
with the maximum number of electrons being 2, taneously. If we know the accurate position of a
8, 18, 32, 18, 8 from the first to sixth shell. particle, say an electron, then you cannot know
its momentum accurately. If you determine its
Schrödinger — a different approach momentum accurately, you cannot specify its
De Broglie’s work on the wave nature of particles position accurately.
might not have received wide publicity if it had This is represented by Heisenberg’s uncer-
not been brought to the attention of Einstein. In tainty principle:
1925, Erwin Schrödinger (1887–1961) read a h
∆x × ∆p ≥ ------
comment of Einstein on de Broglie’s work that 2π
referred to it as more than a mere analogy. where
Schrödinger then set about trying to restore ∆x and ∆p = the inherent uncertainties in
some of the familiar concepts of waves to position and momentum
quantum theory. He eventually derived equations h = Planck’s constant.

450 FROM QUANTA TO QUARKS


Bohr and the principle of complementarity Dirac discovered that the equations of quantum
mechanics have the same structure as the equa-
Bohr had a problem with Heisenberg’s uncer-
tions of classical physics and that the equations of
tainty principle. It was based on wave-particle
classical physics can be obtained from quantum
dualism and also with the fact that an observation
mechanics by using very large quantum numbers
of an atomic system would disturb the system. A
or setting Planck’s constant to zero.
slightly oversimplified version of Bohr’s principle
of complementarity that addresses this is that,
when you make an observation of the particle
nature of something, it still has a wave nature but
you do not see the wave nature during that
observation or vice versa.

The Dirac equation


Paul Dirac (1902–1984) extended quantum
mechanics and derived the Dirac equation, which
added relativity to quantum theory. It predicted
correctly the spin of electrons (which is a relativ-
istic effect). It also predicted the existence of a
particle similar to an electron but with a positive Figure 23.15
charge. The anti-electron or positron was Paul Dirac
observed by Carl Anderson in 1932. (1902–1984)

CHAPTER 23 DEVELOPMENT OF QUANTUM MECHANICS 451


2. Explain why Bohr was said to be delighted
SUMMARY after Pauli used a new approach to calculate
the wavelengths of the spectral lines of
• Bohr’s model of the atom used both the ideas hydrogen.
of classical and quantum physics. This was the
3. When de Broglie was examined for his PhD,
fundamental problem of his model.
his thesis was first thought by his examiners to
• Diffraction is a wave phenomenon and a dif- bear little relationship to reality.
fraction pattern is produced when diffracted (a) What did de Broglie predict that made it
waves interfere. seem unrelated to reality?
(b) What did de Broglie suggest could be
• Louis de Broglie proposed that the wave and observed to support his prediction?
particle behaviour of light were inextricably (c) How did the work of Davisson and Germer
linked. He then predicted that matter must also provide evidence to support de Broglie?
have both a particle and a wave nature. The
wavelength of particles would be equal to 4. If a proton and an electron are travelling with
Planck’s constant divided by the momentum of equal velocities, state which has the longer de
the particle. Broglie wavelength.
5. If one electron travels twice as fast as another
• Einstein considered that de Broglie’s theory electron, state which one has the greater wave-
might be the first step in explaining the length.
problem of the apparent dual nature of light. 4 −1
6. (a) If an electron travelling at 1.0 × 10 m s
4 −1
• De Broglie proposed that electrons would was accelerated to 2.0 × 10 m s , calculate
undergo diffraction. Davisson and Germer the ratio of its new wavelength to its orig-
observed diffraction of electrons reflected from inal wavelength.
8 −1
the surface of a crystal and supplied the evidence (b) If an electron travelling at 1.0 × 10 m s
8 −1
for the wave nature of electrons. was accelerated to 2.0 × 10 m s , would it
change its wavelength by the same amount
• Heisenberg proposed a mathematical model as the electron in part (a)? Explain your
rather than a mechanical model as the basis for answer.
his theory of quantum mechanics.
7. (a) Calculate the de Broglie wavelength of an
• Pauli applied the ideas of quantum mechanics electron in a television set that hits the
to the hydrogen atom and was able to derive screen with a velocity one-tenth of the
Balmer’s equation and the value of the Rydberg velocity of light.
constant. (b) With what velocity would you roll a ball of
mass 0.1 kg if it is to have the same
• Pauli introduced a fourth quantum number
de Broglie wavelength as the electron in
and used his exclusion principle to explain why
part (a)?
there was a maximum number of electrons
possible in each shell of electrons. 8. A neutron emitted when a uranium-235
nucleus undergoes fission may have an energy
of about 1 MeV. A ‘thermal’ neutron that
QUESTIONS would be captured by a uranium-235 nucleus
in a nuclear reactor would have an energy of
1. Outline the ways in which the quantum mech- about 0.02 MeV.
anics developed by Heisenberg improved on (a) Calculate the wavelength of a 1 MeV
the Bohr model that had successfully pre- neutron.
CHAPTER REVIEW

dicted the wavelengths of the spectral lines of (b) Calculate the wavelength of a 0.02 MeV
hydrogen. neutron.

452 FROM QUANTA TO QUARKS


CHAPTER
24 PROBING THE
NUCLEUS
Remember
Before beginning this chapter, you should be able to:
• recall the features of the model of the atom in terms
of the arrangement and properties of protons,
neutrons and electrons
• describe the nature of alpha, beta and gamma rays
and recall their properties in terms of ionising power,
penetrating power and deflection by magnetic and
electric fields.
• recall the relationship between energy and mass,
2
E = mc .

Key content
At the end of this chapter you should be able to:
• define the term ‘transmutation’ and describe the
transmutations involved in naturally occurring
radioactivity
• describe Chadwick’s discovery of the neutron and the
part played by conservation laws in this discovery
• contrast the properties of protons and neutrons and
describe them as nucleons
• describe the problems associated with the energy
distribution of electrons emitted in beta decay that
led Pauli to predict the existence of the neutrino
• describe the properties of the strong nuclear force
and realise that over very short ranges it is much
stronger than the electrostatic and gravitational
forces between nucleons
2
• explain the concept of mass defect (using E = mc )
and be able to calculate the mass defect of nuclei and
the energy associated with nuclear reactions.

Figure 24.1 A wide-angle view of the Super-Kamiokande


detector 1000 m below ground in Japan as it was nearing
completion. The view is from the bottom of the 42 m high,
39 m diameter tank. The top and walls of the tank at this
stage were covered with about 9000 light-sensitive
phototubes. On completion it was filled with 50 000 tonnes
of ultra-pure water. In 1998, results from this detector
indicated neutrino oscillation and hence, neutrino mass.
24.1 DISCOVERIES PRE-DATING
THE NUCLEUS
In chapter 22 we studied the development of the nuclear atom. In this
chapter, we will study the development of the theories of the nucleus itself.
Some important discoveries and investigations were made before the dis-
covery of the nucleus and we will start by examining those investigations.

Discovery and early investigations of


radioactivity
Henri Becquerel discovered radioactivity in 1896 when he was studying
A phosphorescent substance
the radiation emitted from phosphorescent substances that had pre-
absorbs radiation of one viously been exposed to sunlight. Becquerel found by accident that a salt
wavelength and then emits of uranium, potassium–uranyl sulfate, continuously emitted radiation
radiation of a different wavelength regardless of whether or not it had been exposed to sunlight. This radi-
over a period of time. The hands ation penetrated matter, passing through black paper (opaque to light)
of some analogue watches are
coated with a phosphorescent
and causing a photographic plate to become darkened. It seemed to be
substance to enable them to similar in nature to X-rays, which had been recently discovered by
be seen in the dark. Wilhelm Roentgen (1845–1923)
In 1898, Rutherford showed that there were two components (alpha and
beta rays) of the radiation discovered by Becquerel, and in 1900 Paul Vil-
lard (1860–1934) discovered the third component (gamma rays). You
encountered alpha, beta and gamma rays in the preliminary course unit
‘The Cosmic Engine’. The properties of the radiation are reviewed in
‘Physics in focus’ below.

PHYSICS FACT
The radiation discovered
by Becquerel
U ranium is an emitter of alpha particles, which have a low pen-
etrating power and would have been stopped by the black paper.
What really caused the photographic plate to be darkened was the beta
particles emitted by the thorium produced by the alpha particle emis-
sion from uranium.

PHYSICS IN FOCUS
Review of the properties and identities of alpha,
beta and gamma radiation

T he properties of alpha, beta and gamma radiation can be summarised as follows.

Penetrating power
Figure 24.2 shows that the penetrating power is lowest for alpha particles, which can be stopped by a
sheet of paper or a few centimetres of air. Beta particles will be stopped by many metres of air or a sheet
of aluminium about a centimetre thick. Gamma rays may pass through a few centimetres of lead or
many metres of concrete before being stopped.

454 FROM QUANTA TO QUARKS


α particles — absorbed in a few Paper
Concrete
centimetres of air, or by a piece
of paper or layer of dead skin

γ rays — barely affected


by air; absorbed in many
centimetres of lead

Aluminium
Lead
β particles — absorbed in
about 100 cm of air or a few Figure 24.2 The relative
centimetres of aluminium penetrating powers of alpha,
beta, and gamma radiation

Ionising power Ions Ions γ ray


As might be expected, the –
– + +
ionising power is the – –
+ –
inverse of the penetrating –
– + +
+ +
power. Alpha particles + – + – – – –
interact most strongly with + +
– +
matter and hence have a
low penetrating power α particle Neutral gas β particle
and high ionising power. molecules
The ionising power of (a) Alpha particles cause (b) Beta particles ionise a (c) Gamma rays ionise a
beta particles is lower and intense ionisation in gas much less than alpha gas even less than beta
that of gamma rays is very a gas. particles do. particles.
low (see figure 24.3). Figure 24.3 Ionisation caused by radioactive emissions passing through gas

Deflection by a magnetic field


Detection apparatus
The paths of the different types of radiation (e.g. photographic film)
through a magnetic field proved to be harder to
α particles
observe, but once detected they indicated that
alpha particles were positively charged, beta par- γ rays
ticles were negatively charged and gamma rays
were neutral. Magnetic field

β particles

Lead shield
Figure 24.4 The paths of
alpha, beta and gamma rays Source of
through magnetic fields radioactivity
(continued)

CHAPTER 24 PROBING THE NUCLEUS 455


Table 24.1 Summary of properties and identities of alpha, beta and gamma radiation

NATURE OF
PATH THROUGH RADIATION (IN
TYPE OF RADIATION PENETRATING POWER IONISING POWER MAGNETIC FIELD TODAY’S TERMS)

Alpha Very low Very high Curved path of Helium nucleus


positive charge

Beta High Moderate Curved path of Electron


negative charge

Gamma Extremely high Very low Not deflected High energy


photon

PHYSICS FACT
Detection of radioactivity
– ––
O ne of the earliest methods of
detecting radioactivity used a radio-
active source’s ionising power. As shown
Source of
alpha particles
+
Positive ions are attracted
to the electroscope and
+ + cause rapid discharging.
in figure 24.5, if a radioactive source was – –
brought near to a charged electroscope, – –
the electroscope would discharge. It did not matter
whether the radioactive source emitted alpha particles
or beta particles or what the sign of the charge was on –
the electroscope. The electroscope was discharged –
because it attracted ions of opposite charge to the –
electroscope.

The radioactive particles produced both positively

and negatively charged ions when passing through air –
and hence the electroscope was discharged. – –
Apart from the scintillation method used by
Rutherford and his co-workers, the other methods
Leaf electroscope
used to detect radioactivity relied on the ionising
power of the radiation. This is one of the reasons why
detecting the neutral particles, neutrons and Figure 24.5 Discharging an electroscope with a
neutrinos, proved so difficult. radioactive source

Naturally occurring radioactivity explained


In 1902, Rutherford and the English chemist Frederick Soddy (1877–
1956) published a paper proposing that the emission of radioactivity was
the result of ‘radioactive transformation’. When a radioactive atom emitted
an alpha particle or beta particle, the atom split into two. The alpha or
beta particle was emitted and what remained was a heavy leftover part that
When a radioactive atom emitted
an alpha particle or a beta particle, was chemically and physically different from the parent atom (see figures
an atom of a new element was 24.6 and 24.7).
produced. This process by which a This ‘transformation’, ‘disintegration’, ‘decay’ or ‘transmutation’ was
new daughter element was formed responsible for turning one element into another.
from a parent element was termed
transmutation. After the discovery of the nucleus, the transmutation was identified as
the emission of alpha or beta particles from the nucleus.

456 FROM QUANTA TO QUARKS


Radium-226 Radon-222 Thorium-234 Protactinium-234

Proton α particle Proton β particle

Neutron Neutron
88 protons 86 protons 90 protons 91 protons
138 neutrons 136 neutrons 144 neutrons 143 neutrons

Figure 24.6 The alpha decay of Figure 24.7 The beta decay of
radium-226 thorium-234

PHYSICS FACT
Writing nuclear equations
he formulas for nuclei are written in the form AZ X where X is the
T symbol for the element, A is the mass number (number of protons
plus neutrons) and Z is the atomic number (number of protons).
The term nuclide is used to denote a nucleus characterised by par-
The term nuclide refers to a
particular nucleus with certain
ticular values of Z and A. If a group of nuclides share the same atomic
values of Z (atomic number) and A number but have different mass numbers, they are referred to as
(mass number). isotopes of that element.
In any nuclear reaction, the sum of the mass numbers before the
reaction must be equal to the sum of the mass numbers after the
An isotope is a nuclide that has the reaction. The sum of the atomic numbers before the reaction must
same number of protons but likewise be equal to the sum of the atomic numbers after the reaction.
different numbers of neutrons.
This can be slightly complicated by the fact that if a beta decay is
involved, the electron is assigned an atomic number of negative 1. It
may be necessary to look up a periodic table to determine the
element formed if not all the information is supplied.
The equations for the transmutations associated with some
common examples of alpha decay and beta decay are:
We will see later that another particle, 238 234
U → Th + 42 He
an antineutrino, is also emitted 92 90
during beta decay. As we are dealing 234 234
Th → Pa + 0 e.
with the transmutations observed and 90 91 –1
explained in the early 1900s we will We can see that alpha decay reduces the atomic number by two and
omit the antineutrino at present. the mass number by four, and beta decay increases the atomic
number by one and leaves the mass number unchanged.

Radioactive decay
SAMPLE PROBLEM 24.1 The decay series starting with uranium-238 proceeds by alpha decay and
beta decay until the stable isotope of lead-206 is reached.
(a) How many alpha decays are involved in this series?
(b) How many beta decays are involved in this series?

SOLUTION (a) As alpha decay is the only decay that reduces the mass number, and
each alpha decay causes a decrease in mass number of four, there
238 – 206
must be ------------------------ = 8 alpha decays.
4

CHAPTER 24 PROBING THE NUCLEUS 457


(b) The atomic number of uranium is 92 and that of lead is 82. As there
are eight alpha decays, these would reduce the atomic number by 16.
However, it decreased only by 10. Hence, there must have been six
beta decays, each one increasing the atomic number by one.

24.2 DISCOVERY OF THE NEUTRON


After the discovery of the nucleus, it seemed logical to assume that the
nucleus contained protons and electrons. It was possible to explain radio-
active transmutations in terms of emission of alpha and beta particles
from the nucleus of protons and electrons. However, there were major
problems with the idea of a nucleus containing these constituents.
In 1920, Rutherford proposed that a neutral particle, with mass com-
parable to that of a proton, must be another constituent of the nucleus.
He named this particle a neutron. In future research he and James
Chadwick (1891–1974) continued to look out for any result that would
suggest the existence of such a particle.

Experiments involving artificially induced


radioactivity
The radioactivity that we have encountered so far has been associated
with natural alpha, beta and gamma emitters. Rutherford was the first to
use alpha particles to produce nuclear reactions.

The first artificially induced transmutation


In 1919, Rutherford bombarded nitrogen gas with alpha particles from
bismuth-214. A positively charged particle which was more penetrative
than an alpha particle was produced. This particle was identified as a
proton.
What had occurred, as shown in figure 24.8, was that the alpha particle
had combined with the nitrogen nucleus and a proton had been emitted.
The alpha particles from the bismuth-214 source were able to approach
the nucleus very closely and occasionally make contact with it. The equa-
tion for this reaction is:
4 He 14 17
2
+ N → O + 11 H.
7 8

PHYSICS FACT
Alpha-particle-induced nuclear reactions
W hen the first alpha particle scattering experiments were per-
formed, low-energy alpha particles were used and those that
approached a gold nucleus (containing 79 protons) were strongly
repelled. In the alpha-particle-induced reaction with nitrogen, the
alpha particles had a much higher energy than those used in the
early experiments and there was only a weak repelling force from a
nitrogen nucleus that contained only 14 protons. An energetic
alpha particle was able to make contact with the nitrogen nucleus.
Various writers have commented that Rutherford was fortunate
that he did not use a source of very powerful alpha particles when
he performed his first alpha particle scattering experiments!

458 FROM QUANTA TO QUARKS


Figure 24.8 The photograph shows Path of ejected
alpha particle tracks through a cloud proton
chamber filled with nitrogen gas. The Nitrogen
diagram helps to identify an event Path of incident nucleus
where a nitrogen nucleus has been alpha particle
struck by an alpha particle. A proton Path of recoil
is ejected upwards and the resulting of resulting
oxygen nucleus recoils downwards. oxygen nucleus

An artificially induced radioactivity


We have seen that in atomic physics it In 1930, Bothe and Becker (in Germany) fired alpha particles at beryl-
was convenient to measure energy in lium and found that a highly penetrating radiation was produced. The
electron-volts. In nuclear physics, radiation seemed to be similar to gamma rays (high-energy photons) but
energy is usually measured in million it was much more highly penetrating than the gamma rays previously
electron volts (MeV). The energies observed. It was found to have an energy of about 10 MeV, again much
associated with nuclear processes are higher than that previously observed for gamma rays.
very much larger than those In France, Frédéric Joliot (1900–1958) and his wife Irène Curie
associated with atomic processes. (1897–1956) (daughter of Marie Curie), studied this mysterious
radiation and let it fall on a block of paraffin. Paraffin is a hydro-
carbon very rich in hydrogen atoms. They found that the radiation
knocked protons (hydrogen nuclei) from the paraffin (see figure
24.9). The energy of the protons was about 5 MeV. Of course, now
that charged particles (protons) were involved, it was much easier to
determine their properties. They also found that many more protons
than expected were emitted from the paraffin. If gamma rays had
been responsible, their very high penetrating power would have
resulted in fewer interactions with protons.
The high energy of the protons (5 MeV) was a problem because
applying the conservation of energy and conservation of momentum to
the collision between a gamma ray and a proton yielded a value for the
incident gamma ray of at least 50 MeV. This was a major dilemma
because the energy of the incident alpha particles was only about 5 MeV.
In other words, if this was the correct interpretation, there had to have
been a tenfold increase in energy in the interaction!

CHAPTER 24 PROBING THE NUCLEUS 459


Paraffin Detection
Beryllium wax device

Source of
radioactivity +

α particles +
Figure 24.9 The reaction of alpha
particles with beryllium produced a
mysterious radiation that knocked Unknown Protons
protons out of paraffin. radiation

PHYSICS FACT
Rutherford’s prediction of the
neutron
I n his Bakerian lecture of 1920, Rutherford had suggested that ‘it
may be possible for an electron to combine much more closely
with the hydrogen nucleus than is the case in the ordinary hydrogen
atom’. He later used the term neutron. It is worth noting that
Rutherford’s conjecture about the existence of the neutron had not
received wide publication and it had not been read by either Joliot or
his wife. Some years later Joliot commented on the fact that he had
not read Rutherford’s Bakerian lecture and that, had he done so, it
was possible or probable that he and his wife would have identified
the neutron before Chadwick.

Chadwick identifies the neutron


Just over two weeks after reading the paper of the Joliot–Curies, James
Chadwick (1891–1974) had completed his work and submitted a paper
on ‘Possible existence of a neutron’ (1932). In that time Chadwick
applied conservation of energy and conservation of momentum to the
interaction of a neutral particle (of mass similar to that of a proton) with
a proton. Chadwick made measurements of the recoil of nuclei of
hydrogen and nitrogen after interactions with his proposed neutron. The
measurements were difficult but led to the mass of a neutron being
calculated to be 1.15 times that of a proton.
At this time (1932), there was doubt expressed about whether or not
the conservation laws of classical physics would apply to nuclear pro-
cesses. Some leading physicists were adamant that they would but others,
including Bohr, thought otherwise. In fact it was 1936 before Bohr
dropped his ideas of non-conservation of energy.
As Chadwick’s neutron identification depended on the conservation
laws and there was doubt expressed about them at the time, he con-
cluded his paper ‘Up to the present, all the evidence is in favour of the
neutron . . . [unless] the conservation of energy and momentum be
relinquished at some point’.
The nuclear equation for the reaction of alpha particles with beryllium
is:
9 Be 12
+ 42 α → C + 10 n.
4 6

460 FROM QUANTA TO QUARKS


PHYSICS FACT
Problems with electrons and protons in
close association
I t is worth noting that there are major difficulties with the concept
of electrons and protons in close association either as a single par-
ticle (the neutron) or generally in the nuclei of atoms. The masses of
atoms could not be explained in terms of numbers of protons and
electrons. Another problem involved the de Broglie wavelength of an
electron. How could an electron with an energy of a few MeV be con-
−15
fined to a region with a radius of 5 × 10 m when its de Broglie
wavelength was large compared to this radius?
These difficulties were overlooked at the time because, after all, an
alpha particle seemed to be 4 protons and 2 electrons combined very
tightly together.

There were no naturally occurring neutron emitters but now, with a high-
energy alpha particle source (such as polonium) and some beryllium, it was
possible to produce neutrons and conduct neutron scattering experiments.

24.3 DISCOVERY OF THE NEUTRINO


The discovery of the neutron had gone a long way to help explain the
nucleus but problems with beta decay remained. Eventually, the only way
to solve these problems was to predict the existence of another neutral
particle, the neutrino.

Puzzles and problems of beta decay


There are good reasons why an electron cannot possibly be confined to a
nucleus (see Physics fact above), and beta decay had been a problem
since its discovery. Attempts to explain beta decay in a similar manner to
alpha decay were doomed to failure.
All alpha particles emitted from a particular radioactive species had
the same energy, but beta particles seemed to be emitted with a range of
energies. There was considerable debate as to whether the beta particles
had a continuous or line spectrum.
Detection methods proved to be confusing and photographic methods
of detection, which tended to favour line spectra, were not sufficiently
sensitive to determine the continuous spectrum.
Prior to World War I, James Chadwick had used Geiger’s ‘point
counter’ to detect beta particles that had been deflected by a magnetic
field. He detected beta particles with a continuous range of radii
indicating that they had been emitted with a wide spread of energies.
Figure 24.10 is a graph of these energies.
Some very competent physicists, including Otto Hahn (1879–1968)
and Lise Meitner (1878–1968), had claimed to observe many lines in the
beta particle spectrum.
In fact, for many years Hahn and Meitner continued to cling to their
theory that all beta particles were emitted from the nucleus with the same
energy but that this energy was modified as the beta particles left the atom.
How could one beta decay be associated with emission of a certain
amount of energy from a nucleus but another beta decay from a similar
nucleus be associated with a different amount of energy? After all, both

CHAPTER 24 PROBING THE NUCLEUS 461


As we saw in section 22.1, the radius
9
of the path of a charged particle
through a magnetic field is given by 8

Relative number of β particles


mv 7
r = -------- .
qB 6
Beta particles, all with the same 5
charge to mass ratio, had paths of
4
different radii when travelling through
a uniform magnetic field (see figure 3
24.11). Therefore, they must have had 2
different velocities and, hence, been
1
emitted with different energies.
0
.1 .2 .3 .4 .5 .6 .7 .8 .9 1.0 1.1 1.2
Magnetic field Kinetic energy of the β particles (MeV)

Figure 24.10 The distribution of energy of beta particles

β particles
decays produced the same new nucleus. It is not difficult to see why
Hahn and Meitner had developed their theory about the same energy on
emission or why Bohr continued to doubt that conservation of energy
applied to nuclear processes. Bohr’s view in particular could be called
desperate and, initially, so was Pauli’s solution.
Detector
In an attempt to resolve the paradoxes involving beta decay, in 1931
The different radii of the Wolfgang Pauli (1900–1958) took the bold step of predicting that there
beta particles are due to
their different velocities
must be another sub-atomic particle.
(energies). Pauli himself later stated, ‘In June 1931, on the occasion of a conference
in Pasadena, I proposed the following interpretation: the conservation laws
Figure 24.11 The variation of
remain valid, the expulsion of beta particles being accompanied by a very
radius of the beta particles as they
penetrating radiation of neutral particles, which has not yet been observed.’
travelled through a magnetic field
This was prior to Chadwick’s discovery of the neutron and Pauli
indicated that they had a wide range
referred to his predicted particle as a ‘neutron’. It was later renamed
of energies.
‘neutrino’ by Enrico Fermi (1901–1954) to avoid any confusion with the
neutron of Rutherford and Chadwick.

PHYSICS FACT
Pauli and the neutrino
auli was reluctant at first to speak about the neutrino. Fermi invited
P him to speak at a conference in Rome but Pauli was still very cau-
tious about it and would speak privately only to Fermi.
Pauli told astronomer Walter Baade, ‘Today I have done the worst
thing for a theoretical physicist. I have invented something which can
never be detected experimentally.’ Baade offered to bet a crate of
champagne that the particle would be detected and Pauli accepted.
Pauli could never win the bet as there was no time limit specified but
in the mid-1950s the bet was paid.

Fermi explains beta decay


By late 1933, Fermi had completed his famous paper on beta decay. In it,
he outlined the problems of the continuous spectrum of beta particles,
and electrons being present in the nucleus. He then stated that he would
explain beta decay in terms of Pauli’s suggestion of the emission of a

462 FROM QUANTA TO QUARKS


lightweight neutral particle, the neutrino, along with the electron, and
also in terms of the suggestion of Werner Heisenberg (1901–1976) that
the nucleus contained only ‘heavy’ particles, protons and neutrons.
Fermi also proposed that the number of electrons and neutrinos was
not constant. Electrons and neutrinos could be created or could disappear
just like photons. He accounted for the process in which a neutron was
transformed into a proton with emission of an electron and a neutrino.
He also dealt with more advanced quantum aspects and was able to pro-
duce the shape of the spectrum associated with beta decay. He did this for
three different values of the mass of the neutrino and found that the shape
that most closely resembled the observed shape was for a neutrino mass
Although we have seen that Fermi’s of zero, or very close to zero.
neutrino is known today as an By the 1950s the decay of other subatomic particles had been found to be
antineutrino, we will continue to use similar in character to beta decay, and this led to the idea that a new force
the term neutrino in a generic sense was associated with these decays. The force became known as the weak
that can apply to either neutrinos or nuclear force or weak force and joined the gravitational force, the electro-
antineutrinos. magnetic force and the strong nuclear force as a fundamental force of nature.
Fermi’s neutrino is now called an antineutrino. (In terms of some of its
other properties it is more sensible to classify it as an antiparticle rather
than a particle.)
In 1979, Sheldon Glashow (1932– ), There are now three different simple types of beta decay. They are beta-
Abdus Salam (1926–1996) and Steven minus, beta-plus and electron capture. Figure 24.12 shows diagrams repre-
Weinberg (1933– ) were awarded the senting beta-minus and beta-plus decay and the equations are given below.
Nobel prize for showing that the weak Beta-minus decay:
nuclear force and the electromagnetic 1 n → 1 p + 0e + 0 ν
0 1 –1 0
force could be viewed as different aspects Beta-plus decay:
of a single force, called the electroweak 1 p → 1 n + 0e + 0 ν
force. This reduced the number of 1 0 +1 0
fundamental forces from four to three, the Anti-particles are written with a bar above the symbol.
other two being gravity and the strong Unless stated otherwise, the term beta decay will be used to refer to
nuclear force that we will encounter later beta-minus decay.
in this chapter. (a) Beta– decay Beta particle, e, emitted
In 1984, Carlo Rubbia (1934– )
and Simon van der Meer (1925– ) Decays
were awarded the Nobel prize for their Proton (in nucleus)
experimental verification of the theory
Neutron
of the electroweak force. (in a nucleus) –
Antineutrino, υ , emitted

(b) Beta+ decay –


Positron, e, emitted

Decays
Neutron (in nucleus)
Figure 24.12 Diagrams

representing (a) beta decay Proton
+ (in a nucleus)
(b) beta decay Neutrino, υ , emitted

When Fermi tried to publish his paper in the prestigious journal


‘Nature’, it was rejected as being too far removed from reality. It remains
one of the classic papers of physics.

Detection of neutrinos
When Pauli predicted that the neutrino would never be observed, he
imagined that it had a mass similar to the mass of an electron. After
Fermi had predicted that it had a lower or zero mass, the detection
would have seemed even more remote.

CHAPTER 24 PROBING THE NUCLEUS 463


PHYSICS FACT
Interaction of neutrinos with matter
I f you hold out your hand as if to catch something coming from the
direction of the Sun, about 1013 neutrinos pass through it every second.
If you did the same thing at night, with the Earth between you and the
Sun, again about 1013 neutrinos would pass through it every second. The
chance of a neutrino interacting with matter is extremely small.
After Fermi’s paper on beta decay, Hans Bethe, a US physicist born
in Germany in 1906, and Rudolf Peierls (1907–1995) calculated that
a neutrino could travel through about 1000 light-years of water before
it would be absorbed. They were certain that neutrinos would never
be detected by inverse beta decay. Peierls commented about 50 years
later that they had not allowed for the existence of nuclear reactors,
or the ingenuity of experimental physicists.

In 1953, Cowan and Reines built a detector that was the forerunner of
some of the detectors used today. Their detector contained a tank of liquid
that emitted scintillations after gamma rays passed through the tank.
Photomultiplier tubes around the tank detected light (the scintillations)
emitted by the liquid. They hoped to observe scintillations caused by
gamma rays produced by the annihilation of a positron and an electron,
and also the gamma rays emitted after a nucleus had gained a neutron.
As we have already seen, the process of beta decay produces anti-
neutrinos, and Cowan and Reines used the antineutrinos produced in
beta decays occurring in a nuclear reactor.
Their experiment relied on the process of inverse beta decay in which
a proton interacted with an antineutrino and produced a neutron and a
positron.
0ν + 1p → 1n + 0e
0 1 0 1
The results from this detector led them to believe that they had probably
detected events produced by antineutrinos. This was confirmed in 1956
when they built an improved version of the detector and counted about
three events per hour. This indicated inverse beta decay.

PHYSICS FACT
Neutrinos from SN1987A
O n 23 February 1987, various neutrino detectors registered an
increase in neutrinos. The increase was small but significant
16
(only 12 neutrino events were detected as an estimated 10 neutrinos
passed through the Japanese neutrino detector, the Super-
Kamiokande). Some hours later, SN1987A, the first supernova visible
to the naked eye for about 400 years, was observed in the Large
Magellanic Cloud, a distance of about 50 kpc (kiloparsecs) from
Earth. (One parsec is approximately three light years.)

Properties of neutrinos
As we have already seen, neutrinos have an incredibly high penetrating
power and only very rarely interact with matter. Neutrinos have other
properties, for example:

464 FROM QUANTA TO QUARKS


• they are neutral
• they have either zero or an extremely small mass
• they travel at the speed of light
• they possess both momentum and energy (and carry away from a beta
decay the momentum and energy that was previously seen to be
missing after a beta decay)
• they have an intrinsic spin (see Physics fact below).

PHYSICS FACT
The concept of ‘spin’
A lthough spin is not dealt with in this course, it is a particularly
important concept in quantum mechanics. An electron in the
Bohr atom has angular momentum because it is in orbit about the
nucleus. It has another component of angular momentum that can be
considered to be due to it rotating on its axis. It can be thought of as
similar to the Earth in orbit around the Sun and the Earth rotating
on its axis as it does so. As spin is really a relativistic effect, this analogy
breaks down but it serves as a starting point.
The spin of an orbiting electron in an atom is quantised and gives
the fourth quantum number. (The first three are the principal
quantum number, n, from the Bohr model, the angular momentum
quantum number, l, and the magnetic quantum number, m.)
Similarly, nucleons have an intrinsic angular momentum or spin,
and this is also quantised.

PHYSICS IN FOCUS
Recent discoveries related to neutrinos
T here have been many mysteries about neu-
trinos, some of which have been solved by
recent discoveries. In chapter 26 we will study the
one type to another. Information on this experi-
ment can be found at http://neutrino.kek.jp/.
A particular puzzle was the fact that less than half
‘standard model of particle physics’. Perhaps fur- the predicted solar neutrinos were detected on
ther study of neutrinos will help take physics Earth. The Sun produces electron neutrinos and
beyond the standard model. Two neutrino detec- the early neutrino detectors could detect only elec-
tors have figured in these important discoveries: tron neutrinos. Researchers at the Sudbury Neu-
• the Super-Kamiokande (SK) detector in Japan trino Observatory in Canada have now confirmed
• the Sudbury Neutrino Observatory (SNO) that the electron neutrinos make up one-third of
detector in Canada. the total number and that the two other types
In 1998, observations made with the Super- (muon and tau neutrinos) account for the total
Kamiokande neutrino detector indicated that number in agreement with models of the processes
neutrinos could change from one type to occurring in the solar core. (It is possible to detect
another. (There are three types of neutrino in all three types of neutrino at SNO.) This not only
the standard model: the electron neutrino, the explains the solar neutrino problem but also con-
muon neutrino and the tau neutrino.) Neutrinos firms the fact that some of the electron neutrinos
had previously been thought to have zero mass produced in the Sun have changed into muon or
but, if they oscillate from one type to another, tau neutrinos before they reach the Earth.
they must possess some mass, perhaps as small as In December 2002, more evidence for neutrino
one millionth of the mass of an electron. The oscillation was presented by a group of
‘K2K long baseline neutrino oscillation experi- researchers in Japan and the United States. Their
ment’ indicated that neutrinos do change from (continued)

CHAPTER 24 PROBING THE NUCLEUS 465


experiments using the Kamioka Liquid Scintil- Sudbury Neutrino Observatory (SNO)
lator Neutrino Detector (KamLAND) in Japan The detector at the Sudbury Neutrino Observa-
have indicated that antineutrinos of one type can tory is similar to the Super-Kamiokande detector
change into antineutrinos of another type. but is considerably smaller, holding only 1000
Super-Kamiokande (SK) detector tonnes of water with about 10 000 light sensors
The Super-Kamiokande neutrino detector (see surrounding it. The key feature of the SNO
figure 24.1, page 453) suffered a major disaster detector is that it is used with ultra-pure heavy
in 2001 when it was being refilled after cleaning water, or heavy water containing salt. When used
and maintenance. One of the photomultiplier with heavy water, electron neutrinos can be
tubes imploded when the tank was about three- detected. When salt is added to the heavy water,
quarters full. The shock waves from the all three types of neutrino can be detected. It was
implosion destroyed most of the photomulti- this feature that enabled the solar neutrino
plier tubes that were underwateat that time problem to be solved.
(about 6700 of the total number of 11146 It is worth noting that the detection rate at
photomultiplier tubes). SNO is about one neutrino per hour and that
The detector was rebuilt in two stages. First four years of data were required to produce the
the remaining photomultiplier tubes were rear- first meaningful results from SNO.
ranged (and protected from the effects of 2002 Nobel Prize in Physics
another implosion, should one occur) to enable
observations to continue; then new photomulti- Raymond Davis of the University of Pennsylvania
plier tubes were installed. The detector was and Masatoshi Koshiba of Tokyo University
returned to its full potential in June 2006. It is shared one half of the 2002 Nobel Prize for their
now referred to as Super-Kamiokande III. pioneering work in neutrino detection.
It was very expensive to fully repair the SK
detector as each photomultiplier tube costs about
US$3000. However, the cost has been justified
because of the results achieved from SK:
‘It is this reviewer’s opinion that SK has to be
regarded as an astonishing success. SK produced
more physics than any other experiment in par-
ticle physics in the last 10 years and a large frac-
tion of such physics has turned out to be
unexpected. Indeed it is fair to say that the only
clear indications we have of physics beyond the
standard model come today from neutrino oscil- Figure 24.13
lations and SK is the single most important Outside view of the
experiment behind them.’ (A member of photomultiplier
SAGENAP quoted in Department of Energy, tube support
National Science Foundation, Report of Scientific structure at the
Assessment Group on Experimental Non-Accelerator Sudbury Neutrino
Physics (SAGENAP), 12–14 March 2002) Observatory

24.4 THE STRONG NUCLEAR FORCE


Both the gravitational and electrostatic forces are inverse square forces
so both should become large at the small separation of nucleons in a
nucleus. The force of gravity will provide an attractive force between
proton–proton, proton–neutron and neutron–neutron, but there will
be an electrostatic repulsion between pairs of protons. Another force,
the strong nuclear force, is present in the nucleus and holds nucleons
together.

466 FROM QUANTA TO QUARKS


Relative strengths of gravitational and
electrostatic forces between nucleons
The magnitude of the gravitational force between two masses is given by
Gm 1 m 2
F G = -----------------
- and the magnitude of the electrostatic force between two
r2
1 q1 q2
charges is given by F E = ----------- --------- .
4π ε 0 r 2
We can see that the ratio of the gravitational force, FG, to the electrostatic
force, FE, is given by:
FG Gm 1 m 2 4π ε 0 r 2
----- = -----------------
- × -----------------
FE r2 q1 q2
Gm p2 × 4π ε 0
= -----------------------------
-.
q p2
Using this equation with the following data:
−27
Mass of proton, mp = 1.673 × 10 kg
−27
Mass of neutron, mn = 1.675 × 10 kg
−19
Charge on proton, qp = 1.602 × 10 C
−11 2 −2
Universal gravitation constant, G = 6.673 × 10 N m kg
−12 2 −1 −2
Permittivity of free space, ε0 = 8.854 × 10 C N m
shows that the gravitational force is smaller than the electrostatic force by
−37
a factor of 8.1 × 10 .
Clearly, the attractive force of gravity between nucleons is so small as to
be insignificant when compared to the electrostatic repulsion between
protons. There must be another force present in the nucleus to hold the
nucleons together. That force is called the strong nuclear force.

Properties of the strong nuclear force


The properties of the strong nuclear force include:
• an independence of charge and a similar force between proton–proton,
neutron–neutron and proton–neutron when electrostatic forces are
ignored
• a very strong attractive force, much stronger than the electrostatic repul-
sion between protons. (At very short distances, much less than the diameter
of a nucleon, it changes from attraction to repulsion — see figure 24.14.)
−15
• a very short range force acting over a distance of only about 10 m.
Every proton in a nucleus repels every other proton but the strong
nuclear force exists only between a nucleon and its nearest neigh-
bours. This is indicated by the almost uniform density of nuclear
matter and also by the nearly uniform binding energy per nucleon.
Repulsion
Force between nucleons

Separation of nucleons in
1 2 3 4
femtometres (10–15 m)
Figure 24.14 The graph shows how
Attraction

the strong nuclear force between two


nucleons varies with the separation of
the nucleons.

CHAPTER 24 PROBING THE NUCLEUS 467


This ‘old’ strong force, which is well • a favouring of the binding of pairs of nucleons with opposite spins (see
described by the exchange of pions, is note on spin) and pairs of pairs with each pair having a total spin of
now seen as a consequence of the zero. (This helps account for the exceptional stability of two protons
complexities of the processes involving and two neutrons in an alpha particle.)
interactions between quarks and their The fundamental forces so far, with the exception of gravity, have been
messenger particles called gluons. shown to involve the exchange of particles called ‘exchange’ particles or
‘messenger’ particles. In electromagnetism, the exchange particle is a
photon, which is massless, and the electromagnetic force extends to
infinity. The strong force as described above is carried by the pi meson or
pion which is about 273 times heavier than an electron. The large mass is
associated with a very short range force.

24.5 MASS DEFECT AND BINDING


ENERGY OF THE NUCLEUS
The energy associated with a nuclear process is incredibly large com-
pared to that associated with an atomic process. You have already
encountered the equivalence of mass and energy and we now apply this
concept to nuclear processes.

Mass defect
The key to the large energy involved in nuclear reactions is the fact that
mass and energy are equivalent and are linked by Einstein’s relationship,
2
E = mc . The other important fact is that the mass of any nucleus is not the
sum of the masses of its constituent protons and neutrons. The difference
between the mass of a nucleus and the total mass of its constituent
The mass defect of a nucleus is the
nucleons is called the mass defect of the nucleus. Rather than define
difference between the mass of the mass in kilograms, it is usual to use atomic mass units for the masses of
constituent nucleons and the mass nuclei. The conversion factor is:
of the nucleus. −27
1 atomic mass unit, u = 1.661 × 10 kg.
The masses of protons, neutrons and electrons in atomic mass units are:
mass of a proton, mp = 1.007 276 u
mass of a neutron, mn = 1.008 665 u
mass of an electron, me = 0.000 548 580 u.
It is possible to convert the mass defect The mass of a deuterium atom, an atom of the isotope of hydrogen,
in atomic mass units to a mass in with a neutron as well as a proton in its nucleus, is 2.014 102 u.
kilograms and then use E = mc to
2 Therefore, the mass of a deuterium nucleus is 2.014 102 − 0.000 549 =
find the energy in joules that would be 2.013 553 u (the mass of the atom − the mass of the electron).
released. This energy in joules can The total mass of an isolated proton and an isolated neutron would be
then be converted to an energy in 1.007 276 + 1.008 665 = 2.015 941 u.
MeV. However, it is much easier to use If this proton and neutron combined to form a deuterium nucleus,
the standard conversion factor where they would have to lose 2.015 941 − 2.013 553 = 0.002 388 u.
the energy equivalent of a mass of 1 u The mass defect of deuterium is 0.002 388 u and if a proton and a neutron
is 931.5 MeV. On data sheets this is combined, energy equivalent to a mass of 0.002 388 u would be released.
MeV If more nucleons could be added to build bigger nuclei, energy would
stated as 1 u = 931.5 -----------
2
. be released and the total mass defect would increase.
c
Binding energy
If we now tried to do just the opposite, that is, to split our deuterium
nucleus into an isolated proton and neutron, we would find that it was
not possible. There would not be sufficient mass for an isolated proton

468 FROM QUANTA TO QUARKS


and neutron to exist. If we really wanted to accomplish the separation, we
would have to provide the missing mass, the mass defect of the deuterium
nucleus. Somehow, we would have to supply energy to the deuterium
nucleus and have that energy converted into mass. The exact amount of
energy that would have to be converted into mass would be the energy
The binding energy of a nucleus is
equivalent of the mass defect. We call this energy the binding energy of
the energy equivalent of the mass the deuterium nucleus. The mass defect of the deuterium nucleus was
defect of the nucleus. It is the 0.002 388 u. The equivalent energy is 0.002 388 × 931.5 = 2.224 MeV. The
energy that would have to be binding energy of a deuterium nucleus is 2.224 MeV.
provided and converted to mass to If a proton and neutron combined to form a deuterium nucleus,
enable all the nucleons in a
nucleus to be separated from each
2.224 MeV of energy would be released:
other. 1p + 10 n → 21 H + 2.224 MeV.
1
If we wanted to split a deuterium nucleus into an isolated proton and
neutron, we would have to supply 2.224 MeV of energy that could be
converted into mass.
We saw in the previous section that as the number of nucleons in a
nucleus increased, so did the mass defect. This means that the total
binding energy of the nucleus must also have increased. The total binding
energy of a nucleus must be related to the stability of that nucleus, but it
is difficult to obtain useful information from the total binding energy.
However, the stability of the nucleus is indicated by the average
The average binding energy per
binding energy per nucleon. This gives a measure of how strongly an
nucleon is the total binding energy
of a nucleus divided by the average nucleon is bound to a particular nucleus. The graph of average
number of nucleons in the binding energy per nucleon against mass number (figure 24.15) shows
nucleus. It is a measure of the that the most stable nuclei have a mass number of about 50 to 60. The
stability of the nucleus. most stable nucleus is, in fact, iron-56.
We can see from figure 24.15 that if we were able to join together
light nuclei, we would produce nuclei with a higher average binding
energy per nucleon and, hence, energy would be released. This is the
process of nuclear fusion. Some atomic masses for light nuclides are
given in table 24.2.

10 4He 35Cl 89Y


62Ni
20Ne 110Cd 141Pr
12C 180Hf 209Bi
Binding energy per nucleon (MeV)

19F 75As 126Te


14N
56Fe 100Mo 160Dy
11B 197Au
9Be 238U

6Li
5

2H

0
0 100 200
Mass number A

Figure 24.15 A graph of average binding energy per nucleon plotted against mass number

Also, if we were able to take a large mass number nucleus and split it in
two, we would produce two new nuclei with higher average binding
energy per nucleon than the original nucleus. Again energy would be
released. This is the process of nuclear fission. We will study this process
in chapter 25.

CHAPTER 24 PROBING THE NUCLEUS 469


Table 24.2 Atomic masses for some light nuclides

ELEMENT AND NEUTRON ATOMIC ATOMIC MASS


ISOTOPE NUMBER, N NUMBER, Z MASS (u) NUMBER, A
1
Hydrogen ( 1 H) 0 1 1.007 825 1
2
Deuterium ( 1 H) 1 1 2.014 102 2
3
Tritium ( 1 H) 2 1 3.016 049 3
3
Helium ( 2 He) 1 2 3.016 029 3
4
Helium ( 2 He) 2 2 4.002 603 4
6
Lithium ( 3 Li) 3 3 6.015 121 6
7
Lithium ( 3 Li) 4 3 7.016 003 7
9
Beryllium ( 4 Be) 5 4 9.012 182 9
10
Boron ( 5 B) 5 5 10.012 937 10
11
Boron ( 5 B) 6 5 11.009 305 11
12
Carbon ( 6 C) 6 6 12.000 000 12
13
Carbon ( 6 C ) 7 6 13.003 355 13
14
Nitrogen ( 7 N) 7 7 14.003 074 14
15
Nitrogen ( 7 N) 8 7 15.000 109 15
16
Oxygen ( 8 O) 8 8 15.994 915 16
17
Oxygen ( 8 O) 9 8 16.999 131 17
18
Oxygen ( 8 O) 10 8 17.999 160 18

Determining mass defect and binding energy


SAMPLE PROBLEM 24.2 The mass of a helium atom is 4.002 603 u.
(a) Calculate the mass defect of the helium nucleus.
(b) Calculate the total binding energy of the helium nucleus.
(c) Calculate the average binding energy per nucleon of helium.

SOLUTION (a) The total mass of the constituents of a helium atom (two protons, two
neutrons and two electrons) is:
2 (1.007 276 + 1.008 665 + 0.000 549) = 4.032 980 u.
Mass defect = 4.032 980 − 4.002 603
= 0.030 377 u
(b) Binding energy = Mass defect × 931.5
= 28.30 MeV
28.30
(c) Average binding energy per nucleon = --------------
4
= 7.08 MeV per nucleon

470 FROM QUANTA TO QUARKS


We use atomic masses when Energy change in nuclear reactions
performing calculations to do with
nuclear reactions (even if the reaction
When we considered the formation of a deuterium nucleus from a
involves particles such as alpha
proton and neutron, we were really dealing with the nuclear reaction:
particles that do not contain 1H + 10 n → 21 H + energy.
1
electrons). By doing so, we ensure that
we have accounted for the mass of the
We know that energy was released because the product nucleus had less
same number of electrons on each side
mass than the two reacting nucleons.
of the equation.
We can treat any nuclear reaction in the same way. If the mass of the
products is less than the mass of the reacting nuclei, energy will be
released. The energy released will be the energy equivalent to the
decrease in mass.

Calculating the energy released in nuclear fission


SAMPLE PROBLEM 24.3 A possible fission reaction for uranium-235 is given below. Find the
energy (in MeV) released when one uranium-235 nucleus undergoes
such a fission.
235 139
U + 10 n → La + 95 Mo
42
+ 2( 10 n) + 7( 0 e)
92 57 –1

Atomic masses:
139 La = 138.8061 u
95 Mo = 94.9057 u
235 U = 235.0439 u
SOLUTION Total mass of reactants = 235.0439 + 1.008 665
= 236.0526 u (to four decimal places)
Total mass of products = 138.8061 + 94.9057 + 2 × 1.008 665 + 7 × 0.000 549
= 235.7330 u (to four decimal places)
Decrease in mass = 236.0526 − 235.7330
= 0.3196 u
Energy released = 0.3196 × 931.5
= 297.7 MeV

CHAPTER 24 PROBING THE NUCLEUS 471


2. State the numbers of protons and neutrons in
SUMMARY each of the following nuclei:
(i) aluminium-27
• Radioactivity was discovered in 1896. There (ii) lead-208
were found to be three different components (iii) radon-220
of the radiation and these were termed alpha, (iv) polonium-218
beta and gamma. (v) uranium-238.
• It was found that transmutation occurred when 3. What changes in atomic number and mass
an atom emitted an alpha or beta particle and number result from the emission of:
an atom of a new element was formed. (a) an alpha particle?
(b) a beta particle?
• In alpha decay, the mass number decreased by
(c) a gamma ray?
four and the atomic number decreased by two.
In beta decay, the mass number did not change 4. Write nuclear reactions for the following
but the atomic number increased by one. decays:
(a) polonium-214 emits an alpha particle
• The properties of the nucleus could not be (b) thallium-210 emits a beta particle.
explained by assuming it contained protons
and electrons and the existence of another 5. If nuclei do not contain electrons, how can
nuclear particle, the neutron, was predicted. beta decay be accounted for?
Chadwick discovered this particle in 1932. 6. Complete the following nuclear equations:
• The spectrum of energies of beta particles (a) 212 Pb → 212 Bi + ?? ?
? ?
emitted by nuclei seemed to indicate that 212 Bi
(b) → ?? ? + 0 e
neither energy nor momentum was conserved ? –1
in the emission of beta particles. Pauli over- (c) ?? ? → 208 Pb + 42 He.
?
came this problem by predicting the existence
of another neutral particle, the neutrino. Fermi 7. Complete the following nuclear equations:
used this prediction to explain beta decay. (a) 27 A1 + 21 H → ?? ? + 10 n
13
• The emission of a beta particle involves over- 10
(b) B + 42 He → ?? ? + 11 H
coming the weak nuclear force. 5
(c) 27 A1 + 42 He → ?? ? + 11 H
• The gravitational and electrostatic forces 13
cannot hold the nucleons in a nucleus together (d) ?? ? + 42 He → 35 C1 + 11 H
17
and another force which is incredibly strong
over a very short range, the strong nuclear (e) 94 Be + 11 H → ?? ? + 42 He
force, holds nucleons together. (f) 22 Na + 42 He → ?? ? + 11 H.
11
• The mass of a nucleus is less than the masses of
its constituent nucleons. This difference in 8. What property of neutrons makes them
mass is known as the mass defect and the particularly useful for producing nuclear
energy equivalent to this mass is called the reactions?
binding energy of the nucleus. 9. Why was a particle that it was initially thought
would be impossible to detect predicted to be
• The energy equivalent to the change in mass in involved in beta decay?
a nuclear reaction is emitted or absorbed in the
reaction. 10. (a) Why don’t the repulsive electrostatic forces
CHAPTER REVIEW

between protons cause the protons in a


nucleus to fly apart?
(b) Why would the electrostatic forces between
protons have a greater chance of making a
QUESTIONS large nucleus rather than a light one break
1. Even though radioactive decay was discovered apart?
before the nucleus was identified, it was not 11. If it was possible for a nucleus of carbon-12 to
thought to be a chemical reaction. State the absorb a neutron into its nucleus and form
properties of radioactive decay that exclude it carbon-13, calculate the amount of energy that
from being a chemical reaction. would be released.

472 FROM QUANTA TO QUARKS


CHAPTER REVIEW
The atomic masses are: (b) You should have noticed that energy was
absorbed in the reaction. What was the
carbon-12 12.000 000 u source of this energy?
carbon-13 13.003 354 u. The atomic masses are:
14
12. When a proton is fired at a lithium nucleus, a N = 14.003 074 u
7
nuclear reaction in which two alpha particles 17
O = 16.999 131 u
are produced may occur: 8
4 He = 4.002 603 u
7 Li + 11 H → 42 He + 42 He. 2
3
1H = 1.007 825 u.
Ignoring any initial kinetic energy of the 1

lithium atom and the proton, calculate the 14. Calculate the amount of energy released in the
total kinetic energy of the two alpha particles following nuclear reaction:
after the reaction. 14 15
The atomic mass of 73 Li is 7.016 003 u. N + 21 H → N + 11 H.
7 7
The atomic masses are:
Note: even though we are dealing with a
14
proton and two alpha particles in this reaction, N = 14.003 074 u
7
we still use the atomic masses of hydrogen and 2H = 2.014 102 u
helium. By doing so, we ensure that we have 1
15
accounted for the mass of the same number of N = 15.000 108 u
7
electrons on each side of the equation. 1H
1
= 1.007 825 u.
13. The first artificial nuclear transmutation which
Rutherford performed in 1919 was the reaction 15. The process of nuclear fusion occurring in the
between an alpha particle and a nitrogen Sun involves a number of steps but can be
nucleus. The alpha particles used in the experi- summarised in the equation:
ment were emitted from bismuth–214. 4( 11 H) → 42 He + 2( 01 e ) + 2ν.
14 17
N + 42 He → O + 11 H. How much energy is released in this process?
7 8
(a) Use the masses provided to determine the 16. Is total binding energy or average binding
change in energy (in MeV) that occurred energy per nucleon a better indicator of the
in the reaction. stability of a nucleus? Explain your answer.

CHAPTER 24 PROBING THE NUCLEUS 473


NUCLEAR FISSION

CHAPTER
25 AND OTHER USES
OF NUCLEAR
PHYSICS
Remember
Before beginning this chapter, you should be able to:
• recall that the strong nuclear force between nucleons
is a very strong but very short range force
• recall that energy will be released in a nuclear
reaction if there is a decrease in mass in that reaction
• recall that fission of a heavy nucleus releases energy.

Key content
At the end of this chapter you should be able to:
• describe Fermi’s attempts to produce transuranic
elements and explain why the interpretation of his
observations changed after the discovery of nuclear
fission
• describe Fermi’s first demonstration of a nuclear
chain reaction in an atomic pile in 1942
• compare the requirements for a controlled and an
uncontrolled nuclear chain reaction
• explain the basic principles of a fission reactor
• describe the use of a named isotope in each of the
fields of medicine, engineering and agriculture
• explain, by referring to the properties of neutrons,
why neutron scattering is used as a probe
• assess the significance of the Manhattan Project to
society.

Figure 25.1 An atomic bomb test. The first atomic test


took place at Alamogordo in New Mexico at 5:29:45 am,
16 July 1945. William Laurence, the official journalist and
only ‘outsider’ present at the test stated, ‘It was like the
grand finale of a mighty symphony of the elements,
fascinating and terrifying, uplifting and crushing, ominous,
devastating, full of great promise and great forebodings’.
25.1 ENERGY FROM THE NUCLEUS
In 1903, Rutherford was already aware of the vast amount of energy avail-
able from radioactivity. He concluded a lecture by commenting that,
based on Pierre Curie’s experiments, each gram of radium gave out suf-
ficient energy in its lifetime to raise 500 tonnes a mile high. In 1916, he
commented on the possibility of a nuclear bomb: ‘Fortunately at the
present time we had not found out a method of so dealing with these
forces, and personally I am hopeful we should not discover it until man
was living at peace with his neighbour.’
In 1933, Rutherford made a famous statement that suggested that it
would be impossible to obtain energy from the nucleus, ‘It is a very
poor and inefficient way of producing energy, and anyone who looked
for a source of power in the transformation of atoms was talking
moonshine.’ Perhaps he had changed his mind from his earlier views
or perhaps he hoped that it would not be possible.

PHYSICS FACT
A chain reaction
L eo Szilard (1898–1964) recalled reading a report of Rutherford’s
1933 statement in The Times. After reading it he walked through
London, stopped at a red light on the corner of Southampton Row
and wondered if Rutherford might be wrong. He thought about
firing neutrons, not alpha particles at a nucleus and realised as the
light turned green ‘that if we could find an element which is split by
neutrons and which would emit two neutrons when it absorbs one
neutron, such an element, if assembled in sufficiently large mass,
could sustain a nuclear chain reaction.’

PHYSICS IN FOCUS
Sir Mark Oliphant
S ir Mark Oliphant (1901–2000), one of Australia’s great
scientists, worked with Rutherford at the Cavendish Labora-
tory for the ten years before Rutherford’s death and described
it as the most wonderful time of his life. Sir Mark recalled
that in about 1934 or 1935, while Rutherford was absent
from the Cavendish Laboratory, he had performed some
experiments. The aim of these experiments was to see if it
was possible to get a net gain of energy by modifying an
experiment in which deuterium atoms were bombarded
with accelerated deuterium nuclei. He obtained a negative
result but when Rutherford returned and Sir Mark informed
him of the experiment, Rutherford was at first very angry. Sir
Mark suggested to Rutherford’s biographer, John Campbell,
that perhaps Rutherford was aware of the enormous
energy available and had hoped that energy would never
be able to be efficiently extracted from the nucleus.

Figure 25.2 Sir Mark Oliphant


(continued)

CHAPTER 25 NUCLEAR FISSION AND OTHER USES OF NUCLEAR PHYSICS 475


Sir Mark Oliphant was later a member of the he had constructed a particle accelerator that was
Manhattan Project that developed the atomic able to accelerate charged particles to the ener-
bomb. It is ironic that Sir Mark worked on the gies necessary to induce the reactions. (The
development of the fission bomb and had dis- reactions could be induced with earlier acceler-
covered the reaction that occurs in a fusion ators but his accelerator achieved higher reaction
bomb as he was opposed to the use of nuclear rates.)
weapons. The reaction induced between ‘the cores of
Sir Mark Oliphant recalled of his work with heavy hydrogen atoms’ produced helium-3 and a
Rutherford: neutron.
2
In this work, which we did together, we were 1 H + 21 H → 32 He + 10 n
able to discover two new kinds of atomic
species; one was hydrogen of mass 3 (tritium) In 1934, Sir Mark duplicated his accelerator
unknown until that time, and the other and operated it while Rutherford gave the first
public display of a nuclear fusion reaction to a
helium of mass 3, also unknown. These new
Friday night meeting of the Royal Institution in
atoms were produced as a result of atomic
London.
transformations induced by our ion beam
hitting targets of lithium, beryllium and After the war, Sir Mark returned to the Univer-
other materials. Incidentally, at the same sity of Birmingham in England but in 1950
moved back to Australia as the Director of the
time, we were able to show that heavy
Research School of Physical Sciences at the newly
hydrogen nuclei, that is to say the cores of
established Australian National University. He
heavy hydrogen atoms, could be made to
established the Australian Academy of Science
react with one another to produce a good
and was its first president (1954–1956).
deal of energy and new kinds of atoms. This
particular reaction, which we discovered at He retired from ANU in 1967 and was
appointed to a five-year term as State Governor of
this time, is the basic reaction in the so-called
South Australia in 1971. He retired to Canberra
hydrogen bomb.
in 1976 and continued to promote science and
Sir Mark modestly did not state that the work technology and to foster the development of
leading to these discoveries was possible because Australian science until his death in July 2000.

25.2 THE DISCOVERY OF NUCLEAR


FISSION
After the discovery of the neutron, Enrico Fermi (1901–1954) and his
co-workers in Rome led the world in neutron physics. They set out to
study the neutron bombardment of as many elements as possible.
With heavy elements there was often a delayed emission of a beta par-
ticle, which resulted in the production of an element with a higher
atomic number. This happened with uranium (see figure 25.3) and
suggested that an element with an atomic number greater than 92 was
formed.

e e

n
92 p 92 p 93 p 94 p
146 n 147 n 146 n 145 n

238 U 239 U 239 Np 239 Pu

Figure 25.3 Bombardment of uranium with neutrons may produce transuranic elements.

476 FROM QUANTA TO QUARKS


PHYSICS FACT
Transuranic elements

Half-life is the time taken for half


I t is possible to identify the products of neutron bombardment by
measuring the half-lives of the radiation it emits. In 1934, Fermi
reported that an activity with a half-life of 13 minutes was not due to
the radioactive nuclei in a sample
to decay. If we exclude the activity any known isotope of any element and must be due to an element with
of daughter nuclei, it is the time an atomic number greater than 92. Elements with an atomic number
taken for the activity of a greater than 92 are known as transuranic elements.
particular sample to drop to half
its initial value. 238
92 U + 10 n → 239 U
92

239 239 Np 0e
U → 93
+ –1
92
239 Np → 239 Pu + 0e
93 94 –1

It is possible for a uranium-238 nucleus to capture a neutron and


then form plutonium-239 after emitting two beta-particles.
(This reaction does occur and it is how plutonium can be produced
for use in reactors or weapons. However, after the discovery of nuclear
fission, it was realised that radiation originally attributed to transuranic
elements was really associated with isotopes of much lighter elements.)

Fermi’s neutron bombardment of uranium


We now know that when Fermi and his associates bombarded uranium
with neutrons, they must have produced nuclear fission. The production
of transuranic elements would also have occurred but the observation of
radioactivity that Fermi took as evidence for the production of trans-
uranic elements was certainly due to the production of fission products.
In his biography of Fermi, Emilio Segré (1905–1989), a co-worker and
also a Nobel prize winner, comments: ‘As is well known, the discovery of
fission required a reappraisal of the work on the radioactivities induced
in uranium. It has occasionally been said that Fermi’s was the only Nobel
Prize awarded for an unsubstantiated discovery –– on the assumption
that discovery of the transuranic elements was the reason for the prize. It
is clear from the presentation statement, however, that this was not the
case’ (see the next section, page 478).
The first observations that confirmed nuclear fission were made by Otto
Frisch using an ionisation chamber to detect the particles emitted (see page
478). Interestingly, Fermi and his group performed a similar experiment
in early 1935 when trying to detect alpha radiation from their supposed
transuranic elements. They were able to detect beta radiation but were
unable to detect the alpha radiation they thought should be emitted.
Segré recalls that they reasoned that the alpha particles emitted from the
transuranic elements would be more energetic than those emitted from
uranium. They set up a sample of uranium in front of an ionisation
chamber and irradiated the uranium with neutrons from a source sur-
rounded by paraffin. They did not want to detect the natural alpha radi-
ation from the uranium, so they covered the uranium with aluminium
foil. They hoped this would stop the alpha particles emitted from the
uranium but would not stop the higher energy alpha particles they
expected to be emitted from their new elements. They did not manage to
detect anything at all and Segré recalls that the aluminium foil prevented

CHAPTER 25 NUCLEAR FISSION AND OTHER USES OF NUCLEAR PHYSICS 477


them from observing the large ionisation pulses associated with fission.
He also states that even if they had observed these pulses, it is impossible
to say if they would have interpreted them correctly.

Slow neutrons are more effective than fast


neutrons
During the course of their neutron bombardment experiments, Fermi’s
co-workers, Edoardo Amaldi (1908–1989) and Bruno Pontecorvo (1913–
1993) discovered that the same experiment yielded different results
when performed in different parts of the same room. Most noticeably,
the activity was much greater after substances had been irradiated with
neutrons on a wooden table rather than on a marble table. The results
did not make sense.
Fermi set out to investigate this strange phenomenon and planned to
use a block of lead between the neutron source and the target. He had it
carefully machined, not his usual custom, and then at the last minute
changed his mind and substituted a rough piece of paraffin. The result
was a spectacular increase in the intensity of the activation.
Fermi had discovered that slow neutrons were much better at
irradiation than fast neutrons. Neutrons did not need to have a large
energy to closely approach a nucleus because there was no electrostatic
repulsion of the neutron. Neutrons travelling at low speeds spent more
time in the vicinity of the nucleus and hence had a better chance of
being captured. (This is associated with the de Broglie wavelength of the
neutrons. The slower neutrons have a much longer wavelength and
hence have a much greater possibility of capture by a nucleus.) The
action of the paraffin is similar to that of a moderator in a nuclear
reactor (see page 486).
Fermi was awarded the Nobel prize in 1938 for ‘. . . your discovery of new
radioactive substances belonging to the entire race of elements and for the
discovery you made in the course of the selective power of slow neutrons.’
The winning of the Nobel prize was a most fortuitous event for Fermi.
Fermi and his Jewish wife, Laura, were granted permission to travel to
Sweden to accept the prize but then did not return to Italy and fled to the
USA. Many other Jewish physicists had escaped from Germany.
Near the end of 1938 Lise Meitner and her nephew Otto Frisch, two
others who had fled, realised that earlier experiments indicated the
fission of uranium with the potential release of a vast amount of energy.

Meitner and Frisch identify fission


Before fleeing Germany, Lise Meitner (1878–1968) had been working
with Otto Hahn (1879–1968) and Fritz Strassmann (1902–1980). They
studied the substances into which the heaviest elements transmuted
after neutron bombardment. While on a Christmas holiday in Sweden
with her nephew Otto Frisch (1904–1979), Meitner received a letter
from Hahn. Hahn wrote that he had identified barium rather than
radium as one of the products of bombardment of uranium with slow
neutrons.
Meitner replied to Hahn, ‘Your radium results are very amazing. A
process that works with slow neutrons and leads to barium! . . . To me for
the time being the hypothesis of such an extensive burst seems very diffi-
cult to accept, but we have experienced so many surprises in nuclear
physics that one cannot say without hesitation about anything: “It’s
impossible”.’ When Frisch arrived they discussed Hahn’s letter. Later
Frisch summarised the discussion. They noted that there was insufficient

478 FROM QUANTA TO QUARKS


energy present to chip enough protons and alpha particles off a uranium
nucleus to produce barium, and no other particles had been observed. It
was impossible for a nucleus to be cleaved across. Then they began to
consider Bohr’s liquid drop model of the nucleus. They considered that
if the nucleus was like a liquid drop, it could become unstable and split
in two. The two parts would be forced apart by the electrostatic repulsion
between them. (The strong nuclear force has already been overcome
once the two parts are separated.) This process is shown in figure 25.4.
They calculated that the two parts would be forced apart at extremely
high velocities corresponding to an energy of about 200 MeV.
( ) ( ) ( )
Neutron

(a) (b) (c) (d) (e) (f)


235
Figure 25.4 Stages in a fission process (a) A U nucleus absorbs a thermal neutron. (b) A
236
U nucleus with excess energy is formed and oscillates. (c) The oscillations may cause a neck
to form but the strong nuclear force is still in control. (d) As the neck narrows, the electrostatic
repulsion begins to overcome the strong nuclear force. (e) The electrostatic force causes the
neck to break. (f) Now that the two fragments have been separated, there is no attraction but
only the electrostatic repulsion which forces the two fragments apart at high velocity.

Meitner was able to calculate the energy with the use of the mass
defects of the nuclei that she had memorised. She calculated that if the
uranium nucleus did indeed split in two, the mass of the products would
be approximately equal to the mass of the uranium nucleus minus one
2
fifth the mass of a proton. Using E = mc they calculated that about
200 MeV would be released.
Their calculations confirmed the fission of uranium and explained the
origin of the barium. When the uranium nucleus split in two, barium was
one of the many elements that could be formed.
The fission reaction that produces barium is:
235 1 236 141 92 1
92 U + 0n → 92 U → 56 Ba + 36 Kr + 3 0 n.
Another example of one of the many fission reactions that may occur is:
235 236 147 87 1
92 U + 10 n → 92 U → 57 La + 35 Br + 2 0 n.

The first observations of fission


Frisch returned to Copenhagen and informed Bohr of their theory just as
Bohr was about to travel to the USA. A few days later, Frisch performed
the first experiment that actually confirmed the fission reaction. Uranium
was bombarded with neutrons and the particles emitted were passed into
an ionisation chamber. The highly energetic nuclei that were the products
of the fission of uranium were easily detected. The output from the
ionisation chamber was viewed on an oscilloscope.
News of fission reached the USA before Frisch and Meitner had
published their paper and many others performed experiments similar
to that of Frisch.
In Chicago, Herbert Anderson, a graduate student, and Fermi set out
to perform such an experiment. Their later recollections differed but it
seems that Fermi went to a conference in Washington and did not
actually observe the fission.

CHAPTER 25 NUCLEAR FISSION AND OTHER USES OF NUCLEAR PHYSICS 479


PHYSICS FACT
Fission
O tto Frisch demonstrated his fission reaction to many people.
One was an American biologist, William Arnold, who was
studying in Copenhagen. Frisch asked Arnold what biologists called it
when a bacterium split in two and Arnold replied that it was binary
fission. Frisch asked if he could use the term fission by itself and
hence the biological term associated with reproduction of cells
became the term used in physics for the process in the most destruc-
tive weapon that has ever been used in war.

25.3 THE DEVELOPMENT OF THE


ATOM BOMB
Although nuclear fission was recognised as a reality, the feasibility of an
eBook plus atom bomb was still doubtful. Leo Szilard’s idea of a chain reaction was
the key but there was doubt about the mass of uranium required to make
Weblinks: a bomb. Szilard was sure that it would be possible and lobbied the US
The Quebec Agreement government to proceed. In 1939 he approached Albert Einstein who
Manhattan Project files wrote a letter to President Roosevelt outlining his belief that America
Einstein’s letter to
should be actively researching the possibility of a nuclear bomb. This
President Roosevelt
letter has since become famous.
Others had also realised that a chain reaction might be possible. Pre-
liminary research had also taken place in England. Just a few hours
before Japan entered the war, the US government made a commitment
to expand its nuclear research. In 1942, the US government made an
agreement to work with the British government. The ‘Manhattan
Project’, where an atomic bomb was planned to be designed and con-
structed, was commenced. Britain supplied the information it already
possessed but it was unclear how much information would be shared
after the war.

PHYSICS FACT
Early ideas of an atomic bomb
O tto Frisch returned to the University of Birmingham where he
worked with Mark Oliphant and another expatriate German
Jew, Rudolf Peierls (1907–1995). In 1940, Oliphant worked on the
top secret radar development, but Frisch and Peierls were both tech-
nically enemy aliens and were banned from the radar work.
Frisch and Peierls, who later worked with Oliphant on the Man-
hattan Project, did their own work on the atomic bomb. They con-
sidered that the possibility of using pure uranium-235 as the fuel had
been overlooked and concluded that as little as 1 kg of uranium-235
would be suitable for a bomb.
About 40 kg of uranium was eventually used and it was about 89%
U-235.

480 FROM QUANTA TO QUARKS


The Manhattan Project
It was thought that there were two different pathways to an atomic bomb,
using uranium-235 as fuel or using plutonium-239 as fuel. (The plu-
tonium nucleus contains two protons and two neutrons more than a
uranium-235 nucleus.)
The problem was to produce sufficient of each of these. Not knowing
which would be better, the United States Government decided to pro-
ceed with both methods. Vast plants were constructed to produce pure
U-235 from natural uranium and to produce Pu-239 from the neutron
bombardment of U-238.
The first nuclear reactor
eBook plus The first man-made nuclear reactor, or atomic pile as it was then known,
was built in a squash court under a grandstand at Stagg Field in Chicago.
Weblink: Enrico Fermi was the head of the group which designed and constructed
Events at Stagg Field: the reactor. During its design and construction, Fermi and his group
The first atomic pile solved many of the problems associated with nuclear reactors. They took
An eyewitness account of
the events at Stagg Field on
out patents on ‘neutronic reactors’ but, after the war, assigned these
2 December 1942, by Corbin patents to the US government without compensation.
Allardice and Edward R. Trapnell. The aim was to see if it was possible to obtain a neutron multiplication
factor greater than one. This would mean that a chain reaction could
occur. If a chain reaction did occur, it would create a means of producing
plutonium for use as a fuel in atomic bombs.
Fermi realised that impurities in the uranium and the moderator
would result in the capture of some neutrons. This, in turn, would pre-
vent them from causing fissions. The question was whether sufficient
neutrons would produce fissions to enable a chain reaction to occur.
(Even before the success of this reactor, construction of full-scale reactors
designed to produce plutonium had commenced. These reactors were
constructed at Hanford in the state of Washington.)
Fermi’s atomic pile contained 50 tonnes of natural uranium in the
form of 22 000 slugs. These were dispersed throughout 400 tonnes of
graphite which had been machined into 40 000 graphite bricks. Graphite
was used as the moderator (see page 486) because at that time it was the
only material that was available in sufficient quantity and of the required
degree of purity.

Figure 25.5 The first nuclear


reactor. Few photographs were
taken but paintings have been
reproduced many times. Enrico
Fermi slowly withdrew the
control rods and the radiation
produced was measured by
Geiger counters and plotted on
a chart recorder.

CHAPTER 25 NUCLEAR FISSION AND OTHER USES OF NUCLEAR PHYSICS 481


The test of the pile occurred on 2 December 1942 and is shown as a
sketch in figure 25.5 on the previous page. At 9.45 am they started slowly
withdrawing the neutron-absorbing control rods. After each six- or
perhaps 12-inch step, Fermi performed calculations on his slide rule and
predicted where the trace on the chart recorder would level off. At about
11.45 am the automatic safety rod, which had been set at too low a level,
was triggered. Fermi called a break for lunch and the process was
resumed at 2.00 pm. At 3.25 pm Fermi predicted that the trace would
now not level off and a few minutes later reported that the reaction was
self-sustaining. Twenty-eight minutes later the control rods were inserted
and the reaction was stopped.

Research at Los Alamos


The theoretical work on the atomic bomb was done at Los Alamos in
New Mexico where a new secret laboratory was built on the site of a boys
ranch school about 100 km from Santa Fe.
The greatest scientists in the free world gathered at Los Alamos. They
all realised the terrible potential of the weapon that they were designing;
however, they were initially driven by the fear that German scientists, and
there were still famous atomic scientists in Germany, would build the
atomic bomb and give Hitler incredible power.
Once the project commenced, there was the same excitement that
accompanies any new scientific work, in this case magnified by the
presence of so many great scientific minds.
The problem was to assemble a mass greater than the critical mass in a
The smallest amount of fuel
necessary to sustain a chain very short period of time. Different methods were used for the different
reaction is called the critical size or fuels. The plutonium bomb had normal explosives packed around a
critical mass. As the size increases, sphere of plutonium. The plutonium was slightly less than a critical mass.
the volume to surface area ratio The explosives forced the sphere of plutonium into a smaller sphere of
increases and a smaller proportion super-critical mass (see figure 25.6). (The increase in density caused by
of neutrons are lost from the fuel.
the compression from detonating the explosives resulted in the for-
mation of a super-critical mass.)

(a) Before firing (b) Detonation

Trigger control
Detonator
Container
Figure 25.6 A diagram of an
implosion bomb that is similar to ‘Fat
Man’, the bomb dropped on Nagasaki
on 10 August 1945 (a) A slightly sub-
critical mass of plutonium is
surrounded by a shell of explosives
(b) Detonation of the explosive
compresses the plutonium, producing
a super-critical mass and then a Chemical
nuclear explosion. explosive Subcritical mass of Pu-239

The uranium bomb was simpler with two sub-critical masses of


uranium being fired into each other to assemble the super-critical mass
(see figure 25.7). This was known as the gun method.
The implosion method for plutonium was tried in the first test of an
atomic bomb, the Trinity test at Alamogordo in New Mexico. It was not
thought necessary to test the gun method because it used a simple
method of assembling the super-critical mass. Therefore, the uranium
bomb was dropped without a test.

482 FROM QUANTA TO QUARKS


(a) Before firing Separating tube (b) Detonation

Two pieces of enriched Explosive


uranium-235, both less charge
than the critical mass Figure 25.7 A diagram of a gun-style bomb
that is similar to ‘Little Boy’, the bomb
dropped on Hiroshima on 6 August 1945
(c) Compressed piece of uranium
created which exceeds critical mass (a) Two separated sub-critical masses of nearly
Heisenberg’s recollection of the 1941 pure uranium-235 (b) Detonation of the explosive
meeting was conveyed to Robert Jungk charge forces the two sub-critical pieces together
and printed in Jungk’s book Brighter at high speed (c) A super-critical mass is rapidly
than a Thousand Suns: A Personal assembled and a nuclear explosion occurs.
History of the Atomic Scientists. It
is rather different from Bohr’s Speculation remains as to the state of the German atomic bomb project.
recollection. Letters exist in the Bohr Werner Heisenberg, who was in charge of the German project, actually met
archives that Bohr wrote to Heisenberg Niels Bohr in Copenhagen in 1941 before Bohr fled to the USA. Appar-
but did not send. One letter was found ently, Bohr believed that Heisenberg did not realise that a small critical
in the pages of Bohr’s own copy of mass was possible if U-235 was used. Bohr thought that Heisenberg was
Jungk’s book. It was reported that Bohr building something more like a nuclear reactor although he did gain the
took exception to Heisenberg’s impression that there had been a major German nuclear effort.
description of their meeting. In Jungk’s After the end of the war with Germany, ten of the top German scientists
book, Heisenberg suggests that he who were thought to have made contributions to nuclear weapon research
proposed that physicists should not were taken to England and held in Farm Hall for many months. Their con-
work on such a weapon and that he versations were bugged and transcripts indicate the surprise of the scientists
would see that German physicists after the atomic bomb was dropped on Hiroshima. (In fact they thought at
would not build an atomic bomb if first that it was simply a propaganda trick.) The transcripts then indicate
Bohr would use his influence to stop that the German scientists came up with their own version, or Lesart,
Allied physicists from building one. The describing their view of atomic weapons during the war. This version was
conversation was spoken in guarded that the German scientists had not wanted the atomic bomb either because
terms and it is possible that neither it was impossible to achieve during the expected duration of the war or
really understood what the other was because they simply did not want to create an atomic bomb.
referring to. Michael Frayn’s play
Copenhagen deals with this meeting Views of some Manhattan Project physicists
of Heisenberg and Bohr. The letters, Work on the atomic bomb did not stop after Germany was defeated. It is
which were to be released in 2012 on not hard to see that the Manhattan Project scientists wanted to see the
the fiftieth anniversary of the death of conclusion of their work.
Bohr, were released in 2002 because of Leo Szilard saw no need to continue after the defeat of Germany and
the intense interest created by the play. in an attempt to stop the bomb being dropped, organised a petition to
As can be seen, Bohr made a number of President Truman. (It was Szilard who had played an important part in
attempts to write to Heisenberg but did starting the bomb project and had lobbied Einstein to convince Presi-
not actually send him a letter.
dent Roosevelt of the need for the project.)
After the successful test of the first plutonium bomb at Trinity in 1945,
many scientists did not want to see it used as a weapon. There were
attempts to stop it from being used. Some wanted to invite Japanese
leaders to a demonstration of the weapon. However, it was too late for
that; the politicians now had control.
We have seen previously (page 476) that Sir Mark Oliphant was
strongly opposed to the use of nuclear weapons. This was a fairly
eBook plus common view of Manhattan Project scientists. An interesting view of
many of the scientists was that the atomic bomb would see the end to war
Weblink: as it would be too horrible to even contemplate a nuclear war. Others
Documents from the
Niels Bohr Archive were of the view that the secrets should be shared. Little did they know
that the Russians already knew the important details, as the result of

CHAPTER 25 NUCLEAR FISSION AND OTHER USES OF NUCLEAR PHYSICS 483


espionage and the cooperation of some Los Alamos scientists, and that a
eBook plus nuclear arms race was about to begin.
Richard Feynman (1918–1988) recalled that after the Trinity test there
Weblinks: were parties, but one man, Bob Wilson, who was responsible for Feynman
Leo Szilard online becoming involved with the project, was just moping around. Wilson said
Manhattan Project
online
that they had done a terrible thing. Feynman later said, ‘You see, what
happened to me — what happened to the rest of us — is that we started
for a good reason, then when you’re working very hard to accomplish
something and it’s a pleasure, it’s excitement. And you stop thinking, you
know; you just stop. Bob Wilson was the only one who was still thinking
about it at that moment.’
One of the Manhattan Project physicists, Nobel prize winner Hans Bethe
(1906–2005), wrote an open letter in 1994 calling for scientists to cease
work on the production of weapons of mass destruction.

Open letter from Hans Bethe


As the Director of the Theoretical Division Today we are rightly in an era of
at Los Alamos, I participated at the most disarmament and dismantlement of nuclear
senior level in the World War II Manhattan weapons. But in some countries nuclear
Project that produced the first atomic weapons development still continues.
weapons. Whether and when the various nations of
Now, at age 88, I am one of the few the world can agree to stop this is uncertain.
remaining such senior persons alive. But individual scientists can still influence
Looking back at the half century since that this process by withholding their skills.
time, I feel the most intense relief that Accordingly, I call on all scientists in all
these weapons have not been used since countries to cease and desist from work
World War II, mixed with the horror that creating, developing, improving and
tens of thousands of such weapons have manufacturing further nuclear weapons;
been built since that time — one hundred and, for that matter, other weapons of
times more than any of us at Los Alamos potential mass destruction such as
could ever have imagined. chemical and biological weapons.
Hans A. Bethe

25.4 NUCLEAR FISSION REACTORS


After the release of energy by an atomic bomb had been achieved, fission
reactors followed. The principles of a fission reactor are similar to those
of an atomic bomb except that the release of energy is controlled.
In an atomic bomb the release of energy has to occur in an extremely
A fissile nucleus is a nucleus that short time. All the fissile nuclei present must capture neutrons and
may undergo fission. undergo fission. As many as possible of the neutrons produced in each
fission must produce further fissions. Because of the very short time
involved, slow neutrons are useless in a bomb. The fuel would be blown
apart before the neutrons could be slowed or captured.

Neutrons in a nuclear reactor


In a nuclear reactor that has reached its desired power level, one neutron
from each fission must produce another fission to maintain the reaction
at a steady rate.
We will examine the principles of the operation of fission reactors that
use uranium as their fuel.
If we consider the interaction between a neutron and a nucleus, the
neutron is either scattered, with the loss of some energy, or captured

484 FROM QUANTA TO QUARKS


(see figure 25.8). If it is captured by a fissile nucleus there is a poss-
ibility of fission. Of course, another alternative is that the neutron may
simply escape from the fuel.
Natural uranium has 0.7% fissile U-235 and the remainder is non-fissile
U-238. If the percentage of U-235 is increased by the process of enrich-
ment, the amount of uranium needed as fuel can be reduced. Similarly, if
enriched uranium is used, the problem of the loss of neutrons is reduced
and less efficient moderators can be used.

(a)
*
Figure 25.8 Neutrons in a nuclear
reactor (a) Some neutrons escape (b)
(b) High-energy neutrons released *
during one fission should collide with
nuclei in the moderator, losing much * * Fission
of their energy as they do so. When
their energy has been greatly reduced,
they may be captured by a fissile
nucleus and produce another fission. • • Neutron capture without fission

(c)
The zig-zag line in the last part * Moderator
of the path before fission indicates a
thermal neutron. (c) Some neutrons
are captured by non-fissile nuclei.
(d) Some neutrons are captured by a (d)
• Coolant

fissile nucleus but do not produce * Fuel rod


fission.

The problem of leakage of neutrons


Neutrons are very highly penetrative and may travel a large distance
through matter before being captured, perhaps escaping from the fuel.
As the size of the fuel is increased, the number of neutrons that escape
will decrease, and when the critical mass or critical size is reached the
problem of leakage is overcome.

The problem of the energy of neutrons


Fission is induced most efficiently by low-speed or ‘thermal’ neutrons.
The fast neutrons produced in fissions should be slowed by a moderator
to enhance the probability of producing further fission.

The capture of neutrons


The fact that low-speed neutrons rather than high-speed neutrons are
more likely to be captured by a uranium-235 nucleus is associated with
the wavelength of the neutron. A slow or ‘thermal’ neutron has an
energy of about 0.025 eV and hence has a de Broglie wavelength of about
−11
1.6 × 10 m. It may still interact with a nucleus even if it passes this far
eBook plus from it.
A 1.0 MeV neutron, however, has a de Broglie wavelength of only
−15
Weblink: 2.5 × 10 m and hence must make a very close approach to the nucleus
The Los Alamos Primer if it is to interact with it.
Robert Serber, who prepared the Los Alamos Primer, compared this to
an archer shooting at a target. It is as if for fast arrows, the target has a
diameter of one metre but for slow arrows, the target has magically
expanded to 13 metres in diameter!

CHAPTER 25 NUCLEAR FISSION AND OTHER USES OF NUCLEAR PHYSICS 485


An unfortunate result of slowing the neutrons is that they must be
slowed through an energy range where they are likely to be captured
but then do not produce fission. Capture of neutrons with an energy in
this range can be reduced by not mixing the fuel and moderator uni-
formly. The uranium fuel is present as uranium oxide in the fuel rods.
A neutron produced in a fission in a fuel rod should slow while in the
moderator and then re-enter another fuel rod where it may produce
another fission.

PHYSICS FACT
Moderators
A moderator should contain nuclei with a low
mass. If a neutron undergoes a collision
with a mass much larger than itself, it will
collide with a deuterium nucleus. The chance of
capture, and hence forming tritium, 31 H , is very
low. While this sounds ideal and is used in some
rebound elastically from the mass (imagine a reactors, there is the drawback that heavy water,
ping pong ball bouncing off a bowling ball). also known as deuterium oxide, is expensive to
Momentum and energy are conserved in the col- produce. Carbon, in the form of graphite, is an
lision and the ping pong ball rebounds with alternative and, while not as efficient as heavy
almost no change in energy. However, if an water at slowing neutrons, it is economically
object collides with another object of similar viable.
mass, it will pass all, or almost all of its energy on Some nuclear reactors use water as a moder-
to the second object. This suggests that the best ator and accept the fact that some neutrons will
way to slow down neutrons is to have them col- be lost as they are captured by the hydrogen
lide with protons. Unfortunately, however, there nuclei (protons). Those reactors must use
is a high probability of the proton capturing the enriched uranium to compensate for this loss of
neutron. The next best way is for the neutron to neutrons.

Coolant
As we have seen, the products of a fission are fired apart with extremely
high kinetic energies. This kinetic energy is transferred to the atoms and
molecules in the reactor core as thermal energy. A coolant is used to
extract this thermal energy. If enriched uranium is used as the fuel, it is
possible to use ordinary water (under pressure) as the coolant. The
absorption of neutrons by the hydrogen nuclei in the water is compen-
sated for by increasing the percentage of U-235.
If heavy water is used as the moderator, it also performs the role of the
coolant. A pressurised water reactor (PWR) uses water under high
pressure as its coolant. A boiling water reactor (BWR) uses water, still
under pressure, but not enough to stop it from boiling.
A high temperature gas-cooled reactor (HTGR) uses helium gas as its
coolant.

Control rods
eBook plus Control rods are made of a neutron-absorbing material such as cadmium
or boron and can be raised from the reactor core to increase the rate of
Weblink: the reactor, or lowered into the reactor to decrease the rate or shut down
Nuclear control rods the reactor.
A reactor is critical when one neutron from each fission produces
another fission. Reactors are designed to be supercritical but are
maintained at the critical level by use of the control rods.

486 FROM QUANTA TO QUARKS


Fuel loading holes Control rods

Steam to
turbines

Figure 25.9 An advanced gas-cooled


reactor (AGR). This reactor has a
graphite moderator, uses enriched
uranium oxide as fuel and is cooled by Pump
carbon dioxide gas that is pumped Water in
through the core. The core would
contain about 120 tonnes of fuel and Container of pre-stressed concrete which
creates a pressure vessel and biological shield.
1500 tonnes of graphite.

Producing electricity
The coolant that passes through the reactor core passes through a heat
exchange unit where it heats coolant from another circuit. (As a safety
precaution, the coolant from the core would usually not pass outside the
main reactor building.)
The coolant from the second circuit would carry the thermal energy to
a boiler where it would heat water to produce steam to drive a turbine to
produce electricity (see figure 25.10 below).
Safety of nuclear reactors will always be of major concern, but theoreti-
cally a nuclear reactor should be very safe. In an emergency, control rods
should shut down the reactor in a very short period of time. Some people
would argue that despite the problem of disposing of long half-life
radioactive wastes from a fis-
Steam (high pressure) sion reactor, nuclear reactors
Electric power
are a far more environmen-
tally safe means of producing
electricity than using power
Control rods
stations that are fed by
carbon dioxide-producing
fossil fuels.
Turbine Generator
Steam
However, there is a stigma
Water (low pressure) attached to nuclear processes.
(hot)

Coolant in
Reactor
core Steam
(moderator) Pump condenser
Figure 25.10 A nuclear power
Coolant out
station. Hot water under pressure
Water
(cool) from the reactor boils water in the
Pump steam generator and this steam drives
Reactor pressure vessel Water
Water
a turbine, which in turn drives an
(low pressure)
(high pressure) electricity generator.

CHAPTER 25 NUCLEAR FISSION AND OTHER USES OF NUCLEAR PHYSICS 487


Radioactive waste products
Even if nuclear reactors were completely safe, there would still be a
problem associated with the radioactive waste products from the reactor.
If the fuel rods remain in the reactor core for about three years, only a
small proportion of the uranium nuclei present will have undergone
fission when the fuel rods are removed. The fuel rods will contain
uranium-235, uranium-238, and some plutonium-239 (formed when
uranium-238 absorbs neutrons). There will be other radioactive waste
products that must be stored safely for long periods.
Discharged fuel rods are usually stored under water at the reactor site
for a few months and then transported to a reprocessing plant. They are
probably stored again and then the fuel is dissolved in nitric acid and
separated into unused uranium, plutonium and other wastes. While large
volumes of low-activity waste are produced and stored until the activity
reaches a safe level, relatively small volumes of high-activity waste must be
stored safely for long periods.
eBook plus Several methods have been used. A vitrification process is used where
the radioactive wastes are incorporated into borosilicate glass for immo-
Weblink: bilisation. This process has been developed in France over a long period
Synroc of time and is in use at reprocessing plants in the UK and France.
An Australian invention, Synroc, which was first developed at the
Australian National University in Canberra, may offer significant advan-
tages over the vitrification process in long-term performance and, pos-
sibly, overall cost savings.

PHYSICS IN FOCUS
Chernobyl
T here have been several nuclear accidents
involving nuclear reactors. The worst occurred
at Chernobyl on 26 April 1986 when a reactor and
been completed at a research reactor in the USA.
In that test, the reactor had shut itself down.)
The aim of the test was to reduce the reactor
building caught fire and a large amount of radio- power to 1000 MW and shut off the steam supply
active material escaped to the atmosphere. to one of the turbogenerators. Measurements
The reactor was a type RBMK 1000 which was would then be taken to see how long the inertia
graphite moderated and was cooled by boiling of the turbogenerator would provide enough
water. The design was such that the neutrons electricity to drive four of the eight water-coolant-
were fully moderated by graphite. The reactivity circulating pumps. The other four pumps would
was reduced because the water in the coolant be controlled from the electricity grid in the
tubes absorbed neutrons. (The neutrons com- normal manner.
bined with protons to produce deuterium.) The A number of the safety features were over-
reactor was designed to operate with a mixture ridden or bypassed to enable the test to go ahead.
of steam and water in coolant tubes. A ‘steam These included the emergency core-cooling
void’ captures fewer neutrons than a similar system being rendered inoperable and the local
volume of water. control of the automatic reactor power control
When the accident occurred the reactor was rods being disconnected. The power was reduced
being run in far from normal conditions. The but it was impossible to stabilise at 1000 MW and
output had been reduced to test whether the the power fell to about 30 MW. The tubes were
inertia of a steam turbine, when isolated from both full of water (no steam) and this reduced reac-
the driving steam supply from the reactor and tivity as the maximum number of hydrogen
from the electricity grid, would be sufficient to run nuclei were available to absorb neutrons.
reactor pumps for a short period. This was In an attempt to increase reactor power,
designed to improve reactor safety procedures. (A almost all auto and manual control rods were
few weeks earlier, a similar test of reactor safety had raised as far as possible and this increased

488 FROM QUANTA TO QUARKS


reactor power to 200 MW. Raising all control rods In this time, the pressure tubes had become
was against standard operating instructions and void of water and the superheated steam inter-
would have activated a safety trip reinserting con- acted with the zirconium fuel cladding, releasing
trol rods but this safety trip mechanism had been fuel and fission products. The superheated steam
rendered inoperable for the test. also interacted with the zirconium pressure tubes
At 200 MW the steam was shut off to one of the producing hydrogen and rupturing the tubes.
turbogenerators. The automatic safety system There seem to have been two explosions. The
designed to shut down the reactor in event of first was when the steam and fuel interacted, the
steam supply failure had been shorted out for the explosion breaching the reactor building. The
test. As the turbogenerator ran down, the speed second was caused by the interaction of the
of the pumps feeding the water coolant circuit hydrogen (produced in the steam zirconium
fell and steam production in the pressure tubes interaction) combining with carbon monoxide
increased. This increased the reactivity of the and the air that had entered the building after
system. (With more steam and less water in the the first explosion. The incoming air ignited the
coolant tubes, the absorption of neutrons exposed graphite moderator, which was still near
decreased.) its normal reactor temperature of 770°C.
The reactor power increased from 200 MW to The flames were eventually extinguished by
over 500 MW in three seconds and continued to sand dropped from helicopter flights over the
rise exponentially! reactor building.
The operator attempted to close down by There was no nuclear explosion. The explosions
inserting all control rods but it took 10 seconds to were entirely chemical in nature. There had previ-
insert control rods and shut down. As the control ously been safety concerns with this type of reactor
rods were inserted, the reaction continued and and changes had been made to address those con-
was concentrated at the bottom of the core. cerns. Those changes were ignored in the test.

PHYSICS FACT
The number of naturally occurring elements
A common claim in school science textbooks is
that there are 92 naturally occurring
elements. In fact the elements technetium
fission reactor’ produced the plutonium in pre-
historic times. The percentage of uranium-235
in natural uranium would have been higher
(element 43) and promethium (element 61) are (uranium-235 has a shorter half-life than
not naturally occurring elements on Earth. uranium-238) and water flowing over the
Many texts claim that uranium is the heaviest natu- uranium would have acted as a moderator. Other
rally occurring element but that too is incorrect. long half-life isotopes of elements that could have
In 1972, an estimated two tonnes of plutonium been produced in the fission have also been iden-
was located in the bed of the Okla River in the tified, further supporting the natural reactor idea.
Republic of Gabon when uranium deposits were This leaves us with the current thinking that
being mined. The existence of the plutonium has there are 91 elements occurring naturally on
been explained by considering that a ‘natural Earth.

25.5 MEDICAL AND INDUSTRIAL


APPLICATIONS OF
RADIOISOTOPES
Radioisotopes are used in a wide variety of ways in areas such as
medical imaging and treatment, preservation of food, measuring and
testing of materials and inspection of metal and welds. Some of the
properties of the radioisotopes discussed below are summarised in
table 25.1 (page 491).

CHAPTER 25 NUCLEAR FISSION AND OTHER USES OF NUCLEAR PHYSICS 489


Nuclear medicine
It has been estimated that in 1995, over 250 000 Australians underwent
procedures that involved the use of radiopharmaceuticals. Exploratory
surgery has become less common as new diagnostic techniques, many of
which are based on nuclear medicine, become readily available.
X-rays have long been used to examine the structure of the body, but
techniques associated with nuclear medicine are able to provide infor-
mation on the functions of the body.
Radioisotopes carried in the blood can help doctors detect clogged
arteries or check the function of the circulatory system. Some chemicals
collect in specific organs or tissues. Radioactive tracers that concentrate in
an organ or tissue enable an image of that organ or tissue to be formed.
More information on the use of The radioisotopes used have short half-lives, which is an advantage for
radioisotopes in medicine can be the patient but means that they cannot be stored for very long in the hos-
found in the ‘Medical Physics’ pital. Technetium-99m is a very commonly used radioisotope. It has a half-
module, pages 382–395. life of six hours and is produced through the decay of the radioisotope
molybdenum-99 which is formed in nuclear reactors. Gallium-67 and
thallium-201 are other commonly used radioisotopes. Gallium-67 is used in
the detection and localisation of tumours and thallium-201 is used in the
diagnosis of coronary artery disease and other heart conditions.
Radioisotopes are also used in therapeutic applications. When living
tissue is exposed to high levels of radiation, the cells may be destroyed or
damaged in a way that stops them from reproducing. Radioisotopes such
as cobalt-60 are used to destroy malignant tumours. Many types of cancer
are treated by radiation therapy.
Positron Emission Tomography (PET)
Positron Emission Tomography is a non-invasive means of producing
diagnostic images. The patient is usually injected with a metabolically
active tracer, a molecule that will be used by the body, which contains a
positron-emitting isotope such as carbon-11, nitrogen-13, oxygen-15 or
fluorine-18. Glucose labelled with carbon-11, which has a half-life of
20 minutes, can be used to study the brain.
The positron-emitting isotopes are prepared by bombarding the appro-
priate elements with protons in a cyclotron. (A cyclotron is a type of particle
accelerator.) Carbon-11 can be formed when nitrogen-14 is bombarded
with protons. This results in the emission of an alpha-particle:
14
7N + 11 H → 116C + 42 He .
In PET, after a positron is emitted, it will combine with and annihilate
an electron, usually after travelling less than a millimetre. This produces
Annihilation of electron two gamma rays that travel in opposite directions (see figure 25.11).
and positron produce Photomultipliers When two gamma rays are detected simultaneously by detectors on
two gamma rays. opposite sides of the patient, they must have been emitted from the
Detector
line joining the detectors.
After about half a million such events have been recorded, a com-
puter is used to perform a tomographic reconstruction that can be
γ γ
either two-dimensional or three-dimensional if multiple sections
have been taken. See chapter 20, pages 392–394 for more infor-
γ γ mation and illustrations about PET.

γ Figure 25.11 In PET, when an annihilation of a positron and an electron occurs, two
γ photons that travel in opposite directions are produced. As there is a complete ring of
detectors around the patient, the photons trigger detectors on opposite sides simultaneously.
After about half a million events, an image is constructed by computer.

490 FROM QUANTA TO QUARKS


Industrial and agricultural applications
Industrial radiography is used to inspect metal parts and welds for
defects. Radiation from iridium-192 or cobalt-60 is beamed at an object
and the radiation that passes through is recorded on special photo-
graphic film. More radiation will pass through cracks or flaws and hence
these can be detected. An example of the use of this process is the radio-
graphing of jet engine turbine blades to ensure that the internal cooling
passages have been manufactured correctly.
Radioisotopes in gauges are used to monitor and control the thickness
of sheet metals, textiles, metal foils, paper and photographic film. The
amount of radiation passing through the material depends on its thick-
ness and density. Radiation is passed through the material as it is
processed and the amount of radiation detected indicates whether or not
the material is of the correct thickness.
Other uses of radioactive gauges include:
• monitoring roads, buildings and bridges
• use in the exploration for oil, gas and minerals
• detecting explosives in luggage at airports.
A very common application of a radioisotope is the use of americium-241
in smoke detectors. (Americium is another transuranic element. It has an
atomic number of 95 and is produced by the decay of plutonium-241.)
Agricultural uses of radioisotopes include the use of tracers in plant
nutrients. If phosphorus-32 or nitrogen-15 is used in fertilisers, data
about the rate at which the plant takes up the fertiliser can be obtained.
This, in turn, can yield information that will allow more efficient use of
the fertiliser.
The Sterile Insect Technique (SIT) involves irradiating male insects
reared in a laboratory to sterilise them. They are then released in large
numbers in infected areas. They mate with females but no offspring are
produced.

Table 25.1 Properties of some radioisotopes

NAME EMISSION HALF-LIFE

Phosphorus-32 beta (1710 keV) with range of 6 m in air 14.3 days

Cobalt-60 beta (318 keV) and gamma 5.3 years


(1333 keV)

Molybdenum-99 beta 67 hours

Technetium-99m gamma (140 keV) 6.03 hours

Iridium-192 beta (672 keV) and gamma 73.83 days


(468 keV)

Thallium-201 gamma (135 and 167 keV) and photons 73 hours


(68 to 80 keV)

Americium-241 alpha 432.7 years

Using radiation to preserve food is still a controversial use of radio-


isotopes. A search of the internet will yield a number of sites vehemently
opposed to food irradiation. About forty countries have now approved
the irradiation of food.

CHAPTER 25 NUCLEAR FISSION AND OTHER USES OF NUCLEAR PHYSICS 491


Irradiation is most useful in the areas of:
• preservation
• sterilisation
• controlling airborne diseases
• controlling sprouting, ripening and insect damage.
Not all foods can be irradiated; some fruits become soft and dairy
products develop an unpleasant taste.

25.6 NEUTRON SCATTERING


Neutron scattering has become an important tool in many fields of study.
In Australia, neutron scattering investigations are carried out at the Aus-
eBook plus tralian Nuclear Science and Technology Organisation, ANSTO, at Lucas
Heights in Sydney.
Weblink:
Neutron scattering
The main tools used to detect scattered neutrons are diffractometers
at ANSTO and spectrometers. Diffractometers are used to determine atomic and
molecular structure when there has been elastic scattering of neutrons.
Spectrometers are used when neutrons have been inelastically scattered
and information about quantities associated with atomic motion or
energy is required.
Neutron scattering has been used for research in fields such as
geology, environmental science, biology and biotechnology, engineering,
materials science, physics and chemistry.
X-rays are more intense and more common than neutron sources but
there are areas where neutrons have an advantage over X-rays. Some of
these advantages are listed below.
• The neutron has a wave nature. The de Broglie wavelength of a
thermal neutron is comparable to the spacing between atoms in
molecules. Neutrons scattered from an atomic lattice will therefore
produce interference patterns.
• The neutron has a magnetic moment, which makes it an ideal tool for
studying magnetic structures and materials.
• Neutrons have an energy similar to the vibrational energy of atoms in
solids and liquids. This enables neutrons to be used to study the
motion of atoms in molecules in detail.
• Neutrons can be used to study materials without causing destruction.
• Neutrons interact strongly with nuclei. The strength of the interaction
varies for different nuclei, which makes it possible to study isotopes of
light elements.
The disadvantage of neutron scattering is that a nuclear reactor is
required to produce the neutrons.
The HIFAR reactor at ANSTO was replaced with a new reactor, OPAL,
in 2007. There were seven instruments used in neutron scattering investi-
gations when HIFAR was operating, and this was increased to nine when
OPAL started. However, all has not been plain sailing with OPAL. OPAL
first went critical in August 2006 and was operating at full power in
November 2006. HIFAR was permanently shut down in January 2007.
OPAL was officially opened by the then Prime Minister John Howard in
April 2007 but experienced problems in July 2007. Loose fuel plates
enabled water to seep into the heavy water in the reactor, so it was shut
down. It remained shut down for the rest of 2007 and was to be restarted
in 2008.

492 FROM QUANTA TO QUARKS


CHAPTER REVIEW
accidents such as the one that occurred at Cher-
SUMMARY nobyl have provided ammunition for those
opposed to nuclear reactors.
• As early as 1903 it was realised by Rutherford
that a vast amount of energy was associated with • There are now many uses for radioisotopes,
processes involving radioactivity. some of which are produced in nuclear reactors
and some which are produced in particle accel-
• During the 1930s some physicists realised that erators (such as a cyclotron).
there was the possibility of a chain reaction
which would produce an uncontrolled release • Medical imaging techniques that require radio-
of energy from the nucleus. isotopes are providing new diagnostic tech-
niques. These techniques are replacing much
• Fermi and his group conducted research in of the exploratory surgery that used to be
which neutrons bombarded heavy elements. carried out.
When slow or ‘thermal’ neutrons were used,
new isotopes, some of which were thought to be • There are also industrial and agricultural tech-
of transuranic elements, were produced. niques that use radiation from radioisotopes.

• The element barium was identified as being • Neutron scattering provides another very useful
present after uranium was bombarded with tool for examining the properties of materials.
slow neutrons. This was interpreted by Meitner
and Frisch as being evidence of the fission of a
nucleus of uranium.
QUESTIONS
• Refugee physicists from Germany were very
concerned about Germany developing nuclear 1. Which has a greater mass, a uranium nucleus
weapons and lobbied the US government to before fission or the products after fission?
undertake research into the possible develop- Explain.
ment of nuclear weapons. 2. What is the isotope formed when uranium-238
• The Manhattan Project was commenced. It was captures a neutron?
to produce the fuel for both a uranium bomb (i) The nucleus of this isotope is unstable
and a plutonium bomb, and to complete the because it has an excess of neutrons. What
design of an atomic bomb. happens to it to allow it to restore stability?
(ii) What is the isotope eventually formed?
• During World War II, the greatest physicists in
3. List the things that can happen to a neutron
the free world worked on the Manhattan Project.
produced in a fission of uranium-235.
Work on the atomic bomb continued after the
war with Germany ended and the first atomic 4. State some reasons why a chain reaction does
bomb was designed and constructed at Los not occur in a natural deposit of uranium.
Alamos. It was tested successfully in July 1945. 5. Why was there such a large mass of graphite
(about 400 tons) in the first atomic reactor?
• Against the wishes of many of those physicists,
atomic bombs were dropped on the Japanese 6. The strong nuclear force between adjacent pro-
cities of Hiroshima and Nagasaki. tons in a nucleus is very much greater than the
electrostatic repulsion between the protons.
• Nuclear reactors, which used thermal neutrons However, Otto Frisch believed that the electro-
rather than the fast neutrons used in weapons, static force was the force that accelerated the
were developed and used for generation of product nuclei formed in a fission. Explain.
electricity.
7. Explain how the chain reaction within a
• In nuclear reactors, control rods are used to nuclear reactor is maintained at a steady level.
control the rate at which fission reactions 8. Near the end of a three year period in a nuclear
occur. A controlled chain reaction in which, on reactor, most of the energy released from a
average, one neutron from each fission pro- uranium fuel rod comes from the fission of
duces another fission, releases energy which plutonium. Explain.
ultimately is used to generate electricity.
9. The RBMK nuclear reactor at Chernobyl was
• The debate about the safety and environmen- cooled by boiling water and was designed to
tally sensitive aspects of nuclear power pro- operate with a mixture of liquid water and
duction has continued unabated. Nuclear steam in the coolant tubes.

CHAPTER 25 NUCLEAR FISSION AND OTHER USES OF NUCLEAR PHYSICS 493


(i) Explain how the rate of fissions taking place Masses:
would be affected by filling the coolant 2
H = 2.014 102 u
1
tubes with liquid water (no steam). (This 3
happened before the Chernobyl accident.) 2 He = 3.016 029 u
1
(ii) Explain how suddenly reducing the flow of 0
n = 1.008 665 u
water through the coolant tubes would
contribute to the nuclear disaster. (Why 13. Fluorine-18 is a positron-emitting isotope used
did the rate of fissions increase so dramati- in PET that can be prepared in a cyclotron by
cally when the water in the coolant tubes bombarding oxygen-18 with a proton.
started to boil?) (i) Complete the nuclear reaction:
18 1 18
10. The discovery of nuclear fission came after it O + 1H → 9F + ?
8
was confirmed that barium, and not radium, (ii) What other very highly penetrating par-
was really present in samples of uranium that ticle would be produced in this reaction?
had been bombarded with neutrons. Suggest
why it was so difficult to distinguish between 14. (i) Calculate the energy (in keV) of each of
very small amounts of radium and barium. (A the gamma rays produced in PET when a
periodic table may be helpful.) positron and an electron annihilate each
other. (Any kinetic energy associated with
11. Rutherford estimated that there was sufficient the positron and electron is so small that it
energy in one gram of radium to raise 500 can be neglected.)
tonnes a mile high. Presumably, one gram of (ii) Explain why the two gamma rays that are
uranium would contain similar energy, per- detected must travel in opposite directions.
haps enough to raise 500 tonnes, 1.6 km high. (The mass of an electron and the mass of a
The energy output from the first uranium positron are identical and each is equal to
bomb was about 20 kilotonnes. The mass of 0.000 548 580 u.)
uranium in that bomb was about 40 kg.
12
(1 kilotonne is equal to 4.2 × 10 J.) How 15. In 1994, Bertram N. Brockhouse and Clifford
accurate was Rutherford’s 1903 estimate? G. Shull were awarded the Nobel Prize for
Physics for their work on developing neutron
12. Calculate the energy released in the reaction scattering as an investigative tool. List the
of two deuterium nuclei as demonstrated in properties of neutrons that make them such
Rutherford and Oliphant’s experiment. an important tool for investigating properties
2
1 H + 21 H → 32 He + 10 n of matter in such a wide variety of fields.
CHAPTER REVIEW

494 FROM QUANTA TO QUARKS


QUARKS AND THE

CHAPTER
26 STANDARD MODEL
OF PARTICLE
PHYSICS
Remember
Before beginning this chapter, you should be able to:
• recall how charged particles interact and move within
electric and magnetic fields
• recall the methods used by the pioneers of atomic
and nuclear research to detect ionising radiation
• define the properties of the strong nuclear force that
binds nucleons together in a nucleus.

Key content
At the end of this chapter you should be able to:
• describe how a cloud chamber can be used to detect
charged particles
• identify why particle accelerators are needed to probe
the structure of matter
• identify the contribution that particle accelerators
make to our understanding of the structure of matter
• recall the key features and components of the
Standard Model.
Figure 26.1 An aerial view of the Fermi National
Accelerator Laboratory (Fermilab) at Batavia, Illinois in
the USA. The main accelerator, 6.28 km in circumference, is
clearly visible as the circle in the top half of the photograph.
The circle in the lower half is the main injector ring. Some of
the other accelerator and storage rings are just visible near
the main building on the very left. Many important
discoveries have been made at Fermilab, with possibly the
greatest being the discovery of the top quark in 1995.
26.1 INSTRUMENTS USED BY PARTICLE
PHYSICISTS
Many of the most important discoveries in physics made between fifty
and one hundred years ago were made, sometimes by accident, by a
single physicist working with a very simple apparatus. We have seen that
Fermi was awarded a Nobel prize for the work completed after he put a
rough piece of paraffin in the path of the neutrons that he was using to
irradiate a sample. We have also noted (chapter 22) the basic apparatus
used by Marsden in his desktop alpha-particle scattering experiment.
Rutherford and his co-workers used simple scintillation detectors to
observe the scattering of alpha particles. This method of detection
enabled them to count the alpha particles. However, despite the signifi-
cant results achieved with such simple apparatus, better detectors that
would provide information such as the charge and energy of the particles
were required.
These earlier physicists used alpha-particle sources that were naturally
occurring alpha-particle emitters. Some of these produced alpha par-
ticles of much higher energy than others. When alpha particles were
used to induce artificial radioactivity, it soon became apparent that par-
ticles with higher energies still would be more useful.
The quest for better detectors saw the use of the cloud chamber and
the development of the bubble chamber. In recent times, these were
superseded by larger and more complex multicomponent detectors.
Examples of these are the detectors used at the high-energy accelerator
facilities, such as CERN, Fermilab and Brookhaven.
The quest for higher energy particles saw the development of a variety
of particle accelerators. The higher energy particles from the particle
accelerators were used to bombard nuclei and produce a wide variety of
new particles.
As a result, the days of simple experiments are long since gone and
now discoveries and advances in the field of particle physics require very
expensive equipment and perhaps many hundreds of physicists working
together on a single project. The ‘Physics in focus’ section on the dis-
covery of the top quark (pages 510–511) provides an example of how
modern research is carried out.
We will look at the design and use of some of the particle detectors and
There is usually a maximum particle accelerators that have been used throughout the twentieth century.
concentration of a vapour that can be
present in air. (Humidity is the Particle detectors
amount of water vapour in the air. Both the cloud chamber and bubble chamber were very useful particle
It is expressed as a percentage of the detectors. In the following section, we will see how they were used to
maximum amount of water vapour detect particles, and examine some of the discoveries made using them.
that can be held by the air.) In a
supersaturated vapour, the amount is Cloud chambers
greater than 100%. This may seem The cloud chamber was invented by C. T. R. Wilson before the end of the
strange; however, some vapours can nineteenth century, but not used to detect particles until about 1910. It
condense only if there are particles, remained in use until about 1960.
such as dust or ions, on which A cloud chamber contains a supersaturated vapour (see the note at
condensation can commence. The left). As ionising radiation passes through the vapour, fine droplets of
passage of an ionising particle vapour form on the ions produced by the radiation. This leaves a visible
through a cloud chamber produces vapour trail showing the path of the particle. If the chamber is in a mag-
the ions on which the vapour can netic field, the path of a charged particle will be curved, with the direc-
condense. tion of the curve indicating the charge of the particle.

496 FROM QUANTA TO QUARKS


There are two main types of cloud chamber: the expansion cloud
chamber and the diffusion cloud chamber. Small versions are available
for use in school laboratories. There are various liquids that could pro-
duce a supersaturated vapour but propan-2-ol has been found to work
well in the cloud chambers used in school laboratories. A small amount
of the alcohol is usually soaked into a felt ring or disc and evaporation
provides the supersaturated vapour. (The features and peculiarities of
the different types of cloud chamber are covered in Practical activity 26.1,
page 518.)

PHYSICS FACT
Discoveries made with a
26.1 cloud chamber
Cloud chambers
I n 1919, Rutherford demonstrated the reaction of an alpha particle
with a nitrogen nucleus. In 1932, P. M. S. Blackett (1897–1974)
demonstrated the same reaction in a cloud chamber (see figure 24.8,
page 459).
In 1933, Carl D. Anderson (1905–1991) observed a track in a cloud
chamber that was made by a particle similar to an electron but with a
positive charge, hence discovering the anti-electron or positron.

Bubble chambers
In 1952, Donald Glaser invented the bubble chamber. It is
claimed that the observation of bubbles in glasses of beer
played a significant part in the invention.
The bubble chamber has a similar principle of operation
to the cloud chamber except that the bubble chamber con-
tains a superheated liquid. (A superheated liquid exists in the
liquid state at a temperature above its normal boiling point.)
Propane and pentane were used in early bubble chambers
and hydrogen in later ones. When ionising radiation passes
through the liquid, localised boiling occurs on the ions and
leaves a trail of bubbles. Bubble chambers were much better
detectors than cloud chambers because of the greater density
of the substance in the chamber. A 10 cm bubble chamber
was approximately equivalent to a 10 m cloud chamber.
Figure 26.2 shows a bubble chamber at CERN that was dis-
mantled in 1984 after being used for over six million
photographs.

Modern detectors
Detectors in use at large nuclear research facilities, such as
CERN, Fermilab and Brookhaven, are now larger and more
complex than bubble chambers. In typical high-energy experi-
ments performed at these facilities, multicomponent detec-
tors are used to record what may be millions of events and to
Figure 26.2 The 3.7 m bubble store them on computer for later analysis. The Collider Detector at
chamber at CERN. Before being Fermilab is shown in figure 26.10 (see page 511).
dismantled in 1984, this bubble The function of a detector is to record the trajectory, energy and
chamber was used for over six million momentum of the particles produced in a collision ‘event’. If two
photographs. beams of particles, perhaps protons and antiprotons, with similar

CHAPTER 26 QUARKS AND THE STANDARD MODEL OF PARTICLE PHYSICS 497


energies collide head-on, the particles produced could travel in any
eBook plus direction and a large cylindrical detector would be used. Such a
detector would commonly have four different regions. There would be
Weblinks: an inner tracking chamber surrounded in turn by an electromagnetic
European Organisation for calorimeter, then a hadronic calorimeter and finally a muon chamber
Nuclear Research (CERN)
Fermi National
(see figure 26.3). The inner tracking chamber contains a gas, and as
Accelerator Laboratory the charged particles produced in the collision event traverse this
Brookhaven National chamber, they produce ions. The ions may be collected on thin metal
Laboratory wires and produce a small electrical pulse. Once the presence of the
Stanford Linear
ions has been detected, the tracks of the particles that produced the
Accelerator Center
ions can be deduced. Many very short-lived particles do not leave
tracks but they may decay into particles that do.
The calorimeters are made of
dense materials that absorb the
energy of the particles interleaved
with sensitive detector materials. The Neutron
Photon
different materials are segmented
and it is possible to determine where
a particle was finally absorbed. The
electromagnetic calorimeter is opti- Electron
+
mised to measure the energy and pos- π –, Proton

itions of electrons and photons that


interact via the electromagnetic force.
Muon
The hadronic calorimeter is opti-
mised to measure the energy and pos-
itions of hadrons which interact via
Beam pipe (centre)
the strong force (see page 504 for a
Tracking chamber
description of hadrons).
E-M calorimeter
eBook plus Only muons (and neutrinos) are Hadron calorimeter
able to pass through the two inner cal- Magnetised iron
Weblinks: orimeters. Any charged particle that Muon chambers
Hands on CERN reaches the outermost calorimeter
SLAC virtual must be a muon. Neutrinos, of course, Figure 26.3 A simplified end-on view
visitor centre of a cylindrical detector that might be
continue without interacting with any
part of the detector. The passage of dif- used for a colliding beam experiment
ferent types of particle through a
detector is shown in figure 26.4.

Tracking Electromagnetic Hadron Muon


chamber calorimeter calorimeter chamber

Photons

e+–

Muons

π +–, protons

Neutrons

Innermost layer Outermost layer

Figure 26.4 The passage of particles through the different sections of a


multicomponent detector

498 FROM QUANTA TO QUARKS


Particle accelerators
A very simple accelerator is the electron gun in the tube of a television
set. Electrons are accelerated across a large potential difference and
then directed at the screen of the television set. Some of the early
particle accelerators were similar to this in that they were single-stage
electrostatic accelerators. Higher energies were possible when the
particles were accelerated many times, such as in a linear accelerator.
The development of cyclic accelerators such as the cyclotron saw large
energies possible with smaller devices but, as we will see, modern
accelerators have become very large and complex devices.

The first particle accelerators


An early particle accelerator that was used to accelerate protons to
770 keV was developed by John D. Cockroft and Ernest T. S. Walton at
the Cavendish Laboratory in 1932. It was an electrostatic machine that
gave the protons a single high energy ‘kick’. Another electrostatic accel-
erator is the Van de Graaff generator that you have probably encoun-
tered. Large versions were capable of reaching 1.5 MeV or higher.
These accelerators have since been improved as it became apparent
that many more important discoveries could be made with particles of a
higher energy, and new accelerators were developed. However, despite
these new accelerators, the earlier accelerators are still found to be
useful. It is interesting to note that at Fermilab, the initial step or pre-
acceleration is provided by a Cockcroft–Walton accelerator and at
Brookhaven, the initial acceleration of the Relativistic Heavy Ion Collider
(RHIC) is provided by tandem Van de Graaff accelerators.

Linear accelerators
The most famous linear accelerator is at the Stanford Linear Accelerator
Center (SLAC). Charged particles are fired through a three-kilometre-
long evacuated tube. The charged particles pass through one cylindrical
electrode and are then accelerated by an electric field as they pass
through a gap before encountering another electrode. This process is
repeated and the particles increase their energy. Of course, the alter-
nating accelerating potential has to keep in step with the particles and
this requires the cylindrical electrodes to become longer and longer (see
figure 26.5 on the next page). Eventually it becomes impractical to add
extra stages to a linear accelerator. At SLAC, electrons were accelerated
to a velocity very close to that of light.

Cyclotrons
Like a linear accelerator, a cyclotron is able to give a charged particle
many ‘kicks’ as it passes through the electric fields between the ‘dees’
of the cyclotron (see figure 26.6 on the following page). Again the par-
ticles move through an evacuated region. The whole apparatus lies
between the poles of a large magnet. Therefore, the particles move in
circular paths, with the radii of the paths increasing each time the par-
ticle gains energy as it passes through the gap between the dees. When
the particles reach the limit of the magnetic field they are deflected
into a target. Very high energy cyclotrons are not possible for a number
of reasons. Eventually size would become prohibitive and also, as the
particles reached very high velocities, the relativistic increase in mass
would mean that the particles would bcome out of step with the applied
alternating potential.

CHAPTER 26 QUARKS AND THE STANDARD MODEL OF PARTICLE PHYSICS 499


Electron Vacuum tube Drift tubes Target
or ion source

R/F Power supply

Figure 26.5 A linear accelerator. Charged particles are accelerated when they pass through
the gaps between the cylindrical electrodes, or drift tubes. It is necessary to keep the charged
particles in step with the radio-frequency high-voltage applied to the electrodes. Hence, the
drift tubes have to become progressively longer. The SLAC, which was built in 1967, is three
kilometres long and accelerates electrons to 20 GeV. After modifications were completed in
1987, the SLAC was able to produce 50 GeV electrons.

Dee Dee

To target

Deflector plate
S

High-frequency
alternating
voltage

Figure 26.6 A top view of a cyclotron. Positively charged particles from the ion source, S,
travel in semicircular arcs as they pass through the ‘dees’. They are accelerated by a high
voltage as they pass from one dee to the other and as their velocity increases, so does the
radius of their path. Finally, near the outside of the dees, they are deflected towards the
target. The dees are hollow cylinders of non-magnetic metal and the apparatus is in an
evacuated container between the poles of a powerful magnet. (Note that for positively charged
particles to travel clockwise as shown, the magnetic field must be directed out of the page.)

Synchrotrons
The main accelerators today are synchrotrons. Synchrotrons keep the
particles in a path of constant radius. As the particles gain energy, the
magnetic field is increased to maintain the same path. Many powerful
magnets are required around this path. The particles move through a
small-diameter, evacuated tube that forms a large-diameter ring.

500 FROM QUANTA TO QUARKS


On each circuit around the ring, the particles pass through regions
eBook plus where an applied radio frequency provides the ‘kick’ to increase their
energy. This radio frequency provides an electric field in a direction such
Weblinks: that it produces the ‘kick’ that increases the energy of the particles. The
Australian synchrotron radio frequency increases as the particles increase in energy and take
Nova: Science in the news shorter and shorter periods to complete their orbits. A disadvantage of a
Berkeley Lab’s Advanced
Light Source
synchrotron is that a ‘batch’ of particles must complete their journey
through the accelerator before another batch can enter. However, the
advantages of the synchrotron, in terms of energy that can be achieved,
far outweighs this disadvantage.
The dimensions and statistics of the large accelerators are impressive.
The main accelerator, the Tevatron at Fermilab, has a circumference of
6.28 km. The original accelerator was able to accelerate protons to about
200 GeV. The magnets used in this accelerator are the light blue and dis-
tant red sections in the middle of figure 26.7. When higher energies were
required, another accelerator was built below the first. It uses super-
conducting magnets to steer the particles around the ring, and because
these do not heat up, they can be left on for very long periods. Another
accelerator in a separate ring now accelerates protons to 150 GeV, at
which point they are transferred to the new accelerator and accelerated
further to about 1000 GeV (1 TeV).
The Large Electron Positron collider (LEP) at CERN occupied a
tunnel of 27 km circumference straddling the Swiss–French border. In
2000 it was shut down, and construction of its replacement, the Large
Hadron Collider (LHC) began. The LHC was scheduled to commence
operation in 2005 but that was delayed until 2007. Then in 2007 a
problem with one of the magnet support structures caused a further
delay until 2008. The LHC has 1232 superconducting magnets, each
15 m long, around 85% of its circumference. The magnets were supplied
by Fermilab. These magnets will be powered by superconducting cables
carrying currents of 12 000 amps and will be cooled by liquid helium to
−271°C. The LHC should be able to accelerate protons to 7 TeV and col-
lide these protons with other protons travelling in the opposite direction,
also with an energy of 7 TeV. It should be able to collide heavy ions, such
as lead, with a total energy of 1250 TeV, about thirty times that of the
RHIC at Brookhaven. (For more on the LHP and RHIC, see ‘Recent
developments in accelerator research’, page 512.)

Figure 26.7 Part of the 6.28 km-


circumference Fermilab Tevatron
accelerator. The yellow and
red sections on the floor are parts of
the new accelerator, which can
accelerate protons to nearly 1000 GeV.

CHAPTER 26 QUARKS AND THE STANDARD MODEL OF PARTICLE PHYSICS 501


Fixed target collisions versus colliders
In collisions between high-energy particles and a target, conservation of
momentum and conservation of energy naturally apply. The momentum
of the products of any interaction must be the same as the momentum of
the incident particle. This means that the products of the interaction
have considerable kinetic energy and, hence, only a small amount of the
kinetic energy of the incident particles is available for the production of
new particles. When a 400 GeV proton from an accelerator is fired at a
target and then interacts with a particle in the target, the amount of
energy available for producing new particles is only 27 GeV. If the accel-
erator produces 1000 GeV protons, the energy available for producing
new particles will be only 42 GeV.
In the late 1970s, this problem was overcome by producing inter-
actions where the total momentum was very small. Instead of firing a
high energy particle into a stationary target, accelerators were modi-
fied to accelerate protons and antiprotons to the same high energies.
Carlo Rubbia (1934 – ), at CERN, was the driving force behind the
necessary modifications being made to produce a collider. Although it
was not an easy task at the time, antiprotons were produced in large
numbers, injected into the accelerator and accelerated simultaneously
with the protons. The antiprotons travelled around the accelerator in
the opposite direction to the protons. When they reached their
maximum energy the beams of protons and antiprotons were
deflected to intersect.
The total momentum of the proton and antiproton before inter-
action is close to zero and, hence, the total momentum of the products
of any interaction has the same near zero value. Therefore, if the pro-
tons and antiprotons were each accelerated to about 400 GeV, the total
energy available for the production of new particles should be 800 GeV.
(In fact, the 400 GeV accelerator at CERN was able to accelerate the
protons and antiprotons to about 260 GeV and this gave a total energy
of 520 GeV.)
Today, the highest energy accelerators are colliders, but there are still
many experiments performed with fixed target accelerators.
Fixed target accelerators can produce very large numbers of inter-
actions as the high-energy particles are fired into a dense target. They
also have the advantage of using secondary beams of particles that have
been formed by the interactions in the primary collision of the original
high-energy particles with the target. These secondary beams of particles
can be used to strike other targets and perform other experiments, or
even to form tertiary beams for other experiments. More than 15 dif-
ferent experiments were run simultaneously using the fixed target accel-
erator at Fermilab. Fixed target accelerators also have the advantage of a
wide range of target materials.
Although colliders have the obvious advantage of higher energies,
they produce a far smaller number of interactions and they are limited
by the choice of colliding objects (which have generally been electrons
and positrons or protons and antiprotons). This situation is changing as
the RHIC (Relativistic Heavy Ion Collider) came on line in 2000 and
the LHC (Large Hadron Collider) is now being developed at CERN
(page 512).
The Tevatron at Fermilab can be used for fixed-target research or con-
verted to form a storage-ring collider using protons and antiprotons. As a
collider, the Tevatron can accelerate the protons and antiprotons to
980 GeV for a total energy of 1960 GeV.

502 FROM QUANTA TO QUARKS


26.2 THE STANDARD MODEL OF
PARTICLE PHYSICS
In the previous sections we encountered the discovery of just a few of the
eBook plus many sub-atomic particles. In the next section we will encounter some
more. Before 1970, well over 200 different particles had been discovered.
We note that some particles were predicted by new theories that were
Weblink:
The Particle Adventure being developed. With the discovery of so many sub-atomic particles, a
This is an excellent web site search was begun for a structure that would help to explain the existence
on particle physics and of the particles and perhaps even identify some truly ‘fundamental’ par-
the Standard Model. ticles. The existence of quarks as fundamental particles was predicted
and the concept of quarks gained acceptance when the first quarks were
detected.
A model called the ‘Standard Model’ of particle physics was developed.
The Standard Model is a mathematical description of all known particles
and the forces between them. It enables us to explain all the behaviour of
these particles. A very close interplay between experimental and theor-
etical physics prompted the development of the Standard Model. There
Table 26.1 Particle masses are still problems with some aspects of it, and as the 1999 Nobel prize
winner Gerard ‘t Hooft (1946– ) has said, ‘We do admit that the model is
 MeV- not absolutely perfect . . . however, the degree of perfection reached is
PARTICLES MASS  ---------
c 
2
quite impressive’.
Proton (p) 938.3
New particles
Neutron (n) 939.6
In the early 1930s, the proton and neutron had been identified as the
constituents of the nucleus, and electrons were known to be in orbit
Electron (e) 0.511
around the nucleus. With this knowledge the constituents of matter
Muon (µ) 105.7
seemed to have been identified. In 1933, the situation became a little
clouded when Carl Anderson discovered the positron and when Pauli
+ predicted the existence of the neutrino.
Pion (π ) 139.6
It was soon realised that more and more particles were awaiting
discovery.
In 1936, Anderson and Seth Neddermeyer observed tracks in a cloud
chamber that did not match that of electrons or protons. The tracks were
produced by cosmic rays and were too thin to be made by protons. The
Initially, the classification of particles particles that made them penetrated thick lead plates that would stop
was done by mass: leptons (light), electrons. It seemed that the tracks were made by particles that had a
mesons (intermediate) and baryons mass between that of an electron and a proton. In fact the mass was
(heavy).
MeV
The muon, being a particle of shown to be about 100 ------------ . A new particle had been discovered. This
c2
intermediate mass, was originally
called a mu-meson. This classification
particle was originally called a mesotron but is now called a muon.
system has been changed and particles
(Rather than use grams or kilograms for mass, nuclear physicists usually
are now classified in terms of their MeV
use MeV or more correctly ------------ as their measure of mass. The masses of
interaction. The muon is not a meson c2
but a member of the group called some common particles are given in table 26.1.)
leptons (see page 504). The muon at first appeared to be the particle that Hideki Yukawa
Particles are now classified as (1907–1981) had proposed in 1935 to explain the strong nuclear force.
hadrons and leptons. Particles which He had predicted a particle with a mass about 200 times that of an elec-
experience the strong force are called tron. However, there were problems with this idea and after the develop-
hadrons and are named after the ment of a very fine-grained photographic emulsion in 1947, another
Greek word for strong. Particles that particle, the pion, with a mass of 140 MeV was discovered. The pion was
do not experience the strong force are the particle predicted by Yukawa. In fact, the pion disintegrated into a
called leptons. muon and a neutrino.

CHAPTER 26 QUARKS AND THE STANDARD MODEL OF PARTICLE PHYSICS 503


Hadrons Developments leading to the Standard Model
Hadrons are particles that By the early 1960s, about 100 particles had been discovered using new
experience the strong nuclear
accelerators and improved particle detectors. Attempts were made to
force. Mesons and baryons are
both hadrons. organise these particles and perhaps find an underlying structure.
The organisation of elements (as shown in the periodic table, see
Appendix 2, page 528) was understood when the details of atomic struc-
Baryons ture, in terms of a nucleus and orbiting electrons, had been discovered.
Baryons are hadrons that have
Physicists at this time wondered whether a pattern would also be discovered
half-integer spin (and are
fermions). Examples are the for hadrons. When Enrico Fermi was asked about the names of some of
proton and neutron. Some other these particles he made his famous response, ‘If I could remember the
baryons are included in table 26.1 names of all these particles I would have been a botanist’.
on page 503.
While we sympathise with Fermi’s view, we have to look at some of the
terms that are collectively assigned to different groups of particles.
Mesons Hadrons are particles (including mesons and baryons) that experience
Mesons are hadrons that have zero the strong force. Leptons are particles that do not experience the strong
or integer spins. Some of the force. All leptons have half-integer spin and are called fermions. They
mesons with zero spin are included obey the Pauli exclusion principle.
in table 26.2 on page 507.
Some hadrons are fermions, having a half-integer spin, and some are
bosons with either integer or zero spin. Bosons do not obey the Pauli
Leptons exclusion principle.
Leptons are particles which do not
experience the strong nuclear
force. They are all fermions with
half-integer spin. An electron is a
PHYSICS FACT
lepton.

T he Pauli exclusion principle forbids two fermions from existing


in exactly the same quantum state. (The arrangement of elec-
trons in atoms is a reflection of the Pauli exclusion principle.
Fermions
Fermions are particles that have Electrons, which are fermions, cannot accumulate in the lowest
half-integer spins. They obey the energy state because they cannot exist in the same quantum state.)
Pauli exclusion principle. However, it is a different situation with bosons.
In 1995, physicists at Boulder, Colorado, managed to produce a Bose–
Einstein condensate in which about 2000 rubidium-87 atoms were confined
Bosons
Bosons are particles that have to a single quantum state of approximately zero energy. The rubidium-
either integer or zero spin. They 87 atoms are bosons which do not obey the Pauli exclusion principle, so
do not obey the Pauli exclusion it is possible to have a large number in the same quantum state.
principle. Bosons are force-
carrying particles.
The Eightfold Way
In 1961, Murray Gell-Mann (1929– ) in the USA and Yuval Ne’eman
eBook plus (1925–2006), an Israeli theorist in England, independently discovered a
method of organising particles. Gell-Mann called the method, perhaps a
Weblink: little irreverently, by the Buddhist term ‘The Eightfold Way’.
Bose–Einstein
condensation The theory suggested that there was a missing particle. It was called the

Ω (omega minus). In 1963, a search for this particle was started using
Gell-Mann had discovered that many the bubble chamber at Brookhaven. The bubble chamber was about two
particles could be organised into metres in diameter and contained liquid hydrogen. Every few seconds, a
families of eight or ten. He believed burst of kaons (K-mesons), collided with protons (the nuclei of the atoms
that his theory was related to the of liquid hydrogen) in the bubble chamber. This produced a spray of

concept of group theory from particles that, it was hoped, would include the Ω . Eventually, in photo-

mathematics. The particles were graph number 50 321, an event indicating the existence of the Ω was dis-
graphed in terms of certain quantum covered. This photograph is shown in figure 26.8 along with a sketch that
numbers that were given such exotic shows the particles involved in the interaction.
names as ‘isotopic spin’ and This confirmed the Eightfold Way’s organisation of particles, but the
‘strangeness’. reason for this organisation was still unknown.

504 FROM QUANTA TO QUARKS


Quarks
In 1964, Murray Gell-Mann and George Zweig (1937– ), both from the
California Institute of Technology, but working independently, proposed
that there were three fundamental particles that were the constituents of
hadrons. Gell-Mann named these particles ‘quarks’.

Figure 26.8 The bubble chamber photograph taken at Brookhaven National Laboratory

that shows the path of the Ω particle. The diagram at the right identifies the particles
responsible for the various trails. Dotted lines show the paths of particles not visible in the

photograph. The incident K particle (1) collided with a positron at (3). The short tail at (3)
George Zweig named his particles was produced by the Ω before it decayed to the π and ultimately a number of other particles
– –

‘aces’ while Murray Gell-Mann


of which some left trails in the photograph.
named his ‘quarks’.
Gell-Mann had in mind the sound
The idea for the existence of quarks came from the arrangement of
‘kwork’ rhyming with squawk or pork.
baryons and mesons in the patterns of the ‘Eightfold Way’.
He then came across the quote ‘Three
The idea of quarks simplified the structure of hadrons. It was believed
quarks for Muster Mark’ from
Finnegan’s Wake by James Joyce. He
that each baryon was composed of three quarks and that each meson was
decided that he could spell the name of
composed of two quarks. However, there was still doubt about the reality
the particle ‘quark’ but pronounce it to of quarks.
rhyme with squawk. Gell-Mann tried In the late 1960s and early 1970s, experiments conducted at SLAC
to reason why Joyce may have intended (Stanford Linear Accelerator Center) provided convincing evidence of the
‘quark’ to sound as he wanted but from existence of quarks. These experiments involved bouncing high-energy
the quote it seems that it should rhyme electrons off protons in an attempt to discover the structure of protons.
with ‘Mark’. In some ways this was using a similar principle to Rutherford’s scattering
Gell-Mann’s name stuck while experiment in which he bounced alpha particles off atoms to find the
Zweig’s was lost. Both pronunciations structure of atoms. However, the energies involved in Rutherford’s
of quark are accepted today. scattering experiments were much smaller.

CHAPTER 26 QUARKS AND THE STANDARD MODEL OF PARTICLE PHYSICS 505


In his book The God Particle, Leon Lederman (1922– ), Nobel prize
winner and a past director of Fermilab, presented the following extended
analogy for many puzzles in physics, particularly particle physics. He wrote
of a World Cup soccer match attended by some intelligent extraterrestrial
beings from the planet he calls Twilo. He describes these beings as similar
to humans, except that they cannot see objects with the sharp juxta-
positions of black and white, like zebras or soccer balls.
The Twiloans watch the soccer game and at first are totally mystified at
seeing people running around and doing many strange things for appar-
ently no reason, and also the reaction of the crowd at certain events. They
make charts of what is happening but nothing makes sense until one of them
suggests that there is an invisible ball involved. He has noticed that there is
a slight hemispherical bulge in the net of the goal just before the referee
blows his whistle, the crowd cheers madly and a point is added to the score.
Once this idea of a ball that is invisible to Twiloans is accepted, all the pre-
vious charts remain valid but now there is a meaning to all the events.
Quarks cannot be observed directly and a theory has been developed
that predicts that they cannot exist outside hadrons. However, like the
invisible soccer ball, other observations make sense only if the presence
of quarks is accepted.

Unification of the electromagnetic force and the weak


nuclear force
The unification of the electromagnetic force and the weak nuclear force
resulted from breakthroughs by Steven Weinberg (1933– ), working in the
USA, and Abdus Salam (1926– ), working independently in England. They
extended the idea of a force-carrying particle to the weak nuclear force and
argued that the weak and electromagnetic forces were really the same thing.
The force-carrying particle, or boson, for the electromagnetic force
was the photon. Weinberg and Salam proposed that there would be three
+ − 0
force-carrying bosons called W , W and Z , for the weak interactions.
These bosons were termed intermediate vector bosons. These bosons were
very heavy, about one hundred times the mass of the proton. Ultimately
however, the electromagnetic and weak force were manifestations of the
same force that they called the electroweak force.
It was the search for these intermediate vector bosons that led Carlo
Rubbia to convert the 400 GeV accelerator at CERN to a collider (page
502). The total energy of 520 GeV from the collisions was then sufficient
to enable the discovery of the W particles. The theory predicted the mass
of the ‘W’s to be about 80 GeV.
In 1983, Rubbia worked with Simon Van der Meer (1925– ) and a group
of about 130 physicists and provided the final evidence of the electroweak
0
force with the discovery of the Z boson. In 1984, Rubbia and Van der Meer
shared the Nobel prize for the discovery of the W and Z particles.

Particles of the Standard Model


The particles of the Standard Model are quarks and leptons and are
eBook plus shown in table 26.2 on the following page. Today it is accepted that there
are six flavours of quarks and six flavours of leptons. The description,
Weblink: ‘flavours’, just means different types of quarks or leptons. Quarks and
The Standard Model leptons can be divided neatly into groups called ‘generations’. All the vis-
ible matter in the universe is composed of first-generation quarks and
leptons, the up and down quarks, and electrons.

506 FROM QUANTA TO QUARKS


Table 26.2 The particles of the Standard Model

LEPTONS QUARKS

REST MASS ELECTRIC REST MASS ELECTRIC


GENERATION NAME SYMBOL (MeV) CHARGE NAME SYMBOL (MeV) CHARGE

Electron Ve g0 0 Up u g5 + 2--3-
neutrino
I

Electron e 0.511 −1 Down d g7 − 1--3-
Muon Vµ g0 0 Charm c 1500 + 2--3-
neutrino
II
Muon µ

105.7 −1 Strange s g 150 − 1--3-

Tau vτ < 35 0 Top t 170 000 + 2--3-


neutrino
III
Tau τ

1784 −1 Bottom b g 5000 − 1--3-

Quarks
Gell-Mann and Zweig first proposed that hadrons were composed of only
three quarks. It was predicted that there were three different types of
quarks and these were called up, down and strange. Later it was
necessary to add more quarks and these became charm, discovered in
1974, bottom, discovered in 1976, and top, discovered in 1995.
In the strange language of particle physics, these types of quarks
became known as ‘flavours’. There are six different flavours of quarks,
up, u, down, d, strange, s, charm, c, top, t, and bottom, b. The top and
bottom quarks were sometimes referred to as ‘truth’ and ‘beauty’ but top
and bottom are now the accepted names. See ‘Physics in focus’ (pages
510–511) for an account of the discovery of the top quark.
The particles of the Standard Model are given in table 26.2 above. Quarks
possess charges that are either + --2- or − --1- of the charge on an electron.
3 3
A proton is composed of two up quarks, each of charge + 2--- , and one
3
down quark with a charge of − 1--- , giving the proton a charge of +1.
3
A neutron is also composed of up and down quarks, but one up and
two down quarks are required to produce a neutral particle.
Other combinations of quarks that form baryons (composed of three
quarks) and mesons (composed of a quark and an antiquark) are listed
in table 26.3 on the following page.

Leptons
The first known leptons were the electron, the muon and neutrinos.
Leptons are regarded as being fundamental particles.
In 1961, the alternating gradient synchrotron at Brookhaven was
used to bombard a beryllium target with 15 GeV protons. Among the
products were pions which then decayed into muons and neutrinos. A
steel barrier over 10 m thick was used to filter out all particles except
for the neutrinos. Studies of this beam of neutrinos showed that in the
unlikely event that they did interact with matter, it was always muons
and not electrons that were associated with the interactions. Therefore,
it was concluded that there were in fact two types, or flavours, of

CHAPTER 26 QUARKS AND THE STANDARD MODEL OF PARTICLE PHYSICS 507


neutrinos — electron neutrinos and muon neutrinos. In 1976, the
Stanford Positron Electron Annihilation Ring (SPEAR) provided evi-
dence of another lepton, named tau.
It was predicted that another neutrino, the tau neutrino, must be the
sixth lepton of the Standard Model. Even before there was any sugges-
tion of its discovery, it was accepted that the tau neutrino was the sixth
lepton in the Standard Model. In July 2000, an international collabor-
ation of scientists at Fermilab announced the first direct evidence for
the tau neutrino.

Table 26.3 Some hadrons and their properties


The composition of baryons and mesons. The particles in the table are
composed of combinations of two or three quarks given the names up, u, down,
d, strange, s and charm c. In the case of mesons, it is a combination of a quark
and an antiquark. Baryons are composed of three quarks.

MASS CHARGE RATIO

MeV Q QUARK
------------ ----
PARTICLE c2 e SPIN CONTENT

Mesons

π0 135.0 0 0 uu, dd
+
π 139.6 +1 0 ud


π 139.6 −1 0 ud

+
K 493.7 +1 0 us

K 193.7 −1 0 us

η0 547.5 0 0 uu, dd, ss

Baryons
1
p 938.3 +1 ---
2
uud
1
n 939.6 0 --- udd
2
1
Λ0 1116 0 ---
2
uds
1
Σ+ 1189 +1 ---
2
uus
1
Σ0 1193 0 ---
2
uds
1
− 1197 −1 ---
2
dds
1
Ξ0 1315 0 ---
2
uss
1
Ξ− 1321 −1 ---
2
dss
1
∆++ 1231 +2 ---
2
uuu
1
Ω− 1672 −1 ---
2
sss
1
Λc+ 2285 +1 ---
2
udc

508 FROM QUANTA TO QUARKS


PHYSICS IN FOCUS
Boson force-carriers in the Standard Model
T here are four fundamental forces, the electro-
magnetic force, the weak nuclear force, the
strong nuclear force and the force of gravity. The
Although quarks have colour, antiquarks have anti-
colour. Mesons, which are a quark–antiquark pair,
are colourless because they have a quark for colour
Standard Model, at present, describes three of these and a corresponding quark for anticolour.
forces, the electromagnetic force and weak nuclear
force (which are unified in the electroweak force) Gluons and the strong force
and the strong nuclear force, in terms of force- When we encountered the strong nuclear force in
carrying particles called bosons. chapter 24 (page 466), we noted that it involved
As we saw on page 506, there are four boson the exchange of pions between nucleons. In the
force-carriers that have been experimentally identi- Standard Model, the boson force-carriers for the
fied with the electroweak force. These are the W ,
+ strong force are gluons that are exchanged
− 0
W , Z and photon. There are eight gluons that between quarks. (The role of mesons in the force
have been identified that are associated with the between nucleons is really a more complex and
strong force. The force of gravity is not yet included secondary example of the strong force.)
but a boson, the graviton, has been predicted to be Gluons have no mass but they carry the ‘colour
the force-carrier for the force of gravity. However, it charge’. They actually carry a colour and an anti-
has not yet been discovered. In the next sections we colour. We are used to positive and negative
will encounter another feature of quarks — the charges. With colour charge we are dealing with
colour charge. We will see that colour charge is something similar to positive and negative charge,
associated with gluons and the strong force. but with three varieties rather than two. The three
different varieties are called the colour charges.
Colour — another property of quarks
Quarks are fermions, which are particles that have Table 26.4 The particles and boson force-carriers of the
a spin of 1--- . Fermions obey the Pauli exclusion Standard Model.
2
principle, so it seems that it should not be possible Gluons have no mass but carry the colour ‘charge’. They
for baryons to consist of three particles all in the carry colour and an anticolour. While the first pair of

same quantum state. The Ω consists of three gluons shown appear to be both red blue, one is red
strange quarks, apparently with two of these in antiblue and the other is blue antired. It is not easy to
identical quantum states. This appears to violate represent anticolours in a diagram!
the Pauli exclusion principle. Two quarks can exist
in an identical quantum state because they can
have spin in the opposite directions, called spin up
or spin down. If a third quark is added, it must also
have spin up or spin down and hence the Pauli
exclusion principle is apparently violated.
This difficulty was overcome by proposing that
each quark has three different varieties. These
varieties were called ‘colour’ but this, of course,
has nothing to do with real colour. It is just
another whimsical name applied to quarks.
The colours are usually called red, green and
blue and it was assumed that the Pauli exclusion
principle applied differently to each colour.

This then means that the Ω consists of a red s,
a green s, and a blue s, so there is no longer a
problem with the Pauli exclusion principle.
While individual quarks have colour, any particle
composed of quarks must have no net colour. A com-
bination of a red, a green and a blue quark results in
a particle lacking in colour. Baryons always contain
one quark of each colour and, hence, are colourless.
(continued)

CHAPTER 26 QUARKS AND THE STANDARD MODEL OF PARTICLE PHYSICS 509


blue-antired gluon
red up quark
π+

antiblue antidown quark


blue up quark

(b) red up quark


antiblue antidown quark

antired antidown
quark
(a)
(c)
+
Figure 26.9 (a) The structure of π , which contains an up quark and an antidown quark. In this diagram the up quark is
blue and the antidown quark is antiblue. (b) The blue up quark emits a blue-antired gluon leaving a red up quark.
+
(c) The antiblue antidown quark absorbs the blue-antired gluon and hence forms an antired antidown quark. The π now
has changed to having a red up quark and an antired antidown quark.

The study of the interaction of light and elec- remains a π+ when gluons are exchanged between
trons is called quantum electrodynamics (QED). the up and antidown quarks, even though the
The study of the colour force and interactions colours of the quark and anti-quark change. As
involving gluons is called quantum chromo- shown in (b) and (c) the colour of the blue up
dynamics (QCD). quark and antiblue antidown quark will be
Figure 26.9 shows the exchange of a gluon changed to red and antired when the blue up
between the up and antidown quarks of a π+ (a quark emits a blue antired gluon that is absorbed
positively charged pion). The π+ can be red- by the antiblue antidown quark.
antired, green-antigreen or blue-antiblue. It

PHYSICS IN FOCUS
The discovery of the top quark
T he bottom quark was discovered in experi-
ments conducted at Fermilab in 1977. It had
a mass of about 5 GeV. After this discovery it was
and began recording data in 1992. CDF and DØ
were international collaborations of more than
400 physicists each. They also included large
predicted that another quark, the top quark, numbers of engineers and technical staff.
must exist and it was thought that it would have a The CDF and DØ collaborations constructed
mass between 10 and 30 GeV. The Standard enormous, complicated instruments to try to
Model predicted many of the properties of this detect the ‘signature’ of the top as it passed
undiscovered quark, but did not limit its mass. through the detector. The two groups had dif-
By 1988, experiments at CERN had not discov- ferent approaches but expected that if one group
ered the top quark and it was concluded that its found any evidence of the top, the other should
mass must be greater than 41 GeV. be able to find supporting evidence. If a top and
The Fermilab collider had been activated in anti-top were produced, they would decay almost
1985, and in 1988 and 1989 the CDF (Collider instantly into a W and a bottom quark. Hence,
Detector at Fermilab) group were in intense com- the top and anti-top would produce two Ws and a
petition with the group at CERN. An energy of 77 bottom and antibottom. Unfortunately, neither
GeV was reached without the top quark being the W nor the bottom or antibottom could be
detected. Leon Lederman, the director of Fer- observed directly. What was observed was a ‘jet’, a
milab during the 1980s, decided that some local directed beam of particles that travelled in
competition would be a positive move. Another roughly the same direction as the original top
group, DØ (pronounced ‘dee zero’), was created quark.

510 FROM QUANTA TO QUARKS


In 1994, the CDF group had iso-
lated 12 events that may have
involved a top–antitop pair. There
was a possibility that the 12 events
were really just caused by background
events that gave the same signature as
the top-antitop. However, the group
estimated that 5.7 of such back-
ground events were to be expected
but that the probability of all the
events being background events was
less than 1 in 400.
The CDF group then estimated the
total energy of the particles in the
jets, and also the associated leptons,
and found that all of the energies
clustered in a small range around 175
GeV. If it had been background
events, a wider spread of energies
would have been expected.
CDF then had to write a paper to
satisfy its 400 members. This did not
prove to be an easy task. The paper
was submitted in April 1994.
The DØ group had focused their
research on the search for a lighter
particle and had little to support
CDF. DØ performed a re-analysis
based on the higher mass and found
promising results. When the final
presentation was made in March
1995, both CDF and DØ showed over-
whelming evidence for the top quark.
More events had been detected and
the probability that background
events could explain the data had
been reduced to less than 1 in
500 000.
Figure 26.10 The Collider Detector at Fermilab (CDF). The detector, How very different this is from the
which is three storeys high, is shown with the modules of the central work performed in the Cavendish
calorimeter moved to the sides. The detector provides 70 000 channels of Laboratory during the first half of the
data to computers. twentieth century!

The Standard Model — today and beyond


The Standard Model is a great achievement and a large number of
experiments have confirmed, sometimes to incredible precision, the
predictions of the Standard Model. However, it is acknowledged that
there are serious flaws with it. Some of those are listed below and on
the following page.
• It is incompatible with Einstein’s general theory of relativity. There-
fore, unification of the forces cannot involve the force of gravity. The
Standard Model is a quantum–mechanical model while general rela-
tivity is not.

CHAPTER 26 QUARKS AND THE STANDARD MODEL OF PARTICLE PHYSICS 511


• The Standard Model provides no reason for the numbers of particles.
Why are there six quarks and six leptons? Is it coincidence or is there
an underlying reason?
• Is there some underlying truly fundamental particle such as a
‘leptoquark’? This might explain why three leptons have electrical
charges of one unit and quarks have electrical charges of + 2--- or − 1--- of
3 3
this unit of electrical charge.
• The Standard Model does not have a mechanism to generate the
observed masses of particles. (This objection may disappear if the
Higgs boson is detected.)

PHYSICS FACT
The Higgs boson
A difficulty with the electroweak unification
theory was that the W and Z bosons are
massless at some energies but need to acquire
velocity of light. (Note that empty space being
filled with Higgs particles is in some ways similar
to the idea of an ether filling all space.)
mass at lower energies. At high temperatures and At higher temperature and energies, the inter-
energies, the W and Z are similar to a photon, the actions of the Higgs particles are such that they
other force carrier of the electroweak force which do not fill space and the W and Z can pass
has no rest mass. through space at the speed of light. They are no
At lower energies, the W and Z need to acquire longer slowed down and hence, have no mass.
mass, while the photon remains with zero rest The Higgs field can also account for other
mass. Peter Higgs (1929– ), a Scottish physicist, quarks and leptons having mass. In fact, it may
was among those who proposed a mechanism for answer the question of what we really mean by
providing mass for these particles. He proposed mass. (It is a mechanism for providing mass but
the existence of a new field now called the Higgs cannot accurately account for exactly how much
field. A particle called the Higgs particle or boson mass these particles have.) The mass of the
is associated with this field. At low temperatures, Higgs boson is not known but some physicists
space will be filled with Higgs particles. The W think that they are on the verge of discovering
and Z interact with the Higgs particles and do not the particle.
travel through empty space at the velocity of The search for the Higgs boson has been a
light. They have acquired an effective mass massive task and Peter Higgs himself has stated
through their interaction with the Higgs par- ‘When I consider the huge sums going for this,
ticles. The photon does not interact with the the lifetimes spent in the search, I can’t help but
Higgs particles and so continues to travel at the think: “Good heavens, what have I done?” ’

Recent developments in accelerator research


In June 2000, the Relativistic Heavy Ion Collider (RHIC) commenced
eBook plus operation at Brookhaven. Beams of gold ions are accelerated to 99.995%
of the velocity of light. Two beams travel in opposite directions and are
Weblinks: then deflected to intersect. There has been speculation that there may be
The Relativistic Heavy Ion sufficient energy to produce a quark–gluon plasma.
Collider (RHIC) The Large Electron Positron collider (LEP) at CERN was scheduled to
RHIC animations and
end its 11-year life at the end of September 2000. However, its life was
multimedia
The Large Hadron Collider extended for an extra two months after some events were detected that
(LHC) may have been decays associated with the Higgs boson. They did not,
however, provide evidence of the Higgs boson.
The LEP is being replaced by the Large Hadron Collider (LHC),
which will occupy the same 27 km circumference tunnel. The LHC, orig-
inally designed to come on line in 2005, then delayed to 2007, has been
further delayed to 2008. The Higgs boson has proved elusive; it is beyond

512 FROM QUANTA TO QUARKS


the reach of existing particle accelerators, but researchers hope that the
LHC will be able to detect it.
Future directions of research
In the United States, the National Academy of Sciences has set up a
special committee to assess key questions about the nature of the uni-
verse. They have identified eleven key questions, which they hope will
either be answered in the next decade or that we should be thinking of
answering in the following decades.
Their eleven questions are:
• What is dark matter?
• What are the masses of neutrinos, and how have they shaped the evol-
ution of the universe?
• What is the nature of dark energy?
• Are protons unstable?
• How did the universe begin?
• Are there new states of matter at exceedingly high densities and temperatures?
• Is a new theory of matter and light needed at the highest energies?
• How were the elements from iron to uranium made?
• Are there additional space–time dimensions?
• Did Einstein have the last word on gravity?
Source: Connecting Quarks with
the Cosmos: Eleven Science Questions • How do cosmic accelerators work and what are they accelerating?
for the New Century , 8 January Further information is available at www.nationalacademies.org/bpa.
2001, Committee on the Physics Most of these questions involve the close linking of particle physics and
of the Universe cosmology. Perhaps the new accelerators currently being developed will
be able to shed light on some of these questions.

Table 26.5 Timeline of events in quantum and nuclear physics

1885 Balmer’s Equation for spectral line of hydrogen

1895 C. T. R. Wilson begins developing cloud chamber.


Wilhelm Röntgen discovers X-rays.

1896 Henri Becquerel discovers radioactivity.

1897 J. J. Thompson discovers the electron.

1898 Rutherford discovers that radioactivity consists of alpha and beta radiation.

1900 Villard discovers radioactivity also consists of gamma rays.


Planck develops quantum theory.
Rydberg modifies Balmer’s equation for spectral lines of hydrogen.

1902 Transformation theory of radioactivity

1903 Hydrogen atom thought to contain about a thousand electrons

1905 Einstein proposes light quantum.


Einstein introduces special relativity.
2
Einstein introduces E = mc .

1906 Rutherford discovers alpha particle scattering.


Beta particle spectrum detected

1908 Death of Becquerel


Paschen series (infra-red) of spectral lines of hydrogen detected
(continued)

CHAPTER 26 QUARKS AND THE STANDARD MODEL OF PARTICLE PHYSICS 513


1909 Geiger and Marsden publish results of Marsden’s experiment investigating the deflection of alpha particle
scattering by thin metal foils.
Rutherford and Royds identify alpha particles as doubly charged helium ions.

1911 Ernest Rutherford predicts the nuclear atom (based on his interpretation of the results of Geiger and Marsden).

1912 Cosmic radiation is discovered during manned balloon flights.

1913 Bohr publishes three papers that include his postulates that form the basis of the Bohr model of the atom.
Millikan determines the charge on the electron.

1914 First detection of continuous beta spectrum

1916 Lyman series (ultraviolet) of spectral lines of hydrogen detected.

1919 Rutherford identifies the proton as a constituent of nuclei.

1920 Rutherford predicts the existence of the neutron.

1922 Brackett series (infra-red) of spectral lines of hydrogen detected

1924 De Broglie introduces particle-wave duality.


Pfund series (infra-red) of spectral lines of hydrogen detected

1925 Pauli proposes his exclusion principle.


Heisenberg’s first paper on quantum theory is published.
Uhlenback and Goudsmit discover spin.
Born, Heisenberg and Jordan’s paper on matrix mechanics is published.

1926 Equation of hydrogen spectrum derived from matrix mechanics.


Schrödinger’s first paper on wave mechanics is published.
G. N. Lewis uses the name ‘photon’ for a light quantum.

1927 Davisson and Germer discover electron diffraction.


Heisenberg introduces the uncertainty principle.
Bohr introduces his idea of complementarity.

1928 Dirac presents his equation.

1931 Lawrence invents the cyclotron.


Pauli predicts the existence of the particle now known as the neutrino.
Dirac proposes the existence of the positron.

1932 Chadwick discovers the neutron.


Anderson discovers the positron.

1933 Fermi publishes the theory of beta decay.


Szilard has the idea of chain reaction.

1934 Fermi discovers that slow neutrons are much better than fast neutrons when irradiating elements.
+
Discovery of beta radioactivity

1936 Discovery of meson, later called muon, in cosmic rays

1937 Death of Rutherford

1938 Hahn and Strassman discover the presence of barium after bombarding uranium with slow neutrons.

1939 Meitner and Frisch realise that nuclear fission is taking place in the experiments of Hahn and Strassman.

514 FROM QUANTA TO QUARKS


1941 Plutonium discovered
Term ‘nucleon’ introduced

1942 The Manhattan Project commences.


Fermi produces controlled fission in an atomic pile in Chicago.

1945 First atomic bomb test


Two atomic bombs are dropped on Japanese cities Hiroshima and Nagasaki.

1946 Term ‘lepton’ introduced

1953 First bubble chamber pictures taken

1954 Foundation of CERN


Death of Fermi

1955 Discovery of the antiproton


Death of Einstein

1956 Cowan and Reines detect antineutrino

1958 Death of Pauli

1961 Gell-Mann proposes ‘Eightfold Way’.


Death of Schrödinger

1962 Discovery of the muon neutrino


Death of Bohr

1964 Hypothesis that all hadrons are composed of three quarks (and antiquarks)
Introduction of fourth quark

1976 Death of Heisenberg

1977 Discovery of fifth quark, bottom

1983 Discovery of the W and Z bosons

1984 Death of Dirac

1986 Chernobyl nuclear accident

1995 Discovery of sixth quark, top

2000 Detection of tau neutrino

CHAPTER 26 QUARKS AND THE STANDARD MODEL OF PARTICLE PHYSICS 515


• Gell-Mann and Zweig believed that there was an
SUMMARY underlying structure and that each baryon was
made up of three truly fundamental particles,
• The method that physicists have used to dis- named quarks by Gell-Mann. They believed that
cover information on the most basic structure mesons were composed of a quark and an anti-
of matter is to bombard the matter with high- quark.
energy particles. The particles produced in
• This led to the development of the ‘Standard
these interactions are then detected and ana-
Model’ in which all matter can be formed from
lysed. This process has led to a quest for par-
twelve fermions — six flavours of leptons and
ticles of higher and higher energies.
six flavours of quarks. These particles interact
• Cloud chambers were a very useful detector in with each other via the electroweak force, the
the first half of the twentieth century but were strong force and the force of gravity. These
replaced by bubble chambers which were much interactions occur through force-carrying
more efficient detectors. These, in turn, have bosons. Eight gluons carry the strong force or
been replaced by multi-component detectors colour force between quarks. The force
that include tracking chambers nd calorimeters. between leptons is carried by the three inter-
+ − 0
mediate vector bosons called W , W , Z , and
• With the invention of new types of particle the photon. The graviton, the conjectured force
accelerators the energies to which charged carrier of the gravitational force, has not been
particles can be accelerated in particle accel- detected.
erators have increased dramatically. The orig-
• The Standard Model may have been a great
inal one-stage electrostatic accelerators such
scientific achievement; however, it has serious
as the Cockcroft–Walton machine and the
flaws. It is incompatible with the general theory
Van de Graaff generator are still used as the
of relativity, Einstein’s theory of gravity. It
pre-accelerators in some of today’s biggest
cannot explain why there are six leptons and six
accelerators.
quarks. It does not have a mechanism to explain
• While linear accelerators and cyclotrons were the mass of particles, but perhaps the detection
once very important accelerators, the main of the Higgs boson will alleviate this difficulty.
accelerators today are synchrotons in which • The construction of more powerful accelerators
particles are accelerated around a constant such as the colliders RHIC and LHC might
radius path. enable physicists to produce a QGP (quark-
gluon plasma). If they are able to do this it
• Early accelerators fired high-energy particles at
would produce matter in the form that domi-
fixed targets; however, conservation of
nated the universe one millionth of a second
momentum and energy restricts the amount of
after the big bang. The LHC may have sufficient
energy that can be used in the production of
energy to detect the Higgs boson.
new particles. Much more energy is available for
particle production when similar mass particles,
travelling in opposite directions (and hence
having little total momentum) collide. A QUESTIONS
recently constructed collider is the RHIC at
Brookhaven. The LHC is an even more ener- 1. In 1930, Ernest Lawrence constructed the first
getic collider being built at CERN, and it will be cyclotron. It was approximately 12.5 cm in dia-
completed by 2005. meter and accelerated hydrogen ions in a mag-
netic field of 1.27 T between the poles of a
CHAPTER REVIEW

• By 1960, hundreds of different types of particles magnet 10 cm in diameter. The accelerating vol-
had been detected and there seemed to be little tage applied to the dees was 2000 V and Law-
pattern to link all the particles. Murray Gell- rence determined that the hydrogen ions had
Mann and Yuval Ne’eman discovered that there been accelerated to an energy of 80 000 eV.
was an underlying organisation of all these par- (a) How many times had the hydrogen ions
ticles which Gell-Mann called the ‘Eightfold experienced the 2000 V accelerating
Way’. This organisation method prompted the potential and how many orbits of the cyclo-
prediction of the existence of another particle. tron did they complete?
The actual detection of the particle confirmed (b) Determine the velocity of an 80 000 eV
this method of organisation. hydrogen ion (proton).

516 FROM QUANTA TO QUARKS


CHAPTER REVIEW
(c) A charged particle moving in a circular 5. In 1983, particle physicists in the United States
path in a magnetic field has a centripetal proposed that a new accelerator be con-
2
mv structed. This new accelerator was called the
force of magnitude Fc = --------- provided by the
r Superconducting Supercollider (SSC) and was
magnetic force FB = qvB. Determine the to be approximately 86 kilometres in circumfer-
radius of an 80 000 eV proton in a magnetic ence. It was planned to be approximately 60
field of 1.22 T. metres below the ground surrounding the city
2. The Stanford Linear Accelerator (SLAC) was of Waxahachie, Texas. The total cost was esti-
used to accelerate electrons to very high velo- mated to be eight billion dollars.
cities. Suggest a reason why a linear accelerator
Construction began in 1990, but the project
was preferable to a cyclic accelerator, such as a
was cancelled in 1994 when it was about 20%
cyclotron, as a device to produce very high
energy electrons. (Hint: you may wish to think complete. (A search on the internet will yield
back to the Rutherford model of the atom and information about the project and reasons for
the electrons in orbit around the nucleus.) and against stopping the construction.)
3. The latest high-energy particle accelerators that Prepare material that could be used in a
have been constructed have been colliders, or debate, either to support or to oppose the
at least have been able to function as colliders expenditure of such a large amount of money
as well as fixed-target devices. on such a project.
(a) Consider a collider where a beam of
6. The Large Electron–Positron collider (LEP) at
200 GeV protons interacts with a beam of
200 GeV antiprotons and an accelerator CERN was planned to be shut down on
where a beam of 400 GeV protons were 11 September 2000. (The LEP is to be replaced
fired at a fixed target. With reference to by the Large Hadron Collider (LHC) which
momentum and energy, explain why there may start in 2008.) Researchers using the LEP
would be much more energy available for thought that they were on the verge of discov-
formation of new particles in the collider ering the Higgs boson and requested that the
than in the fixed target accelerator. LEP continue. Its life was extended until
(b) Obviously the difference in available energy 2 November 2000. By that time researchers
shows that a collider has a significant advan- thought that they were even closer to the dis-
tage over a fixed-target accelerator. How- covery of the Higgs. They believed that they
ever, there are some advantages that a fixed had reduced the possibility of their results
target accelerator has over a collider. List being due to statistical fluctuations to less than
some of these advantages. two parts in 1000. They believed that in 2001
4. (a) A proton is composed of two up quarks and they would be able to increase the energy of the
one down quark, and a neutron is com- LEP and complete a run which would reduce
posed of one up quark and two down the uncertainty to three parts in 10 million.
quarks. Show that a proton will have a
Researchers were shocked when they were
charge of plus one unit and a neutron will
not granted a further extension in the life of
be neutral.
the LEP and it was finally shut down on
(b) An antiproton would be expected to have a
charge opposite to that of a proton. Identify 2 November 2000.
the quarks (or antiquarks) in an antiproton There is still considerable debate about the
and show that it will have a charge of minus decision not to extend the life of the LEP. Much
one unit. of the information is available on the internet.
(c) A neutron has no charge but still has an anti- Research this decision and decide if you
particle. Identify the quarks in an antineu- support the closure or if you think that the life
tron and show that an antineutron is neutral. of the LEP should have been extended.

CHAPTER 26 QUARKS AND THE STANDARD MODEL OF PARTICLE PHYSICS 517


cooled with dry ice (beneath the base) and as the
26.1 alcohol vapour diffuses downwards, the vapour
becomes supersaturated. There should be a
CLOUD region in the chamber that maintains a supersat-
urated vapour and the trails of charged particles
CHAMBERS through this region can be observed.

Aim Expansion cloud chamber


To observe the tracks produced by charged par- This was the original type of cloud chamber and
ticles passing through a cloud chamber. was invented by C. T. R. Wilson in 1911 to detect
radiation. In such a cloud chamber, a piston
Safety below the chamber was lowered to reduce the
pressure and cool the gas in the chamber. This
Tongs or forceps should be used when handling produced a supersaturated vapour. In the expan-
radioactive materials. They should not be handled sion cloud chamber made in Australia by IEC, the
with bare hands. The sources encased in per-
expansion is performed by a bicycle pump (with
spex are probably not suitable for use in cloud
the washer reversed to extract air from the
chambers.
chamber).
Some cloud chambers have sources already
mounted. Special care should be taken with others
that rely on a radioactive salt such as thorium
oxide. Gloves should be worn if it is necessary to
open a bottle containing thorium oxide powder.
You should wash your hands after handling any
radioactive sources.
If using a diffusion cloud chamber, gloves
should be worn when handling dry ice and special
care taken when breaking and or crushing the dry
ice.

Introduction and theory


As there are two different types of cloud chamber
available for use in schools, we will describe how
each can be used. The principle of operation is the
same for both types as they rely on the production
of a supersaturated vapour (see page 496). As ion-
ising radiation passes through the supersaturated
vapour, droplets condense on the ions formed and
reveal the path of the radiation. Figure 26.11 A Wilson cloud chamber
PRACTICAL ACTIVITIES

Diffusion cloud chamber Reduction of the pressure causes the vapour to


become supersaturated and hopefully trails
The diffusion cloud chamber was developed in
1939 by Dr Alexander Langsdorf, Jnr at the Uni- appear. A voltage applied to a metal ring inside
versity of California at Berkeley. A diffusion the chamber sweeps the ions from the chamber
cloud chamber can operate continuously, appar- before another expansion or pressure reduction
ently for hours at a time instead of the few sec- can be performed. A disadvantage of such a
onds of an expansion type chamber, but the chamber is that it is then necessary to wait per-
small devices used in schools are unable to main- haps a minute for the alcohol to evaporate into
tain the correct conditions for much longer than the chamber before repeating the process. The
fifteen to twenty minutes. In these cloud cham- small school version can usually be used without
bers, alcohol from the top of the chamber evap- this wait. Once the correct conditions have been
orates and then diffuses downwards towards the attained, withdrawing the handle of the bicycle
base of the chamber. The base of the chamber is pump can usually be done every few seconds.

518 FROM QUANTA TO QUARKS


PRACTICAL ACTIVITIES
Part A: Observing tracks in a will remove dust and ions and this should assist in
the production of tracks in the chamber.)
diffusion cloud chamber After a few minutes, tracks should become vis-
ible. If a light is shone horizontally through the
Apparatus side of the chamber, the visibility of the tracks may
cloud chamber (which probably has a built-in be enhanced.
radioactive source) The chamber should continue to operate while
woollen cloth the temperature difference between the top and
alcohol (Propan-2-ol also known as iso-propyl alcohol bottom of the chamber region is maintained. If the
recommended but ethyl alcohol should work) trails become hard to see, recharging the perspex
dry ice top may help.
light source (possibly a microscope lamp) It is sufficient to set up a cloud chamber and
observe the tracks of alpha particles through the
chamber.
Method If the cloud chamber works particularly well you
We will assume that the cloud chamber has a built- may be able observe other tracks that do not come
in source. from the source. (Cloud chambers played an
Remove the perspex top and moisten the felt ring important role in the discovery of cosmic rays.)
with a few drops of alcohol. Turn the chamber
upside-down and unscrew the base and remove the Part B: Observing tracks in an
foam pad. Place some small pieces of dry ice over
the black metal plate. These will be held against expansion cloud chamber
the metal by the foam when the base is screwed The IEC Wilson’s Expansion Cloud Chamber is
back on. described below. Usually expansion cloud chambers
have the disadvantage that they provide a brief view
Transparent top (charged
of the tracks when expansion occurs and
by rubbing with cloth)
Felt ring
then there is a delay before the expansion
(containing can be repeated. However, with this par-
alcohol) ticular cloud chamber, expansions can be
performed every few seconds.

Radioactive source
Apparatus
Wilson’s Expansion Cloud Chamber
Black including modified bicycle pump, radio-
metal active source
plate alcohol (propan-2-ol, also known as iso-
Dry ice
propyl alcohol, is recommended but ethyl
alcohol should work.)
light source (possibly a microscope lamp)
Foam pad (surrounded high-voltage DC power supply (at least
by polystyrene) 300 V)
Wedge
(used to level Method
chamber) Preparing the radioactive source
Figure 26.12 A diffusion cloud chamber
The cloud chamber is supplied with a small quan-
tity of thorium oxide. The thorium oxide can be
Place the chamber right side up on small wedges used to prepare a point radioactive source (which
(probably provided) and level it as carefully as can be screwed into the side of the chamber)
possible. If the metal plate is not horizontal, con- or radon 220 gas, produced from the decay of
vection currents can hinder the production of thorium, can be used as the source. (If you search
tracks. Replace the perspex top and rub it gently for information on the decay of thorium, you will
with the woollen cloth to charge it. (Charging the find that radon 220 is produced part of the way
top of the chamber produces an electric field that along the decay series.)

CHAPTER 26 QUARKS AND THE STANDARD MODEL OF PARTICLE PHYSICS 519


Special care must be taken if it is necessary to Reconnect the bicycle pump, and quickly and
transfer thorium oxide into the squeeze bottle that smoothly withdraw the handle of the pump.
is used to inject the radon into the cloud chamber. Tracks should be visible throughout the
Your teacher will probably do this for you. (The chamber. The radon gas should have spread
apparatus works well with about 10 grams of tho- throughout the chamber and as it decays by alpha
rium oxide in the squeeze bottle.) particle emission, the short tracks produced by
Setting up the expansion cloud chamber alpha particles should be visible throughout the
Make sure that the base of the cloud chamber is level. chamber.
Remove and clean the glass top. Pour a few millilitres The trails disappear quickly but repeating within
of propan-2-ol onto the black metal disc at the a few seconds should produce another set of trails.
bottom of the chamber (2 to 3 millilitres works well). Occasionally much longer trails, which are prob-
Replace the top and gently tighten the screws to pro- ably thinner than those of the alpha particles will
duce a seal. be observed. What particles are likely to produce
Attach the hose from the squeeze bottle con- these trails and what might be their origin?
taining the thorium oxide to the fitting on the side The number of tracks produced will gradually
of the chamber. decrease. (Check the half-life of radon 220 and
Connect the terminals on the side and base of you will see why.) Eventually it will be necessary to
the chamber to a high-voltage power supply. (The squeeze another puff of gas into the chamber.
instructions say up to 600 volts can be used but
excellent tracks were observed with a voltage of Further investigations
200 volts. This high voltage is essential for the pro- It is an achievement to set up a cloud chamber and
duction of tracks.) observe the trails. If you find that you are able to
Set up the microscope lamp to that it shines get your cloud chamber working very well without
horizontally into the chamber. (It should not be any difficulty you might like to try to investigate
too close to the chamber as it is essential that the further.
chamber is not heated.) You could try different sources and compare the
Release the Mohr clip on the hose connecting tracks produced by different type of radiation.
the squeeze bottle to the chamber, squeeze the (This might be impractical with a diffusion cloud
bottle gently once and then replace the clip. (As chamber but might well be possible with the
the purpose of this is to inject some radon 220 gas expansion cloud chamber.)
into the chamber, it is easier to do this if the You could try to produce a magnetic field in the
bicycle pump is disconnected which means that chamber and see if it is possible to deflect the par-
the chamber is not sealed.) ticles in the magnetic field.
PRACTICAL ACTIVITIES

520 FROM QUANTA TO QUARKS


GLOSSARY
A bosons: particles that have either integer or zero spin.
A-scan: a range-measuring system that records the time They do not obey the Pauli exclusion principle. Bosons
for an ultrasonic pulse to travel to an interface in the are force-carrying particles.
body and be reflected back brushes: conductors that make electrical contact with the
absolute magnitude, M: the magnitude that a star would moving split metal ring of a commutator
have if it were viewed from a standard distance of
10 parsecs C
absorption spectrum: a series of dark lines on a coloured cathode/anode: cathode rays are now known to be
background that is produced when white light is passed streams of electrons emitted within an evacuated tube
through a cool gas and viewed through a spectroscope from a cathode (negative electrode) to an anode
active optics: a slow feedback system to correct sagging (positive electrode). They were first observed in
or other deformities in the primary mirror of large discharge tubes.
modern reflector telescopes cathode ray tube or discharge tube: a sealed glass tube
acoustic impedance, Z : a measure of how readily sound from which most of the air is removed by vacuum pump.
will pass through a material. It is measured in kg m−2 s−1. A beam of electrons travels from the cathode to the
adaptive optics: use a fast feedback system to attempt to anode and can be deflected by electrical and/or
correct for effects of atmospheric turbulence magnetic fields.
aether: the proposed medium for light and other centripetal acceleration: always present in uniform
electromagnetic waves, before it was realised that these circular motion. It is associated with centripetal force
waveforms do not need a medium in order to travel and is also directed towards the centre of the circle.
angular momentum, L: of a point mass, m, which is in centripetal force: the force that acts to maintain circular
circular motion of radius, r, with velocity, v, is given by: motion and is directed towards the centre of the circle
L = m v r. Angular momentum is the rotational Chandrasekhar limit: (1.4 solar masses) the greatest mass
equivalent to linear momentum and is an important that a non-rotating white dwarf can have
quantity in rotational motion. (It follows a similar CNO (carbon–nitrogen–oxygen) cycle: the hydrogen
conservation principle to linear momentum.) fusion mechanism that dominates in hotter main
annual parallax, p: half the angle through which a nearby sequence stars
star appears to shift against the backdrop of distant
coherent: when there is a constant phase difference
stars, over a particular six-month period
between light waves; that is, the peaks line up and the
apparent magnitude, m: the magnitude given to a star as troughs line up. Also refers to an optic fibre bundle in
viewed from Earth which the optic fibres keep the same position relative
armature: a frame around which a coil of wire is wound, to one another.
which rotates in a motor’s magnetic field colour index: the difference between a star’s
astrometry: the careful measurement of a celestial photographic magnitude, B, and its visual magnitude, V
object’s position, and changes of position, to a high commutator: a device for reversing the direction of a
order of accuracy current flowing through an electric circuit, for
attenuation: the reduction in intensity of a signal example, the coil of a motor
average binding energy per nucleon: the total binding contrast: refers to the brightness difference between parts
energy of a nucleus divided by the number of nucleons of the image.
in the nucleus. It is a measure of the stability of the
covalent bond: a strong chemical bond formed between
nucleus.
atoms by the sharing of electrons in the valence band
critical mass: smallest amount of fuel necessary to sustain
B a chain reaction. As the size increases, the volume to
B-scan: displays the reflected ultrasound as a spot, the surface area ratio increases and a smaller proportion of
brightness of which is determined by the intensity of the neutrons are lost from the fuel.
ultrasound crystal: a naturally occurring solid with a regular
back emf: an electromagnetic force that opposes the polyhedral shape. All crystals of the same substance
main current flow in a circuit. When the coil of a motor grow so that they have the same angles between their
rotates, a back emf is induced in the coil due to its faces. The atoms that make up the crystal have a regular
motion in the external magnetic field. arrangement called a crystal lattice.
baryons: hadrons that have half-integer spin (and are crystalline: a solid in which the atoms or molecules are
fermions). Examples are the proton and neutron. Some arranged in a regular pattern
other baryons are included in table 26.1.
binding energy: the energy equivalent of the mass defect
of the nucleus. It is the energy that would have to be D
provided and converted to mass to enable all the diffraction: refers to the spreading out of light waves
nucleons in a nucleus to be separated from each other. around the edge of an object or when light passes
black hole: the crushed remnant of the core (greater than through a small aperture
5 solar masses) of a very massive star. Theoretically, it is diffraction grating: a device consisting of a large number
a point of zero volume and infinite density. of slits, used to produce a spectrum

GLOSSARY 521
diode: a device that contains only two electrodes gravitational potential energy: Ep, the energy of a mass
distance modulus: equal to (apparent magnitude − due to its position within a gravitational field. On a large
absolute magnitude). It is directly related to the scale, gravitational potential energy is defined as the
distance of a star from Earth. work done to move an object from infinity (or some
dopant: a tiny amount of an impurity that is placed in an point very far away) to a point within a gravitational field.
otherwise pure crystal lattice to alter its electrical ground state: the state an electron is in when it has the
properties lowest possible amount of energy
Doppler effect: the apparent change in frequency
observed when there is relative movement between a H
source of a sound and an observer hadrons: particles that experience the strong nuclear
drift velocity: the average velocity of electrons in a force. Mesons and baryons are both hadrons.
conductor under the influence of an electric field half-life: the time taken for half the radioactive nuclei in
a sample to decay. If we exclude the activity of daughter
E nuclei, it is the time taken for the activity of a particular
eddy current: a circular or whirling current induced in a sample to drop to half its initial value.
conductor that is stationary in a changing magnetic hard X-rays: consist of high-energy photons and are more
field, or that is moving through a magnetic field. They penetrating than soft X-rays, which have lower energy
resemble the eddies or swirls left in the water after a photons
boat has gone by. heliosphere: the zone around the solar system dominated
electromagnetic induction: the generation of an emf by the Sun’s magnetic field and solar wind. It is bound
and/or electric current through the use of a magnetic by the heliopause, approximately 100 AU from the Sun.
field helium flash: the sudden onset of helium fusion in the
emission spectrum: a series of brightly coloured lines core of a new red giant
on a dark background that is produced when light from
an excited gas is viewed through a spectroscope
empirical equation: one that has no theoretical basis I
but can be used to calculate correct values. Kepler’s incandescent: bright or glowing. Like black bodies, most
Third Law, T 2 ∝ R3, which you encountered in ‘The substances become incandescent when they become
Cosmic Engine’, is another example of an empirical hot enough.
equation. induction: a process where one object with magnetic or
escape velocity: the initial velocity required by a electrical properties can produce the same properties
projectile to rise vertically and just escape the in another object without making physical contact
gravitational field of a planet induction motor: an AC machine in which torque is
excited state: when an electron exists in a stationary state produced by the interaction of a rotating magnetic field
in which it has more energy produced by the stator and currents induced in the
rotor
F inertial frame of reference: a non-accelerated
fermions: particles that have half-integer spins. They obey environment. Only steady motion or no motion is
the Pauli exclusion principle. allowed. A non-inertial frame of reference experiences
field vector: a single vector that describes the strength acceleration.
and direction of a uniform vector field. For a integrated circuit (IC): an electronic circuit in which all
gravitational field, the field vector is g. the components, such as transistors, diodes, resistors,
fissile: a nucleus that may undergo fission capacitors and connections, are made in or on a single
fluorescence: the emission of light from a material when piece of semiconductor, such as a silicon chip
it is exposed to streams of particles or external radiation interference: the interaction of two or more waves —
flux: from the Latin word fluo meaning ‘flow’. Flux is a producing regions of maximum amplitude (constructive
state of flowing or movement. In physics, flux is the rate interference) and zero amplitude (destructive
of flow of a fluid, radiation or particles. interference). The Michelson–Morley experiment used
the interference of light in an attempt to measure the
movement of the Earth through the aether.
G
interferometry: a technique used to combine the data
galvanometer: an instrument for detecting small
from several elements of an antenna array in order to
electrical currents
achieve a higher resolution
geostationary orbit: an altitude at which the period of the
orbit precisely matches that of the Earth. This interstellar dust: made of grains of silicates and ices in a
corresponds to an altitude of approximately 35 800 km. core and mantle structure, just one micrometre across
gradient magnetic field: a magnetic field that changes by interstellar gas: occurs as regions of neutral atoms, ions
small known increments throughout the region of the or molecules. It is mostly hydrogen.
field interstellar medium: consists of gas and dust
gravitational field: a field within which any mass will ionisation blackout: a period of no communication with
experience a gravitational force. The field has both a spacecraft due to a surrounding layer of ionised atoms
strength and direction. forming in the heat of re-entry

522 GLOSSARY
isotope: a nuclide that has the same number of protons non-coherent: refers to an optic fibre bundle in which the
but different numbers of neutrons compared to fibres are not kept in the same position relative to one
another nuclide of the same element another
nuclide: refers to a particular nucleus with certain values
L of Z (atomic number) and A (mass number)
Larmor frequency: the frequency with which a nucleus
precesses about its spin axis, in response to the force O
due to an external magnetic field optical fibre: a glass core surrounded by a cladding of
length contraction: the shortening of an object in the lower refractive index. Light is transferred along the
direction of its motion as observed from a reference optical fibre by total internal reflection.
frame in relative motion
leptons: particles which do not experience the strong P
nuclear force. They are all fermions with half-integer parallax: the apparent shift in position of a close object
spin. An electron is a lepton. against a distant background due to a change in
light-year: the distance travelled through space in one year position of the observer
by light or other electro magnetic wave. It corresponds
parsec: a parallax-second — the distance that
to a distance of 0.3066 parsecs or 9.4605 × 1012 km.
corresponds to an annual parallax of 1 second of arc
low Earth orbit: an orbit higher than 250 km and lower period, T: the time taken to complete one orbit
than 1000 km
phase scan: a scan produced using an array of
transducers. The phase difference between the signals
M from each transducer may be varied to produce this
magnetic flux, ΦΒ : the amount of magnetic field passing scan.
through a given area. In the SI system, ΦB is measured phosphorescent: a substance that absorbs radiation of
in weber (Wb). one wavelength and then emits radiation of a different
magnetic flux density: the strength of a magnetic field, wavelength over a period of time. The hands of some
B. In the SI system, B is measured in tesla (T) or weber analogue watches are coated with a phosphorescent
per square metre (Wb m−2). substance to enable them to be seen in the dark.
magnetosphere: the region around a planet in which the photocell: a device that uses the photoelectric effect.
planet’s magnetic field exerts an influence These devices include photovoltaic cells and solar cells
main sequence star: characterised by the fusion of which convert electromagnetic energy, such as sunlight,
hydrogen to helium in its core into electrical energy.
mass defect: the difference between the mass of the photoconductive cell: or photo-resistor, uses the fact that
constituent nucleons of a nucleus and the mass of the electrical resistance is affected by light falling onto it
nucleus photoelectric effect: the name given to the release of
mass dilation: the increase in the mass of an object as electrons from a metal surface exposed to
observed from a reference frame in relative motion electromagnetic radiation. For example, when a clean
maxima: refers to points on an interference pattern surface of sodium metal is exposed to ultraviolet light,
where the peaks of each set of waves coincide. This electrons are liberated from the surface.
produces a bright spot when light is used and is a point photometry: the measurement of the brightness of a
of constructive interference. source of light or other radiation
medium: the material through which a wave travels photon: a quantum (or discrete packet) of
mesons: hadrons that have zero or integer spins. Some electromagnetic radiation. It can be thought of as an
of the mesons with zero spin are included in table 26.2. elementary particle with zero rest mass and charge,
metastable: a nucleus in an excited state for a period of travelling at the speed of light.
time before decaying piezoelectric effect: the conversion of electrical energy
MeV: a million electron volts — the energy gained by one to mechanical energy resulting in the change in shape
electron accelerating through a potential difference of of a piezoelectric crystal when it is subjected to a
one million volts potential difference
minima: refers to points on an interference pattern where planetary nebula: a shell-shaped cloud of gas that is the
peaks of one wave coincide with troughs of the other. blown-away outer layers of a star
This produces a dark spot and is a point of destructive positrons: positively charged beta particles formed when
interference. a proton disintegrates to form a neutron and a positron.
motor effect: the action of a force experienced by a A positron is identical to an electron except that its
current-carrying conductor in an external magnetic field charge is positive instead of negative.
precession: the movement, in a conical path, of the axis
N of a spinning object
net spin: a property of a nucleus. If a nucleus has a net principal quantum number: the value of n for each
spin it behaves as a tiny magnet. stationary state or orbit of the Bohr atom
neutron star: the extremely dense remnant of the core projectile: any object launched into the air
(1.4 to 3 solar masses) of a massive star. It is composed proton–proton (p–p) chain: the hydrogen fusion
of neutron matter. mechanism that is first to occur in main sequence stars

GLOSSARY 523
protostar: a new star before it begins to produce any produces a constant glow, not individual flashes as
nuclear energy in its core would be observed when alpha particles hit such a
screen.
Q sector scan: a scan in the shape of a sector, made from a
quantum (plural: quanta): can be considered to be the series of B-scans
smallest amount of energy possible in a given situation. seeing: refers to the twinkling and blurring of a star’s light
Planck’s atomic oscillators could oscillate only with due to atmospheric distortion
certain precise amounts of energy. semiconductor: a material in which resistance decreases
quantum mechanics: the name given to a set of physical as it rises. Its resistivity lies between that of a conductor
laws that apply to objects the size of atoms or smaller. and an insulator.
The concepts of wave–particle duality and uncertainty slingshot effect: or planetary swing-by, is a manoeuvre
lie at the heart of quantum mechanics. used with space probes to pick up speed and proceed
quantum theory: based on quantity or amounts (from on to another target
the Latin word quantum meaning ‘how much’). In slip speed: the difference between the speed of the
‘classical physics’ an object could possess any amount rotating magnetic field and the speed of the rotor
of energy. In quantum theory objects could possess only soft X-rays: X-rays consisting of low-energy photons
certain discrete amounts of energy. Instead of being
solenoid: consists of a coil of wire wound uniformly into
‘continuous’, energy is available only in ‘packets’.
a cylinder
space–time: a single four-dimensional concept that
R considers space and time as being bound together
radioactive decay: the emission of particles from the spectroscope: a device used to spread a light into its
nucleus of a radioactive element spectrum. It can be attached to the eyepiece of a
radioactive isotope or radioisotope: an isotope that is telescope to examine the spectra of starlight.
unstable and will emit particles from the nucleus until spectroscopic parallax: a method of using the H–R
it becomes stable diagram and the distance modulus formula to
radiopharmaceutical: a compound that has been determine the approximate distance of a star
labelled with a radioisotope
speed of light: 3.0 × 10 8 m s−1, or approximately
radio telescope: a large dish or array aimed at the sky that 173 million km h−1. It is the theoretical maximum
detects radio waves arriving from space. The signal is velocity in our universe.
fed to computers that are able to compile the
information into an image. split metal ring: the two-piece conducting metal surface
of a commutator. Each part is connected to the coil.
red giant: a star characterised by a helium-burning core
surrounded by a hydrogen-burning shell squirrel-cage rotor: an assembly of parallel conductors
relaxation: refers to precessing nuclei moving back to and short-circuiting end rings in the shape of a
their original energy state cylindrical squirrel cage
resolution: the ability to distinguish closely spaced points stationary state: the state an electron is in when it orbits
as separate points. The resolution limit is the smallest the nucleus without emitting any electromagnetic
separation of points that can be distinguished as radiation
distinct. stator: the non-rotating magnetic part of a motor
resonate: to absorb energy when an applied frequency stellar spectroscopy: the examination of the spectra of
matches the natural frequency of an object stars in order to learn more about their composition,
rest energy: the energy equivalent of a stationary object’s surface temperature, velocity, density, etc.
mass, measured within the object’s rest frame step-down transformer: provides an output voltage that
rest frame: the frame of reference within which a is less than the input voltage
measured event occurs or a measured object lies at step-up transformer: provides an output voltage that is
rest greater than the input voltage
right-hand grip rule: used to find the direction of a stroboscope: a light that produces quick flashes at regular
magnetic field around a straight current-carrying (usually small) time periods
conductor.
supernova: a violent explosion of uncontrolled nuclear
right-hand push rule: (also called the right-hand palm reactions that completely blows away the various layers
rule) used to find the direction of the force acting on
of a massive star (original mass greater than five solar
a moving charged particle or current-carrying
masses)
conductor in an external magnetic field
rotor: the rotating part of an electrical rotating machine
T
S terminal: the free end of a cell or battery to which a
scintillation: a flash of light observed on a scintillation connection is made to the rest of a circuit
screen. Another example of scintillation is electrons theoretical resolution: a telescope’s ability to distinguish
striking the screen of a cathode ray oscilloscope. The two close objects as separate images. It is measured as
screen produces many scintillations when it is struck by an angle.
electrons. Of course, the continuous beam of electrons thrust: the force delivered to a rocket by its engines

524 GLOSSARY
time dilation: the slowing down of events as observed V
from a reference frame in relative motion valence band: the energy band in a solid in which the
torque: the turning effect of a force. It is the product of outermost electrons are found
the tangential component of the force and the distance valve: a thermionic device in which two or more
the force is applied from the axis of rotation. electrodes are enclosed in a glass tube. The name
trajectory: the path that a projectile follows during its flight comes from the rectifying property of the device; that
transfer orbit: an orbit used to manoeuvre a satellite from is, the current flows in only one direction.
one orbit to another vector: any quantity that has both magnitude and
transformer: a magnetic circuit with two multi-turn coils direction. Force is one example.
wound onto a common core visual magnitude: refers to magnitude as judged by eye,
transistor: a tiny switch that changes the size or direction or more accurately by a photometer fitted with a
of electric current as a result of very small changes in the yellow–green filter
voltage across it. Transistors are used in sound amplifiers
voltage: the electrical pressure between two points that is
and in a wide range of electronic devices. Today, a single
capable of producing a flow of current between the
chip of silicon can hold many microscopic transistors
points when they are connected by a closed circuit
and is called an integrated circuit.
transmutation: when a radioactive atom emits an alpha voxel: a small volume, part of a ‘slice’ through the body
particle or a beta particle and an atom of a new element
is produced. A new daughter element is formed from a W
parent element. wavefront: either the crest or trough of a wave. The
trigonometric parallax: a method of using trigonometry wavefront is perpendicular to the direction of the
to solve the triangle formed by parallax to determine velocity of the wave.
distance weight: the force on a mass due to the gravitational field
triple alpha reaction: the process of helium fusion in the of a large celestial body, such as the Earth
core of a red giant white dwarf: a dense star made of degenerate matter. It
is the end point of small- to medium-sized stars.
U work function: the energy required to release the
uniform circular motion: circular motion with a uniform electron from the surface of a particular material
orbital speed
universal motor: a series-wound motor that may be X
operated on either AC or DC electricity
X-rays: electromagnetic waves of very high frequency and
ultrasound: very high frequency sound. Ultrasound waves
very short wavelength
are sound waves that have a frequency above the range
of human hearing, that is, greater than 20 000 hertz.
ultrasound transducer: a device for converting electrical Z
energy to ultrasound energy or for converting zero-age main sequence (ZAMS): a plot of the main
ultrasound energy to electrical energy sequence using only zero-age stars

GLOSSARY 525
APPENDIX 1: Formulae and data sheet
DATA SHEET
Numerical values of several constants

−19
Charge on the electron, qe −1.602 × 10 C
−31
Mass of electron, me 9.109 × 10 kg
−27
Mass of neutron, mn 1.675 × 10 kg
−27
Mass of proton, mp 1.673 × 10 kg
−1
Speed of sound in air 340 m s
−2
Earth’s gravitational acceleration, g 9.8 m s
8 −1
Speed of light, c 3.00 × 10 m s
µ
Magnetic force constant  k ≡ -----0-
−7 −2
2.0 × 10 N A
 2π
−11 2 −2
Universal gravitational constant, G 6.67 × 10 N m kg
24
Mass of Earth 6.0 × 10 kg
−34
Planck’s constant, h 6.626 × 10 Js
7 −1
Rydberg’s constant, RH (hydrogen) 1.097 × 10 m
−27
Atomic mass unit, u 1.661 × 10 kg
MeV
931.5 -----------
2
-
c
−19
1 eV 1.602 × 10 J
3 −3
Density of water, ρ 1.00 × 10 kg m
3 −1 −1
Specific heat capacity of water 4.18 × 10 J kg K

526 APPENDIX 1
FORMULAE SHEET
PRELIMINARY COURSE HSC COURSE From ideas to implementation
The world communicates Space F = qvB sin θ
v = fλ m1 m2 V
E p = – G ------------- E = ---
1 r d
I ∝ ----2-
d F = mg E = hf
2 2
v1 sin i vx = ux c = fλ
----- = -----------
v2 sin r v = u + at Astrophysics
2 2
v y = u y + 2ay ∆y 1
Electrical energy in the home d = ---
p
F ∆x = uxt
E = --
q d
∆y = uyt + 1--- ayt2 M = m − 5 log  ------
2  10
V
R = ---
I 3 (m – m )
r GM IA B A
P = VI ------2- = ---------2- -------------------------
T 4π ----- = 100 5
IB
Energy = VIt
Gm 1 m 2
F = -----------------
- 4π 2 r 3
Moving about d2 m 1 + m 2 = -------------
-
GT 2
∆r E = mc
2
v av = ------
∆t Medical physics
2
∆v v–u v Z = ρv
a av = ------ = ------------ Lv = L0 1 – ----2-
∆t t c
I [ Z2 – Z1 ]2
ΣF = ma t0 ----r = ------------------------
-
tv = -----------------
- I0 [ Z2 + Z1 ]2
mv 2 2
F = ---------- v
r 1 – ----2-
c From quanta to quarks
1
E k = --- mv 2 1  1 1
2 m0 --- = R H  ----- – -----
m v = ------------------ λ
W = Fs v2  n 2f n 2i 
1 – -----
p = mv c2
h
Impulse = Ft λ = ------
-
mv
Motors and generators
The cosmic engine F I1 I2
--- = k --------
- The age of silicon
luminosity l d
Brightness = --------------------------
2
- V out
4πr F = BIl sin θ A0 = ---------
V in
λmaxT = W t = Fd
v = H0 D V out R
t = nBIA cos θ --------- = – -----f
V in Ri
Vp n
----- = ----p-
Vs ns

APPENDIX 1 527
Group
1 18

528
Period 1 2
1 H He
1.008 4.003
Hydrogen 2 13 14 15 16 17 Helium

3 4 Atomic number 5 6 7 8 9 10

APPENDIX 2
2 Li Be Symbol B C N O F Ne
6.941 9.012 Atomic weight 10.81 12.01 14.01 16.00 19.00 20.18
Lithium Beryllium Name Boron Carbon Nitrogen Oxygen Fluorine Neon

11 12 13 14 15 16 17 18
3 Na Mg Al Si P S Cl Ar
22.99 24.31 26.98 28.09 30.97 32.07 35.45 39.95
Sodium Magnesium 3 4 5 6 7 8 9 10 11 12 Aluminium Silicon Phosphorus Sulfur Chlorine Argon

19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
4 K Ca Sc Ti V Cr Mn Fe Co Ni Cu Zn Ga Ge As Se Br Kr
39.10 40.08 44.96 47.87 50.94 52.00 54.94 55.85 58.93 58.69 63.55 65.39 69.72 72.61 74.92 78.96 79.90 83.80
Potassium Calcium Scandium Titanium Vanadium Chromium Manganese Iron Cobalt Nickel Copper Zinc Gallium Germanium Arsenic Selenium Bromine Krypton

37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
5 Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe
85.47 87.62 88.91 91.22 92.91 95.94 [98] 101.1 102.9 106.4 107.9 112.4 114.8 118.7 121.8 127.6 126.9 131.3
Rubidium Strontium Yttrium Zirconium Niobium Molybdenum Technetium Ruthenium Rhodium Palladium Silver Cadmium Indium Tin Antimony Tellurium Iodine Xenon

55 56 57–71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
6 Cs Ba * Hf Ta W Re Os Ir Pt Au Hg Tl Pb Bi Po At Rn
132.9 137.3 178.5 180.9 183.8 186.2 190.2 192.2 195.1 197.0 200.6 204.4 207.2 209.0 [209] [210] [222]
Caesium Barium Lanthanides Hafnium Tantalum Tungsten Rhenium Osmium Iridium Platinum Gold Mercury Thallium Lead Bismuth Polonium Astatine Radon
APPENDIX 2: Periodic table

87 88 89–103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118
7 Fr Ra ** Rf Db Sg Bh Hs Mt Ds Rg Uub Uuq Uuh Uuo
[223] [226] [261] [262] [266] [264] [265] [268] [271] [280] [272] [285] [289] [294]
Francium Radium Actinides Rutherfordium Dubnium Seaborgium Bohrium Hassium Meitnerium Darmstadtium Roentgenium Ununbium Ununquadium Ununhexium Ununoctium

*Lanthanide series
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
La Ce Pr Nd Pm Sm Eu Gd Tb Dy Ho Er Tm Yb Lu
138.9 140.1 140.9 144.2 [145] 150.4 152.0 157.3 158.9 162.5 164.9 167.3 168.9 173.0 175.0
Lanthanum Cerium Praseodymium Neodymium Promethium Samarium Europium Gadolinium Terbium Dysprosium Holmium Erbium Thulium Ytterbium Lutetium

**Actinide series
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103
Ac Th Pa U Np Pu Am Cm Bk Cf Es Fm Md No Lr
[227] 232.0 231.0 238.0 [237] [244] [243] [247] [247] [251] [252] [257] [258] [259] [262]
Actinium Thorium Protactinium Uranium Neptunium Plutonium Americium Curium Berkelium Californium Einsteinium Fermium Mendelevium Nobelium Lawrencium

• For elements with no stable nuclides, the mass of the longest living isotope is given in square brackets.
• The atomic weights of Np and Tc are given for the isotopes 237Np and 99Tc.
APPENDIX 3: Key words for examination questions
HSC syllabus documents and examination questions use the following key words that state what students
are expected to be able to do.

Account Account for: state reasons for, report on. Give an account of: narrate a series of
events or transactions
Analyse Identify components and the relationship between them; draw out and relate
implications
Apply Use, utilise, employ in a particular situation
Appreciate Make a judgement about the value of
Assess Make a judgement of value, quality, outcomes, results or size
Calculate Ascertain/determine from given facts, figures or information
Clarify Make clear or plain
Classify Arrange or include in classes/categories
Compare Show how things are similar or different
Construct Make; build; put together items or arguments
Contrast Show how things are different or opposite
Critically (analyse/ Add a degree or level of accuracy, depth, knowledge and understanding, logic,
evaluate) questioning, reflection and quality to (analysis/evaluation)
Deduce Draw conclusions
Define State meaning and identify essential qualities
Demonstrate Show by example
Describe Provide characteristics and features
Discuss Identify issues and provide points for and/or against
Distinguish Recognise or note/indicate as being distinct or different from; note differences
between
Evaluate Make a judgement based on criteria; determine the value of
Examine Inquire into
Explain Relate cause and effect; make the relationships between things evident; provide why
and/or how
Extract Choose relevant and/or appropriate details
Extrapolate Infer from what is known
Identify Recognise and name
Interpret Draw meaning from

APPENDIX 3 529
Investigate Plan, inquire into and draw conclusions about
Justify Support an argument or conclusion
Outline Sketch in general terms; indicate the main features of
Predict Suggest what may happen based on available information
Propose Put forward (for example a point of view, idea, argument, suggestion) for
consideration or action
Recall Present remembered ideas, facts or experiences
Recommend Provide reasons in favour
Recount Retell a series of events
Summarise Express, concisely, the relevant details
Synthesise Put together various elements to make a whole
© Board of Studies NSW, 2003

530 APPENDIX 3
ANSWERS TO NUMERICAL QUESTIONS
CHAPTER 1 CHAPTER 4
21
4. 4. (a) 3.59 × 10 N
g on surface Weight of 80 kg 23
-2
(m s ) person there (N) (b) 4.17 × 10 N
−10
5. 1.48 × 10 N
3.7 296
6. (a) 710 N
8.9 712 (b) 650 N
1.8 143 7. (a) Satellite:
−1
orbital velocity, 7721 m s
1.3 101 centripetal force, 1.21 × 10 N
4

4
5. (a) 0.124 gravitational force, 1.21 × 10 N
(b) 0.515 (b) Venus:
4 −1
(c) 0.904 orbital velocity, 3.52 × 10 m s
22
(d) 0.466 centripetal force, 5.6 × 10 N
9 22
11. −8.59 × 10 J gravitational force, 5.5 × 10 N
30
12. (a) −7.4 × 10 J (c) Callisto:
(b) −3.24 × 1035 J orbital velocity, 8186 m s
−1

21
centripetal force, 3.9 × 10 N
CHAPTER 2 21
gravitational force, 3.9 × 10 N
8. 3.14 m
9. (a) 2.39 m
(b) 6.14 m
CHAPTER 5
10. (a) 6.2 s
(b) 108.5 m 12.
Distance
11. Yes (light- Distance Distance
12. (a) 56 500 m Star years) (parsecs) (km)
(b) 57 400 m 14
(c) 56 500 m Canopus 75 23 7.1 × 10
15
13. 3390 m Rigel 900 276 8.5 × 10
14. 115 000 m 14
−1 −1
17. Mercury: 4250 m s ; Venus: 10 400 m s ; Arcturus 32.6 10 3.1 × 10
−1 −1
Io: 2550 m s ; Callisto: 2470 m s Hadar 3.26 100
15
3.1 × 10
−2
18. (a) a = 60 m s , g = 7.1
(b) a = 69.6 m s−2, g = 8.1 15. (a) 0.745c (b) 53.4 m
−2
19. (a) a = 2.7 m s , g = 1.3 16. 3479.99999998 km
−2
(b) a = 83 m s , g = 8.5 17. 0.99c
18. 0.99999994c
CHAPTER 3 19. Pluto: 15 min; Proxima Centauri: 69 days;
−2
2. F = 31 N, a = 78 m s Sirius: 141 days; Alpha Crucis: 23.36 years;
3. 19 400 N Andromeda: 100 717 years
−1
5. (a) 28 400 km h 20. (a) 0.866c (b) 1 × 10 kg
5

(b) 85 min
(c) 2 s
(c) 2.44 N
6. (a) 28 050 km h
−1 21. (a) 28.6 m
(b) 88.1 min (b) 28 min 59 s
−2 5
(c) 9.25 m s towards Earth’s centre (c) 3.14 × 10 kg
−10
(d) 1020 000 N 25. (a) 1.506 × 10 J
7. Mercury: .244 Earth years; Venus: .619 Earth −10
(b) 5.9819 × 10 J
years; Mars: 1.89 Earth years; Jupiter: 11.9 Earth −9
years; Saturn: 29.4 Earth years (c) 1.7939 × 10 J
11
10. Low Earth: 360.0, 1.53, 7686 (d) 4.5 × 10 J
16
Geostationary: 35 800, 23.93, 3070 (e) 9 × 10 J

ANSWERS TO NUMERICAL QUESTIONS 531


CHAPTER 6 CHAPTER 10
−3 −13
11. (a) 6.8 × 10 N, down the page 1. 2.4 × 10 N
−4 −16
(b) 1.5 × 10 N, out of the page 2. 8.6 × 10 N
−13
12. (a) 12.5 T 3. 1.7 × 10 N
−2 17 −1
13. (a) 6.0 × 10 N 4. 1.9 × 10 m s
−2 3 −1
14. 1.8 × 10 N 5. (a) 4.0 × 10 V m left
−5 −16
15. (a) 1.2 × 10 N (b) 6.4 × 10 N right
−5 −16
16. (a) 1.3 × 10 N (c) 6.4 × 10 N left
−6 −17
(b) 3.2 × 10 N (e) 3.2 × 10 J each
−5
−7
(c) 5.7 × 10 N 8. 2.0 × 10 N
−19
19. (b) 0.34 N 11. 4.8 × 10 C
12. (a) 2.00 × 10 V m−2
−2 2 2
(c) 2.7 × 10 m
−2
(d) 3.6 × 10 N m (b) 0.40 N
20. (a) 0.98 N, upwards 13. 0.488 N
21. (a) 3.00 × 10 V m−1
4

CHAPTER 7 (b) 3.00 × 10 m s−1


6

15
3. (a) 3.0 Wb (c) 4.8 × 10 N
−2
(b) 2.3 × 10 Wb
−6 CHAPTER 11
(c) 6.0 × 10 Wb
14
(d) 0 1. (a) 6.0 × 10 Hz
−19
−3
11. (a) 1.4 × 10 Wb (b) 4.0 × 10 J
14
−3
12. (a) 6.3 × 10 Wb (c) 2.5 × 10
(b) Would be 25 times greater 2. (a) 5.6 V
−18
15. (a) 48 A (b) 3.7 × 10 J
15
(b) 0.6 A (c) 5.6 × 10 Hz
14
16. (a) 24 A 6. (c) 4.2 × 10 Hz
−34
(b) 220 V (d) 6.6 × 10 J s
−19
(e) 2.8 × 10 J
−19
CHAPTER 8 9. 3.07 × 10 J
11. (b) 64 10. 55
12. 16 V number of red photons per second 
11. 1.33  ---------------------------------------------------------------------------------------------
-
13. (a) 2.0 V −4; 6.0 V −12 number of blue photons per second
−19
(b) 2.5 A 12. (a) 2.6 × 10 J
14. (a) 400 V (b) 2.5 V
−19
(b) 200 W (c) 6.9 × 10 J
(c) 200 W 14. (a) 2.86 m
−26
(d) 10 A (b) 6.95 × 10 J
−15
15. (b) 3 (increase) 17. 8.0 × 10 J
16. (a) 26
CHAPTER 12
(b) 26 A −9
1. 1.5 × 10 m
17. 0.020
18. (a) 1.3 V CHAPTER 13
(c) 0.8 A 5. (a) 0.014 W
20. (a) 0.24 A −5
(b) 3.0 × 10 V
(b) 0.096 V 6. 0.15 A
(c) 500 kV
−2
(d) 2.3 × 10 W CHAPTER 14
−2
21. (a) 7.0 × 10 9. (a) 2.1 arcsec
(b) 220 MW (b) 2.1 arcsec
2
(c) 6.7 × 10 A (c) 1.1 arcsec
22. (a) 5.0 A (d) 0.53 arcsec
(b) 400 W (e) 0.035 arcsec
3
(c) 3.9 × 10 V (f) 0.013 arcsec

532 ANSWERS TO NUMERICAL QUESTIONS


10. (a) 0.53 arcsec (d) 0.38 arcsec
(b) 0.63 arcsec (e) 0.0134 arcsec
(c) 0.74 arcsec (f) 0.00763 arcsec
5
11. (a) 1.3 × 10 arcsec (g) 0.0104 arcsec
(b) 630 arcsec (h) 0.0775 arcsec
(c) 210 arcsec (i) 0.001 arcsec
(d) 90 arcsec ( j) 0.13 arcsec
(e) 32 arcsec
6. (a) 51.5 light-years
(f) 6.3 arcsec
(b) 33.6 light-years
(g) 0.42 arcsec
(c) 773 light-years
12. (a) 4.2 arcsec
(d) 8.5 light-years
(b) 4200 arcsec
6 (e) 243 light-years
(c) 4.2 × 10 arcsec
13. (a) m = 40 × magnification, R = 0.46 arcsec (f) 427.6 light-years
(b) m = 100 × magnification, R = 0.46 arcsec (g) 313 light years
14. (a)
4
1.3 × 10 m
2 (h) 42.1 light-years
(b) 0.6 arcsec (i) 3200 light-years
15. (a) 1128 m ( j) 25 light-years
(b) 0.02 arcsec 13. (a) 3000 K
16. (a) 0.007 arcsec (b) 8000 K
(b) 0.002 arcsec (c) 6000 K
3
14. (a) 7.25 × 10 K
26
CHAPTER 15 (b) 9.37 × 10 W
1. 26. (a) ≈ 4.2
km AU l-y pc (b) ≈ 2.8
(c) ≈ 3700
1 km = 1 6.685 × 1.057 × 3.2408
10
−9
10
−13
× 10
−14 (d) ≈ 2
(e) ≈ 65
1 AU = 1.49 × 1 1.5813 4.848 ×
10
8
× 10
−5
10
−6 (f) ≈ 56
28
27. ≈ 2.5 × 10
1 light 9.4605 6.324 × 1 0.3066
year = × 10
12
104 30. (a) ≈ 96
(b) ≈ 98 pc
1 parsec 3.086 × 206 265 3.2616 1
12
= 10 31.
Star m M d
4. (a) 44.05 pc
Rigel 0.18 −6.69 237
(b) 98.04 pc
(c) 19.96 pc Bellatrix 1.64 -2.72 74.5
(d) 28.49 pc
(e) 5.144 pc Capella 0.07 −0.48 12.9
(f) 190 pc (to 2 significant figures) Sirius −1.44 1.45 2.64
(g) 11.2 pc
(h) 1.82 pc Deneb 1.25 −8.73 991
(i) 160 pc
Altair 0.75 2.2 5.14
( j) 130 pc
5. (a) 0.0633 arcsec Achernar 0.45 -2.77 44.1
(b) 0.097 arcsec
(c) 0.00422 arcsec Spica 0.98 −3.55 81

ANSWERS TO NUMERICAL QUESTIONS 533


32.
Star Parallax (mas) Distance (pc) m M

Fomalhaut 130.08 7.69 1.17 1.74

Vega 128.93 7.76 0.03 0.58

Canopus 10.43 95.9 −0.62 −5.5

Betelgeuse 7.63 131 0.45 −5.1

Rigil Kent 742.12 1.35 −0.01 4.3

−1
35. Fomalhaut: 10 pc; Vega: 8 pc 7. 330 m s
6 −2 −1
36. 75 pc 8. (a) (i) 1.63 × 10 kg m s
6 −2 −1
41. Based on the colour index, Aldebaran is a red (ii) 6.53 × 10 kg m s
star of spectral class K with a surface (b) 3:2
temperature of approximately 3500 K. 9. 0.000319
−2
42. Based on the colour index, Spica is a blue-white 10. (d) 1.74 mW cm
−2
star of spectral class B with a surface (e) 79.12 mW cm
−2
temperature of approximately 15 000 K. 14. 0.16 mW cm
6 −2 −1
15. (b) 1.56 × 10 kg m s
−1
CHAPTER 16 (c) 1300 m s
18. (c) 18 cm
5. −4
Total mass of system Total mass of system 19. (a) 4.5 × 10 s
(kg) (solar masses)
CHAPTER 20
30
4.87 × 10 2.45 6. (a) 4.0 minutes
31 (b) 8.0 minutes
1.16 × 10 5.81
30 CHAPTER 21
3.54 × 10 1.78
4. (c) 1.004 T
31
5.15 × 10 25.9
CHAPTER 22
31 −8
9.17 × 10 46.1 1. (a) 9.496 × 10 m
−7
30
(b) 4.341 × 10 m
6.80 × 10 3.42 (c) 1.282 × 10 m
−6

−7 −7
31 2. (a) 3.889 × 10 m, 3.798 × 10 m,
1.61 × 10 8.10 −7
3.751 × 10 m
−10
2.53 × 10
31
12.7 3. (a) 2.1 × 10 m
−10
(b) 4.8 × 10 m
30 −10
3.78 × 10 1.90 (c) 8.5 × 10 m
30
5. 10
4.08 × 10 2.05 −7 −7
6. (a) 1.22 × 10 m, 1.03 × 10 m
−7 −7
32 (b) 6.57 × 10 m, 4.87 × 10 m
6. (a) 1.05 × 10 kg −6 −6
9 (c) 1.88 × 10 m, 1.28 × 10 m
(b) 7.16 × 10 m −6
7. (a) 7.65 × 10 m
−6
(b) 2.30 × 10 m
CHAPTER 18 8. (a) ∞
6 −2 −1 −8 −7 −7
5. 1.71 × 10 kg m s (b) 9.12 × 10 m, 3.65 × 10 m, 8.22 × 10 m
3 −3 −3
6. 1.01 × 10 kg m (1.01 g cm ) (c) 13.6 eV

534 ANSWERS TO NUMERICAL QUESTIONS


9. (a) (i) 1.8 eV
14 −7
(ii)4.35 × 10 Hz, 6.89 × 10 m
−7
(b) 1.41 × 10 m
14
16. (b) 7.3 × 10 Hz

CHAPTER 23
−11
7. (a) 2.43 × 10 m
−22 −1
(b) 2.73 × 10 m s
−14
8. (a) 2.86 × 10 m
−13
(b) 2.02 × 10 m

CHAPTER 24
11. 4.95 MeV
12. 17.3 MeV
13. (a) 1.19 MeV absorbed
14. 8.6 MeV
15. 25.7 MeV

CHAPTER 25
12. 3.27 MeV
14. (i) 511 keV

CHAPTER 26
1. (a) 40 times or 20 orbits
6 −1
(b) 3.9 × 10 m s
(c) 3.2 cm

ANSWERS TO NUMERICAL QUESTIONS 535


INDEX
A-scans (ultrasound) 348–9 Balmer’s equation 424
absolute magnitude 292 band structures 231
absorption spectra 284–5, 303, 425, 426 doping, and 219–20
AC electric motors semiconductors, in 216–19
energy transformations and transfers 169–70 solids, in 213
induction motors 165–9 baryons 504, 505, 508
main features 164 Becqueral, Henri 419, 420, 454
universal motor 164–5 Becquerel’s predicament 420
AC electricity beta decay
household use 155 Fermi explanation 462–3
versus DC 147–8 problems of 461–2
AC generators 142–3 beta particles 383
AC induction motors 165–9, 172 deflection by a magnetic field 455
operation 168–9 distribution of energy 462
power 169 penetrating power 454
slip speed 169 properties 383, 456
squirrel-cage rotor 167–8 Bethe, Hans 484
stator of three-phase 166–7 binary stars 306
structure 166–8 astrometric binaries 310
acceleration, rocket lift-off 24–7 eclipsing binaries 308–9, 318–19
acceleration due to gravity 3–4, 6, 65 mass–luminosity relationship 311
pendulum determination 11 spectroscopic binaries 309–10, 319
variations 4–5 visual binaries 306–8
weight values in the solar system 12 binding energy 469–70
acceleration equations 15–16 black body radiation 199–201, 281–2
acoustic impedance 344–5 black hole 334
active optics 267 blood flow measurement by ultrasound 352–5
adaptive optics 267–9 Bohr, Niels 449
aether model 72–4, 75 periodic table explanation 447–8
agricultural uses of radioisotopes 491–2 principle of complementarity 448, 451
air resistance 22–3 views on atomic bomb 483
alpha particles 382 Bohr equation 432
deflection by magnetic field 455 Bohr’s model of the atom 283, 423–7, 433
penetrating power 454 de Broglie explanation of Bohr’s electron orbits 446–7
properties 383, 456 energies of ‘stationary states’ 431–2
alpha particles scattering experiments 458 limitations 434
Geiger and Marsden 420–1, 422 mathematics of 429–34
Rutherford and Bequerel 419–20 postulates 427–8
Anglo-Australian Telescope 259, 264, 268 quantum theory to explain hydrogen spectrum 424
angular momentum 428, 465 radii of ‘stationary states’, hydrogen atom 430
annual parallax 276, 277 ‘stationary states’ of electrons 428
precision 302–3 bone density and ultrasound 351–2
anode 175 bone imaging 390
antineutrino 463, 464 Born, Max 446, 448, 449
antiprotons 502 bosons 504, 509–10, 512
apparent magnitude 291 Bragg’s experiment 238–9
armature 5 Bragg’s Law 238, 239
artificially induced radioactivity 458, 459–60 Bragg’s X-ray diffraction studies 237–8
artificially induced transmutations 458 brain, imaging studies 390–1, 410, 411
astrometric binaries 310 breathalysers 208
astrometric satellites 278 Bremsstrahlung radiation 366
astrometry 275–8 brightness
atomic bomb development 480–4 measurement 289
atomic masses, light nuclides 470 stars 290–1
atomic models brightness ratios, stars 290–1
Bohr’s 423–7, 427–8, 430–1, 432, 434 brushes 111
quantum theory steps 444–51 BSC theory 243
Rutherford’s 419–23, 429–30 bubble chambers 497
attenuation of a signal 367
Australian Telescope Compact Array 272–3 carbon–nitrogen–oxygen (CNO) cycle 326–7
average binding energy per nucleon 469 cathode 175
cathode ray oscilloscope (CRO) 187–8
B-scans (ultrasound) 349 cathode ray tubes 175, 176
back emf in motors, and Lenz’s Law 129–30 component parts 186

536 INDEX
cathode rays DC generators 144–5
applications 186–8 de Broglie, Louis
charge-to-mass ratio 183 explanation for Bohr’s electron orbits 446–7
discovery 175–6 matter waves 444–6
electric field effects 177–82 wave model of electrons 215–16, 441
magnetic field effects 182 de Broglie wavelength 444–6
Thomson’s experiments 180, 183, 184–5 de Forest, Lee 221
waves or particles 184–5 deflecting plates 186
causality, principle of 85 density, stars 289
centripetal acceleration 40 de-orbiting 50
centripetal force 39, 41 de-orbit manoeuvre 51
Chadwick, James 460–1 DEXA (Dual Energy X-ray Absorptiometry) 352
identification of neutron 460–1 diffraction 235, 441, 445
Chandra X-Ray Observatory 268, 269 electrons 446
charge-to-mass ratio of cathode rays 183 explanation 442–3
Chernobyl nuclear accident 488–9 X-rays 235–8
classical physics 202 diffraction grating 235, 236, 443
photoelectric effect 204–5 diffusion cloud chamber 518
cloud chambers 496–7, 518–20 diodes 220, 221, 222
CNO cycle 326–7 Dirac equation 451
coherent circular waves 233 discharge tubes 175, 176, 191
coherent light 443 everyday uses 176
coherent optic fibre bundle 374, 375 distance modulus 292–3
coiled conductor dopant 217
induced currents in 126, 137 doping, and band structures 219–20
using a moving magnet in 125 Doppler effect 288, 352–3
colliders 502, 512 Doppler ultrasound
colour filters 303–4 blood flow measurement 352–5
colour index, stars 297 choosing the best signal 354–5
colour magnitudes, stars 296
practice, in 353–4
colour measurement, stars 295
colour television 186–7
Earth’s gravitational field 3–12
commutators 109, 111
review 10
Compton Gamma Ray Observatory 268, 269
Earth’s rotational motion 31–2
computed axial tomography see CT scans
eclipsing binaries 308–9, 318–19
conductors 213–16, 239
resistivity 217 eddy currents
continuous spectra 280–2, 303, 425 heat losses in transformers 151
contrast (image) 409 magnetic fields, and 131–2
Coolidge X-ray tube 235 switching devices, in 132
Cooper pairs 244–5 Edison, Thomas 147–8, 221
covalent bonding 218 Eightfold Way 504
critical mass 482 Einstein, Albert 75–6, 85, 91, 205, 206, 423, 424, 441
Crookes, William 184 photoelectric equation 206
crystal lattice structure of metals 239–40 theory of relativity 76
crystalline substances 347 electric chair 147–8
crystals, X-ray diffraction 236, 236–8 electric field strength 178
CT scans 368–73 electric fields, effect on cathode rays 177–82
diagnostic tool, as 372–3 electric motors, DC 109–14, 121
production 369–71 electric power generating stations 146
Curie, Irène 459 electrical resistance
current-carrying conductor see also parallel current-carrying low temperature effects 241–2
conductor superconductors, in 246, 254
magnetic field 103–4, 120, 131, 137–8 electricity
magnitude of the force on 104–5 AC/DC 147–8, 155
right-hand push rule 104 society, and 156
cyclotrons 385, 499–500 electricity production, nuclear fission reactor 487
electromagnetic braking 132
Davisson, Clinton 446 electromagnetic force, unification of 506
DC electric motors 109–14 electromagnetic induction 123, 126–7
anatomy 109–10 electromagnetic levitation 249
calculating torque of a coil 113–14 electromagnetic spectrum 195
changing speed 112 atmospheric absorption 258–60
commutators 111 components 258
magnetic field 112 electromagnetic waves 185, 236
model 121 Maxwell’s theory 194–5
operation 110–11 electromagnets 103, 140, 403
DC electricity, versus AC 147–8 electron gun 186

INDEX 537
electronics, superconductor applications 247–8 Geiger, Hans 420–1, 422
electrons 184, 419, 503 see also cathode rays Gell-Mann, Murray 504, 505, 507
charge 181 generators 140–5
de Broglie wave model 215–16, 441 AC 142–3
diffraction 446 current direction 143–4
excited state 433 DC 144–5
ground state 433 hand-operated, output 160
magnetic field effects 182 magnetic flux and emf variation 141–2
positron interactions 393 power stations 146
protons in close proximity, and 461 geostationary orbit 47
Rutherford atomic model, in 423 geosynchronous orbit 46
spin 465 germanium, for semiconductors 218–19
stationary states 428 Germer, Lester 446
superconducting state, in 243 gluons 509–10
electrostatic forces, nucleons 467 gradient magnetic field 406
elementary charge 181 gravitational attraction, and satellite motion 62–4
elements, naturally occurring 489 gravitational collapse 322–4
elliptical orbits 45–7 gravitational field vector g 3–4
emission spectra 282–4, 303, 425, 426 variations
empirical equation 424 altitude, with 4–5
endoscopes geographical location, with 4
medical diagnosis, in 373–7 planetary body, with 4
operation 375 gravitational fields 3–5, 65–6
structure 374–5 weight and 6
usage 376–7 gravitational forces, nucleons 467
energy, and mass 88–9 gravitational potential energy 7–9
energy bands 213, 214, 224
energy transformations and transfers 169–70 hadrons 504, 508
escape velocity 23–4 Hahn, Otto 478
exclusion principle (Pauli) 450, 504 hair dryer 170
expansion cloud chamber 518 half-life 383–4, 477
extrinsic semiconductor materials 217 Hallwachs, Willhelm 203
extrinsic semiconductors 219–20 hard X-rays 366
extrinsic variables 312 heart muscle, imaging studies 389–90
heavy elements synthesis, stars 332
Faraday, Michael 123 Heisenberg, Werner 448, 451
electromagnetic induction 123, 126–7 uncertainty principle 450
first experiments 123–4 work on German atomic bomb project 483
iron ring experiment 124–5 helium flash 328
motor effect 103–4 Henry, Joseph 123
using a moving magnet 125–6 Hertz’s experiments with radio waves 196–8
Faraday’s Law of Induction 127, 149 Hertzsprung–Russell diagrams 287, 324, 328, 330, 335
fault current limiter (FCL) 247 Higgs boson 512
Fermi, Enrico 504 HIPPARCOS Catalogue 278, 302
explanation of beta decay 462–3 Hounsfield, Godfrey N. 369
neutron bombardment of uranium 476, 477–8 Hubble Deep Field 256
fermions 504, 509 Hubble Space Telescope (HST) 260, 268
field vector g 3–4 Huygens’ Principle 442
fissile nucleus 484 hydrogen atom 422
fixed target accelerators 502 ‘classical’ energy 429–30
fluorescence 175 energies of ‘stationary states’ 431–2
frequency 341 quantum mechanics perspective 450
ultrasound 342–3 radii of ‘stationary states’ 430
Frisch, Otto 478–9, 480 spectral lines explanation 432–4
hydrogen fusion mechanisms (main sequence stars) 324–5
g forces 27–30 carbon–nitrogen–oxygen cycle 326–7
decelerating 53–4 proton–proton chain 325
variations during rocket launch 30–1 hydrogen protons
Galileo’s telescopes 257 external magnetic field, in 400, 403
galvanometer 114, 123 Larmor frequency 405
gamma camera 388–9 hydrogen spectrum 424, 437–9
gamma radiation 383, 386–7, 456 quantum ideas 424–5
deflection by magnetic field 455 theoretical expression for wavelengths 432–3
ionising power 455
penetrating power 454–5 incandescent light 280
gases, spectra 425–6 induced currents
Geiger counter 421 coiled conductor, in 126, 137

538 INDEX
direction 138 current-carrying conductor 103–5, 120, 131, 137–8
linked coils 137–8 current-carrying solenoid, around 102
induction 126 DC electric motors 112
induction heating 133 direction around a solenoid 103
industrial uses of radioisotopes 491 eddy currents, and 131–2
inertial frames of reference 74–5 effect on orientation of nuclei 403, 404–5
insulating transmission lines 155 hydrogen protons, and 400, 403
insulators 213–16 radioactive emissions deflection by 455
integrated circuits (ICs) 225, 227–9 review 101–3
interference 233–5, 442–3 rotating coils in 128
interferometry 265–6 superconductors, and 245
interstellar dust 321, 322 magnetic flux 126–7
interstellar gas 321 variation in generator coil 141–2
interstellar medium 321–4 magnetic flux density 126
intrinsic semiconductor materials 217 magnetic resonance imaging see MRI
intrinsic semiconductors 219 magnitude of stars 290–2
intrinsic variables 313 main sequence stars 324
iodine-123 386, 389 carbon–nitrogen–oxygen (CNO) cycle 326–7
iodine-131 386 hydrogen ‘burning’ 324–5
ionisation blackout 50 proton–proton chain 325
ionising power, radiation 455 transition to red giants 329–30
isotopes 382, 457 see also radioisotopes Manhattan Project 480, 481–4
first nuclear reactor 481–2
Joliot, Frédéric 459 physicists’ views 483–4
Josephson junction 246 research at Los Alamos 482–3
Marconi’s radio wave experiments 198
kaons 504 Marsden, Ernest 420–1, 496
Keck telescopes 269 mass
Kelvin scale of temperature 241 energy, and 88–9
Kepler’s Law of Periods 41–2, 43, 46, 60, 62, 307, 309 relativity of 85–9
constant derivation 61 mass defect 468–71
Kunsman, Charles 446 mass dilation 87
mass energy 89
Langrangian point 49 mass–luminosity relationship 311
large-scale integrated circuits (LSI) 227 matter waves (de Broglie) 444–6
Larmor frequency 404–5 confirmation of 446
lattice structures 218 maxima 234
doping effects 219–20 Maxwell’s theory of electromagnetic waves 194–5
metals 239–40 medical cyclotron 385
Law of Conservation of Momentum 25, 68 medical diagnosis
Law of Universal Gravitation 3–4, 42, 61–5, 70 CT scan use 368–73
Lenard, Philipp von 203–4 endoscopy use 373–7
length, relativity of 81–4 MRI use 248, 399–500, 410–11
length contraction 84 PET scans 392–4, 395, 490
lenses radioisotope use 382, 384, 386, 387–91 489–90
light-gathering ability 272 SQUID (Superconducting Quantum Interference
magnification, and 262–3 Device) 248
Lenz’s Law 128–30, 144 superconducting magnets use 248
Principle of Conservation of Energy, and 129 ultrasound use 342–6, 351–2
production of back emf in motors, and 129–30 X-ray use 366–8
leptons 504, 506, 507–8 medical imaging
lift-off (rockets) 24–32 combined techniques 394–5
light transmission by optical fibres 373, 380 comparison of techniques 412–14
lightning protection 154, 179 medium 68
linear accelerators 499 Meissner effect 245, 253–4
Los Alamos Laboratory 482–3 Meitner, Lise 478–9
loudspeakers 115 mesons 504, 505, 508
low altitude polar orbit 49 metal lattice 214
Low Earth orbit 49 metals
luminosity classes, stars 287 crystal lattice structure 239–40
lungs, imaging studies 390, 391 superconductors 242
metastable nucleus 383
maglev trains 248–9 metre, definition 77
magnetic field lines 101–3 MeV 393, 459
magnetic fields Michelson–Morley experiment 72–4, 233
cathode ray effects 182 modelling 96–7
charged particles in 102, 131 Millikan’s oil drop experiment 181–2

INDEX 539
minima 234 nucleus
motor effect 103–5, 120 binding energy 469–70
MRI energy from 475–6
image and the patient 399–400 Larmor frequency 404–5
magnets in the body 400 mass defect 468–71
medical uses 410–11 net spin 400, 401
MRI machine precession 404, 405
application of radio frequency pulses 405–6 nuclides 457
contrast in images 409, 410 atomic masses 470
distinguishing one type of hydrogen compound from
another 408–9 Oliphant, Sir Mark 475–6, 480
effect on atoms in the patient 402–10 optical fibres
effect on nuclei orientation in strong magnetic endoscopes, in 374, 375
field 403–5 light transmission 373, 380
precession 404–5 optics
relaxation time, measuring 409–10 active 267
removal of radio frequency pulses 407–9 adaptive 267–9
muons 503 orbit
elliptical 45–7
n-type semiconductors 219–20, 222 types of 47–9
naming stars 311 orbital decay 49–50
NASA’s ‘Great Observatories Program’ 268–9 orbital energy 44–5
naturally occurring elements 489 orbital motion 39–50
naturally occurring radioactivity 456–7 orbital velocity 42–4, 70
net spin of a nucleus 400, 401
neutrinos 503 p–n junction 222, 223–4
detection 463–4, 466 p-type semiconductors 219–20, 222
discovery of 461–6 parallactic ellipse 277–8
interaction with matter 464 parallax 275
properties 464–5 annual 276, 277, 302–3
recent discoveries 465–6 sprectroscopic 293–5
types of 508 trigonometric 275
neutron scattering 492 parallel current-carrying conductor
neutron stars 334 forces between 105–7, 120–1
neutrons magnitude of the force 106–7
discovery 458–61 parallel plates, electric field between 177–9
in a nuclear reactor 484–6 parsec (parallax-second) 276–8
slow and fast 478 particle accelerators 250, 499–502, 512–13
Newton’s Law of Universal Gravitation 3–4, 42, 61–5 particle detectors 496–8
Newton’s Second Law of Motion 3, 6, 26, 40 modern detectors 497–8
Newton’s Third Law of Motion 25 particle masses 503
non-coherent optic fibre bundles 374, 375 particle physics
non-inertial frames of reference 71, 97–8 and cosmology 513
non-periodic variables 313 new particles 503
npn transistors 225 Standard Model 504–6
NSW electrical distribution system 153–4 timeline 513–15
nuclear atom (Rutherford model) 422 Pauli, Wolfgang
nuclear equations 457 exclusion principle 450, 504
nuclear fission prediction of neutrino 462, 503
discovery 476–80 quantum mechanics to hydrogen, application of 450
first observations 479 Peierls, Rudolf 480
Meitner and Frisch experiments 478–9 pendulum, to determine g 11
nuclear fission reactor 484–9 penetrating power, radiation 454–5
Chernobyl accident 488–9 period–luminosity relationship 315
control rods 486 periodic table, Bohr’s explanation 447–8
coolant 486 periodic variables 313–14
electricity production 487 periods, Law of 41–2
moderators 486 permanent magnets 103, 140, 403
neutrons in 484–6 PET scans 392, 395, 490
radioactive waste products 488 isotopes used 394
nuclear medicine 382, 385, 386, 387–91, 490 operation 393–4
nuclear physics, timeline 513–15 phase difference 351
nuclear power station 487 phase scans (ultrasound) 350–1
nuclear reactions, energy change 471 phosphorescent substances 454
nuclear reactor, first 481–2 photocells 207
nucleons photoconductive cells 208
gravitational and electrostatic forces 467 photocopier machine 180
strong nuclear force 466–8, 468 photoelectric effect 197, 202–8

540 INDEX
applications 207–8 quantum physics, timeline 513–15
explanation 204–5 quantum theory 201, 202, 419, 423–4, 448–9
photoelectric equation 206 hydrogen spectrum 424–5
photoelectric photometry 298 model of the atom 444–51
photographic photometry 298 quarks 505–6, 507
photometry 289–98 colour properties 509
photons 201, 203–4, 283, 424 discovery of top quark 510–
phototubes 208
photovoltaic cells 207–8, 228–9 radiant energy 289
photovoltaic effect 207 radiation
piezoelectric effect 347 properties 383, 454–6
pions 503 types of 382–3
Planck, Max 200–1, 206, 283, 423, 424 radio aerials, operation 199
Planck’s constant 201 radio frequency pulses
Planck’s equation 201 application 405–6
planetary swing-by 66–9 removal 407–9
plutonium bomb 482 radio telescopes 260, 264, 265
pnp transistors 225 radio waves
Pogson scale 290 carrier waves and superimposed signal 221
pointed conductors 179 frequencies and 198
positron emission tomography see PET scans Hertz’s experiments with 196–8
positrons 393, 503 Marconi’s experiments 198
post-helium burning 331 producing and transmitting 211
potential difference 127–8 radioactive decay 382, 383–4, 457–8
moving charge through 178 radioactive waste products 488
power 149 radioactivity
AC induction motor 169 artificially induced 458, 459–60
power distribution 151–4 detection 456
NSW 153–4 early investigations 454
transformers to reduce power loss 152–3 naturally occurring 456–7
power generation, superconductor use 246–7 safety issues 392
power station generators 146
radioisotopes
power storage, superconductor use 247
advantages/disadvantages 395
power transmission lines see transmission lines
body organs, targeting 387–9
precession 404, 405
emitting gamma radiation 386–7
principle of complementarity 448, 451
half-life 383–4
Principle of Conservation of Energy 169
Lenz’s Law, and 129 industrial and agricultural applications 491–2
transformers, and 149–51 medical diagnosis 382, 384, 386, 387–91, 395, 489–90
principle of relativity 74–5 metabolising by the body 385
projectile motion 14–23 PET scans 392–4, 395, 490
modelling 35 production 385
projectiles 14 properties 382, 491
acceleration equations 15–16 radiopharmaceuticals 385, 390
air resistance 22–3 red giants 327–31
combined vertical and horizontal motions 19 main sequence transition to, evidence 329–30
horizontal motion 17–19 post-helium burning 331
maximum height 21 triple alpha reaction 331
range 22 re-entry (spacecraft) 50–3
trajectory 15 decelerating g forces 53–4
trip time 22 extreme heat 51–3
velocity 20–1 ionisation blackout 54
vertical motion 16–17 reaching the surface 54–5
proton–proton (p–p) chain 325 reflecting telescopes 261–2
protons 503 reflection of ultrasound 345–6
antiprotons, and 502 refracting telescopes 261
electrons in close proximity, and 461 relativistic space flight 89–91
energy levels 402 relativity see also special relativity
external magnetic fields, in 400 length, of 81–4
resonation 405 mass, of 85–9
protostar 323 principle of 74–5
pulsars 334 simultaneity, of 77–8
theory of 72, 81
quanta 423 time, of 78–81
quantum chromodynamics (QCD) 510 resistance see electrical resistance
quantum electrodynamics (QED) 510 resonation (protons) 405
quantum mechanics 445, 448–9 rest energy 89
development 449–51 rest frame 79

INDEX 541
right-hand grip rule 102, 103, 128, 131 space–time continuum 76
right-hand push rule 104, 131, 143 spacecraft
rocket science pioneers 32 re-entry 50–5
rockets slingshot effect 66–9, 70
Apollo 10 launch 36–7 special relativity
Earth’s motion, effect on launch 31–2 consequences 77–92
g forces 27–30 constant speed of light 75–6
lift-off 24–7 inertial frames of reference 74–5
thrust and acceleration 26–7 space–time continuum 76
variations in acceleration and g 30–1 spectra
Röentgen, Wilhelm 185, 235, 454 absorption 284–5, 303, 425, 426
rotational velocity, stars 288–9 continuous 280–2, 303, 425
rotor 140 emission 282–4, 303, 425, 426
Rutherford, Ernest 185, 454, 496 gases, of 425–6
alpha particle scattering experiments 419–21 making 279–80
artificially induced transmutation 458 observing with spectroscope 427
energy from the nucleus 475, 476 spectral analysis, starlight 285–6
nuclear atom 421–2 spectral classes, stars 286
prediction of the neutron 458, 460 spectrophotometer 280
Rutherford model of the atom 419–23 spectroscope 279, 427
‘classical’ energy of hydrogen atom 429–30 spectroscopic binaries 309–10, 319
electrons in 423 spectroscopic parallax 293–5
mathematics of 429–34 spectroscopy 279–89
speed of light 75–6, 195
satellite motion, and gravitational attraction 62–4 faster than 85
satellites spin, electrons 465
orbital decay 49–50 Spitzer Space Telescope 268, 322
orbital velocity 42–4 split-metal ring 111
periods of 41–2 split-ring commutator 111
types of orbits 47–9 Square Kilometre Array (SKA) 266
S-Cam, the 264, 280 SQUID (Superconducting Quantum Interference
Schrödinger, Erwin, wave function theory 448, 450 Device) 248
scintillations 421 squirrel-cage rotor 167–8
sector scans (ultrasound) 350–1 Standard Model (particle physics) 503
seeing 261, 278 boson force-carriers 509–10
semiconductors 213–16 developments leading to 504–8
applications 228–9 particles 506–11
band structures 213 today and beyond 511–13
doping and band structure 219–20 Stanford Linear Accelerator Center (SLAC) 505
making 218–19 star birth
resistivity 217–18 gravitational collapse 322–4
Shockley, William 225, 231 interstellar medium 321–4
silicon star death 332–5
doping effect on lattice structure 219–20 star life
lattice structure 218 main sequence, after 327–31
semiconductors, for 218–19 main sequence stars 324–7
silicon chips 227–8 starlight, spectral analysis 285–6
simple harmonic motion 11 stars
simultaneity, relativity of 77–8 absolute magnitude 292
singularity 334 absorption spectra 285
slingshot effect 66–9, 70 apparent magnitude 291
slip speed 169 binary 306–11
sodium chloride 215, 236 class L stars 286
crystal structure 237 colour index 297
soft X-rays 366 colour magnitudes 296–7
solar cells 207–8, 228–9 colour measurement 295
solar system weight values and g 12 data 302
solenoid density 289
determining poles of 103 distance modulus 292–3
magnetic field around 102 evolutionary tracks 335
solid state devices 222–3, 227–8 heavy elements synthesis 332
versus thermionic devices 224–5 luminosity classes 287
sound waves 341–2 magnitudes 290–92
space exploration 32 mass–luminosity relationship 311
space shuttle measuring brightness and luminosity 289
engines 26 naming 311
re-entry 51 period–luminosity relationship 315

542 INDEX
rotational velocity 288–9 time, relativity of 78–81
spectral classes 286 time dilation 79–81
spectroscopic parallax 293–5 torque 107–8
temperature 288 coil in DC motor, calculating 113–14
translational velocity 288 total internal reflection 373
variable 312–15 transformers 148–51
stars of five solar masses or less 333 AC input and output voltage 161
stars of more than five solar masses 333–4 eddy current heat losses 151
stationary states household use 155
electrons 428 Principle of Conservation of Energy, and 149–51
energies, Bohr hydrogen atom 431–2 reducing transmission line power loss 152–3
radii, Bohr hydrogen atom 430 simple 160–1
stator 109, 140 transistors 225–6, 231
three-phase induction motor 166–7 translational velocity, stars 288
Stefan’s Law 282 transmission lines
stellar birth 321–4 insulating 155
stellar object research 338 power losses 152–3, 162
stellar spectroscopy 286 protection from lightning 154
step-down transformer 149 superconducter use 246
step-up transformer 149 transmutations 456–7
stroboscope 15 artificially induced 458
strong magnetic fields 403–4 transuranic elements 476–7
strong nuclear force 466–8 trigonometric parallax 275
gluons and 509–10 triodes 221
properties 467–8 triple alpha reaction 331
Sudbury Neutrino Observatory 466 twins paradox 91–2
Super-Kamiokande neutrino detector 466
superconducting magnets 404 UBV system 296–7
superconductivity 240–42 ultrasound 341–57
applications 246–50
advantages/disadvantages 357
BCS theory 243–4
blood flow, measurement of 352–5
explanation 243–50
bone density, and 351–2
timeline 250
comparison with X-rays and CAT scans for
superconductors
diagnosis 372–3
critical temperatures 242
detecting structure inside body 343–6
levitation and Meissner effect 245, 253–4
magnetic field effects 245 history of use 348
resistance, and 246, 254 medical diagnosis, and 342–6, 357
temperature changes 244, 253 piezoelectric effect 347
tunnelling effect 246 reflection 344, 345–6
superluminal velocities 85 transmission 344
supernovae 334, 464 type of sound 341–2
supersaturated vapour 496 ultrasound scans 348–9
switching devices, eddy currents in 132 A-scans 348–9
synchrotrons 500–1 B-scans 349
medical uses 356
technetium-99m 386–7, 389, 390, 391 phase scans 350–1
telescopes 259–60 sector scans 350–1
advanced telescope technology 268–9 ultrasound transducer 346, 350
Galileo’s 257 ultraviolet catastrophe 200
improving performance 265–9 uncertainty principle (Heisenberg) 450
performance 262–4 uniform circular motion 39–41, 56, 58–9
reflecting 261–2 uniform electric fields 177–9
refracting 261 universal motor 164–5
theoretical resolution 263–4 uranium bomb 482
television 186–7
temperature, stars 288 valence bands 214
terminals 142 valves 220
thallium-201 389–90 vapour, supersaturated 496
theoretical resolution of telescopes 263–4 variable stars 312
theory of relativity 72, 85 variables
thermionic devices 220–1 Cepheids 315
versus solid state devices 224–5 extrinsic variables 312
Thomson, J. J. 180, 183, 184–5, 203 intrinsic variables 313
‘plum pudding’ model of the atom 419 non-periodic variables 313
three-phase power generation 146 period–luminosity relationship 315
thrust 24, 26 periodic variables 313–14
thyroid investigations 389 RR Lyrae 315

INDEX 543
vector 61
vector field 3 X-radiation
velocity effect on the body 362–3, 365
projectiles 20–1 frequency 366
superluminal 85 X-ray diffraction 235–8
Very Large Array (VLA) 265–6 Bragg’s experiment 238–9
visual binaries 306–8 X-rays 236
visual magnitude 296 comparison to CT scans and ultrasound for
Von Laue’s diffraction experiment 236–7 diagnosis 372–3
voxels 407 CT scans, use in 368–71
definition 362
wave equation 341–2 diagnostic tool, as 366–8
wavefront 442 discovery and application 185
wavelength 341 imaging parts of the body 367–8, 390
weak nuclear force 506 production 363
weight 6 types of 365–6
Westinghouse, George 147–8 use and detection 363–5
Wien’s Law 281 Young’s ‘double slit’ experiment 234, 235, 442, 443
Wilson cloud chamber 518
work function 205 zero-age main sequence (ZAMS) 323

544 INDEX

You might also like