100% found this document useful (2 votes)
271 views

Savelyev - Physics - A General Course - Vol 2 - Mir

This document is the preface and table of contents for Volume II of Physics: A General Course by Igor Savelyev. The preface states that the volume covers electromagnetism, waves, and optics using the International System of Units. It was intended for higher technical schools and is the result of 25 years of work in the Department of General Physics at the Moscow Institute of Engineering Physics. The table of contents lists 13 chapters covering topics such as electric and magnetic fields, conductors, electromagnetism, Maxwell's equations, motion of charged particles, electrical conduction, oscillations, and more.

Uploaded by

Fabio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
271 views

Savelyev - Physics - A General Course - Vol 2 - Mir

This document is the preface and table of contents for Volume II of Physics: A General Course by Igor Savelyev. The preface states that the volume covers electromagnetism, waves, and optics using the International System of Units. It was intended for higher technical schools and is the result of 25 years of work in the Department of General Physics at the Moscow Institute of Engineering Physics. The table of contents lists 13 chapters covering topics such as electric and magnetic fields, conductors, electromagnetism, Maxwell's equations, motion of charged particles, electrical conduction, oscillations, and more.

Uploaded by

Fabio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 529

I. V.

SAVELYEV

PHYSICS
A GENERAL COURSE
(In three volumes)

VOLUME II
ELECTRICITY
AND MAGNETISM
WAVES
OPTICS

MIR PUBLISHERS
MOSCOW
Translated from Russian by G. Leib

First published 1980


Revised from the 1978 Russian edition
Second printing 1985
Third printing 1989

Printed in the Union of Soviet Socialist Republics

ISBN 5-03-000902-7, 1978


ISBN 5-03-000900-0, 1980
v

PREFACE

The main content of the present volume is the science of electromagnetism and the
science of waves (elastic, electromagnetic, and light).
The International System of Units (SI) has been used throughout the book,
although the reader is simultaneously acquainted with the Gaussian system. In
addition to a list of symbols, the appendices at the end of the book give the units of
electrical and magnetic quantities in the SI and in the Gaussian system of units, and
also compare the form of the basic formulas of electromagnetism in both systems.
The course is the result of twenty five year’s work in the Department of General
Physics of the Moscow Institute of Engineering Physics. I am grateful to my col-
leagues and friends for their helpful discussions, criticism and advice in the course
of the preparation of the book.
The present course is intended above all for higher technical schools with an
extended syllabus in physics. The material has been arranged, however, so that the
book can be used as a teaching aid for higher technical schools with an ordinary
syllabus simply by omitting some sections.

Igor Savelyev

Moscow, November, 1979


CONTENTS vii

Contents

Preface v

I ELECTRICITY AND MAGNETISM 1

Chapter 1. ELECTRIC FIELD IN A VACUUM 3


1.1 Electric Charge 3
1.2 Coulomb’s Law 4
1.3 Systems of Units 7
1.4 Rationalized Form of Writing Formulas 8
1.5 Electric Field. Field Strength 9
1.6 Potential 12
1.7 Interaction Energy of a System of Charges 16
1.8 Relation Between Electric Field Strength and Potential 17
1.9 Dipole 20
1.10 Field of a System of Charges at Great Distances 25
1.11 A Description of the Properties of Vector Fields 28
1.12 Circulation and Curl of an Electrostatic Field 43
1.13 Gauss’s Theorem 45
1.14 Calculating Fields with the Aid of Gauss’s Theorem 47

Chapter 2. ELECTRIC FIELD IN DIELECTRICS 53


2.1 Polar and Non-Polar Molecules 53
2.2 Polarization of Dielectrics 55
2.3 The Field Inside a Dielectric 56
2.4 Space and Surface Bound Charges 57
2.5 Electric Displacement Vector 62
2.6 Examples of Calculating the Field in Dielectrics 65
2.7 Conditions on the Interface Between Two Dielectrics 69
2.8 Forces Acting on a Charge in a Dielectric 72
2.9 Ferroelectrics 74

Chapter 3. CONDUCTORS IN AN ELECTRIC FIELD 77


3.1 Equilibrium of Charges on a Conductor 77
3.2 A Conductor in an External Electric Field 80
viii CONTENTS

3.3 Capacitance 80
3.4 Capacitors 82

Chapter 4. ENERGY OF AN ELECTRIC FIELD 85


4.1 Energy of a Charged Conductor 85
4.2 Energy of a Charged Capacitor 85
4.3 Energy of an Electric Field 88

Chapter 5. STEADY ELECTRIC CURRENT 93


5.1 Electric Current 93
5.2 Continuity Equation 96
5.3 Electromotive Force 97
5.4 Ohm’s Law. Resistance of Conductors 99
5.5 Ohm’s Law for an Inhomogeneous Circuit Section 101
5.6 Multiloop Circuits. Kirchhoff’s Rules 103
5.7 Power of a Current 106
5.8 The Joule-Lenz Law 107

Chapter 6. MAGNETIC FIELD IN A VACUUM 109


6.1 Interaction of Currents 109
6.2 Magnetic Field 112
6.3 Field of a Moving Charge 113
6.4 The Biot-Savart Law 116
6.5 The Lorentz Force 119
6.6 Ampere’s Law 122
6.7 Magnetism as a Relativistic Effect 124
6.8 Current Loop in a Magnetic Field 130
6.9 Magnetic Field of a Current Loop 135
6.10 Work Done When a Current Moves in a Magnetic Field 138
6.11 Divergence and Curl of a Magnetic Field 142
6.12 Field of a Solenoid and Toroid 146

Chapter 7. MAGNETIC FIELD IN A SUBSTANCE 151


7.1 Magnetization of a Magnetic 151
7.2 Magnetic Field Strength 152
7.3 Calculation of the Field in Magnetics 158
7.4 Conditions at the Interface of Two Magnetics 160
7.5 Kinds of Magnetics 164
7.6 Gyromagnetic Phenomena 164
7.7 Diamagnetism 169
7.8 Paramagnetism 173
7.9 Ferromagnetism 175

Chapter 8. ELECTROMAGNETIC INDUCTION 181


8.1 The Phenomenon of Electromagnetic Induction 181
8.2 Induced E.M.F. 182
8.3 Ways of Measuring the Magnetic Induction 186
8.4 Eddy Currents 187
8.5 Self-Induction 189
CONTENTS ix

8.6 Current When a Circuit Is Opened or Closed 191


8.7 Mutual Induction 194
8.8 Energy of a Magnetic Field 196
8.9 Work in Magnetic Reversal of a Ferromagnetic 198
Chapter 9. MAXWELL’S EQUATIONS 201
9.1 Vortex Electric Field 201
9.2 Displacement Current 203
9.3 Maxwell’s Equations 207
Chapter 10. MOTION OF CHARGED PARTICLES IN ELECTRIC
AND MAGNETIC FIELDS 211
10.1 Motion of a Charged Particle in a Homogeneous Magnetic Field 211
10.2 Deflection of Moving Charged Particles by an Electric and a Magnetic Field 213
10.3 Determination of the Charge and Mass of an Electron 216
10.4 Determination of the Specific Charge of Ions. Mass Spectrographs 221
10.5 Charged Particle Accelerators 225
Chapter 11. THE CLASSICAL THEORY OF ELECTRICAL CONDUCTANCE
OF METALS 231
11.1 The Nature of Current Carriers in Metals 231
11.2 The Elementary Classical Theory of Metals 233
11.3 The Hall Effect 237
Chapter 12. ELECTRIC CURRENT IN GASES 241
12.1 Semi-Self-Sustained and Self-Sustained Conduction 241
12.2 Semi-Self-Sustained Gas Discharge 241
12.3 Ionization Chambers and Counters 245
12.4 Processes Leading to the Appearance of Current Carriers 250
12.5 Gas-Discharge Plasma 254
12.6 Glow Discharge 256
12.7 Arc Discharge 259
12.8 Spark and Corona Discharges 260
Chapter 13. ELECTRICAL OSCILLATIONS 265
13.1 Quasistationary Currents 265
13.2 Free Oscillations in a Circuit Without a Resistance 266
13.3 Free Damped Oscillations 269
13.4 Forced Electrical Oscillations 273
13.5 Alternating Current 277

II WAVES 281

Chapter 14. ELASTIC WAVES 283


14.1 Propagation of Waves in an Elastic Medium 283
14.2 Equations of a Plane and a Spherical Wave 286
14.3 Equation of a Plane Wave Propagating in an Arbitrary Direction 289
x CONTENTS

14.4 The Wave Equation 291


14.5 Velocity of Elastic Waves in a Solid Medium 292
14.6 Energy of an Elastic Wave 294
14.7 Standing Waves 299
14.8 Oscillations of a String 302
14.9 Sound 303
14.10 The Velocity of Sound in Gases 306
14.11 The Doppler Effect for Sound Waves 311

Chapter 15. ELECTROMAGNETIC WAVES 313


15.1 The Wave Equation for an Electromagnetic Field 313
15.2 Plane Electromagnetic Wave 315
15.3 Experimental Investigation of Electromagnetic Waves 318
15.4 Energy of Electromagnetic Waves 319
15.5 Momentum of Electromagnetic Field 322
15.6 Dipole Emission 324

III OPTICS 329

Chapter 16. OPTICS 331


16.1 The Light Wave 331
16.2 Representation of Harmonic Functions Using Exponents 334
16.3 Reflection and Refraction of a Plane Wave at the Interface Between Two
Dielectrics 336
16.4 Luminous Flux 342
16.5 Photometric Quantities and Units 343
16.6 Geometrical Optics 347
16.7 Centered Optical System 351
16.8 Thin Lenses 358
16.9 Huygens’ Principle 359

Chapter 17. INTERFERENCE OF LIGHT 361


17.1 Interference of Light Waves 361
17.2 Coherence 366
17.3 Ways of Observing the Interference of Light 374
17.4 Interference of Light Reflected from Thin Plates 376
17.5 The Michelson Interferometer 386
17.6 Multibeam Interference 389

Chapter 18. DIFFRACTION OF LIGHT 397


18.1 Introduction 397
18.2 Huygens-Fresnel Principle 398
18.3 Fresnel Zones 401
18.4 Fresnel Diffraction from Simple Barriers 406
18.5 Fraunhofer Diffraction from a Slit 417
18.6 Diffraction Grating 425
18.7 Diffraction of X-Rays 434
CONTENTS xi

18.8 Resolving Power of an Objective 440


18.9 Holography 443
Chapter 19. POLARIZATION OF LIGHT 447
19.1 Natural and Polarized Light 447
19.2 Polarization in Reflection and Refraction 451
19.3 Polarization in Double Refraction 455
19.4 Interference of Polarized Rays 459
19.5 Passing of Plane-Polarized Light Through a Crystal Plate 461
19.6 A Crystal Plate Between Two Polarizers 463
19.7 Artificial Double Refraction 467
19.8 Rotation of Polarization Plane 469
Chapter 20. INTERACTION OF ELECTROMAGNETIC WAVES WITH A
SUBSTANCE 473
20.1 Dispersion of Light 473
20.2 Group Velocity 474
20.3 Elementary Theory of Dispersion 479
20.4 Absorption of Light 483
20.5 Scattering of Light 485
20.6 The Vavilov-Cerenkov Effect 488
Chapter 21. MOVING-MEDIA OPTICS 491
21.1 The Speed of Light 491
21.2 Fizeau’s Experiment 494
21.3 Michelson’s Experiment 497
21.4 The Doppler Effect 500
APPENDICES 505
A.1 List of Symbols 505
A.2 Units of Electrical and Magnetic Quantities 508
A.3 Basic Formulas of Electricity and Magnetism 510
PART I

ELECTRICITY AND
MAGNETISM
3

Chapter 1
ELECTRIC FIELD IN A VACUUM

1.1. Electric Charge

All bodies in nature are capable of becoming electrified, i.e., acquiring an electric
charge. The presence of such a charge manifests itself in that a charged body
interacts with other charged bodies. Two kinds of electric charges exist. They are
conventionally called positive and negative. Like charges repel each other, and
unlike charges attract each other.
An electric charge is an integral part of certain elementary particles¹. The
charge of all elementary particles (if it is not absent) is identical in magnitude. It can
be called an elementary charge. We shall use the symbol 𝑒 to denote a positive
elementary charge.
The elementary particles include, in particular, the electron (carrying the neg-
ative charge −𝑒), the proton (carrying the positive charge +𝑒), and the neutron
(carrying no charge). These particles are the bricks which the atoms and molecules
of any substance are built of, therefore all bodies contain electric charges. The
particles carrying charges of different signs are usually present in a body in equal
numbers and are distributed over it with the same density. The algebraic sum of
the charges in any elementary volume of the body equals zero in this case, and each
such volume (as well as the body as a whole) will be neutral. If in some way or
other we create a surplus of particles of one sign in a body (and, correspondingly,
a shortage of particles of the opposite sign), the body will be charged. It is also
possible, without changing the total number of positive and negative particles, to
cause them to be redistributed in a body so that one part of it has a surplus of
charges of one sign and the other part a surplus of charges of the opposite sign.

¹Elementary particles are defined as such microparticles whose internal structure at the present
level of development of physics cannot be conceived as a combination of other particles.
4 ELECTRIC FIELD IN A VACUUM

This can be done by bringing a charged body close to an uncharged metal one.
Since a charge 𝑞 is formed by a plurality of elementary charges, it is an integral
multiple of 𝑒:
𝑞 = ±𝑁𝑒. (1.1)
An elementary charge is so small, however, that macroscopic charges may be con-
sidered to have continuously changing magnitudes.
If a physical quantity can take on only definite discrete values, it is said to be
quantized. The fact expressed by Eq. (1.1) signifies that an electric charge is quantized.
The magnitude of a charge measured in different inertial reference frames will
be found to be the same. Hence, an electric charge is relativistically invariant. It
thus follows that the magnitude of a charge does not depend on whether the charge
is moving or at rest.
Electric charges can vanish and appear again. Two elementary charges of
opposite signs always appear or vanish simultaneously, however. For example,
an electron and a positron (a positive electron) meeting each other annihilate, i.e.,
transform into neutral gamma-photons. This is attended by vanishing of the charges
−𝑒 and +𝑒. In the course of the process called the birth of a pair, a gamma-photon
getting into the field of an atomic nucleus transforms into a pair of particles—an
electron and a positron. This process causes the charges −𝑒 and +𝑒 to appear.
Thus, the total charge of an electrically isolated system² cannot change. This
statement forms the law of electric charge conservation.
We must note that the law of electric charge conservation is associated very
closely with the relativistic invariance of a charge. Indeed, if the magnitude of a
charge depended on its velocity, then by bringing charges of one sign into motion
we would change the total charge of the relevant isolated system.

1.2. Coulomb’s Law

The law obeyed by the force of interaction of point charges was established ex-
perimentally in 1785 by the French physicist Charles A. de Coulomb (1736-1806). A
point charge is defined as a charged body whose dimensions may be disregarded
in comparison with the distances from this body to other bodies carrying an electric
charge.
Using a torsion balance (Fig. 1.1) similar to that employed by H. Cavendish to
determine the gravitational constant (see Vol. I, Sec. 6.1), Coulomb measured the
force of interaction of two charged spheres depending on the magnitude of the
²A system is referred to as electrically isolated if no charged particles can penetrate through the
surface confining it.
Coulomb’s Law 5

Fig. 1.1

charges on them and on the distance between them. He proceeded from the fact
that when a charged metal sphere was touched by an identical uncharged sphere,
the charge would be distributed equally between the two spheres.
As a result of his experiments, Coulomb arrived at the conclusion that the force
of interaction between two stationary point charges is proportional to the magnitude of
each of them and inversely proportional to the square of the distance between them. The
direction of the force coincides with the straight line connecting the charges.
It must be noted that the direction of the force of interaction along the straight
line connecting the point charges follows from considerations of symmetry. An
empty space is assumed to be homogeneous and isotropic. Consequently, the only
direction distinguished in the space by stationary point charges introduced into it
is that from one charge to the other. Assume that the force 𝑭 acting on the charge
𝑞𝑖 (Fig. 1.2) makes the angle 𝛼 with the direction from 𝑞1 to 𝑞2 , and that 𝛼 differs
from 0 or 𝜋. But owing to axial symmetry, there are no grounds to set the force
𝑭 aside from the multitude of forces of other directions making the same angle 𝛼
with the axis 𝑞1 -𝑞2 (the directions of these forces form a cone with a cone angle of
2𝛼). The difficulty appearing as a result of this vanishes when 𝛼 equals 0 or 𝜋.
Coulomb’s law can be expressed by the formula
𝑞1 𝑞2
𝑭 12 = −𝑘 2 𝒆ˆ 12 . (1.2)
𝑟
Here, 𝑘 is a proportionality constant assumed to be positive, 𝑞1 and 𝑞2 are magni-
tudes of the interacting charges, 𝑟 is the distance between the charges, 𝒆ˆ 12 is the unit
6 ELECTRIC FIELD IN A VACUUM

Fig. 1.2 Fig. 1.3

vector directed from the charge 𝑞1 to 𝑞2 and 𝑭 12 is the force acting on the charge 𝑞1
(Fig. 1.3; the figure corresponds to the case of like charges).
The force 𝑭 21 differs from 𝑭 12 in its sign:
𝑞1 𝑞2
𝑭 21 = 𝑘 2 𝒆ˆ 12 . (1.3)
𝑟
The magnitude of the interaction force, which is the same for both charges, can
be written in the form
|𝑞1 𝑞2 |
𝐹=𝑘 2 . (1.4)
𝑟
Experiments show that the force of interaction between two given charges does
not change if other charges are placed near them. Assume that we have the charge
𝑞a and, in addition, 𝑁 other charges 𝑞1 , 𝑞2 , . . . , 𝑞𝑁 . It can be seen from the above
that the resultant force 𝑭 with which all the 𝑁 charges 𝑞𝑖 act on 𝑞a is
Õ𝑁
𝑭= 𝑭 a, 𝑖 (1.5)
𝑖=1
where 𝑭 a, 𝑖 is the force with which the charge 𝑞𝑖 acts on 𝑞a in the absence of the
other 𝑁 − 1 charges.
The fact expressed by Eq. (1.5) permits us to calculate the force of interaction
between charges concentrated on bodies of finite dimensions, knowing the law of
interaction between point charges. For this purpose, we must divide each charge
into so small charges d𝑞 that they can be considered as point ones, use Eq. (1.2)
to calculate the force of interaction between the charges d𝑞 taken in pairs, and
then perform vector summation of these forces. Mathematically, this procedure
coincides completely with the calculation of the force of gravitational attraction
between bodies of finite dimensions (see Vol. I, Sec. 6.1).
All experimental facts available lead to the conclusion that Coulomb’s law holds
for distances from 10−15 m to at least several kilometres. There are grounds to
presume that for distances smaller than 10−16 m the law stops being correct. For
very great distances, there are no experimental confirmations of Coulomb’s law.
But there are also no reasons to expect that this law stops being obeyed with very
great distances between charges.
Systems of Units 7

1.3. Systems of Units

We can make the proportionality constant in Eq. (1.2) equal unity by properly choos-
ing the unit of charge (the units for 𝐹 and 𝑟 were established in mechanics). The
relevant unit of charge (when 𝐹 and 𝑟 are measured in cgs units) is called the ab-
solute electrostatic unit of charge (cgse𝑞 ). It is the magnitude of a charge that
interacts with a force of 1 dyn in a vacuum with an equal charge at a distance of
1 cm from it.
Careful measurements (they are described in Sec. 10.3) showed that an elemen-
tary charge is
𝑒 = 4.80 × 10−10 cgse𝑞 . (1.6)
Adopting the units of length, mass, time, and charge as the basic ones, we
can construct a system of units of electrical and magnetic quantities. The system
based on the centimetre, gramme, second, and the cgse𝑞 unit is called the absolute
electrostatic system of units (the cgse system). It is founded on Coulomb’s law,
i.e., the law of interaction between charges at rest. On a later page, we shall become
acquainted with the absolute electromagnetic system of units (the cgsm system)
based on the law of interaction between conductors carrying an electric current.
The Gaussian system in which the units of electrical quantities coincide with those
of the cgse system, and of magnetic quantities with those of the cgsm system, is also
an absolute system.
Equation (1.4) in the cgse system becomes
|𝑞1 𝑞2 |
𝐹= 2 . (1.7)
𝑟
This equation is correct if the charges are in a vacuum. It has to be determined
more accurately for charges in a medium (see Sec. 2.8).
USSR State Standard GOST 9867-61, which came into force on January 1, 1963,
prescribes the preferable use of the International System of Units (SI). The basic
units of this system are the metre, kilogramme, second, ampere, kelvin, candela,
and mole. The SI unit of force is the newton (N) equal to 105 dynes.
In establishing the units of electrical and magnetic quantities, the SI system, like
the cgsm one, proceeds from the law of interaction of current-carrying conductors
instead of charges. Consequently, the proportionality constant in the equation of
Coulomb’s law is a quantity with a dimension and differing from unity.
The SI unit of charge is the coulomb (C). It has been found experimentally that
1 C = 2.998 × 109 ≈ 3 × 109 cgse𝑞 . (1.8)
To form an idea of the magnitude of a charge of 1 C, let us calculate the force
with which two point charges of 1 C each would interact with each other if they
8 ELECTRIC FIELD IN A VACUUM

were 1 m apart. By Eq. (1.7)


3 × 109 × 3 × 109
𝐹= cgse𝐹 = 9 × 1014 dyn = 9 × 109 N ≈ 109 kgf. (1.9)
1002
An elementary charge expressed in coulombs is
𝑒 = 1.60 × 10−19 C. (1.10)

1.4. Rationalized Form of Writing Formulas

Many formulas of electrodynamics when written in the cgs systems (in particular, in
the Gaussian one) include as factors 4𝜋 and the so-called electromagnetic constant
𝑐 equal to the speed of light in a vacuum. To eliminate these factors in the formulas
that are most important in practice, the proportionality constant in Coulomb’s law
is taken equal to 1/4𝜋 𝜀0 . The equation of the law for charges in a vacuum will thus
become
1 |𝑞1 𝑞2 |
𝐹= . (1.11)
4𝜋 𝜀0 𝑟 2
The other formulas change accordingly. This modified way of writing formulas
is called rationalized. Systems of units constructed with the use of rationalized
formulas are also called rationalized. They include the SI system.
The quantity 𝜀0 is called the electric constant. It has the dimension of ca-
pacitance divided by length. It is accordingly expressed in units called the farad
per metre. To find the numerical value of 𝜀0 , we shall introduce the values of the
quantities corresponding to the case of two charges of 1 C each and 1 m apart into
Eq. (1.11). By Eq. (1.9), the force of interaction in this case is 9 × 109 N. Using this
value of the force, and also 𝑞1 = 𝑞2 = 1 C and 𝑟 = 1 m in Eq. (1.11), we get
1 |1 × 1|
9 × 109 =
4𝜋 𝜀0 12
whence
1
𝜀0 = = 0.885 × 10−11 F m−1 . (1.12)
4𝜋 × 9 × 109
The Gaussian system of units was widely used and is continuing to be used
in physical publications. We therefore consider it essential to acquaint our reader
with both the SI and the Gaussian system. We shall set out the material in the SI
units showing at the same time how the formulas look in the Gaussian system. The
fundamental formulas of electrodynamics written in the SI and the Gaussian system
are compared in Appendix. A.3.
Electric Field. Field Strength 9

Fig. 1.4

1.5. Electric Field. Field Strength

Charges at rest interact through an electric field³. A charge alters the properties
of the space surrounding it—it sets up an electric field in it. This field manifests
itself in that an electric charge placed at a point of it experiences the action of a
force. Hence, to see whether there is an electric field at a given place, we must place
a charged body (in the following we shall say simply a charge for brevity) at it and
determine whether or not it experiences the action of an electric force. We can
evidently assess the “strength” of the field according to the magnitude of the force
exerted on the given charge.
Thus, to detect and study an electric field, we must use a “test” charge. For the
force acting on our test charge to characterize the field “at the given point”, the
test charge must be a point one. Otherwise, the force acting on the charge will
characterize the properties of the field averaged over the volume occupied by the
body that carries the test charge.
Let us study the field set up by the stationary point charge 𝑞 with the aid of the
point test charge 𝑞t . We place the test charge at a point whose position relative to
the charge 𝑞 is determined by the position vector 𝒓 (Fig. 1.4). We see that the test
charge experiences the force
1 𝑞
 
𝑭 = 𝑞t 𝒆ˆ 𝑟 (1.13)
4𝜋 𝜀0 𝑟 2
[see Eqs. (1.3) and (1.11)]. Here 𝒆ˆ 𝑟 is the unit vector of the position vector 𝒓.
A glance at Eq. (1.13) shows that the force acting on our test charge depends not
only on the quantities determining the field (on 𝑞 and 𝒓), but also on the magnitude
of the test charge 𝑞t . If we take different test charges 𝑞t0, 𝑞t00, etc., then the forces 𝑭 0,
𝑭 00, etc. which they experience at the given point of the field will be different. We
can see from Eq. (1.13), however, that the ratio 𝐹/𝑞t for all the test charges will be the
same and depend only on the values of 𝑞 and 𝒓 determining the field at the given
point. It is therefore natural to adopt this ratio as the quantity characterizing an

³We shall see in Sec. 6.2 that when considering moving charges, their interaction in addition to an
electric field is due to a magnetic field.
10 ELECTRIC FIELD IN A VACUUM

electric field:
𝑭
𝑬= . (1.14)
𝑞t
This vector quantity is called the electric field strength (or intensity) at a given
point (i.e., at the point where the test charge 𝑞t experiences the action of the force
𝑭).
According to Eq. (1.14), the electric field strength numerically equals the force
acting on a unit point charge at the given point of the field. The direction of the
vector 𝑬 coincides with that of the force acting on a positive charge.
It must be noted that Eq. (1.14) also holds when the test charge is negative (𝑞t < 0).
In this case, the vectors 𝑬 and 𝑭 have opposite directions.
We have arrived at the concept of electric field strength when studying the field
of a stationary point charge. Definition (1.14), however, also covers the case of a field
set up by any collection of stationary charges, but here the following clarification
is needed. The arrangement of the charges setting up the field being studied may
change under the action of the test charge. This will happen, for example, when the
charges producing the field are on a conductor and can freely move within its limits.
Therefore, to avoid appreciable alterations in the field being studied, a sufficiently
small test charge must be taken.
It follows from Eqs. (1.13) and (1.14) that the field strength of a point charge varies
directly with the magnitude of the charge 𝑞 and inversely with the square of the
distance 𝑟 from the charge to the given point of the field:
1 𝑞
𝑬= 𝒆ˆ 𝑟 . (1.15)
4𝜋 𝜀0 𝑟 2
The vector 𝑬 is directed along the radial straight line passing through the charge
and the given point of the field, from the charge if the latter is positive and toward
the charge if it is negative.
In the Gaussian system, the equation for the field strength of a point charge in
a vacuum has the form
𝑞
𝑬 = 2 𝒆ˆ 𝑟 . (1.16)
𝑟
The unit of electric field strength is the strength at a point where unit force
(1 N in the SI and 1 dyn in the Gaussian system) acts on unit charge (1 C in the SI
and 1 cgse𝑞 in the Gaussian system). This unit has no special name in the Gaussian
system. The SI unit of electric field strength is called the volt per metre (V m−1 ) [see
Eq. (1.44)].
According to Eq. (1.15), a charge of 1 C produces the following field strength in a
Electric Field. Field Strength 11

vacuum at a distance of 1 m from this charge:


1 1
𝐸= 9
 2 = 9 × 109 V m−1 .
4𝜋 1/4𝜋 × 9 × 10 1
This strength in the Gaussian system is
3 × 109
= 3 × 105 cgse𝐸 .
𝑞
𝐸= 2 =
𝑟 1002
Comparing these two results, we find that
1 cgse𝐸 = 3 × 104 V m−1 . (1.17)
According to Eq. (1.14), the force exerted on a test charge is
𝑭 = 𝑞t 𝑬.
It is obvious that any point charge 𝑞⁴ at a point of a field with the strength 𝑬 will
experience the force
𝑭 = 𝑞𝑬. (1.18)
If the charge 𝑞 is positive, the direction of the force coincides with that of the vector
𝑬. If 𝑞 is negative, the vectors 𝑭 and 𝑬 are directed oppositely.
We mentioned in Sec. 1.2 that the force with which a system of charges acts on a
charge not belonging to the system equals the vector sum of the forces which each
of the charges of the system exerts separately on the given charge [see Eq. (1.15)].
Hence it follows that the field strength of a system of charges equals the vector sum of
the field strengths that would be produced by each of the charges of the system separately:
Õ
𝑬= 𝑬𝑖. (1.19)
𝑖
This statement is called the principle of electric field superposition.
The superposition principle allows us to calculate the field strength of any
system of charges. By dividing extended charges into sufficiently small fractions d𝑞,
we can reduce any system of charges to a collection of point charges. We calculate
the contribution of each of such charges to the resultant field by Eq. (1.15).
An electric field can be described by indicating the magnitude and direction of
the vector 𝑬 for each of its points. The combination of these vectors forms the field
of the electric field strength vector (compare with the field of the velocity vector,
Vol. I, Sec. 9.1). The velocity vector field can be represented very illustratively with
the aid of flow lines. Similarly, an electric field can be described with the aid of
strength lines, which we shall call for short 𝑬 lines or field lines. These lines are
drawn so that a tangent to them at every point coincides with the direction of the

⁴In Eq. (1.15), 𝑞 stands for the charge setting up the field. In Eq. (1.18), 𝑞 stands for the charge
experiencing the force 𝑭 at a point of strength 𝑬.
12 ELECTRIC FIELD IN A VACUUM

Fig. 1.5 Fig. 1.6

vector 𝑬. The density of the lines is selected so that their number passing through
a unit area at right angles to the lines equals the numerical value of the vector 𝑬.
Hence, the pattern of field lines permits us to assess the direction and magnitude of
the vector 𝑬 at various points of space (Fig. 1.5).
The 𝑬 lines of a point charge field are a collection of radial straight lines directed
away from the charge if it is positive and toward it if it is negative (Fig. 1.6). One
end of each line is at the charge, and the other extends to infinity. Indeed, the total
number of lines intersecting a spherical surface of arbitrary radius 𝑟 will equal the
product of the density of the lines and the surface area of the sphere 4𝜋𝑟 2 . We
have assumed that the density of the lines numerically equals 𝐸 = (1/4𝜋 𝜀0 ) (𝑞/𝑟 2 ).
Hence, the number of lines is (1/4𝜋 𝜀0 ) (𝑞/𝑟 2 )4𝜋𝑟 2 = 𝑞/𝜀0 . This result signifies that
the number of lines at any distance from a charge will be the same. It thus follows
hat the lines do not begin and do not terminate anywhere except tor the charge.
Beginning at the charge, they extend to infinity (the charge is positive), or arriving
from infinity, they terminate at the charge (the latter is negative). This property of
the 𝑬 lines is common for all electrostatic fields, i.e., fields set up by any system of
stationary charges: the field lines can begin or terminate only at charges or extend
to infinity.

1.6. Potential

Let us consider the field produced by a stationary point charge 𝑞. At any point of
this field, the point charge 𝑞0 experiences the force
1 𝑞𝑞0
𝑭= 𝒆ˆ 𝑟 = 𝐹 (𝑟) 𝒆ˆ 𝑟 . (1.20)
4𝜋 𝜀0 𝑟 2
Here 𝐹 (𝑟) is the magnitude of the force 𝑭, and 𝒆ˆ 𝑟 is the unit vector of the position
vector 𝒓 determining the position of the charge 𝑞0 relative to the charge 𝑞.
The force (1.20) is a central one (see Vol. I, Sec. 3.4). A central field of forces is
conservative. Consequently, the work done by the forces of the field on the charge
𝑞0 when it is moved from one point to another does not depend on the path. This
Potential 13

Fig. 1.7

work is
∫ 2
𝐴12 = 𝐹 (𝑟) 𝒆ˆ 𝑟 d𝒍 (1.21)
1
where d𝒍 is the elementary displacement of the charge 𝑞0. Inspection of Fig. 1.7
shows that the scalar product 𝒆ˆ 𝑟 d𝒍 equals the increment of the magnitude of the
position vector 𝒓, i.e., d𝑟. Equation (1.21) can therefore be written in the form
∫ 2
𝐴12 = 𝐹 (𝑟) d𝑟
1
[compare with Eq. (3.24) of Vol. I]. Introduction of the expression for 𝐹 (𝑟) yields
𝑞𝑞0 d𝑟 1
 0
𝑞𝑞0
∫ 𝑟2 
𝑞𝑞
𝐴12 = = − . (1.22)
4𝜋 𝜀0 𝑟1 𝑟 2 4𝜋 𝜀0 𝑟1 𝑟2
The work of the forces of a conservative field can be represented as a decrement
of the potential energy:
𝐴12 = 𝑊p,1 − 𝑊p,2 . (1.23)
A comparison of Eqs. (1.22) and (1.23) leads to the following expression for the
potential energy of the charge 𝑞0 in the field of the charge 𝑞:
1 𝑞𝑞0
𝑊p = + constant.
4𝜋 𝜀0 𝑟2
The value of the constant in the expression for the potential energy is usually chosen
so that when the charge moves away to infinity (i.e., when 𝑟 = ∞), the potential
energy vanishes. When this condition is observed, we get
1 𝑞𝑞0
𝑊p = . (1.24)
4𝜋 𝜀0 𝑟2
Let us use the charge 𝑞0 as a test charge for studying the field. By Eq. (1.24), the
potential energy which the test charge has depends not only on its magnitude 𝑞0,
but also on the quantities 𝑞 and 𝑟 determining the field. Thus, we can use this energy
14 ELECTRIC FIELD IN A VACUUM

to describe the field just like we used the force acting on the test charge for this
purpose.
Different test charges 𝑞t0, 𝑞t00, etc. will have different energies 𝑊p0, 𝑊p00, etc. at the
same point of a field. But the ratio 𝑊p /𝑞t will be the same for all the charges [see
Eq. (1.24)]. The quantity
𝑊p
𝜑= (1.25)
𝑞t
is called the field potential at a given point and is used together with the field
strength 𝑬 to describe electric fields.
It can be seen from Eq. (1.25) that the potential numerically equals the potential
energy which a unit positive charge would have at the given point of the field.
Substituting for the potential energy in Eq. (1.25) its value from (1.24), we get the
following expression for the potential of a point charge:
1 𝑞
𝜑= . (1.26)
4𝜋 𝜀0 𝑟
In the Gaussian system, the potential of the field of a point charge in a vacuum
is determined by the formula
𝑞
𝜑= . (1.27)
𝑟
Let us consider the field produced by a system of 𝑁 point charges 𝑞1 , 𝑞2 , . . . , 𝑞𝑁 .
Let 𝑟1 , 𝑟2 , . . . , 𝑟𝑁 be the distances from each of the charges to the given point of
the field. The work done by the forces of this field on the charge 𝑞0 will equal
the algebraic sum of the work done by the forces set up by each of the charges
separately:
Õ𝑁
𝐴12 = 𝐴𝑖 .
𝑖=1
By Eq. (1.22), each work 𝐴𝑖 equals
1
 0
𝑞𝑖 𝑞 0

𝑞𝑖 𝑞
𝐴𝑖 = −
4𝜋 𝜀0 𝑟𝑖,1 𝑟𝑖,2
where 𝑟𝑖,1 is the distance from the charge 𝑞𝑖 to the initial position of the charge 𝑞0,
and 𝑟𝑖,2 is the distance from 𝑞𝑖 to the final position of the charge 𝑞0. Hence,
1 Õ 𝑞𝑖 𝑞 0 1 Õ 𝑞𝑖 𝑞 0
𝑁 𝑁
𝐴𝑖2 = − .
4𝜋 𝜀0 𝑖=1 𝑟𝑖,1 4𝜋 𝜀0 𝑖=1 𝑟𝑖,2
Comparing this equation with Eq. (1.23), we get the following expression for the
Potential 15

potential energy of the charge 𝑞0 in the field of a system of charges:


1 Õ 𝑞𝑖 𝑞 0
𝑁
𝑊p =
4𝜋 𝜀0 𝑖=1 𝑟𝑖
from which it can be seen that
1 Õ 𝑞𝑖
𝑁
𝜑= . (1.28)
4𝜋 𝜀0 𝑖=1 𝑟𝑖
Comparing this formula with Eq. (1.26), we arrive at the conclusion that the
potential of the field produced by a system of charge equals the algebraic sum of the
potentials produced by each of the charges separately. Whereas the field strengths are
added vectorially in the superposition of fields, the potentials are added algebraically.
This is why it is usually much simpler to calculate the potentials than the electric
field strengths.
Examination of Eq. (1.25) shows that the charge 𝑞 at a point of a field with the
potential 𝜑 has the potential energy
𝑊p = 𝑞𝜑. (1.29)
Hence, the work of the field forces on the charge 𝑞 can be expressed through the
potential difference:
𝐴12 = 𝑊p,1 − 𝑊p,2 = 𝑞 (𝜑1 − 𝜑2 ) . (1.30)
Thus, the work done on a charge by the forces of a field equals the product of the
magnitude of the charge and the difference between the potentials at the initial and
final points (i.e., the potential decrement).
If the charge 𝑞 is removed from a point having the potential 𝜑 to infinity (where
by convention the potential vanishes), then the work of the field forces will be
𝐴∞ = 𝑞𝜑. (1.31)
Here, it follows that the potential numerically equals the work done by the forces of
a field on a unit positive charge when the latter is removed from the given point to
infinity. Work of the same magnitude must be done against the electric field forces
to move a unit positive charge from infinity to the given point of a field.
Equation (1.31) can be used to establish the units of potential. The unit of potential
is taken equal to the potential at a point of a field when work equal to unity is
required to move unit positive charge from infinity to this point. The SI unit of
potential called the volt (V) is taken equal to the potential at a point when work of
1 joule has to be done to move a charge of 1 coulomb from infinity to this point:
1J
1 J = 1 C × 1 V, thus, 1 V = . (1.32)
1C
The absolute electrostatic unit of potential ( cgse𝜑 ) is taken equal to the potential
16 ELECTRIC FIELD IN A VACUUM

at a point when work of 1 erg has to be done to move a charge of 1 cgse𝑞 from
infinity to this point. Expressing 1 J and 1 C in Eq. (1.32) through cgse units, we shall
find the relation between the volt and the cgse potential unit:
1J 107 erg 1
1V = = 9
= cgse𝜑 . (1.33)
1 C 3 × 10 cgse𝑞 300
Thus, 1 cgse𝜑 equals 300 V.
A unit of energy and work called the electron-volt (eV) is frequently used in
physics. An electron-volt is defined as the work done by the forces of a field on a
charge equal to that of an electron (i.e., on the elementary charge 𝑒) when it passes
through a potential difference of 1 V:
1 eV = 1.60 × 10−19 C × 1 V = 1.60 × 10−19 J = 1.60 × 10−12 erg. (1.34)
Multiple units of the electron-volt are also used:
1 keV (kiloelectron-volt) = 103 eV,
1 MeV (megaelectron-volt) = 106 eV,
1 GeV (gigaelectron-volt) = 109 eV.

1.7. Interaction Energy of a System of Charges

Equation (1.24) can be considered as the mutual potential energy of the charges 𝑞
and 𝑞0. Using the symbols 𝑞1 and 𝑞2 for these charges, we get the following formula
for their interaction energy:
1 𝑞1 𝑞2
𝑊p = . (1.35)
4𝜋 𝜀0 𝑟12
The symbol 𝑟12 stands for the distance between the charges.
Let us consider a system consisting of 𝑁 point charges 𝑞1 , 𝑞2 , . . . , 𝑞𝑁 . We showed
in Sec. 3.6 of Vol. I that the energy of interaction of such a system equals the sum of
the energies of interaction of the charges taken in pairs:
1 Õ
𝑊p = 𝑊p,𝑖𝑘 (𝑟𝑖𝑘 ) (1.36)
2
(𝑖≠𝑘)
[see Eq. (3.60) of Vol. I].
According to Eq. (1.35)
1 𝑞𝑖 𝑞𝑘
𝑊p,𝑖𝑘 = .
4𝜋 𝜀0 𝑟𝑖𝑘
Using this equation in (1.36), we find that
1 Õ 1 𝑞𝑖 𝑞𝑘
𝑊p = . (1.37)
2 4𝜋 𝜀0 𝑟𝑖𝑘
(𝑖≠𝑘)
Relation Between Electric Field Strength and Potential 17

In the Gaussian system, the factor 1/(4𝜋 𝜀0 ) is absent in this equation.


In Eq. (1.37), summation is performed over the subscripts 𝑖 and 𝑘. Both subscripts
pass independently through all the values from 1 to 𝑁. Addends for which the value
of the subscript 𝑖 coincides with that of 𝑘 are not taken into consideration. Let us
write Eq. (1.37) as follows:
1 Õ Õ 1 𝑞𝑘
𝑁 𝑁
𝑊p = 𝑞𝑖 . (1.38)
2 𝑖=1 𝑖=1
4𝜋 𝜀 0 𝑟 𝑖𝑘
(𝑖≠𝑘)
The expression
1 Õ 𝑞𝑘
𝑁
𝜑𝑖 =
4𝜋 𝜀0 𝑖=1 𝑟𝑖𝑘
(𝑖≠𝑘)
is the potential produced by all the charges except 𝑞𝑖 at the point where the charge
𝑞𝑖 is. With this in view, we get the following formula for the interaction energy:

𝑁
𝑊p = 𝑞𝑖 𝜑 𝑖 . (1.39)
2 𝑖=1

1.8. Relation Between Electric Field Strength and Potential

An electric field can be described either with the aid of the vector quantity 𝑬, or
with the aid of the scalar quantity 𝜑. There must evidently be a definite relation
between these quantities. If we bear in mind that 𝑬 is proportional to the force
acting on a charge and 𝜑 to the potential energy of the charge, it is easy to see that
this relation must be similar to that between the potential energy and the force.
The force 𝑭 is related to the potential energy by the expression
𝑭 = −∇𝑊p (1.40)
[see Eq. (3.32) of Vol. I]. For a charged particle in an electrostatic field, we have
𝑭 = 𝑞𝑬 and 𝑊p = 𝑞𝜑. Introducing these values into Eq. (1.40), we find that
𝑞𝑬 = −∇(𝑞𝜑).
The constant 𝑞 can be put outside the gradient sign. Doing this and then cancelling
𝑞, we arrive at the formula
𝑬 = −∇𝜑 (1.41)
establishing the relation between the field strength and potential.
Taking into account the definition of the gradient [see Eq. (3.31) of Vol. 1], we
18 ELECTRIC FIELD IN A VACUUM

can write that


∂𝜑 ∂𝜑 ∂𝜑
𝑬 = − 𝒆ˆ 𝑥 − 𝒆ˆ 𝑦 − 𝒆ˆ 𝑧 . (1.42)
∂𝑥 ∂𝑦 ∂𝑧
Hence, Eq. (1.41) has the following form in projections onto the coordinate axes:
∂𝜑 ∂𝜑 ∂𝜑
𝐸𝑥 = − , 𝐸 𝑦 = − , 𝐸𝑧 = − . (1.43)
∂𝑥 ∂𝑦 ∂𝑧
Similarly, the projection of the vector 𝑬 onto an arbitrary direction 𝑙 equals the
derivative of 𝜑 with respect to 𝑙 taken with the opposite sign, i.e., the rate of dimin-
ishing of the potential when moving along the direction 𝑙:
∂𝜑
𝐸𝑙 = − . (1.44)
∂𝑙
It is easy to see that Eq. (1.44) is correct by choosing 𝑙 as one of the coordinate axes
and taking Eq. (1.43) into account.
Let us explain Eq. (1.41) using as an example the field of a point charge. The
potential of this field is expressed by Eq. (1.26). Passing over to Cartesian coordinates,
we get the expression
1 𝑞 1 𝑞
𝜑= = .
4𝜋 𝜀0 𝑟 4𝜋 𝜀0 (𝑥2 + 𝑦 2 + 𝑧2 ) 1/2
The partial derivative of this function with respect to 𝑥 is
∂𝜑 1 𝑞𝑥 1 𝑞𝑥
=− =− .
∂𝑥 4𝜋 𝜀0 (𝑥2 + 𝑦 2 + 𝑧2 ) 3/2 4𝜋 𝜀0 𝑟 3
Similarly,
∂𝜑 1 𝑞𝑦 ∂𝜑 1 𝑞𝑧
=− 3
, =− .
∂𝑦 4𝜋 𝜀0 𝑟 ∂𝑦 4𝜋 𝜀0 𝑟 3
Using the found values of the derivatives in Eq. (1.42), we arrive at the expression
1 𝑞 𝑥ˆ𝒆𝑥 + 𝑦ˆ𝒆 𝑦 + 𝑧ˆ𝒆𝑧 1 𝑞𝒓 1 𝑞

𝑬= 3
= 3
= 𝒆ˆ 𝑟
4𝜋 𝜀0 𝑟 4𝜋 𝜀0 𝑟 4𝜋 𝜀0 𝑟 2
that coincides with Eq. (1.15).
Equation (1.41) allows us to find the field strength at every point from the known
values of 𝜑. We can also solve the reverse problem, i.e., find the potential difference
between two arbitrary points of a field according to the given values of 𝑬. For this
purpose, we shall take advantage of the circumstance that the work done by the
forces of a field on the charge 𝑞 when it is moved from point 1 to point 2 can be
calculated as
∫ 2
𝐴12 = 𝑞𝑬 d𝒍.
1
Relation Between Electric Field Strength and Potential 19

At the same time in accordance with Eq. (1.30), this work can be written as
𝐴12 = 𝑞 (𝜑1 − 𝜑2 ) .
Equating these two expressions and cancelling 𝑞, we obtain
∫ 2
𝜑1 − 𝜑2 = 𝑬 d𝒍. (1.45)
1
The integral can be taken along any line joining points 1 and 2 because the work
of the field forces is independent of the path. For circumvention along a closed
contour, 𝜑1 = 𝜑2 , and Eq. (1.45) becomes

𝑬 d𝒍 = 0 (1.46)
(the circle on the integral sign indicates that integration is performed over a closed
contour). It must be noted that this relation holds only for an electrostatic field. We
shall see on a later page that the field of moving charges (i.e., a field changing with
time) is not a potential one. Therefore, condition (1.46) is not observed for it.
An imaginary surface all of whose points have the same potential is called an
equipotential surface. Its equation has the form
𝜑(𝑥, 𝑦, 𝑧) = constant.
The potential does not change in movement along an equipotential surface over
the distance d𝑙 (d𝜑 = 0). Hence, according to Eq. (1.44), the tangential component
of the vector 𝑬 to the surface equals zero. We thus conclude that the vector 𝑬 at
every point is directed along a normal to the equipotential surface passing through
the given point. Bearing in mind that the vector 𝑬 is directed along a tangent to
an 𝑬 line, we can easily see that the field lines at every point are orthogonal to the
equipotential surfaces.
An equipotential surface can be drawn through any point of a field. Conse-
quently, we can construct an infinitely great number of such surfaces. They are
conventionally drawn so that the potential difference for two adjacent surfaces is
the same everywhere. Thus, the density of the equipotential surfaces allows us to
assess the magnitude of the field strength. Indeed, the denser are the equipotential
surfaces, the more rapidly does the potential change when moving along a normal
to the surface. Hence, ∇𝜑 is greater at the given place, and, therefore, 𝑬 is greater
too.
Figure 1.8 shows equipotential surfaces (more exactly, their intersections with
the plane of the drawing) for the field of a point charge. In accordance with the
nature of the dependence of 𝐸 on 𝑟, equipotential surfaces become the denser, the
nearer we approach a charge.
Equipotential surfaces for a homogeneous field are a collection of equispaced
20 ELECTRIC FIELD IN A VACUUM

Fig. 1.8 Fig. 1.9

planes at right angles to the direction of the field.

1.9. Dipole

An electric dipole is defined as a system of two point charges +𝑞 and −𝑞 identical


in value and opposite in sign, the distance between which is much smaller than that
to the points at which the field of the system is being determined. The straight line
passing through both charges is called the dipole axis.
Let us first calculate the potential and then the field strength of a dipole. This
field has axial symmetry. Therefore, the pattern of the field in any plane passing
through the dipole axis will be the same, the vector 𝑬 being in this plane. The
position of a point relative to the dipole will be characterized with the aid of the
position vector 𝒓 or with the aid of the polar coordinates 𝑟 and 𝜃 (Fig. 1.9). We shall
introduce the vector 𝒍 passing from the negative charge to the positive one. The
position of the charge +𝑞 relative to the centre of the dipole is determined by the
vector 𝒂, and of the charge −𝑞 by the vector −𝒂. It is obvious that 𝒍 = 2𝒂. We shall
designate the distances to a given point from the charges +𝑞 and −𝑞 by 𝑟+ and 𝑟− ,
respectively.
Owing to the smallness of 𝑎 in comparison with 𝑟, we can assume approximately
that
𝑟+ = 𝑟 − 𝑎 cos 𝜃 = 𝑟 − 𝒂· 𝒆ˆ 𝑟 ,
(1.47)
𝑟− = 𝑟 + 𝑎 cos 𝜃 = 𝑟 + 𝒂· 𝒆ˆ 𝑟 .
The potential at a point determined by the position vector 𝒓 is
1 1 𝑞(𝑟− − 𝑟+ )
 
𝑞 𝑞
𝜑(𝒓) = − = .
4𝜋 𝜀0 𝑟+ 𝑟− 4𝜋 𝜀0 𝑟+ 𝑟−
Dipole 21

Fig. 1.10

The product 𝑟+ 𝑟− can be replaced with 𝑟 2 . The difference 𝑟− − 𝑟+ according to Eqs.


(1.47), is 2(𝒂· 𝒆ˆ 𝑟 ) = 𝒍· 𝒆ˆ 𝑟 . Hence,
1 𝑞(𝒍· 𝒆ˆ 𝑟 ) 1 (𝒑· 𝒆ˆ 𝑟 )
𝜑(𝒓) = 2
= (1.48)
4𝜋 𝜀0 𝑟 4𝜋 𝜀0 𝑟 2
where
𝒑 = 𝑞𝒍 (1.49)
is a characteristic of a dipole called its electric moment. The vector 𝒑 is directed
along the dipole axis from the negative charge to the positive one (Fig. 1.10).
A glance at Eq. (1.48) shows that the field of a dipole is determined by its electric
moment 𝒑. We shall see below that the behaviour of a dipole in an external electric
field is also determined by its electric moment 𝒑. A comparison with Eq. (1.26) shows
that the potential of a dipole field diminishes with the distance more rapidly (as
1/𝑟 2 ) than the potential of a point charge field (which changes in proportion to 1/𝑟).
It can be seen from Fig. 1.9 that 𝒑 · 𝒆ˆ 𝑟 = 𝑝 cos 𝜃. Therefore, Eq. (1.48) can be
written as follows:
1 𝑝 cos 𝜃
𝜑(𝑟, 𝜃) = . (1.50)
4𝜋 𝜀0 𝑟 2
To find the field strength of a dipole, let us calculate the projections of the
vector 𝑬 onto two mutually perpendicular directions by Eq. (1.44). One of them is
determined by the motion of a point due to the change in the distance 𝑟 (with 𝜃
fixed), the other by the motion of the point due to the change in the angle 𝜃 (with
𝑟 fixed, see Fig. 1.9). The first projection is obtained by differentiation of Eq. (1.50)
with respect to 𝑟:
∂𝜑 1 2𝑝 cos 𝜃
𝐸𝑟 = − = . (1.51)
∂𝑟 4𝜋 𝜀0 𝑟 3
We shall find the second projection (let us designate it by 𝐸 𝜃 ) by taking the ratio
of the increment of the potential 𝜑 obtained when the angle 𝜃 grows by d𝜃 to the
distance 𝑟 d𝜃 over which the end of the segment 𝑟 moves (in this case the quantity
d𝑙 in Eq. (1.44) equals 𝑟 d𝜃]. Thus,
1 ∂𝜑
𝐸𝜃 = − .
𝑟 ∂𝜃
Introducing the value of the derivative of function (1.50) with respect to 𝜃 we get
1 𝑝 sin 𝜃
𝐸𝜃 = . (1.52)
4𝜋 𝜀0 𝑟 3
22 ELECTRIC FIELD IN A VACUUM

The sum of the squares of Eqs. (1.51) and (1.52) gives the square of the vector 𝑬 (see
Fig. 1.9):
2  
1 𝑝 2

2 2 2
4 cos2 𝜃 + sin2 𝜃

𝐸 = 𝐸𝑟 + 𝐸 𝜃 = 3
4𝜋 𝜀0 𝑟
2  
1 2

1 + 3 cos2 𝜃 .
𝑝 
= 3
4𝜋 𝜀0 𝑟
Hence
1 𝑝  1/2
𝐸= 3
1 + 3 cos2 𝜃 . (1.53)
4𝜋 𝜀0 𝑟
Assuming in Eq. (1.53) that 𝜃 = 0, we get the strength on the dipole axis:
1 2𝑝
𝐸k = . (1.54)
4𝜋 𝜀0 𝑟 3
The vector 𝑬 k is directed along the dipole axis. This is in agreement with the axial
symmetry of the problem. Examination of Eq. (1.51) shows that 𝐸𝑟 > 0 when 𝜃 = 0,
and 𝐸𝑟 < 0 when 𝜃 = 𝜋. This signifies that in any case the vector 𝑬 k has a direction
coinciding with that from −𝑞 to +𝑞 (i.e., with the direction of 𝒑). Equation (1.54) can
therefore be written in the vector form:
1 2𝒑
𝑬k = . (1.55)
4𝜋 𝜀0 𝑟 3
Assuming in Eq. (1.53) that 𝜃 = 𝜋/2, we get the field strength on the straight line
passing through the centre of the dipole and perpendicular to its axis:
1 𝑝
𝐸⊥ = . (1.56)
4𝜋 𝜀0 𝑟 3
By Eq. (1.51), when 𝜃 = 𝜋/2, the projection 𝐸𝑟 equals zero. Hence, the vector 𝑬 ⊥ is
parallel to the dipole axis. It follows from Eq. (1.52) that when 𝜃 = 𝜋/2, the projection
𝐸 𝜃 is positive. This signifies that the vector 𝑬 ⊥ is directed toward the growth of the
angle 𝜃, i.e., antiparallel to the vector 𝒑.
The field strength of a dipole is characterized by the circumstance that it di-
minishes with the distance from the dipole in proportion to 1/𝑟 3 , i.e., more rapidly
than the field strength of a point charge (which diminishes in proportion to 1/𝑟 2 ).
Figure 1.11 shows 𝑬 lines (the solid lines) and equipotential surfaces (the dash
lines) of the field of a dipole. According to Eq. (1.50), when 𝜃 = 𝜋/2, the potential
vanishes for all the 𝑟’s. Thus, all the points of a plane at right angles to the dipole
axis and passing through its middle have a zero potential. This could have been
predicted because the distances from the charges +𝑞 and −𝑞 to any point of this
plane are identical.
Now let us turn to the behaviour of a dipole in an external electric field. If a
dipole is placed in a homogeneous electric field, the charges +𝑞 and −𝑞 forming
Dipole 23

Fig. 1.11 Fig. 1.12

the dipole will be under the action of the forces 𝑭 1 and 𝑭 2 equal in magnitude, but
opposite in direction (Fig. 1.12). These forces form a couple whose arm is 𝑙 sin 𝛼, i.e.,
depends on the orientation of the dipole relative to the field. The magnitude of each
of the forces is 𝑞𝐸. Multiplying it by the arm, we get the magnitude of the torque
acting on a dipole:
𝑇 = 𝑞𝐸𝑙 sin 𝛼 = 𝑝𝐸 sin 𝛼 (1.57)
(𝑝 is the electric moment of the dipole). It is easy to see that Eq. (1.57) can be written
in the vector form
𝑻 = 𝒑 × 𝑬. (1.58)
The torque (1.58) tends to turn a dipole so that its electric moment 𝒑 is in the direction
of the field.
Let us find the potential energy belonging to a dipole in an external electric
field. By Eq. (1.29), this energy is
𝑊p = 𝑞𝜑+ − 𝑞𝜑− = 𝑞(𝜑+ − 𝜑− ). (1.59)
Here 𝜑+ and 𝜑− are the values of the potential of the external field at the points
where the charges +𝑞 and −𝑞 are placed.
The potential of a homogeneous field diminishes linearly in the direction of
the vector 𝑬. Assuming that the 𝑥-axis is this direction (Fig. 1.13), we can write that
𝐸 = 𝐸 𝑥 = −d𝜑/d𝑥. A glance at Fig. 1.13 shows that the difference 𝜑+ − 𝜑− equals the
increment of the potential on the segment 𝛥𝑥 = 𝑙 cos 𝛼:
d𝜑
𝜑+ − 𝜑− = 𝑙 cos 𝛼 = −𝐸𝑙 cos 𝛼.
d𝑥
Introducing this value into Eq. (1.59), we find that
𝑊p = −𝑞𝐸𝑙 cos 𝛼 = −𝑝𝐸 cos 𝛼. (1.60)
Here 𝛼 is the angle between the vectors 𝒑 and 𝑬. We can therefore write Eq. (1.60)
24 ELECTRIC FIELD IN A VACUUM

Fig. 1.13 Fig. 1.14

in the form
𝑊p = −𝒑 · 𝑬. (1.61)
We must note that this expression takes no account of the energy of interaction of
the charges +𝑞 and −𝑞 forming a dipole.
We have obtained Eq. (1.61) assuming for simplicity’s sake that the field is homo-
geneous. This equation also holds, however, for an inhomogeneous field.
Let us consider a dipole in an inhomogeneous field that is symmetrical relative
to the 𝑥-axis⁵. Let the centre of the dipole be on this axis, the dipole electric moment
making with the axis an angle 𝛼, differing from 𝜋/2 (Fig. 1.14). In this case, the forces
acting on the dipole charges are not identical in magnitude. Therefore, apart from
the rotational moment (torque), the dipole will experience a force tending to move
it in the direction of the 𝑥-axis. To find the value of this force, we shall use Eq. (1.40),
according to which
∂𝑊p ∂𝑊p ∂𝑊p
𝐹𝑥 = − , 𝐹𝑦 = − , 𝐹𝑧 = − .
∂𝑥 ∂𝑦 ∂𝑧
In view of Eq. (1.60), we can written
𝑊p (𝑥, 𝑦, 𝑧) = −𝑝𝐸(𝑥, 𝑦, 𝑧) cos 𝛼
(we consider the orientation of the dipole relative to the vector 𝑬 to be constant,
𝛼 = constant).
For points on the 𝑥-axis, the derivatives of 𝐸 with respect to 𝑦 and 𝑧 are zero.
Accordingly, ∂𝑊p /∂𝑦 = ∂𝑊p /∂𝑧 = 0. Thus, only the force component 𝐹 𝑥 differs
from zero. It is
∂𝑊p ∂𝐸
𝐹𝑥 = − = 𝑝 cos 𝛼. (1.62)
∂𝑥 ∂𝑥
This result can be obtained if we take account of the fact that the field strength
at the points where the charges +𝑞 and −𝑞 are (see Fig. 1.14) differs by the amount

⁵A particular case of such a field is that of a point charge if we take a straight line passing through
the charge as the 𝑥-axis.
Field of a System of Charges at Great Distances 25

Fig. 1.15

(∂𝐸/∂𝑥)𝑙 cos 𝛼. Accordingly, the difference between the forces acting on the charges
is 𝑞(∂𝐸/∂𝑥)𝑙 cos 𝛼, which coincides with Eq. (1.62).
When 𝛼 is less than 𝜋/2, the value of 𝐹 𝑥 determined by Eq. (1.62) is positive. This
signifies that under the action of the force the dipole is pulled into the region of a
stronger field (see Fig. 1.14). When 𝛼 is greater than 𝜋/2, the dipole is pushed out of
the field.
In the case shown in Fig. 1.15, only the derivative ∂𝐸/∂𝑦 differs from zero for
points on the 𝑦-axis. Therefore, the force acting on the dipole is determined by the
component
∂𝑊p ∂𝐸
𝐹𝑦 = − = 𝑝 , (cos 𝛼 = 1).
∂𝑦 ∂𝑦
The derivative ∂𝐸/∂𝑦 is negative. Consequently, the force is directed as shown in
the figure. Thus, in this case too, the dipole is pulled into the field.
We shall note that like −∂𝑊p /∂𝑥 gives the projection of the force acting on
the system onto the 𝑥-axis, so does the derivative of Eq. (1.60) with respect to 𝛼
taken with the opposite sign give the projection of the torque onto the 𝛼-“axis”:
𝑇𝛼 = −𝑝𝐸 sin 𝛼. The minus sign was obtained because the 𝛼-“axis” and the torque
𝑇 are directed oppositely (see Fig. 1.12).

1.10. Field of a System of Charges at Great Distances

Let us take a system of 𝑁 charges 𝑞1 , 𝑞2 , . . . , 𝑞𝑁 in a volume having linear dimensions


of the order of 𝑙, and study the field set up by this system at distances 𝑟 that are great
in comparison with 𝑙 (𝑟 > 𝑙). We take the origin of coordinates 0 inside the volume
occupied by the system and shall determine the positions of the charges with the
aid of the position vectors 𝒓 𝑖 , (Fig. 1.16; to simplify the figure, we have shown only
the position vector of the 𝑖-th charge).
The potential at the point determined by the position vector 𝒓 is
1 Õ 𝑞𝑖
𝑁
𝜑(𝒓) = . (1.63)
4𝜋 𝜀0 𝑖=1 |𝒓 − 𝒓 𝑖 |
26 ELECTRIC FIELD IN A VACUUM

Fig. 1.16

Owing to the smallness of 𝑟𝑖 in comparison with 𝑟, we can assume that


𝒓 𝑖 · 𝒆ˆ 𝑟
 
|𝒓 − 𝒓 𝑖 | = 𝑟 − 𝒓 𝑖 · 𝒆ˆ 𝑟 = 𝑟 1 −
𝑟
[compare with Eqs. (1.47)]. Introduction of this expression into Eq. (1.63) yields
1 Õ 𝑞𝑖 1
𝑁  
𝜑(𝒓) = . (1.64)
4𝜋 𝜀0 𝑖=1 𝑟 1 − (𝒓 𝑖 · 𝒆ˆ 𝑟 /𝑟)
Using the formula
1
≈1+𝑥
1−𝑥
which holds when 𝑥  1, we can transform Eq. (1.64) as follows:
1 Õ 𝑞𝑖 𝒓 𝑖 · 𝒆ˆ 𝑟
𝑁  
𝜑(𝒓) = 1+
4𝜋 𝜀0 𝑖=1 𝑟 𝑟
!
1 1Õ 1 1 Õ
𝑁 𝑁
= 𝑞𝑖 + 𝑞𝑖 𝒓 𝑖 · 𝒆ˆ 𝑟 . (1.65)
4𝜋 𝜀0 𝑟 𝑖=1 4𝜋 𝜀0 𝑟 2 𝑖=1
The first term of the expression obtained is the potential of the field of a point
charge having the value 𝑞 = 𝑖 𝑞𝑖 [compare with Eq. (1.26)]. The second term has
Í
the same form as the expression determining the potential of a dipole field, the part
of the electric moment of the dipole being played by the quantity
Õ𝑁
𝒑= 𝑞𝑖 𝒓 𝑖 . (1.66)
𝑖=1
This quantity is called the dipole electric moment of a system of charges. It is
easy to verify that for a dipole Eq. (1.66) transforms into the expression 𝒑 = 𝑞𝒍 which
we are already familiar with.
If the total charge of a system is zero ( 𝑖 𝑞𝑖 = 0), the value of the dipole moment
Í
does not depend on our choice of the origin of coordinates. To convince ourselves
Field of a System of Charges at Great Distances 27

Fig. 1.17 Fig. 1.18

that this is true, let us take two arbitrary origins of coordinates 0 and 00 (Fig. 1.17).
The position vectors of the 𝑖-th charge conducted from these points are related as
follows:
𝒓 𝑖0 = 𝒃 + 𝒓 𝑖 (1.67)
(what the vector 𝒃 is clear from the figure). With account taken of Eq. (1.67), the
dipole moment in the system with the origin 00 is
Õ Õ Õ Õ
𝒑0 = 𝑞𝑖 𝒓 𝑖 = 𝑞𝑖 (𝒃 + 𝒓 𝑖 ) = 𝒃 𝑞𝑖 + 𝑞𝑖 𝒓 𝑖 .
𝑖 𝑖
Í 𝑖 𝑖
The first addend equals zero (because 𝑖 𝑞𝑖 = 0). The second one is 𝒑—the dipole
moment in a coordinate system with its origin at 0. We have thus obtained that
𝒑0 = 𝒑.
Equation (1.65) is in essence the first two terms of the series expansion of function
(1.63) by powers of 𝑟𝑖 /𝑟. When 𝑖 𝑞𝑖 ≠ 0, the first term of Eq. (1.65) makes the main
Í
contribution to the potential (the second term diminishes in proportion to 1/𝑟 2
and is therefore much smaller than the first one). For an electrically neutral system
( 𝑖 𝑞𝑖 = 0), the first term equals zero, and the potential is determined mainly by the
Í
second term of Eq. (1.65). This is how matters stand, in particular, for the field of a
dipole.
For the system of charges depicted in Fig. 1.18a and called a quadrupole, both
𝑖 𝑞𝑖 and 𝒑 equal zero so that Eq. (1.65) gives a zero value of the potential. Actually,
Í
however, the field of a quadrupole, although it is much weaker than that of a dipole
(with the same values of 𝑞 and 𝑙), differs from zero. The potential of the field set up
by a quadrupole is determined mainly by the third term of the expansion that is
proportional to 1/𝑟 3 . To obtain this term, we must take into consideration quantities
of the order of (𝑟𝑖 /𝑟) 2 which we disregarded in deriving Eq. (1.65). For the system of
charges shown in Fig. 1.18b and called an octupole, the third term of the expansion
also equals zero. The potential of the field of such a system is determined by the
fourth term of the expansion, which is proportional to 1/𝑟 4 .
It must be noted that the quantity equal to 𝑖 𝑞𝑖 in the numerator of the first
Í
28 ELECTRIC FIELD IN A VACUUM

term of Eq. (1.65) is called a monopole or a zero-order multipole, a dipole is also


called a first-order multipole, a quadrupole is called a second-order multipole,
and so on.
Thus, in the general case, the field of a system of charges at great distances
can be represented as the superposition of fields set up by multipoles of different
orders—a monopole, dipole, quadrupole, octupole, etc.

1.11. A Description of the Properties of Vector Fields

To continue our study of the electric field, we must acquaint ourselves with the
mathematical tools used to describe the properties of vector fields. These tools
are called vector analysis. In the present section, we shall treat the fundamental
concepts and selected formulas of vector analysis, and also prove its two main
theorems—the Ostrogradsky-Gauss theorem (sometimes called Gauss’s divergence
theorem) and Stokes’s theorem.
The quantities used in vector analysis can be best illustrated for the field of the
velocity vector of a flowing liquid. We shall therefore introduce these quantities
while dealing with the flow of an ideal incompressible liquid, and then extend the
results obtained to vector fields of any nature.
We are already acquainted with one of the concepts of vector analysis. This is
the gradient, used to characterize scalar fields. If the value of the scalar quantity
𝜑 = 𝜑(𝑥, 𝑦, 𝑧) is compared with every point P having the coordinates 𝑥, 𝑦, 𝑧, we say
that the scalar field of 𝜑 has been set. The gradient of the quantity 𝜑 is defined as
the vector
∂𝜑 ∂𝜑 ∂𝜑
grad 𝜑 = 𝒆ˆ 𝑥 + 𝒆ˆ 𝑦 + 𝒆ˆ 𝑧 . (1.68)
∂𝑥 ∂𝑦 ∂𝑧
The increment of the function 𝜑 upon displacement over the length d𝒍 =
𝒆ˆ 𝑥 d𝑥 + 𝒆ˆ 𝑦 d 𝑦 + 𝒆ˆ 𝑧 d𝑧 is
∂𝜑 ∂𝜑 ∂𝜑
d𝜑 = d𝑥 + d𝑦 + d𝑧
∂𝑥 ∂𝑦 ∂𝑧
which can be written in the form
d𝜑 = grad 𝜑 · d𝒍. (1.69)
Now we shall go over to establishing the characteristics of vector fields.
Vector flux. Assume that the flow of a liquid is characterized by the field of
the velocity vector. The volume of liquid flowing in unit time through an imaginary
surface 𝑆 is called the flux of the liquid through this surface. To find the flux, let
us divide the surface into elementary sections of the size 𝛥𝑆. It can be seen from
A Description of the Properties of Vector Fields 29

Fig. 1.19

Fig. 1.19 that during the time 𝛥𝑡 a volume of liquid equal to


𝛥𝑉 = ( 𝛥𝑆 cos 𝛼)𝑣 𝛥𝑡
will pass through section 𝛥𝑆. Dividing this volume by the time 𝛥𝑡, we shall find the
flux through surface 𝛥𝑆:
𝛥𝑉
𝛥𝛷 = = 𝛥𝑆𝑣 cos 𝛼.
𝛥𝑡
Passing over to differentials, we find that
d𝛷 = (𝑣 cos 𝛼) d𝑆. (1.70)
Equation (1.70) can be written in two other ways. First, if we take into account that
𝑣 cos 𝛼 gives the projection of the velocity vector onto the normal 𝒆ˆ 𝑛 to area d𝑆, we
can write Eq. (1.70) in the form
d𝛷 = 𝑣𝑛 d𝑆. (1.71)
Second, we can introduce the vector d𝑺 whose magnitude equals that of area d𝑆,
while its direction coincides with the direction of a normal 𝒏ˆ to the area:
d𝑺 = d𝑆 𝒏.ˆ
Since the direction of the vector 𝒏ˆ is chosen arbitrarily (it can be directed to either
side of the area), then d𝑺 is not a true vector, but is a pseudo vector. The angle 𝛼
in Eq. (1.70) is the angle between the vectors 𝒗 and d𝑺. Hence, this equation can be
written in the form
d𝛷 = 𝒗 · d𝑺. (1.72)
By summating the fluxes through all the elementary areas into which we have
divided surface 𝑆, we get the flux of the liquid through 𝑆:
∫ ∫
𝛷𝑣 = 𝒗 · d𝑺 = 𝑣𝑛 d𝑆. (1.73)
𝑆 𝑆
30 ELECTRIC FIELD IN A VACUUM

Fig. 1.20

A similar expression written for an arbitrary vector field 𝒂, i.e., the quantity
∫ ∫
𝛷𝑎 = 𝒂 · d𝑺 = 𝑎𝑛 d𝑆 (1.74)
𝑆 𝑆
is called the flux of the vector a through surface 𝑆. In accordance with this
definition, the flux of a liquid can be called the flux of the vector 𝒗 through the
relevant surface [see Eq. (1.73)].
The flux of a vector is an algebraic quantity. Its sign depends on the choice of
the direction of a normal to the elementary areas into which surface 𝑆 is divided
in calculating the flux. Reversal of the direction of the normal changes the sign of
𝑎𝑛 and, therefore, the sign of the quantity (1.74). The customary practice for closed
surfaces is calculation of the flux “emerging outward” from the region enclosed
by the surface. Accordingly, in the following we shall always implicate that 𝒗ˆ is an
outward normal.
We can give an illustrative geometrical interpretation of the vector flux. For
this purpose, we shall represent a vector field by a system of lines 𝒂 constructed so
that the density of the lines at every point is numerically equal to the magnitude of
the vector 𝒂 at the same point of the field (compare with the rule for constructing
the lines of the vector 𝑬 set out at the end of Sec. 1.5). Let us find the number 𝛥𝑁 of
intersections of the field lines with the imaginary area 𝛥𝑆. A glance at Fig. 1.20 shows
that this number equals the density of the lines (i.e., 𝑎) multiplied by 𝛥𝑆⊥ = 𝛥𝑆 cos 𝛼:
𝛥𝑁 (=) 𝑎𝛥𝑆 cos 𝛼 = 𝑎𝑛 𝛥𝑆.
We are speaking only about the numerical equality between 𝛥𝑁 and 𝑎𝑛 𝛥𝑆. This
is why the equality sign is confined in parentheses. According to Eq. (1.74), the
expression 𝑎𝑛 𝛥𝑆 is 𝛥𝛷—the flux of the vector a through area 𝛥𝑆. Thus,
𝛥𝑁 (=) 𝛥𝛷𝑎 . (1.75)
For the sign of 𝛥𝑁 to coincide with that of 𝛥𝛷𝑎 , we must consider those inter-
sections to be positive for which the angle 𝛼 between the positive direction of a
field line and a normal to the area is acute. The intersection should be considered
negative if the angle 𝛼 is obtuse. For the area shown in Fig. 1.20, all three intersec-
A Description of the Properties of Vector Fields 31

Fig. 1.21 Fig. 1.22

tions are positive: 𝛥𝑁 = +3 (𝛥𝛷𝑎 in this case is also positive because 𝑎𝑛 > 0). If
the direction of the normal in Fig. 1.20 is reversed, the intersections will become
negative (𝛥𝑁 = −3), and the flux 𝛥𝛷𝑎 will also be negative.
Summation of Eq. (1.75) over the finite imaginary surface 𝑆 yields
Õ
𝛥𝛷𝑎 (=) 𝛥𝑁 = 𝑁+ − 𝑁− (1.76)
where 𝑁+ and 𝑁− are the total number of positive and negative intersections of the
field lines with surface 𝑆, respectively.
The reader may be puzzled by the circumstance that since the flux, as a rule, is
expressed by a fractional number, the number of intersections of the field lines with
a surface compared with the flux will also be fractional. Do not be confused by this,
however. Field lines are a purely conditional image deprived of a physical meaning.
Let us take an imaginary surface in the form of a strip of paper whose bottom
part is twisted relative to the top one through the angle 𝜋 (Fig. 1.21). The direction
of a normal must be chosen identically for the entire surface. Hence, if in the top
part of the strip a positive normal is directed to the right, then in the bottom part a
normal will be directed to the left. Accordingly, the intersections of the field lines
depicted in Fig. 1.21 with the top half of the surface must be considered positive, and
with the bottom half, negative.
An outward normal is considered to be positive for a closed surface (Fig. 1.22).
Therefore, the intersections corresponding to outward protrusion of the lines (in
this case the angle 𝛼 is acute) must be taken with the plus sign, and the ones appearing
when the lines enter the surface (in this case the angle 𝛼 is obtuse) must be taken
with the minus sign.
Inspection of Fig. 1.22 shows that when the field lines enter a closed surface
continuously, each line when intersecting the surface enters it and emerges from it
32 ELECTRIC FIELD IN A VACUUM

Fig. 1.23

the same number of times. As a result, the flux of the corresponding vector through
this surface equals zero. It is easy to see that if field lines end inside a surface, the
vector flux through the closed surface will numerically equal the difference between
the number of lines beginning inside the surface (𝑁beg ) and the number of lines
terminating inside the surface (𝑁term ):
𝛷𝑎 (=) 𝑁beg − 𝑁term . (1.77)
The sign of the flux depends on which of these numbers is greater. When 𝑁beg is
equal to 𝑁term , the flux equals zero.
Divergence. Assume that we are given the field of the velocity vector of an
incompressible continuous liquid. Let us take an imaginary closed surface 𝑆 in
the vicinity of point P (Fig. 1.23). If in the volume confined by this surface no
liquid appears and no liquid vanishes, then the flux flowing outward through the
surface will evidently equal zero. A liquid flux 𝛷𝑣 other than zero will indicate that
there are liquid sources or sinks inside the surface, i.e., points at which the liquid
enters the volume (sources) or emerges from it (sinks). The magnitude of the flux
determines the total algebraic power of the sources and sinks⁶. When the sources
predominate over the sinks, the magnitude of the flux will be positive, and when
the sinks predominate, negative.
The quotient obtained when dividing the flux 𝛷𝑣 by the volume which it flows
out from, i.e.,
𝛷𝑣
(1.78)
𝑉
gives the average unit power of the sources confined in the volume 𝑉 . In the limit
when 𝑉 tends to zero, i.e., when the volume 𝑉 contracts to point P, expression (1.78)
gives the true unit power of the sources at point P, which is called the divergence

⁶The power of a source (sink) is defined as the volume of liquid discharged (absorbed) in unit time.
A sink can be considered as a source with a negative power.
A Description of the Properties of Vector Fields 33

of the vector 𝒗 (it is designated by div 𝒗). Thus, by definition,


𝛷𝑣
div 𝒗 = lim .
𝑉 →P 𝑉
The divergence of any vector 𝒂 is determined in a similar way:

𝛷𝑣
div 𝒂 = lim = lim 𝒂 · d𝑺. (1.79)
𝑉 →P 𝑉 𝑉 →P
The integral is taken over arbitrary closed surface 𝑆 surrounding point P⁷; 𝑉 is the
volume confined by this surface. Since the transition 𝑉 →P is being performed
upon which 𝑆 tends to zero, we can assume that Eq. (1.79) cannot depend on the
shape of the surface. This assumption is confirmed by strict calculations.
Let us surround point P with a spherical surface of an extremely small radius 𝑟
(Fig. 1.24). Owing to the smallness of 𝑟, the volume 𝑉 enclosed by the sphere will
also be very small. We can therefore consider with a high degree of accuracy that
the value of div 𝒂 within the limits of the volume 𝑉 is constant⁸. In this case, we
can write in accordance with Eq. (1.79) that
𝛷𝑎 ≈ (div 𝒂)𝑉
where 𝛷𝑎 is the flux of the vector a through the surface surrounding the volume 𝑉 .
By Eq. (1.77), 𝛷𝑎 equals 𝑁beg , the number of lines of a beginning inside 𝑉 if div 𝒂 at
point P is positive, or 𝑁term , the number of lines of a terminating inside 𝑉 if div 𝒂
at point P is negative.
It follows from the above that the lines of the vector 𝒂 begin in the closest
vicinity of a point with a positive divergence. The field lines “diverge” from this
point; the latter is the “source” of the field (Fig. 1.24a). On the other hand, in the
vicinity of a point with a negative divergence, the lines of the vector 𝒂 terminate.
The field lines “converge” toward this point; the latter is the “sink” of the field
(Fig. 1.24b). The greater the absolute value of div 𝒂, the bigger is the number of lines
that begin or terminate in the vicinity of the given point.
It can be seen from definition (1.79) that the divergence is a scalar function
of the coordinates determining the positions of points in space (briefly—a point
function). Definition (1.79) is the most general one that is independent of the kind
of coordinate system used.
Let us find an expression for the divergence in a Cartesian coordinate system.
We shall consider a small volume in the form of a parallelepiped with ribs parallel
to the coordinate axes in the vicinity of point 𝑃 (𝑥, 𝑦, 𝑧) (Fig. 1.25). The vector flux
through the surface of the parallelepiped is formed from the fluxes passing through

⁷The circle on the integral sign signifies that integration is performed over a closed surface.
⁸It is assumed that the value of div 𝒂 changes continuously, without any jumps, when passing
from one point of a field to another.
34 ELECTRIC FIELD IN A VACUUM

Fig. 1.24 Fig. 1.25

each of the six faces separately.


Let us find the flux through the pair of faces perpendicular to the 𝑥-axis (in
Fig. 1.25 these faces are designated by shaded areas and by the numbers 1 and 2). The
outward normal 𝒏ˆ 2 to face 2 coincides with the direction of the 𝑥-axis. Hence, for
points of this face, 𝑎𝑛2 = 𝑎𝑥 . The outward normal 𝒏ˆ 1 to face 1 is directed oppositely
to the 𝑥-axis. Therefore, for points on this face, 𝑎𝑛2 = −𝑎𝑥 . The flux through face 2
can be written in the form
𝑎𝑥,2 𝛥𝑦 𝛥𝑧
where 𝑎𝑥,2 is the value of 𝑎𝑥 averaged over face 2. The flux through face 1 is
−𝑎𝑥,1 𝛥𝑦 𝛥𝑧
where 𝑎𝑥,1 is the average value of 𝑎𝑥 for face 1. The total flux through faces 1 and 2
is determined by the expression
(𝑎𝑥,2 − 𝑎𝑥,1 ) 𝛥𝑦 𝛥𝑧. (1.80)
The difference 𝑎𝑥,2 − 𝑎𝑥,1 is the increment of the average (over a face) value
of 𝑎𝑥 upon a displacement along the 𝑥-axis by 𝛥𝑥. Owing to the smallness of the
parallelepiped (we remind our reader that we shall let its dimensions shrink to zero),
this increment can be written in the form (∂𝑎𝑥 /∂𝑥) 𝛥𝑥, where the value ∂𝑎𝑥 /∂𝑥 is
taken at point P⁹. Therefore, Eq. (1.80) becomes
∂𝑎𝑥 ∂𝑎𝑥
𝛥𝑥 𝛥𝑦 𝛥𝑧 = 𝛥𝑉 .
∂𝑥 ∂𝑥
Similar reasoning allows us to obtain the following expressions for the fluxes
through the pairs of faces perpendicular to the 𝑦- and 𝑧-axes:
∂𝑎 𝑦 ∂𝑎 𝑦 ∂𝑎𝑧 ∂𝑎𝑧
𝛥𝑥 𝛥𝑦 𝛥𝑧 = 𝛥𝑉 , 𝛥𝑥 𝛥𝑦 𝛥𝑧 = 𝛥𝑉 .
∂𝑦 ∂𝑦 ∂𝑧 ∂𝑧
⁹The inaccuracy which we tolerate here vanishes when the volume shrinks to point P in the limit
transition.
A Description of the Properties of Vector Fields 35

Thus, the total flux through the entire close surface is determined by the ex-
pression
∂𝑎𝑥 ∂𝑎 𝑦 ∂𝑎𝑧
 
𝛷𝑎 = + + 𝛥𝑉 .
∂𝑥 ∂𝑦 ∂𝑧
Dividing this expression by 𝛥𝑉 , we shall find the divergence of the vector a at point
𝑃 (𝑥, 𝑦, 𝑧):
∂𝑎𝑥 ∂𝑎 𝑦 ∂𝑎𝑧
div 𝒂 = + + . (1.81)
∂𝑥 ∂𝑦 ∂𝑧
The Ostrogradsky-Gauss Theorem. If we know the divergence of the vector
𝒂 at every point of space, we can calculate the flux of this vector through any closed
surface of finite dimensions. Let us first do this for the flux of the vector 𝒗 (a
liquid flux). The product of div 𝒗 and d𝑉 gives the power of the sources of the
liquid confined within the volume d𝑉 . The sum of such products, i.e., (div 𝒗) d𝑉 ,

gives the total algebraic power of the sources confined in the volume 𝑉 over which
integration is performed. Owing to incompressibility of the liquid, the total power
of the sources must equal the liquid flux emerging through surface 𝑆 enclosing the
volume 𝑉 . We thus arrive at the equation
∮ ∫
𝒗 · d𝑺 = (div 𝒗) d𝑉 .
𝑆 𝑉
A similar equation holds for a vector field of any nature:
∮ ∫
𝒂 · d𝑺 = (div 𝒂) d𝑉 . (1.82)
𝑆 𝑉
This relation is called the Ostrogradsky-Gauss theorem. The integral in the left-
hand side of the equation is calculated over an arbitrary closed surface 𝑆, and the
integral in the right-hand side over the volume 𝑉 enclosed by this surface.
Circulation. Let us revert to the flow of an ideal incompressible liquid. Imag-
ine a closed line—the contour 𝛤. Assume that in some way or other we have
instantaneously frozen the liquid in the entire volume except for a very thin closed
channel of constant cross section including the contour 𝛤 (Fig. 1.26). Depending
on the nature of the velocity vector field, the liquid in the channel formed will
either be stationary or move along the contour (circulate) in one of the two possible
directions. Let us take the quantity equal to the product of the velocity of the liquid
in the channel and the length of the contour 𝑙 as a measure of this motion. This
quantity is called the circulation of the vector 𝒗 around the contour 𝛤. Thus,
circulation of 𝒗 around 𝛤 = 𝑣𝑙
(since we assumed that the channel has a constant cross section, the magnitude of
the velocity, 𝑣, is a constant).
36 ELECTRIC FIELD IN A VACUUM

Fig. 1.26

At the moment when the walls freeze, the velocity component perpendicular
to a wall will be eliminated in each of the liquid particles, and only the velocity
component tangent to the contour will remain, i.e., 𝑣𝑙 . The momentum d𝒑𝑙 , is
associated with this component. The magnitude of the momentum for a liquid
particle contained within a segment of the channel of length d𝑙 is 𝜌𝜎 𝑣𝑙 d𝑙 (𝜌 is the
density of the liquid, and 𝜎 is the cross-sectional area of the channel). Since the
liquid is ideal, the action of the walls can change only the direction of the vector
d𝒑𝑙 , but not its magnitude. The interaction between the liquid particles will cause a
redistribution of the momentum between them that will level out the velocities of
all the particles. The algebraic sum of the tangential components of the momenta
cannot change: the momentum acquired by one of the interacting particles equals
the momentum lost by the second particle. This signifies that

𝜌𝜎 𝑣𝑙 = 𝜌𝜎 𝑣𝑙 d𝑙
𝛤
where 𝑣 is the circulation velocity, and 𝑣𝑙 is the tangential component of the liquid’s
velocity in the volume 𝜎 d𝑙 at the moment of time preceding the freezing of the
channel walls. Cancelling 𝜌𝜎 , we get

circulation of 𝒗 around 𝛤 = 𝑣𝑙 = 𝑣𝑙 d𝑙.
𝛤
The circulation of any vector 𝒂 around an arbitrary closed contour 𝛤 is determined
in a similar way:
∮ ∮
circulation of 𝒂 around 𝛤 = 𝒂 · d𝒍 = 𝑎𝑙 d𝑙. (1.83)
𝛤 𝛤
It may seem that for the circulation to be other than zero the vector lines must
be closed or at least bent in some way or other in the direction of circumventing the
contour. It is easy to see that this assumption is wrong. Let us consider the laminar
flow of water in a river. The velocity of the water directly at the river bottom is
zero and grows as we approach the surface of the water (Fig. 1.27). The streamlines
(lines of the vector 𝒗) are straight. Notwithstanding this fact, the circulation of the
vector 𝒗 around the contour depicted by the dash line obviously differs from zero.
A Description of the Properties of Vector Fields 37

Fig. 1.27 Fig. 1.28


On the other hand, in a field with curved lines, the circulation may equal zero.
Circulation has the property of additivity. This signifies that the sum of the
circulations around contours 𝛤1 and 𝛤2 enclosing neighboring surfaces 𝑆1 and 𝑆2
(Fig. 1.28) equals the circulation around contour 𝛤 enclosing surface 𝑆, which is the
sum of surfaces 𝑆1 and 𝑆2 . Indeed, the circulation 𝐶1 around the contour bounding
surface 𝑆1 can be represented as the sum of the integrals
∮ ∫ 2 ∫ 1
𝐶1 = 𝒂 · d𝒍 = 𝒂 · d𝒍 + 𝒂 · d𝒍. (1.84)
𝛤1 1,(𝐼) 2,(int.)
The first integral is taken over section 𝐼 of the outer contour, the second over the
interface between surfaces 𝑆1 and 𝑆2 in direction 2-1.
Similarly, the circulation 𝐶2 around the contour enclosing surface 𝑆2 is
∮ ∫ 1 ∫ 2
𝐶2 = 𝒂 · d𝒍 = 𝒂 · d𝒍 + 𝒂 · d𝒍. (1.85)
𝛤2 2,(𝐼 𝐼) 1,(int.)
The first integral is taken over section 𝐼 𝐼 of the outer contour, the second over the
interface between surfaces 𝑆1 and 𝑆2 in direction 1-2.
The circulation around the contour bounding total surface 𝑆 can be represented
in the form
∮ ∫ 2 ∫ 1
𝐶= 𝒂 · d𝒍 = 𝒂 · d𝒍 + 𝒂 · d𝒍. (1.86)
𝛤 1,(𝐼) 2,(𝐼 𝐼)
The second addends in Eqs. (1.84) and (1.85) differ only in their sign. Therefore, the
sum of these expressions will equal Eq. (1.86). Thus,
𝐶 = 𝐶1 + 𝐶2 . (1.87)
Equation (1.87) which we have proved does not depend on the shape of the
surfaces and holds for any number of addends. Hence, if we divide an arbitrary
open surface 𝑆 into a great number of elementary surfaces 𝛥𝑆¹⁰ (Fig. 1.29), then
¹⁰In the figure, the elementary surfaces are depicted in the form of rectangles. Actually, their shape
38 ELECTRIC FIELD IN A VACUUM

Fig. 1.29

the circulation around the contour enclosing 𝑆 can be written as the sum of the
elementary circulations 𝛥𝐶 around the contours enclosing the 𝛥𝑆’s:
Õ
𝐶= 𝛥𝐶 𝑖 . (1.88)
𝑖
Curl. The additivity of the circulation permits us to introduce the concept
of unit circulation, i.e., consider the ratio of the circulation 𝐶 to the magnitude
of surface 𝑆 around which the circulation “flows”. When surface 𝑆 is finite, the
ratio 𝐶/𝑆 gives the mean value of the unit circulation. This value characterizes the
properties of a field averaged over surface 𝑆. To obtain the characteristic of the
field at point P, we must reduce the dimensions of the surface, making it shrink to
point P. The ratio 𝐶/𝑆 tends to a limit that characterizes the properties of the field
at point P.
Thus, let us take an imaginary contour 𝛤 in a plane passing through point P,
and consider the expression
𝐶𝑎
lim (1.89)
𝑆→P 𝑆
where 𝐶 𝑎 is the circulation of the vector 𝒂 around the contour 𝛤 and 𝑆 is the surface
area enclosed by the contour.
Limit (1.89) calculated for an arbitrarily oriented plane cannot be an exhaustive
characteristic of the field at point P because the magnitude of this limit depends
on the orientation of the contour in space in addition to the properties of the field
at point P. This orientation can be given by the direction of a positive normal 𝒏ˆ
to the plane of the contour (a positive normal is one that is associated with the
direction of circumvention of the contour in integration by the right-hand screw
rule). In determining limit (1.89) at the same point P for different directions 𝒏,ˆ we

may be absolutely arbitrary.


A Description of the Properties of Vector Fields 39

shall obtain different values. For opposite directions, these values will differ only
in their sign (reversal of the direction 𝒏ˆ is equivalent to reversing the direction of
circumvention of the contour in integration, which only causes a change in the
sign of the circulation). For a certain direction of the normal, the magnitude of
expression (1.89) at the given point will be maximum.
Thus, quantity (1.89) behaves like the projection of a vector onto the direction
of a normal to the plane of the contour around which the circulation is taken. The
maximum value of quantity (1.89) determines the magnitude of this vector, and
the direction of the positive normal 𝒏ˆ at which the maximum is reached gives the
direction of the vector. This vector is called the curl of the vector 𝒂. lts symbol is
curl 𝒂. Using this notation, we can write expression (1.89) in the form
1

𝐶𝑎
(curl 𝒂)𝑛 = lim = lim 𝒂 d𝒍. (1.90)
𝑆→P 𝑆 𝑆→P 𝑆 𝑆
We can obtain a graphical picture of the curl of the vector 𝒗 by imagining a
small and light fan impeller placed at the given point of a flowing liquid (Fig. 1.30).
At the spots where the curl differs from zero, the impeller will rotate, its velocity
being the higher, the greater in value is the projection of the curl onto the impeller
axis.
Equation (1.90) defines the vector curl 𝒂. This definition is a most general one
that does not depend on the kind of coordinate system used. To find expressions for
the projections of the vector curl 𝒂 onto the axes of a Cartesian coordinate system,
we must determine the values of quantity (1.90) for such orientations of area 𝑆 for
which the normal 𝒏ˆ to the area coincides with one of the axes 𝑥, 𝑦, 𝑧. If, for example,
we direct 𝒏ˆ along the 𝑥-axis, then (1.90) becomes (curl 𝒂)𝑥 . Contour 𝛤 in this case
is arranged in a plane parallel to the coordinate plane 𝑦𝑧. Let us take this contour
in the form of a rectangle with the sides 𝛥𝑦 and 𝛥𝑧 (Fig. 1.31, the 𝑥-axis is directed
toward us in this figure; the direction of circumvention indicated in the figure is
associated with the direction of the 𝑥-axis by the right-hand screw rule). Section 1
of the contour is opposite in direction to the 𝑧-axis. Therefore, 𝑎𝑙 on this section
coincides with −𝑎𝑧 . Similar reasoning shows that 𝑎𝑙 on sections 2, 3, and 4 equals
𝑎 𝑦 , 𝑎𝑧 , and −𝑎 𝑦 , respectively. Hence, the circulation can be written in the form
(1.91)
 
𝑎𝑧,3 − 𝑎𝑧,1 𝛥𝑧 − 𝑎 𝑦,4 − 𝑎 𝑦,2 𝛥𝑦
where 𝑎𝑧,3 and 𝑎𝑧,1 are the average values of 𝑎𝑧 on sections 3 and 1, respectively,
and 𝑎 𝑦,4 and 𝑎 𝑦,2 are the average values of 𝑎 𝑦 on sections 4 and 2.
The difference 𝑎𝑧,3 −𝑎𝑧,1 is the increment of the average value of 𝑎𝑧 on the section
𝛥𝑧 when this section is displaced in the direction of the 𝑦-axis by 𝛥𝑦. Owing to the
smallness of 𝛥𝑦 and 𝛥𝑧, this increment can be represented in the form (∂𝑎𝑧 /∂𝑦) 𝛥𝑦,
40 ELECTRIC FIELD IN A VACUUM

Fig. 1.30 Fig. 1.31

where the value of ∂𝑎𝑧 /∂𝑦 is taken for point P¹¹. Similarly, the difference 𝑎 𝑦,4 − 𝑎 𝑦,2
can be represented in the form (∂𝑎 𝑦 /∂𝑧) 𝛥𝑧. Using these expressions in Eq. (1.91) and
putting the common factor outside the parentheses, we get the following expression
for thecirculation:
∂𝑎𝑧 ∂𝑎 𝑦 ∂𝑎𝑧 ∂𝑎 𝑦
 
− 𝛥𝑦 𝛥𝑧 = − 𝛥𝑆
∂𝑦 ∂𝑧 ∂𝑦 ∂𝑧
where 𝛥𝑆 is the area of the contour. Dividing the circulation by 𝛥𝑆, we find the
expression for the projection of curl 𝒂 onto the 𝑥-axis:
∂𝑎𝑧 ∂𝑎 𝑦
(curl 𝒂)𝑥 = − . (1.92)
∂𝑦 ∂𝑧
We can find by similar reasoning that
∂𝑎𝑥 ∂𝑎𝑧
(curl 𝒂) 𝑦 = − , (1.93)
∂𝑧 ∂𝑥
∂𝑎 𝑦 ∂𝑎𝑥
(curl 𝒂)𝑧 = − . (1.94)
∂𝑥 ∂𝑦
It is easy to see that any of the equations (1.92)-(1.94) can be obtained from the
preceding one [Eq. (1.94) should be considered as the preceding one for Eq. (1.94)] by
the so-called cyclic transposition of the coordinates, i.e., by replacing the coordinates
according to the scheme

𝑥 𝑦

𝑧
Thus, the curl of the vector 𝒂 is determined in the Cartesian coordinate system
by the following expression:
∂𝑎𝑧 ∂𝑎 𝑦 ∂𝑎𝑥 ∂𝑎𝑧 ∂𝑎 𝑦 ∂𝑎𝑥
     
curl 𝒂 = 𝒆ˆ 𝑥 − + 𝒆ˆ 𝑦 − + 𝒆ˆ 𝑧 − . (1.95)
∂𝑦 ∂𝑧 ∂𝑧 ∂𝑥 ∂𝑥 ∂𝑦
¹¹The inaccuracy which we tolerate here vanishes when the contour shrinks to point P in the limit
transition.
A Description of the Properties of Vector Fields 41

Below we shall indicate a more elegant way of writing this expression.


Stokes’ Theorem. Knowing the curl of the vector 𝒂 at every point of surface
𝑆 (not necessarily plane), we can calculate the circulation of this vector around
contour 𝛤 enclosing 𝑆 (the contour may also not be plane). For this purpose, we
divide the surface into very small elements 𝛥𝑆. Owing to their smallness, these
elements can be considered as plane. Therefore in accordance with Eq. (1.90), the
circulation of the vector 𝒂 around the contour bounding 𝛥𝑆 can be written in the
form
𝛥𝐶 ≈ (curl 𝒂)𝑛 𝛥𝑆 = curl 𝒂 · 𝛥𝑺 (1.96)
where 𝒏ˆ is a positive normal to surface element 𝛥𝑆.
In accordance with Eq. (1.88), summation of expression (1.96) over all the 𝛥𝑆’s
yields the circulation of the vector 𝒂 around contour 𝛤 enclosing 𝑆:
Õ Õ
𝐶= 𝛥𝐶 ≈ curl 𝒂 · 𝛥𝑺.
Performing a limit transition in which all the 𝛥𝑆’s shrink to zero (their number
grows unlimitedly), we arrive at the equation
∮ ∫
𝒂 · d𝒍 = (curl 𝒂) · 𝛥𝑺. (1.97)
𝛤 𝑆
Equation (1.97) is called Stokes’ theorem. Its meaning is that the circulation of the
vector 𝒂 around an arbitrary contour 𝛤 equals the flux of the vector curl 𝒂 through
the arbitrary surface 𝑆 surrounded by the given contour.
The Del Operator. Writing of the formulas of vector analysis is simplified quite
considerably if we introduce a vector differential operator designated by the symbol
∇ (nabla or del) and called the del operator or the Hamiltonian operator. This
operator denotes a vector with the components ∂/∂𝑥, ∂/∂𝑦 and ∂/∂𝑧. Consequently,
∂ ∂ ∂
∇ = 𝒆ˆ 𝑥 + 𝒆ˆ 𝑦 + 𝒆ˆ 𝑧 . (1.98)
∂𝑥 ∂𝑦 ∂𝑧
This vector has no meaning by itself. It acquires a meaning in combination with
the scalar or vector function by which it is symbolically multiplied. Thus, if we
multiply the vector ∇ by the scalar 𝜑 we obtain the vector
∂𝜑 ∂𝜑 ∂𝜑
∇𝜑 = 𝒆ˆ 𝑥 + 𝒆ˆ 𝑦 + 𝒆ˆ 𝑧 (1.99)
∂𝑥 ∂𝑦 ∂𝑧
which is the gradient of the function 𝜑 [see Eq. (1.68)].
The scalar product of the vectors ∇ and 𝒂 gives the scalar
∇ · 𝒂 = ∇𝑥 𝑎𝑥 + ∇ 𝑦 𝑎 𝑦 + ∇𝑧 𝑎𝑧 (1.100)
which we can see to be the divergence of the vector 𝒂 [see Eq. (1.81)].
Finally, the vector product of the vectors ∇ and 𝒂 gives a vector with the
components (∇ × 𝒂)𝑥 = ∇ 𝑦 𝑎𝑧 − ∇𝑧 𝑎 𝑦 = ∂𝑎𝑧 /∂𝑦 − ∂𝑎 𝑦 /∂𝑧, etc., that coincide with
42 ELECTRIC FIELD IN A VACUUM

the components of curl 𝒂 [see Eqs. (1.92)-(1.94)]. Hence, using the writing of a vector
product with the aid of a determinant, we have
𝒆ˆ 𝑥 𝒆ˆ 𝑦 𝒆ˆ 𝑧

∂ ∂ ∂

curl 𝒂 = ∇ × 𝒂 = . (1.101)
∂𝑥 ∂𝑦 ∂𝑧
𝑎𝑥 𝑎 𝑦 𝑎𝑧
Thus, there are two ways of denoting the gradient, divergence, and curl:
∇𝜑 ≡ grad 𝜑, ∇ · 𝒂 ≡ div 𝒂, ∇ × 𝒂 ≡ curl 𝒂.
The use of the del symbol has a number of advantages. We shall therefore use such
symbols in the following. One must accustom oneself to identify the symbol ∇𝜑
with the words “gradient of phi” (i.e., to say not “del phi”, but “gradient of phi”), the
symbol ∇ · 𝒂 with the words “divergence of a” and, finally, the symbol ∇ × 𝒂 with
the words “curl of a”.
When using the vector ∇, one must remember that it is a differential operator
acting on all the functions to the right of it. Consequently, in transforming expres-
sions including ∇, one must take into consideration both the rules of vector algebra
and those of differential calculus. For example, the derivative of the product of the
functions 𝜑 and 𝜓 is
(𝜑𝜓) 0 = 𝜑 0𝜓 + 𝜑𝜓 0.
Accordingly,
grad (𝜑𝜓) = ∇(𝜑𝜓) = 𝜓∇𝜑 + 𝜑∇𝜓 = 𝜓 grad 𝜑 + 𝜑 grad 𝜓. (1.102)
Similarly,
div (𝜑𝒂) = ∇ · (𝜑𝒂) = 𝒂 · (∇𝜑) + 𝜑(∇ · 𝒂). (1.103)
The gradient of a function 𝜑 is a vector function. Therefore, the divergence and
curl operations can be performed with it:
 
div grad 𝜑 = ∇ · ∇𝜑 = (∇ · ∇)𝜑 = ∇2𝑥 + ∇2𝑦 + ∇2𝑧 𝜑
∂2 𝜑 ∂2 𝜑 ∂2 𝜑
= + + = Δ𝜑 (1.104)
∂𝑥2 ∂𝑦 2 ∂𝑧 2
(Δ is the Laplacian operator)
curl grad 𝜑 = ∇ × (∇𝜑) = (∇ × ∇)𝜑 (1.105)
(we remind our reader that the vector product of a vector and itself is zero).
Let us apply the divergence and curl operations to the function curl 𝒂:
div curl 𝒂 = ∇ · ∇ × 𝒂 = 0 (1.106)
(a scalar triple product equals the volume of a parallelepiped constructed on the
vectors being multiplied (see Vol. I, p. 22); if two of these vectors coincide, the
Circulation and Curl of an Electrostatic Field 43

volume of the parallelepiped equals zero):


curl curl 𝒂 = ∇ × (∇ × 𝒂) = ∇(∇ · 𝒂) − (∇ · ∇)𝒂 = grad div 𝒂 − Δ𝒂 (1.107)
[we have used Eq. (1.35) of Vol. I, namely, 𝒂 × 𝒃 × 𝒄 = 𝒃(𝒂 · 𝒄) − 𝒄(𝒂 · 𝒃)].
Equation (1.106) signifies that the field of a curl has no sources. Hence, the lines
of the vector curl 𝒂 have neither a beginning nor an end. It is exactly for this reason
that the flux of a curl through any surface 𝑆 resting on the given contour 𝛤 is the
same [see Eq. (1.97)).
We shall note in concluding that when the del operator is used, Eqs. (1.82) and
(1.97) can be given the form
∮ ∮
𝒂 · d𝑺 = ∇ · 𝒂 d𝑉 , (the Ostrogradsky-Gauss theorem) (1.108)
𝑆 𝑉
∮ ∫
𝒂 · d𝒍 = (∇ × 𝒂) · d𝑺. (Stokes’ theorem) (1.109)
𝛤 𝑆

1.12. Circulation and Curl of an Electrostatic Field

We established in Sec. 1.6 that the forces acting on the charge 𝑞 in an electrostatic
field are conservative. Hence, the work of these forces on any closed path 𝛤 is zero:

𝐴 = 𝑞𝑬 · d𝒍 = 0.
𝛤
Cancelling 𝑞, we get

𝑬 · d𝒍 = 0 (1.110)
𝛤
(compare with Eq. (1.46)].
The integral in the left-hand side of Eq. (1.110) is the circulation of the vector 𝑬
around contour 𝛤 [see expression (1.80)]. Thus, an electrostatic field is characterized
by the fact that the circulation of the strength (intensity) vector of this field around any
closed contour equals zero.
Let us take an arbitrary surface 𝑆 resting on contour 𝛤 for which the circulation
is calculated (Fig. 1.32). According to Stokes’s theorem [see Eq. (1.109)], the integral of
curl 𝑬 taken over this surface equals the circulation of the vector 𝑬 around contour
𝛤: ∫ ∮
(∇ × 𝑬) · d𝑺 = 𝑬 · d𝒍. (1.111)
𝑆 𝛤
Since the circulation equals zero, we arrive at the conclusion that

(∇ × 𝑬) · d𝑺 = 0.
𝑆
This condition must be observed for any surface 𝑆 resting on arbitrary contour 𝛤.
44 ELECTRIC FIELD IN A VACUUM

Fig. 1.32 Fig. 1.33

This is possible only if the curl of the vector 𝑬 at every point of the field equals zero:
∇ × 𝑬 = 0. (1.112)
By analogy with the fan impeller shown in Fig. 1.25, let us imagine an electrical
“impeller” in the form of a light hub with spokes whose ends carry identical positive
charges 𝑞 (Fig. 1.33; the entire arrangement must be small in size). At the points of
an electric field where curl 𝑬 differs from zero, such an impeller would rotate with
an acceleration that is the greater, the larger is the projection of the curl onto the
impeller axis. For an electrostatic field, such an imaginary arrangement would not
rotate with any orientation of its axis.
Thus, a feature of an electrostatic field is that it is a non-circuital one. We
established in the preceding section that the curl of the gradient of a scalar function
equals zero [see expression (1.96)]. Therefore, the equality to zero of curl 𝑬 at every
point of a field makes it possible to represent 𝑬 in the form of the gradient of a scalar
function 𝜑 called the potential. We have already considered this representation
in Sec. 1.8 [see Eq. (1.41); the minus sign in this equation was taken from physical
considerations].
We can immediately conclude from the need to observe condition (1.110) that
the existence of an electrostatic field of the kind shown in Fig. 1.34 is impossible.
Indeed, for such a field, the circulation around the contour shown by the dash line
would differ from zero, which contradicts condition (1.110). It is also impossible for
a field differing from zero in a restricted volume to be homogeneous throughout
this volume (Fig. 1.35). In this case, the circulation around the contour shown by the
dash line would differ from zero.
Gauss’s Theorem 45

Fig. 1.34 Fig. 1.35 Fig. 1.36

1.13. Gauss’s Theorem

We established in the preceding section what the curl of an electrostatic field equals.
Now let us find the divergence of a field. For this purpose, we shall consider the
field of a point charge 𝑞 and calculate the flux of the vector 𝑬 through closed surface
𝑆 surrounding the charge (Fig. 1.36). We showed in Sec. 1.5 that the number of
lines of the vector 𝑬 beginning at a point charge +𝑞 or terminating at a charge −𝑞
numerically equals 𝑞/𝜀0 .
By Eq. (1.77), the flux of the vector 𝑬 through any closed surface equals the
number of lines coming out, i.e., beginning on the charge, if it is positive, and the
number of lines entering the surface, i.e., terminating on the charge, if it is negative.
Taking into account that the number of lines beginning or terminating at a point
charge numerically equals 𝑞/𝜀0 (see Sec. 1.5), we can write that
𝑞
𝛷𝐸 = . (1.113)
𝜀0
The sign of the flux coincides with that of the charge 𝑞. The dimensions of both
sides of Eq. (1.113) are identical.
Now let us assume that a closed surface surrounds 𝑁 point charges 𝑞1 , 𝑞2 , . . . , 𝑞𝑁 .
On the basis of the superposition principle, the strength 𝑬 of the field set up by
all the charges equals the sum of the strengths 𝑬 𝑖 set up by each charge separately:
𝑬 = 𝑖 𝑬 𝑖 . Hence,
Í
∮ ∮ Õ ! Õ∮
𝛷𝐸 = 𝑬 · d𝑺 = 𝑬 𝑖 · d𝑺 = 𝑬 𝑖 · d𝑺.
𝑆 𝑆 𝑖 𝑖 𝑆
Each of the integrals inside the sum sign equals 𝑞𝑖 /𝜀0 . Therefore,
1 Õ
∮ 𝑁
𝛷𝐸 = 𝑬 · d𝑺 = 𝑞𝑖 . (1.114)
𝑆 𝜀0 𝑖=1
The statement we have proved is called Gauss’s theorem. According to it, the flux
46 ELECTRIC FIELD IN A VACUUM

of an electric field strength vector through a closed surface equals the algebraic sum of
the charges enclosed by this surface divided by 𝜀0 .
When considering fields set up by macroscopic charges (i.e., charges formed by
an enormous number of elementary charges), the discrete structure of these charges
is disregarded, and they are considered to be distributed in space continuously with
a finite density everywhere. The volume density of a charge 𝜌 is determined by
analogy with the density of a mass as the ratio of the charge d𝑞 to the infinitely
small (physically) volume d𝑉 containing this charge:
d𝑞
𝜌= . (1.115)
d𝑉
In the given case by an infinitely small (physically) volume, we must understand a
volume which on the one hand is sufficiently small for the density within its limits
to be considered identical, and on the other is sufficiently great for the discreteness
of the charge not to manifest itself.
Knowing the charge density at every point of space, we can find the total charge
surrounded by closed surface 𝑆. For this purpose, we must calculate the integral of
𝜌 with respect to the volume enclosed by the surface:
Õ ∫
𝑞𝑖 = 𝜌 d𝑉 .
𝑖 𝑉
Thus, Eq. (1.114) can be written in the form
1
∮ ∫
𝑬 · d𝑺 = 𝜌 d𝑉 . (1.116)
𝑆 𝜀0 𝑉
Replacing the surface integral with a volume one in accordance with Eq. (1.108),
we have
1
∫ ∫
∇ · 𝑬 d𝑉 = 𝜌 d𝑉 .
𝑉 𝜀0 𝑉
The relation which we have arrived at must be observed for any arbitrarily chosen
volume 𝑉 . This is possible only if the values of the integrands for every point of
space are the same. Hence, the divergence of the vector 𝑬 is associated with the
density of the charge at the same point by the equation
1
∇ · 𝑬 = 𝜌. (1.117)
𝜀0
This equation expresses Gauss’s theorem in the differential form.
For a flowing liquid, ∇ · 𝒗 gives the unit power of the sources of the liquid at a
given point. By analogy, charges are said to be sources of an electric field.
Calculating Fields with the Aid of Gauss’s Theorem 47

1.14. Calculating Fields with the Aid of Gauss’s Theorem

Gauss’s theorem permits us in a number of cases to find the strength of a field in a


much simpler way than by using Eq. (1.15) for the field strength of a point charge and
the field superposition principle. We shall demonstrate the possibilities of Gauss’s
theorem by employing a few examples that will be useful for our further exposition.
Before starting on our way, we shall introduce the concepts of surface and linear
charge densities.
If a charge is concentrated in a thin surface layer of the body carrying the charge,
the distribution of the charge in space can be characterized by the surface density
𝜎 , which is determined by the expression
d𝑞
𝜎= . (1.118)
d𝑆
Here d𝑞 is the charge contained in the layer of area d𝑆. By d𝑆 is meant an infinitely
small (physically) section of the surface.
If a charge is distributed over the volume or surface of a cylindrical body
(uniformly in each section), the linear charge density is used, i.e.,
d𝑞
𝜆= (1.119)
d𝑙
where d𝑙 is the length of an infinitely small (physically) segment of the cylinder, and
d𝑞 is the charge concentrated on this segment.
Field of an Infinite Homogeneously Charged Plane. Assume that the sur-
face charge density at all points of a plane is identical and equal to 𝜎 ; for definiteness
we shall consider the charge to be positive. It follows from considerations of sym-
metry that the field strength at any point is directed at right angles to the plane.
Indeed, since the plane is infinite and charged homogeneously, there is no reason
why the vector 𝑬 should deflect to a side from a normal to the plane. It is further
evident that at points symmetrical relative to the plane, the field strength is identical
in magnitude and opposite in direction.
Let us imagine mentally a cylindrical surface with generatrices perpendicular
to the plane and bases of a size 𝛥𝑆 arranged symmetrically relative to the plane
(Fig. 1.37). Owing to symmetry, we have 𝐸 0 = 𝐸 00 = 𝐸. We shall apply Gauss’s
theorem to the surface. The flux through the side part of the surface will be absent
because 𝐸 𝑛 at each point of it is zero. For the bases, 𝐸 𝑛 coincides with 𝐸. Hence,
the total flux through the surface is 2𝐸 𝛥𝑆. The surface encloses the charge 𝜎 𝛥𝑆.
According to Gauss’s theorem, the condition must be observed that
𝜎 𝛥𝑆
2𝐸 𝛥𝑆 =
𝜀0
48 ELECTRIC FIELD IN A VACUUM

Fig. 1.37 Fig. 1.38

whence
𝜎
𝐸= . (1.120)
2𝜀0
The result we have obtained does not depend on the length of the cylinder. This
signifies that at any distances from the plane, the field strength is identical in
magnitude. The field lines are shown in Fig. 1.38. For a negatively charged plane,
the result will be the same except for the reversal of the direction of the vector 𝑬
and the field lines.
If we take a plane of finite dimensions, for instance a charged thin plate¹², then
the result obtained above will hold only for points, the distance to which from
the edge of the plate considerably exceeds the distance from the plate itself (in
Fig. 1.39 the region containing such points is outlined by a dash line). At points
at an increasing distance from the plane or approaching its edges, the field will
differ more and more from that of an infinitely charged plane. It is easy to imagine
the nature of the field at great distances if we take into account that at distances
considerably exceeding the dimensions of the plate, the field it sets up can be treated
as that of a point charge.
Field of Two Uniformly Charged Planes. The field of two parallel infinite
planes carrying opposite charges with a constant surface density 𝜎 identical in
magnitude can be found by superposition of the fields produced by each plane
separately (Fig. 1.40). In the region between the planes, the fields being added have
the same direction, so that the resultant field strength is
𝜎
𝐸= . (1.121)
𝜀0
¹²For a plate, by 𝜎 in Eq. (1.120) should be understood the charge concentrated on 1 m2 of the plate
over its entire thickness. In metal bodies, the charge is distributed over the external surface. Therefore
by 𝜎 we should understand the double value of the charge density on the surfaces surrounding the
metal plate.
Calculating Fields with the Aid of Gauss’s Theorem 49

Fig. 1.39 Fig. 1.40

Outside the volume bounded by the planes, the fields being added have opposite
directions so that the resultant field strength equals zero.
Thus, the field is concentrated between the planes. The field strength at all
points of this region is identical in value and in direction; consequently, the field is
homogeneous. The field lines are a collection of parallel equispaced straight lines.
The result we have obtained also holds approximately for planes of finite dimen-
sions if the distance between them is much smaller than their linear dimensions
(a parallel-plate capacitor). In this case, appreciable deviations of the field from
homogeneity are observed only near the edges of the plates (Fig. 1.41).
Field of an Infinite Charged Cylinder. Assume that the field is produced
by an infinite cylindrical surface of radius 𝑅 whose charge has a constant surface
density 𝜎 . Considerations of symmetry show that the field strength at any point
must be directed along a radial line perpendicular to the cylinder axis, and that the
magnitude of the strength can depend only on the distance 𝑟 from the cylinder axis.
Let us mentally imagine a coaxial closed cylindrical surface of radius 𝑟 and height
ℎ with a charged surface (Fig. 1.42). For the bases of the cylinder, we have 𝐸 𝑛 = 0,
for the side surface 𝐸 𝑛 = 𝐸(𝑟) (the charge is assumed to be positive). Hence, the
flux of the vector 𝑬 through the surface being considered is 𝐸(𝑟) × 2𝜋𝑟ℎ. If 𝑟 > 𝑅,
the charge 𝑞 = 𝜆ℎ (where 𝜆 is the linear charge density) will get into the surface.
Applying Gauss’s theorem, we find that
2𝜆
𝐸(𝑟) × 2𝜋𝑟ℎ = .
𝜀0
Hence,
1 𝜆
𝐸(𝑟) = (𝑟 > 𝑅). (1.122)
2𝜋 𝜀0 𝑟
If 𝑟 < 𝑅, the closed surface being considered contains no charges inside, owing to
50 ELECTRIC FIELD IN A VACUUM

Fig. 1.41 Fig. 1.42

which 𝐸(𝑟) = 0.
Thus, there is no field inside a uniformly charged cylindrical surface of infinite
length. The field strength outside the surface is determined by the linear charge
density 𝜆 and the distance 𝑟 from the cylinder axis.
The field of a negatively charged cylinder differs from that of a positively charged
one only in the direction of the vector 𝑬. A glance at Eq. (1.122) shows that by reducing
the cylinder radius 𝑅 (with a constant linear charge density 𝜆), we can obtain a field
with a very great strength near the surface of the cylinder.
Introducing 𝜆 = 2𝜋 𝑅𝜎 into Eq. (1.122) and assuming that 𝑟 = 𝑅, we get the
following value for the field strength in direct proximity to the surface of a cylinder:
𝜎
𝐸(𝑅) = . (1.123)
𝜀0
The superposition principle makes it simple to find the field of two coaxial
cylindrical surfaces carrying a linear charge density 𝜆 of the same magnitude, but of
opposite signs (Fig. 1.43). There is no field inside the smaller and outside the larger
cylinders. The field strength in the gap between the cylinders is determined by
Eq. (1.122). This also holds for cylindrical surfaces of a finite length if the gap between
the surfaces is much smaller than their length (a cylindrical capacitor). Appreciable
deviations from the field of surfaces of an infinite length will be observed only near
the edges of the cylinders.
Field of a Charged Spherical Surface. The field produced by a spherical
surface of radius 𝑅 whose charge has a constant surface density 𝜎 will obviously
be a centrally symmetrical one. This signifies that the direction of the vector 𝑬 at
any point passes through the centre of the sphere, while the magnitude of the field
Calculating Fields with the Aid of Gauss’s Theorem 51

Fig. 1.43

strength is a function of the distance 𝑟 from the centre of the sphere. Let us imagine
a surface of radius 𝑟 that is concentric with the charged sphere. For all points of
this surface, 𝐸 𝑛 = 𝐸(𝑟). If 𝑟 > 𝑅, the entire charge 𝑞 distributed over the sphere
will be inside the surface. Hence,
𝐸(𝑟) × 4𝜋𝑟 2 =
𝑞
𝜀0
whence
1 𝑞
𝐸(𝑟) = . (𝑟 > 𝑅) (1.124)
4𝜋 𝜀0 𝑟 2
A spherical surface of radius 𝑟 less than 𝑅 will contain no charges, owing to
which for 𝑟 < 𝑅 we get 𝐸(𝑟) = 0.
Thus, there is no field inside a spherical surface whose charge has a constant
surface density 𝜎 . Outside this surface, the field is identical with that of a point
charge of the same magnitude at the centre of the sphere.
Using the superposition principle, it is easy to show that the field of two con-
centric spherical surfaces (a spherical capacitor) carrying charges +𝑞 and −𝑞 that
are identical in magnitude and opposite in sign is concentrated in the gap between
the surfaces, the magnitude of the field strength in the gap being determined by
Eq. (1.124).
Field of a Volume-Charged Sphere. Assume that a sphere of radius 𝑅 has a
charge with a constant volume density 𝜌. The field in this case has central symmetry.
It is easy to see that the same result is obtained for the field outside the sphere [see
Eq. (1.124)] as for a sphere with a surface charge. The result will be different for
points inside the sphere, however. A spherical surface of radius 𝑟 (𝑟 < 𝑅) contains a
charge equal to 𝜌 × 4𝜋𝑟 3 /3. Therefore, Gauss’s theorem for such a surface will be
written as follows:
1 4
𝐸(𝑟) × 4𝜋𝑟 2 = 𝜌 𝜋𝑟 3 .
𝜀0 3
52 ELECTRIC FIELD IN A VACUUM

Hence, substituting 𝑞/(4𝜋 𝑅3 /3) for 𝜌, we get


1 𝑞
𝐸(𝑟) = 𝑟. (𝑟 6 𝑅) (1.125)
4𝜋 𝜀0 𝑅3
Thus, the field strength inside a sphere grows linearly with the distance 𝑟 from
the centre of the sphere. Outside the sphere, the field strength diminishes according
to the same law as for the field of a point charge.
53

Chapter 2
ELECTRIC FIELD IN DIELECTRICS

2.1. Polar and Non-Polar Molecules

Dielectrics (or insulators) are defined as substances not capable of conducting an


electric current. Ideal insulators do not exist in nature. All substances, even if to
a negligible extent, conduct an electric current. But substances called conductors
conduct a current from 1015 to 1020 times better than substances called dielectrics.
If a dielectric is introduced into an electric field, then the field and the dielectric
itself undergo appreciable changes. To understand why this happens, we must
take into account that atoms and molecules contain positively charged nuclei and
negatively charged electrons.
A molecule is a system with a total charge of zero. The linear dimensions of
this system are very small, of the order of a few angstroms (the angstrom—Å—is
a unit of length equal to 10−10 m that is very convenient in atomic physics). We
established in Sec. 1.10 that the field set up by such a system is determined by the
magnitude and orientation of the dipole electric moment
Õ
𝒑= 𝑞𝑖 𝒓 𝑖 (2.1)
𝑖
(summation is performed both over the electrons and over the nuclei). True, the
electrons in a molecule are in motion, so that this moment constantly changes. The
velocities of the electrons are so high, however, that the mean value of the moment
(2.1) is detected in practice. For this reason in the following by the dipole moment
of a molecule, we shall mean the quantity
Õ
𝒑= 𝑞𝑖 h𝒓 𝑖 i (2.2)
𝑖
(for nuclei, 𝒓 𝑖 is simply taken as h𝒓 𝑖 i in this sum). In other words, we shall consider
that the electrons are at rest relative to the nuclei at certain points obtained by
54 ELECTRIC FIELD IN DIELECTRICS

averaging the positions of the electrons in time.


The behaviour of a molecule in an external electric field is also determined
by its dipole moment. We can verify this by calculating the potential energy of a
molecule in an external electric field. Selecting the origin of coordinates inside the
molecule and taking advantage of the smallness of h𝒓 𝑖 i, let us write the potential at
the point where the 𝑖-th charge is in the form
𝜑𝑖 = 𝜑 + ∇𝜑 · h𝒓 𝑖 i
where 𝜑 is the potential at the origin of coordinates [see Eq. (1.69)]. Hence,
Õ Õ Õ Õ
𝑊p = 𝑞𝑖 𝜑 𝑖 = 𝑞𝑖 (𝜑 + ∇𝜑 · h𝒓 𝑖 i) = 𝜑 𝑞𝑖 + ∇𝜑 𝑞𝑖 h𝒓 𝑖 i .
𝑖 𝑖 𝑖 𝑖
Taking into account that 𝑖 𝑞𝑖 = 0 and substituting −𝑬 for ∇𝜑, we get
Í
Õ
𝑊p = −𝑬 𝑞𝑖 h𝒓 𝑖 i = −𝒑 · 𝑬 = −𝑝𝐸 cos 𝛼.
𝑖
Differentiating this expression with respect to 𝛼, we get Eq. (1.57) for the rotational
moment; differentiating with respect to 𝑥, we arrive at the force (1.62).
Thus, a molecule is equivalent to a dipole both with respect to the field it sets
up and with respect to the forces it experiences in an external field. The positive
charge of this dipole equals the total charge of the nuclei and is at the “centre of
gravity” of the positive charges; the negative charge equals the total charge of the
electrons and is at the “centre of gravity” of the negative charges.
In symmetrical molecules (such as H2 ,O2 , N2 ), the centres of gravity of the
positive and negative charges coincide in the absence of an external electric field.
Such molecules have no intrinsic dipole moment and are called non-polar. In
asymmetrical molecules (such as CO, NH, HCl), the centres of gravity of the charges
of opposite signs are displaced relative to each other. In this case, the molecules
have an intrinsic dipole moment and are called polar.
Under the action of an external electric field, the charges in a non-polar molecule
become displaced relative to one another, the positive ones in the direction of the
field, the negative ones against the field. As a result, the molecule acquires a dipole
moment whose magnitude, as shown by experiments, is proportional to the field
strength (intensity). In the rationalized system, the constant of proportionality is
written in the form 𝜀0 𝛽, where 𝜀0 is the electric constant, and 𝛽 is a quantity called
the polarizability of a molecule. Since the directions of 𝒑 and 𝑬 coincide, we can
write that
𝒑 = 𝛽𝜀0 𝑬. (2.3)
The dipole moment has a dimension of [𝑞]L. By Eq. (1.15), the dimension of 𝜀0 𝑬 is
[𝑞]L−2 . Hence, the polarizability of a molecule 𝛽 has the dimension L3 .
Polarization of Dielectrics 55

The process of polarization of a non-polar molecule proceeds as if the positive


and negative charges of the molecule were bound to one another by elastic forces.
A non-polar molecule is, therefore said, to behave in an external field like an elastic
dipole.
The action of an external field on a polar molecule consists mainly in tending
to rotate the molecule so that its dipole moment is arranged in the direction of the
field. An external field does not virtually affect the magnitude of a dipole moment.
Consequently, a polar molecule behaves in an external field like a rigid dipole.

2.2. Polarization of Dielectrics

In the absence of an external electric field, the dipole moments of the molecules
of a dielectric usually either equal zero (non-polar molecules) or are distributed
in space by directions chaotically (polar molecules). In both cases, the total dipole
moment of a dielectric equals zero¹.
A dielectric becomes polarized under the action of an external field. This
signifies that the resultant dipole moment of the dielectric becomes other than
zero. It is quite natural to take the dipole moment of a unit volume as the quantity
characterizing the degree of polarization. If the field or the dielectric (or both) are
not homogeneous, the degrees of polarization at different points of the dielectric
will differ. To characterize the polarization at a given point, we must separate an
infinitely small (physically) volume 𝛥𝑉 containing this point, find the sum 𝛥𝑉 𝒑
Í
of the moments of the molecules confined in this volume, and take the ratio
Õ
𝒑
(2.4)
𝛥𝑉
𝑷= .
𝛥𝑉
The vector quantity 𝑷 defined by Eq. (2.4) is called the polarization of a dielectric.
The dipole moment 𝒑 has the dimension [𝑞]L. Consequently, the dimension of
𝑷 is [𝑞]L−2 , i.e., it coincides with the dimension of 𝜀0 𝑬 [see Eq. (1.15)].
The polarization of isotropic dielectrics of any kind is associated with the field
strength at the same point by the simple relation
𝑷 = 𝜒𝜀0 𝑬 (2.5)
where 𝜒 is a quantity independent of 𝑬 called the electric susceptibility of a
dielectric². It was indicated above that the dimensions of 𝑷 and 𝜀0 𝑬 are identical.
Hence, 𝜒 is a dimensionless quantity.

¹In Sec. 2.9, we shall acquaint ourselves with substances that can have a dipole moment in the
absence of an external field.
²In anisotropic dielectrics, the directions of 𝑷 and 𝑬, generally speaking, do not coincide. In this
56 ELECTRIC FIELD IN DIELECTRICS

In the Gaussian system of units, Eq. (2.5) has the form


𝑷 = 𝜒𝑬. (2.6)
For dielectrics built of non-polar molecules, Eq. (2.5) issues from the following
simple considerations. The volume 𝛥𝑉 contains a number of molecules equal to
𝑛𝛥𝑉 , where 𝑛 is the number of molecules per unit volume. Each of the moments 𝒑
is determined in this case by Eq. (2.3). Hence,
Õ
𝛥𝑉 𝒑 = 𝑛𝛥𝑉 𝛽𝜀0 𝑬.
Dividing this expression by 𝛥𝑉 , we get the polarization 𝑷 = 𝑛𝛽𝜀0 𝑬. Finally, intro-
ducing the symbol 𝜒 = 𝑛𝛽, we arrive at Eq. (2.5).
For dielectrics built of polar molecules, the orienting action of the external
field is counteracted by the thermal motion of the molecules tending to scatter
their dipole moments in all directions. As a result, a certain preferred orientation
of the dipole moments of the molecules sets in in the direction of the field. The
relevant statistical calculations, which agree with experimental data, show that the
polarization is proportional to the field strength, i.e., leads to Eq. (2.5). The electric
susceptibility of such dielectrics varies inversely with the absolute temperature.
In ionic crystals, the separate molecules lose their individuality. An entire
crystal is, as it were, a single giant molecule. The lattice of an ionic crystal can
be considered as two lattices inserted into each other, one of which is formed by
the positive, and the other by the negative ions. When an external field acts on
the crystal ions, both lattices are displaced relative to each other, which leads to
polarization of the dielectric. The polarization in this case too is associated with the
field strength by Eq. (2.5). We must note that the linear relation between 𝑬 and 𝑷
described by Eq. (2.5) may be applied only to not too strong fields [a similar remark
relates to Eq. (2.3)].

2.3. The Field Inside a Dielectric

The charges in the molecules of a dielectric are called bound. The action of a
field can only cause bound charges to be displaced slightly from their equilibrium
case, the relation between 𝑷 and 𝑬 is described by the equations

𝑃𝑥 = 𝜀 𝜒 𝑥𝑥 𝐸 𝑥 + 𝜒 𝑥 𝑦 𝐸 𝑦 + 𝜒 𝑥𝑧 𝐸 𝑧 ,

𝑃 𝑦 = 𝜀 𝜒 𝑦𝑥 𝐸 𝑥 + 𝜒 𝑦 𝑦 𝐸 𝑦 + 𝜒 𝑦𝑧 𝐸 𝑧 ,

𝑃𝑧 = 𝜀 𝜒 𝑧𝑥 𝐸 𝑥 + 𝜒 𝑧 𝑦 𝐸 𝑦 + 𝜒 𝑧𝑧 𝐸 𝑧 .

The combination of the nine quantities 𝜒 𝑖𝑗 forms a symmetrical tensor of rank two called the tensor
of the dielectric susceptibility [compare with Eqs. (5.30) of Vol. I). This tensor characterizes the
electrical properties of an anisotropic dielectric.
Space and Surface Bound Charges 57

positions; they cannot leave the molecule containing them.


Following the example of L. Landau and E. Lifshitz³, we shall call charges that,
although they are within the boundaries of a dielectric, are not inside its molecules,
and also charges outside a dielectric, extraneous ones⁴.
The field in a dielectric is the superposition of the field 𝑬 extr produced by the
extraneous charges, and the field 𝑬 bound of the bound charges. The resultant field is
called microscopic (or true):
𝑬 micro = 𝑬 extr + 𝑬 bound . (2.7)
The microscopic field changes greatly within the limits of the intermolecular
distances. Owing to the motion of the bound charges, the field 𝑬 micro also changes
with time. These changes are not detected in a macroscopic-consideration. There-
fore, a field is characterized by the quantity (2.7) averaged over an infinitely small
(physically) volume, i.e.,
𝑬 = h𝑬 micro i = h𝑬 extr i + h𝑬 bound i .
In the following, we shall designate the averaged field of the extraneous charges
by 𝑬 0 , and the averaged field of the bound charges by 𝑬 0. Accordingly, we shall
define a macroscopic field as the quantity
𝑬 = 𝑬 0 + 𝑬 0. (2.8)
The polarization 𝑷 is a macroscopic quantity. Therefore, 𝑬 in Eq. (2.5) should
be understood as the strength determined by Eq. (2.8).
In the absence of dielectrics (i.e., in a “vacuum”), the macroscopic field is
𝑬 = 𝑬 0 = h𝑬 extr i .
It is exactly this quantity that is understood to be 𝑬 in Eq. (1.117).
If the extraneous charges are stationary, the field determined by Eq. (2.8) has
the same properties as an electrostatic field in a vacuum. In particular, it can be
characterized with the aid of the potential 𝜑 related to the field strength (2.8) by Eqs.
(1.41) and (1.45).

2.4. Space and Surface Bound Charges

When a dielectric is not polarized, the volume density 𝜌0 and the surface density
𝜎 0 of the bound charges equal zero. Polarization causes the surface density, and in
some cases also the volume density of the bound charges to become different from
³See L. D. Landau and E. M. Lifshitz. Elektrodinamika sploshnykh sred (Electrodynamics of
Continuous Media). Moscow, Gostekhizdat (1957), p. 57.
⁴It is customary practice to call such charges free. This name is extremely unsuccessful, however,
because in a number of cases extraneous charges are not at all free.
58 ELECTRIC FIELD IN DIELECTRICS

Fig. 2.1 Fig. 2.2

zero.
Figure 2.1 shows schematically a polarized dielectric with nonpolar (a) and polar
(b) molecules. Inspection of the figure shows that the polarization is attended by
the appearance of a surplus of bound charges of one sign in the thin surface layer of
the dielectric. If the normal component of the field strength 𝑬 for the given section
of the surface is other than zero, then under the action of the field, charges of one
sign will move away inward, and of the other sign will emerge.
There is a simple relation between the polarization 𝑷 and the surface density
of the bound charges 𝜎 0. To find it, let us consider an infinite plane-parallel plate
of a homogeneous dielectric placed in a homogeneous electric field (Fig. 2.2). Let
us mentally separate an elementary volume in the plate in the form of a very thin
cylinder with generatrices parallel to 𝑬 in the dielectric, and with bases of area 𝛥𝑆
coinciding with the surfaces of the plate. The magnitude of this volume is
𝛥𝑉 = 𝑙 𝛥𝑆 cos 𝛼
where 𝑙 is the distance between the bases of the cylinder and 𝛼 the angle between the
vector 𝑬 and an outward normal to the positively charged surface of the dielectric.
The volume 𝛥𝑉 has a dipole electric moment of the magnitude
𝑃 𝛥𝑉 = 𝑃𝑙 𝛥𝑆 cos 𝛼
(𝑃 is the magnitude of the polarization).
From the macroscopic viewpoint, the volume being considered is equivalent to
a dipole formed by the +𝜎 0 𝛥𝑆 and −𝜎 0 𝛥𝑆 with a spacing of 𝑙. Therefore, its electric
moment can be written in the form 𝜎 0 𝛥𝑆𝑙. Equating the two expressions for the
electric moment, we get
𝑃𝑙 𝛥𝑆 cos 𝛼 = 𝜎 0 𝛥𝑆𝑙.
Hence, we get the required relation between 𝜎 0 and 𝑷:
𝜎 0 = 𝑃 cos 𝛼 = 𝑃n (2.9)
Space and Surface Bound Charges 59

where 𝑃n is the projection of the polarization onto an outward normal to the


relevant surface. For the right-hand surface in Fig. 2.2, we have 𝑃n > 0, accordingly,
𝜎 0 for it is positive; for the left-hand surface 𝑃n < 0, accordingly, 𝜎 0 for it is negative.
Expressing 𝑷 through 𝜒 and 𝑬 by means of Eq. (2.5), we arrive at the formula
𝜎 0 = 𝜒𝜀0 𝐸n (2.10)
where 𝐸n is the normal component of the field strength inside the dielectric. Ac-
cording to Eq. (2.10), at the places where the field lines emerge from the dielectric
(𝐸n > 0), positive bound charges come up to the surface, while where the field lines
enter the dielectric (𝐸n < 0), negative surface charges appear.
Equations (2.9) and (2.10) also hold in the most general case when an inhomo-
geneous dielectric of an arbitrary shape is in an inhomogeneous electric field. By
𝑃n and 𝐸n in this case, we must understand the normal component of the rele-
vant vector taken in direct proximity to the surface element for which 𝜎 0 is being
determined.
Now let us turn to finding the volume density of the bound charges appear-
ing inside an inhomogeneous dielectric. Let us consider an imaginary small area
𝛥𝑆 (Fig. 2.3) in an inhomogeneous isotropic dielectric with non-polar molecules.
Assume that a unit volume of the dielectric has 𝑛 identical particles with a charge
of +𝑒 and 𝑛 identical particles with a charge of −𝑒. In close proximity to area 𝛥𝑆,
the electric field and the dielectric can be considered homogeneous. Therefore,
when the field is switched on, all the positive charges near 𝛥𝑆 will be displaced
over the same distance 𝑙1 in the direction of 𝑬, and all the negative charges will
be displaced in the opposite direction over the same distance 𝑙2 (see Fig. 2.3). A
certain number of charges of one sign (positive if 𝛼 < 𝑛/2 and negative if 𝛼 > 𝑛/2)
will pass through area 𝛥𝑆 in the direction of a normal to it, and a certain number
of charges of the opposite sign (negative if 𝛼 < 𝑛/2 and positive if 𝛼 < 𝑛/2) in
the direction opposite to 𝒏. ˆ Area 𝛥𝑆 will be intersected by all the charges +𝑒 that
were at a distance of not over 𝑙1 cos 𝛼 from it before the field was switched on, i.e.,
by all the +𝑒’s in an oblique cylinder of volume 𝑙1 𝛥𝑆 cos 𝛼. The number of these
charges is 𝑛𝑙 1 𝛥𝑆 cos 𝛼, while the charge they carry in the direction of a normal to
the area is 𝑒𝑛𝑙 1 𝛥𝑆 cos 𝛼 (when 𝛼 > 𝜋/2, the charge carried in the direction of the
normal as a result of displacement of the charges +𝑒 will be negative). Similarly,
area 𝛥𝑆 will be intersected by all the charges −𝑒 in the volume 𝑙2 𝛥𝑆 cos 𝛼. These
charges will carry a charge of 𝑒𝑛𝑙 2 𝛥𝑆 cos 𝛼 in the direction of a normal to the area
(inspection of Fig. 2.3 shows that when 𝛼 < 𝜋/2, the charges −𝑒 will carry the charge
−𝑒𝑛𝑙2 𝛥𝑆 cos 𝛼 through 𝛥𝑆 in the direction opposite to 𝒏, ˆ which is equivalent to
ˆ
carrying the charge 𝑒𝑛𝑙2 𝛥𝑆 cos 𝛼 in the direction of 𝒏).
60 ELECTRIC FIELD IN DIELECTRICS

Fig. 2.3

Thus, when the field is switched on, the charge


𝛥𝑞0 = 𝑒𝑛𝑙 1 𝛥𝑆 cos 𝛼 + 𝑒𝑛𝑙 2 𝛥𝑆 cos 𝛼 = 𝑒𝑛(𝑙 1 + 𝑙 2 ) 𝛥𝑆 cos 𝛼
is carried through area 𝛥𝑆 in the direction of a normal to it. The sum 𝑙1 + 𝑙2 is the
distance 𝑙 over which the positive and negative bound charges are displaced toward
one another in the dielectric. As a result of this displacement, each pair of charges
acquires the dipole moment 𝑝 = 𝑒𝑙 = 𝑒(𝑙1 + 𝑙2 ). The number of such pairs in a unit
volume is 𝑛. Consequently, the product 𝑒(𝑙1 + 𝑙 2 )𝑛 = 𝑒𝑙𝑛 = 𝑝 gives the magnitude
of the polarization 𝑃. Thus, the charge passing through area 𝛥𝑆 in the direction of
a normal to it when the field is switched on is [see Eq. (2.9)]
𝛥𝑞0 = 𝑃 𝛥𝑆 cos 𝛼.
Since the dielectric is isotropic, the directions of the vectors 𝑬 and 𝑷 coincide
(see Fig. 2.3). Consequently, 𝛼 is the angle between the vectors 𝑷 and 𝒏, ˆ and in this
connection we can write
ˆ 𝛥𝑆.
𝛥𝑞0 = (𝑷 · 𝒏)
Passing over from deltas to differentials, we get
ˆ d𝑆 = 𝑷 · d𝑺.
d𝑞 = (𝑷 · 𝒏)
We have found the bound charge d𝑞0 that passes through elementary area d𝑆 in
the direction of a normal to it when the field is switched on; 𝑷 is the polarization
set up under the action of the field at the location of area d𝑆.
Let us imagine closed surface 𝑆 inside the dielectric. When the field is switched
on, a bound charge 𝑞0 will intersect this surface and emerge from it. This charge is
∮ ∮
𝑞em = d𝑞 =
0 0
𝑷 · d𝑺
𝑆 𝑆
(we have agreed to take the outward normal to area d𝑆 for closed surfaces). As a
result, a surplus bound charge will appear in the volume enclosed by surface 𝑆. Its
value is ∮
0
𝑞sur 0
= −𝑞em =− 𝑷 · d𝑺 = −𝛷P (2.11)
𝑆
(𝛷P is the flux of the vector 𝑷 through surface 𝑆).
Space and Surface Bound Charges 61

Fig. 2.4

Introducing the volume density of the bound charges 𝜌0, we can write

0
𝑞sur = 𝜌0 d𝑉
𝑉
(the integral is taken over the volume enclosed by surface 𝑆). We thus arrive at the
formula∫ ∮
𝜌0 d𝑉 = − 𝑷 · d𝑺.
𝑉 𝑆
Let us transform the surface integral according to the Ostrogradsky-Gauss theorem
[see Eq. (1.108)). The result is
∫ ∫
𝜌 d𝑉 = − ∇ · 𝑷 d𝑉 .
0
𝑉 𝑉
This equation must be observed for any arbitrarily chosen volume 𝑉 . This is possible
only if the following equation is observed at every point of the dielectric:
𝜌0 = −∇ · 𝑷. (2.12)
Consequently, the density of bound charges equals the divergence of the polarization
𝑷 taken with the opposite sign.
We obtained Eq. (2.12) when considering a dielectric with non-polar molecules.
This equation also holds, however, for dielectrics with polar molecules.
Equation (2.12) can be given a graphical interpretation. Points with a positive
∇ · 𝑷 are sources of the field of the vector 𝑷, and the lines of 𝑷 diverge from them
(Fig. 2.4). Points with a negative ∇ · 𝑷 are sinks of the field of the vector 𝑷, and the
lines of 𝑷 converge at them. In polarization of the dielectric, the positive bound
charges are displaced in the direction of the vector 𝑷, i.e., in the direction of the
lines 𝑷; the negative bound charges are displaced in the opposite direction (in the
figure the bound charges belonging to separate molecules are encircled by ovals).
As a result, a surplus of negative bound charges is formed at places with a positive
62 ELECTRIC FIELD IN DIELECTRICS

∇ · 𝑷, and a surplus of positive bound charges at places with a negative ∇ · 𝑷.


Bound charges differ from extraneous ones only in that they cannot leave
the confines of the molecules which they are in. Otherwise, they have the same
properties as all other charges. In particular, they are sources of an electric field.
Therefore, when the density of the bound charges 𝜌0 differs from zero, Eq. (1.117)
must be written in the form
1
∇ · 𝑬 = (𝜌 + 𝜌0) . (2.13)
𝜀0
Here 𝜌 is the density of the extraneous charges.
Let us introduce Eq. (2.5) for 𝑷 into Eq. (2.12) and use Eq. (1.103). The result is
𝜌0 = −∇ · ( 𝜒𝜀0 𝑬) = −𝜀0 ∇ · ( 𝜒𝑬) = −𝜀0 [𝑬 · ∇ 𝜒 + 𝜒∇ · 𝑬].
Substituting for ∇ · 𝑬 its value from Eq. (2.13), we arrive at the equation
𝜌0 = −𝜀0 (𝑬 · ∇ 𝜒) − 𝜒𝜌 − 𝜒𝜌0.
Hence,
1
 
0
𝜌 =− [𝜀0 (𝑬 · ∇ 𝜒) + 𝜒𝜌]. (2.14)
1+ 𝜒
We can see from Eq. (2.14) that the volume density of bound charges can differ
from zero in two cases: (1) if a dielectric is not homogeneous (∇ 𝜒 ≠ 0), and (2) if at
a given place in a dielectric the density of the extraneous charges is other than zero
(𝜌 ≠ 0).
When there are no extraneous charges in a dielectric, the volume density of the
bound charges is
 
𝜀0
0
𝜌 =− (𝑬 · ∇ 𝜒). (2.15)
1+ 𝜒

2.5. Electric Displacement Vector

We noted in the preceding section that not only extraneous, but also bound charges
are sources of a field. Accordingly,
1
∇ · 𝑬 = (𝜌 + 𝜌0) (2.16)
𝜀0
[see Eq. (2.13)].
Equation (2.16) is of virtually no use for finding the vector 𝑬 because it expresses
the properties of the unknown quantity 𝑬 through bound charges, which in turn
are determined by the unknown quantity 𝑬 [see Eqs. (2.10) and (2.14)].
Calculation of the fields is often simplified if we introduce an auxiliary quantity
whose sources are only extraneous charges 𝜌. To establish what this quantity looks
Electric Displacement Vector 63

like, let us introduce Eq. (2.12) for 𝜌0 into Eq. (2.16):


1
∇ · 𝑬 = (𝜌 − ∇ · 𝑷)
𝜀0
whence it follows that
∇ · (𝜀0 𝑬 + 𝑷) = 𝜌 (2.17)
(we have put 𝜀0 inside the del symbol). The expression in parentheses in Eq. (2.17)
is the required quantity. It is designated by the symbol 𝑫 and is called the electric
displacement (or electric induction).
Thus, the electric displacement is a quantity determined by the relation
𝑫 = 𝜀0 𝑬 + 𝑷. (2.18)
Inserting Eq. (2.5) for 𝑷, we get
𝑫 = 𝜀0 𝑬 + 𝜒𝜀0 𝑬 = 𝜀0 (1 + 𝜒)𝑬. (2.19)
The dimensionless quantity
𝜀 =1+ 𝜒 (2.20)
is called the relative permittivity or simply the permittivity of a medium⁵. Thus,
Eq. (2.19) can be written in the form
𝑫 = 𝜀0 𝜀𝑬. (2.21)
According to Eq. (2.21), the vector 𝑫 is proportional to the vector 𝑬. We remind our
reader that we are dealing with isotropic dielectrics. In anisotropic dielectrics, the
vectors 𝑬 and 𝑫, generally speaking, are not collinear.
In accordance with Eqs. Eq. (1.15) and Eq. (2.21), the electric displacement of the
field of a point charge in a vacuum is
1 𝑞
𝑫= 𝒆ˆ 𝑟 . (2.22)
4𝜋 𝑟 2
The unit of electric displacement is the coulomb per square metre (C m−2 ).
Equation (2.17) can be written as
∇ · 𝑫 = 𝜌. (2.23)
Integration of this equation over the arbitrary volume 𝑉 yields
∫ ∫
∇ · 𝑫 d𝑉 = 𝜌 d𝑉 .
𝑉 𝑉
Let us transform the left-hand side according to the Ostrogradsky-Gauss theorem
[see Eq. (1.108)]:
∮ ∫
𝑫 · d𝑺 = 𝜌 d𝑉 . (2.24)
𝑆 𝑉
⁵The so-called absolute permittivity of a medium 𝜀a = 𝜀0 𝜀 is introduced in electrical engineering.
This quantity is deprived of a physical meaning, however, and we shall not use it.
64 ELECTRIC FIELD IN DIELECTRICS

The quantity on the left-hand side is 𝛷𝐷 —the flux of the vector 𝑫 through closed
surface 𝑆, while that on the right-hand side is the sum of the extraneous charges
𝑖 𝑞𝑖 enclosed by this surface. Hence, Eq. (2.24) can be written in the form
Í
Õ
𝛷𝐷 = 𝑞𝑖 . (2.25)
𝑖
Equations (2.24) and (2.25) express Gauss’s theorem for the vector 𝑫: the flux of the
electric displacement through a closed surface equals the algebraic sum of the extraneous
charges enclosed by this surface.
In a vacuum, 𝑷 = 0, so that the quantity 𝑫 determined by Eq. (2.18) transforms
into 𝜀0 𝑬, and Eqs. (2.24) and (2.25) transform into Eqs. (1.114) and (1.116).
The unit of the flux of the electric displacement vector is the coulomb. By
Eq. (2.25), a charge of 1 C sets up a displacement flux of 1 C through the surface
surrounding it.
The field of the vector 𝑫 can be depicted with the aid of electric displacement
lines (we shall call them displacement lines for brevity’s sake). Their direction and
density are determined in exactly the same way as for the lines of the vector 𝑬 (see
Sec. 1.5). The lines of the vector 𝑬 can begin and terminate at both extraneous and
bound charges. The sources of the field of the vector 𝑫 are only extraneous charges.
Hence, displacement lines can begin or terminate only at extraneous charges. These
lines pass without interruption through points at which bound charges are placed.
The electric induction⁶ in the Gaussian system is determined by the expression
𝑫 = 𝑬 + 4𝜋 𝑷. (2.26)
Substituting for 𝑷 in this equation its value from Eq. (2.6), we get
𝑫 = (1 + 4𝜋 𝜒)𝑬. (2.27)
The quantity
𝜀 = 1 + 4𝜋 𝜒. (2.28)
is called the permittivity. Introducing this quantity into Eq. (2.27) we get
𝑫 = 𝜀𝑬. (2.29)
In the Gaussian system, the electric induction in a vacuum coincides with the
field strength 𝑬. Consequently, the electric induction of the field of a point charge
in a vacuum is determined by Eq. (1.16).
By Eq. (2.22) the electric displacement set up by a charge of 1 C at a distance of
1 m is
1 𝑞 1 1
𝐷= 2
= 2
= C m−2 .
4𝜋 𝑟 4𝜋 × 1 4𝜋

⁶The term “electric displacement” is not applied to quantity (2.27).


Examples of Calculating the Field in Dielectrics 65

In the Gaussian system, the electric induction in this case is


3 × 109
= 3 × 105 cgse𝐷 .
𝑞
𝐷= 2 =
𝑟 104
Thus,
1 C m−2 = 4𝜋 × 3 × 105 cgse𝐷 .
In the Gaussian system, the expressions of Gauss’s theorem have the form
∮ ∫
𝑫 · d𝑺 = 4𝜋 𝜌 d𝑉 (2.30)
𝑆 𝑉
Õ
𝛷𝐷 = 4𝜋 𝑞𝑖 . (2.31)
𝑖
According to Eq. (2.31), a charge of 1 C sets up a flux of the electric induction vector
of 4𝜋𝑞 = 4𝜋 × 3 × 109 cgse𝛷𝐷 . The following relation thus exists between the units
of flux of the vector 𝑫:
1 C = 4𝜋 × 3 × 109 cgse𝛷𝐷 .

2.6. Examples of Calculating the Field in Dielectrics

We shall consider several examples of fields in dielectrics to reveal the meaning of


the quantities 𝑫 and 𝜀.
Field Inside a Flat Plate. Let us consider two infinite parallel oppositely
charged planes. Let the field they produce in a vacuum be characterized by the
strength 𝑬 0 and the displacement 𝑫0 = 𝜀0 𝑬 0 . Let us introduce into this field a
plate of a homogeneous isotropic dielectric and arrange it as shown in Fig. 2.5. The
dielectric becomes polarized under the action of the field, and bound charges of
density 𝜎 0 will appear on its surfaces. These charges will set up a homogeneous
field inside the plate whose strength by Eq. (1.121) is 𝐸 0 = 𝜎 0/𝜀0 . In the given case,
𝐸 0 = 0 outside the dielectric.
The field strength 𝐸0 is 𝜎 /𝜀0 . Both fields are directed toward each other, hence,
inside the dielectric we have
𝜎0 1
𝐸 = 𝐸0 − 𝐸 0 = 𝐸0 − = (() 𝜎 − 𝜎 0). (2.32)
𝜀0 𝜀0
Outside the dielectric, 𝐸 = 𝐸0 .
The polarization of the dielectric is due to field (2.32). The latter is perpendicular
to the surfaces of the plate. Hence, 𝐸n = 𝐸, and in accordance with Eq. (2.10),
𝜎 0 = 𝜒𝜀0 𝐸. Using this value in Eq. (2.32), we get
𝐸 = 𝐸0 − 𝜒𝐸
66 ELECTRIC FIELD IN DIELECTRICS

Fig. 2.5

whence
𝐸0 𝐸0
𝐸= = . (2.33)
1+ 𝜒 𝜀
Thus, in the given case, the permittivity 𝜀 shows how many times the field in a
dielectric weakens.
Multiplying Eq. (2.33) by 𝜀0 𝜀, we get the electric displacement inside the plate:
𝐷 = 𝜀0 𝜀𝐸 = 𝜀0 𝐸0 𝐷0 . (2.34)
Hence, the electric displacement inside the plate coincides with that of the external
field 𝐷0 . Substituting 𝜎 /𝜀0 for 𝐸0 in Eq. (2.34), we find
𝐷 = 𝜎. (2.35)
To find let us express 𝐸 and 𝐸0 in Eq. (2.33) through the charge densities:
𝜎 0,
1 𝜎
(() 𝜎 − 𝜎 0) =
𝜀0 𝜀0 𝜀
whence
𝜀−1
𝜎0 = 𝜎. (2.36)
𝜀
Figure 2.5 has been drawn assuming that 𝜀 = 3. Accordingly, the density of
the field lines in the dielectric is one-third of that outside the plate. The lines are
equally spaced because the field is homogeneous. In the given case, 𝜎 0 can be found
without resorting to Eq. (2.36). Indeed, since the field intensity inside the plate is
one-third of that outside it, then of three field lines beginning (or terminating) on
Examples of Calculating the Field in Dielectrics 67

extraneous charges, two must terminate (or begin respectively) on bound charges.
It thus follows that the density of the bound charges must be two-thirds that of the
extraneous charges.
In the Gaussian system, the field strength 𝐸 0 produced by the bound charges 𝜎 0
is 4𝜋𝜎 0. Therefore, Eq. (2.32) becomes
𝐸 = 𝐸0 − 𝐸 0 = 𝐸0 − 4𝜋𝜎 0. (2.37)
The surface density is associated with the field strength 𝐸 by the equation 𝜎 0 =
𝜎0
𝜒𝐸n . We can thus write that
𝐸 = 𝐸0 − 4𝜋 𝜒𝐸
whence
𝐸0 𝐸0
𝐸= = .
1 + 4𝜋 𝜒 𝜀
Thus, the permittivity 𝜀, like its counterpart 𝜀 in the SI, shows how many times
the field inside a dielectric weakens. Therefore, the values of 𝜀 in the SI and the
Gaussian system coincide. Hence, taking into account Eqs. (2.20) and (2.28), we
conclude that the susceptibilities in the Gaussian system ( 𝜒Gs ) and in the SI ( 𝜒SI )
differ from each other by the factor 4𝜋:
𝜒SI = 4𝜋 𝜒Gs . (2.38)
Field Inside a Spherical Layer. Let us surround a charged sphere of radius 𝑅
with a concentric spherical layer of a homogeneous isotropic dielectric (Fig. 2.6).
The bound charge 𝑞10 distributed with the density 𝜎10 will appear on the internal
surface of the layer (𝑞10 = 4𝜋 𝑅21 𝜎10), and the charge 𝑞20 distributed with the density
𝜎20 will appear on its external surface (𝑞20 = 4𝜋 𝑅22 𝜎20). The sign of the charge 𝑞20
coincides with that of the charge 𝑞 of the sphere, while 𝑞10 has the opposite sign. The
charges 𝑞10 and 𝑞20 set up a field at a distance 𝑟 exceeding 𝑅1 and 𝑅2 , respectively,
that coincides with the field of a point charge of the same magnitude [see Eq. (1.124)].
The charges 𝑞10 and 𝑞20 produce no field inside the surfaces over which they are
distributed. Hence, the field strength 𝐸 0 inside a dielectric is
1 𝑞10 1 4𝜋 𝑅21 𝜎10 1 𝑅21 𝜎10
𝐸0 = = =
4𝜋 𝜀0 𝑟 2 4𝜋 𝜀0 𝑟 2 𝜀0 𝑟 2
and is opposite in direction to the field strength 𝐸0 . The resultant field in a dielectric
is
1 𝑞 1 𝑅21 𝜎10
𝐸(𝑟) = 𝐸0 − 𝐸 0 = − . (2.39)
4𝜋 𝜀0 𝑟 2 𝜀0 𝑟 2
It diminishes in proportion to 1/𝑟 2 . We can, therefore, state that
𝐸(𝑅1 ) 𝑟2 𝑟2
= 2 ⇒ 𝐸(𝑅1 ) = 𝐸(𝑟) 2 ,
𝐸(𝑟) 𝑅1 𝑅1
68 ELECTRIC FIELD IN DIELECTRICS

Fig. 2.6

where 𝐸(𝑅1 ) is the field strength in a dielectric in direct proximity to the internal
surface of the layer. It is exactly this strength that determines the quantity 𝜎10:
𝑟2
𝜎10 = 𝜒𝜀0 𝐸(𝑅1 ) = 𝜒𝜀0 𝐸(𝑟) (2.40)
𝑅21
(at each point of the surface |𝐸n | = 𝐸).
Introducing Eq. (2.40) into Eq. (2.39), we get
1 𝑞 1 𝑅21 𝜒𝜀0 𝐸(𝑟)𝑟 2
𝐸(𝑟) = − = 𝐸0 (𝑟) − 𝜒𝐸(𝑟).
4𝜋 𝜀0 𝑟 2 𝜀0 𝑟 2 𝑅21
From this equation, we find that inside a dielectric 𝐸 = 𝐸0 /𝜀, and, consequently,
𝐷 = 𝜀0 𝐸0 [compare with Eqs. (2.33) and (2.34)].
The field inside a dielectric changes in proportion to 1/𝑟 2 . Therefore, the
relation 𝜎10 : 𝜎20 = 𝑅1 : 𝑅2 holds. Hence, it follows that 𝑞10 = 𝑞20 . Consequently, the
fields set up by these charges at distances exceeding 𝑅2 mutually destroy each other
so that outside the spherical layer 𝐸 0 = 0 and 𝐸 = 𝐸0 .
Assuming that 𝑅1 = 𝑅 and 𝑅2 = ∞, we arrive at the case of a charged sphere
immersed in an infinite homogeneous and isotropic dielectric. The field strength
outside such a sphere is
1 𝑞
𝐸= . (2.41)
4𝜋 𝜀0 𝜀𝑟 2
The strength of the field set up in an infinite dielectric by a point charge will be the
same.
Both examples considered above are characterized by the fact that the dielectric
was homogeneous and isotropic, and the surfaces enclosing it coincided with the
Conditions on the Interface Between Two Dielectrics 69

Fig. 2.7 Fig. 2.8

equipotential surfaces of the field of extraneous charges. The result we have obtained
in these cases is a general one. If a homogeneous and isotropic dielectric completely
fills the volume enclosed by equipotential surfaces of the field of extraneous charges,
then the electric displacement vector coincides with the vector of the field strength of the
extraneous charges multiplied by 𝜀0 , and, therefore, the field strength inside the dielectric
is 1/𝜀 of that of the field strength of the extraneous charges.
If the above conditions are not observed, the vectors 𝑫 and 𝜀0 𝑬 do not coincide.
Figure 2.7 shows the field in the plate of a dielectric. The plate is skewed relative to
the planes carrying extraneous charges. The vector 𝑬 0 is perpendicular to the faces
of the plate, therefore, 𝑬 and 𝑬 0 are not collinear. The vector 𝑫 is directed the same
as 𝑬, consequently, 𝑫 and 𝜀0 𝑬 0 do not coincide in direction. We can show that they
also fail to coincide in magnitude. In the examples considered above owing to the
specially selected shape of the dielectric, the field 𝑬 0 differed from zero only inside
the dielectric. In the general case, 𝑬 0 may differ from zero outside the dielectric too.
Let us place a rod made of a dielectric into an initially homogeneous field (Fig. 2.8).
Owing to polarization, bound charges of opposite signs are formed on the ends of
the rod. Their field outside the rod is equivalent to the field of a dipole (the lines of
𝑬 0 are dash ones in the figure). It is easy to see that the resultant field 𝑬 near the
ends of the rod is greater than the field 𝑬 0 .

2.7. Conditions on the Interface Between Two Dielectrics

Near the interface between two dielectrics, the vectors 𝑬 and 𝑫 must comply with
definite boundary conditions following from the relations (1.112) and (2.23):
∇ × 𝑬 = 0, ∇ · 𝑫 = 𝜌.
Let us consider the interface between two dielectrics with the permittivities 𝜀1
and 𝜀2 (Fig. 2.9). We choose an arbitrarily directed 𝑥-axis on this surface. We take a
small rectangular contour of length 𝑎 and width 𝑏 that is partly in the first dielectric
70 ELECTRIC FIELD IN DIELECTRICS

Fig. 2.9 Fig. 2.10

and partly in the second one. The 𝑥-axis passes through the middle of the sides 𝑏.
Assume that a field has been set up in the first dielectric whose strength is 𝑬 1 ,
and in the second one whose strength is 𝑬 2 . Since ∇ × 𝑬 = 0, the circulation of
the vector 𝑬 around the contour we have chosen must equal zero [see Eq. (1.110)].
With small dimensions of the contour and the direction of circumvention shown in
Fig. 2.9, the circulation of the vector 𝑬 can be written in the form

𝐸 𝑙 d𝑙 = 𝐸1,𝑥 𝑎 − 𝐸2,𝑥 𝑎 + h𝐸𝑏 i 2𝑏 (2.42)
where h𝐸𝑏 i is the mean value of 𝐸 𝑙 on sections of the contour perpendicular to the
interface. Equating this expression to zero, we arrive at the equation
𝐸1,𝑥 − 𝐸2,𝑥 𝑎 = h𝐸𝑏 i 2𝑏.


In the limit, when the width 𝑏 of the contour tends to zero, we get
𝐸1,𝑥 = 𝐸2,𝑥 . (2.43)
The values of the projections of the vectors 𝑬 1 and 𝑬 2 onto the 𝑥-axis are taken in
direct proximity to the interface between the boundary of the dielectrics.
Equation (2.43) is obeyed when the 𝑥-axis is selected arbitrarily. It is only
essential that this axis be in the plane of the interface between the dielectrics.
Inspection of Eq. (2.43) shows that with such a selection of the 𝑥-axis when 𝐸1,𝑥 = 0,
the projection of 𝐸2,𝑥 = 0 will also equal zero. This signifies that the vectors 𝑬 1 and
𝑬 2 at two close points taken at opposite sides of the interface are in the same plane
as a normal to the interface. Let us represent each of the vectors 𝑬 1 and 𝑬 2 in the
form of the sum of the normal and tangential components:
𝑬 1 = 𝑬 1,𝑛 + 𝑬 1,𝜏 , 𝑬 2 = 𝑬 2,𝑛 + 𝑬 2,𝜏 .
In accordance with Eq. (2.43)
𝐸1,𝜏 = 𝐸2,𝜏 . (2.44)
Here 𝑬 𝑖,𝜏 is the projection of the vector 𝑬 𝑖 onto the unit vector 𝝉ˆ directed along the
line of intersection of the dielectric interface with the plane containing the vectors
𝑬 1 and 𝑬 2 .
Substituting in accordance with Eq. (2.21) the projections of the vector 𝑫 divided
Conditions on the Interface Between Two Dielectrics 71

by 𝜀0 𝜀 for the projections of the vector 𝑬, we get the proportion


𝐷1,𝜏 𝐷2,𝜏
=
𝜀0 𝜀1 𝜀0 𝜀2
whence it follows that
𝐷1,𝜏 𝜀1
= . (2.45)
𝐷2,𝜏 𝜀2
Now let us take an imaginary cylindrical surface of height ℎ on the interface
between the dielectrics (Fig. 2.10). Base 𝑆1 is in the first dielectric, and base 𝑆2 in the
second. Both bases are identical in size (𝑆1 = 𝑆2 = 𝑆) and are so small that within
the limits of each of them the field may be considered homogeneous. Let us apply
Gauss’s theorem [see Eq. (2.25)] to this surface. If there are no extraneous charges
on the interface between the dielectrics, the right-hand side in Eq. (2.25) equals zero.
Hence, 𝛷𝐷 = 0.
The flux through base 𝑆1 is 𝐷1,𝑛 𝑆, where 𝐷1,𝑛 is the projection of the vector
𝑫 in the first dielectric onto the normal 𝒏ˆ 1 . Similarly, the flux through base 𝑆2 is
𝐷2,𝑛 𝑆, where 𝐷2,𝑛 is the projection of the vector 𝑫 in the second dielectric onto the
normal 𝒏ˆ 2 . The flux through the side surface can be written in the form h𝐷i 𝑛 𝑆side ,
where h𝐷i 𝑛 is the value of 𝐷𝑛 averaged over the entire side surface, and 𝑆side is the
magnitude of this surface. We can thus write that
𝛷𝐷 = 𝐷1,𝑛 𝑆 + 𝐷2,𝑛 𝑆 + h𝐷i 𝑛 𝑆side = 0. (2.46)
If the altitude ℎ of the cylinder is made to tend to zero, then 𝑆side will also tend to
zero. Hence, in the limit, we get
𝐷1,𝑛 = −𝐷2,𝑛 .
Here 𝐷𝑖,𝑛 is the projection onto 𝒏ˆ 𝑖 of the vector 𝑫 the 𝑖-th dielectric in direct
proximity to its interface with the other dielectric. The signs of the projections are
different because the normals 𝒏ˆ 1 and 𝒏ˆ 2 to the bases of the cylinder have opposite
directions. If we project 𝑫1 and 𝑫2 onto the same normal, we get the condition
𝐷1,𝑛 = 𝐷2,𝑛 . (2.47)
Using Eq. (2.21) to replace the projections of 𝑫 with the corresponding projec-
tions of the vector 𝑬 multiplied by 𝜀0 𝜀, we get the relation
𝜀0 𝜀1 𝐸1,𝑛 = 𝜀0 𝜀2 𝐸2,𝑛
whence
𝐸1,𝑛 𝜀2
= . (2.48)
𝐸2,𝑛 𝜀1
The results we have obtained signify that when passing through the interface
between two dielectrics, the normal component of the vector 𝑫 and the tangential
component of the vector 𝑬 change continuously. The tangential component of the
72 ELECTRIC FIELD IN DIELECTRICS

vector 𝑫 and the normal component of the vector 𝑬, however, are disrupted when
passing through the interface.
Equations (2.44), (2.45), (2.47), and (2.48) determine the conditions which the
vectors 𝑬 and 𝑫 must comply with on the interface between two dielectrics (if there
are no extraneous charges on this interface). We have obtained these equations for
an electrostatic field. They also hold, however, for fields varying with time (see
Sec. 16.3).
The conditions we have found also hold for the interface between a dielectric
and a vacuum. In this case, one of the permittivities must be taken equal to unity.
We must note that condition (2.47) can be obtained on the basis of the fact that
the displacement lines pass through the interface between two dielectrics without
being interrupted (Fig. 2.11). According to the rule for drawing these lines, the
number of lines arriving at area 𝛥𝑆 from the first dielectric is 𝐷1 𝛥𝑆1 = 𝐷1 𝛥𝑆 cos 𝛼1 .
Similarly, the number of lines emerging from area 𝛥𝑆 into the second dielectric is
𝐷2 𝛥𝑆2 = 𝐷2 𝛥𝑆 cos 𝛼2 . If the lines are not interrupted at the interface, both these
numbers must be the same:
𝐷1 𝛥𝑆 cos 𝛼1 = 𝐷2 𝛥𝑆 cos 𝛼2 .
Cancelling 𝛥𝑆 and taking into account that the product 𝐷 cos 𝛼 gives the value of
the normal component of the vector 𝑫, we arrive at condition (2.47).
The displacement lines are bent (refracted) on the interface between dielectrics,
owing to which the angle 𝛼 between a normal to the interface and the line 𝑫 changes.
Inspection of Fig. 2.12 shows that
𝐷1,𝜏 𝐷2,𝜏
tan 𝛼1 : tan 𝛼2 = :
𝐷1,𝑛 𝐷2,𝑛
whence with account taken of Eqs. (2.45) and (2.47), we get the law of displacement
line refraction:
tan 𝛼1 𝜀1
= . (2.49)
tan 𝛼2 𝜀2
When displacement lines pass into a dielectric with a lower permittivity 𝜀, the angle
made by them with a normal diminishes, hence, the lines are spaced farther apart;
when the lines pass into a dielectric with a higher permittivity 𝜀, on the contrary,
they become closer together.

2.8. Forces Acting on a Charge in a Dielectric

If we introduce into an electric field in a vacuum a charged body of such small


dimensions that the external field within the body can be considered homogeneous,
Forces Acting on a Charge in a Dielectric 73

Fig. 2.11 Fig. 2.12

then the body will experience the force


𝑭 = 𝑞𝑬. (2.50)
To place a charged body in a field set up in a dielectric, a cavity must be made in
the latter. In a fluid dielectric, the body itself forms the cavity by displacing the
dielectric from the volume it occupies. The field inside the cavity 𝑬 cav will differ
from that in a continuous dielectric. Thus, we cannot calculate the force exerted
on a charged body placed in a cavity as the product of the charge 𝑞 and the field
strength 𝑬 in the dielectric before the body was introduced into it.
When calculating the force acting on a charged body in a fluid dielectric, we
must take another circumstance into account. Mechanical tension is set up on the
boundary with the body in the dielectric. This sets up an additional mechanical
force 𝑭 ten acting on the body.
Thus, the force acting on a charged body in a dielectric, generally speaking,
cannot be determined by Eq. (2.50), and it is usually a very complicated task to
calculate it. These calculations give an interesting result for a fluid dielectric. The
resultant of the electric force 𝑞𝑬 cav and the mechanical force 𝑭 ten is found to be
exactly equal to 𝑞𝑬, where 𝑬 is the field strength in the continuous dielectric
𝑭 = 𝑞𝑬 cav + 𝑭 ten = 𝑞𝑬. (2.51)
The strength of the field produced in a homogeneous infinitely extending di-
electric by a point charge is determined by Eq. (2.49). Hence, we get the following
expression for the forces of interaction of two point charges immersed in a homo-
geneous infinitely extending dielectric:
1 𝑞1 𝑞2
𝐹= . (2.52)
4𝜋 𝜀0 𝜀𝑟 2
This formula expresses Coulomb’s law for charges in a dielectric. It holds only for
fluid dielectrics.
74 ELECTRIC FIELD IN DIELECTRICS

Some authors characterize Eq. (2.52) as “the most general expression of Coulomb’s
law”. In this connection, we shall cite Richard P. Feynman: “Many older books on
electricity start with the ’fundamental’ law that the force between two charges
is. . . [Eq. (2.52) is given]. . . , a point of view which is thoroughly unsatisfactory. For
one thing, it is not true in general; it is true only for a world filled with a liquid.
Secondly, it depends on the fact that 𝜀 is a constant which is only approximately
true for most real materials”⁷.
We shall not treat questions relating to the forces acting on a charge inside a
cavity made in a solid dielectric.

2.9. Ferroelectrics

There is a group of substances that can have the property of spontaneous polar-
ization in the absence of an external field. They are called ferroelectrics. This
phenomenon was first discovered for Rochelle salt, and the first detailed investiga-
tion of the electrical properties of this salt was carried out by the Soviet physicists I.
Kurchatov and P. Kobeko.
Ferroelectrics differ from the other dielectrics in a number of features:
1. Whereas the permittivity 𝜀 of ordinary dielectrics is only several units, reach-
ing as an exception several scores (for example, for water 𝜀 = 81), the permittivity
of ferroelectrics may be of the order of several thousands.
2. The dependence of 𝑃 on 𝐸 is not linear (see branch 1 of the curve shown in
Fig. 2.13). Hence, the permittivity depends on the field strength.
3. When the field changes, the values of the polarization 𝑃 (and, therefore,
of the displacement 𝐷 too) lag behind the field strength 𝐸. As a result, 𝑃 and 𝐷
are determined not only by the value of 𝐸 at the given moment, but also by the
preceding values of 𝐸, i.e., they depend on the preceding history of the dielectric.
This phenomenon is called hysteresis (from the Greek word “husterein”—to come
late, be behind). Upon cyclic changes of the field, the dependence of 𝑃 on 𝐸 follows
the curve shown in Fig. 2.13 and called a hysteresis loop. When the field is initially
switched on, the polarization grows with 𝐸 according to branch 1 of the curve.
Diminishing of 𝑃 takes place along branch 2. When 𝐸 vanishes, the substance retains
a value of the polarization 𝑃r called the residual polarization. The polarization
vanishes only under the action of an oppositely directed field 𝐸c . This value of the
field strength is called the coercive force. Upon a further change in 𝐸, branch 3 of
the hysteresis loop is obtained, and so on.

⁷R. P. Feynman, R. B. Leighton, M. Sands. The Feynman Lectures on Physics. Vol. II. Reading,
Mass., Addison-Wesley (1965), p. 10-8.
Ferroelectrics 75

Fig. 2.13

The behaviour of the polarization of ferroelectrics is similar to that of the


magnetization of ferromagnetics (see Sec. 7.9), and this is the origin of their name.
Only crystalline substances having no centre of symmetry can be ferroelectrics.
For example, the crystals of Rochelle salt belong to the rhombic system (see Sec.
13.2 of Vol. I). The interaction of the particles in a ferroelectric crystal leads to
the fact that their dipole moments line up spontaneously parallel to one another.
In exclusive cases, the identical orientation of the dipole moments extends to the
entire crystal. Ordinarily, however, regions appear in a crystal in whose confines
the dipole moments are parallel to one another, but the directions of polarization
in different regions are different. Thus, the resultant moment of an entire crystal
may equal zero. The regions of spontaneous polarization are also called domains.
Under the action of an external field, the moments of the domains rotate as a single
whole, arranging themselves in the direction of the field.
Every ferroelectric has a temperature at which the substance loses its unusual
properties and becomes a normal dielectric. This temperature is called the Curie
point. Rochelle salt has two Curie points, namely, −15 ◦C and 22 ◦C, and it behaves
like a ferroelectric only in the interval between these two temperatures. Its electrical
properties are conventional at temperatures below −15 ◦C and above 22 ◦C.
77

Chapter 3
CONDUCTORS IN AN ELECTRIC
FIELD

3.1. Equilibrium of Charges on a Conductor

The carriers of a charge in a conductor are capable of moving under the action of a
vanishingly small force. Therefore, the following conditions must be observed for
the equilibrium of charges on a conductor:
1. The strength of the field everywhere inside the conductor must be zero:
𝑬 = 0. (3.1)
In acccordance with Eq. (1.41), this signifies that the potential inside the con-
ductor must be constant (𝜑 = constant).
2. The strength of the field on the surface of the conductor must be directed
along a normal to the surface at every point:
𝑬 = 𝑬𝑛. (3.2)
Consequently, when the charges are in equilibrium, the surface of the con-
ductor will be an equipotential one.
If a charge 𝑞 is imparted to a conducting body, the charge will be distributed so
as to observe conditions of equilibrium. Let us imagine an arbitrary closed surface
completely confined in a body. When the charges are in equilibrium, there is no field
at every point inside the conductor; therefore, the flux of the electric displacement
vector through the surface vanishes. According to Gauss’s theorem, the sum of
the charges inside the surface will also equal zero. This holds for a surface of any
dimensions arbitrarily arranged inside a conductor. Hence, in equilibrium, there
can be no surplus charges anywhere inside a conductor—they will all be distributed
over the surface of the conductor with a certain density 𝜎 .
Since there are no surplus charges in a conductor in the state of equilibrium,
78 CONDUCTORS IN AN ELECTRIC FIELD

Fig. 3.1

the removal of substance from a volume taken inside the conductor will have no
effect whatsoever on the equilibrium arrangement of the charges. Thus, a surplus
charge will be distributed on a hollow conductor in the same way as on a solid one,
i.e., along its external surface. No surplus charges can be located on the surface of a
cavity in the state of equilibrium. This conclusion also follows from the fact that
the like elementary charges forming the given charge 𝑞 mutually repel one another
and, consequently, tend to take up positions at the farthest distance apart.
Imagine a small cylindrical surface formed by normals to the surface of a
conductor and bases of the magnitude d𝑆, one of which is inside and the other
outside the conductor (Fig. 3.1). The flux of the electric displacement vector through
the inner part of the surface equals zero because 𝑬 and, consequently, 𝑫 vanish
inside the conductor. Outside the conductor in direct proximity to it, the field
strength 𝑬 is directed along a normal to the surface. Hence, for the side surface
of the cylinder protruding outward, 𝐷𝑛 = 0, and for the outside base 𝐷𝑛 = 𝐷 (the
outside base is assumed to be very close to the surface of the conductor). Hence,
the displacement flux through the surface being considered is 𝐷 d𝑆, where 𝐷 is the
value of the displacement in direct proximity to the surface of the conductor. The
cylinder contains an extraneous charge 𝜎 d𝑆 (𝜎 is the charge density at the given
spot on the surface of the conductor).
Applying Gauss’s theorem, we get 𝐷 d𝑆 = 𝜎 d𝑆, i.e., 𝐷 = 𝜎 . We thus see that the
strength of the field near the surface of the conductor is
𝜎
𝐸= (3.3)
𝜀0 𝜀
where 𝜀 is the permittivity of the medium surrounding the conductor [compare
with Eq. (1.123) obtained for the case when 𝜀 = 1].
Let us consider the field produced by the charged conductor shown in Fig. 3.2.
Equilibrium of Charges on a Conductor 79

Fig. 3.2 Fig. 3.3

At great distances from the conductor, equipotential surfaces have the shape of a
sphere that is characteristic of a point charge (owing to the lack of space, a spherical
surface is shown in the figure at a small distance from the conductor; the dash lines
are field lines). As we approach the conductor, the equipotential surfaces become
more and more similar to the surface of the conductor, which is an equipotential
one. Near the projections, the equipotential surfaces are denser, hence, the field
strength is also greater here. It thus follows that the density of the charges on the
projections is especially great [see Eq. (3.3)]. We can arrive at the same conclusion
by taking into account that owing to their mutual repulsion, charges tend to take
up positions as far as possible from one another.
Near depressions in a conductor, the equipotential surfaces have a lower density
(see Fig. 3.3). Accordingly, the field strength and the density of the charges at these
spots will he smaller. In general, the density of charges with a given potential
of a conductor is determined by the curvature of the surface—it grows with an
increase in the positive curvature (convexity) and diminishes with an increase in the
negative curvature (concavity). The density of charges is especially high on sharp
points. Consequently, the field strength near such points may be so great that the
gas molecules surrounding the conductor become ionized. Ions of the sign opposite
to that of 𝑞 are attracted to the conductor and neutralize its charge. Ions of the same
sign as 𝑞 begin to move away from the conductor, carrying along neutral molecules
of the gas. The result is a noticeable motion of the gas called an electric wind. The
charge of the conductor diminishes, it flows off the point, as it were, and is carried
away by the wind. This phenomenon is therefore called emanation of a charge from
a point.
80 CONDUCTORS IN AN ELECTRIC FIELD

Fig. 3.4

3.2. A Conductor in an External Electric Field

When an uncharged conductor is introduced into an electric field, the charge carriers
come into motion: the positive ones in the direction of the vector 𝑬, the negative
ones in the opposite direction. As a result, charges of opposite signs called induced
charges appear at the ends of the conductor (Fig. 3.4, the dash lines depict the
external field lines). The field of these charges is directed oppositely to the external
field. Hence, the accumulation of charges at the ends of a conductor leads to
weakening of the field in it. The charge carriers will be redistributed until conditions
(3.1) and (3.2) are observed, i.e., until the strength of the field inside the conductor
vanishes and the field lines outside the conductor are perpendicular to its surface
(see Fig. 3.4). Thus, a neutral conductor introduced into an electric field disrupts
part of the field lines—they terminate on the negative induced charges and begin
again on the positive ones.
The induced charges distribute themselves over the outer surface of a conduc-
tor. If a conductor contains a cavity, then upon equilibrium distribution of the
induced charges, the field inside it vanishes. Electrostatic shielding is based on
this phenomenon. If an instrument is to be protected from the action of external
fields, it is surrounded by a conducting screen. The external field is compensated
inside the screen by the induced charges appearing on its surface. Such a screen
also functions quite well if it is made not solid, but in the form of a dense network.

3.3. Capacitance

A charge 𝑞 imparted to a conductor distributes itself over its surface so that the
strength of the field inside the conductor vanishes. Such a distribution is the only
possible one. Therefore, if we impart to a conductor already carrying the charge
Capacitance 81

𝑞 another charge of the same magnitude, then the second charge must distribute
itself over the conductor in exactly the same way as the first one. Otherwise, the
charge will set up in the conductor a field differing from zero. We must note that
this holds only for a conductor remote from other bodies (an isolated conductor). If
other bodies are near the conductor, the imparting to the latter of a new portion of
charge will produce either a change in the polarization of these bodies or a change
in the induced charges on them. As a result, similarity in the distribution of different
portions of the charge will be violated.
Thus, charges differing in magnitude distribute themselves on an isolated con-
ductor in a similar way (the ratio of the densities of the charge at two arbitrary
points on the surface of the conductor with any magnitude of the charge will be the
same). It thus follows that the potential of an isolated conductor is proportional to
the charge on it. Indeed, an increase in the charge a certain number of times leads
to an increase in the strength of the field at every point of the space surrounding the
conductor the same number of times. Accordingly, the work needed for transferring
a unit charge from infinity to the surface of a conductor, i.e., the potential of the
conductor, grows the same number of times. Thus, for an isolated conductor
𝑞 = 𝐶𝜑. (3.4)
The constant of proportionality 𝐶 between the potential and the charge is called
the capacitance. From Eq. (3.4), we get
𝑞
𝐶= . (3.5)
𝜑
In accordance with Eq. (3.5), the capacitance numerically equals the charge which
when imparted to a conductor increases its potential by unity.
Let us calculate the potential of a charged sphere of radius 𝑅. The potential
difference and the field strength are related by Eq. (1.45). We can therefore find the
potential of the sphere 𝜑 by integrating Eq. (2.41) over 𝑟 from 𝑅 to ∞ (we assume
that the potential at infinity equals zero):
1 1 𝑞
∫ ∞
𝑞
𝜑= d𝑟 = . (3.6)
4𝜋 𝜀0 0 𝜀𝑟 2 4𝜋 𝜀0 𝜀𝑅
Comparing Eqs. (3.5) and (3.6), we find that the capacitance of an isolated sphere of
radius 𝑅 immersed in a homogeneous infinite dielectric of permittivity 𝜀 is
𝐶 = 4𝜋 𝜀0 𝜀𝑅. (3.7)
The unit of capacitance is the capacitance of a conductor whose potential
changes by 1 V when a charge of 1 C is imparted to it. This unit of capacitance is
called the farad (F). In the Gaussian system, the formula for the capacitance of an
82 CONDUCTORS IN AN ELECTRIC FIELD

isolated sphere has the form


𝐶 = 𝜀𝑅. (3.8)
Since 𝜀 is a dimensionless quantity, the capacitance determined by Eq. (3.8) has the
dimension of length. The unit of capacitance is the capacitance of an isolated sphere
with a radius of 1 cm in a vacuum. This unit of capacitance is called the centimetre.
According to Eq. (3.5),
1 C 3 × 109 cgse𝐶
1F = = = 9 × 1011 cm. (3.9)
1V 1/300
An isolated sphere having a radius of 9 × 1011 cm, i.e., a radius 1500 times
greater than that of the Earth, would have a capacitance of 1 F. We can thus see
that the farad is a very great unit. For this reason, submultiples of a farad are used
in practice—the millifarad (mF), the microfarad (µF), the nanofarad (nF), and the
picofarad (pF) (see Vol. I, Table 3.1).

3.4. Capacitors

Isolated conductors have a small capacitance. Even a sphere of the Earth’s size has
a capacitance of only 700 µF. Devices are needed in practice, however, that with
a low potential relative to the surrounding bodies would accumulate charges of
an appreciable magnitude (i.e., would have a high charge “capacity”). Such devices,
called capacitors, are based on the fact that the capacitance of a conductor grows
when other bodies are brought close to it. This is due to the circumstance that under
the action of the field set up by the charged conductor, induced (on a conductor)
or bound (on a dielectric) charges appear on the body brought up to it. Charges
of the sign opposite to that of the charge 𝑞 of the conductor will be closer to the
conductor than charges of the same sign as 𝑞 and, consequently, will have a greater
influence on its potential. Therefore, when a body is brought close to a charged
conductor, the potential of the latter diminishes in absolute value. According to
Eq. (3.5), this signifies an increase in the capacitance of the conductor.
Capacitors are made in the form of two conductors placed close to each other.
The conductors forming a capacitor are called its plates. To prevent external bodies
from influencing the capacitance of a capacitor, the plates are shaped and arranged
relative to each other so that the field set up by the charges accumulating on them
is concentrated inside the capacitor. This condition is satisfied (see Sec. 1.14) by
two plates arranged close to each other, two coaxial cylinders, and two concentric
spheres. Accordingly, parallel-plate (plane), cylindrical, and spherical capacitors are
encountered. Since the field is confined inside a capacitor, the electric displacement
lines begin on one plate and terminate on the other. Consequently, the extraneous
Capacitors 83

charges produced on the plates have the same magnitude and are opposite in sign.
The basic characteristic of a capacitor is its capacitance, by which is meant a
quantity proportional to the charge 𝑞 and inversely proportional to the potential
difference between the plates:
𝑞
𝐶= . (3.10)
𝜑1 − 𝜑2
The potential difference 𝜑1 − 𝜑2 is called the voltage across the relevant points¹.
We shall use the symbol 𝑈 to designate the voltage. Hence, Eq. (3.10) can be written
as follows:
𝑞
𝐶= . (3.11)
𝑈
Here, 𝑈 is the voltage across the plates.
The capacitance of capacitors is measured in the same units as that of isolated
conductors (see the preceding section).
The magnitude of the capacitance is determined by the geometry of the capacitor
(the shape and dimensions of the plates and their separation distance), and also by
the dielectric properties of the medium filling the space between the plates. Let us
find the equation for the capacitance of a parallel-plate capacitor. If the area of a
plate is 𝑆 and the charge on it is 𝑞, then the strength of the field between the plates is
𝜎 𝑞
𝐸= =
𝜀0 𝜀 𝜀0 𝜀𝑆
[see Eqs. (1.121) and (2.33); 𝜀 is the permittivity of the medium filling the gap between
the plates].
In accordance with Eq. (1.45), the potential difference between the plates is
𝑞𝑑
𝜑1 − 𝜑2 = 𝐸𝑑 = .
𝜀0 𝜀𝑆
Hence, for the capacitance of a parallel-plate capacitor, we get
𝜀0 𝜀𝑆
𝐶= (3.12)
𝑑
where 𝑆 is the area of a plate, 𝑑 is the separation distance of the plates, and 𝜀 is the
permittivity of the substance filling the gap.
It must be noted that the accuracy of determining the capacitance of a real
parallel-plate capacitor by Eq. (3.12) is the greater, the smaller is the separation
distance 𝑑 in comparison with the linear dimensions of the plates.
It can be seen from Eq. (3.12) that the dimension of the electric constant 𝜀0 equals
the dimension of capacitance divided by that of length. Accordingly, 𝜀0 is measured
in farads per metre [see Eq. (1.12)].
If we disregard the dispersion of the field near the plate edges, we can easily
¹A more general definition of the quantity called voltage will be given in Sec. 5.3 [see Eq. (5.18)].
84 CONDUCTORS IN AN ELECTRIC FIELD

obtain the following equation for the capacitance of a cylindrical capacitor:


2𝜋 𝜀0 𝜀𝑙
𝐶=   (3.13)
𝑅2
ln
𝑅1
where 𝑙 is length of the capacitor, 𝑅1 and 𝑅2 the radii of the internal and external
plates.
The accuracy of determining the capacitance of a real capacitor by Eq. (3.13)
is the greater, the smaller is the separation distance of the plates 𝑑 = 𝑅2 − 𝑅1 in
comparison with 𝑙 and 𝑅1 .
The capacitance of a spherical capacitor is
 
𝑅1 𝑅2
𝐶 = 4𝜋 𝜀0 𝜀 (3.14)
𝑅2 − 𝑅1
where 𝑅1 and 𝑅2 are the radii of the internal and external plates.
Apart from the capacitance, every capacitor is characterized by the maximum
voltage 𝑈max that may be applied across its plates without the danger of a breakdown.
When this voltage is exceeded, a spark jumps across the space between the plates.
The result is destruction of the dielectric and failure of the capacitor.
85

Chapter 4
ENERGY OF
AN ELECTRIC FIELD

4.1. Energy of a Charged Conductor

The charge 𝑞 on a conductor can be considered as a system of point charges 𝛥𝑞.


In Sec. 1.7, we obtained the following expression for the energy of interaction of a
system of charges [see Eq. (1.39)]:

𝑊p = 𝑞𝑖 𝜑 𝑖 . (4.1)
2 𝑖
Here, 𝜑𝑖 is the potential set up by all the charges except 𝑞𝑖 at the point where the
charge 𝑞𝑖 is.
The surface of a conductor is equipotential. Therefore, the potentials of the
points where the point charges 𝛥𝑞 are located are identical and equal the potential
𝜑 of the conductor. Using Eq. (4.1), we get the following expression for the energy
of a charged conductor
1Õ 1 Õ 1
𝑊p = 𝜑 𝛥𝑞 = 𝜑 𝛥𝑞 = 𝜑𝑞. (4.2)
2 2 2
Taking into account Eq. (3.5), we can write that
𝜑𝑞 𝑞2 𝐶𝜑2
𝑊p = = = . (4.3)
2 2𝐶 2
Any of these expressions gives the energy of a charged conductor.

4.2. Energy of a Charged Capacitor

Assume that the potential of a capacitor plate carrying the charge +𝑞 is 𝜑1 and that of
a plate carrying the charge −𝑞 is 𝜑2 . Consequently, each of the elementary charges
86 ENERGY OF AN ELECTRIC FIELD

Fig. 4.1

𝛥𝑞 into which the charge +𝑞 can be divided is at a point with the potential 𝜑1 , and
each of the charges into which the charge −𝑞 can be divided is at a point with the
potential 𝜑2 . By Eq. (4.1), the energy of such a system of charges is
1 1 1
𝑊p = [(+𝑞)𝜑1 + (−𝑞)𝜑2 ] = 𝑞 (𝜑1 − 𝜑2 ) = 𝑞𝑈. (4.4)
2 2 2
Using Eq. (3.11), we can write three expressions for the energy of a charged capacitor:
𝑞𝑈 𝑞2 𝐶𝑈 2
𝑊p = = = . (4.5)
2 2𝐶 2
Equation (4.5) differs from (4.3) only in containing 𝑈 instead of 𝜑.
The expression for the potential energy permits us to find the force with which
the plates of a parallel-plate capacitor attract each other. Let us assume that the
separation distance of the plates can be changed. We shall associate the origin of
the 𝑥-axis with the left-hand plate (Fig. 4.1). The coordinate 𝑥 of the second plate
will, therefore, determine the separation distanced of the plates. According to Eqs.
(3.12) and (4.5), we have
𝑞2 𝑞2
𝑊p = = 𝑥.
2𝐶 2𝜀0 𝜀𝑆
Let us differentiate this expression with respect to 𝑥, assuming that the charge on
the plates is constant (the capacitor is disconnected from a voltage source). As a
result, we obtain the projection of the force exerted on the right-hand plate onto
the 𝑥-axis:
∂𝑊p 𝑞2
𝐹𝑥 = − =− .
∂𝑥 2𝜀0 𝜀𝑆
The magnitude of this expression gives the force with which the plates attract each
other:
𝑞2
𝐹= . (4.6)
2𝜀0 𝜀𝑆
Energy of a Charged Capacitor 87

Now, let us try to calculate the force of attraction between the plates of a parallel-
plate capacitor as the product of the strength of the field produced by one of the
plates and the charge concentrated on the other one. By Eq. (1.120), the strength of
the field set up by one plate is
𝜎 𝑞
𝐸= = . (4.7)
2𝜀0 2𝜀0 𝑆
A dielectric weakens the field in the space between the plates 𝑒 times, but this occurs
only inside the dielectric [see Eq. (2.33) and the related text]. The charges on the
plates are outside the dielectric and are, therefore, acted upon by the field strength
given by Eq. (4.7). Multiplying the charge of a plate 𝑞 by this strength, we get the
following expression for the force:
𝑞2
𝐹0 = . (4.8)
2𝜀0 𝑆
Equations (4.6) and (4.8) do not coincide. The value of the force given by Eq. (4.6)
obtained from the expression for the energy agrees with experimental data. The
explanation is that apart from the “electric” force given by Eq. (4.8), the plates
experience mechanical forces from the side of the dielectric that tend to spread
them apart (see Sec. 2.8; we must note that we have in mind a fluid dielectric). There
is a dispersed field at the edges of the plates whose magnitude diminishes with
an increasing distance from the edges (Fig. 4.2). The molecules of the dielectric
have a dipole moment and experience the action of a force pulling them into the
region with the stronger field [see Eq. (1.62)]. The result is an increase in the pressure
between the plates and the appearance of a force that weakens the force given by
Eq. (4.8) 𝑒 times.
If a charged capacitor with an air gap is partially immersed in a liquid dielectric,
the latter will be drawn into the space between the plates (Fig. 4.3). This phenomenon
is explained as follows. The permittivity of air virtually equals unity. Consequently,
before the plates are immersed in the dielectric, we can consider that the capac-
itance of the capacitor is 𝐶0 = 𝜀0 𝑆/𝑑, and its energy is 𝑊0 = 𝑞2 /2𝐶0 . When the
space between the plates is partially filled with the dielectric, the capacitor can be
considered as two capacitors connected in parallel, one of which has a plate area
of 𝑥𝑆 (𝑥 is the relative part of the space filled with the liquid) and is filled with a
dielectric for which 𝜀 > 1, and the other has a plate area equal to (1 − 𝑥)𝑆. In the
parallel connection of capacitors, their capacitances are summated:
𝜀0 𝑆(1 − 𝑥) 𝜀0 𝜀𝑆𝑥 𝜀0 (𝜀 − 1)𝑆
𝐶 = 𝐶1 + 𝐶2 = + = 𝐶0 + 𝑥 > 𝐶0 .
𝑑 𝑑 𝑑
Since 𝐶 > 𝐶0 , the energy 𝑊 = 𝑞2 /2𝐶 will be smaller than 𝑊0 (the charge 𝑞 is
assumed to be constant—the capacitor was disconnected from the voltage source
88 ENERGY OF AN ELECTRIC FIELD

Fig. 4.2 Fig. 4.3

before being immersed in the liquid). Hence, the filling of the space between the
plates with the dielectric is profitable from the energy viewpoint. This is why the
dielectric is drawn into the capacitor and its level in the space separating the plates
rises. This, in turn, results in an increase in the potential energy of gravity. In
the long run, the level of the dielectric in the space will establish itself at a certain
height corresponding to the minimum total energy (electrical and gravitational).
The above phenomenon is similar to the capillary rise of a liquid in the narrow
space between plates (see Sec. 14.5 of Vol. I).
The drawing of the dielectric into the space between plates can also be explained
from the microscopic viewpoint. There is a nonuniform field at the edges of the
capacitor plates. The molecules of the dielectric have an intrinsic dipole moment or
acquire it under the action of the field; therefore, they experience forces that tend to
transfer them to the region of the strong field, i.e., into the capacitor. These forces
cause the liquid to be drawn into the space between the plates until the electric
forces exerted on the liquid at the plate edges will be balanced by the weight of the
liquid column.

4.3. Energy of an Electric Field

The energy of a charged capacitor can be expressed through quantities charac-


terizing the electric field in the space between the plates. Let us do this for a
Energy of an Electric Field 89

parallel-plate capacitor. Introducing expression (3.12) for the capacitance into the
equation 𝑊p = 𝐶𝑈 2 /2 [see Eq. (4.5)], we get
 2
𝐶𝑈 2 𝜀0 𝜀𝑆𝑈 2 𝜀0 𝜀 𝑈
𝑊p = = = 𝑆𝑑.
2 2𝑑 2 𝑑
The ratio 𝑈/𝑑 equals the strength of the field between the plates; the product 𝑆𝑑 is
the volume occupied by the field. Hence,
𝜀0 𝜀𝐸2
𝑊p = 𝑉. (4.9)
2
The equation 𝑊p = 𝑞2 /(2𝐶) relates the energy of a capacitor to the charge on
its plates, while Eq. (4.9) relates this energy to the field strength. It is logical to ask
the question: where, after all, is the energy localized (i.e., concentrated), what is the
carrier of the energy—charges or a field? This question cannot be answered within
the scope of electrostatics, which studies the fields of fixed charges that are constant
in time. Constant fields and the charges producing them cannot exist separately
from each other. Fields varying in time, however, can exist independently of the
charges producing them and propagate in space in the form of electromagnetic
waves. Experiments show that electromagnetic waves transfer energy. In partic-
ular, the energy due to which life exists on the Earth is supplied from the Sun by
electromagnetic waves; the energy that causes a radio receiver to sound is carried
from the transmitting station by electromagnetic waves, etc. These facts make us
acknowledge the circumstance that the carrier of energy is a field.
If a field is homogeneous (which is the case in a parallel-plate capacitor), the
energy confined in it is distributed in space with a constant density 𝑤 equal to the
energy of the field divided by the volume it occupies. Inspection of Eq. (4.9) shows
that the density of the energy of a field of strength 𝐸 set up in a medium with the
permittivity 𝜀 is
𝜀0 𝜀𝐸2
𝑤= . (4.10)
2
With account taken of Eq. (2.21), we can write Eq. (4.10) as follows:
𝜀0 𝜀𝐸2 𝐸𝐷 𝐷2
𝑤= = = . (4.11)
2 2 2𝜀0 𝜀
In an isotropic dielectric, the directions of the vectors 𝑬 and 𝑫 coincide. We
can, therefore, write the equation for the energy density in the form
𝑬·𝑫
𝑤= .
2
Substituting for 𝑫 in this equation its value from Eq. (2.18), we get the following
90 ENERGY OF AN ELECTRIC FIELD

expression for 𝑤:
𝑬(𝜀0 𝑬 + 𝑷) 𝜀0 𝑬 2 𝑬 · 𝑷
𝑤= = + . (4.12)
2 2 2
The first addend in this expression coincides with the energy density of the field 𝑬
in a vacuum. The second addend, as we shall proceed to prove, is the energy spent
for polarization of the dielectric.
The polarization of a dielectric consists in that the charges contained in the
molecules are displaced from their positions under the action of the electric field 𝑬.
The work done to displace the charges 𝑞, over the distance d𝒓 𝑖 per unit volume of
the dielectric is !
Õ Õ
d𝐴 = 𝑞𝑖 𝑬 d𝒓 𝑖 = 𝑬 d 𝑞𝑖 𝒓 𝑖
𝑉 =𝑖 𝑉 =𝑖
(we consider for simplicity’s sake that the field is homogeneous). According to
Eq. (2.1), 𝑉 =𝑖 𝑞𝑖 𝒓 𝑖 equals the dipole moment of a unit volume, i.e., the polarization
Í
of the dielectric 𝑷. Hence,
d𝐴 = 𝑬 d𝑷. (4.13)
The vector 𝑷 is related to the vector 𝑬 by the expression 𝑷 = 𝜒𝜀0 𝑬 [see Eq. (2.5)].
Hence, d𝑷 = 𝜒𝜀0 d𝑬. Using this value of d𝑷 in Eq. (4.13), we get the expression
𝜒𝜀0 𝑬 2
   
𝑬·𝑷
d𝐴 = 𝜒𝜀0 𝑬 d𝑬 = d =d .
2 2
Finally, integration gives us the following expression for the work done to polarize
a unit volume of the dielectric:
𝑬·𝑷
𝐴= , (4.14)
2
which coincides with the second addend in Eq. (4.12). Thus, expressions (4.11), apart
from the intrinsic energy of a field 𝜀0 𝐸2 /2, include the energy (𝑬 · 𝑷)/2 spent for
the polarization of the dielectric when the field is set up.
Knowing the density of the field energy at every point, we can find the energy of
the field confined in any volume 𝑉 . For this purpose, we must calculate the integral
𝜀0 𝜀𝐸2
∫ ∫
𝑊= 𝑤 d𝑉 = d𝑉 . (4.15)
𝑉 𝑉 2
Let us calculate, as an example, the energy of the field of a charged conducting
sphere of radius 𝑅 placed in a homogeneous infinite dielectric. The field strength
here is a function only of 𝑟:
1 𝑞
𝐸= .
4𝜋 𝜀0 𝜀𝑟 2
Let us divide the space surrounding our sphere into concentric spherical layers of
Energy of an Electric Field 91

thickness d𝑟. The volume of a layer is d𝑉 = 4𝜋𝑟 2 d𝑟. It contains the energy
1 𝑞 1 𝑞2 d𝑟
 
𝜀0 𝜀 2
d𝑊 = 𝑤 d𝑉 = 4𝜋𝑟 d𝑟 = .
2 4𝜋 𝜀0 𝜀𝑟 2 2 4𝜋 𝜀0 𝜀 𝑟 2
The energy of the field is
1 𝑞2 d𝑟 1 𝑞2 𝑞2
∫ ∫ ∞
𝑊= d𝑊 = = =
2 4𝜋 𝜀0 𝜀 𝑅 𝑟 2 2 4𝜋 𝜀0 𝜀𝑅 2𝐶
[according to Eq. (3.7), 4𝜋 𝜀0 𝜀𝑅 is the capacitance of a sphere].
The expression we have obtained coincides with that for the energy of a con-
ductor having the capacitance 𝐶 and carrying the charge 𝑞 [see Eq. (4.3)].
93

Chapter 5
STEADY ELECTRIC CURRENT

5.1. Electric Current

If a total charge other than zero is carried through an imaginary surface, an electric
current (or simply a current) is said to flow through this surface. A current can
flow in solids (metals, semiconductors), liquids (electrolytes), and in gases (the flow
of a current through a gas is called a gas discharge).
For a current to flow, the given body (or given medium) must contain charged
particles that can move within the limits of the entire body. Such particles are called
current carriers. The latter may be electrons, or ions, or, finally, macroscopic
particles carrying a surplus charge (for example, charged dust particles and droplets).
A current is produced if there is an electric field inside a body. The charge
carriers participate in the molecular thermal motion and, consequently, travel with
a certain velocity 𝒗 even in the absence of a field. But in this case, an identical
number of carriers of either sign pass on the average in both directions through an
arbitrary area mentally drawn in the body, so that the current is zero. When a field
is switched on, ordered motion with the velocity 𝒖 is superposed onto the chaotic
motion of the carriers with the velocity 𝒗¹. The velocity of the carriers will thus be
𝒗 + 𝒖. Since the mean value of 𝒗 (but not of 𝑣) equals zero, then the mean velocity
of the carriers is h𝒖i:
h𝒗 + 𝒖i = h𝒗i + h𝒖i = h𝒖i .
It follows from what has been said above that an electric current can be defined as
the ordered motion of electric charges.
A quantitative characteristic of an electric current is the magnitude of the charge
carried through the surface being considered in unit time. It is called the current
¹Similarly, in a gas flow, ordered motion is superposed onto the chaotic thermal motion of the
molecules.
94 STEADY ELECTRIC CURRENT

strength, or more often simply the current. We must note that a current is in
essence a flow of a charge through a surface (compare with the flow of a fluid,
energy flux, etc.).
If the charge d𝑞 is carried through a surface during the time d𝑡, then the current
is
d𝑞
𝐼= . (5.1)
d𝑡
An electric current may be produced by the motion of either positive or negative
charges. The transfer of a negative charge in one direction is equivalent to the
transfer of a positive charge of the same magnitude in the opposite direction. If
a current is produced by carriers of both signs, the positive carriers transferring
the charge d𝑞+ in one direction through the given surface during the time d𝑡, and
the negative carriers the charge d𝑞− in the opposite direction during the same time,
then
d𝑞+ |d𝑞− |
𝐼= + .
d𝑡 d𝑡
The direction of motion of the positive carriers has been conventionally as-
sumed to be the direction of a current.
A current may be distributed non-uniformly over the surface through which it
is flowing. A current can be characterized in greater detail by means of the current
density vector 𝒋. This vector numerically equals the current d𝐼 through the area
d𝑆⊥ arranged at the given point perpendicular to the direction of motion of the
carriers divided by the magnitude of this area:
d𝐼
𝑗= . (5.2)
d𝑆⊥
The direction of 𝒋 is taken as that of the velocity vector 𝒖+ of the ordered motion of
the positive carriers (or as the direction opposite to that of the vector 𝒖− ).
The field of the current density vector can be depicted by means of current
lines that are constructed in the same way as the streamlines in a flowing liquid, the
lines of the vector 𝑬, etc.
Knowing the current density vector at every point of space, we can find the
current 𝐼 through any surface 𝑆:

𝐼= 𝒋 · d𝑺. (5.3)
𝑆
It can be seen from Eq. (5.3) that the current is the flux of the current density vector
through a surface [see Eq. (1.74)].
Assume that a unit volume contains 𝑛+ positive carriers and 𝑛− negative ones.
The algebraic value of the carrier charges is 𝑒+ and 𝑒− , respectively. If the carriers
acquire the average velocities 𝑢+ and 𝑢− under the action of the field, then 𝑛+ 𝑢+
Electric Current 95

positive carriers will pass in unit time through unit area², and they will transfer
the charge 𝑒+ 𝑛+ 𝑢+ . Similarly, the negative carriers will transfer the charge 𝑒− 𝑛− 𝑢−
in the opposite direction. We, thus, get the following expression for the current
density:
𝑗 = 𝑒+ 𝑛+ 𝑢+ + 𝑒− 𝑛− 𝑢− . (5.4)
This expression can be given a vector form:
𝒋 = 𝑒+ 𝑛+ 𝒖+ + 𝑒− 𝑛− 𝒖− (5.5)
(both addends have the same direction: the vector 𝒖−
is directed oppositely to the
vector 𝒋; when it is multiplied by the negative scalar 𝑒− , we get a vector of the same
direction as 𝒋).
The product 𝑒+ 𝑛+ gives the charge density of the positive carriers 𝜌+ . Similarly,
𝑒− 𝑛− gives the charge density of the negative carriers 𝜌− . Hence, Eq. (5.5) can be
written in the form
𝒋 = 𝜌+ 𝒖+ + 𝜌− 𝒖− (5.6)
A current that does not change with time is called steady (do not confuse with
a direct current whose direction is constant, but whose magnitude may vary). For a
steady current, we have
d𝑞
𝐼= , (5.7)
d𝑡
where 𝑞 is the charge carried through the surface being considered during the finite
time 𝑡.
In the SI, the unit of current, the ampere (A), is a basic one. Its definition will be
given on a later page (see Sec. 6.1). The unit of charge, the coulomb (C), is defined
as the charge carried in one second through the cross section of a conductor at a
current of one ampere.
The unit of current in the cgse system is the current at which one cgse unit of
charge (1 cgse𝑞 ) is carried through a given surface in one second. From Eqs. (1.8)
and (5.7) we find that
1 A = 3 × 109 cgse𝐼 . (5.8)

²The expression for the number of molecules flying in unit time through unit area contains, in
addition, the factor 1/4 due to the fact that the molecules move chaotically [see Eq. (11.23) of Vol. I].
This factor is not present in the given case because all the carriers of a given sign have ordered motion
in one direction.
96 STEADY ELECTRIC CURRENT

Fig. 5.1 Fig. 5.2

5.2. Continuity Equation

Let us consider an imaginary closed∮ surface 𝑆 (Fig. 5.1) in a medium in which a


current is flowing. The expression 𝑆 𝒋 · d𝑺 gives the charge emerging in a unit
time from the volume 𝑉 enclosed by surface 𝑆. Owing to charge conservation, this
quantity must equal the rate of diminishing of the charge 𝑞 contained in the given
volume:
d𝑞

𝒋 · d𝑺 = − .
𝑆 d𝑡
Substituting for 𝑞 its value 𝑉 𝜌 d𝑉 , we get the expression

d ∂𝜌
∮ ∫ ∫
𝒋 · d𝑺 = − 𝜌 d𝑉 = − d𝑉 . (5.9)
𝑆 d𝑡 𝑉 𝑉 ∂𝑡
We have written the partial derivative of 𝜌 with respect to 𝑡 inside the integral
because the charge density may depend not only on time, but also on the coordinates
(the integral 𝑉 𝜌 d𝑉 is a function only of time). Let us transform the left-hand side

of Eq. (5.9) in accordance with the Ostrogradsky-Gauss theorem. As a result, we get


∂𝜌
∫ ∫
(∇ · 𝒋) d𝑉 = − d𝑉 . (5.10)
𝑉 𝑉 ∂𝑡
Equation (5.10) must be observed upon an arbitrary choice of the volume 𝑉 over
which the integrals are taken. This is possible only if at every point of space the
condition is observed that
∂𝜌
∇· 𝒋=− . (5.11)
∂𝑡
Equation (5.11) is known as the continuity equation. It [like Eq. (5.9)] expresses the
law of charge conservation. According to Eq. (5.11), the charge diminishes at points
that are sources of the vector 𝒋.
For a steady current, the potential at different points, the charge density, and
other quantities are constant. Hence, for a steady current, Eq. (5.11) has the form
∇ · 𝒋 = 0. (5.12)
Electromotive Force 97

Fig. 5.3

Thus, for a steady current, the vector 𝒋 has no sources. This signifies that the current
lines begin nowhere and terminate∮ nowhere. Hence, the lines of a steady current
are always closed. Accordingly, 𝑆 𝒋 · d𝑺 equals zero. Therefore, for a steady current,
the picture similar to that shown in Fig. 5.1 has the form shown in Fig. 5.2.

5.3. Electromotive Force

If an electric field is set up in a conductor and no measures are taken to maintain


it, then motion of the current carriers will lead very rapidly to vanishing of the
field inside the conductor and stopping of the current. To maintain a current for a
sufficiently long time, it is necessary to continuously remove from the end of the
conductor with the lower potential (the current carriers are assumed to be positive)
the charges carried to it by the current, and continuously supply them to the end
with the higher potential (Fig. 5.3). In other words, it is necessary to circulate the
charges along a closed path. This agrees with the fact that the lines of a steady
current are closed (see the preceding section).
The circulation of the strength vector of an electrostatic field equals zero. There-
fore, in a closed circuit, in addition to sections on which the positive carriers travel
in the direction of a decrease in the potential 𝜑, there must be sections on which the
positive charges are carried in the direction of a growth in 𝜑, i.e., against the forces
of the electrostatic field (see the part of the circuit in Fig. 5.3 shown by the dash line).
Motion of the carriers on these sections is possible only with the aid of forces of
a non-electrostatic origin, called extraneous forces. Thus, to maintain a current,
extraneous forces are needed that act either over the entire length of the circuit
or on separate sections of it. These forces may be due to chemical processes, the
diffusion of the current carriers in a non-uniform medium or through the interface
between two different substances, to electric (but not electrostatic) fields set up by
magnetic fields varying with time (see Sec. 9.1), etc.
Extraneous forces can be characterized by the work they do on charges travel-
ling along a circuit. The quantity equal to the work done by the extraneous forces
on a unit positive charge is called the electromotive force (e.m.f.) E acting in
98 STEADY ELECTRIC CURRENT

a circuit or on a section of it. Hence, if the work of the extraneous forces on the
charge 𝑞 is 𝐴, then
𝐴
E= . (5.13)
𝑞
A comparison of Eqs. (1.31) and (5.13) shows that the dimension of the e.m.f.
coincides with that of the potential. Therefore, E is measured in the same units as
𝜑.
The extraneous force 𝑭 extr acting on the charge 𝑞 can be represented in the
form
𝑭 extr = 𝑬 ∗ 𝑞. (5.14)
The vector quantity 𝑬 is called the strength of the extraneous force field. The

work (of the extraneous forces on the charge 𝑞 on circuit section 1-2 is
∫ 2 ∫ 2
𝐴12 = 𝑭 extr d𝒍 = 𝑞 𝑬 ∗ · d𝒍.
1 1
Dividing this work by 𝑞, we get the e.m.f. acting on the given section:
∫ 2
E12 = 𝑬 ∗ · d𝒍. (5.15)
1
A similar integral calculated for a closed circuit gives the e.m.f. acting in this circuit:

E= 𝑬 ∗ · d𝒍. (5.16)
Thus, the e.m.f. acting in a closed circuit can be determined as the circulation of the
strength vector of the extraneous forces.
In addition to extraneous forces, a charge experiences the forces of an electro-
static field 𝑭 𝐸 = 𝑞𝑬. Hence, the resultant force acting at each point of a circuit on
the charge 𝑞 is
𝑭 = 𝑭 𝐸 + 𝑭 extr = 𝑞(𝑬 + 𝑬 ∗ ).
The work done by this force on the charge 𝑞 on circuit section 1-2 is determined
by the expression
∫ 2 ∫ 2
𝐴12 = 𝑞 𝑬 · d𝒍 + 𝑞 𝑬 ∗ · d𝒍 = 𝑞(𝜑1 − 𝜑2 ) + 𝑞E12 . (5.17)
1 1
The quantity numerically equal to the work done by the electrostatic and
extraneous forces in moving a unit positive charge is defined as the voltage drop
or simply the voltage 𝑈 on the given section of the circuit. According to Eq. (5.17),
𝑈12 = 𝜑1 − 𝜑2 + E12 . (5.18)
A section of a circuit on which no extraneous forces act is called homogeneous.
Ohm’s Law. Resistance of Conductors 99

A section on which the current carriers experience extraneous forces is called


inhomogeneous. For a homogeneous section of a circuit
𝑈12 = 𝜑1 − 𝜑2 , (5.19)
i.e., the voltage coincides with the potential difference across the ends of the section.

5.4. Ohm’s Law. Resistance of Conductors

The German physicist Georg Ohm (1789-1854) experimentally established a law


according to which the current flowing in a homogeneous (in the meaning that no
extraneous forces are present) metal conductor is proportional to the voltage drop 𝑈 in
the conductor:
1
𝐼 = 𝑈. (5.20)
𝑅
We remind our reader that for a homogeneous conductor the voltage 𝑈 coincides
with the potential difference 𝜑1 − 𝜑2 [see Eq. (5.18)].
The quantity designated by the symbol 𝑅 in Eq. (5.20) is called the electrical
resistance of a conductor. The unit of resistance is the ohm (Ω) equal to the
resistance of a conductor in which a current of 1 A flows at a voltage of 1 V.
The value of the resistance depends on the shape and dimensions of a conduc-
tor and also on the properties of the material it is made of. For a homogeneous
cylindrical conductor
1
𝑅=𝜌 , (5.21)
𝑆
where 𝑙, is the length of the conductor, 𝑆 its cross-sectional area, and 𝜌 the coeffi-
cient depending on the properties of the material and called the resistivity of the
substance.
If 𝑙 = 1 and 𝑆 = 1, then 𝑅 numerically equals 𝜌. In the SI, 𝜌 is measured in
ohm-metres (Ω m).
Let us find the relation between the vectors 𝒋 and 𝑬 at the same point of a
conductor. In an isotropic conductor, the ordered motion of the current carriers
takes place in the direction of the vector 𝑬. Therefore, the directions of the vectors
𝒋 and 𝑬 coincide³. Let us mentally separate an elementary cylindrical volume with
generatrices parallel to the vectors 𝒋 and 𝑬 in the vicinity of a certain point (Fig. 5.4).
A current equal to 𝑗 d𝑆 flows through the cross section of the cylinder. The voltage
across the cylinder is 𝐸 d𝑙, where 𝐸 is the field-strength at the given point. Finally,
the resistance of the cylinder, according to Eq. (5.21), is 𝜌(d𝑙/d𝑆). Using these values

³In anisotropic bodies, the directions of the vectors 𝒋 and 𝑬, generally speaking, do not coincide.
The relation between 𝒋 and 𝑬 for such bodies is achieved with the aid of the conductance tensor.
100 STEADY ELECTRIC CURRENT

Fig. 5.4

in Eq. (5.21), we arrive at the equation


1 d𝑆 1
𝑗 d𝑆 = 𝐸 d𝑙 or 𝑗 = 𝐸.
𝜌 d𝑙 𝜌
Taking advantage of the fact that the vectors 𝒋 and 𝑬 have the same direction, we
can write
1
𝒋 = 𝑬 = 𝜎 𝑬. (5.22)
𝜌
This equation expresses Ohm’s law in the differential form.
The quantity 𝜎 in Eq. (5.22) that is the reciprocal of 𝜌 is called the conductivity
of a material. The unit that is the reciprocal of the ohm is called the siemens (S).
The unit of 𝜎 is accordingly the siemens per metre (S m−1 ).
Let us assume for simplicity’s sake that a conductor contains carriers of only
one sign. According to Eq. (5.5), the current density in this case is
𝒋 = 𝑒𝑛𝒖. (5.23)
A comparison of this expression with Eq. (5.22) leads us to the conclusion that the
velocity of ordered motion of current carriers is proportional to the field strength
𝑬, i.e., to the force imparting ordered motion to the carriers. Proportionality of
the velocity to the force applied to a body is observed when apart from the force
producing the motion, the body experiences the force of resistance of the medium.
This force is due to the interaction of the current carriers with the particles which
the substance of the conductor is built of. The presence of the force of resistance
to ordered motion of the current carriers results in the electrical resistance of a
conductor.
The ability of a substance to conduct an electric current is characterized by
its resistivity 𝜌 or conductivity 𝜎 . Their magnitude is determined by the chemical
nature of the substance and the surrounding conditions, in particular the ambient
temperature.
The resistivity 𝜌 varies directly with the absolute temperature 𝑇 for most metals
at temperatures close to room one:
𝜌 ∝ 𝑇. (5.24)
Deviations from this proportion are observed at low temperatures (Fig. 5.5). The
dependence of 𝜌 on 𝑇 usually follows curve 1. The magnitude of the residual
Ohm’s Law for an Inhomogeneous Circuit Section 101

Fig. 5.5

resistivity 𝜌res depends very greatly on the purity of the material and the presence of
residual mechanical stresses in the specimen. This is why 𝜌res appreciably diminishes
after annealing. The resistivity 𝜌 of a perfectly pure metal with an ideal regular
crystal lattice vanishes at absolute zero.
The resistance of a large group of metals and alloys at a temperature of the
order of several kelvins vanishes in a jump (curve 2 in Fig. 5.5). This phenomenon,
called superconductivity, was first discovered in 1911 by the Dutch scientist Heike
Kamerlingh Onnes (1853-1926) for mercury. Superconductivity was later discovered
in lead, tin, zinc, aluminium, and other metals, as well as in a number of alloys. Every
superconductor has its own critical temperature 𝑇cr at which it passes over into
a superconducting state. The superconducting state is violated when a magnetic
field acts on a superconductor. The magnitude of the critical field 𝐵cr (the symbol
𝐵 stands for the magnetic induction—see Sec. 6.2) destroying superconductivity
equals zero when 𝑇 = 𝑇cr and grows with lowering of the temperature.
A complete theoretical substantiation of superconductivity was given in 1957 by
J. Bardeen, L. Cooper, and J. Schrieffer (see Vol. III, Sec. 8.2).
The temperature dependence of resistance underlies the design of resistance
thermometers. Such a thermometer is a metal (usually platinum) wire wound onto a
porcelain or mica body. A resistance thermometer graduated according to constant
temperature points makes it possible to measure both low and high temperatures
with an accuracy of the order of several hundredths of a kelvin. Recent times
have seen semiconductor resistance thermometers coming into greater and greater
favour.

5.5. Ohm’s Law for an Inhomogeneous Circuit Section

The extraneous forces 𝑒𝑬 ∗ act on current carriers on an inhomogeneous section of


a circuit in addition to the electrostatic forces 𝑒𝑬. Extraneous forces are capable of
producing ordered motion of current carriers to the same extent as electrostatic
102 STEADY ELECTRIC CURRENT

Fig. 5.6

forces are. We found in the preceding section that in a homogeneous conductor,


the average velocity of ordered motion of the current carriers is proportional to the
electrostatic force 𝑒𝑬. It is quite obvious that, where extraneous forces are exerted
on the carriers in addition to the electrostatic forces, the average velocity of ordered
motion of the carriers will be proportional to the total force 𝑒𝑬 + 𝑒𝑬 ∗ . Accordingly,
the current density at these points is proportional to the sum of the strengths 𝑬 + 𝑬 ∗ :
𝒋 = 𝜎 (𝑬 + 𝑬 ∗ ) . (5.25)
Equation (5.25) summarizes Eq. (5.22) for an inhomogeneous conductor. It ex-
presses Ohm’s law for an inhomogeneous section of a circuit in the differential
form.
We can pass over from Ohm’s law in the differential form to its integral form.
Let us consider an inhomogeneous section of a circuit. Assume that there is a line
inside this section (we shall call it the current path) complying with the following
conditions: (1) in every cross section perpendicular to the path, the quantities 𝒋, 𝜎 ,
𝑬, 𝑬 ∗ have the same values with sufficient accuracy, and (2) the vectors 𝒋, 𝑬, and
𝑬 ∗ at every point are directed along a tangent to the path. The cross section of the
conductor may vary (Fig. 5.6).
Let us choose an arbitrary direction of motion along the path. Assume that the
chosen Fig. 5.6 direction corresponds to motion from end 1 to end 2 of the circuit
section (direction 1-2). Let us project the vectors in Eq. (5.25) onto the path element
d𝒍. The result is
𝑗𝑙 = 𝜎 (𝐸 𝑙 + 𝐸∗𝑙 ). (5.26)
Owing to our assumption, the projection of each of the vectors equals the magnitude
of the vector taken with the sign plus or minus depending on the direction of the
vector relative to d𝒍. For example, 𝑗𝑙 = 𝑗 if the current flows in direction 1-2, and
𝑗𝑙 = −𝑗 if it flows in direction 2-1.
Owing to charge conservation, the steady current in each section must be the
same. Therefore, the quantity 𝐼 = 𝑗𝑙 𝑆 is constant along the path. The current in this
case should be treated as an algebraic quantity. We remind our reader that we have
chosen direction 1-2 arbitrarily. Hence, if the current flows in the chosen direction,
it should be considered positive, and if it flows in the opposite direction (i.e., from
Multiloop Circuits. Kirchhoff’s Rules 103

end 2 to end 1), it should be considered negative.


Let us substitute the ratio 𝐼/𝑆 for 𝑗𝑙 and the resistivity 𝜌 for the conductivity 𝜎
in Eq. (5.26). We get
𝜌
𝐼 = 𝐸 𝑙 + 𝐸∗𝑙 .
𝑆
Multiplication of the above equation by d𝑙 and integration along the path yield
∫ 2 ∫ 2 ∫ 2
d𝑙
𝐼 𝜌 = 𝐸 𝑙 d𝑙 + 𝐸∗𝑙 d𝑙.
1 𝑆 1 1
The quantity 𝜌(d𝑙/𝑆) is the resistance of the path section of length d𝑙, and the
integral of this quantity is the resistance 𝑅 of the circuit section. The first integral
in the right-hand side gives 𝜑1 − 𝜑2 , and the second integral gives the e.m.f. E12
acting on the section. We, thus, arrive at the equation
𝐼 𝑅 = 𝜑1 − 𝜑2 + E12 . (5.27)
The e.m.f. E12 , like the current 𝐼, is an algebraic quantity. When the e.m.f.
facilitates the motion of the positive current carriers in the selected direction (in
direction 1-2), we have E12 > 0. If the e.m.f. prevents the motion of the positive
carriers in the given direction, E12 < 0.
Let us write Eq. (5.27) in the form
𝜑1 − 𝜑2 + E12
𝐼= . (5.28)
𝑅
This equation expresses Ohm’s law for an inhomogeneous circuit section. Assuming
that 𝜑1 = 𝜑2 , we get the equation of Ohm’s law for a closed circuit:
E
𝐼= . (5.29)
𝑅
Here, E is the e.m.f. acting in the circuit, and 𝑅 is the total resistance of the entire
circuit.

5.6. Multiloop Circuits. Kirchhoff’s Rules

The calculation of multiloop circuits or networks is considerably simplified if we


use two rules formulated by the German physicist Gustav Kirchhoff (1824-1887).
The first of them relates to the junctions of a circuit. A junction is defined as a
point where three or more conductors meet (Fig. 5.7). A current flowing toward a
junction is considered to have one sign (plus or minus), and a current flowing out
of a junction is considered to have the opposite sign (minus or plus).
Kirchhoff’s first rule, also called the junction rule, states that the algebraic sum
104 STEADY ELECTRIC CURRENT

Fig. 5.7 Fig. 5.8

of all the currents coming into a junction must be zero:


Õ
𝐼 𝑘 = 0. (5.30)
𝑘
This rule follows from the continuity equation, i.e., in the long run from the law
of charge conservation. For a steady current, ∇ · 𝒋 equals zero everywhere [see
Eq. (5.21)]. Hence, the flux of the vector 𝒋, i.e., the algebraic sum of the currents
flowing through an imaginary closed surface surrounding a junction, must be zero.
Equation (5.30) can be written for each of the 𝑁 junctions of circuit. Only 𝑁 − 1
equations will be independent, however, whereas the 𝑁-th one will be a corollary
of them.
The second rule relates to any closed loop separated from a multiloop circuit (see,
for example, loop 1-2-3-4-1 in Fig. 5.8). Let us choose a direction of circumvention
(for example, clockwise as in the figure) and apply Ohm’s law to each unbranched
loop section:
𝐼1 𝑅1 = 𝜑1 − 𝜑2 + E1
𝐼2 𝑅2 = 𝜑2 − 𝜑3 + E2
𝐼3 𝑅3 = 𝜑3 − 𝜑4 + E3
𝐼4 𝑅4 = 𝜑4 − 𝜑1 + E4 .
When these expressions are summated, the potentials can be cancelled, and we get
the equation
Õ Õ
𝐼𝑘 𝑅𝑘 = E𝑘 , (5.31)
𝑘 𝑘
that expresses Kirchhoff’s second rule, also called the loop rule.
Equation (5.31) can be written for all the closed loops that can be separated
mentally in a given multiloop circuit. Only the equations for the loops that cannot
Multiloop Circuits. Kirchhoff’s Rules 105

Fig. 5.9

be obtained by the superposition of other loops on one another will be independent,


however. For example, for the circuit depicted in Fig. 5.9, we can write three
equations:
(1) for loop 1-2-3-6-1,
(2) for loop 3-4-5-6-3, and
(3) for loop 1-2-3-4-5-6-1.
The last loop is obtained by superposition of the first two. The equations will,
therefore, not be independent. We can take any two equations of the three as
independent ones.
In writing the equations of the loop rule, we must appoint the signs of the
currents and e.m.f.’s in accordance with the chosen direction of circumvention.
For example, the current 𝐼1 in Fig. 5.9 must be considered negative because it
flows oppositely to the chosen direction of circumvention. The e.m.f. must also be
considered negative because it acts in the direction opposite to that of circumvention,
and so on.
We may choose the direction of circumvention in each loop absolutely arbi-
trarily and independently of the choice of the directions in the other loops. It may
happen, here, that the same current or the same e.m.f. may be included in different
equations with opposite signs (this happens with the current 𝐼2 in Fig. 5.9 for the
indicated directions of circumvention in the loops). This is of no significance, how-
ever, because a change in the direction around a loop results only in a reversal of all
the signs in Eq. (5.31).
In compiling the equations, remember that the same current flows in any cross
section of an unbranched part of a circuit. For example, the same current 𝐼2 flows
from junction 6 to the current source E2 as from the source E2 to junction 3.
106 STEADY ELECTRIC CURRENT

The number of independent equations compiled in accordance with the junction


and loop rules equals the number of different currents flowing in a multiloop circuit.
Therefore, if we know the e.m.f.’s and resistances for all the unbranched sections,
we can calculate all the currents. We can also solve other problems, for instance
find the e.m.f.’s that must be connected in each of the sections of a circuit to obtain
the required currents with the given resistances.

5.7. Power of a Current

Let us consider an arbitrary section of a steady current circuit across whose ends
the voltage 𝑈 is applied. The charge 𝑞 = 𝐼𝑡 will flow during the time 𝑡 through every
cross section of the conductor. This is equivalent to the fact that the charge 𝐼𝑡 is
carried during the time 𝑡 from one end of the conductor to the other. The forces of
the electrostatic field and the extraneous forces acting on the given section do the
work
𝐴 = 𝑈𝑞 = 𝑈 𝐼𝑡 (5.32)
[we remind our reader that the voltage 𝑈 is determined as the work done by the
electrostatic and extraneous forces in moving a unit positive charge; see Eq. (5.18)].
Dividing the work 𝐴 by the time 𝑡 during which it is done, we get the power
developed by the current on the circuit section being considered:
𝑃 = 𝑈 𝐼 = (𝜑1 − 𝜑2 )𝐼 + E12 + 𝐼. (5.33)
This power may be spent for the work done by the circuit section being consid-
ered on external bodies (for this purpose the section must move in space), for the
proceeding of chemical reactions, and, finally, for heating the given circuit section.
The ratio of the power 𝛥𝑃 developed by a current in the volume 𝛥𝑉 of a
conductor to the magnitude of this volume is called the unit power of the current
𝑃u , corresponding to the given point of the conductor. By definition, the unit power
is
𝛥𝑃
𝑃u = . (5.34)
𝛥𝑉
Speaking conditionally, the unit power is the power developed in unit volume of a
conductor.
An expression for the unit power can be obtained proceeding from the following
considerations. The force 𝑒(𝑬 + 𝑬 ∗ ) develops a power of
𝑃 0 = 𝑒(𝑬 + 𝑬 ∗ )(𝒗 + 𝒖)
upon the motion of a current carrier. Let us average this expression for the carriers
confined in the volume 𝛥𝑉 within which 𝑬 and 𝑬 ∗ may be considered constant.
The Joule-Lenz Law 107

The result is
h𝑃 0i = 𝑒(𝑬 + 𝑬 ∗ ) h𝒗 + 𝒖i = 𝑒(𝑬 + 𝑬 ∗ ) h𝒗i + 𝑒(𝑬 + 𝑬 ∗ ) h𝒖i = 𝑒(𝑬 + 𝑬 ∗ ) h𝒖i
(remember that h𝒗i = 0).
We can find the power 𝛥𝑃 developed in the volume 𝛥𝑉 by multiplying h𝑃 0i
by the number of current carriers in this volume, i.e., by 𝑛𝛥𝑉 (𝑛 is the number of
carriers in unit volume). Thus,
𝛥𝑃 = h𝑃 0i 𝑛𝛥𝑉 = 𝑒(𝑬 + 𝑬 ∗ ) · h𝒖i 𝑛𝛥𝑉 = 𝒋 · (𝑬 + 𝑬 ∗ ) 𝛥𝑉
[see Eq. (5.23)]. Hence,
𝑃u = 𝒋 · (𝑬 + 𝑬 ∗ ) (5.35)
This expression is a differential form of the integral equation (5.33).

5.8. The Joule-Lenz Law

When a conductor is stationary and no chemical transformations occur in it, the


work of a current given by Eq. (5.32) goes to increase the internal energy of the
conductor, and as a result the latter gets heated. It is customary to say that when a
current flows in a conductor, the heat
𝑄 = 𝑈 𝐼𝑡
is liberated. Substituting 𝑅𝐼 for 𝑈 in accordance with Ohm’s law, we get the formula
𝑄 = 𝑅𝐼 2 𝑡. (5.36)
Equation (5.36) was established experimentally by the British physicist James
Joule (1818-1889) and independently of him by the Russian physicist Emil Lenz (1804-
1865), and is called the Joule-Lenz law.
If the current varies with time, then the amount of heat liberated during the
time 𝑡 is calculated by the equation
∫ 𝑡
𝑄= 𝑅𝐼 2 d𝑡. (5.37)
0
We can pass over from Eq. (5.36) determining the heat liberated in an entire
conductor to an expression characterizing the liberation of heat at different spots of
the conductor. Let us separate in a conductor, in the same way as we did in deriving
Eq. (5.22), an elementary volume in the form of a cylinder (see Fig. 5.4). According to
the Joule-Lenz law, the following amount of heat will be liberated in this volume
during the time d𝑡:
𝜌 d𝑙
d𝑄 = 𝑅𝐼 2 d𝑡 = ( 𝑗 d𝑆) 2 d𝑡 = 𝜌𝑗2 d𝑉 d𝑡 (5.38)
d𝑆
(d𝑉 = d𝑆 d𝑙 is the magnitude of the elementary volume).
108 STEADY ELECTRIC CURRENT

Dividing Eq. (5.38) by d𝑉 and d𝑡, we shall find the amount of heat liberated in
unit volume per unit time:
𝑄 u = 𝜌𝑗2 . (5.39)
By analogy with the name of quantity Eq. (5.34), the quantity 𝑄 u can be called the
unit thermal power of a current.
Equation (5.39) is a differential form of the Joule-Lenz law. It can be obtained
from Eq. (5.35). Substituting 𝒋/𝜎 = 𝜌𝒋 for 𝑬 + 𝑬 ∗ in Eq. (5.35) [see Eq. (5.25)], we arrive
at the expression
𝑃u = 𝜌𝒋2 ,
that coincides with Eq. (5.39).
It must be noted that Joule and Lenz established their law for a homogeneous
circuit section. As follows from what has been said in the present section, however,
Eqs. (5.36) and (5.39) also hold for an inhomogeneous section provided that the
extraneous forces acting in it have a non-chemical origin.
109

Chapter 6
MAGNETIC FIELD
IN A VACUUM

6.1. Interaction of Currents

Experiments show that electric currents exert a force on one another. For example,
two thin straight parallel conductors carrying a current (we shall call them line
currents) attract each other if the currents in them flow in the same direction, and
repel each other if the currents flow in opposite directions. The force of interaction
per unit length of each of the parallel conductors is proportional to the magnitudes
of the currents 𝐼1 and 𝐼2 in them and inversely proportional to the distance 𝑏
between them:
2𝐼1 𝐼2
𝐹u = 𝑘 . (6.1)
𝑏
We have designated the proportionality constant 2𝑘 for reasons that will become
clear on a later page.
The law of interaction of currents was established in 1820 by the French physicist
Andre Ampere (1775-1836). A general expression of this law suitable for conductors
of any shape will be given in Sec. 6.6. Equation (6.1) is used to establish the unit
of current in the SI and in the absolute electromagnetic system (cgsm) of units.
The SI unit of current—the ampere—is defined as the constant current which,
if maintained in two straight parallel conductors of infinite length, of negligible
cross section, and placed 1 metre apart in vacuum, would produce between these
conductors a force equal to 2 × 10−7 newton per metre of length.
The unit of charge, called the coulomb, is defined as the charge passing in 1
second through the cross section of a conductor in which a constant current of
1 ampere is flowing. Accordingly, the coulomb is also called the ampere-second
(A s).
110 MAGNETIC FIELD IN A VACUUM

Equation (6.1) is written in the rationalized form as follows:


𝜇0 2𝐼1 𝐼2
𝐹u = , (6.2)
4𝜋 𝑏
where 𝜇0 is the so-called magnetic constant [compare with Eq. (1.11)]. To find the
numerical value of 𝜇0 , we shall take advantage of the fact that according to the
definition of the ampere, when 𝐼1 = 𝐼2 = 1 A and 𝑏 = 1 m, the force 𝐹u is obtained
equal to 2 × 10−7 N m−1 .
Let us use these values in Eq. (6.2):
𝜇0 2 × 1 × 1
2 × 10−7 = .
4𝜋 1
Hence,
𝜇0 = 4𝜋 × 10−7 = 1.26 × 10−6 H m−1 (6.3)
(the symbol H m−1 stands for henry per metre—see Sec. 8.5).
The constant 𝑘 in Eq. (6.1) can be made equal to unity by choosing an appropriate
unit of current. This is how the absolute electromagnetic unit of current (cgsm𝐼 )
is established. It is defined as the current which, if maintained in a thin straight
conductor of infinite length, would act on an equal and parallel line current at a
distance of 1 cm from it with a force equal to 2 dyn per centimetre of length.
In the cgse system, the constant 𝑘 is a dimension quantity other than unity.
According to Eq. (6.1), the dimension of 𝑘 is determined as follows:
[𝐹u 𝑏] [𝐹]
[𝑘] = 2
= . (6.4)
[𝐼] [𝐼] 2
We have taken into account that the dimension of 𝐹u is the dimension of force
divided by the dimension of length; hence, the dimension of the product 𝐹u 𝑏 is that
of force. According to Eqs. (1.7) and (5.7):
[𝑞] 2 [𝑞]
[𝐹] = 2 ; [𝐼] = .
L T
Using these values in Eq. (6.4), we find that
T2
[𝑘] = 2 .
L
Consequently, in the cgse system, 𝑘 can be written in the form
1
𝑘 = 2, (6.5)
𝑐
where 𝑐 is a quantity having the dimension of velocity and called the electromag-
netic constant. To find its value, let us use relation (1.8) between the coulomb
and the cgse unit of charge, which was established experimentally. A force of
2 × 10−7 N m−1 is equivalent to 2 × 10−4 dyn cm−1 . According to Eq. (6.1), this is the
force with which currents of 3 × 109 cgse𝐼 (i.e., 1 A) each interact when 𝑏 = 100 cm.
Interaction of Currents 111

Thus,
1 2 × 3 × 109 × 3 × 109
2 × 10−4 = ,
𝑐2 100
whence
𝑐 = 3 × 1010 cm s−1 = 3 × 108 m s−1 . (6.6)
The value of the electromagnetic constant coincides with that of the speed of
light in a vacuum. From J. Maxwell’s theory, there follows the existence of electro-
magnetic waves whose speed in a vacuum equals the electromagnetic constant 𝑐.
The coincidence of 𝑐 with the speed of light in a vacuum gave Maxwell the grounds
to assume that light is an electromagnetic wave. 2
The value of 𝑘 in Eq. (6.1) is 1 in the cgsm system and 1/𝑐2 = 1/ 3 × 1010
s2 cm−2 in the cgse system. Hence, it follows that a current of 1 cgsm𝐼 is equivalent
to a current of 3 × 1010 cgse𝐼 :
1 cgsm𝐼 = 3 × 1010 cgse𝐼 = 10 A. (6.7)
Multiplying this relation by 1 s, we get
1 cgsm𝑞 = 3 × 1010 cgse𝑞 = 10 C. (6.8)
Thus,
1
𝐼cgsm = 𝐼cgse . (6.9)
𝑐
Accordingly,
1
𝑞cgsm = 𝑞cgse . (6.10)
𝑐
There is a definite relation between the constants 𝜀0 , 𝜇0 , and 𝑐. To establish it,
let us find the dimension and numerical value of the product 𝜀0 𝜇0 . In accordance
with Eq. (1.11), the dimension of 𝜀0 is
[𝑞] 2
[𝜀0 ] = 2 . (6.11)
L [𝐹]
According to Eq. (6.2)
[𝐹u 𝑏] [𝐹]T2
[𝜇0 ] = = . (6.12)
[𝐼] 2 [𝑞] 2
Multiplication of Eqs. (6.11) and (6.12) yields
T2 1
[𝜀0 𝜇0 ] = 2 = 2
(6.13)
L [𝑣]
(𝑣 is the speed).
With account taken of Eqs. (1.11) and (6.3), the numerical value of the product
112 MAGNETIC FIELD IN A VACUUM

𝜀0 𝜇0 is
1 1 2
𝜀 0 𝜇0 = × 4𝜋 × 10−7 =  2 s cm .
−2
(6.14)
4𝜋 × 9 × 109 3 × 10 8

Finally, taking into account Eqs. (6.6), (6.13), and (6.14), we get the relation
interesting us:
1
𝜀 0 𝜇0 = 2 . (6.15)
𝑐

6.2. Magnetic Field

Currents interact through a field called magnetic. This name originated from the
fact that, as the Danish physicist Hans Oersted (1777-1851) discovered in 1820, the
field set up by a current has an orienting action on a magnetic pointer. Oersted
stretched a wire carrying a current over a magnetic pointer rotating on a needle.
When the current was switched on, the pointer aligned itself at right angles to the
wire. Reversing of the current caused the pointer to rotate in the opposite direction.
Oersted’s experiment shows that a magnetic field has a sense of direction and
must be characterized by a vector quantity. The latter is designated by the symbol 𝑩.
It would be logical to call 𝑩 the magnetic field strength, by analogy with the electric
field strength 𝑬. For historical reasons, however, the basic force characteristic of
a magnetic field was called the magnetic induction. The name magnetic field
strength was given to an auxiliary quantity 𝑯 similar to the auxiliary characteristic
𝑫 of an electric field.
A magnetic field, unlike its electric counterpart, does not act on a charge at rest.
A force appears only when a charge is moving.
A current-carrying conductor is an electrically neutral system of charges in
which the charges of one sign are moving in one direction, and the charges of the
other sign in the opposite direction (or are at rest). It thus follows that a magnetic
field is set up by moving charges.
Thus, moving charges (currents) change the properties of the space surrounding
them—they set up a magnetic field in it. This field manifests itself in that forces are
exerted on charges moving in it (currents).
Experiments show that the superposition principle holds for a magnetic field,
the same as for an electric field: the field 𝑩 set up by several moving charges (currents)
equals the vector sum of the fields 𝑩𝑖 set up by each charge (current) separately:
Õ
𝑩= 𝑩𝑖 (6.16)
𝑖
[compare with Eq. (1.19)].
Field of a Moving Charge 113

Fig. 6.1

6.3. Field of a Moving Charge

Space is isotropic, consequently, if a charge is stationary, then all directions have


equal rights. This underlies the fact that the electrostatic field set up by a point
charge is spherically symmetrical.
If a charge travels with the velocity 𝒗, a preferred direction (that of the vector
𝒗) appears in space. We can, therefore, expect the magnetic field produced by a
moving charge to have axial symmetry. We must note that we have in mind free
motion of a charge, i.e., motion with a constant velocity. For an acceleration to
appear, the charge must experience the action of a field (electric or magnetic). This
field by its very existence would violate the isotropy of space.
Let us consider the magnetic field set up at point 𝑃 by the point charge 𝑞
travelling with the constant velocity 𝒗 (Fig. 6.1). The disturbances of the field are
transmitted from point to point with the finite velocity 𝑐. For this reason, the
induction 𝑩 at point 𝑃 at the moment of time 𝑡 is determined not by the position of
the charge at the same moment 𝑡, but by its position at an earlier moment of time
𝑡 − 𝜏:
𝑩(𝑃, 𝑡) = 𝑓 {𝑞, 𝒗, 𝒓(𝑡 − 𝜏)}.
Here, 𝑃 signifies the collection of the coordinates of point 𝑃 determined in a sta-
tionary reference frame, and 𝒓(𝑡 − 𝜏) is the position vector drawn to point 𝑃 from
the point where the charge was at the moment 𝑡 − 𝜏.
If the velocity of the charge is much smaller than 𝑐 (𝑣  𝑐), then the retardation
time 𝜏 will be negligibly small. In this case, we can consider that the value of 𝑩 at
the moment 𝑡 is determined by the position of the charge at the same moment 𝑡. If
this condition is observed, then
𝑩(𝑃, 𝑡) = 𝑓 {𝑞, 𝒗, 𝒓(𝑡)} (6.17)
[we remind our reader that 𝒗 = constant, therefore, 𝒗(𝑡 − 𝜏) = 𝒗(𝑡)].
The form of function (6.17) can be established only experimentally. But before
giving the results of experiments, let us try to find the logical form of this relation.
114 MAGNETIC FIELD IN A VACUUM

The simplest assumption is that the magnitude of the vector 𝑩 is proportional to


the charge 𝑞 and the velocity 𝑣 (when 𝒗 → 0, a magnetic field is absent). We have to
“construct” the vector 𝑩 we are interested in from the scalar 𝑞 and the two given
vectors 𝒗 and 𝒓. This can be done by vector multiplication of the given vectors and
then by multiplying their product by the scalar. The result is the expression
𝑞(𝒗 × 𝒓). (6.18)
The magnitude of this expression grows with an increasing distance from the charge
(with increasing 𝑟). It is improbable that the characteristic of a field will behave
in this way—for the fields that we know (electrostatic, gravitational), the field
does not grow with an increasing distance from the source, but, on the contrary,
weakens, varying in proportion to 1/𝑟 2 . Let us assume that the magnetic field of a
moving charge behaves in the same way when 𝑟 changes. We can obtain an inverse
proportion to the square of 𝑟 by dividing Eq. (6.18) by 𝑟 3 . The result is
𝑞(𝒗 × 𝒓)
. (6.19)
𝑟3
Experiments show that when 𝑣  𝑐, the magnetic induction of the field of a
moving charge is determined by the formula
𝑞(𝒗 × 𝒓)
𝑩 = 𝑘0 , (6.20)
𝑟3
where 𝑘 0 is a proportionality constant.
We must stress once more that the reasoning which led us to expression (6.19)
must by no means be considered as the derivation of Eq. (6.20). This reasoning does
not have conclusive force. Its aim is to help us understand and memorize Eq. (6.20).
This equation itself can be obtained only experimentally.
It can be seen from Eq. (6.20) that the vector 𝑩 at every point 𝑃 is directed at
right angles to the plane passing through the direction of the vector 𝒗 and point 𝑃,
so that rotation in the direction of 𝑩 forms a right-handed system with the direction
of 𝒗 (see the circle with the dot in Eq. (6.1)). We must note that 𝑩 is a pseudo vector.
The value of the proportionality constant 𝑘 0 depends on our choice of the units
of the quantities in Eq. (6.20). This equation is written in the rationalized form as
follows:
𝜇0 𝑞(𝒗 × 𝒓)
𝑩= . (6.21)
4𝜋 𝑟3
This equation can be written in the form
𝜇0 𝑞(𝒗 × 𝒆ˆ 𝑟 )
𝑩= (6.22)
4𝜋 𝑟3
[compare with Eq. (1.15)]. It must be noted that in similar equations when 𝜀0 is in the
denominator, 𝜇0 is in the numerator, and vice versa.
Field of a Moving Charge 115

The SI unit of magnetic induction is called the tesla (T) in honour of the Croat-
ian electrician and inventor Nikola Tesla (1856-1943).
The units of the magnetic induction 𝐵 are chosen in the cgse and cgsm systems
so that the constant 𝑘 0 in Eq. (6.20) equals unity. Hence, the same relation holds
between the units of 𝐵 in these systems as between the units of charge:
1 cgsm 𝐵 = 3 × 1010 cgse 𝐵 (6.23)
[see Eq. (6.8)].
The cgsm unit of magnetic induction has a special name—the gauss (Gs).
The German mathematician Karl Gauss (1777-1855) proposed a system of units
in which all the electrical quantities (charge, current, electric field strength, etc.)
are measured in cgse units, and all the magnetic quantities (magnetic induction,
magnetic moment, etc.) in cgsm units. This system of units was named the Gaussian
one, in honour of its author.
In the Gaussian system, owing to Eqs. (6.9) and (6.10), all the equations containing
the current or charge in addition to magnetic quantities include one multiplier 1/𝑐
for each quantity 𝐼 or 𝑞 in the relevant equation. This multiplier converts the value
of the pertinent quantity (𝐼 or 𝑞) expressed in cgse units to a value expressed in cgsm
units (the cgsm system of units is constructed so that the proportionality constants
in all the equations equal 1). For example, in the Gaussian system, Eq. (6.20) has the
form
1 𝑞(𝒗 × 𝒓)
𝑩= . (6.24)
𝑐 𝑟3
We must note that the appearance of a preferred direction in space (the direction
of the vector 𝒗) when a charge moves leads to the electric field of the moving charge
also losing its spherical symmetry and becoming axially symmetrical. The relevant
calculations show that the 𝑬 lines of the field of a freely moving charge have the
form shown in Fig. 6.2. The vector 𝑬 at point 𝑃 is directed along the position vector
𝒓 drawn from the point where the charge is at the given moment to point 𝑃. The
magnitude of the field strength is determined by the equation
1 𝑞 1 − 𝑣2 /𝑐2

𝐸= , (6.25)
4𝜋 𝜀0 𝑟 2 1 − (𝑣2 /𝑐2 ) sin2 𝜃  3/2
where 𝜃 is the angle between the direction of the velocity 𝒗 and the position vector
𝒓.
When 𝑣  𝑐, the electric field of a freely moving charge at each moment of time
does not virtually differ from the electrostatic field set up by a stationary charge at
the point where the moving charge is at the given moment. It must be remembered,
however, that this “electrostatic” field moves together with the charge. Hence, the
field at each point of space changes with time.
116 MAGNETIC FIELD IN A VACUUM

Fig. 6.2

At values of 𝑣 comparable with 𝑐, the field in directions at right angles to 𝒗 is


appreciably stronger than in the direction of motion at the same distance from the
charge (see Fig. 6.2 drawn for 𝑣/𝑐 = 0.8). The field “flattens out” in the direction of
motion and is concentrated mainly near a plane passing through the charge and
perpendicular to the vector 𝒗.

6.4. The Biot-Savart Law

Let us determine the nature of the magnetic field set up by an arbitrary thin wire
through which a current flows. We shall consider a small element of the wire of
length d𝑙. This element contains 𝑛𝑆 d𝑙 current carriers (𝑛 is the number of carriers
in a unit volume, and 𝑆 is the cross-sectional area of the wire where the element d𝑙
has been taken). At the point whose position relative to the element d𝑙 is determined
by the position vector 𝒓 (Fig. 6.3), a separate carrier of current 𝑒 sets up a field with
the induction
𝜇0 𝑒[(𝒗 + 𝒖) × 𝒓]
𝑩=
4𝜋 𝑟3
[see Eq. (6.21)]. Here, 𝒗 is the velocity of chaotic motion, and 𝒖 is the velocity of
ordered motion of the carrier.
The value of the magnetic induction averaged over the current carriers in the
element d𝑙 is
𝜇0 𝑒[(h𝒗i + h𝒖i) × 𝒓] 𝜇0 𝑒(h𝒖i × 𝒓)
h𝑩i = 3
=
4𝜋 𝑟 4𝜋 𝑟3
(h𝒗i = 0). Multiplying this expression by the number of carriers in an element
of the wire (equal to 𝑛𝑆 d𝑙), we get the contribution to the field introduced by the
element d𝑙:
𝜇0 𝑆 [(𝑛𝑒 h𝒖i) × 𝒓] d𝑙
d𝑩 = h𝑩i 𝑛𝑆 d𝑙 =
4𝜋 𝑟3
The Biot-Savart Law 117

Fig. 6.3

(we have put the scalar multipliers 𝑛 and 𝑒 inside the sign of the vector product).
Taking into account that 𝑛𝑒 h𝒖i = 𝒋, we can write
𝜇0 𝑆( 𝒋 × 𝒓) d𝑙
d𝑩 = . (6.26)
4𝜋 𝑟3
Let us introduce the vector d𝒍 directed along the axis of the current element d𝑙
in the same direction as the current. The magnitude of this vector is d𝑙. Since the
directions of the vectors 𝒋 and d𝒍 coincide, we can write the equation
𝒋 d𝑙 = 𝑗 d𝒍. (6.27)
Performing such a substitution in Eq. (6.26), we get
𝜇0 𝑆 𝑗(d𝒍 × 𝒓)
d𝑩 = .
4𝜋 𝑟3
Finally, taking into account that the product 𝑆 𝑗 gives the current 𝐼 in the wire, we
arrive at the final expression determining the magnetic induction of the field set up
by a current element of length d𝑙:
𝜇0 𝐼 (d𝒍 × 𝒓)
d𝑩 = . (6.28)
4𝜋 𝑟3
We have derived Eq. (6.28) from Eq. (6.21). Equation (6.28) was actually established
experimentally before Eq. (6.21) was known. Moreover, the latter equation was
derived Eq. (6.28).
In 1820, the French physicists Jean Biot (1774-1862) and Felix Savart (1791-1841)
studied the magnetic fields flowing along thin wires of various shape. The French
astronomer and mathematician Pierre Laplace (1749-1827) analysed the experimental
data obtained and found that the magnetic field of any current can be calculated
as the vector sum (superposition) of the fields set up by the separate elementary
118 MAGNETIC FIELD IN A VACUUM

Fig. 6.4

sections of the currents. Laplace obtained Eq. (6.28) for the magnetic induction
of the field set up by a current element of length d𝑙. In this connection, Eq. (6.28)
is called the Biot-Savart-Laplace law, or more briefly the Biot-Savart law. A
glance at Fig. 6.3 shows that the vector d𝑩 is directed at right angles to the plane
passing through d𝒍 and the point for which the field is being calculated so that
rotation about d𝒍 in the direction of d𝑩 is associated with d𝒍 by the right-hand
screw rule. The magnitude of d𝑩 is determined by the expression
𝜇0 𝐼 d𝑙 sin 𝜃
d𝐵 = , (6.29)
4𝜋 𝑟3
where 𝛼 is the angle between the vectors d𝒍 and 𝒓.
Let us use Eq. (6.28) to calculate the field of a line current, i.e., the field set up
by a current flowing through a thin straight wire of infinite length (Fig. 6.4). All
the vectors d𝑩 at a given point have the same direction (in our case beyond the
drawing). Therefore, addition of the vectors d𝑩 may be replaced with addition of
their magnitudes. The point for which we are calculating the magnetic induction is
at the distance 𝑏 from the wire.
Inspection of Fig. 6.4 shows that
𝑏 𝑟 d𝛼 𝑏 d𝛼
𝑟= , d𝑙 = = .
sin 𝛼 sin 𝛼 sin2 𝛼
Let us introduce these values into Eq. (6.29):
𝜇0 𝐼𝑏 d𝛼 sin 𝛼 sin2 𝛼 𝜇0 𝐼
d𝐵 = = sin 𝛼 d𝛼.
4𝜋 𝑏2 sin2 𝛼 4𝜋 𝑏
The Lorentz Force 119

Fig. 6.5

The angle 𝛼 varies within the limits from 0 to 𝜋 for all the elements of an infinite
line current. Hence,
𝜇0 2𝐼
∫ ∫ 𝜋
𝜇0 𝐼
𝐵= d𝐵 = sin 𝛼 d𝛼 = .
4𝜋 𝑏 0 4𝜋 𝑏
Thus, the magnetic induction of the field of a line current is determined by the
formula
𝜇0 2𝐼
𝐵= . (6.30)
4𝜋 𝑏
The magnetic induction lines of the field of a line current are a system of
concentric circles surrounding the wire (Fig. 6.5).

6.5. The Lorentz Force

A charge moving in a magnetic field experiences a force which we shall call mag-
netic. The force is determined by the charge 𝑞, its velocity 𝒗, and the magnetic
induction 𝑩 at the point where the charge is at the moment of time being considered.
The simplest assumption is that the magnitude of the force 𝐹 is proportional to
each of the three quantities 𝑞, 𝑣, and 𝐵. In addition, 𝐹 can be expected to depend on
the mutual orientation of the vectors 𝒗 and 𝑩. The direction of the vector 𝑭 should
be determined by those of the vectors 𝒗 and 𝑩.
To “construct” the vector 𝑭 from the scalar 𝑞 and the vectors 𝒗 and 𝑩, let us find
the vector product of 𝒗 and 𝑩 and then multiply the result obtained by the scalar 𝑞.
The result is the expression
𝑞(𝒗 × 𝑩). (6.31)
It has been established experimentally that the force 𝑭 acting on a charge moving
in a magnetic field is determined by the formula
𝑭 = 𝑘𝑞(𝒗 × 𝑩), (6.32)
120 MAGNETIC FIELD IN A VACUUM

Fig. 6.6

where 𝑘 is a proportionality constant depending on the choice of the units for the
quantities in the formula.
It must be borne in mind that the reasoning which led us to expression (6.31)
must by no means be considered as the derivation of Eq. (6.32). This reasoning does
not have conclusive force. Its aim is to help us memorize Eq. (6.32). The correctness
of this equation can be established only experimentally.
We must note that Eq. (6.32) can be considered as a definition of the magnetic
induction 𝑩.
The unit of magnetic induction 𝑩—the tesla—-is determined so that the pro-
portionality constant 𝑘 in Eq. (6.32) equals unity. Hence, in SI units, this equation
becomes
𝑭 = 𝑞(𝒗 × 𝑩). (6.33)
The magnitude of the magnetic force is
𝐹 = 𝑞𝑣𝐵 sin 𝛼, (6.34)
where 𝛼 is the angle between the vectors 𝒗 and 𝑩. It can be seen from Eq. (6.34) that
a charge moving along the lines of a magnetic field does not experience the action
of a magnetic force.
The magnetic force is directed at right angles to the plane containing the vectors
𝒗 and 𝑩. If the charge 𝑞 is positive, then the direction of the force coincides with
that of the vector 𝒗 × 𝑩. When 𝑞 is negative, the directions of the vectors 𝑭 and
𝒗 × 𝑩 are opposite (Fig. 6.6).
Since the magnetic force is always directed at right angles to the velocity of
a charged particle, it does no work on the particle. Hence, we cannot change the
energy of a charged particle by acting on it with a constant magnetic field.
The force exerted on a charged particle that is simultaneously in an electric and
a magnetic field is
𝑭 = 𝑞𝑬 + 𝑞(𝒗 × 𝑩). (6.35)
This expression was obtained from the results of experiments by the Dutch physicist
Hendrik Lorentz (1853-1928) and is called the Lorentz force.
The Lorentz Force 121

Fig. 6.7 Fig. 6.8

Assume that the charge 𝑞 is moving with the velocity 𝒗 parallel to a straight
infinite wire along which the current 𝐼 flows (Fig. 6.7).
According to Eqs. (6.30) and (6.34), the charge in this case experiences a magnetic
force whose magnitude is
𝜇0 2𝐼
𝐹 = 𝑞𝑣𝐵 = 𝑞𝑣 , (6.36)
4𝜋 𝑏
where 𝑏 is the distance from the charge to the wire. The force is directed toward
the wire when the charge is positive if the directions of the current and motion of
the charge are the same, and away from the wire if these directions are opposite
(see Fig. 6.7). When the charge is negative, the direction of the force is reversed, the
other conditions being equal.
Let us consider two like point charges 𝑞1 and 𝑞2 moving along parallel straight
lines with the same velocity 𝑣 that is much smaller than 𝑐 (Fig. 6.8). When 𝑣  𝑐,
the electric field does not virtually differ from the field of stationary charges (see
Sec. 6.3). Therefore, the magnitude of the electric force 𝐹e exerted on the charges
can be considered equal to
1 𝑞1 𝑞2
𝐹e,1 = 𝐹e,2 = 𝐹e = . (6.37)
4𝜋 𝜀0 𝑟 2
Equations (6.21) and (6.3) give us the following expression for the magnetic force
𝐹m exerted on the charges:
𝜇 0 𝑞1 𝑞2 𝑣 2
𝐹m,1 = 𝐹m,2 = 𝐹m = (6.38)
4𝜋 𝑟 2
(the position vector 𝒓 is perpendicular to 𝒗).
Let us find the ratio between the magnetic and electric forces. It follows from
Eqs. (6.37) and (6.38) that
𝐹m 𝑣2
= 𝜀 0 𝜇0 𝑣2 = 2 (6.39)
𝐹e 𝑐
122 MAGNETIC FIELD IN A VACUUM

[see Eq. (6.15)]. We have obtained Eq. (6.39) on the assumption that 𝑣  𝑐. This
ratio holds, however, with any 𝑣’s.
The forces 𝑭 e and 𝑭 m are directed oppositely. Figure 6.8 has been drawn for
like and positive charges. For like negative charges, the directions of the forces will
remain the same, while the directions of the vectors 𝑩1 and 𝑩2 will be reversed. For
unlike charges, the directions of the electric and magnetic forces will be the reverse
of those shown in the figure.
Inspection of Eq. (6.39) shows that the magnetic force is weaker than the Coulomb
one by a factor equal to the square of the ratio of the speed of the charge to that of
light. The explanation is that the magnetic interaction between moving charges is
a relativistic effect (see Sec. 6.7). Magnetism would disappear if the speed of light
were infinitely great.

6.6. Ampere’s Law

If a wire carrying a current is in a magnetic field, then each of the current carriers
experiences the force
𝑭 = 𝑒[(𝒗 + 𝒖) × 𝑩] (6.40)
[see Eq. (6.33)]. Here, 𝒗 is the velocity of chaotic motion of a carrier, and 𝒖 is the
velocity of ordered motion. The action of this force is transferred from a current
carrier to the conductor along which it is moving. As a result, a force acts on a wire
with current in a magnetic field.
Let us find the value of the force d𝑭 exerted on an element of a wire of length
d𝑙. We shall average Eq. (6.40) over the current carriers contained in the element d𝑙:
h𝑭i = 𝑒[(h𝒗i + h𝒖i) × 𝑩] = 𝑒(h𝒖i × 𝑩) (6.41)
(𝑩 is the magnetic induction at the place where the element d𝑙 is). The wire element
contains 𝑛𝑆 d𝑙 carriers (𝑛 is the number of carriers in unit volume, and 𝑆 is the
cross-sectional area of the wire at the given place). Multiplying Eq. (6.41) by the
number of carriers, we find the force we are interested in:
d𝑭 = h𝑭i 𝑛𝑆 d𝑙 = [(𝑛𝑒 h𝒖i) × 𝑩]𝑆 d𝑙.
Taking into account that 𝑛𝑒 h𝒖i is the current density 𝒋, and 𝑆 d𝑙 gives the volume
of a wire element d𝑉 , we can write
d𝑭 = ( 𝒋 × 𝑩) d𝑉 . (6.42)
Hence, we can obtain an expression for the density of the force, i.e., for the force
acting on unit volume of the conductor
𝑭 u.v = 𝒋 × 𝑩. (6.43)
Ampere’s Law 123

Fig. 6.9 Fig. 6.10

Let us write Eq. (6.42) in the form


d𝑭 = ( 𝒋 × 𝑩)𝑆 d𝑙.
Replacing in accordance with Eq. (6.27) 𝒋𝑆 d𝑙 with 𝑗𝑆 d𝒍 = 𝐼 d𝒍, we arrive at the
equation
d𝑭 = 𝐼 (d𝒍 × 𝑩). (6.44)
This equation determines the force exerted on a current element d𝒍 in a magnetic
field. Equation (6.44) was established experimentally by Ampere and is called Am-
pere’s law.
We have obtained Ampere’s law on the basis of Eq. (6.33) for the magnetic force.
The expression for the magnetic force was actually obtained from the experimentally
established equation (6.44).
The magnitude of the force (6.44) is calculated by the equation
d𝐹 = 𝐼 𝐵 d𝑙 sin 𝛼, (6.45)
where 𝛼 is the angle between the vectors d𝒍 and 𝑩 (Fig. 6.9). The force is normal to
the plane containing the vectors d𝒍 and 𝑩.
Let us use Ampere’s law to calculate the force of interaction between two parallel
infinitely long line currents in a vacuum. If the distance between the currents is
𝑏 (Fig. 6.10), then each element of the current 𝐼2 will be in a magnetic field whose
induction is 𝐵1 = (𝜇0 /4𝜋) (2𝐼1 /𝑏) [see Eq. (6.30)]. The angle 𝛼 between the elements
of the current 𝐼2 and the vector 𝑩1 is a right one. Hence, according to Eq. (6.45), the
force acting on unit length of the current 𝐼2 is
𝜇0 2𝐼1 𝐼2
𝐹21,u = 𝐼2 𝐵1 = . (6.46)
4𝜋 𝑏
124 MAGNETIC FIELD IN A VACUUM

Fig. 6.11 Fig. 6.12

Equation (6.46) coincides with Eq. (6.2).


We get a similar equation for the force 𝐹21,u exerted on unit length of the current
𝐼1 . It is easy to see that when the currents flow in the same direction they attract
each other, and in the opposite direction repel each other.

6.7. Magnetism as a Relativistic Effect

There is a deep relation between electricity and magnetism. On the basis of the
postulates of the theory of relativity and of the invariance of an electric charge, we
can show that the magnetic interaction of charges and currents is a corollary of
Coulomb’s law. We shall show this on the example of a charge moving parallel to
an infinite line current with the velocity 𝑣0 ¹ (Fig. 6.11).
According to Eq. (6.36), the magnetic force acting on a charge in the case being
considered is
𝜇0 2𝐼
𝐹 = 𝑞𝑣0 (6.47)
4𝜋 𝑏
(the meaning of the symbols is clear from Fig. 6.11). The force is directed toward the
conductor carrying the current (𝑞 > 0). Before commencing to derive Eq. (6.47) for
the force on the basis of Coulomb’s law and relativistic relations, let us consider
the following effect. Assume that we have an infinite linear train of point charges
of an identical magnitude 𝑒 spaced a very small distance 𝑙0 apart (Fig. 6.12). Owing
to the smallness of 𝑙0 , we can speak of the linear density of the charges 𝜆0 which
obviously is
𝑒
𝜆0 = . (6.48)
𝑙0
Let us bring the charges into motion along the train with the identical velocity 𝑢.
The distance between the charges will therefore diminish and become equal to
 1/2
𝑢2

𝑙 = 𝑙0 1 − 2
𝑐

¹We have used the symbol 𝑣0 for the velocity of a charge to make the notation similar to that in
Chap. 8 of Vol. I.
Magnetism as a Relativistic Effect 125

Fig. 6.13

[see Eq. 8.19 of Vol. I]. The magnitude of the charges owing to their invariance,
however, remains the same. As a result, the linear density of the charges observed
in the reference frame relative to which the charges are moving will change and
become equal to
𝑒 𝜆0
𝜆= =p . (6.49)
𝑙 1 − (𝑢2 /𝑣2 )
Now let us consider in the reference frame K two infinite trains formed by
charges of the same magnitude, but of opposite signs, moving in opposite directions
with the same velocity 𝑢 and virtually coinciding with each other (Fig. 6.13a). The
combination of these trains is equivalent to an infinite line current having the value
2𝜆0 𝑢
𝐼 = 2𝜆𝑢 = p , (6.50)
1 − (𝑢2 /𝑣2 )
where 𝜆 is the quantity determined by Eq. (6.49). The total linear density of the
charges of a train equals zero, therefore an electric field is absent. The charge 𝑞
experiences a magnetic force whose magnitude according to Eqs. (6.47) and (6.50) is
𝜇0 4𝜆0 𝑢
𝐹 = 𝑞𝑣0 . (6.51)
4𝜋 𝑏 1 − (𝑢2 /𝑣2 )
p

Let us pass over to the reference frame K0 relative to which the charge 𝑞 is at
rest (Fig. 6.13b). In this frame, the charge 𝑞 also experiences a force (let us denote
it by 𝐹 0). This force cannot be of a magnetic origin, however, because the charge
𝑞 is stationary. The force 𝐹 0 has a purely electrical origin. It appears because the
linear densities of the positive and negative charges in the trains are now different
(we shall see below that the density of the negative charges is greater). The surplus
negative charge distributed over a train sets up an electric field that acts on the
positive charge 𝑞 with the force 𝐹 0 directed toward the train (see Fig. 6.13b).
Let us calculate the force 𝐹 0 and convince ourselves that it “equals” the force
𝐹 determined by Eq. (6.51). We have taken the word “equals” in quotation marks
because force is not an invariant quantity. Upon transition from one inertial refer-
126 MAGNETIC FIELD IN A VACUUM

ence frame to another, the force transforms according to a quite complicated law.
In a particular case, when the force 𝑭 0 is perpendicular to the relative velocity of
the frames K and K’ (𝑭 0 ⊥ 𝒗0 ), the transformation has the form
q
𝑭 0 1 − 𝑣02 /𝑐2 + 𝒗0 (𝑭 0 · 𝒗 0)/𝑐2

𝑭=
1 + (𝒗0 · 𝒗 0)/𝑐2
(𝒗 is the velocity of a particle experiencing the force 𝑭 0 and measured in the frame
0

K’). If 𝒗 0 = 0 (which occurs in the problem we are considering), the formula for
transformation of the force is as follows:
  2   1/2
𝑣
𝑭 = 𝑭 1 − 02
0
.
𝑐
A glance at this formula shows that the force perpendicular to 𝒗0 exerted on a
particle at rest in the frame K’ is also perpendicular to the vector 𝒗0 in the frame K.
The magnitude of the force in this case, however, is transformed by the formula
  2  1/2
𝑣
𝐹 = 𝐹 1 − 02
0
. (6.52)
𝑐
The densities of the charges in the positive and negative trains measured in the
frame K’ have the values [see Eq. (6.49)]
𝜆0 𝜆0
𝜆+0 = q , 𝜆−0 = − q , (6.53)
1 − 𝑢+02 /𝑐2 1 − 𝑢−02 /𝑐2


where 𝑢+0 and 𝑢−0 are the velocities of the charges +𝑒 and −𝑒 measured in the frame
K’. Upon a transition from the frame K to the frame K’, the projection of the velocity
of a particle onto the direction 𝑥 coinciding with the direction of 𝒗0 is transformed
by the equation
𝑢𝑥 − 𝑣0
𝑢𝑥0 =
1 − (𝑢𝑥 𝑣0 /𝑐2 )
[see Eqs. (8.28) of Vol. I; we have substituted 𝑢 and 𝑢0 for 𝑣 and 𝑣 0]. For the charges
+𝑒, the component 𝑢𝑥 equals 𝑢, for the charges −𝑒 it equals −𝑢 (see Fig. 6.13a). Hence,
𝑢 − 𝑣0 −𝑢 − 𝑣0
𝑢𝑥0 + = 𝑢𝑥0 − =
 
2
, .
1 − (𝑢𝑣0 /𝑐 ) 1 + (𝑢𝑣0 /𝑐2 )
Since the remaining projections equal zero, we get
|𝑢 − 𝑣0 | 𝑢 + 𝑣0
𝑢+0 = 2
, 𝑢−0 = . (6.54)
1 − (𝑢𝑣0 /𝑐 ) 1 + (𝑢𝑣0 /𝑐2 )
To simplify our calculations, let us pass over to relative velocities:
𝑣0 𝑢 𝑢0 𝑢0
𝛽0 = , 𝛽 = , 𝛽+0 = + , 𝛽− = − .
𝑐 𝑐 𝑐 𝑐
Magnetism as a Relativistic Effect 127

Equations (6.53) and (6.54) therefore acquire the form


𝜆0 𝜆0
𝜆+0 = p , 𝜆− = p (6.55)
1 − 𝛽+02 1 − 𝛽−02
|𝛽 − 𝛽0 | 𝛽 + 𝛽0
𝛽+0 = , 𝛽−0 = . (6.56)
1 − 𝛽𝛽0 1 + 𝛽𝛽0
With account taken of these equations, we get the following expression for the total
density of the charges:
𝜆 0 = 𝜆+0 + 𝜆−0
𝜆0 𝜆0
= " − "
  2 # 1/2   2 # 1/2
𝛽 − 𝛽0 𝛽 + 𝛽0
1− 1−
1 − 𝛽𝛽0 1 + 𝛽𝛽0
𝜆0 (1 − 𝛽𝛽0 ) 𝜆0 (1 + 𝛽𝛽0 )
=q −q .
(1 − 𝛽𝛽0 ) 2 − (𝛽 − 𝛽0 ) 2 (1 + 𝛽𝛽0 ) 2 − (𝛽 + 𝛽0 ) 2
It is easy to see that
(1 − 𝛽𝛽0 ) 2 − (𝛽 − 𝛽0 ) 2 = (1 + 𝛽𝛽0 ) 2 − (𝛽 + 𝛽0 ) 2 = 1 − 𝛽02 1 + 𝛽 2 .
 

Consequently,
−2𝜆0 𝛽𝛽0 −2𝜆0 𝑢𝑣0
𝜆0 = q = q . (6.57)
2 2 2 2
1 − 𝛽0 (1 + 𝛽 ) 𝑐 1 − 𝑣0 /𝑐 2 2 2
1 − (𝑢 /𝑐 )
 p

In accordance with Eq. (1.122), an infinitely long filament carrying a charge of


density 𝜆 0 sets up a field whose strength at the distance 𝑏 from the filament is
1 𝜆0
𝐸0 = .
2𝜋 𝜀0 𝑏
In this field, the charge 𝑞 experiences the force
𝑞𝜆 0
𝐹 0 = 𝑞𝐸 0 = .
2𝜋 𝜀0 𝑏
Introduction of Eq. (6.57) yields (we have omitted the minus sign)
𝑞𝜆0 𝑢𝑣0
𝐹0 = q
𝜋 𝜀0 𝑣𝑐2 1 − 𝑣02 /𝑐2 1 − (𝑢2 /𝑐2 )
p

𝜇0 4𝜆0 𝑢 1
= 𝑞𝑣0 (6.58)
4𝜋 1 − (𝑢 /𝑐 )
2 2
p q
1 − 𝑣2 /𝑐2

0
[we remind our reader that 𝜇0 = 1/(𝜀0 2
𝑐 ); see Eq. (6.15)]. q
The expression obtained differs from Eq. (6.51) only in the factor 1 − 𝑣02 /𝑐2 .

128 MAGNETIC FIELD IN A VACUUM

We can, therefore, write that


  2  1/2
𝑣
𝐹 = 𝐹 1 − 02
0
,
𝑐
where 𝐹 is the force determined by Eq. (6.51), and 𝐹 0 is the force determined by
Eq. (6.58). A comparison with Eq. (6.52) shows that 𝐹 and 𝐹 0 are the values of the
same force determined in the frames K and K’.
We must note that in the frame 𝐾 0, which would move relative to the frame K
with a velocity differing from that of the charge 𝑣0 , the force exerted on the charge
would consist of both electric and magnetic forces.
The results we have obtained signify that an electric and a magnetic field are
inseparably linked with each other and form a single electromagnetic field. Upon
a special choice of the reference frame, a field may be either purely electric or
purely magnetic. Relative to other reference frames, however, the same field is a
combination of an electric and a magnetic field.
In different inertial reference frames, the electric and magnetic fields of the
same collection of charges are different. A derivation beyond the scope of a general
course in physics leads to the following equations for the transformation of fields
when passing over from a reference frame K to a reference frame K’ moving relative
to it with the velocity 𝒗0 :
 𝐸 𝑦 − 𝑣0 𝐵 𝑧 𝐸 𝑧 + 𝑣0 𝐵 𝑦
𝐸0 = 𝐸𝑥 , 𝐸 𝑦 = p , 𝐸 𝑧0 = p

 ,
2 1 − 𝛽2

 𝑥 1−𝛽

(6.59)

0 = 𝐵 ,
𝐵 𝑦 + 𝑣0 𝐸 𝑧 𝐵 − 𝑣0 𝐸 𝑦
0 = 𝑧
=

 𝐵 𝑥 𝐵 𝑦 , 𝐵 .
 𝑥 1 − 𝛽2 1 − 𝛽2

 p 𝑧 p

Here, 𝐸 𝑥 , 𝐸 𝑦 , 𝐸 𝑧 , 𝐵 𝑥 , 𝐵 𝑦 , 𝐵 𝑧 are the components of the vectors 𝑬 and 𝑩 charac-
terizing an electromagnetic field in the frame K, similar primed symbols are the
components of the vectors 𝑬 0 and 𝑩 0 characterizing the field in the frame K’. The
Greek letter 𝛽 stands for the ratio 𝑣0 /𝑐.
Resolving the vectors 𝑬 and 𝑩, and also 𝑬 0 and 𝑩 0, into their components
parallel to the vector 𝒗0 (and, consequently, to the axes 𝑥 and 𝑥 0) and perpendicular
to this vector (i.e., representing, for example, 𝑬 in the form 𝑬 = 𝑬 k + 𝑬 ⊥ , etc.), we
can write Eqs. (6.59) in the vector form:
𝑬 ⊥ + (𝒗0 × 𝑩⊥ )
𝑬 0 = 𝑬 k , 𝑬 ⊥0 =


 ,
 k 1 − 𝛽 2

 p
(6.60)

0 0
𝑩⊥ − 1/𝑐2 (𝒗0 × 𝑬 ⊥ )
𝑩 = 𝑩 𝑩 =

 k , ⊥ .
 k 1 − 𝛽2

 p

Magnetism as a Relativistic Effect 129

In the Gaussian system of units, Eqs. (6.60) have the form


𝑬 ⊥ + (1/𝑐) (𝒗0 × 𝑩⊥ )
𝑬 0 = 𝑬 k , 𝑬 ⊥0 =

 ,
 k 1 − 𝛽2

 p
(6.61)

0 0 𝑩⊥ − (1/𝑐) (𝒗0 × 𝑬 ⊥ )
 𝑩 k = 𝑩 k , 𝑩⊥ =

 .
1 − 𝛽2
 p

When 𝛽  1 (i.e., 𝑣0  𝑐), Eqs. (6.60) are simplified as follows:
𝑬 0k = 𝑬 k , 𝑬 ⊥0 = 𝑬 ⊥ + 𝒗0 × 𝑩⊥ ,
𝑩 0k = 𝑩 k , 𝑩⊥0 = 𝑩⊥ − 1/𝑐2 (𝒗0 × 𝑬 ⊥ ).


Adding these equations in pairs, we get


 𝑬 0 = 𝑬 0k + 𝑬 ⊥0 = 𝑬 k + 𝑬 ⊥ + (𝒗0 × 𝑩⊥ ) = 𝑬 + (𝒗0 × 𝑩⊥ ),


(6.62)

1 1
 𝑩 0 = 𝑩 0k + 𝑩⊥0 = 𝑩 k + 𝑩⊥ − 2 (𝒗0 × 𝑬 ⊥ ) = 𝑩 + 2 (𝒗0 × 𝑬 ⊥ ).

 𝑐 𝑐
Since the vectors 𝒗0 and 𝑩 k are collinear, their vector product equals zero.
Hence, 𝒗0 × 𝑩 = 𝒗0 × 𝑩 k + 𝒗0 × 𝑩⊥ = 𝒗0 × 𝑩⊥ . Similarly, 𝒗0 × 𝑬 = 𝒗0 × 𝑬 ⊥ . With
this taken into account, Eqs. (6.62) can be given the form
1
𝑬 0 = 𝑬 + 𝒗0 × 𝑩, 𝑩 0 = 𝑩 − 2 (𝒗0 × 𝑬). (6.63)
𝑐
Fields are transformed by means of these equations if the relative velocity of the
reference frames 𝑣0 is much smaller than the speed of light in a vacuum 𝑐 (𝑣0  𝑐).
Equations (6.63) acquire the following form in the Gaussian system of units:
1 1
𝑬 0 = 𝑬 + (𝒗0 × 𝑩), 𝑩 0 = 𝑩 − (𝒗0 × 𝑬). (6.64)
𝑐 𝑐
In the example in the frame K considered at the beginning of this section, in
which the charge 𝑞 travelled with the velocity 𝒗0 parallel to a current-carrying wire,
there was only the magnetic field 𝑩⊥ perpendicular to 𝒗0 ; the components 𝑩 k , 𝑬 ⊥ .
and 𝑬 k equalled zero. According to Eqs. (6.60) in the frame K’, in which the charge
𝑞 is at rest (this
p frame travels relative to K with the velocity 𝒗0 ), the component 𝑩⊥
0

equal to 𝑩⊥ / 1 − 𝛽 2 is observed and, p in addition, the perpendicular component of


the electric field 𝑬 ⊥0 = (𝒗0 × 𝑩⊥ )/ 1 − 𝛽 2 .
In the frame K, the charge experiences the force
𝑭 = 𝑞(𝒗0 × 𝑩⊥ ). (6.65)
Since the charge 𝑞 is at rest in the frame K’, it experiences in this frame only the
electric force
𝑞(𝒗0 × 𝑩⊥ )
𝑭 0 = 𝑞𝑬 ⊥0 = p . (6.66)
1 − 𝛽2
A comparison of Eqs. (6.65) and (6.66) yields 𝑭 = 𝑭 0 1 − 𝛽 2 , which coincides with
p
130 MAGNETIC FIELD IN A VACUUM

Eq. (6.52).

6.8. Current Loop in a Magnetic Field

Let us see how a loop carrying a current behaves in a magnetic field. We shall begin
with a homogeneous field (𝑩 = constant). According to Eq. (6.44), a loop element d𝒍
experiences the force
d𝑭 = 𝐼 (d𝒍 × 𝑩). (6.67)
The resultant
∮ of such forces is
𝑭= 𝐼 (d𝒍 × 𝑩). (6.68)
Putting the constant quantities 𝐼 and 𝑩 outside the integral, we get
∮  
𝑭=𝐼 d𝒍 × 𝑩 .

The integral d𝒍 equals zero, therefore, 𝑭 = 0. Thus, the resultant force exerted on

a current loop in a homogeneous magnetic field equals zero. This holds for loops of
any shape (including non-planar ones) with an arbitrary arrangement of the loop
relative to the direction of the field. Only homogeneity of the field is essential for
the resultant force to equal zero.
In the following, we shall limit ourselves to a consideration of plane loops. Let
us calculate the resultant torque set up by the forces (6.67) applied to a loop. Since
the sum of these forces equals zero in a homogeneous field, the resultant torque
relative to any point will be the same. Indeed, the resultant torque relative to point
0 is determined by the expression

𝑻= (𝒓 × d𝑭),
where 𝒓 is the position vector drawn from point 0 to the point of application of
the force d𝑭. Let us take point 00 displaced relative to 0 by the distance 𝑏. Hence,
𝒓 = 𝒃 + 𝒓 0, and accordingly 𝒓 0 = 𝒓 − 𝒃. Therefore, the resultant torque relative to
point 00 is
∫ ∫ ∫ ∫
0
𝑻 = (𝒓 × d𝑭) =
0
( [𝒓 − 𝒃] × d𝑭) = (𝒓 × d𝑭) − (𝒃 × d𝑭)
 ∫ 
=𝑻 − 𝒃× d𝑭 = 𝑻,

d𝑭 = 0. The torques calculated relative to two arbitrarily taken points 0 and 00


were found to coincide. We, thus, conclude that the torque does not depend on the
selection of the point relative to which it is taken (compare with a couple of forces).
Current Loop in a Magnetic Field 131

Fig. 6.14

Let us consider an arbitrary plane current loop in a homogeneous magnetic


field 𝑩. Assume that the loop is oriented so that a positive normal to the loop 𝒏ˆ is at
right angles to the vector 𝑩 (Fig. 6.14). A normal is called positive if its direction is
associated with that of the current in the loop by the right-hand screw rule.
Let us divide the area of the loop into narrow strips of width d 𝑦 parallel to
the direction of the vector 𝑩 (see Fig. 6.14a; Fig. 6.14b is an enlarged view of one
of these strips). The force d𝑭 1 directed beyond the drawing is exerted on the
loop element d𝒍1 enclosing the strip at the left. The magnitude of this force is
d𝐹1 = 𝐼 𝐵 d𝑙1 sin 𝛼1 = 𝐼 𝐵 d𝑦 (see Fig. 6.14b). The force d𝑭 2 directed toward us is
exerted on the loop element d𝒍2 enclosing the strip at the right. The magnitude of
this force is d𝐹2 = 𝐼 𝐵 d𝑙2 sin 𝛼2 = 𝐼 𝐵 d 𝑦.
The result we have obtained signifies that the forces applied to opposite loop
elements d𝒍1 and d𝒍 2 form a couple whose torque is
d𝑇 = 𝐼 𝐵𝑥 d 𝑦 = 𝐼 𝐵 d𝑆
(d𝑆 is the area of a strip). A glance at Fig. 6.14 shows that the vector d𝑻 is perpendic-
ular to the vectors 𝒏ˆ and 𝑩 and, consequently, can be written in the form
d𝑻 = 𝐼 ( 𝒏ˆ × 𝑩) d𝑆.
Summation of this equation over all the strips yields the torque acting on the loop:
∫ ∫
𝑻= 𝐼 ( 𝒏ˆ × 𝑩) d𝑆 = 𝐼 ( 𝒏ˆ × 𝑩) d𝑆 = 𝐼 ( 𝒏ˆ × 𝑩) d𝑆 (6.69)
(the field is assumed to be homogeneous, therefore, the product 𝒏ˆ × 𝑩 is the same
for all the strips and can be put outside the integral). The quantity 𝑆 in Eq. (6.69) is
the area of the loop.
Equation (6.69) can be written in the form
ˆ × 𝑩.
𝑻 = (𝐼𝑆 𝒏) (6.70)
132 MAGNETIC FIELD IN A VACUUM

Fig. 6.15

This equation is similar to Eq. (1.58) determining the torque exerted on an electric
dipole in an electric field. The analogue of 𝑬 in Eq. (6.70) is the vector 𝑩, and that
of the electric dipole moment 𝒑 is the expression 𝐼𝑆 𝒏.ˆ This served as the grounds
to call the quantity
𝒑m = 𝐼𝑆 𝒏ˆ (6.71)
the magnetic dipole moment of a current loop. The direction of the vector 𝒑m
coincides with that of a positive normal to the loop.
Using the notation of Eq. (6.71), we can write Eq. (6.70) as follows:
𝑻 = 𝒑m × 𝑩 (𝒑m ⊥ 𝑩). (6.72)
Now, let us assume that the direction of the vector 𝑩 coincides with that of a
positive normal to the loop 𝒏ˆ and, therefore, with that of the vector 𝒑m too (Fig. 6.15).
In this case, the forces exerted on different elements of the loop are in one plane—
that of the loop. The force exerted on the loop element d𝒍 is determined by Eq. (6.67).
Let us calculate the resultant torque produced by such forces relative to point 0 in
the plane of the loop:
∫ ∫ ∮
𝑻= d𝑻 = (𝒓 × d𝑭) = 𝐼 [𝒓 × (d𝒍 × 𝑩)]
(𝒓 is the position vector drawn from point 0 to the element d𝒍). Let us transform
the integrand by means of Eq. (1.35) of Vol. I. The result is
∮ ∮ 
𝑻=𝐼 (𝒓 · 𝑩) d𝒍 − 𝑩(𝒓 · d𝒍) .

The first integral equals zero because the vectors 𝒓 and 𝑩 are mutually per-
pendicular. The scalar product inside the second integral is 𝑟 d𝑟 = d 𝑟 2 /2. The


second integral can, therefore, be written in the form


1

d 𝑟2 .

𝑩
2
Current Loop in a Magnetic Field 133

Fig. 6.16

The total differential of the function 𝑟 2 is inside the integral. The sum of the
increments of a function along a closed path is zero. Hence, the second addend in
the expression for 𝑻 is zero too. We have, thus, proved that the resultant torque 𝑻
relative to any point 0 in the plane of the loop is zero. The resultant torque relative
to all other points has the same value (see above).
Thus, when the vectors 𝒑m and 𝑩 have the same direction, the magnetic forces
exerted on separate portions of a loop do not tend to turn the loop nor shift it from
its position. They only tend to stretch the loop in its plane. If the vectors 𝒑m and 𝑩
have opposite directions, the magnetic forces tend to compress the loop.
Assume that the directions of the vectors 𝒑m and 𝑩 form an arbitrary angle
𝛼 (Fig. 6.16). Let us resolve the magnetic induction 𝑩 into two components: 𝑩 k
parallel to the vector 𝒑m and 𝑩⊥ perpendicular to it, and consider the action of
each component separately. The component 𝑩 k will set up forces stretching or
compressing the loop. The component 𝑩⊥ whose magnitude is 𝐵 sin 𝛼 will lead to
the appearance of a torque that can be calculated by Eq. (6.72):
𝑻 = 𝒑m × 𝑩 ⊥ .
Inspection of Fig. 6.16 shows that
𝒑m × 𝑩⊥ = 𝒑m × 𝑩.
Consequently, in the most general case, the torque exerted on a plane current loop
in a homogeneous magnetic field is determined by the equation
𝑻 = 𝒑m × 𝑩. (6.73)
The magnitude of the vector 𝑻 is
𝑇 = 𝑝m 𝐵 sin 𝛼. (6.74)
To increase the angle a between the vectors 𝒑m and 𝑩 by d𝛼, the following work
134 MAGNETIC FIELD IN A VACUUM

Fig. 6.17

must be done against the forces exerted on a loop in a magnetic field:


d𝐴 = 𝑇 d𝛼 = 𝑝m 𝐵 sin 𝛼 d𝛼. (6.75)
Upon turning to its initial position, a loop can return the work spent for its rotation
by doing it on some other body. Hence, the work (6.75) goes to increase the potential
energy 𝑊p,mech which a current loop has in a magnetic field, by the magnitude
d𝑊p,mech = 𝑝m 𝐵 sin 𝛼 d𝛼.
Integration yields
𝑊p,mech = −𝑝m 𝐵 cos 𝛼 + constant.
Assuming that constant = 0, we get the following expression:
𝑊p,mech = −𝑝m 𝐵 cos 𝛼 = −𝒑m · 𝑩 (6.76)
[compare with Eq. (1.61)].
Parallel orientation of the vectors 𝒑m and 𝑩 corresponds to the minimum energy
(6.76) and, consequently, to the position of stable equilibrium of a loop.
The quantity expressed by Eq. (6.76) is not the total potential energy of a current
loop, but only the part of it that is due to the existence of the torque (6.73). To stress
this, we have provided the symbol of the potential energy expressed by Eq. (6.76)
with the subscript “mech”. Apart from 𝑊p,mech , the total potential energy of a loop
includes other addends.
Now let us consider a plane current loop in an inhomogeneous magnetic field.
For simplicity, we shall first consider the loop to be circular. Assume that the field
changes the fastest in the direction 𝑥 coinciding with that of 𝑩 where the centre of
the loop is, and that the magnetic moment of the loop is oriented along 𝑩 (Fig. 6.17a).
Here, 𝑩 ≠ constant, and Eq. (6.68) does not have to be zero. The force d𝑭
exerted on a loop element is perpendicular to 𝑩, i.e., to the magnetic field line where
it intersects d𝒍. Therefore, the forces applied to different loop elements form a
Magnetic Field of a Current Loop 135

symmetrical conical fan (Fig. 6.17b). Their resultant 𝑭 is directed toward a growth
in 𝑩 and, therefore, pulls the loop into the region with a stronger field. It is quite
obvious that the greater the field changes (the greater is ∂𝐵/∂𝑥), the smaller is the
apex angle of the cone and the greater, other conditions being equal, is the resultant
force 𝑭. If we reverse the direction of the current (now 𝒑m is antiparallel to 𝑩), the
directions of all the forces d𝑭 and of their resultant 𝑭 will be reversed (Fig. 6.17c).
Hence, with such a mutual orientation of the vectors 𝒑m and 𝑩, the loop will be
pushed out of the field.
It is a simple matter to find a quantitative expression for the force 𝑭 by using
Eq. (6.76) for the energy of a loop in a magnetic field. If the orientation of the
magnetic moment relative to the field remains constant (𝑎 = constant), then 𝑊p,mech
will depend only on 𝑥 (through 𝐵). Differentiating 𝑊p,mech with respect to 𝑥 and
changing the sign of the result, we get the projection of the force onto the 𝑥-axis:
∂𝑊p,mech ∂𝐵
𝐹𝑥 = − = 𝑝m cos 𝛼.
∂𝑥 ∂𝑥
We assume that the field changes only slightly in the other directions. Hence, we
may disregard the projections of the force onto the other axes and assume that
𝐹 = 𝐹 𝑥 . Thus,
∂𝐵
𝐹 = 𝑝m cos 𝛼. (6.77)
∂𝑥
According to the equation we have obtained, the force exerted on a current loop
in an inhomogeneous magnetic field depends on the orientation of the magnetic
moment of the loop relative to the direction of the field. If the vectors 𝒑m and
𝑩 coincide in direction (𝛼 = 0), then the force is positive, i.e., is directed toward
a growth in 𝑣𝑒𝑐𝐵 (∂𝐵/∂𝑥 = 0 is assumed to be positive; otherwise, the sign and
the direction of the force will be reversed, but the force will pull the loop into the
region of a strong field as before). If 𝒑m and 𝑩 are antiparallel (𝛼 = 𝜋), the force
is negative, i.e., directed toward diminishing of 𝑩. We have already obtained this
result qualitatively with the aid of Fig. 6.17.
It is quite evident that apart from the force (6.77), a current loop in an inhomo-
geneous magnetic field will also experience the torque (6.73).

6.9. Magnetic Field of a Current Loop

Let us consider the field set up by a current flowing in a thin wire having the shape
of a circle of radius 𝑅 (a ring current). We shall determine the magnetic induction
at the centre of the ring current (Fig. 6.18). Every current element produces at the
centre an induction directed along a positive normal to the loop. Therefore, vector
136 MAGNETIC FIELD IN A VACUUM

Fig. 6.18 Fig. 6.19

summation of the d𝑩’s consists in summation of their magnitudes. By Eq. (6.29),


𝜇0 𝐼 d𝑙
d𝐵 =
4𝜋 𝑅2
(𝛼 = 𝜋/2). Let us integrate this expression over the entire loop:
𝜇0 2𝐼 𝜋 𝑅2
∫ ∮ 
𝜇0 𝐼 𝜇0 𝐼
𝐵= d𝐵 = d𝑙 = 2𝜋 𝑅 = .
4𝜋 𝑅2 4𝜋 𝑅2 4𝜋 𝑅3
The expression in parentheses is the magnitude of the magnetic dipole moment
𝑝m [see Eq. (6.71)]. Hence, the magnetic induction at the centre of a ring current has
the value
𝜇0 2𝑝m
𝐵= . (6.78)
4𝜋 𝑅3
Inspection of Fig. 6.18 shows that the direction of the vector 𝑩 coincides with
that of a positive normal to the loop, i.e., with that of the vector 𝒑m . Therefore,
Eq. (6.78) can be written in the vector form:
𝜇0 2𝒑m
𝑩= . (6.79)
4𝜋 𝑅3
Now let us find 𝑩 on the axis of the ring current at the distance of 𝑟 from
the centre of the loop (Fig. 6.19). The vectors d𝑩 are perpendicular to the planes
passing through the relevant element d𝒍 and the point where we are seeking the
field. Hence, they form a symmetrical conical fan (Fig. 6.19b). We can conclude
from considerations of symmetry that the resultant vector 𝑩 is directed along the
axis of the loop. Each of the component vectors d𝑩 contributes d𝑩 𝑝𝑎𝑟𝑎𝑙𝑙𝑒𝑙 equal in
magnitude to d𝐵 sin 𝛽 = d𝐵(𝑅/𝑏) to the resultant vector. The angle 𝛼 between d𝒍
and 𝒃 is a right one, hence,
𝑅 𝜇0 𝐼 d𝑙 𝑅 𝜇0 𝐼 𝑅 d𝑙
d𝐵 k = d𝐵 = = .
𝑏 4𝜋 𝑏2 𝑏 4𝜋 𝑏3
Magnetic Field of a Current Loop 137


Integrating over the entire loop and substituting 𝑅2 + 𝑟 2 for 𝑏, we obtain
𝜇0 2 𝐼𝜋 𝑅2
∫ ∮ 
𝜇0 𝐼 𝑅 𝜇0 𝐼 𝑅
𝐵= d𝐵 k = d𝑙 = 2𝜋 𝑅 =
4𝜋 𝑏3 4𝜋 𝑏3 4𝜋 (𝑅2 + 𝑟 2 ) 3/2
𝜇0 2𝑝m
= . (6.80)
4𝜋 (𝑅2 + 𝑟 2 ) 3/2
This equation determines the magnitude of the magnetic induction on the axis of a
ring current. With a view to the vectors 𝑩 and 𝒑m having the same direction, we
can write Eq. (6.80) in the vector form:
𝜇0 2𝒑m
𝑩= . (6.81)
4𝜋 (𝑅 + 𝑟 2 ) 3/2
2

This expression does not depend on the sign of 𝑟. Hence, at points on the axis
symmetrical relative to the centre of the current, 𝑩 has the same magnitude and
direction.
When 𝑟 = 0, Eq. (6.81) transforms, as should be expected, into Eq. (6.79) for the
magnetic induction at the centre of a ring current.
For great distances from a loop, we may disregard 𝑅2 in the denominator in
comparison with 𝑟 2 . Equation (6.81) now becomes
𝜇0 2𝒑m
𝑩= (along the current axis), (6.82)
4𝜋 𝑟 3
which is similar to Eq. (1.55) for the electric field strength along the axis of a dipole.
Calculations beyond the scope of the present book show that a magnetic dipole
moment 𝒑m can be ascribed to any system of currents or moving charges localized
in a restricted portion of space (compare with the electric dipole moment of a
system of charges). The magnetic field of such a system at distances that are great in
comparison with its dimensions is determined through 𝒑m using the same equations
as those used to determine the field of a system of charges at great distances through
the electric dipole moment (see Sec. 1.10). In particular, the field of a plane loop of
any shape at great distances from it is
𝜇0 2𝑝m √
𝐵= 1 + 3 cos2 𝜃, (6.83)
4𝜋 𝑟 3
where 𝑟 is the distance from the loop to the given point, and 𝜃 is the angle between
the direction of the vector 𝒑m and the direction from the loop to the given point
of the field [compare with Eq. (1.53)]. When 𝜃 = 0, Eq. (6.83) gives the same value as
Eq. (6.82) for the magnitude of the vector 𝑩.
Figure 6.20 shows the magnetic field lines of a ring current. It shows only the
lines in one of the planes passing through the current axis. A similar picture will be
observed in any of these planes.
It follows from everything said in the preceding and this sections that the
138 MAGNETIC FIELD IN A VACUUM

Fig. 6.20

magnetic dipole moment is a very important characteristic of a current loop. It


determines both the field set up by a loop and the behaviour of the loop in an
external magnetic field.

6.10. Work Done When a Current Moves in a Magnetic Field

Let us consider a current loop formed by stationary wires and a movable rod of
length 𝑙 sliding along them (Fig. 6.21). Let the loop be in an external magnetic field
which we shall assume to be homogeneous and at right angles to the plane of the
loop. With the directions of the current and field shown in the figure, the force 𝑭
exerted on the rod will be directed to the right and will equal
𝐹 = 𝐼 𝐵𝑙.
When the rod moves to the right by dℎ, this force does the positive work
d𝐴 = 𝐹 dℎ = 𝐼 𝐵𝑙 dℎ = 𝐼 𝐵 d𝑆, (6.84)
where d𝑆 is the shaded area (see Fig. 6.21a).
Let us see how the magnetic induction flux 𝛷 through the area of the loop will
change when the rod moves. We shall agree, when calculating the flux through the
area of a current loop, that the quantity 𝒏ˆ in the equation

𝛷= 𝑩 · 𝒏ˆ d𝑆,
Work Done When a Current Moves in a Magnetic Field 139

Fig. 6.21

is a positive normal, i.e., one that forms a right-handed system with the direction of
the current in the loop (see Sec. 6.8). Hence, in the case shown in Fig. 6.21a, the flux
will be positive and equal to 𝐵𝑆 (𝑆 is the area of the loop). When the rod moves to
the right, the area of the loop receives the positive increment d𝑆. As a result, the
flux also receives the positive increment d𝛷 = 𝐵 d𝑆. Equation (6.84) can, therefore,
be written in the form
d𝐴 = 𝐼 d𝛷. (6.85)
When the field is directed toward us (Fig. 6.21b), the force exerted on the rod is
directed to the left. Therefore when the rod moves to the right through the distance
dℎ, the magnetic force does the negative work
d𝐴 = −𝐼 𝐵𝑙 dℎ = −𝐼 𝐵 d𝑆. (6.86)
In this case, the flux through the loop is −𝐵𝑆. When the area of the loop grows
by d𝑆, the flux receives the increment d𝛷 = −𝐵 d𝑆. Hence, Eq. (6.86) can also be
written in the form of Eq. (6.85).
The quantity d𝛷 in Eq. (6.85) can be interpreted as the flux through the area
covered by the rod when it moves. We can say accordingly that the work done by
the magnetic force on a portion of a current loop equals the product of the current
and the magnitude of the magnetic flux through the surface covered by this portion
during its motion.
Equations (6.84) and (6.85) can be combined into a single vector expression. For
this purpose, we shall compare the vector 𝒍 having the direction of the current with
the rod (Fig. 6.22). Regardless of the direction of the vector 𝑩 (toward us or away
from us), the force exerted on the rod can be represented in the form
𝑭 = 𝐼𝒍 × 𝑩.
When the rod moves through the distance d𝒉, the force does the work
d𝐴 = 𝑭 d𝒉 = 𝐼𝒍 × 𝑩 d𝒉.
Let us perform a cyclic transposition of the multipliers in this triple scalar product
140 MAGNETIC FIELD IN A VACUUM

Fig. 6.22

[see Eq. (1.34) of Vol. I]. The result is


d𝐴 = 𝐼 𝑩(d𝒉 × 𝒍). (6.87)
A glance at Fig. 6.22 shows that the vector product (d𝒉 × 𝒍) equals in magnitude
the area d𝑆 described by the rod during its motion and has the direction of a positive
normal 𝒏.ˆ Hence,
ˆ d𝑆.
d𝐴 = 𝐼 (𝑩 · 𝒏) (6.88)
In the case shown in Fig. 6.22a, we have 𝑩 · 𝒏ˆ = 𝐵, and we arrive at Eq. (6.84). In the
case shown in Fig. 6.22b, we have 𝑩 · 𝒏ˆ = −𝐵. and we arrive at Eq. (6.86).
The expression 𝑩 · 𝒏ˆ d𝑆 determines the increment of the magnetic flux through
the loop due to motion of the rod. Thus, Eq. (6.88) can be written in the form of
(6.85). But Eq. (6.88) has an advantage over (6.85) because we “automatically” get the
sign of d𝛷 from it and, consequently, the sign of d𝐴 too.
Let us consider a rigid current loop of any shape in an arbitrary magnetic field.
We shall find the work done upon an arbitrary infinitely small displacement of the
loop. Assume that the loop element d𝒍 was displaced by d𝒉 (Fig. 6.23). The magnetic
force does the following work on it:
d𝐴el = 𝐼 (d𝒍 × 𝑩) · d𝒉. (6.89)
Here, 𝑩 is the magnetic induction at the place where the loop element d𝒍 is.
Performing a cyclic transposition of the multipliers in Eq. (6.89), we get
d𝐴el = 𝐼 𝑩 · (d𝒉 × d𝒍). (6.90)
The magnitude of the vector product d𝒉 × d𝒍 equals the area of a parallelogram
constructed on the vectors d𝒉 and d𝒍, i.e., the area d𝑆 described by the element d𝒍
during its motion. The direction of the vector product coincides with that of a
positive normal to the area d𝑆. Consequently,
ˆ d𝑆 = d𝛷el ,
𝑩 · (d𝒉 × d𝒍) = (𝑩 · 𝒏) (6.91)
where d𝛷el is the increment of the magnetic flux through the loop due to the dis-
placement of the loop element d𝒍.
Work Done When a Current Moves in a Magnetic Field 141

Fig. 6.23

With a view to Eq. (6.91), we can write Eq. (6.90) in the form
d𝐴el = 𝐼 d𝛷el . (6.92)
Summation of Eq. (6.92) over all the loop elements yields an expression for the work
of the magnetic forces upon an arbitrary infinitely small displacement of the loop:
∫ ∫ ∫
d𝐴 = d𝐴el = 𝐼 d𝛷el = 𝐼 d𝛷el = 𝐼 d𝛷 (6.93)
(d𝛷 is the total increment of the flux through the loop).
To find the work done upon a finite arbitrary displacement of a loop, let us
integrate Eq. (6.93) over the entire loop:
∫ ∫
𝐴12 = d𝐴 = 𝐼 d𝛷el = 𝐼 (𝛷2 − 𝛷1 ) . (6.94)
Here, 𝛷1 and 𝛷2 are the values of the magnetic flux through the loop in its initial
and final positions. The work done by the magnetic forces on the loop thus equals
the product of the current and the increment of the magnetic flux through the loop.
In particular, when a plane loop rotates in a homogeneous field from a position
in which the vectors 𝑷 m and 𝑩 are directed oppositely (in this position 𝛷 = −𝐵𝑆) to
a position in which these vectors have the same direction (in this position 𝛷 = 𝐵𝑆,
the magnetic forces do the following work on the loop:
𝐴 = 𝐼 [𝐵𝑆 − (𝐵𝑆)] = 2𝐼 𝐵𝑆.
The same result is obtained with the aid of Eq. (6.91) for the potential energy of a
loop in a magnetic field:
𝐴 = 𝑊init − 𝑊fin = 𝑝m 𝐵 − (−𝑝m 𝐵) = 2𝑝m 𝐵 = 2𝐼𝑆𝐵
(𝑝m = 𝐼𝑆).
We must note that the work expressed by Eq. (6.94) is done not at the expense of
the energy of the external magnetic field, but at the expense of the source maintain-
ing a constant current in the loop. We shall show in Sec. 8.2 that when the magnetic
flux through a loop changes, an induced e.m.f. Ei = −(d𝛷/d𝑡) is set up in the loop.
Hence, the source in addition to the work done to liberate the Joule heat must also
do work against the induced e.m.f. determined by the expression
∫ ∫ ∫ ∫
𝐴= d𝐴 = − Ei 𝑙 d𝑡 = d𝛷/d𝑡𝐼 d𝑡 = 𝐼 d𝛷 = 𝐼 (𝛷2 − 𝛷1 ),
142 MAGNETIC FIELD IN A VACUUM

that coincides with Eq. (6.94).

6.11. Divergence and Curl of a Magnetic Field

The absence of magnetic charges in nature² results in the fact that the lines of
the vector 𝑩 have neither a beginning nor an end. Therefore, in accordance with
Eq. (1.77), the flux of the vector 𝑩 through a closed surface must equal zero. Thus,
for any magnetic field and an arbitrary closed surface, the condition

𝛷𝐵 = 𝑩 d𝑺 = 0, (6.95)
𝑆
is observed. This equation expresses Gauss’s theorem for the vector 𝑩: the flux of
the magnetic induction vector through any closed surface equals zero.
Substituting a volume integral for the surface one in Eq. (6.95) in accordance
with Eq. (1.108), we find that

∇ · 𝑩 d𝑉 = 0.
𝑉
The condition which we have arrived at must be observed for any arbitrarily chosen
volume 𝑉 . This is possible only if the integrand at each point of the field is zero.
Thus, a magnetic field has the property that its divergence is zero everywhere:
∇ · 𝑩 = 0. (6.96)
Let us now turn to the circulation of the vector 𝑩. By definition, the circulation
equals the integral

𝑩 · d𝒍. (6.97)
It is the simplest to calculate this integral for the field of a line current. Assume
that a closed loop is in a plane perpendicular to the current (Fig. 6.24; the current is
perpendicular to the plane of the drawing and is directed beyond the drawing). At
each point of the loop, the vector 𝑩 is directed along a tangent to the circumference
passing through this point. Let us substitute 𝐵 d𝑙 𝐵 for 𝑩 · d𝒍 in the expression
for the circulation (d𝑙 𝐵 is the projection of a loop element onto the direction of
the vector 𝑩). Inspection of the figure shows that d𝑙 𝐵 equals 𝑏 d𝛼, where 𝑏 is the
distance from the wire carrying the current to d𝒍, and d𝛼 is the angle through which
a radial straight line turns when it moves along the loop over the element d𝒍. Thus,

²The British physicist Paul Dirac made the assumption that magnetic charges (called Dirac’s
monopoles) should exist in nature. Searches for these charges have meanwhile given no results and
the question of the existence of Dirac’s...
Divergence and Curl of a Magnetic Field 143

Fig. 6.24

introducing Eq. (6.30) for 𝐵, we get


𝜇0 2𝐼 𝜇0 𝐼
𝑩 · d𝒍 = 𝐵 d𝑙 𝐵 = 𝑏 d𝛼 = d𝛼. (6.98)
4𝜋 𝑏 2𝜋
With a view to Eq. (6.98), we have
∮ ∮
𝜇0 𝐼
𝑩 · d𝒍 = d𝛼. (6.99)
2𝜋
Upon circumvention of the loop enclosing∮ the current, the radial straight line
constantly turns in one direction, therefore, d𝛼 = 2𝜋. Matters are different if the
current is not enclosed by the loop (Fig. 6.24b). Here, upon circumvention of the
loop, the radial straight line first turns ∮in one direction (segment 1-2), and then in
the opposite one (2-1), owing to which d𝛼 equals zero. With a view to this result.
we can∮write that
𝑩 · d𝒍 = 𝜇0 𝐼, (6.100)
where 𝐼 must be understood as the current enclosed by the loop. If the loop does
not enclose the current, the circulation of the vector 𝑩 is zero.
The sign of expression (6.100) depends on the direction of circumvention of the
loop (the angle 𝛼 is measured in the same direction). If the direction of circumvention
forms a right-handed system with the direction of the current, quantity (6.100) is
positive, in the opposite case it is negative. The sign can be taken into consideration
by assuming 𝐼 to be an algebraic quantity. A current whose direction is associated
with that of circumvention of a loop by the right-hand screw rule must be considered
positive; a current of the opposite direction will be negative.
Equation (6.100) will allow us to easily recall Eq. (6.30) for 𝐵 of the field of a
line current. Imagine a plane loop in the form of a circle of radius 𝑏 (Fig. 6.25). At
each point of this loop, the vector 𝑩 has the same magnitude and is directed along a
tangent to the circle. Hence, the circulation equals the product of 𝐵 and the length
of the circumference 2𝜋𝑏, and Eq. (6.100) has the form
𝐵 × 2𝜋𝑏 = 𝜇0 𝐼.
144 MAGNETIC FIELD IN A VACUUM

Fig. 6.25 Fig. 6.26

Thus, 𝐵 = 𝜇0 𝐼/(2𝜋𝑏) [compare with Eq. (6.30)].


The case of a non-planar loop (Fig. 6.26) differs from that of a plane one con-
sidered above, only in that upon motion along the loop the radial straight line not
only turns about the wire, but also moves along it. All our reasoning which led us
to Eq. (6.100) remains true if we understand d𝛼 to be the angle through which the
projection of the radial straight line onto a plane perpendicular to the current turns.
The total angle of rotation of this projection is 2𝜋 if the loop encloses the current,
and zero otherwise. We thus again arrive at Eq. (6.100).
We have obtained Eq. (6.100) for a line current. We can show that it also holds
for a current flowing in a wire of an arbitrary shape, for example, for a ring current.
Assume that a loop encloses several wires carrying currents. Owing to the
superposition principle [see Eq. (6.16)]:
∮ ∮ Õ ! Õ∮
𝑩 · d𝒍 = 𝑩𝑘 d𝒍 = 𝑩𝑘 d𝒍.
𝑘 𝑘
Each of the integrals in this sum equals 𝜇0 𝐼 𝑘 . Hence,
∮ Õ
𝑩 · d𝒍 = 𝜇0 𝐼𝑘 (6.101)
𝑘
(remember that 𝐼 𝑘 is an algebraic quantity).
If currents flow in the entire space where a loop is, the algebraic sum of the
currents enclosed by the loop can be represented in the form
Õ ∫ ∫
𝐼𝑘 = 𝒋 · d𝑺 = 𝒋 · 𝒏ˆ d𝑆. (6.102)
𝑘 𝑆 𝑆
The integral is taken over the arbitrary surface 𝑆 enclosing the loop. The vector 𝒋 is
the current density at the point where area element d𝑆 is; 𝒏ˆ is a positive normal
to this element (i.e., a normal forming a right-handed system with the direction of
circumvention of the loop in calculating the circulation).
Divergence and Curl of a Magnetic Field 145

Substituting Eq. (6.102) for the sum of the currents in Eq. (6.101), we obtain
∮ ∫
𝑩 · d𝒍 = 𝜇0 𝒋 · d𝑺.
𝑆
Transforming the left-band side according to Stokes’s theorem, we arrive at the
equation
∫ ∫
(∇ × 𝑩) · d𝑺 = 𝜇0 𝒋 · d𝑺.
𝑆 𝑆
This equation must be obeyed with an arbitrary choice of the surface 𝑆 over which
the integrals are taken. This is possible only if the integrands have identical values
at every point. We, thus, arrive at the conclusion that the curl of the magnetic
induction vector is proportional to the current density vector at the given point:
∇ × 𝑩 = 𝜇0 𝒋. (6.103)
The proportionality constant in the SI system is 𝜇0 .
We must note that Eqs. (6.101) and (6.103) hold only for the field in a vacuum in
the absence of time-varying electric fields.
Thus, we have found the divergence and curl of a magnetic field in a vacuum. Let
us compare the equations obtained with the similar equations for an electrostatic
field in a vacuum. According to Eqs. (1.112), (1.117), (6.96) and (6.103):
1
∇ · 𝑬 = 𝜌 (the divergence of 𝑬 equals 𝜌 divided by 𝜀0 )
𝜀0
∇ × 𝑬 = 0 (the curl of 𝑬 equals zero)
∇ · 𝑩 = 0 (the divergence of 𝑩 equals zero)
∇ × 𝑩 = 𝜇0 𝒋 (the curl of 𝑩 equals 𝜇0 multiplied by 𝒋).
A comparison of these equations shows that an electrostatic and a magnetic
field are of an appreciably different nature. The curl of an electrostatic field equals
zero; consequently, an electrostatic field is potential and can be characterized by
the scalar potential 𝜑. The curl of a magnetic field at points where there is a current
differs from zero. Accordingly, the circulation of the vector 𝑩 is proportional to
the current enclosed by a loop. This is why we cannot ascribe to a magnetic field a
scalar potential that would be related to 𝑩 by an equation similar to Eq. (1.41). This
potential would not be unique—upon each circumvention of the loop and return
to the initial point it would receive an increment equal to 𝜇0 𝐼. A field whose curl
differs from zero is called a vortex or a solenoidal one.
Since the divergence of the vector 𝑩 is zero everywhere, this vector can be
represented as the curl of a function 𝑨:
𝑩 = ∇ × 𝑨, (6.104)
146 MAGNETIC FIELD IN A VACUUM

the divergence of a curl always equals zero [see Eq. (1.106)]. The function 𝑨 is called
the vector potential. A treatment of the vector potential is beyond the scope of
the present book.

6.12. Field of a Solenoid and Toroid

A solenoid is a wire wound in the form of a spiral onto a round cylindrical body. The
magnetic field lines of a solenoid are arranged approximately as shown in Fig. 6.27.
The direction of these lines inside the solenoid forms a right-handed system with
the direction of the current in the turns.
A real solenoid has a current component along its axis. In addition, the linear
density of the current 𝑗lin (equal to the ratio of the current d𝐼 to an element of
solenoid length d𝑙) changes periodically along the solenoid. The average value of
this density is
d𝐼
 
h𝑗lin i = = 𝑛𝐼, (6.105)
d𝑙
where 𝑛 is the number of solenoid turns per unit length and 𝐼 the current in the
solenoid.
In the science of electromagnetism, a great part is played by an imaginary
infinitely long solenoid having no axial current component and, in addition, having
a constant linear current density 𝑗lin along its entire length. The reason for this
is that the field of such a solenoid is homogeneous and is bounded by the volume
of the solenoid (similarly, the electric field of an infinite parallel-plate capacitor is
homogeneous and is bounded by the volume of the capacitor).
In accordance with what has been said above, let us imagine a solenoid in the
form of an infinite thin-walled cylinder around which flows a current of constant
linear density
𝑗lin = 𝑛𝐼. (6.106)
Let us divide the cylinder into identical ring currents—“turns”. Examination of
Fig. 6.28 shows that each pair of turns arranged symmetrically relative to a plane
perpendicular to the solenoid axis sets up a magnetic induction parallel to the axis
at any point of this plane. Hence, the resultant of the field at any point inside and
outside an infinite solenoid can only have a direction parallel to the axis.
It can be seen from Fig. 6.27 that the directions of the field inside and outside
a finite solenoid are opposite. The directions of the fields do not change when
the length of a solenoid is increased, and in the limit, when 𝑙 → ∞, they remain
opposite. In an infinite solenoid, as in a finite one, the direction of the field inside
the solenoid forms a right-handed system with the direction in which the current
Field of a Solenoid and Toroid 147

Fig. 6.27 Fig. 6.28

flows around the cylinder.


It follows from the vector 𝑩 and the axis being parallel that the field both inside
and outside an infinite solenoid must be homogeneous. To prove this, let us take an
imaginary rectangular loop 1-2-3-4 inside a solenoid (Fig. 6.29; 4-1 is along the axis
of the solenoid).
Passing clockwise around the loop, we get the value (𝐵2 − 𝐵1 )𝑎 for the cir-
culation of the vector 𝑩. The loop does not enclose the currents, therefore the
circulation must be zero [see Eq. (6.101)]. Hence, it follows that 𝑩1 = 𝑩2 . Arranging
section 2-3 of the loop at any distance from the axis, we shall always find that the
magnetic induction 𝑩2 at this distance equals the induction 𝑩1 on the solenoid axis.
Thus, the homogeneity of the field inside the solenoid has been proved.
Now let us turn to loop 10-20-30-40. We have depicted the vectors 𝑩10 and 𝑩20 by
a dash line since, as we shall find out in the following, the field outside an infinite
solenoid is zero. Meanwhile, all that we know is that the possible direction of the
field outside the solenoid is opposite to that of the field inside it. Loop 10-20-30-40
does not enclose the currents; therefore, the circulation of the vector 𝑩 0 around
this loop, equal to (𝐵10 − 𝐵20 )𝑎, must be zero. It thus follows that 𝑩10 = 𝑩20 . The
distances from the solenoid axis to sections 10-40 and 20-30 were taken arbitrarily.
Consequently, the value of 𝑩 0 at any distance from the axis will be the same outside
the solenoid. Thus, the homogeneity of the field outside the solenoid has been
proved too.
The circulation around the loop shown in Fig. 6.30 is 𝑎(𝐵 + 𝐵 0) (for clock-
wise circumvention). This loop encloses a positive current of magnitude 𝑗lin 𝑎. In
accordance with Eq. (6.101), the following equation must be observed:
𝑎(𝐵 + 𝐵 0) = 𝜇0 𝑗lin 𝑎,
148 MAGNETIC FIELD IN A VACUUM

Fig. 6.29 Fig. 6.30

or after cancelling 𝑎 and replacing 𝑗lin with 𝑛𝐼 [see Eq. (6.106)]


(𝐵 + 𝐵 0) = 𝜇0 𝑛𝐼. (6.107)
This equation shows that the field both inside and outside an infinite solenoid is
finite.
Let us take a plane at right angles to the solenoid axis (Fig. 6.31). Since the field
lines 𝑩 are closed, the magnetic fluxes through the inner part 𝑆 of this plane and
through its outer part 𝑆 0 must be the same. Since the fields are homogeneous and
normal to the plane, each of the fluxes equals the product of the relevant value of the
magnetic induction and the area penetrated by the flux. We, thus, get the expression
𝐵𝑆 = 𝐵 0𝑆 0.
The left-hand side of this equation is finite, the factor 𝑆 0 in the right-hand side
is infinitely great. Hence, it follows that 𝐵 0 = 0.
Thus, we have proved that the magnetic induction outside an infinitely long
solenoid is zero. The field inside the solenoid is homogeneous. Assuming in
Eq. (6.107) that 𝐵 0 = 0, we arrive at an equation for the magnetic induction in-
side a solenoid:
𝐵 = 𝜇0 𝑛𝐼. (6.108)
The product 𝑛𝐼 is called the number of ampere-turns per metre. At 𝑛 = 1000
turns per metre and a current of 1 A, the magnetic induction inside a solenoid is
4𝜋 × 10−4 T = 4𝜋Gs.
The symmetrically arranged turns make an identical contribution to the mag-
netic induction on the axis of a solenoid [see Eq. (6.81)]. Therefore, at the end of a
semi-infinite solenoid, the magnetic induction on its axis equals half the value given
by Eq. (6.108):
1
𝐵 = 𝜇0 𝑛𝐼. (6.109)
2
Practically, if the length of a solenoid is considerably greater than its diameter,
Field of a Solenoid and Toroid 149

Fig. 6.31 Fig. 6.32

Eq. (6.108) will hold for points in the central part of the solenoid, and Eq. (6.109) for
points on its axis near its ends.
A toroid is a wire wound onto a body having the shape of a torus (Fig. 6.32). Let
us take a loop in the form of a circle of radius 𝑟 whose centre coincides with that of
a toroid. Owing to symmetry, the vector 𝑩 at every point must be directed along a
tangent to the loop. Hence, the circulation of 𝑩 is

𝑩 · d𝒍 = 𝐵 × 2𝜋𝑟
(𝑩 is the magnetic induction at the points through which the loop passes).
If a loop passes inside a toroid, it encloses the current 2𝜋 𝑅𝑛𝐼 (𝑅 is the radius of
the toroid, and 𝑛 is the number of turns per unit of its length). In this case,
𝐵 × 2𝜋𝑟 = 𝜇0 2𝜋 𝑅𝑛𝐼,
whence
𝑅
𝐵 = 𝜇0 𝑛𝐼 . (6.110)
𝑟
A loop passing outside a toroid encloses no currents, hence, we have 𝐵×2𝜋𝑟 = 0
for it. Thus, the magnetic induction outside a toroid is zero.
For a toroid whose radius 𝑅 considerably exceeds the radius of a turn, the ratio
𝑅/𝑟 for all the points inside the toroid differs only slightly from unity, and instead
of Eq. (6.110) we get an equation coinciding with Eq. (6.108) for an infinitely long
solenoid. In this case, the field may be considered homogeneous in each of the toroid
sections. The field is directed differently in different sections. We can, therefore,
speak of the homogeneity of the field within the entire toroid only conditionally,
bearing in mind the identical magnitude of 𝑩.
A real toroid has a current component along its axis. This component sets up a
field similar to that of a ring current in addition to the field given by Eq. (6.110).
151

Chapter 7
MAGNETIC FIELD
IN A SUBSTANCE

7.1. Magnetization of a Magnetic

We assumed in the preceding chapter that the conductors carrying a current are
in a vacuum. If the conductors carrying a current are in a medium, the magnetic
field changes. The explanation is that any substance is a magnetic, i.e., is capable of
acquiring a magnetic moment under the action of a magnetic field (of becoming
magnetized). The magnetized substance sets up the magnetic field 𝑩 0 that is super-
posed onto the field 𝑩0 produced by the currents. Both fields produce the resultant
field
𝑩 = 𝑩0 + 𝑩 0 (7.1)
[compare with Eq. (2.8)].
The true (microscopic) field in a magnetic varies greatly within the limits of in-
termolecular distances. By 𝑩 is meant the averaged (macroscopic) field (see Sec. 2.3).
To explain the magnetization of bodies, Ampere assumed that ring currents (molec-
ular currents) circulate in the molecules of a substance. Every such current has a
magnetic moment and sets up a magnetic field in the surrounding space. In the ab-
sence of an external field, the molecular currents are oriented chaotically, owing to
which the resultant field set up by them equals zero. The total magnetic moment of
a body also equals zero because of the chaotic orientation of the magnetic moments
of its separate molecules. The action of a field causes the magnetic moments of the
molecules to acquire a predominating orientation in one direction, owing to which
the magnetic becomes magnetized–its total magnetic moment becomes other than
zero. The magnetic fields of individual molecular currents in this case no longer
compensate one another, and the field 𝑩 0 appears.
152 MAGNETIC FIELD IN A SUBSTANCE

It is quite natural to characterize the magnetization of a magnetic by the mag-


netic moment of unit volume. This quantity is called the magnetization and is
denoted by the symbol 𝑴. If a magnetic is magnetized inhomogeneously, the
magnetization at a given point is determined by the following expression:
1 Õ
𝑴= 𝒑 , (7.2)
𝛥𝑉 𝛥𝑉 m
where 𝛥𝑉 is an infinitely small volume (from the physical viewpoint) taken in the
vicinity of the point being considered, and 𝒑m is the magnetic moment of a separate
molecule. Summation is performed over all the molecules confined in the volume
𝛥𝑉 [compare with Eq. (2.4)].
The field 𝑩 0, like the field 𝑩0 , has no sources. Therefore, the divergence of the
resultant field given by Eq. (7.1) is zero:
∇ · 𝑩 = ∇ · 𝑩0 + ∇ · 𝑩 0 = 0. (7.3)
Thus, Eq. (6.96) and, consequently, Eq. (6.95), hold not only for a field in a vacuum,
but also for a field in a substance.

7.2. Magnetic Field Strength

Let us write an expression for the curl of the resultant field (7.1):
∇ × 𝑩 = ∇ × 𝑩0 + ∇ × 𝑩 0 .
According to Eq. (6.103), ∇ × 𝑩0 = 𝜇0 𝒋, where 𝒋 is the density of the macroscopic
current. Similarly, the curl of the vector 𝑩 0 must be proportional to the density of
the molecular currents:
∇ × 𝑩 0 = 𝜇0 𝒋mol .
Consequently, the curl of the resultant field is determined by the equation
(7.4)

∇ × 𝑩 = 𝜇0 𝒋 + 𝒋mol .
Inspection of Eq. (7.4) shows that when calculating the curl of a field in a mag-
netic, we encounter a difficulty similar to that which we encountered when dealing
with an electric field in a dielectric [see Eq. (2.16)]: to determine the curl of 𝑩, we
must know the density not only of the macroscopic, but also of the molecular
currents. But the density of the molecular currents, in turn, depends on the value of
the vector 𝑩. The way of circumventing this difficulty is also similar to the one we
took advantage of in Sec. 2.5. We are able to find such an auxiliary quantity whose
curl is determined only by the density of the macroscopic currents.
To find the form of this auxiliary quantity, let us attempt to express the density
of the molecular currents 𝒋mol through the magnetization of a magnetic 𝑴 (in
Magnetic Field Strength 153

Fig. 7.1 Fig. 7.2

Sec. 2.5 we expressed the density of the bound charges through the polarization of a
dielectric 𝑷). For this purpose, let us calculate the algebraic sum of the molecular
currents enclosed by a loop 𝛤. This sum is

𝒋mol · d𝑺, (7.5)
𝑆
where 𝑆 is the surface enclosing the loop.
The algebraic sum of the molecular currents includes only the molecular cur-
rents that are “threaded” onto the loop (see the current 𝐼mol 0 in Fig. 7.1). The currents

that are not “threaded” onto the loop either do not intersect the surface enclosing
the loop at all, or intersect it twice—once in one direction and once in the opposite
one (see the current 𝐼mol 00 in Fig. 7.1). As a result, their contribution to the algebraic

sum of the currents enclosed by the loop equals zero.


A glance at Fig. 7.2 shows that the contour element d𝑙 making the angle 𝛼 with
the direction of magnetization 𝑴 threads onto itself those molecular currents whose
centres are inside an oblique cylinder of volume 𝑆mol cos 𝛼 d𝑙 (where 𝑆mol is the
area enclosed by a separate molecular current). If n is the number of molecules in
unit volume, then the total current enclosed by the element d𝑙 is 𝐼mol 𝑆mol 𝑛 cos 𝛼 d𝑙.
The product 𝐼mol 𝑆mol equals the magnetic moment 𝑝m of an individual molecular
current. Hence, the expression 𝐼mol 𝑆mol 𝑛 is the magnetic moment of unit volume,
i.e., it gives the magnitude of the vector 𝑴, while 𝐼mol 𝑆mol cos 𝛼 gives the projection
of the vector 𝑴 onto the direction of the element d𝑙. Thus, the total molecular
current enclosed by the element d𝑙 is 𝑴 · d𝒍, while the sum of the molecular currents
enclosed by the entire loop [see Eq. (7.5)] is
∫ ∮
𝒋mol · d𝑺 = 𝑴 · d𝒍.
𝑆
Transforming the right-hand side according to Stokes’s theorem, we get
∫ ∫
𝒋mol · d𝑺 = (∇ × 𝑴) · d𝑺.
𝑆 𝑆
154 MAGNETIC FIELD IN A SUBSTANCE

Fig. 7.3

The equation which we have arrived at must be obeyed when the surface 𝑆 has been
chosen arbitrarily. This is possible only if the integrands are equal at every point of
a magnetic:
𝒋mol = ∇ × 𝑴. (7.6)
Thus, the density of the molecular currents is determined by the value of the curl
of the magnetization. When ∇ × 𝑴 = 0, the molecular currents of individual
molecules are oriented so that their sum on an average is zero.
Equation (7.6) allows us to make the following illustrative interpretation. Figure
7.3 shows the magnetization vectors 𝑴 1 and 𝑴 2 in direct proximity to a certain
point P. This point and both vectors are in the plane of the drawing. Loop 𝛤 depicted
by a dash line is also in the plane of the drawing. If the nature of the magnetization
is such that the vectors 𝑴 1 and 𝑴 2 are identical in magnitude, then the circulation
of 𝑴 around loop 𝛤 will be zero. Accordingly, ∇ × 𝑴 at point P will also be zero.
The molecular currents 𝑖10 and 𝑖20 flowing in the loops depicted in Fig. 7.3 by solid
lines can be compared with the magnetizations 𝑴 1 and 𝑴 2 . These loops are in a
plane normal to the plane of the drawing. With an identical direction of the vectors
𝑴 1 and 𝑴 2 , the directions of the currents 𝑖10 and 𝑖20 at point P will be opposite. Since
𝑀1 = 𝑀2 , the currents 𝑖10 and 𝑖20 are identical in magnitude, owing to which the
resultant molecular current at point P, like ∇ × 𝑴, will be zero: 𝒋mol = 0.
Now let us assume that 𝑀1 > 𝑀2 . Therefore, the circulation of 𝑴 around loop
𝛤 will differ frow zero. Accordingly, the field of the vector 𝑴 at point P will be
characterized by the vector ∇× 𝑴 directed beyond the drawing. A greater molecular
current corresponds to a greater magnetization; hence, 𝑖10 > 𝑖20 . Consequently, at
point P, there will be observed a resultant current other than zero characterized
by the density 𝒋mol . The latter, like ∇ × 𝑴, is directed beyond the drawing. When
𝑀1 < 𝑀2 , the vectors ∇ × 𝑴 and 𝒋mol will be directed toward us instead of beyond
the drawing.
Magnetic Field Strength 155

Thus, at points where the curl of the magnetization is other than zero, the
density of the molecular currents also differs from zero, the vectors ∇ × 𝑴 and 𝒋mol
having the same direction [see Eq. (7.6)].
Let us introduce Eq. (7.6) for the density of the molecular currents into Eq. (7.4):
∇ × 𝑩 = 𝜇0 𝒋 + 𝜇0 ∇ × 𝑴.
Dividing this equation by 𝜇0 and combining the curls, we get
 
𝑩
∇× − 𝑴 = 𝒋. (7.7)
𝜇0
Whence it follows that
𝑩
𝑯= − 𝑴, (7.8)
𝜇0
is our required auxiliary quantity whose curl is determined only by the macroscopic
currents. This quantity is called the magnetic tield strength.
In accordance with Eq. (7.7),
∇×𝑯 = 𝒋 (7.9)
(the curl of the vector 𝑯 equals the vector of the density of the macroscopic currents).
Let us take an arbitrary loop 𝛤 enclosed by surface 𝑆 and form the expression
∫ ∫
∇ × 𝑯 · d𝑺 = 𝒋 · d𝑺.
𝑆 𝑆
According to Stokes’s theorem, the left-hand side of this equation is equivalent to
the circulation of the vector 𝑯 around loop 𝛤. Hence,
∮ ∫
𝑯 · d𝒍 = 𝒋 · d𝑺. (7.10)
𝛤 𝑆
If macroscopic currents flow through wires enclosed by a loop, Eq. (7.10) can be
written∮in the form
Õ
𝑯 · d𝒍 = 𝐼𝑘 . (7.11)
𝛤 𝑘
Equations (7.10) and (7.11) express the theorem on the circulation of the vector 𝑯:
the circulation of the magnetic field strength vector around a loop equals the algebraic
sum of the macroscopic currents enclosed by this loop.
The magnetic field strength 𝑯 is the analogue of the electric displacement 𝑫.
It was originally assumed that magnetic masses similar to electric charges exist in
nature, and the science of magnetism developed along the lines of that of electricity.
Back in those times, the relevant names were introduced: the “magnetic induction”
for 𝑩 and the “field strength” (formerly “field intensity”) for 𝑯. It was later established
that no magnetic masses exist in nature and that the quantity called the magnetic
induction is actually the analogue not of the electric displacement 𝑫, but of the
156 MAGNETIC FIELD IN A SUBSTANCE

electric field strength 𝑬 (accordingly, 𝑯 is the analogue of 𝑫 instead of 𝑬). It was


decided not to change the established terminology, however, moreover because
owing to the different nature of an electric and a magnetic field (an electrostatic
field is potential, a magnetic one is solenoidal¹) the quantities 𝑩 and 𝑫 display many
similarities in their behaviour (for example, the 𝑩 lines, like the 𝑫 lines, are not
disrupted at the interface between two media).
In a vacuum, 𝑴 = 0, therefore, 𝑯 transforms into 𝑩/𝜇0 and Eqs. (7.10) and (7.11)
transform into Eqs. (6.103) and (6.101).
In accordance with Eq. (6.30), the strength of the field of a line current in a
vacuum is determined by the expression
1 2𝐼
𝐻= , (7.12)
4𝜋 𝑏
whence it can be seen that the magnetic field strength has a dimension equal to that
of current divided by that of length. In this connection, the SI unit of magnetic field
strength is called the ampere per metre (A m−1 ).
In the Gaussian system, the magnetic field strength is defined as the quantity
𝑯 = 𝑩 − 4𝜋 𝑴. (7.13)
It follows from this definition that in a vacuum 𝑯 coincides with 𝑩. Accordingly,
the unit of 𝑯 in the Gaussian system, called the oersted (Oe), has the same value
and dimension as the unit of magnetic induction—the gauss (Gs). In essence, the
oersted and gauss are different names of the same unit. If the latter measures 𝑯, it
is called the oersted, and if it measures 𝑩, the gauss.
It is customary practice to associate the magnetization not with the magnetic
induction, but with the field strength. It is assumed that at every point of a magnetic
𝑴 = 𝜒m 𝑯, (7.14)
where 𝜒m is a quantity characteristic of a given magnetic and called the magnetic
susceptibility². Experiments show that for weakly magnetic (non-ferromagnetic)
substances in not too strong fields 𝜒m is independent of 𝑯. According to Eq. (7.8),
the dimension of 𝑯 coincides with that of 𝑴. Hence, 𝜒m is a dimensionless quantity.
Using Eq. (7.14) for 𝑴 in Eq. (7.8), we get
𝑩
𝑯= − 𝜒m 𝑯,
𝜇0

¹A solenoidal field is one having no sources. At each point of such a field, the divergence is zero.
²In anisotropic media, the directions of the vectors 𝑴 and 𝑯, generally speaking, do not coincide.
For such media, the relatiom between the vectors 𝑴 and 𝑯 is achieved by means of the magnetic
susceptibility tensor (see the footnote number 2 on page 55).
Magnetic Field Strength 157

whence
𝑩
𝑯= . (7.15)
𝜇0 (1 + 𝜒m )
The dimensionless quantity
𝜇 = 1 + 𝜒m , (7.16)
is called the relative permeability or simply the permeability of a substance³.
Unlike the dielectric susceptibility 𝜒 that can have only positive values (the polar-
ization 𝑷 in an isotropic dielectric is always directed along the 𝑬 field), the magnetic
susceptibility 𝜒m may be either positive or negative. Hence, the permeability may
be either greater or smaller than unity.
With account taken of Eq. (7.16), Eq. (7.15) can be written as follows:
𝑩
𝑯= . (7.17)
𝜇0 𝜇
Thus, the magnetic field strength 𝑯 is a vector having the same direction as the
vector 𝑩, but whose magnitude is 𝜇0 𝜇 times smaller (in anisotropic media the vectors
𝑯 and 𝑩, generally speaking, do not coincide in direction).
Equation (7.14) relating the vectors 𝑴 and 𝑯 has exactly the same form in the
Gaussian system too. Using this equation in Eq. (7.13), we get
𝑯 = 𝑩 − 4𝜋 𝜒m 𝑯,
whence
𝑩
𝑯= . (7.18)
1 + 4𝜋 𝜒m
The dimensionless quantity
𝜇 = 1 + 4𝜋 𝜒m , (7.19)
is called the permeability of a substance. Introducing this quantity into Eq. (7.18),
we get
𝑩
𝑯= . (7.20)
𝜇
The value of 𝜇 in the Gaussian system of units coincides with its value in the
SI. A comparison of Eqs. (7.16) and (7.19) shows that the value of the magnetic
susceptibility in the SI is 4𝜋 times that of 𝜒m in the Gaussian system:
𝜒m,SI = 4𝜋 𝜒m,Gs . (7.21)

³The so-called absolute permeability 𝜇a = 𝜇0 𝜇 is introduced in electrical engineering. This


quantity is deprived of a physical meaning, however, and we shall not use it.
158 MAGNETIC FIELD IN A SUBSTANCE

Fig. 7.4

7.3. Calculation of the Field in Magnetics

Let us consider the field produced by an infinitely long round magnetized rod. We
shall consider the magnetization 𝑴 to be the same everywhere and directed along
the axis of the rod. Let us divide the rod mentally into layers of thickness d𝑙 at right
angles to the axis. We shall divide each layer in turn into small cylindrical elements
with bases of an arbitrary shape and of area d𝑆 (Fig. 7.4a). Each such element has
the magnetic moment
d𝑝m = 𝑀 d𝑆 d𝑙. (7.22)
The field d𝑩 set up by an element at distances that are great in comparison with
0

its dimensions is equivalent to the field that would produce the current 𝐼 = 𝑀 d𝑙
flowing around the element along its side surface (see Fig. 7.4b). Indeed, the magnetic
moment of such a current is d𝑝m = 𝐼 d𝑆 = 𝑀 d𝑙 d𝑆 [compare with Eq. (7.22)], while
the magnetic field at great distances is determined only by the magnitude and
direction of the magnetic moment (see Sec. 6.9).
The imaginary currents flowing in the section of the surface common for two
adjacent elements are identical in magnitude and opposite in direction, therefore
their sum is zero. Thus, when summating the currents flowing around the side
surfaces of the elements of one layer, only the currents flowing along the side surface
of the layer will remain uncompensated.
It follows from the above that a rod layer of thickness d𝑙 sets up a field equivalent
to the one which would be produced by the current 𝑀 d𝑙 flowing around the layer
along its side surface (the linear density of this current is 𝑗lin = 𝑀). The entire
infinite magnetized rod sets up a field equivalent to the field of a cylinder around
which flows a current having the linear density 𝑗lin = 𝑀. We established in Sec. 6.2
that outside such a cylinder the field vanihes, while inside it the field is homogeneous
and equals 𝜇0 𝑗lin in magnitude.
We have, thus, determined the nature of the field 𝑩 0 set up by a homogeneously
Calculation of the Field in Magnetics 159

magnetized infinitely long round rod. Outside the rod, the field vanishes. Inside it,
the field is homogeneous and equals
𝑩 0 = 𝜇0 𝑴. (7.23)
Assume that we have a homogeneous field 𝑩0 set up by macrocurrents in a
vacuum. According to Eq. (7.17), the strength of this field is
𝑩0
𝑯0 = . (7.24)
𝜇0
Let us introduce into this field (we shall call it an external one) an infinitely long
round rod of a homogeneous and isotropic magnetic, arranging it along the direction
of 𝑩0 . It follows from considerations of symmetry that the magnetization 𝑴 set up
in the rod is collinear with the vector 𝑩0 .
The magnetized rod produces inside itself the field 𝑩 0 determined by Eq. (7.23).
The field inside the rod, as a result, becomes equal to
𝑩 = 𝑩0 + 𝑩 0 = 𝑩0 + 𝜇0 𝑴. (7.25)
Using this value of 𝑩 in Eq. (7.8), we get the strength of the field inside the rod
𝑩 𝑩0
𝑯= −𝑴= = 𝑯0
𝜇0 𝜇0
[see Eq. (7.24)]. Thus, the strength of the field in the rod coincides with that of the
external field.
Multiplying 𝑯 by 𝜇0 𝜇 we get the magnetic induction inside the rod:
𝑩0
𝑩 = 𝜇0 𝜇𝑯 = 𝜇0 𝜇 = 𝜇𝑩0 . (7.26)
𝜇0
Hence, it follows that the permeability 𝜇 shows how many times the field increases
in a magnetic [compare with Eq. (7.26)].
It must be noted that since the field 𝑩 0 is other than zero only inside the rod,
the magnetic field outside the rod remains unchanged.
The result we have obtained is correct when a homogeneous and isotropic
magnetic fills the volume bounded by surfaces formed by the strength lines of
the external field⁴. Otherwise, the field strength determined by Eq. (7.8) does not
coincide with 𝑯 0 = 𝑩0 /𝜇0 .
It is conditionally assumed that the field strength in a magnetic is
𝑯 = 𝑯 0 − 𝑯 d, (7.27)
where 𝑯 0 is the external field, and 𝑯 d is the so-called demagnetizing field. The

⁴We remind our reader that for an electric field 𝑫 = 𝑫0 provided that a homogeneous and
isotropic dielectric fills the volume bounded by equipotential surfaces, i.e., surfaces orthogonal to the
strength lines of the external field.
160 MAGNETIC FIELD IN A SUBSTANCE

latter is assumed to be proportional to the magnetization


𝑯 d = 𝑁 𝑴. (7.28)
The proportionality constant 𝑁 is known as the demagnetization factor. It de-
pends on the shape of a magnetic. We have seen that 𝑯 = 𝑯 0 for a body whose
surface is not intersected by strength lines of the external field, i.e., the demagneti-
zation factor is zero. For a thin disk perpendicular to the external field, 𝑁 = 1, and
for a sphere, 𝑁 = 1/3.
The relevant calculations show that when a homogeneous and isotropic mag-
netic having the shape of an ellipsoid is placed in a homogeneous external field, the
magnetic field in it is also homogeneous, although it differs from the external one.
This also holds for a sphere, which is a particular case of an ellipsoid, and for a long
rod and a thin disk, which can be considered as the extreme cases of an ellipsoid.
In concluding, let us find the field strength of an infinitely long solenoid filled
with a homogeneous and isotropic magnetic (or submerged in an infinite homoge-
neous and isotropic magnetic.). Applying the theorem on circulation [see Eq. (7.11)]
to the loop shown in Fig. 6.30, we get the equation 𝐻𝑎 = 𝑛𝑎𝐼. Hence,
𝐻 = 𝑛𝐼. (7.29)
Thus, the field strength inside an infinitely long solenoid equals the product of the
current and the number of turns per unit length. Outside the solenoid, the field
strength vanishes.

7.4. Conditions at the Interface of Two Magnetics

Near the interface of two magnetics, the vectors 𝑩 and 𝑯 must comply with definite
boundary conditions that follow from the relations
∇ · 𝑩 = 0, ∇×𝑯 = 𝒋 (7.30)
[see Eqs. (7.3) and (7.9)]. We are considering stationary fields, i.e., ones that do not
vary with time.
Let us take on the interface of two magnetics of permeabilities 𝜇1 and 𝜇2 an
imaginary cylindrical surface of height ℎ with bases 𝑆1 and 𝑆2 at different sides of
the interface (Fig. 7.5). The flux of the vector 𝑩 through this interface is
𝛷𝐵 = 𝐵1,n 𝑆 + 𝐵2,n 𝑆 + h𝐵n i 𝑆side (7.31)
[compare with Eq. (2.46)].
Since ∇ · 𝑩 = 0, the flux of the vector 𝑩 through any closed surface is zero.
Equating expression (7.31) to zero and making the transition ℎ → 0, we arrive at
the equation 𝐵1,n = −𝐵2,n . If we project 𝑩1 and 𝑩2 onto the same normal, we get
Conditions at the Interface of Two Magnetics 161

Fig. 7.5 Fig. 7.6

the condition
𝐵1,n = 𝐵2,n (7.32)
[compare with Eq. (2.47)].
Replacing in accordance with Eq. (7.17) the components of 𝑩 with the corre-
sponding components of 𝑯 multiplied by 𝜇0 𝜇, we get the equation
𝜇0 𝜇1 𝐻1,n = 𝜇0 𝜇2 𝐻2,n ,
whence
𝐻1,n 𝜇2
= . (7.33)
𝐻2,n 𝜇1
Now let us take a rectangular loop on the interface of the magnetics (Fig. 7.6)
and calculate the circulation of 𝑯 for it. With small dimensions of the loop, the
circulation
∮ can be written in the form
𝐻𝑙 d𝑙 = 𝐻1,𝜏 𝑎 − 𝐻2,𝜏 𝑎 + h𝐻𝑙 i 2𝑏, (7.34)
where h𝐻𝑙 i is the average value of 𝑯 𝑙 on the parts of the loop at right angles to
the interface. If no macroscopic currents flow along the interface of the magnetics,
∇ × 𝑯 within the limits of the loop will equal zero. Consequently, the circulation
will also be zero. Assuming that Eq. (7.34) is zero and performing the limit transition
𝑏 → 0, we arrive at the expression
𝐻1,𝜏 = 𝐻2,𝜏 (7.35)
[compare with Eq. (2.44)].
Replacing the components of 𝑯 with the corresponding components of 𝑩
divided by 𝜇0 𝜇, we get the relation
𝐵1,𝜏 𝐵2,𝜏
= ,
𝜇0 𝜇1 𝜇0 𝜇2
whence it follows that
𝐵1,𝜏 𝜇1
= . (7.36)
𝐵2,𝜏 𝜇2
Summarizing, we can say that in passing through the interface between two
magnetics, the normal component of the vector 𝑩 and the tangential component of
162 MAGNETIC FIELD IN A SUBSTANCE

Fig. 7.7

the vector 𝑯 change continuously.


The tangential component of the vector 𝑩 and the normal component of the
vector 𝑯 in passing through the interface of the magnetics, however, experience a
discontinuity. Thus, when passing through the interface of two media, the vector 𝑩
behaves similar to the vector 𝑫, and the vector 𝑯 similar to the vector 𝑬.
Figure 7.7 shows the behaviour of the 𝑩 lines when intersecting the surface
between two magnetics. Let the angles between the 𝑩 lines and a normal to the
interface be 𝛼1 and 𝛼2 , respectively. The ratio of the tangents of these angles is
tan 𝛼1 𝐵1,𝜏 /𝐵1,n
= ,
tan 𝛼2 𝐵2,𝜏 /𝐵2,n
whence with a view to Eqs. (7.32) and (7.36) we get a law of refraction of the magnetic
field lines similar to Eq. (2.49):
tan 𝛼1 𝜇1
= . (7.37)
tan 𝛼2 𝜇2
Upon passing into a magnetic with a greater value of 𝜇, the magnetic field lines
deviate from a normal to the surface. This leads to crowding of the lines. The
crowding of the 𝑩 lines in a substance with a great permeability makes it possible
to form magnetic beams, i.e., impart the required shape and direction to them. In
particular, for magnetic shielding of a space, it is surrounded with an iron screen. A
glance at Fig. 7.8 shows that the crowding of the magnetic field lines in the body of
the screen results in weakening of the field inside it.
Figure 7.9 is a schematic view of a laboratory electromagnet. It consists of an
iron core onto which coils supplied with a current are fitted. The magnetic field
lines are mainly concentrated inside the core. Only in the narrow air gap do they
pass in a medium with a low value of 𝜇. The vector 𝑩 intersects the boundaries
between the air gap and the core along a normal to the interface. It thus follows,
Conditions at the Interface of Two Magnetics 163

Fig. 7.8 Fig. 7.9

in accordance with Eq. (7.32), that the magnetic induction in the gap and in the
core is identical in value. Let us apply the theorem on the circulation of 𝑯 to the
loop along the axis of the core. We can assume that the field strength is identical
everywhere in the iron and is 𝐻iron = 𝐵/(𝜇0 𝜇iron ). In the air, 𝐻air = 𝐵/(𝜇0 𝜇air ). Let
us denote the length of the loop section in the iron by 𝑙iron and in the gap by 𝑙air .
The Circulation can, thus, be written in the form 𝐻iron 𝑙 iron + 𝐻air 𝑙air . According to
Eq. (7.11), this circulation must equal 𝑁 𝐼, where 𝑁 is the total number of turns of
the electromagnet coils, and 𝐼 is the current. Thus,
𝐵 𝐵
𝑙iron + 𝑙air = 𝑁 𝐼.
𝜇0 𝜇iron 𝜇0 𝜇air
Hence,
𝑁 𝑁
𝐵 = 𝜇0 𝐼   ≈ 𝜇0 𝐼  
𝑙air 𝑙iron 𝑙iron
+ 𝑙air +
𝜇air 𝜇iron 𝜇iron
(𝜇air differs from unity only in the fifth digit after the decimal point).
Usually, 𝑙air is of the order of 0.1 m, 𝑙iron is of the order of 1 m, while 𝜇iron reaches
values of the order of several thousands. We may, therefore, disregard the second
addend in the denominator and write that
𝑁
𝐵 = 𝜇0 𝐼 . (7.38)
𝑙air
Consequently, the magnetic induction in the gap of an electromagnet has the same
value as it would have inside a toroid without a core when 𝑁/𝑙air turns are wound
on the torus per unit length [see Eq. (6.110)]. By increasing the total number of turns
and reducing the dimensions of the air gap, we can obtain fields with a high value of
𝐵. In practice, fields with 𝐵 of the order of several teslas (several tens of thousands
of gausses) are obtained with the aid of electromagnets having an iron core.
164 MAGNETIC FIELD IN A SUBSTANCE

7.5. Kinds of Magnetics

Equation (7.14) determines the magnetic susceptibility 𝜒m of a unit volume of a


substance. This susceptibility is often replaced with the molar (for chemically
simple substances—the atomic) susceptibility 𝜒m,mol ( 𝜒m,at ) related to one mole
of a substance. It is evident that 𝜒m,mol = 𝜒m 𝑉mol , where 𝑉mol is the volume of a
mole of a substance. Whereas 𝜒m is a dimensionless quantity, 𝜒m,mol is measured in
m3 mol−1 .
Depending on the sign and magnitude of the magnetic susceptibility, all mag-
netics are divided into three groups:
(1) diamagnetics, for which 𝜒m is negative and small in absolute value (| 𝜒m,mol |
is about 10−11 m3 mol−1 to 10−10 m3 mol−1 );
(2) paramagnetics, for which 𝜒m is also not great, but positive ( 𝜒m,mol is about
10−10 m3 mol−1 to 10−10 m3 mol−1 );
(3) ferromagnetics, for which 𝜒m is positive and reaches very great values
( 𝜒m,mol is about 1 m3 mol−1 ). In addition, unlike diamagnetics and paramag-
netics for which 𝜒m does not depend on 𝐻, the susceptibility of ferromag-
netics is a function of the magnetic field strength.
Thus, the magnetization 𝑴 in isotropic substances may either coincide in
direction with 𝑯 (in paramagnetics and ferromagnetics), or be directed oppositely
to it (in diamagnetics). We remind our reader that in isotropic dielectrics the
polarization is always directed in the same way as 𝑬.

7.6. Gyromagnetic Phenomena

The nature of molecular currents became clear after the British physicist Ernest
Rutherford (1871-1937) established experimentally that the atoms of all substances
consist of a positively charged nucleus and negatively charged electrons travelling
around it.
The motion of electrons in atoms obeys quantum laws; in particular, the con-
cept of a trajectory cannot be applied to the electrons travelling in an atom. The
diamagnetism of a substance can be explained, however, by using the very simple
Bohr model of an atom. According to this model, the electrons in atoms travel along
stationary circular orbits.
Assume that an electron is moving with the speed 𝑣 in an orbit of radius 𝑟
(Fig. 7.10). The charge 𝑒𝜈, where 𝑒 is the charge of an electron and 𝜈 is its number of
revolutions a second, will be carried through an area at any place along the path of
the electron in one second. Hence, an electron travelling in orbit will form the ring
current 𝐼 = 𝑒𝜈. Since the charge of an electron is negative, the direction of motion
Gyromagnetic Phenomena 165

Fig. 7.10

of the electron and the direction of the current will be opposite. The magnetic
moment of the current set up by an electron is
𝑝m = 𝐼𝑆 = 𝑒𝜈𝜋𝑟 3 .
The product 2𝜋𝑟𝜈 gives the speed of the electron 𝑣, therefore, we can write that
𝑒𝑣𝑟
𝑝m = . (7.39)
2
The moment (7.39) is due to the motion of an electron in orbit and is, therefore,
called the orbital magnetic moment. The direction of the vector 𝒑m forms a
right-handed system with the direction of the current, and a left-handed one with
that of motion of the electron (see Fig. 7.10).
An electron moving in orbit has the angular momentum
𝐿 = 𝑚𝑣𝑟 (7.40)
(𝑚 is the mass of an electron). The vector 𝑳 is called the orbital angular momen-
tum of an electron. It forms a right-handed system with the direction of motion of
the electron. Hence, the vectors 𝒑m and 𝑳 are directed oppositely.
The ratio of the magnetic moment of an elementary particle to its angular
momentum is called the gyromagnetic (or magneto mechanical) ratio. For an
electron, it is
𝑝m 𝑒
=− (7.41)
𝐿 2𝑚
(𝑚 is the mass of an electron; the minus sign indicates that the magnetic moment
and the angular momentum are directed oppositely).
Owing to its rotation about the nucleus, an electron is similar to a spinning top
or gyroscope. This circumstance underlies the so-called gyromagnetic phenom-
ena consisting in that the magnetization of a magnetic leads to its rotation, and,
conversely, the rotation of a magnetic leads to its magnetization. The existence of
the first phenomenon was proved experimentally by A. Einstein and W. de Haas,
and of the second by S. Barnett.
166 MAGNETIC FIELD IN A SUBSTANCE

Fig. 7.11

Einstein and de Haas based their experiment on the following reasoning. If


we magnetize a rod made of a magnetic, then the magnetic moments of the elec-
trons will be aligned in the direction of the field and the angular momenta in the
opposite direction. As a result, the total angular momentum of the electrons 𝑖 𝑳𝑖
Í
will become other than zero (initially owing to the chaotic orientation of the in-
dividual momenta it equalled zero). The angular momentum of the system rod +
electrons must remain unchanged. Therefore, the rod acquires the angular mo-
mentum − 𝑖 𝑳𝑖 and, consequently, begins to rotate. A change in the direction of
Í
magnetization leads to a change in the direction of rotation of the rod.
A mechanical model of this experiment can be carried out by seating a person
on a rotatable stool and having him hold a massive rotating wheel in his hands.
When he holds the axle of the wheel upward, he begins to rotate in the direction
opposite to that of rotation of the wheel. When he turns the axle downward, he
begins to rotate in the other direction.
Einstein and de Haas conducted their experiment as follows (Fig. 7.11). A thin
iron rod was suspended on an elastic thread and placed inside a solenoid. The
thread was twisted very slightly when the rod was magnetized using a constant
magnetic field. The resonance method was used to increase the effect—the solenoid
was fed with an alternating current whose frequency was chosen equal to the
natural frequency of mechanical oscillations of the system. In these conditions, the
amplitude of the oscillations reached values that could be measured by watching the
displacement of a light spot reflected by a mirror fastened to the thread. The data
obtained in the experiment were used to calculate the gyromagnetic ratio, which
was found to equal −(𝑒/𝑚). Thus, the sign of the charge of the carriers setting up
Gyromagnetic Phenomena 167

the molecular currents coincided with the sign of the charge of an electron. The
result obtained, however, was double the expected value of the gyromagnetic ratio
(7.41).
To understand Barnett’s experiment, we must remember that when an attempt
was made to bring a gyroscope into rotation about a certain direction, the gyroscope
axis turned so that the directions of the natural and forced rotations of the gyroscope
coincided (see Sec. 5.9 of Vol. I). If we place a gyroscope fastened in a universal
joint on the disk of a centrifugal machine and begin to rotate it, the gyroscope axis
will align itself vertically, and in such a way that the direction of rotation of the
gyroscope will coincide with that of the disk. When the direction of rotation of the
centrifugal machine is reversed, the gyroscope axis will turn through 180 degrees,
i.e., in such a way that the directions of the two rotations will again coincide.
Barnett rotated an iron rod very rapidly about its axis and measured the pro-
duced magnetization. Barnett also obtained a value for the gyromagnetic ratio from
the results of his experiment double that given by Eq. (7.41).
It was discovered later that apart from the orbital magnetic moment (7.39)
and the orbital angular momentum (7.40), an electron has its intrinsic angular
momentum 𝐿s and magnetic moment 𝑝m,s for which the gyromagnetic ratio is
𝑝m,s 𝑒
=− , (7.42)
𝐿s 𝑚
i.e., coincides with the value obtained in the experiments conducted by Einstein
and de Haas and by Barnett. It thus follows that the magnetic properties of iron are
due not to the orbital, but to the intrinsic magnetic moment of its electrons.
Attempts were initially made to explain the existence of the intrinsic magnetic
moment and angular momentum of an electron by considering it as a charged
sphere spinning about its axis. Accordingly, the intrinsic angular momentum of
an electron was named its spin. It was discovered quite soon, however, that such
a notion results in a number of contradictions, and it became necessary to reject
the hypothesis of a “spinning” electron. It is assumed at present that the intrinsic
angular momentum (spin) and the intrinsic (spin) magnetic moment associated with
it are inherent properties of an electron like its mass and charge.
Not only electrons, but also other elementary particles have a spin. The spin⁵ of
elementary particles is an integral or half-integral multiple of the quantity ℏ equal
to Planck’s constant ℎ divided by 2𝜋

ℏ= = 1.05 × 10−34 J s = 1.05 × 10−2 erg s. (7.43)
2𝜋

⁵More exactly, the maximum value of the projection of the spin onto a direction separated in
space, for example, onto that of the extenal field.
168 MAGNETIC FIELD IN A SUBSTANCE

In particular, for an electron, 𝐿s = ℏ/2; in this connection, the spin of an electron


is said to equal 1/2. Thus, ℏ is a natural unit of the angular momentum like the
elementary charge 𝑒 is a natural unit of charge.
In accordance with Eq. (7.42), the intrinsic magnetic moment of an electron is
𝑒 𝑒 ℏ 𝑒ℏ
𝑝m = − 𝐿s = − =− . (7.44)
𝑚 𝑚2 2𝑚
The quantity⁶
𝑒ℏ
𝜇B = = 0.927 × 10−23 J T−1 = 0.927 × 10−20 erg Gs−1 (7.45)
2𝑚
is called the Bohr magneton. Hence, the intrinsic magnetic moment of an electron
equals one Bohr magneton.
The magnetic moment of an atom consists of the orbital and intrinsic moments
of the electrons in it, and also of the magnetic moment of the nucleus (which is
due to the magnetic moments of the elementary particles—protons and neutrons—
forming the nucleus). The magnetic moment of a nucleus is much smaller than
the moments of the electrons. For this reason, they may be disregarded when
considering many questions, and we may consider the magnetic moment of an
atom to equal the vector sum of the magnetic moments of its electrons. The magnetic
moment of a molecule may also be considered equal to the sum of the magnetic
moments of all its electrons.
0. Stern and W. Gerlach determined the magnetic moments of atoms experi-
mentally. They passed a beam of atoms through a greatly inhomogeneous magnetic
field. The inhomogeneity of the field was achieved by using a special shape of
the electromagnet pole shoes (Fig. 7.12). By Eq. (6.77), the atoms of the beam must
experience the force
∂𝐵
𝐹 = 𝑝m cos 𝛼,
∂𝑥
whose magnitude and sign depend on the angle 𝛼 made by the vector 𝒑m with the
direction of the field. When the moments of the atoms are distributed chaotically
by directions, the beam contains particles for which the values of 𝛼 vary within the
limits from 0 to 𝜋. It was assumed accordingly that a narrow beam of atoms after
passing between the poles would form on a screen a continuous extended trace
whose edges would correspond to atoms having orientations at angles of 𝛼 = 0 and
𝑎 = 𝜋 (Fig. 7.13). The experiment gave unexpected results. Instead of a continuous
extended trace, separate lines were obtained that were arranged symmetrically with
respect to the trace of the beam obtained in the absence of a field.

⁶According to the equation 𝑊 = −𝒑m · 𝑩, the dimension of magnetic moment equals that of
energy (joule or erg) divided by the dimension of magnetic induction (tesla or gauss).
Diamagnetism 169

Fig. 7.12 Fig. 7.13

The Stern-Gerlach experiment showed that the angles at which the magnetic
moments of atoms are oriented relative to a magnetic field can have only discrete
values, i.e., that the projection of a magnetic moment onto the direction of a field is
quantized.
The number of possible values of the projection of the magnetic moment onto
the direction of the magnetic field for different atoms is different. It is two for silver,
aluminium, copper, and the alkali metals, four for vanadium, nitrogen, and the
halogens, five for oxygen, six for manganese, nine for iron, ten for cobalt, etc.
Measurements gave values of the order of several Bohr magnetons for the
magnetic moments of atoms. Some atoms showed no deflections (see, for example,
the trace of mercury and magnesium atoms in Fig. 7.13), which indicates that they
have no magnetic moment.

7.7. Diamagnetism

An electron travelling in an orbit is like a spinning top. Therefore, all the features
of behaviour of gyroscopes under the action of external forces must be inherent
in it, in particular, precession of the electron orbit must appear in the appropriate
conditions. The conditions needed for precession appear if an atom is in an external
magnetic field 𝑩 (Fig. 7.14). In this case, the torque 𝑻 = 𝒑m × 𝑩 is exerted on the
orbit. It tends to set up the orbital magnetic moment of an electron 𝒑m in the
direction of the field (the angular momentum 𝑳 will be set np against the field). The
torque 𝑻 causes the vectors 𝒑m and 𝑳 to precess about the direction of the magnetic
induction vector 𝑩 whose velocity is simple to find (see Sec. 5.9 of Vol. I).
During the time d𝑡, the vector 𝑳 receives the increment d𝑳 equal to
d𝑳 = 𝑻 d𝑡.
170 MAGNETIC FIELD IN A SUBSTANCE

Fig. 7.14

The vector d𝑳 like the vector 𝑻, is perpendicular to the plane passing through the
vectors 𝑩 and 𝑳; its magnitude is
|d𝑳| = 𝑝m 𝐵 sin 𝛼 d𝑡,
where 𝛼 is the angle between 𝒑m and 𝑩.
During the time d𝑡, the plane containing the vector 𝑳 will turn about the direc-
tion of 𝑩 through the angle
|d𝐿| 𝑝m 𝐵 sin 𝛼 d𝑡 𝑝m
d𝜃 = = = 𝐵 d𝑡.
𝐿 sin 𝛼 𝐿 sin 𝛼 𝐿
Dividing this angle by the time d𝑡, we find the angular velocity of precession
d𝜃 𝑝m
𝜔L = = 𝐵.
d𝑡 𝐿
Introducing the value of the ratio of the magnetic moment and angular momentum
from Eq. (7.41), we get
𝑒𝐵
𝜔L = . (7.46)
2𝑚
The frequency (7.46) is called the frequency of Larmor precession or simply
the Larmor frequency. It depends neither on the angle of inclination of an orbit
with respect to the direction of the magnetic field nor on the radius of the orbit or
the speed of the electron, and, consequently, is the same for all the electrons in an
atom.
The precession of an orbit causes additional motion of the electron about the
direction of the field. If the distance 𝑟 0 from the electron to an axis parallel to 𝑩 and
passing through the centre of the orbit did not change, the additional motion of
Diamagnetism 171

Fig. 7.15

the electron would occur along a circle of radius 𝑟 0 (see the unshaded circle in the
right part of Fig. 7.14). The ring current 𝐼 0 = 𝑒(𝜔L /2𝜋) (see the shaded circle) would
correspond to it. The magnetic moment of this current is
𝜔L 𝑒𝜔L 02
𝑝m0
= 𝐼 0𝑆 0 = 𝑒 𝜋𝑟 02 = 𝑟 , (7.47)
2𝜋 2
and is directed oppositely to 𝑩 (see the figure). It is called the induced magnetic
moment.
Indeed, owing to the motion of an electron in its orbit, the distance 𝑟 0 constantly
changes. Therefore, in Eq. (7.47), we must replace 𝑟 02 with its average value in time
𝑟 . The latter depends on the angle 𝛼 characterizing the orientation of the orbit

02

plane relative to 𝑩. In particular, for an orbit perpendicular to the vector 𝑩, the


quantity 𝑟 0 is constant and equals the radius of the orbit 𝑟. For an orbit whose
plane passes through the direction of 𝑩, the quantity 𝑟 0 varies according to the law
𝑟 0 = 𝑟 sin(𝜔𝑡), where 𝜔 is the angular velocity of revolution of an electron in its orbit
(Fig.

02 7.15;
the2 vector 𝑩 and the orbit are in the
plane of the
drawing). Consequently,
𝑟 2 2 2
= 𝑟 sin (𝜔𝑡) = 𝑟 /2 (the quantity sin (𝜔𝑡) = 1). Averaging over all
possible values of 𝛼, considering them to be equally probable, yields

02 2 2
𝑟 = 𝑟 . (7.48)
3
Using in Eq. (7.47) the value (7.46) for 𝜔L and (7.48) for 𝑟 , we get the following

02

expression for the average value of the induced magnetic moment of one electron:
𝑒2
𝑝m = − 𝑟 2 𝐵 (7.49)

0
6𝑚
(the minus sign reflects the circumstance that the vectors 𝒑m and 𝑩 have opposite

0

directions). We assumed the orbit to


be circular. In the general case (for example,
for an elliptical orbit), we must take 𝑟 2 instead of 𝑟 2 , i.e., the mean square of the
distance from an electron to the nucleus.
172 MAGNETIC FIELD IN A SUBSTANCE

Summation of Eq. (7.49) over all the electrons yields the induced magnetic
moment of an atom
𝑒2 𝐵 Õ
2
Õ
𝑍
0
𝑝m,at = 0
𝑝m = − 𝑟 (7.50)
6𝑚 𝑘=1 𝑘
(𝑍 is the atomic number of a chemical element; the number of electrons in an atom
is 𝑍).
Thus, the action of an external magnetic field sets up precession of the electron
orbits with the same angular velocity (7.46) for all the electrons. The additional
motion of the electrons due to precession leads to the production of an induced
magnetic moment of an atom [Eq. (7.50)] directed against the field. Larmor preces-
sion appears in all substances without exception. When atoms by themselves have a
magnetic moment, however, a magnetic field not only induces the moment (7.50),
but also has an orienting action on the magnetic moments of atoms, aligning them
in the direction of the field. The positive (i.e., directed along the field) magnetic
moment that appears may be considerably greater than the negative induced mo-
ment. The resultant moment is, therefore, positive and the substance behaves like a
paramagnetic.
Diamagnetism is found only in substances whose atoms have no magnetic
moment (the vector sum of the orbital and spin magnetic moments of the atom
electrons is zero). If we multiply Eq. (7.50) by the Avogadro constant 𝑁A for such
a substance, we get the magnetic moment for a mole of the substance. Dividing
it by the field strength 𝐻, we find the molar magnetic susceptibility 𝜒m,mol . The
permeability of dielectrics virtually equals unity. We can therefore assume that
𝐵/𝐻 = 𝜇0 . Thus,
𝜇0 𝑁A 𝑒2 Õ
2
0
𝑁A 𝑝m,at 𝑍
𝜒m,mol = =− 𝑟 . (7.51)
𝐻 6𝑚 𝑘=1 𝑘
We must note that the strict quantum-mechanical theory gives exactly the same
expression.
Introduction of the numerical values of 𝜇0 , 𝑁A , 𝑒 and 𝑚 in Eq. (7.51) yields
𝑍
9
Õ
2
𝜒m,mol = −3.55 × 10 𝑟𝑘 .
𝑘=1
The radii of electron orbits have a value of the order of 10−10 m. Hence, the molar
diamagnetic susceptibility of the order of 10−11 to 10−10 is obtained, which agrees
quite well with experimental data.
Paramagnetism 173

7.8. Paramagnetism

If the magnetic moment 𝑝m of the atoms differs from zero, the relevant substance is
paramagnetic. A magnetic field tends to align the magnetic moments of the atoms
along 𝑩, while thermal motion tends to scatter them uniformly in all directions. As
a result, a certain preferential orientation of the moments is established along the
field. Its value grows with increasing 𝑩 and diminishes with increasing temperature.
The French physicist and chemist Pierre Curie (1859-1906) established experimen-
tally a law (named Curie’s law in his honour) according to which the susceptibility
of a paramagnetic is
𝐶
𝜒m,mol = , (7.52)
𝑇
where 𝐶 is the Curie constant depending on the kind of substance and 𝑇 the absolute
temperature.
The classical theory of paramagnetism was developed by the French physicist
Paul Langevin (1872-1946) in 1905. We shall limit ourselves to a treatment of this
theory for not too strong fields and not very low temperatures.
According to Eq. (6.76), an atom in a magnetic field has the potential energy 𝑊 =
−𝑝m 𝐵 cos 𝜃 that depends on the angle 𝜃 between the vectors 𝒑m and 𝑩. Therefore,
the equilibrium distribution of the moments by directions must obey Boltzmann’s
law (see Sec. 11.8 of Vol. I). According to this law, the probability of the fact that the
magnetic moment of an atom will make with the direction of the vector 𝑩 an angle
within the limits from 𝜃 to 𝜃 + d𝜃 is proportional to
𝑝m 𝐵 cos 𝜃
   
𝑊
exp − = exp .
𝑘𝑇 𝑘𝑇
Introducing the notation
𝑝m 𝐵
𝑎= , (7.53)
𝑘𝑇
we can write the expression determining the probability in the form
exp(𝑎 cos 𝜃). (7.54)
In the absence of a field, all the directions of the magnetic moments are equally
probable. Consequently, the probability of the fact that the direction of a moment
will form with a certain direction 𝑧 an angle within the limits from 𝜃 to 𝜃 + d𝜃 is
d𝛺 𝜃 2𝜋 sin 𝜃 d𝜃 1
(d𝑃𝜃 )𝐵=0 = = = sin 𝜃 d𝜃. (7.55)
d4𝜋 4𝜋 2
Here, d𝛺 𝜃 = 2𝜋 sin 𝜃 d𝜃 is the solid angle enclosed between cones having apex
angles of 𝜃 and 𝜃 + d𝜃 (Fig. 7.16).
When a field is present, the multiplier (7.54) appears in the expression for the
174 MAGNETIC FIELD IN A SUBSTANCE

Fig. 7.16

probability:
1
d𝑃𝜃 = 𝐴 exp(𝑎 cos 𝜃) sin 𝜃 d𝜃 (7.56)
2
(𝐴 is a proportionality constant that is meanwhile unknown).
The magnetic moment of an atom has a magnitude of the order of one Bohr
magneton, i.e., about 10−23 J T−1 [see Eq. (7.45)]. At the usually achieved fields, the
magnetic induction is of the order of 1 T (104 Gs). Hence, 𝑝m 𝐵 is of the order
of 10−23 J. The quantity 𝑘𝑇 at room temperature is about 4 × 10−21 J. Thus, 𝑎 =
𝑝m 𝐵/(𝑘𝑇) is much smaller than unity, and exp(𝑎 cos 𝜃) may be replaced with the
approximate expression 1 + 𝑎 cos 𝜃. In this approximation, Eq. (7.56) becomes
1
d𝑃𝜃 = 𝐴(1 + 𝑎 cos 𝜃) sin 𝜃 d𝜃.
2
The constant 𝐴 can be found by proceeding from the fact that the sum of the
probabilities of all possible values of the angle 𝜃 must equal unity:
1
∫ 𝜋
1= 𝐴(1 + 𝑎 cos 𝜃) sin 𝜃 d𝜃.
0 2
Hence, 𝐴 = 1, so that
1
d𝑃𝜃 = (1 + 𝑎 cos 𝜃) sin 𝜃 d𝜃.
2
Assume that unit volume of a paramagnetic contains 𝑛 atoms. Consequently,
the number of atoms whose magnetic moments form angles from 𝜃 to 𝜃 + d𝜃 with
the direction of the field will be
1
d𝑛 𝜃 = 𝑛 d𝑃𝜃 = 𝑛(1 + 𝑎 cos 𝜃) sin 𝜃 d𝜃.
2
Each of these atoms makes a contribution of 𝑝m cos 𝜃 to the resultant magnetic
moment. Therefore, we get the following expression for the magnetic moment of
unit volume (i.e., for the magnetization):
1 1
∫ 𝜋 ∫ 𝜋
𝑛𝑝m 𝑎
𝑀= 𝑝m cos 𝜃 d𝑛 𝜃 = 𝑛𝑝m (1 + 𝑎 cos 𝜃) sin 𝜃 d𝜃 = .
0 2 0 2 3
Ferromagnetism 175

Substitution for 𝑎 of its value from Eq. (7.53) yields


𝑛𝑝2 𝐵
𝑀= m .
3𝑘𝑇
Finally, dividing 𝑀 by 𝐻 and assuming that 𝐵/𝐻 = 𝜇0 (for a paramagnetic 𝜇 is
virtually equal to unity), we find the susceptibility
2
𝜇0 𝑛𝑝m
𝜒m = . (7.57)
3𝑘𝑇
Substituting the Avogadro constant 𝑁A for 𝑛, we get an expression the molar
susceptibility:
𝜇0 𝑁A 𝑝m 2
𝜒m,mol = . (7.58)
3𝑘𝑇
We have arrived at Curie’s law. A comparison of Eqs. (7.52) and (7.58) gives the
following expression for the Curie constant:
𝜇0 𝑁A 𝑝m2
𝐶= . (7.59)
3𝑘
It must be remembered that Eq. (7.58) has been obtained assuming that 𝑝m 𝐵 
𝑘𝑇. In very strong fields and at low temperatures, deviations are observed from
proportionality between the magnetization of a paramagnetic 𝑀 and the field
strength 𝐻. In particular, a state of magnetic saturation may set in when all the
𝒑m ’s are lined up along the field, and a further increase in 𝐻 does not result in a
growth in 𝑀.
The values of 𝜒m,mol calculated by Eq. (7.58) in a number of cases agree quite
well with the values obtained experimentally.
The quantum theory of paramagnetism takes account of the fact that only
discrete orientations of the magnetic moment of an atom relative to a field are
possible. It arrives at an expression for 𝜒m,mol similar to Eq. (7.58).

7.9. Ferromagnetism

Substances capable of having magnetization in the absence of an external mag-


netic field form a special class of magnetics. According to the name of their most
widespread representative—ferrum (iron)—they have been called ferromagnet-
ics. In addition to iron, they include nickel, cobalt, gadolinium, their alloys and
compounds, and also certain alloys and compounds of manganese and chromium
with non-ferromagnetic elements. All these substances display ferromagnetism
only in the crystalline state.
Ferromagnetics are strongly magnetic substances. Their magnetization exceeds
that of diamagnetics and paramagnetics which belong to the category of weakly
176 MAGNETIC FIELD IN A SUBSTANCE

Fig. 7.17 Fig. 7.18

magnetized substances an enormous number of times (up to 1010 ).


The magnetization of weakly magnetized substances varies linearly with the
field strength. The magnetization of ferromagnetics depends on 𝐻 in an intricate
way. Figure 7.17 shows the magnetization curve for a ferromagnetic whose magnetic
moment was initially zero (it is called the initial or zero magnetization curve).
Already in fields of the order of several oersteds (about 100 A m−1 ), the magneti-
zation 𝑀 reaches saturation. The initial magnetization curve in a 𝐵-𝐻 diagram is
shown in Fig. 7.18 (curve 0-1). We remind our reader that 𝐵 = 𝜇0 (𝐻 + 𝑀). Therefore,
when saturation is reached, 𝐵 continues to grow with increasing 𝐻 according to a
linear law: 𝐵 = 𝜇0 𝐻 + constant, where constant = 𝜇0 𝑀sat .
A magnetization curve for iron was first obtained and investigated in detail
by the Russian scientist Aleksandr Stoletov (1839-1896). The ballistic method of
measuring the magnetic induction which he developed has been finding wide
application (see Sec. 8.3).
Apart from the non-linear relation between 𝐻 and 𝑀 (or between 𝐻 and 𝐵),
ferromagnetics are characterized by the presence of hysteresis. If we bring magne-
tization up to saturation (point 1 in Fig. 7.18) and then diminish the magnetic field
strength, the induction 𝐵 will no longer follow the initial curve 0-1, but will change
in accordance with curve 1-2. As a result, when the strength of the external field
vanishes (point 2), the magnetization does not vanish and is characterized by the
quantity 𝐵r called the residual induction. The magnetization for this point has
the value 𝑀r called the retentivity or remanence.
The magnetization vanishes only under the action of the field 𝐻c directed
oppositely to the field that produced the magnetization. The field strength 𝐻c is
called the coercive force.
Ferromagnetism 177

The existence of remanence makes it possible to manufacture permanent mag-


nets, i.e., bodies that have a magnetic moment and produce a magnetic field in the
space surrounding them without the expenditure of energy for maintaining the
macroscopic currents. A permanent magnet retains its properties better when the
coercive force of the material it is made of is higher.
When an alternating magnetic field acts on a ferromagnetic, the induction
changes in accordance with curve 1-2-3-4-5-1 (Fig. 7.18) called a hysteresis loop
(a similar curve is obtained in an 𝑀-𝐻 diagram). If the maximum values of 𝐻 are
such that the magnetization reaches saturation, we get the so-called maximum
hysteresis loop (the solid loop in Fig. 7.18). If saturation is not reached at the
amplitude values of 𝐻, we get a loop called a partial cycle (the dash line in the
figure). The number of such partial cycles is infinite, and all of them are within the
maximum hysteresis loop.
Hysteresis results in the fact that the magnetization of a ferromagnetic is not a
unique function of 𝐻. It depends very greatly on the previous history of a specimen—
on the fields which it was in previously. For example, in a field of strength 𝐻1
(Fig. 7.18), the induction may have any value ranging from 𝐵10 to 𝐵100.
It follows from everything said above about ferromagnetics that they are very
similar in their properties to ferroelectrics (see Sec. 2.9). In connection with the
ambiguity of the dependence of 𝐵 on 𝐻, the concept of permeability is applied
only to the initial magnetization curve. The permeability of ferromagnetics 𝜇 (and,
consequently, their magnetic susceptibility 𝜒m ) is a function of the field strength.
Figure 7.19a shows an initial magnetization curve. Let us draw from the origin
of coordinates a siraight line that passes through an arbitrary point on the curve.
The slope of this line is proportional to the ratio 𝐵/𝐻, i.e., to the permeability 𝜇
for the relevant value of the field strength. When 𝐻 grows from zero, the slope
(and, consequently, 𝜇) first grows. At point 2 it reaches a maximum (straight line
0-2 is a tangent to the curve) and then diminishes. Figure Fig. 7.19b shows how
𝜇 depends on 𝐻. A glance at the figure shows that the maximum value of the
permeability is reached somewhat earlier than saturation. Upon an unlimited
increase in 𝐻, the permeability approaches unity asymptotically. This can be seen
from the circumstance that 𝑀 in the expression 𝜇 = 1 + 𝑀/𝐻 cannot exceed the
value 𝑀sat .
The quantities 𝐵r (or 𝑀r ), 𝐻c and 𝜇max are the basic characteristics of a fer-
romagnetic. If the coercive force 𝐻c is great, the ferromagnetic is called hard. It
is characterized by a broad hysteresis loop. A ferromagnetic with a low 𝐻c (and
accordingly with a narrow hysteresis loop) is called soft. The characteristic of a
ferromagnetic is chosen depending on the use it is to be put to. Thus, hard ferromag-
netics are used for permanent magnets, and soft ones for the cores of transformers.
178 MAGNETIC FIELD IN A SUBSTANCE

Fig. 7.19

Table 7.1 gives the characteristics of several typical ferromagnetics.


The fundamentals of the theory of ferromagnetism were presented by the Soviet
physicist Yakov Frenkel (1894-1952) and the German physicist Werner Heisenberg
(1901-1976) in 1928. It follows from experiments involving the studying of gyro-
magnetic phenomena (see Sec. 7.6) that the intrinsic (spin) magnetic moments of
electrons are responsible for the magnetic properties of ferromagnetics. In definite
conditions, forces⁷ may appear in crystals that make the magnetic moments of
the electrons become lined up parallel to one another. The result is the setting
up of regions of spontaneous magnetization, also called domains. Within the
confines of each domain, a ferromagnetic is spontaneously magnetized to saturation
⁷These forces are called exchange ones. Their explanation is given only by quantum mehcanics.

Table 7.1

Substance Composition 𝜇max 𝐵r , T 𝐻c , A m−1

Iron 99.9% Fe 5000 — 80


Supermalloy 79% Ni, 5%, Mo, 16% Fe 800000 — 0.3
Alniko 10% Al, 19% Ni, 18% Co — 0.9 52000
Ferromagnetism 179

Fig. 7.20

and has a definite magnetic moment. The directions of these moments are different
for different domains (Fig. 7.20), so that in the absence of an external field the total
moment of an entire body is zero. Domains have dimensions of the order of 1 µm
to 10 µm.
The action of a field on domains at different stages of the magnetization process
is different. First, with weak fields, displacement of the domain boundaries is
observed. As a result, the domains whose moments make a smaller angle with 𝑯
grow at the expense of the domains for which the angle 𝜃 between the vectors 𝒑m
and 𝑯 is greater. For example, domains 1 and 3 in Fig. 7.20 grow at the expense
of domains 2 and 4. With an increase in the field strength, this process goes on
further and further until the domains with a smaller 𝜃 (which have a smaller energy
in a magnetic field) completely absorb the domains that are less advantageous from
the energy viewpoint. In the next stage, the magnetic moments of the domains
turn in the direction of the field. The moments of the electrons within the confines
of a domain turn simultaneously without violating their strict parallelism to one
another. These processes (excluding slight displacements of the boundaries between the
domains in very weak fields) are irreversible, and this is exactly what causes hysteresis.
There is a definite temperature 𝑇C for every ferromagnetic at which the re-
gions of spontaneous magnetization (domains) break up and the substance loses
its ferromagnetic properties. This temperature is called the Curie point. It is
768 ◦C for iron and 365 ◦C for nickel. At a temperature above the Curie point, a
ferromagnetic becomes an ordinary paramagnetic whose magnetic susceptibility
obeys the Curie-Weiss law
𝐶
𝜒m,mol = (7.60)
𝑇 − 𝑇C
180 MAGNETIC FIELD IN A SUBSTANCE

[compare with Eq. (7.52)]. When a ferromagnetic is cooled to below its Curie point,
domains once more appear in it.
Exchange forces sometimes result in the appearance of so-called antiferro-
magnetics (chromium, manganese, etc.). The existence of antiferromagnetics was
predicted by the Soviet physicist Lev Landau (1908-1968) in 1933. In antiferromagnet-
ics, the intrinsic magnetic moments of the electrons are spontaneously oriented
antiparallel to one another. Such an orientation involves adjacent atoms in pairs.
The result is that antiferromagnetics have an extremely low magnetic susceptibility
and behave like very weak paramagnetics. There is also a temperature 𝑇N for an-
tiferromagnetics at which the antiparallel orientation of the spins vanishes. This
temperature is known as the antiferromagnetic Curie point or the Neél point.
Some antiferromagnetics (for example, erbium, dysprosium, alloys of manganese
and copper) have two such points (an upper and a lower Neel point), the antiferro-
magnetic properties being observed only at the intermediate temperatures. Above
the upper point, the substance behaves like a paramagnetic, and at temperatures
below the lower Neél point it becomes a ferromagnetic.
181

Chapter 8
ELECTROMAGNETIC
INDUCTION

8.1. The Phenomenon of Electromagnetic Induction

In 1831, the British physicist and chemist Michael Faraday (1791-1867) discovered
that an electric current is produced in a closed conducting loop when the flux
of magnetic induction through the surface enclosed by this loop changes. This
phenomenon is called electromagnetic induction, and the current produced an
induced current.
The phenomenon of electromagnetic induction shows that when the magnetic
flux in a loop changes, an induced electromotive force Ei is set up. The value of Ei
does not depend on how the magnetic flux 𝛷 is changed and is determined only by
the rate of change of 𝛷, i.e., by the value of d𝛷/d𝑡. A change in the sign of d𝛷/d𝑡 is
attended by a change in the direction of Ei .
Let us consider the following example. Figure 8.1 shows loop 1 whose current 𝐼1
can be varied by means of a rheostat. This current sets up a magnetic field through
loop 2. If we increase the current 𝐼1 , the magnetic induction flux 𝛷 through loop
2 will grow. This will lead to the appearance in loop 2 of the induced current 𝐼2
registered by a galvanometer. Diminishing of the current 𝐼1 will cause the magnetic
flux through the second loop to decrease. This will result in the appearance in it
of an induced current of a direction opposite to that in the first case. An induced
current 𝐼2 can also be set up by bringing loop 2 closer to loop 1 or moving it away
from it. In these two cases, the directions of the induced current are opposite.
Finally, electromagnetic induction can be produced without translational motion
of loop 2, but by turning it so as to change the angle between a normal to the loop
and the direction of the field.
182 ELECTROMAGNETIC INDUCTION

Fig. 8.1

E. Lenz established a rule permitting us to find the direction of an induced


current. Lenz’s rule states that an induced current is always directed so as to oppose
the cause producing it. If, for example, a change in 𝛷 is due to motion of loop 2,
then an induced current is set up of a direction such that the force of interaction
with the first loop opposes the motion of the loop. When loop 2 approaches loop 1
(see Fig. 8.1), a current 𝐼20 is set up whose magnetic moment is directed oppositely to
the field of the current 𝐼1 (the angle a between the vectors 𝒑m 0 and 𝑩 is 𝜋s). Hence,

loop 2 will experience a force repelling it from loop 1 [see Eq. (6.77)]. When loop 2 is
moved away from loop 1, the current 𝐼200 is produced whose moment 𝒑m 00 coincides

in direction with the field of the current 𝐼1 (𝛼 = 0) so that the force exerted on loop
2 is directed toward loop 1.
Assume that both loops are stationary and the current in loop 2 is induced
by changing the current 𝐼1 in loop 1. Now a current 𝐼2 is induced of a direction
such that the intrinsic magnetic flux it produces tends to weaken the change in the
external flux leading to the setting up of the induced current. When 𝐼1 grows, i.e.,
the external magnetic flux directed to the right is increased, a current 𝐼20 is induced
that sets up a flux directed to the left. When 𝐼1 diminishes, the current 𝐼200 is set
up whose intrinsic magnetic flux has the same direction as the external flux and,
consequently, tends to keep the external flux unchanged.

8.2. Induced E.M.F.

We have established in the preceding section that changes in the magnetic flux 𝛷
through a loop set up an induced e.m.f. Ei in it. To find the relation between Ei and
the rate of change of 𝛷, we shall consider the following example.
Induced E.M.F. 183

Fig. 8.2

Let us take a loop with a movable rod of length 𝑙 (Fig. 8.2a). We shall put it in a
homogeneous magnetic field at right angles to the plane of the loop and directed
beyond the drawing. Let us bring the rod into motion with the velocity 𝒗. The
current carriers in the rod—electrons—will also begin to move relative to the
field with the same velocity. As a result, each electron will begin to experience the
magnetic force
𝑭 k = −𝑒(𝒗 × 𝑩), (8.1)
directed along the rod [see Eq. (6.33); the charge of an electron is −𝑒]. The action of
this force is equivalent to the action on an electron of an electric field of strength
𝑬 = 𝒗 × 𝑩.
This field is of a non-electrostatic origin. Its circulation around a loop gives the
value of the e.m.f. induced in the loop:
∮ ∮ ∫ 2
Ei = 𝑬 · d𝒍 = (𝒗 × 𝑩) · d𝒍 = (𝒗 × 𝑩) · d𝒍 (8.2)
1
(the integrand differs from zero only on section 1-2 formed by the rod).
To be able to judge about the direction in which the e.m.f. acts according to the
sign of Ei , we shall consider Ei positive when its direction forms a right-handed
system with the direction of a normal to the loop.
Let us choose the normal as shown in Fig. 8.2. Hence, when calculating the
circulation, we must circumvent the loop clockwise and choose the direction of the
vectors d𝒍 accordingly. If we put the constant vector 𝒗 × 𝑩 in Eq. (8.2) outside the
integral, we get
∫ 2
Ei = (𝒗 × 𝑩) d𝒍 = (𝒗 × 𝑩) · 𝒍,
1
where 𝒍 is the vector depicted in Fig. 8.2b. Let us perform a cyclic rearrangement of
the multipliers in the expression obtained, after which we shall multiply and divide
184 ELECTROMAGNETIC INDUCTION

it by d𝑡:
𝑩 · (𝒍 × 𝒗 d𝑡)
Ei = 𝑩 · (𝒍 × 𝒗) = . (8.3)
d𝑡
A glance at Fig. 8.2b shows that
𝒍 × 𝒗 d𝑡 = −𝒏ˆ d𝑆,
where d𝑆 is the increment of the loop area during the time d𝑡. By the definition of a
flux, 𝑩 · d𝑺 = 𝑩 · 𝒏ˆ d𝑆 is the flux through the area d𝑆, i.e., the increment of the flux
d𝛷 through the loop. Thus,
𝑩 · (𝒍 × 𝒗 d𝑡) = −𝑩 · 𝒏ˆ d𝑆 = −d𝛷.
With a view to this expression, Eq. (8.3) can be written as
d𝛷
Ei = − . (8.4)
d𝑡
We have found that d𝛷/d𝑡 and Ei have opposite signs. The sign of the flux and
that of Ei are associated with the choice of the direction of a normal to the plane of a
loop. With our selection of the normal (see Fig. 8.2), the sign of d𝛷/d𝑡 is positive, and
that of Ei is negative. If we had chosen a normal directed not beyond the drawing,
but toward us, the sign of d𝛷/d𝑡 would be negative and that of It positive.
The SI unit of magnetic induction flux is the weber (Wb), which is the flux
through a surface of 1 m2 intersected by magnetic field lines normal to it with
𝐵 = 1 T. At a rate of change of the flux equal to 1 Wb s−1 , an e.m.f. of 1 V is induced
in the loop. In the Gaussian system of units, Eq. (8.4) has the form
1 d𝛷
Ei = − . (8.5)
𝑐 d𝑡
The unit of 𝛷 in this system is the maxwell (Mx) equal to the flux through a surface
of 1 cm2 at 𝐵 = 1 Gs. Equation (8.5) gives Ei in cgse𝑈 . To find it in volts, we must
multiply the result obtained by 300. Since 300/𝑐 = 10−8 , we have
d𝛷
Ei (𝑉 ) = −10−8 Mx s−1 . (8.6)

d𝑡
In the reasoning that led us to Eq. (8.4), the part of the extraneous forces main-
taining a current in a loop was played by magnetic forces. The work of these forces
on a unit positive charge, equal by definition to the e.m.f., is other than zero. This
circumstance apparently contradicts the statement made in Sec. 6.5 that a magnetic
force can do no work on a charge. This contradiction is eliminated if we take into
account that the force (8.1) is not the total magnetic force exerted on an electron,
but only the component of this force parallel to the conductor and due to the ve-
locity 𝒗 (see the force 𝑭 k in Fig. 8.3). This component causes the electron to start
moving along the conductor with the velocity 𝒖, as a result of which a magnetic
Induced E.M.F. 185

Fig. 8.3

force perpendicular to the wire is set up equal to


𝑭 k = −𝑒(𝒖 × 𝑩)
(this component makes no contribution to the circulation because it is perpendicular
to d𝒍).
The total magnetic force exerted on an electron is
𝑭 = 𝑭 k + 𝑭 ⊥,
and the work of this force on an electron during the time d𝑡 is
d𝐴 = 𝑭 k · 𝒖 d𝑡 + 𝑭 ⊥ · 𝒗 d𝑡 = 𝐹 k 𝑢 d𝑡 + 𝐹⊥ 𝑣 d𝑡
(the directions of the vectors 𝑭 k and 𝒖 are the same, and of the vectors 𝑭 ⊥ and 𝒗
are opposite; see Fig. 8.3). Substituting for the magnitudes of the forces their values
𝐹 k = 𝑒𝑣𝐵 and 𝐹⊥ = 𝑒𝑢𝐵, we find that the work of the total magnetic force equals
zero.
The force 𝑭 ⊥ is directed oppositely to the velocity of the rod 𝒗. Therefore, for
the rod to move with the constant velocity 𝒗, the external force 𝑭 ext must be applied
to it that balances the sum of the forces 𝑭 ⊥ applied to all the electrons contained
in the rod. It is exactly at the expense of the work of this force that the energy
liberated in the loop by the induced current will be produced.
Our explanation of the appearance of an induced e.m.f. relates to the case when
the magnetic field is constant, while the geometry of the loop changes. The magnetic
flux through the loop can also be changed, however, by changing 𝑩. In this case, the
explanation of the appearance of an e.m.f. will differ in principle. The time-varying
magnetic field sets up a vortex electric field 𝑬 (this is treated in detail in Sec. 9.1). The
action of the field 𝑬 causes the current carriers in a conductor to start moving—an
induced current is set up. The relation between the induced e.m.f. and the changes
in the magnetic flux in this case too is described by Eq. (8.4).
Assume that the loop in which an e.m.f. is induced consists of 𝑁 turns instead
of one, i.e., it is a solenoid, for example. Since the turns are connected in series, Ei
186 ELECTROMAGNETIC INDUCTION

will equal the sum of the e. m.f.’s induced in each of the turns separately:
Õ d𝛷 d Õ 
Ei = − =− 𝛷 .
d𝑡 d𝑡
The quantity
Õ
𝛹 = 𝛷, (8.7)
is called the flux linkage or the total magnetic flux. It is measured in the same
units as 𝛷. If the flux through each of the turns is the same, then
𝛹 = 𝑁𝛷. (8.8)
The e.m.f. induced in an intricate loop is determined by the formula
d𝛹
Ei = − . (8.9)
d𝑡

8.3. Ways of Measuring the Magnetic Induction

Assume that the total magnetic flux linked to a loop changes from 𝛹1 to 𝛹2 . Let us
find the charge 𝑞 that flows through each section of the loop. The instantaneous
value of the current in the loop is
E 1 d𝛹
𝐼= =− .
𝑅 𝑅 d𝑡
Hence,
1 d𝛹 1
d𝑞 = 𝐼 d𝑡 = − d𝑡 = − d𝛹 .
𝑅 d𝑡 𝑅
Integration of this expression yields the total charge:
∫ 2
1 1

𝑞= d𝑞 = − d𝛹 = (𝛹1 − 𝛹2 ). (8.10)
𝑅 1 𝑅
Equation (8.10) underlies the ballistic method of measuring the magnetic in-
duction developed by A. Stoletov. It consists in the following. A small coil with 𝑁
turns is placed in the field being studied. The coil is arranged so that the vector 𝑩 is
perpendicular to the plane of the turns (Fig. 8.4a). Hence, the total magnetic flux
linked with the coil will be
𝛹1 = 𝑁 𝐵𝑆,
where 𝑆 is the area of one turn, which must be so small that the field within its
limits may be considered homogeneous.
When the coil is turned through 180 degrees (Fig. 8.4b), the flux linkage becomes
equal to 𝛹2 = −𝑁 𝐵𝑆 (𝒏ˆ and 𝑩 are directed oppositely). Hence, the change in the
total flux linkage when the coil is turned is 𝛹1 − 𝛹2 = 2𝑁 𝐵𝑆. If the coil is turned
sufficiently quickly, a short current pulse is produced in the loop upon which the
Eddy Currents 187

Fig. 8.4

charge
1
𝑞= 2𝑁 𝐵𝑆 (8.11)
𝑅
flows [see Eq. (8.10)].
The charge flowing in the circuit during the short current pulse can be measured
with the aid of a so-called ballistic galvanometer. The latter is a galvanometer
with a great period of natural oscillations. Having measured 𝑞 and knowing 𝑅, 𝑁,
and 𝑆, we can find 𝐵 by Eq. (8.11). By 𝑅, here, is meant the resistance of the entire
circuit including the coil, the connecting wires, and the galvanometer.
Instead of turning the coil, we may switch on (or off) the magnetic field being
studied, or reverse its direction.
To measure 𝐵, the circumstance is also used that the electric resistance of
bismuth grows greatly under the action of a magnetic field—by about five per cent
per tenth of a tesla (per 1000 Gs). Consequently, we can determine the magnetic
induction of a magnetic field by placing a preliminarily graduated bismuth coil
(Fig. 8.5) into the field and measuring the relative change in its resistance.
We must note that the electric resistance of other metals also grows in a magnetic
field, but to a much smaller extent. For copper, for example, the increase in the
resistance is about one-ten thousandth of that for bismuth.

8.4. Eddy Currents

Induced currents can also be produced in solid massive conductors. In this case,
they are known as eddy currents. The electric resistance of a massive conductor
is small, therefore, the eddy currents may reach a very high value.
In accordance with Lenz’s rule, eddy currents choose paths and directions in a
conductor such as to resist by their action the reason setting them up as much as
possible. This is why good conductors moving in a strong magnetic field experience
great retardation due to the interaction of the eddy currents with the magnetic
188 ELECTROMAGNETIC INDUCTION

Fig. 8.5 Fig. 8.6

field. This is taken advantage of for damping the movable parts of galvanometers,
seismographs, and other instruments. A conducting (for example, aluminium) plate
in the form of a sector is fastened to the movable part of an instrument (Fig. 8.6)
and is introduced into the gap between the poles of a strong permanent magnet.
Movement of the plate causes eddy currents to be produced in it that brake the
system. The advantage of such a device is that the braking action appears only
when the plate moves and vanishes when the plate is stationary. Therefore, the
electromagnetic damper is absolutely no hindrance to the instrument accurately
arriving at its equilibrium position.
The thermal action of eddy currents is used in induction furnaces. Such a
furnace is a coil supplied with a high-frequency current of a high value. If we place
a conducting body inside the coil, intensive eddy currents will be produced in it
that can heat the body up to its melting point. This method is used to melt metals
in vacuum. The resulting materials have an exceedingly high purity.
Eddy currents are also used to heat the internal metal components of vacuum
installations in order to degas them.
Eddy currents are quite often undesirable, and special measures must be taken
to eliminate them. For example, to prevent the losses of energy for heating trans-
former cores by eddy currents, the cores are assembled of thin insulated sheets.
The latter are arranged so that the possible directions of the eddy currents will
be perpendicular to them. The appearance of ferrites (semiconductor magnetic
materials with a high electric resistance) made it possible to manufacture solid
cores.
The eddy currents set up in conductors carrying alternating currents are di-
rected so as to weaken the current inside a conductor and increase it near the
surface. As a result, the fast-varying current is distributed unevenly over the cross
section of the conductor—it is forced out, as it were, to the surface of the conductor.
This phenomenon is called the skin effect. Owing to this effect, the internal part
of conductors in high-frequency circuits is useless. This is why the conductors used
Self-Induction 189

for such circuits have the form of tubes.

8.5. Self-Induction

An electric current flowing in any loop produces the magnetic flux 𝛹 through this
loop. When 𝐼 changes, 𝛹 also changes, and the result is the induction of an e.m.f.
in the loop. This phenomenon is called self-induction. In accordance with the
Biot-Savart law, the magnetic induction 𝐵 is proportional to the current setting up
the field. Hence, it follows that the current 𝐼 in a loop and the total magnetic flux 𝛹
through the loop it produces are proportional to each other:
𝛹 = 𝐿𝐼. (8.12)
The constant of proportionality 𝐿 between the current and the total magnetic flux
is called the inductance of a loop.
A linear dependence of 𝛹 on 𝐼 is observed only if the permeability 𝜇 of the
medium surrounding the loop does not depend on the field strength 𝐻, i.e., in the
absence of ferromagnetics. Otherwise, 𝜇 is an intricate function of 𝐼 (through 𝐻,
see Fig. 7.19b), and, since 𝐵 = 𝜇0 𝜇𝐻, the dependence of 𝛹 on 𝐼 will also be quite
intricate. Equation (8.12), however, is also extended to this case, and the inductance
𝐿 is considered as a function of 𝐼. With a constant current 𝐼, the total flux 𝛹 can
change as a result of changes in the shape and dimensions of a loop.
It can be seen from the above that the inductance 𝐿 depends on the geometry
of a loop (i.e., on its shape and dimensions), and also on the magnetic properties
(on 𝜇) of the medium surrounding the loop. If the loop is rigid and there are no
ferromagnetics near it, the inductance 𝐿 is a constant quantity. The SI unit of
inductance is the inductance of a conductor in which a total flux 𝛹 of 1 Wb linked
with it is set up at a current of 1 A in the conductor. This unit is called the henry
(H).
In the Gaussian system of units, the inductance has the dimension of length.
Accordingly, the unit of inductance in this system is called the centimetre. A loop
with which a flux of 1 Mx (10−8 Wb) is linked at a current of 1 cgsm𝐼 (i.e., 10 A) has
an inductance of 1 cm.
Let us calculate the inductance of a solenoid. We shall take a solenoid so long that
it can virtually be considered infinite. When a current 𝐼 flows in it, a homogeneous
field is produced inside the solenoid whose induction is 𝐵 = 𝜇0 𝜇𝑛𝐼 [see Eqs. (6.108)
and (7.26)]. The flux through each of the turns is 𝛷 = 𝐵𝑆, and the total magnetic flux
linked with the solenoid is
𝛹 = 𝑁𝛷 = 𝑛𝑙𝐵𝑆 = 𝜇0 𝜇𝑛2 𝑙𝑆𝐼, (8.13)
where 𝑙 is the length of the solenoid (which is assumed to be very great), 𝑆 is the
190 ELECTROMAGNETIC INDUCTION

cross-sectional area, and 𝑛 the number of turns per unit length (the product 𝑛𝑙 gives
the total number of turns 𝑁).
A comparison of Eqs. (8.12) and (8.13) gives the following expression for the
inductance of a very long solenoid:
𝐿 = 𝜇0 𝜇𝑛2 𝑙𝑆 = 𝜇0 𝜇𝑛2 𝑉 , (8.14)
where 𝑉 = 𝑙𝑆 is the volume of the solenoid.
It follows from Eq. (8.14) that the dimension of 𝜇0 equals that of inductance
divided by the dimension of length. Accordingly, 𝜇0 is measured in henry per metre
[see Eq. (6.3)].
When the current in a loop changes, a self-induced e.m.f. Es is set up that equals
d𝛹 d(𝐿𝐼) d𝐼 d𝐿
 
Es = − =− =− 𝐿 +𝐼 . (8.15)
d𝑡 d𝑡 d𝑡 d𝑡
If the inductance remains constant when the current changes (which is possible
only in the absence of ferromagnetics), the expression for the self-induced e.m.f.
becomes
d𝐼
Es = −𝐿 . (8.16)
d𝑡
The minus sign in Eq. (8.16) is due to Lenz’s rule according to which an induced
current is directed so as to oppose the cause producing it. In the case being con-
sidered, what sets up Es is the change of the current in the circuit. Let us assume
clockwise circumvention to be the positive direction. In these conditions, the cur-
rent will be greater than zero if it flows clockwise in the circuit and less than zero
if it flows counterclockwise. Similarly, Es will be greater than zero if it is exerted in
a clockwise direction, and less than zero if it is exerted in a counterclockwise one.
The derivative d𝐼/d𝑡 is positive in two cases—either upon a growth in a positive
current or upon a decrease in the absolute value of a negative current. Inspection
of Eq. (8.16) shows that in these cases Es < 0. This signifies that the self-induced
e.m.f. is directed counterclockwise and, therefore, is opposed to the above current
changes (a growth in a positive or a decrease in a negative current).
The derivative d𝐼/d𝑡 is negative also in two cases—either when a positive
current diminishes, or when the magnitude of a negative current grows. In these
cases, Es > 0 and, consequently, opposes changes in the current (a decrease in a
positive or a growth in the magnitude of a negative current).
Equation (8.16) makes it possible to define the inductance as a constant of pro-
portionality between the rate of change of the current in a loop and the resulting
self-induced e.m.f.. Such a definition is lawful, however, only when 𝐿 = constant.
In the presence of ferromagnetics, 𝐿 of an undeforming loop will be a function
of 𝐼 (through 𝐻). Hence, d𝐿/d𝑡 can be written as (d𝐿/d𝐼) (d𝐼/d𝑡). Making such a
Current When a Circuit Is Opened or Closed 191

Fig. 8.7

substitution in Eq. (8.15), we get


d𝐿 d𝐼
 
Es = − 𝐿 + 𝐼 . (8.17)
d𝐼 d𝑡
We can, thus, see that in the presence of ferromagnetics the constant of proportion-
ality between d𝐼/d𝑡 and Es does not at all equal 𝐿.

8.6. Current When a Circuit Is Opened or Closed

According to Lenz’s rule, the additional currents set up owing to self-induction are
always directed so as to prevent any changes in the current in a circuit. The result
is that a current grows to its steady value when a circuit is closed or drops to zero
when the circuit is opened not instantaneously, but gradually.
Let us first find how a current changes when the switch of a circuit is opened.
Assume that a current source of e.m.f. Eis connected in a circuit with an inductance
𝐿 not depending on 𝐼 and a resistance 𝑅 (Fig. 8.7). The steady current flowing in
the circuit will be
E
𝐼0 = (8.18)
𝑅
(we consider the resistance of the current source to be negligibly small).
At the moment 𝑡 = 0, let us switch off the current source and simultaneously
short the circuit by means of switch SW. As soon as the current in the circuit begins
to diminish, a self-inductance e.m.f. opposing this decrease appears. The current in
the circuit will comply with the equation
d𝐼
𝐼 𝑅 = Es = −𝐿 ,
d𝑡
192 ELECTROMAGNETIC INDUCTION

or
d𝐼 𝑅
+ 𝐼 = 0. (8.19)
d𝑡 𝐿
Equation (8.19) is a linear homogeneous differential equation of the first order.
Separating variables, we get
d𝐼 𝑅
= − d𝑡,
𝐼 𝐿
whence
𝑅
ln 𝐼 = − 𝑡 + ln(constant)
𝐿
(with a view to further transformations, we have written the integration constant
in the form “ln(constant)”). Converting this relation to a power yields
 
𝑅
𝐼 = constant × exp − 𝑡 . (8.20)
𝐿
Equation (8.20) is a general solution of Eq. (8.19). We shall find the value of the
constant from the initial conditions. When 𝑡 = 0, the current had the value given
by Eq. (8.18). Hence, constant = 𝐼0 . Introducing this value into Eq. (8.20), we arrive
at the expression
 
𝑅
𝐼 = 𝐼0 exp − 𝑡 . (8.21)
𝐿
Thus, after the e.m.f. source had been switched off, the current in the circuit did
not vanish instantaneously, but diminished according to the exponential law (8.21).
A plot of the diminishing of 𝐼 is given in Fig. 8.8 (curve 1). The rate of diminishing
is determined by the quantity
𝐿
𝜏= , (8.22)
𝑅
having the dimension of time and called the time constant of the circuit. Substi-
tuting 1/𝜏 for 𝑅/𝐿 in Eq. (8.21), we get
 𝑡
𝐼 = 𝐼0 exp − . (8.23)
𝜏
According to this equation, 𝜏 is the time during which the current diminishes to
1/𝑒-th of its initial value. A glance at Eq. (8.22) shows that the time constant 𝜏
grows and the current in the circuit diminishes at a slower rate with an increasing
inductance 𝐿 and a decreasing resistance 𝑅 of the circuit.
To simplify our calculations, we considered that the circuit is shorted when the
current source is switched off. If we simply break a circuit with a high inductance,
the high induced voltage set up produces a spark or an arc at the place of breaking
of the circuit.
Now let us consider the closing of a circuit. After the e.m.f. source is switched
Current When a Circuit Is Opened or Closed 193

Fig. 8.8

on, a self-induced e.m.f. will act in the circuit apart from the e.m.f. E until the
current reaches its steady value given by Eq. (8.18). Hence, in accordance with Ohm’s
law
d𝐼
𝐼 𝑅 = E + Es + E − 𝐿 ,
d𝑡
or
d𝐼 𝑅 E
+ 𝐼= . (8.24)
d𝑡 𝐿 𝐿
We have arrived at a linear inhomogeneous differential equation that differs
from Eq. (8.19) only in that the right-hand side contains the constant quantity E/𝐿
instead of zero. It is known from the theory of differential equations that the general
solution of a linear inhomogeneous equation can be obtained by adding any partial
solution of it to the general solution of the corresponding homogeneous equation
(see Sec. 7.4 of Vol. I). The general solution of our homogeneous equation has the
form of Eq. (8.20). It is easy to see that 𝐼 = E/𝑅 = 𝐼0 is a partial solution of Eq. (8.24).
Hence, the function
 
𝑅
𝐼 = 𝐼0 + constant × exp − 𝑡 ,
𝐿
will be the general solution of Eq. (8.24). At the initial moment, the current is zero.
Thus, constant = −𝐼0 , and
  
𝑅
𝐼 = 𝐼0 1 − exp − 𝑡 . (8.25)
𝐿
This function describes the growth of the current in a circuit after a source of an
e.m.f. has been switched on in it. A plot of function (8.25) is shown in Fig. 8.8 (curve
2).
194 ELECTROMAGNETIC INDUCTION

Fig. 8.9

8.7. Mutual Induction

Let us take two loops 1 and 2 close to each other (Fig. 8.9). If the current 𝐼1 flows in
loop 1, it sets up through loop 2 a total magnetic flux proportional to 𝐼1 , i.e.,
𝛹2 = 𝐿21 𝐼1 (8.26)
(the field producing this flux is depicted in the figure by solid lines). When the
current 𝐼1 changes, the e.m.f.
d𝐼1
Ei,2 = −𝐿21 , (8.27)
d𝑡
is induced in loop 2 (we assume that there are no ferromagnetics near the loops).
Similarly, when the current 𝐼2 flows in loop 2, the following flux linked with
loop 1 appears:
𝛹1 = 𝐿12 𝐼2 (8.28)
(the field producing this flux is depicted in the figure by dash lines). When the
current 𝐼2 changes, the e.m.f.
d𝐼2
Ei,1 = −𝐿12 , (8.29)
d𝑡
is induced in loop 1.
Loops 1 and 2 are called coupled, while the phenomenon of the setting up of an
e.m.f. in one of the loops upon changes in the current in the other is called mutual
induction.
The coefficients of proportionality 𝐿12 and 𝐿21 are called the mutual induc-
tances of the loops. The relevant calculations show that in the absence of ferro-
magnetics these coefficients are always equal to each other:
𝐿12 = 𝐿21 . (8.30)
Their magnitude depends on the shape, dimensions, and mutual arrangement of
the loops, and also on the permeability of the medium surrounding the loops. The
quantity 𝐿12 is measured in the same units as the inductance 𝐿.
Let us find the mutual inductance of two coils wound onto a common toroidal
Mutual Induction 195

Fig. 8.10

iron core (Fig. 8.10). The magnetic induction lines are concentrated inside the core
[see the text following Eq. (7.31)]. We can, therefore, consider that the magnetic field
set up by any of the windings will have the same strength throughout the core. If
the first winding has 𝑁1 turns and the current 𝐼1 flows through it, then according
to the theorem on circulation [see Eq. (7.11)], we have
𝐻𝑙 = 𝑁1 𝐼1 (8.31)
(here, 𝑙 is the length of the core).
The magnetic flux through the cross section of the core is 𝛷 = 𝐵𝑆 = 𝜇0 𝜇𝐻𝑆,
where 𝑆 is the cross-sectional area of the core. Introducing the value of 𝐻 from
Eq. (8.31) and multiplying the expression obtained by 𝑁2 , we get the total flux linked
with the second winding:
𝑆
𝛹2 = 𝜇0 𝜇𝑁1 𝑁2 𝐼1 .
𝑙
A comparison of this equation with Eq. (8.26) shows that
𝑆
𝐿21 = 𝜇0 𝜇𝑁1 𝑁2 . (8.32)
𝑙
Calculations of the flux 𝛹1 linked with the first winding when the current 𝐼2
flows through the second winding yields the equation
𝑆
𝐿12 = 𝜇0 𝜇𝑁1 𝑁2 , (8.33)
𝑙
which coincides in form with 𝐿21 [see Eq. (8.31)]. In the given case, however, we
cannot assert that 𝐿12 = 𝐿21 . The factor 𝜇 in the expressions for these coefficients
depends on the field strength 𝐻 in the core. If 𝑁1 ≠ 𝑁2 , then the same current
passed once through the first winding and another time through the second one
196 ELECTROMAGNETIC INDUCTION

Fig. 8.11

will set up a field of different strength 𝐻 in the core. Accordingly, the values of 𝜇 in
both cases will be different so that when 𝐼1 = 𝐼2 the numerical values of 𝐿12 and
𝐿21 do not coincide.

8.8. Energy of a Magnetic Field

Let us consider the circuit shown in Fig. 8.11. When the switch is closed, the current
𝐼 will be set up in the solenoid. It will produce a magnetic field linked with the
solenoid turns. If the switch is opened, a gradually diminishing current will flow
for a certain time through resistor 𝑅. This current is maintained by the self-induced
e.m.f. produced in the solenoid. The work done by the current during the time d𝑡
is
d𝛹
d𝐴 = Es 𝐼 d𝑡 = − 𝐼 d𝑡 = −𝐼 d𝛹 . (8.34)
d𝑡
If the inductance of the solenoid does not depend on 𝐼 (𝐿 = constant), then
d𝛹 = 𝐿 d𝐼, and Eq. (8.34) becomes
d𝐴 = −𝐿𝐼 d𝐼. (8.35)
Integrating this expression with respect to 𝐼 within the limits from the initial value
of 𝐼 to zero, we get the work done in the circuit during the entire time needed for
vanishing of the magnetic field:
∫ 0
𝐿𝐼 2
𝐴=− 𝐿𝐼 d𝐼 = . (8.36)
𝐼 2
The work (8.36) is spent on an increment of the internal energy of the resistor 𝑅,
the solenoid, and the connecting wires (i.e., on heating them). This work is attended
by vanishing of the magnetic field that initially existed in the space surrounding
the solenoid. Since no other changes occur in the bodies surrounding the circuit,
it remains for us to conclude that the magnetic field is a carrier of energy, and it
Energy of a Magnetic Field 197

is exactly at the expense of the latter that the work given by Eq. (8.36) is done. We,
thus, arrive at the conclusion that a conductor of inductance 𝐿 carrying the current
𝐼 has the energy
𝐿𝐼 2
𝑊= , (8.37)
2
that is localized in the magnetic field set up by the current [compare this equation
with the expression 𝐶𝑈 2 /2 for the energy of a charged capacitor; see Eq. (4.5)].
Equation (8.36) can be interpreted as the work that must be done against the
self-induced e.m.f. when the current grows from 0 to 𝐼, and that is used to set up a
magnetic field having the energy given by Eq. (8.37). Indeed, the work done against
the self-induced e.m.f. is
∫ 𝐼
0
𝐴 = (−Es )𝐼 d𝑡.
0
Performing transformations similar to those which led us to Eq. (8.35), we get
𝐿𝐼 2
∫ 𝐼
0
𝐴 = 𝐿𝐼 d𝐼 = , (8.38)
0 2
that coincides with Eq. (8.36). The work according to Eq. (8.38) is done when the
current sets in at the expense of the e.m.f. source. It is used completely for producing
a magnetic field linked with the solenoid turns. Equation (8.38) takes no account of
the work spent by the e.m.f. source for heating the conductors during the time the
current reaches its steady value.
Let us express the energy of a magnetic field given by Eq. (8.37) through quantities
characterizing the field itself. For a long (virtually infinite) solenoid
𝐿 = 𝜇0 𝜇𝑛2 𝑉 , 𝐻 = 𝑛𝐼, or 𝐼 =
𝐻
𝑛
[see Eqs. (7.29) and (8.14)]. Using these values of 𝐿 and 𝐼 in Eq. (8.37) and performing
the relevant transformations, we obtain
𝜇0 𝜇𝐻 2
𝑊= 𝑉. (8.39)
2
It was shown in Sec. 6.12 that the magnetic field of an infinitely long solenoid
is homogeneous and differs from zero only inside the solenoid. Hence, the energy
according to Eq. (8.39) is localized inside the solenoid and is distributed over its
volume with a constant density 𝑤 that can be found by dividing 𝑊 by 𝑉 . This
division yields
𝜇0 𝜇𝐻 2
𝑤= . (8.40)
2
Using Eq. (7.17), we can write the equation for the energy density of a magnetic field
198 ELECTROMAGNETIC INDUCTION

as follows:
𝜇0 𝜇𝐻 2 𝐻 𝐵 𝐵2
𝑤= = = . (8.41)
2 2 2𝜇0 𝜇
The expressions we have obtained for the energy density of a magnetic field
differ from Eqs. (4.11) for the energy density of an electric field only in that the
electrical quantities in them have been replaced with the relevant magnetic ones.
Knowing the density of the field energy at every point, we can find the energy of
the field enclosed in any volume 𝑉 . For this purpose, we must calculate the integral
𝜇0 𝜇𝐻 2
∫ ∫
𝑊= 𝑤 d𝑉 = d𝑉 . (8.42)
𝑉 𝑉 2
It can be shown that for coupled loops (in the absence of ferromagnetics) the
field energy is determined by the equation
𝐿1 𝐼12 𝐿2 𝐼22 𝐿12 𝐼1 𝐼2 𝐿21 𝐼2 𝐼1
𝑊= + + + . (8.43)
2 2 2 2
A similar expression is obtained for the energy of 𝑁 loops coupled to one another:

𝑁
𝑊= 𝐿𝑖,𝑘 𝐼 𝑖 𝐼 𝑘 , (8.44)
2 𝑖,𝑘=1
where 𝐿𝑖,𝑘 = 𝐿𝑘,𝑖 is the mutual inductance of the 𝑖-th and 𝑘-th loops, and 𝐿𝑖,𝑖 = 𝐿𝑖
is the inductance of the 𝑖-th loop.

8.9. Work in Magnetic Reversal of a Ferromagnetic

Changes in a current in a circuit are attended by the performance of work against


the self-induced e.m.f.:
d𝛹
d0𝐴 = (−Es )𝐼 d𝑡 = 𝐼 d𝑡 = 𝐼 d𝛹 . (8.45)
d𝑡
If the inductance of the circuit 𝐿 remains constant (which is possible only in the
absence of ferromagnetics), this work is used completely for producing the energy
of a magnetic field: d0𝐴 = d𝑊 . We shall now see that matters are different when
ferromagnetics are present.
For a very long (“infinite”) solenoid, 𝐻 = 𝑛𝐼, 𝛹 = 𝑛𝑙𝐵𝑆. Hence,
𝐻
𝐼 = , d𝛹 = 𝑛𝑙𝑆 d𝐵.
𝑛
Introducing these expressions into Eq. (8.45), we get
d0𝐴 = 𝐻 d𝐵 × 𝑉 , (8.46)
where 𝑉 = 𝑙𝑆 is the volume of the solenoid, i.e., the volume in which a homogeneous
magnetic feild has been produced.
Work in Magnetic Reversal of a Ferromagnetic 199

Fig. 8.12

Let us see whether we can identify Eq. (8.46) with the increment of the energy of
a magnetic field. We remind our reader that energy is a function of state. Therefore,
the sum of its increments for a cyclic process is zero:

d𝑊 = 0.
If we fill a solenoid with a ferromagnetic, then the relation between 𝐵 and 𝐻 is
depicted by the curve shown in Fig. 8.12.∮The expression 𝐻 d𝐵 gives the area of the
shaded strip. Consequently, the integral 𝐻 d𝐵 calculated along the hysteresis loop
equals the area 𝑆𝑙 enclosed by the loop. Thus, the integral of expression (8.46), i.e.,
d 𝐴, differs from zero. It, therefore, follows that in the presence of ferromagnetics,

0

the work given by Eq. (8.46) cannot be equated to the increment of the energy of a
magnetic field. Upon completion of the cycle of magnetic reversal, 𝐻 and 𝐵∮ and,
therefore, the magnetic energy will have their initial values. Hence, the work d0𝐴
is not used to produce the energy of a magnetic field. Experiments show that it is
used to increase the internal energy of the ferromagnetic, i.e., to heat it.
Thus, the completion of one cycle of magnetic reversal of a ferromagnetic is
attended by the expenditure of work per unit volume numerically equal to the area
of the hysteresis loop:

𝐴u.vol = 𝐻 d𝐵 = 𝑆𝑙 . (8.47)
This work goes to heat the ferromagnetic.
In the absence of ferromagnetics, 𝐵 is an unambiguous function of 𝐻 (𝐵 =
𝜇0 𝜇𝐻, where 𝜇 = constant). Therefore, the expression 𝐻 d𝐵 = 𝜇0 𝜇𝐻 d𝐻 is a total
differential
d𝑤 = 𝐻 d𝐵, (8.48)
determining the increment of the energy of a magnetic field. Integration of Eq. (8.48)
within the limits from 0 to 𝐻 leads to Eq. (8.40) for the density of the field energy
200 ELECTROMAGNETIC INDUCTION

(before performing integration, 𝐻 d𝐵 must be transformed by substituting 𝜇0 𝜇 d𝐻


for d𝐵).
201

Chapter 9
MAXWELL’S EQUATIONS

9.1. Vortex Electric Field

Let us consider electromagnetic induction when a wire loop in which a current is


induced is stationary, and the changes in the magnetic flux are due to changes in the
magnetic field. The setting up of an induced current signifies that the changes in the
magnetic field produce extraneous forces in the loop that are exerted on the current
carriers. These extraneous forces are associated neither with chemical nor with
thermal processes in the wire. They also cannot be magnetic forces because such
forces do not work on charges. It remains to conclude that the induced current is
due to the electric field set up in the wire. Let us use the symbol 𝑬 𝐵 to denote the
strength of this field (this symbol, like the one 𝑬 𝑞 used below, is an auxiliary one;
we shall omit the subscripts 𝐵 and 𝑞 later on). The e.m.f. equals the circulation of
the vector 𝑬 𝐵 around the given loop:

Ei = 𝑬 𝐵 · d𝒍. (9.1)

Introducing into Eq. (9.1) Ei = −d𝛷/d𝑡 for Ei and the expression 𝑩 · d𝑺 for 𝛷,

we arrive at the equation


d
∮ ∫
𝑬 𝐵 · d𝒍 = − 𝑩 · d𝑺
d𝑡 𝑆
(the integral in the right-hand side of the equation is taken over an arbitrary surface
resting on the loop). Since the loop and the surface are stationary, the operations of
time differentiation and integration over the surface can have their places exchanged:
∂𝑩
∮ ∫
𝑬 𝐵 · d𝒍 = − · d𝑺. (9.2)
𝑆 ∂𝑡
In connection with the fact that the vector 𝑩 depends, generally speaking, both
on the time and on the coordinates, we have put the symbol of the partial time
202 MAXWELL’S EQUATIONS

derivative inside the integral (the integral 𝑩 · d𝑺 is a function only of time).


Let us transform the left-hand side of Eq. (9.2) in accordance with Stokes’s
theorem.∫ The result is
∂𝑩

(∇ × 𝑬 𝐵 ) · d𝑺 = − · d𝑺.
𝑆 𝑆 ∂𝑡
Owing to the arbitrary nature of choosing the integration surface, the following
equation must be obeyed:
∂𝑩
∇ × 𝑬𝐵 = − . (9.3)
∂𝑡
The curl of the field 𝑬 𝐵 at each point of space equals the time derivative of the
vector 𝑩 taken with the opposite sign.
The British physicist James Maxwell (1831-1879) assumed that a time-varying
magnetic field causes the field 𝑬 𝐵 to appear in space regardless of whether or not
there is a wire loop in this space. The presence of a loop only makes it possible to
detect the existence of an electric field at the corresponding points of space as a
result of a current being induced in the loop.
Thus, according to Maxwell’s idea, a time-varying magnetic field gives birth to an
electric field. This field 𝑬 𝐵 differs appreciably from the electrostatic field 𝑬 𝑞 set up
by fixed charges. An electrostatic field is a potential one, its strength lines begin and
terminate at charges. The curl of the vector 𝑬 𝐵 is zero at any point:
∇ × 𝑬𝑞 = 0 (9.4)
[see Eq. (1.112)]. According to Eq. (9.3), the curl of the vector 𝑬 𝐵 differs from zero.
Hence, the field 𝑬 𝐵 like a magnetic field, is a vortex one. The strength lines of the
field 𝑬 𝐵 are closed.
Thus, an electric field may be either a potential (𝑬 𝑞 ) or a vortex (𝑬 𝐵 ) one. In the
general case, an electric field can consist of the field 𝑬 𝑞 produced by charges and the
field 𝑬 𝐵 set up by a time-varying magnetic field. Adding Eqs. (9.3) and (9.4), we get
the following equation for the curl of the strength of the total field 𝑬 = 𝑬 𝐵 + 𝑬 𝐵 :
∂𝑩
∇×𝑬 =− . (9.5)
∂𝑡
This equation is one of the fundamental ones in Maxwell’s electromagnetic theory.
The existence of a relationship between electric and magnetic fields [expressed
in particular by Eq. (9.5)] is a reason why the separate treatment of these fields has
only a relative meaning. Indeed, an electric field is set up by a system of fixed charges.
If the charges are fixed relative to a certain inertial reference frame, however, they
are moving relative to other inertial frames and, consequently, set up not only an
electric, but also a magnetic field. A stationary wire carrying a steady current sets up
a constant magnetic field at every point of space. This wire is in motion, however,
Displacement Current 203

Fig. 9.1

relative to other inertial frames. Consequently, the magnetic field it sets up at any
point with the given coordinates 𝑥, 𝑦, 𝑧 will change and, therefore, give birth to a
vortex electric field. Thus, a field which is “purely” electric or “purely” magnetic
relative to a certain reference frame will be a combination of an electric and a
magnetic field forming a single electromagnetic field relative to other reference
frames.

9.2. Displacement Current

For a stationary (i.e., not varying with time) electromagnetic field, the curl of the
vector 𝑯 by Eq. (7.9) equals the density of the conduction current at each point:
∇ × 𝑯 = 𝒋.
The vector 𝒋 is associated with the charge density at the same point by continuity
equation (5.11):
∂𝜌
∇· 𝒋=− .
∂𝑡
An electromagnetic field can be stationary only provided that the charge density
𝜌 and the current density 𝒋 do not depend on the time. In this case, according to
Eq. (5.11), the divergence of 𝒋 equals zero. Therefore, the current lines (lines of the
vector 𝒋) have no sources and are closed.
Let us see whether Eq. (7.9) holds for time-varying fields. We shall consider the
current flowing when a capacitor is charged from a source of constant voltage 𝑈.
This current varies with time (the current stops flowing when the voltage across the
capacitor becomes equal to 𝑈). The lines of the conduction current are interrupted
in the space between the capacitor plates (Fig. 9.1; the current lines inside the plates
are shown by dash lines).
204 MAXWELL’S EQUATIONS

Let us take a circular loop 𝛤 enclosing the wire in which the current flows
toward the capacitor and integrate Eq. (7.9) over surface 𝑆1 intersecting the wire
and enclosed by the loop:
∫ ∫
∇ × 𝑯 · d𝑺 = 𝒋 · d𝑺.
𝑆1 𝑆1
Transforming the left-hand side according to Stokes’s theorem we get the circulation
of the vector 𝑯 over loop 𝛤:
∮ ∫
𝑯 · d𝒍 = 𝒋 · d𝑺 = 𝐼 (9.6)
𝛤 𝑆1
(𝐼 is the current charging the capacitor). After performing similar calculations for
surface 𝑆2 that does not intersect the current-carrying wire (see Fig. 9.1), we arrive
at the obviously incorrect relation
∮ ∫
𝑯 · d𝒍 = 𝒋 · d𝑺 = 0. (9.7)
𝛤 𝑆2
The result we have obtained indicates that for time-varying fields Eq. (7.9) stops
being correct. The conclusion suggests itself that this equation lacks an addend
depending on the time derivatives of the fields. For stationary fields, this addend
vanishes.
That Eq. (7.9) is not correct for non-stationary fields is also indicated by the
following reasoning. Let us take the divergence of both sides of Eq. (7.9):
∇ · (∇ × 𝑯) = ∇ · 𝒋.
The divergence of a curl must equal zero [see Eq. (1.106)]. We, thus, arrive at the
conclusion that the divergence of the vector 𝒋 must also always equal zero. But
this conclusion contradicts the continuity equation (5.11). Indeed, in non-stationary
processes, 𝜌 may change with time (this, in particular, is what happens with the
charge density on the plates of a capacitor being charged). In this case in accordance
with Eq. (5.11), the divergence of 𝒋 differs from zero.
To bring Eqs. (5.11) and (7.9) into agreement, Maxwell introduced an additional
addend into the right-hand side of Eq. (7.9). It is quite natural that this addend
should have the dimension of current density. Maxwell called it the density of the
displacement current. Thus, according to Maxwell, Eq. (7.9) should have the form
∇ × 𝑯 = 𝒋 + 𝒋d . (9.8)
The sum of the conduction current and the displacement current is usually
called the total current. The density of the total current is
𝒋tot = 𝒋 + 𝒋d . (9.9)
If we assume that the divergence of the displacement current equals that of the
Displacement Current 205

conduction current taken with the opposite sign:


∇ · 𝒋d = −∇ · 𝒋, (9.10)
then the divergence of the right-hand side of Eq. (9.8), like that of the left-hand side,
will always be zero.
Substituting ∂𝜌/∂𝑡 for ∇ · 𝒋 in Eq. (9.10) in accordance with Eq. (5.11), we get the
following expression for the divergence of the displacement current:
∇ · 𝒋d = ∂𝜌/∂𝑡. (9.11)
To associate the displacement current with quantities characterizing the change in
an electric field with time, let us use Eq. (2.23) according to which the divergence of
the electric displacement vector equals the density of the extraneous charges:
∇ · 𝑫 = 0.
Time differentiation of this equation yields
∂ ∂𝜌
(∇ · 𝑫) = .
∂𝑡 ∂𝑡
Now, let us change the sequence of differentiation with respect to time and to the
coordinates in the left-hand side. As a result, we get the following. expression for
the derivative of 𝜌 with respect to 𝑡:
∂𝜌 ∂𝑫
 
=∇· .
∂𝑡 ∂𝑡
Introduction of this expression into Eq. (9.11) yields
∂𝑫
 
∇ · 𝒋d = ∇ · .
∂𝑡
Hence,
∂𝑫
𝒋d = . (9.12)
∂𝑡
Using Eq. (9.12) in Eq. (9.8), we arrive at the equation
∂𝑫
∇×𝑯 = 𝒋+ , (9.13)
∂𝑡
which, like Eq. (9.5), is one of the fundamental equations in Maxwell’s theory.
We must underline the fact that the term “displacement current” is purely
conventional. In essence, the displacement current is a time-varying electric field.
The only reason for calling the quantity given by Eq. (9.12) a “current” is that the
dimension of this quantity coincides with that of current density. Of all the physical
properties belonging to a real current, a displacement current has only one—the
ability of producing a magnetic field.
The introduction of the displacement current determined by Eq. (9.12) has
“given equal rights” to an electric field and a magnetic field. It can be seen from the
206 MAXWELL’S EQUATIONS

phenomenon of electromagnetic induction that a varying magnetic field sets up an


electric field. It follows from Eq. (9.13) that a varying electric field sets up a magnetic
field.
There is a displacement current wherever there is a time-varying electric field.
In particular, it also exists inside conductors carrying an alternating electric current.
The displacement current inside conductors, however, is usually negligibly small in
comparison with the conduction current.
We must note that Eq. (9.6) is approximate. For it to become quite strict, we
must add a term to its right-hand side that takes account of the displacement current
due to the weak dispersed electric field in the vicinity of surface 𝑆1 .
Let us convince ourselves that the surface integral of the right-hand side of
Eq. (9.8) has the same value for surfaces 𝑆1 and 𝑆2 (see Fig. 9.1). Both the conduction
current and the displacement current due to the electric field outside the capacitor
“flow” through surface 𝑆1 . Hence, for the first surface, we have
d d
∫ ∫
Int1 = 𝒋 · d𝑺 + 𝑫 · d𝑺 = 𝐼 + 𝛷1,in
𝑆1 d𝑡 𝑆1 d𝑡
(we have changed the sequence of the operations of differentiation with respect
to time and integration over the coordinates in the second addend). The quantity
designated by the letter 𝐼 is the current flowing in the conductor to the left-hand
plate of the capacitor, 𝛷1,in is the flux of the vector 𝑫 flowing into the volume 𝑉
bounded by surfaces 𝑆1 and 𝑆2 (see Fig. 9.1).
For the second surface, 𝒋 = 0, consequently
d d

Int2 = 𝑫 · d𝑺 = 𝛷2,out
d𝑡 𝑆2 d𝑡
where 𝛷2,out is the flux of the vector 𝑫 flowing out of volume 𝑉 through surface 𝑆2 .
The difference between the integrals is
d d
Int2 − Int1 = 𝛷2,out − 𝛷1,in − 𝐼.
d𝑡 d𝑡
The current 𝐼 can be represented as d𝑞/d𝑡, where 𝑞 is the charge on a capacitor
plate. The flux passing inward through surface 𝑆1 equals the flux passing outward
through the same surface taken with the opposite sign. Substituting −𝛷1,out for 𝛷1,in
and d𝑞/d𝑡 for 𝐼, we get
d  d𝑞 d
Int2 − Int1 = 𝛷2,out + 𝛷1,out − = (𝛷𝐷 − 𝑞) , (9.14)
d𝑡 d𝑡 d𝑡
where 𝛷𝐷 is the flux of the vector 𝑫 through the closed surface formed by surfaces
𝑆1 and 𝑆2 . According to Eq. (2.25), this flux must equal the charge enclosed by the
surface. In the given case, it is the charge 𝑞 on a capacitor plate. Thus, the right-hand
side of Eq. (9.14) equals zero. It follows that the magnitude of the surface integral of
Maxwell’s Equations 207

the total current density vector does not depend on the choice of the surface over
which the integral is being calculated.
We can construct current lines for the displacement current like those for the
conduction current. According to Eq. (2.35), the electric displacement in the space
between the capacitor plates equals the surface charge density on a plate: 𝐷 = 𝜎 .
Hence,
𝐷¤ = 𝜎¤ .
The left-hand side gives the density of the displacement current in the space between
the plates, and the right-hand side-the density of the conduction current inside
the
current uninterruptedly transform into lines of the displacement current at the
boundary of the plates. Consequently, the lines of the total current are closed.

9.3. Maxwell’s Equations

The discovery of the displacement current permitted Maxwell to present a single


general theory of electrical and magnetic phenomena. This theory explained all the
experimental facts known at that time and predicted a number of new phenomena
whose existence was confirmed later on. The main corollary of Maxwell’s theory
was the conclusion on the existence of electromagnetic waves propagating with
the speed of light. Theoretical investigation of the properties of these waves led
Maxwell to the electromagnetic theory of light.
The theory is based on Maxwell’s equations. These equations play the same
part in the science of electromagnetism as Newton’s laws do in mechanics, or the
fundamental laws in thermodynamics.
The first pair of Maxwell’s equations is formed by Eqs. (9.5) and (7.3):
∂𝑩
∇×𝑬 =− , (9.5)
∂𝑡
∇ · 𝑩 = 0. (7.3)
The first of them relates the values of 𝑬 to changes in the vector 𝑩 in time and is
in essence an expression of the law of electromagnetic induction. The second one
points to the absence of sources of a magnetic field, i.e., magnetic charges.
The second pair of Maxwell’s equations is formed by Eqs. (9.13) and (2.23):
∂𝑫
∇×𝑯 = 𝒋+ , (9.13)
∂𝑡
∇ · 𝑫 = 𝜌. (2.23)
The first of them establishes a relation between the conduction and displacement
208 MAXWELL’S EQUATIONS

currents and the magnetic field they produce. The second one shows that extraneous
charges are the sources of the vector 𝑫.
Equations (9.5), (7.3), (9.13) and (2.23) are Maxwell’s equations in the differential
form. We must note that the first pair of equations includes only the basic char-
acteristics of a field, namely, 𝑬 and 𝑩. The second pair includes only the auxiliary
quantities 𝑫 and 𝑯.
Each of the vector equations (9.5) and (9.13) is equivalent to three scalar equations
relating the components of the vectors in the left-hand and right-hand sides of the
equations. Using Eqs. (1.81) and (1.92)-(1.91), let us present Maxwell’s equation in the
scalar form:
 ∂𝐸 𝑧 ∂𝐸 𝑦 ∂𝐵 𝑥
 − =−
∂𝑦 ∂𝑧 ∂𝑡




 ∂𝐸 𝑥 ∂𝐸 𝑧 ∂𝐵 𝑦


(9.15)

− =−
 ∂𝑧
 ∂𝑥 ∂𝑡
∂𝐸 𝑦 ∂𝐸 𝑥 ∂𝐵 𝑧



− =−


 ∂𝑥 ∂𝑦 ∂𝑡

∂𝐵 𝑥 ∂𝐵 𝑦 ∂𝐵 𝑧
+ + = 0, (9.16)
∂𝑥 ∂𝑦 ∂𝑧
(the first pair of equations),
 ∂𝐻𝑧 ∂𝐻 𝑦 ∂𝐷𝑥
 − = 𝑗𝑥 +
∂𝑦 ∂𝑧 ∂𝑡




 ∂𝐻𝑥 ∂𝐻𝑧 ∂𝐷 𝑦


(9.17)

− = 𝑗𝑦 +
 ∂𝑧
 ∂𝑥 ∂𝑡
∂𝐻 ∂𝐻 ∂𝐷


 𝑦 𝑥 𝑧
− = 𝑗𝑧 +


 ∂𝑥 ∂𝑦 ∂𝑡

∂𝐷𝑥 ∂𝐷 𝑦 ∂𝐷𝑧
+ + = 𝜌, (9.18)
∂𝑥 ∂𝑦 ∂𝑧
(the second pair of equations).
We get a total of 8 equations including 12 functions (three components each
of the vectors 𝑬, 𝑩, 𝑫, 𝑯). Since the number of equations is less than the number
of unknown functions, (9.5), (7.3), (9.13) and (2.23) are not sufficient for finding the
fields according to the given distribution of the charges and currents. To calculate
the fields, we must add equations relating 𝑫 and 𝒋 to 𝑬 and also 𝑯 to 𝑩 to these
equations. They have the form
𝑫 = 𝜀0 𝜀𝑬, (2.21)
𝑩 = 𝜇0 𝜇𝑯, (7.17)
Maxwell’s Equations 209

𝒋 = 𝜎 𝑬. (5.22)
The collection of equations (9.5), (7.3), (9.13) and (2.23), and (2.21), (7.17), (5.22) forms
the foundation of the electrodynamics of media at rest.
The equations
d
∮ ∫
𝑬 · d𝒍 = − 𝑩 · d𝑺, (9.19)
d 𝑆
∮𝛤
𝑩 · d𝑺 = 0, (9.20)
𝑆
(the first pair) and
d
∮ ∫ ∫
𝑯 · d𝒍 = 𝒋 · d𝑺 + 𝑩 · d𝑺, (9.21)
𝛤 𝑆 d 𝑆
∮ ∫
𝑫 · d𝑺 = 𝜌 d𝑉 , (9.22)
𝑆 𝑉
(the second pair) are Maxwell’s equations in the integral form.
Equation (9.19) is obtained by integration of Eq. (9.5) over arbitrary surface 𝑆 with
the following transformation of the left-hand side according to Stokes’s theorem
into an integral over loop 𝛤 enclosing surface 𝑆. Equation (9.21) is obtained in the
same way from Eq. (9.13). Equations (9.20) and (9.22) are obtained from Eqs. (7.3) and
(2.23) by integration over the arbitrary volume 𝑉 with the following transformation
of the left-hand side according to the Ostrogradsky-Gauss theorem into an integral
over closed surface 𝑆 enclosing volume 𝑉 .
211

Chapter 10
MOTION OF CHARGED
PARTICLES IN ELECTRIC
AND MAGNETIC FIELDS

10.1. Motion of a Charged Particle in a Homogeneous Magnetic Field

Imagine a charge 𝑒0 moving in a homogeneous magnetic field with the velocity


𝒗 perpendicular to 𝑩. The magnetic force imparts to the charge an acceleration
perpendicular to the velocity
𝐹 𝑒0
𝑎n = = 𝑣𝐵 (10.1)
𝑚 𝑚
[see Eq. (6.33); the angle between 𝒗 and 𝑩 is a right one]. This acceleration changes
only the direction of the velocity, while the magnitude of the latter remains un-
changed. Hence, the acceleration given by Eq. (10.1) will be constant in magnitude
too. In these conditions, the charged particle moves uniformly around a circle
whose radius is determined by means of the equation 𝑎n = 𝑣2 /𝑅. Substituting for 𝑎n
in this equation its value from Eq. (10.1) and solving the resulting equation relative
to 𝑅, we get
𝑚𝑣
𝑅= 0 . (10.2)
𝑒 𝐵
Thus, when a charged particle moves in a homogeneous magnetic field per-
pendicular to the plane in which the motion is taking place, the trajectory of the
particle is a circle. The radius of the circle depends on the velocity of the particle,
the magnetic induction of the field, and the ratio of the charge of the particle 𝑒0 to
its mass 𝑚. The ratio 𝑒0/𝑚 is called the specific charge.
Let us find the time 𝑇 needed for the particle to complete one revolution. For
this purpose, we shall divide the length of the circumference 2𝜋 𝑅 by the velocity of
212 MOTION OF CHARGED PARTICLES

Fig. 10.1 Fig. 10.2

the particle 𝑣. The result is


𝑚1
𝑇 = 2𝜋 0 . (10.3)
𝑒 𝐵
Inspection of Eq. (10.3) shows that the period of revolution of the particle does not
depend on its velocity. It is determined only by the specific charge of the particle
and the magnetic induction of the field.
Let us determine the nature of motion of a charged particle when its velocity
makes the angle 𝛼 with the direction of a homogeneous magnetic field, and 𝛼 is not
a right angle. We shall resolve the vector 𝒗 into two components: 𝒗⊥ perpendicular
to 𝑩, and 𝒗 k parallel to 𝑩 (Fig. 10.1). The magnitudes of these components are
𝑣⊥ = 𝑣 sin 𝛼, 𝑣 k = 𝑣 cos 𝛼.
The magnetic force has the magnitude
𝐹 = 𝑒0 𝑣𝐵 sin 𝛼 = 𝑒0 𝑣⊥ 𝐵,
and is in a plane at right angles to 𝑩. The acceleration produced by this force
is normal for the component 𝒗⊥ . The component of the magnetic force in the
direction of 𝑩 is zero. Hence, this force cannot affect the magnitude of 𝒗 k . The
motion of the particle can, thus, be considered as the superposition of two motions:
(1) translation along the direction of 𝑩 with a constant velocity 𝑣 k = 𝑣 cos 𝛼, and (2)
uniform circular motion in a plane at right angles to the vector 𝑩.
The radius of the circle is determined by Eq. (10.2) with 𝑣⊥ = 𝑣 sin 𝛼 substituted
for 𝑣. The trajectory of motion is a helix (spiral) whose axis coincides with the
direction of 𝑩 (Fig. 10.2). The pitch of the helix 𝑙 can be found by multiplying 𝑣 k by
the period of revolution 𝑇 determined by Eq. (10.3):
𝑚1
𝑙 = 𝑣 k 𝑇 = 2𝜋 0 𝑣 cos 𝛼. (10.4)
𝑒 𝐵
The direction in which the helix curls depends on the sign of the particle’s
charge. If the latter is positive, the helix curls counterclockwise. A helix along
which a negatively charged particle is moving curls clockwise (it is assumed that
we are looking at the helix along the direction of 𝑩; the particle flies away from us
Deflection of Moving Charged Particles by an Electric and a Magnetic Field213

Fig. 10.3

if 𝛼 < 𝜋/2, and toward us if 𝛼 > 𝜋/2).

10.2. Deflection of Moving Charged Particles by an Electric and a Magnetic


Field

Let us consider a narrow beam of identically charged particles (for example, elec-
trons) that in the absence of fields falls on a screen perpendicular to it at point
0 (Fig. 10.3). Let us find the displacement of the trace of the beam produced by
a homogeneous electric field perpendicular to the beam and acting on a path of
length 𝑙1 . Let the initial velocity of the particles be 𝒗0 . Upon entering the region
of the field, each particle will move with an acceleration 𝑎⊥ = (𝑒0/𝑚)𝐸 constant
in magnitude and in direction and perpendicular to 𝒗0 (here, 𝑒0/𝑚 is the specific
charge of a particle). Motion under the action of the field continues during the time
𝑡 = 𝑙1 /𝑣0 . During this time, the particles will be displaced over the distance
1 1 𝑒0 𝑙21
𝑦1 = 𝑎⊥ 𝑡2 = 𝐸 , (10.5)
2 2 𝑚 𝑣02
and will acquire the following velocity component perpendicular to 𝒗0 :
𝑒0 𝑙1
𝑣⊥ = 𝑎⊥ 𝑡 = 𝐸 .
𝑚 𝑣0
The particles now fly in a straight line in a direction that makes with the vector
𝒗0 the angle 𝛼 determined by the expression
𝑣⊥ 𝑒0 𝑙1
tan 𝛼 = − 𝐸 . (10.6)
𝑣0 𝑚 𝑣02
As a result in addition to the displacement given by Eq. (10.5) the beam receives the
displacement
𝑒0 𝑙1 𝑙2
𝑦2 = 𝑙 2 tan 𝛼 = 𝐸 2 ,
𝑚 𝑣0
where 𝑙2 is the distance to the screen from the boundary of the region which the
field is in.
214 MOTION OF CHARGED PARTICLES

Fig. 10.4

The displacement of the trace of the beam relative to point 0 is thus


𝑒0 𝑙1 1
 
𝑦 = 𝑦1 + 𝑦2 = 𝐸 2 𝑙1 + 𝑙2 . (10.7)
𝑚 𝑣0 2
Taking into account Eq. (10.6), the expression for the displacement can be written
in the form
1

𝑦 = 𝑙1 + 𝑙2 tan 𝛼.
2
It thus follows that the particles leaving the field fly as if they were leaving the centre
of the capacitor setting up the field at the angle 𝛼 determined by means of Eq. (10.6).
Now let us assume that on a particle path of 𝑙1 a homogeneous magnetic field is
switched on perpendicular to the velocity 𝒗0 of the particles (Fig. 10.4; the field is
perpendicular to the plane of the drawing, the region of the field is surrounded by a
dash circle). Under the action of the field, each particle receives the acceleration
𝑎⊥ = (𝑒0/𝑚)𝑣0 𝐵 constant in magnitude. Limiting ourselves to the case when the
deflection of the beam by the field is not great, we can consider that the acceleration
𝑎⊥ is constant in magnitude and perpendicular to 𝑣0 . Hence, we can use the equa-
tions we have obtained for calculating the displacement, replacing the acceleration
𝑎⊥ = (𝑒0/𝑚)𝐸 in them with the value 𝑎⊥ = (𝑒0/𝑚)𝑣0 𝐵. As a result, we get the
following expression for the displacement, which we shall now denote by 𝑥:
𝑒0 𝑙1 1
 
𝑥= 𝐵 𝑙1 + 𝑙2 . (10.8)
𝑚 𝑣0 2
The angle through which the beam is deflected by the magnetic field is deter-
mined by the expression
𝑒0 𝑙1
tan 𝛽 = 𝐵 . (10.9)
𝑚 𝑣0
With a view to Eq. (10.3), we can write Eq. (10.8) in the form
1
 
𝑥 = 𝑙 1 + 𝑙 2 tan 𝛽.
2
Deflection of Moving Charged Particles by an Electric and a Magnetic Field 215

Fig. 10.5

Consequently, upon small deflections, the particles after leaving the magnetic field
fly as if they had left the centre of the region containing the deflecting field at the
angle 𝛽 whose magnitude is determined by Eq. (10.9).
Inspection of Eqs. (10.7) and (10.8) shows that both the deflection by an electric
field and the deflection by a magnetic one are proportional to the specific charge of
the particles.
The deflection of a beam of electrons by an electric or magnetic field is used
in cathode-ray tubes. A tube with electrical deflection (Fig. 10.5), apart from the
so-called electron gun producing a narrow beam of fast electrons (an electron beam),
contains two pairs of mutually perpendicular deflecting plates. By feeding a voltage
to any pair of plates, we can produce a proportional displacement of the electron
beam in a direction normal to the given plates. The screen of the tube is coated
with a fluorescent composition. Therefore, a brightly luminescent spot appears on
the screen where the electron beam falls on it.
Cathode-ray tubes are used in oscillographs—instruments making it possible to
study rapid processes. A voltage changing linearly with time (the scanning voltage)
is fed to one pair of deflecting plates, and the voltage being studied to the other.
Owing to the negligibly small inertia of an electron beam, its deflection without
virtually any delay follows the changes in the voltages across both pairs of deflecting
plates, and the beam draws on the oscillograph screen a plot of time dependence of
the voltage being studied. Many nonelectrical quantities can be transformed into
electric voltages with the aid of the relevant devices (transducers). Consequently,
oscillographs are used to study the most diverse processes.
A cathode-ray tube is an integral part of television equipment. In television,
tubes with magnetic control of the electron beam are used most frequently. In these
tubes, the deflecting plates are replaced with two external mutually perpendicular
systems of coils each of which sets up a magnetic field perpendicular to the beam.
Changing of the current in the coils produces motion of the light spot created by
the electron beam on the screen.
216 MOTION OF CHARGED PARTICLES

Fig. 10.6

10.3. Determination of the Charge and Mass of an Electron

The specific charge of an electron (i.e., the ratio 𝑒/𝑚) was first measured by the
British physicist Joseph J. Thomson (1856-1940) in i897 with the aid of a discharge
tube like the one shown in Fig. 10.6. The electron beam (cathode rays; see Sec. 12.6)
emerging from the opening in anode A passed between the plates of a parallel-plate
capacitor and impinged on a fluorescent screen producing a light spot on it. By
feeding a voltage to the capacitor plates, it was possible to act on the beam with a
virtually homogeneous electric field.
The tube was placed between the poles of an electromagnet, which could pro-
duce a homogeneous magnetic field perpendicular to the electric one on the same
portion of the path of the electrons (the region of the magnetic field is shown in
Fig. 10.6 by the dash circle). When the fields were switched off, the beam impinged
on the screen at point 0. Each of the fields separately caused deflection of the beam
in a vertical direction. The magnitudes of the displacements were determined with
the aid of Eqs. (10.7) and (10.8) obtained in the preceding section.
After switching on the magnetic field and measuring the displacement of the
beam trace
𝑒 𝑙1 1
 
𝑥= 𝐵 𝑙1 + 𝑙2 , (10.10)
𝑚 𝑣0 2
which it produced, Thomson also switched on the electric field and selected its value
so that the beam would again reach point 0. In this case, the electric and magnetic
fields acted on the electrons of the beam simultaneously with forces identical in
value, but opposite in direction. The condition was observed that
𝑒𝐸 = 𝑒𝑣0 𝐵. (10.11)
By solving the simultaneous equations (10.10) and (10.11), Thomson calculated 𝑒/𝑚
and 𝑣0 . H. Busch used the method of magnetic focussing to determine the specific
charge of electrons. The essence of this method consists in the following. Assume
that a slightly diverging beam of electrons having a velocity 𝑣 identical in magnitude
flies out from a certain point of a homogeneous magnetic field. The beam is sym-
Determination of the Charge and Mass of an Electron 217

Fig. 10.7

metrical relative to the direction of the field. The directions in which the electrons
fly out form small angles 𝛼 with the direction of 𝑩. It was shown in Sec. 10.1 that
the electrons in this case travel along helical trajectories, performing during the
identical time
𝑚1
𝑇 = 2𝜋 ,
𝑒 𝐵
a complete revolution and being displaced along the direction of the field over the
distance 𝑙 equal to
𝑙 = 𝑣 cos 𝛼 × 𝑇. (10.12)
Owing to the smallness of the angles 𝛼, the distances (10.12) for different electrons
are virtually the same and equal 𝑣𝑇 (for small angles cos 𝛼 ≈ 1). Consequently, the
slightly diverging beam is focussed at a point that is at the distance
𝑚𝑣
𝑙 = 𝑣𝑇 = 2𝜋 (10.13)
𝑒 𝐵
from the point of emergence of the electrons.
In Busch’s experiment, the electrons emitted by hot cathode C (Fig. 10.7) are
accelerated when passing through the potential difference 𝑈 applied between the
cathode and anode A. As a result, they acquire the velocity 𝑣 whose value can be
found from the relation
𝑚𝑣2
𝑒𝑈 = . (10.14)
2
After next flying out through an opening in the anode, the electrons form a narrow
beam directed along the axis of the evacuated tube inserted into a solenoid. A
capacitor fed with a varying voltage is placed at the inlet of the solenoid. The
field set up by the capacitor deflects the electrons of the beam from the axis of
the instrument through small angles 𝛼 changing with time. This leads to “eddying”
of the beam—the electrons begin to move along different helical trajectories. A
fluorescent screen is placed at the outlet from the solenoid. If the magnetic induction
𝐵 is selected so that the distance 𝑙 0 from the capacitor to the screen complies with
the condition
𝑙 0 = 𝑛𝑙 (10.15)
218 MOTION OF CHARGED PARTICLES

Fig. 10.8

(𝑙 is the pitch of the helix, and 𝑛 is an integer), then the point of intersection of the
trajectories of the electrons gets onto the screen the electron beam is focussed at
this point and produces a sharp luminescent spot on the screen. If condition (10.15)
is not observed, the luminescent spot on the screen will be blurred. We can find
𝑒/𝑚 and 𝑣 by solving the system of equations (10.13), (10.14), and (10.15).
The most accurate value of the specific charge of an electron established with
account taken of the results obtained by different methods, is
= 1.76 × 1011 C kg−1 = 5.27 × 1017 cgse𝑞 g−1 .
𝑒
(10.16)
𝑚
Equation (10.16) gives the ratio of the charge of an electron to its rest mass 𝑚. In
the experiments conducted by Thomson, Busch, and in other similar experiments,
the ratio of the charge to the relativistic mass
𝑚
𝑚r = p , (10.17)
1 − (𝑣2 /𝑐2 )
was determined. In Thomson’s experiments, the speed of the electrons was about
0.1𝑐. At such a speed, the relativistic mass exceeds the rest mass by 0.5%. In subse-
quent experiments, the speed of the electrons reached very high values. In all cases,
the experimenters discovered a reduction in the measured values of 𝑒/𝑚 with a
growth in 𝑣, which occurred in complete accordance with Eq. (10.17).
The charge of an electron was determined with high accuracy by the American
scientist Robert Millikan (1886-1953) in 1909. He introduced very minute oil droplets
into the closed space between horizontally arranged capacitor plates (Fig. 10.8).
When atomized, the droplets became electrolyzed, and they could be suspended in
mid air by properly choosing the magnitude and the sign of the voltage across the
capacitor. Equilibrium set in when the following condition was observed:
𝑃 0 = 𝑒0 𝐸. (10.18)
Here, is the charge of a droplet, and
𝑒0 𝑃0 is the resultant of the force of gravity and
the buoyant force equal to
4
𝑃 0 = 𝜋𝑟 2 (𝜌 − 𝜌0 ) 𝑔 (10.19)
3
Determination of the Charge and Mass of an Electron 219

(𝜌 is the density of a droplet, 𝑟 is its radius, and 𝜌0 is the density of air).


Equations (10.18) and (10.19) can be used to find 𝑒 if we know 𝑟. To determine the
radius, the speed 𝑣0 of uniform falling of a droplet was measured in the absence of a
field. Uniform motion of a droplet sets in provided that the force 𝑃 0 is balanced by
the force of resistance 𝐹 = 6𝜋𝜂𝑟𝑣 [see Eq. (9.24) of Vol. I; 𝜂 is the viscosity of air]:
𝑃 0 = 6𝜋𝜂𝑟𝑣0 . (10.20)
The motion of a droplet was observed with the aid of a microscope. To measure 𝑣0 ,
the time was determined during which a droplet covered the distance between two
threads that could be seen in the field of vision of the microscope.
It is very difficult to accurately suspend a droplet in equilibrium. Therefore,
instead of a field complying with condition (10.18), such a field was switched on
under whose action a droplet began to move upward with a small speed. The steady
speed of rising 𝑣𝐸 is determined from the condition that the force 𝑃 0 and the force
6𝜋𝜂𝑟𝑣 together balance the force 𝑒0 𝐸:
𝑃 0 + 6𝜋𝜂𝑟𝑣𝐸 = 𝑒0 𝐸. (10.21)
Excluding 𝑃0 and 𝑟 from Eqs. (10.19), (10.20), and (10.21), we get an expression for
𝑒0:
 1/2 
2𝜂3 𝑣0

𝑣0 + 𝑣𝐸 
𝑒 = 9𝜋
0
(10.22)
(𝜌 − 𝜌0 ) 𝑔 𝐸
(Millikan introduced a correction into this equation taking into account that the
dimensions of the droplets were comparable with the free path of air molecules).
Thus, by measuring the speed of free fall of a droplet 𝑣0 and the speed of its rise
𝑣𝐸 in a known electric field 𝐸, one could find the charge of a droplet 𝑒0. In measuring
the speed 𝑣𝐸 at a certain value of the charge 𝑒0, Millikan ionized the air by radiating
X-rays through the space between the plates. Separate ions adhered to a droplet and
changed its charge. As a result, the speed 𝑣𝐸 also changed. After measuring the new
value of the speed, the space between the plates was again irradiated, and so on.
The changes in the charge of a droplet 𝛥𝑒0 and the charge 𝑒0 itself measured by
Millikan were each time found to be integral multiples of the same quantity 𝑒. This
was an experimental proof of the discrete nature of an electric charge, i.e., of the
fact that any charge consists of elementary charges of the same magnitude.
The value of the elementary charge established with a view to Millikan’s mea-
surements and to the data obtained in other ways is
𝑒 = 1.60 × 10−19 C = 4.80 × 10−10 cgse𝑞 . (10.23)
The charge of an electron has the same value.
The rest mass of an electron obtained from Eqs. (10.16) and (10.23) is
𝑚 = 0.91 × 10−30 kg = 0.91 × 10−23 g. (10.24)
220 MOTION OF CHARGED PARTICLES

It is about 1/1840 of the mass of the lightest of all atoms-the hydrogen atom.
The laws of electrolysis established experimentally by Michael Faraday in 1836
played a great part in discovering the discrete nature of electricity. According to
these laws, the mass 𝑚 of a substance liberated when a current passes through an
electrolyte¹ is proportional to the charge 𝑞 carried by the current:
1𝑀
𝑚= 𝑞. (10.25)
𝐹 𝑧
Here, 𝑀 is the mass of one mole of the liberated substance, 𝑧 the valence of the
substance and 𝐹 the Faraday’s constant (Faraday’s number) equal to
𝐹 = 96.5 × 103 C mol−1 . (10.26)
Dividing both sides of Eq. (10.25) by the mass of an ion, we get
1 𝑁A
𝑁= 𝑞,
𝐹 𝑧
where 𝑁A is the Avogadro’s constant and 𝑁 the number of ions contained in the
mass 𝑚.
Hence, for the charge of one ion, we have
𝑞 𝐹
𝑒0 = = 𝑧.
𝑁 𝑁A
Consequently, the charge of an ion is an integral multiple of the quantity
𝐹
𝑒= , (10.27)
𝑁A
which is the elementary charge.
Thus, the discrete nature of the charges which ions in electrolytes can have
follows from an analysis of the laws of electrolysis.
Substituting for 𝐹 in Eq. (10.27) its value from Eq. (10.26) and for 𝑁A its value
found from J. Perrin’s experiments (see Sec. 11.9 of Vol. I), we get a value fore that
agrees quite well with that found by Millikan.
Since the accuracy with which Faraday’s constant is determined and the accu-
racy of the value of 𝑒 obtained by Millikan are greatly superior to the accuracy of
Perrin’s experiments for determining 𝑁A , Eq. (10.27) was used to determine Avo-
gadro’s constant. Here, the value of 𝐹 found from experiments in electrolysis and
the value of 𝑒 obtained by Millikan were used.

¹Electrolytes are solutions of salts, alkalies or acids in water and some other liquids, and also
molten salts that are ionic crystals in the solid state. Chemical transformations occur in electrolytes
when a current passes through them. Such substances are called electrolytic conductors (conductors
of the second kind) to distinguish them from electronic conductors (conductors of the first kind) in
which the passage of a current is not attended by chemical transformations.
Determination of the Specific Charge of Ions. Mass Spectrographs 221

Fig. 10.9

10.4. Determination of the Specific Charge of Ions. Mass Spectrographs

The methods of determining the specific charge described in the preceding section
are suitable when all the particles in a beam have the same velocity. All the electrons
forming a beam are accelerated by the same potential difference applied between
the cathode from which they fly out and the anode. Therefore, the scattering of
the values of the velocities of the electrons in a beam is very small. If matters were
different, an electron beam would produce a greatly blurred spot on the screen, and
measurements would be impossible.
Ions are formed as a result of ionization of molecules of a gas that takes place in
a volume having an appreciable length. Appearing in different places of this volume,
the ions then pass through different potential differences, and, consequently, their
velocities are different. Thus, the methods used to determine the specific charge of
electrons cannot be applied to ions. In 1907, J. J. Thomson developed the “method
of parabolas”, which made it possible to circumvent the difficulty noted above.
In Thomson’s experiment, a narrow beam of positive ions passed through a
region in which it simultaneously experienced the action of parallel electric and
magnetic fields (Fig. 10.9). Both fields were virtually homogeneous and made a right
angle with the initial direction of the beam. They produced deflections of the ions:
the magnetic field deflected them in the direction of the 𝑥-axis, the electric one
along they 𝑦-axis. According to Eqs. (10.8) and (10.7), these deflections are
𝑒0 𝑙1 1
 
𝑥= 𝐵 𝑙1 + 𝑙2
𝑚 𝑣 2
(10.28)
𝑒0 𝑙1 1
 
𝑦= 𝐸 2 𝑙1 + 𝑙2 ,
𝑚 𝑣 2
where 𝑣 is the velocity of a given ion with the specific charge 𝑒0/𝑚, 𝑙1 the length
222 MOTION OF CHARGED PARTICLES

Fig. 10.10

of the region in which the field acts on the beam and 𝑙2 is the distance from the
boundary of this region to the photographic plate registering the ions impinging
on it.
Equations (10.28) are the coordinates of the point at which an ion having the
given values of 𝑒0/𝑚 and the velocity 𝑣 impinges on the plate. Ions having the
same specific charge, but different velocities, reached different points of the plate.
Eliminating the velocity 𝑣 from Eqs. (10.28), we get the equation of a curve along
which the traces of ions having the same value of 𝑒0/𝑚 are arranged:
𝐸 𝑚 2
𝑦= 2 𝑥 . (10.29)
𝐵 𝑙1 (0.5𝑙1 + 𝑙2 ) 𝑒0
Inspection of Eq. (10.29) shows that ions having identical values of 𝑒0/𝑚 and
different values of 𝑣 left a trace in the form of a parabola on the plate. Ions having
different values of 𝑒0/𝑚 occupied different parabolas. Equation (10.29) can be used to
find the specific charge of the ions corresponding to each parabola if the parameters
of the instrument are known (i.e., 𝐸, 𝐵, 𝑙1 , and 𝑙2 ), and the displacements 𝑥 and 𝑦
are measured. When the direction of one of the fields was reversed, the relevant
coordinate reversed its sign, and parabolas symmetrical to the initial ones were
obtained. Dividing the distance between similar points of symmetrical parabolas in
half made it possible to find 𝑥 and 𝑦. The trace left on the plate by the beam with
the fields switched off gave the origin of coordinates. Figure 10.10 shows the first
parabolas obtained by Thomson.
When performing experiments with chemically pure neon, Thomson discovered
that this gas produced two parabolas corresponding to relative atomic masses of
20 and 22. This result gave rise to the assumption that there are two chemically
Determination of the Specific Charge of Ions. Mass Spectrographs 223

Fig. 10.11
36 37 38 39 40 41 42 43 44

39 40 41 42 43 44

38 39 40 41 42 43 44 45

Fig. 10.12

indistinguishable varieties of the neon atoms (today we call them isotopes of neon).
This assumption was proved by the British scientist Francis Aston (1877-1945), who
improved the method of determining the specific charge of ions.
Aston’s instrument, which he called a mass spectrograph, was designed as
follows (Fig. 10.11). A beam of ions separated by a system of slits was consecutively
passed through an electric field and a magnetic field. These fields were directed so
that they caused the ions to travel to opposite sides. When they passed through
the electric field, ions with a given value of 𝑒0/𝑚 were deflected more when their
velocity was lower. Consequently, the ions left the electric field in the form of a
diverging beam. The trajectories of the ions were also curved more in the magnetic
field when their velocity was lower. Since the ions were deflected to opposite sides
by the two fields, after leaving the magnetic field they formed a beam converging at
one point.
Ions with other values of the specific charge were focussed at other points
(the trajectories of the ions for only one value of 𝑒0/𝑚 are shown in Fig. 10.11).
The relevant calculations show that points at which beams formed by ions having
different values of 𝑒0/𝑚 converge are approximately on a single straight line (shown
by a dash line in the figure). Putting a photographic plate along this line, Aston
obtained a number of short lines on it, each of which corresponded to a definite
value of 𝑒0/𝑚. The similarity of the image obtained on the plate to a photograph of
an optical line spectrum was the reason why Aston called it a mass spectrogram, and
the instrument itself—a mass spectrograph. Figure 10.12 shows mass spectrograms
obtained by Aston (the mass numbers of the relevant ions are indicated opposite
224 MOTION OF CHARGED PARTICLES

Fig. 10.13

the lines).
K. Bainbridge designed an instrument of a different kind. In the Bainbridge mass
spectrograph (Fig. 10.13), a beam of ions first passes through the so-called velocity
selector that separates ions having a definite velocity from the beam. In the selector,
the ions experience the action of mutually perpendicular electric and magnetic fields
that deflect the ions to opposite sides. Only those ions pass through the selector
slit for which the actions of the electric and magnetic fields compensate each other.
This occurs when 𝑒0 𝐸 = 𝑒0 𝑣𝐵. Hence, the velocities of the ions leaving the selector
regardless of their mass and charge have identical values equal to 𝑣 = 𝐸/𝐵.
After leaving the selector, the ions get into the region of a homogeneous mag-
netic field of induction 𝐵 0 at right angles to their velocity. In this field, they move
along circles whose radii depend on 𝑒0/𝑚:
𝑚 𝑣
𝑅= 0 0
𝑒 𝐵
[see Eq. (10.21)].
After completing a semi-circle, the ions strike a photographic plate at distances
of 2𝑅 from the slit. Hence, the ions of each species (determined by the value of
𝑒0/𝑚) leave a trace on the plate in the form of a narrow strip. The specific charges
of the ions can be calculated if the parameters of the instrument are known. Since
the charges of the ions are integral multiples of the elementary charge 𝑒, the masses
of the ions can be calculated from the found values of 𝑒0/𝑚.
Numerous kinds of mass spectrographs are in use at present. Instruments have
also been designed in which the ions are registered by means of an electrical device
instead of by a photographic plate. They are called mass spectrometers.
Charged Particle Accelerators 225

Fig. 10.14

10.5. Charged Particle Accelerators

Experiments using beams of high-energy charged particles play a great part in the
physics of atomic nuclei and elementary particles. The devices used for obtaining
such beams are called charged particle accelerators. There are many types of
such devices. We shall acquaint ourselves with the operating principles of some of
them.
The Van De Graaff Generator. In 1929, R. van de Graaff proposed an electro-
static generator based on the fact that surplus charges take up a position on the
external surface of a conductor. A schematic view of the generator is shown in
Fig. 10.14. A hollow metal sphere called a conductor is mounted on an insulating
column. An endless moving belt of silk or rubberized fabric mounted on shafts
is introduced into the sphere. A comb of sharp points is installed at the base of
the column near the belt. The charge produced by a voltage generator (VG) for
several scores of kilovolts flows onto the belt from the comb points. The conductor
contains a second comb onto whose points the charge flows from the belt. This
comb is connected Fig. 10.14 to the conductor so that the charge taken off the belt
immediately passes over to its external surface. As charges accumulate on the con-
ductor, its potential grows until the charge that leaks away becomes equal to the
newly supplied charge. The leakage is mainly due to ionization of the gas near the
surface of the conductor. The resulting passage of a current through the gas is called
a corona discharge (see Sec. 12.8). The surface of the conductor is carefully polished
226 MOTION OF CHARGED PARTICLES

Fig. 10.15

to reduce the corona discharge.


The potential up to which the conductor can be discharged is limited by the
circumstance that at a field strength of about 3 kV m−1 (30 kV cm−1 ) a discharge
appears in the air at atmospheric pressure. For a sphere, 𝐸 = 𝜑/𝑟. Therefore, to
obtain higher potential differences, the size of the conductor has to be increased
(up to 10 m in diameter). The maximum potential difference that can be obtained
in practice with the aid of a van de Graaff generator is about 10 MV (107 V).
Particles are accelerated in a discharge tube (DT) to whose electrodes the po-
tential difference obtained in the generator is applied. A van de Graaff generator is
sometimes designed in the form of two identical columns near each other whose
conductors are charged oppositely. In this case, the discharge tube is connected
between the conductors.
It must be noted that the generator belt, conductor, discharge tube, and the
earth form a closed direct current circuit. Inside the tube, the charges move under
the action of the electrostatic field. Charges are carried to the conductor from the
earth by extraneous forces whose part is played by the mechanical forces bringing
the generator belt into motion.
Betatron. This is the name given to an induction accelerator of electrons using
a vortex electric field. It consists of a toroidal evacuated chamber (a doughnut)
placed between the poles of an electromagnet of a special shape (Fig. 10.15). The
winding of the magnet is supplied with alternating current having a frequency of
about 100 Hz. The varying magnetic field produced performs two functions: first,
it sets up a vortex electric field accelerating the electrons, and, second, it retains the
electrons in an orbit coinciding with the axis of the doughnut.
To keep an electron in an orbit of constant radius, the magnetic induction of the
field must be increased as its velocity grows [according to Eq. (10.2), the radius of the
orbit is proportional to 𝑣/𝐵]. Consequently, only the second and fourth quarters
of the current period can be used for acceleration because at their beginning the
current in the magnet winding is zero. A betatron thus operates in pulse conditions.
At the beginning of the pulse, an electron gun feeds a beam of electrons into the
Charged Particle Accelerators 227

doughnut. The beam is caught up by the vortex electric field and begins to travel
in a circular orbit with a constantly growing velocity. During the growth of the
magnetic field (about 10−3 s), the electrons are able to complete up to a million
revolutions and acquire an energy that may reach several hundred MeV. With such
an energy, the speed of the electrons almost equals the speed of light 𝑐.
For an electron being accelerated to travel in a circular orbit of radius 𝑟0 , a
simple relation, which we shall now proceed to derive, must be observed between
the magnetic induction of the field in the orbit and inside it. The vortex electric field
is directed along a tangent to the orbit along which the electron is travelling. Hence,
the circulation of the vector 𝑬 along this orbit is 2𝜋𝑟0 𝐸. At the same time according
to Eq. (9.19), the circulation of the vector 𝑬 is −(d𝛷/d𝑡), where 𝛷 is the magnetic flux
through the surface enclosed by the orbit. The minus sign indicates the direction of
𝑬. We shall be interested only in the magnitude of the field strength, therefore, we
shall omit the minus sign. Equating the two expressions for the circulation, we find
that
1 d𝛷
𝐸= .
2𝜋𝑟0 d𝑡
The magnetic field is perpendicular to the plane of the orbit. We can, therefore,
assume that 𝛷 = 𝜋𝑟02 h𝐵i, where h𝐵i is the average value of the magnetic induction
over the area of the orbit. Hence,
1 d  𝑟0 d
𝐸= 𝜋𝑟02 h𝐵i = h𝐵i . (10.30)
2𝜋𝑟0 d𝑡 2 d𝑡
Let us write the relativistic equation of motion of an electron in orbit:
" #
d 𝑚𝒗
= 𝑒𝑬 + 𝑒𝒗 × 𝑩orb (10.31)
d𝑡 1 − (𝑣2 /𝑐2 )
p

(𝑩orb is the magnetic induction of the field in the orbit).


The velocity of an electron moving along a circle of radius 𝑟0 can be written in
the form 𝒗 = 𝜔𝑟0 𝝉ˆ , where 𝜔 is the angular velocity of the electron, and 𝝉ˆ is the unit
vector of a tangent to the orbit. The vector 𝑬 can be represented in the form
𝑟0 d
𝑬 = 𝐸 𝝉ˆ = h𝐵i 𝝉ˆ
2 d𝑡
[see Eq. (10.30)]. Finally, the product 𝒗 × 𝑩 can be written in the form 𝑣𝐵𝒏ˆ = 𝜔𝑟0 𝐵𝒏, ˆ
where 𝒏ˆ is a unit vector of a normal to the orbit. In view of what has been said
above, let us write Eq. (10.31) as follows:
 
d  𝜔𝑟0 𝝉ˆ  𝑒𝑟 d
0
= h𝐵i 𝝉ˆ + 𝑒𝜔𝑟0 𝐵orb 𝒏.
ˆ (10.32)

d𝑡  2 d𝑡
q
2 2 2
 1 − 𝜔 𝑟0 /𝑐 
 
 
228 MOTION OF CHARGED PARTICLES

The time derivative of the unit vector 𝝉ˆ is 𝝉ˆ = 𝜔𝒏ˆ [see Eq. (1.56) of Vol. I; the
angular velocity of rotation of the unit vector 𝝉ˆ coincides with the angular velocity
of an electron]. Consequently, performing differentiation in the left-hand side of
Eq. (10.32), we arrive at the equation
   
d  𝜔𝑟0   𝜔𝑟0  𝑒𝑟0 d
𝝉ˆ +  𝜔𝒏ˆ = ˆ
h𝐵i 𝝉ˆ + 𝑒𝜔𝑟0 𝐵orb 𝒏.
  
d𝑡  2 d𝑡
q  q
2 2 2 2 2 2
 1 − 𝜔 𝑟0 /𝑐   1 − 𝜔 𝑟0 /𝑐 
    
   
Equating the factors of similar unit vectors in the left-hand and righthand sides of
the equation, we get
 
d  𝜔𝑟0  𝑒𝑟 d
0
= h𝐵i , (10.33)

d𝑡  2 d𝑡
q
 1 − 𝜔2 𝑟02 /𝑐2 
 
 
𝜔𝑟0
q  = 𝑒𝑟0 𝐵orb . (10.34)
1 − 𝜔2 𝑟02 /𝑐2
It follows from Eq. (10.33) that
𝜔𝑟0 𝑒𝑟0
= h𝐵i (10.35)
2
q
2 2
1 − 𝜔 𝑟0 /𝑐 2


(𝜔 and h𝐵i at the beginning of a pulse equal zero).


A comparison of Eqs. (10.34) and (10.35) yields:
1
𝐵orb = h𝐵i .
2
Thus, for an electron to travel constantly in a circular orbit, the magnetic induction
in the orbit must be half of the average value of the magnetic induction inside the
orbit. This is achieved by making the pole shoes in the form of truncated cones (see
Fig. 10.15).
At the end of an acceleration cycle, an additional magnetic field is switched on
that deflects the accelerated electrons from their stationary orbit and directs them
onto a special target inside the doughnut. Upon striking the target, the electrons
emit hard electromagnetic radiation (gamma rays, X-rays).
Betatrons are mainly used in nuclear investigations. Small accelerators for an
energy up to 50 MeV have found use in industry as sources of very hard X-rays
employed for flaw detection in massive articles.
Cyclotron. The accelerator bearing this name is based on the period of revo-
lution of a charged particle in a homogeneous magnetic field being independent
of its velocity [see Eq. (10.3)]. This apparatus consists of two electrodes in the form
Charged Particle Accelerators 229

Fig. 10.16

of halves of a low round box (Fig. 10.16²) called dees. The latter are confined in an
evacuated housing placed between the poles of a large electromagnet. The field
produced by the magnet is homogeneous and perpendicular to the plane of the dees.
The dees are supplied with an alternating voltage produced by a high-frequency
generator.
Let us introduce a charged particle into the slit between the dees at the moment
when the voltage reaches its maximum value. The particle will be caught up by the
electric field and pulled into one of the dees. The space inside the dee is equipotential,
therefore, the particle in it will be under the action of only a magnetic field. In this
case, the particle travels along a circle whose radius is proportional to the velocity of
the particle [see Eq. (10.2)]. Let us choose the frequency of the change in the voltage
between the dees so that by the moment when the particle, after covering half of
the circle, approaches the slit between the dees, the potential difference between
them will change its sign and reach its amplitude value. The particle will now be
accelerated again and fly into the second dee with an energy double that with which
it travelled in the first dee. Having a greater velocity, the particle will travel in the
second dee along a circle of a greater radius (𝑅 is proportional to 𝑣), but the time
during which it covers half the circle remains the same as previously. Therefore,
by the moment when the particle flies into the slit between the dees, the voltage
between them will again change its sign and take on the amplitude value.
Thus, the particle travels along a curve close to a spiral, and each time it passes
through the slit between the dees it receives an additional portion of energy equal to
𝑒0𝑈m (𝑒0 is the charge of the particle, and 𝑈m is the amplitude of the voltage produced
by the generator). Having a source of alternating voltage of a comparatively small
value (𝑈m is about 105 V) at our disposal, we can use a cyclotron to accelerate
protons up to energies of about 25 MeV. At higher energies, the dependence of the
mass of the protons on the velocity begins to tell—the period of revolution increases
[according to Eq. (10.3) it is proportional to 𝑚], and the synchronism between the
²This figure was taken from https://commons.wikimedia.org/wiki/File:Zyclotron.svg.
230 MOTION OF CHARGED PARTICLES

motion of the particles and the changes in the accelerating field is violated.
To prevent this violation of synchronism and to obtain particles having higher
energies, either the frequency of the voltage fed to the dees or the magnetic field
induction is made to vary. An apparatus in which in the course of accelerating
each portion of particles the frequency of the accelerating voltage is diminished as
required is called a phasotron (or a synchrocyclotron). An accelerator in which
the frequency remains constant, while the magnetic field induction is changed so
that the ratio 𝑚/𝐵 remains constant is called a synchrotron (equipment of this
type is used only to accelerate electrons).
In the accelerator called a synchrophasotron or a proton synchrotron, both
the frequency of the accelerating voltage and the magnetic field induction are
changed. The particles being accelerated travel in this machine along a circular
path instead of a spiral. An increase in the velocity and mass of the particles is
attended by a growth in the magnetic field induction so that the radius determined
by Eq. (10.2) remains constant. The period of revolution of the particles changes
both owing to the growth in their mass and to the growth in 𝐵. For the accelerating
voltage to be synchronous with the motion of the particles, the frequency of this
voltage is made to change according to the relevant law. A synchrophasotron has
no dees, and the particles are accelerated on separate sections of the path by the
electric field produced by the varying frequency voltage generator.
The most powerful accelerator at present (in 1979)—a proton synchrotron—was
started in 1974 at the Fermi National Accelerator Laboratory at Batavia, Illinois,
in the USA. It accelerates protons up to an energy of 400 GeV (4 × 1011 eV). The
speed of protons having such an energy differs from that of light in a vacuum by
less than 0.0003% (𝑣 = 0.9999972𝑐).
231

Chapter 11
THE CLASSICAL THEORY OF
ELECTRICAL CONDUCTANCE OF
METALS

11.1. The Nature of Current Carriers in Metals

A number of experiments were run to reveal the nature of the current carriers in
metals. Let us first of all note the experiment conducted in 1901 by the German
physicist Carl Riecke (1845-1915). He took three cylinders—two of copper and one
of aluminium—with thoroughly polished ends. After being weighed, the cylinders
were put end to end in the sequence copper-aluminium-copper. A current was
passed in one direction through this composite conductor during a year. During this
time, a total charge of 3.5 × 106 C passed through the cylinders. Weighing showed
that the passage of a current had no effect on the weight of the cylinders. When the
ends that had been in contact were studied under a microscope, no penetration of
one metal into another was detected. The results of the experiment indicate that a
charge is carried in metals not by atoms, but by particles encountered in all metals.
The electrons discovered by I. J. Thomson in 1897 could be such particles.
To identify the current carriers in metals with electrons, it was necessary to
determine the sign and numerical value of the specific charge of the carriers. Exper-
iments run for this purpose were based on the following considerations. If metals
contain charged particles capable of moving, then upon the deceleration (braking)
of a metal conductor these particles should continue to move by inertia for a certain
time, as a result of which a current pulse will appear in the conductor, and a certain
charge will be carried in it.
Assume that a conductor initially moves with the velocity 𝒗0 (Fig. 11.1). We shall
begin to decelerate it with the acceleration 𝒂. Continuing to move by inertia, the
232 CLASSICAL THEORY OF ELECTRICAL CONDUCTANCE OF METALS

Fig. 11.1

current carriers will acquire the acceleration −𝒂 relative to the conductor. The
same acceleration can be imparted to the carriers in a stationary conductor if an
electric field of strength 𝑬 = −𝑚𝒂/𝑒0 is set up in it, i.e., the potential difference,
∫ 2 ∫ 2
𝑚𝒂 𝑚𝑎𝑙
𝜑1 − 𝜑2 = 𝑬 · d𝒍 = − 0
· d𝒍 = − 0 ,
1 1 𝑒 𝑒
is applied to the ends of the conductor (𝑚 and 𝑒 are the mass and the charge of
0

a current carrier, 𝑙 is the length of a conductor). In this case, the current 𝐼 =


(𝜑1 − 𝜑2 )/𝑅, where 𝑅 is the resistance of the conductor, will flow through it (𝐼
is considered to be positive if the current flows in the direction of motion of the
conductor). Hence, the following charge will pass through each cross section of the
conductor during the time d𝑡:
𝑚𝑎𝑙 𝑚𝑙
d𝑞 = 𝐼 d𝑡 = − 0 d𝑡 = − 0 d𝑣.
𝑒𝑅 𝑒𝑅
The charge passing during the entire time of deceleration is
∫ 2 ∫ 0
𝑚𝑙 𝑚 𝑙𝑣0
𝑞= d𝑞 = − 0𝑅
d𝑣 = 0 (11.1)
1 𝑣0 𝑒 𝑒 𝑅
(the charge is positive if it is carried in the direction of motion of the conductor).
Thus, by measuring 𝑙, 𝑣0 , and 𝑅, and also the charge 𝑞 flowing through the
circuit when the conductor is decelerated, we can find the specific charge of the
carriers. The direction of the current pulse will indicate the sign of the carriers.
The first experiment with conductors moving with acceleration was run in
1913 by the Soviet physicists Leonid Mandelshtam (1879-1944) and Nikolai Papaleksi
(1880-1947). They made a wire coil perform rapid torsional oscillations about its
axis. A telephone was connected to the ends of the coil, and the sound due to the
current pulses was heard in it.
A quantitative result was obtained by the American physicists R. Tolman and
T. Stewart in 1916. A coil of a wire 500 m long was made to rotate with a linear
velocity of the turns of 300 m s−1 . The coil was then sharply braked, and a ballistic
galvanometer was used to measure the charge flowing in the circuit during the
braking time. The value of the specific charge of the carriers calculated by Eq. (11.1)
was obtained very close to 𝑒/𝑚 for electrons. It was, thus, proved experimentally
that electrons are the current carriers in metals.
The Elementary Classical Theory of Metals 233

A current can be produced in metals by an extremely small potential difference.


This gives us the grounds to consider that the current carriers—electrons—move
without virtually any hindrance in a metal. The result of Tolman’s and Stewart’s
experiment lead to the same conclusion.
The existence of free electrons in metals can be explained by the fact that
when a crystal lattice is formed, the most weakly bound (valence) electrons detach
themselves from the atoms of the metal. They become the “collective” property
of the entire piece of metal. If one electron becomes detached from every atom,
then the concentration of the free electrons (i.e., their number 𝑛 in a unit volume)
will equal the number of atoms in a unit volume. The latter number is (𝛿/𝑀)𝑁A ,
where 𝛿 is the density of the metal, 𝑀 is the mass of a mole, 𝑁A is Avogadro’s
constant. The values of 𝛿/𝑀 for metals range from 2 × 104 mol m−3 (for potassium)
to 2 × 105 mol m−3 (for beryllium). Hence, we get values of the following order for
the concentration of the free electrons (or conduction electrons, as they are also
called):
𝑛 = 1028 m−3 to 1029 m−3 1022 cm−3 to 1023 cm−3 . (11.2)


11.2. The Elementary Classical Theory of Metals

Proceeding from the notions of free electrons, the German physicist Paul Drude
(1863-1906) created the classical theory of metals that was later improved by H.
Lorentz. Drude assumed that the conduction electrons in a metal behave like the
molecules of an ideal gas. In the intervals between collisions, they move absolutely
freely, covering on an average a certain path 𝑙. True, unlike the molecules of a gas
whose free path is determined by collisions of the molecules with one another,
the electrons collide chiefly not with one another, but with the ions forming the
crystal lattice of the metal. These collisions result in the establishment of thermal
equilibrium between the electron gas and the crystal lattice.
Assuming that the results of the kinetic theory of gases may be extended to
an electron gas, we can use the following formula to assess the average velocity of
thermal motion of the electrons:
 1/2
8𝑘𝑇

h𝑣i = (11.3)
𝜋𝑚
[see Eq. (11.65) of Vol. I]. Calculations by this equation for room temperature (about
300 K) give the following result:
 1/2
8 × 1.38 × 10−23 × 300

h𝑣i = ≈ 105 m s−1 .
3.14 × 0.91 × 10 −30

When a field is switched on, the ordered motion of the electrons with a certain
234 CLASSICAL THEORY OF ELECTRICAL CONDUCTANCE OF METALS

average velocity h𝑢i is superposed onto the chaotic thermal motion occurring with
the velocity h𝑣i. It is simple to assess the value of h𝑢i by the equation
𝑗 = 𝑛𝑒 h𝑢i (11.4)
[see Eq. (5.23)]. The maximum current density for copper wires allowed by the rele-
vant specifications is about 107 A m−2 (10 A mm−2 ). Taking the value of 1029 m−3
for 𝑛, we get
𝑗 107
h𝑢i = ≈ ≈ 10−3 m s−1 .
𝑒𝑛 1.6 × 10−19 × 1029
Thus, even at very high current densities, the average velocity of ordered motion
of the charges h𝑢i is about 1/108 of the average velocity of thermal motion h𝑣i.
Therefore, in calculations, the magnitude of the resultant velocity |𝒗 + 𝒖| may be
replaced with that of the velocity of thermal motion |𝒗|.
Let us find the change in the mean value of the kinetic energy of the electrons
produced by a field. The mean square of the resultant velocity is
(𝒗 + 𝒖) 2 = 𝒗2 + 2𝒗 · 𝒖 + 𝒖2 = 𝒗2 + 2 h𝒗 · 𝒖i + 𝒖2 . (11.5)




The two events consisting in that the velocity of thermal motion of the electrons
will take on the value 𝒗, while the velocity of ordered motion—the value 𝒖, are
statistically independent. Therefore, according to the theorem on the multiplication
of probabilities [see Eq. (11.4) of Vol. I], we have h𝒗 · 𝒖i = h𝒗i · h𝒖i. But h𝒗i equals
zero, so that the second addend in Eq. (11.5) vanishes, and it acquires the form
(𝒗 + 𝒖) 2 = 𝒗2 + 𝒖2 .



Hence, it follows that the ordered motion increases the kinetic energy of the elec-
trons on an average by
𝑚 𝑢2


h𝛥𝜀k i = . (11.6)
2
Ohm’s Law. Drude considered that when an electron collides with an ion of
the crystal lattice, the additional energy (11.6) acquired by the electron is transmitted
to the ion and, consequently, the velocity 𝑢 as a result of the collision vanishes. Let
us assume that the field accelerating the electrons is homogeneous. Hence, under
the action of the field, the electron receives a constant acceleration equal to 𝑒𝐸/𝑚,
and toward the end of its path the velocity of ordered motion will reach, on an
average, the value
𝑒𝐸
𝑢max = 𝜏, (11.7)
𝑚
where 𝜏 is the average time elapsing between two consecutive collisions of the
electron with ions of the lattice.
Drude did not take into consideration the distribution of the electrons by
The Elementary Classical Theory of Metals 235

velocities and ascribed the same value of the velocity 𝑣 to all the electrons. In this
approximation
𝑙
𝜏=
𝑣
(we remind our reader that |𝒗 + 𝒖| virtually equals |𝒗|). Using this value of 𝜏 in
Eq. (11.7), we get
𝑒𝐸𝑙
𝑢max = . (11.8)
𝑚𝑣
The velocity 𝑢 changes linearly during the time it takes to cover the path 𝑙. Therefore,
its average value over the path equals half the maximum value:
1 𝑒𝐸𝑙
h𝑢i = 𝑢max = .
2 2𝑚𝑣
Introducing this equation into Eq. (11.4), we get
𝑛𝑒2 𝑙
𝑗= 𝐸.
2𝑚𝑣
The current density is found to be proportional to the field strength. We have,
thus, arrived at Ohm’s law. According to Eq. (5.22), the constant of proportionality
between 𝑗 and 𝐸 is the conductivity
𝑛𝑒2 𝑙
𝜎= . (11.9)
2𝑚𝑣
If the electrons did not collide with the ions of the lattice, their free path and,
consequently, the conductivity of the metal would be infinitely great. Thus, according
to the classical notions, the electrical resistance of metals is due to the collisions of their
free electrons with the ions at the crystal lattice points of the metal.
The Joule-Lenz Law. By the end of its free path, an electron acquires additional
kinetic energy whose average value is
𝑚𝑢2max 𝑒2 𝑙 2 2
h𝛥𝜀k i = = 𝐸 (11.10)
2 2𝑚𝑣
[see Eqs. (11.6) and (11.8)]. Upon colliding with an ion, the electron, according to the
assumption, completely transfers the additional energy it has acquired to the crystal
lattice. The energy given up to the lattice goes to increase the internal energy of the
metal, which manifests itself in its becoming heated.
Every electron experiences on an average 1/𝜏 = 𝑣/𝑙 collisions a second, com-
municating each time the energy expressed by Eq. (11.10) to the lattice. Hence, the
following amount of heat should be liberated in unit volume per unit time:
1 𝑛𝑒2 𝑙 2
𝑄 u = 𝑛 h𝛥𝜀k i = 𝐸
𝜏 2𝑚𝑣
(𝑛 is the number of conduction electrons per unit volume).
236 CLASSICAL THEORY OF ELECTRICAL CONDUCTANCE OF METALS

The quantity 𝑄 u is the unit thermal power of a current (see Sec. 5.8). The factor
of 𝐸2 coincides with the value given by Eq. (11.9) for 𝜎 . Passing over in the expression
𝜎 𝐸2 from 𝜎 and 𝐸 to 𝜌 and 𝑗, we arrive at the formula 𝑄 u = 𝜌𝑗2 expressing the
Joule-Lenz law [see Eq. (5.39)].
The Wiedemann-Franz Law. It is known from experiments that in addition
to their high electrical conductivity, metals are distinguished by a high thermal
conductivity. The German physicists G. Wiedemann and R. Franz discovered an
empirical law according to which the ratio of the thermal conductivity 𝜘 to the
electrical conductivity 𝜎 is about the same for all metals and changes in proportion
to the absolute temperature. For example, for aluminium at room temperature, this
ratio is 5.8 × 10−6 J Ω s−1 K−1 , for copper it is 6.4 × 10−6 J Ω S−1 K−1 , and for lead
it is 7.0 × 10−6 J Ω s−1 K−1 .
Non-metallic crystals are also capable of conducting heat. The thermal conduc-
tivity of metals, however, considerably exceeds that of dielectrics. It thus follows,
that the free electrons instead of the crystal lattice are responsible for the transfer
of heat in metals. Considering these electrons as a monatomic gas, we can adopt an
expression from the kinetic theory of gases for the thermal conductivity:
1
𝜘 = 𝑛𝑚𝑣𝑙𝑐𝑉 (11.11)
3
[see Eq. (16.26) of Vol. I; the density 𝜌 has been replaced with the product 𝑛𝑚, and h𝑣i
with 𝑣]. The specific heat capacity of a monatomic gas is 𝑐𝑉 = 3𝑅/(2𝑀) = 3𝑘/(2𝑚).
Using this value in Eq. (11.11), we obtain
1
𝜘 = 𝑛𝑘𝑣𝑙.
2
Dividing 𝜘 by Eq. (11.9) for 𝜎 and then substituting 3𝑘/(2𝑇) for 𝑚𝑣2 /2, we arrive
at the expression
 2
𝜘 𝑘𝑚𝑣2 𝑘
= 2 =3 𝑇. (11.12)
𝜎 𝑒 𝑒
that expresses the Wiedemann-Franz law.
Introduction of the numerical values of 𝑘 and 𝑒 into Eq. (11.12) yields
𝜘
= 2.23 × 10−8 𝑇.
𝜎
When 𝑇 = 300 K, we get the value 3.7 × 10−6 J Ω s−1 K−1 for 𝜘/𝜎 , which agrees
quite well with experimental data (see the values of 𝜘/𝜎 given above for aluminium,
copper, and lead). It was later established, however, that such a good coincidence is
accidental, because when H. Lorentz performed the calculations more accurately,
taking into account the distribution of the electrons by velocities, the value of
2(𝑘/𝑒) 2𝑇 was obtained for the ratio 𝜘/𝜎 , and it does not agree so well with the data
The Hall Effect 237

Fig. 11.2 Fig. 11.3

of experiments.
Thus, the classical theory was able to explain Ohm’s and the Joule-Lenz laws,
and also gave a qualitative explanation of the Wiedemann-Franz law. At the same
time, this theory encountered quite appreciable difficulties. They include two basic
ones. It can be seen from Eq. (11.9) that the resistance of metals (i.e., the quantity
that is the reciprocal of 𝜎 ) must increase as the square root of 𝑇. Indeed, we have
no grounds to assume that the quantities 𝑛 and 𝑙 depend on the temperature. The
velocity of thermal motion, on the other hand, is proportional to the square root of
𝑇. This theoretical conclusion contradicts experimental data according to which
the electrical resistance of metals grows in proportion to the first power of 𝑇, i.e.,
more rapidly than 𝑇 1/2 [see expression (5.24)].
The second difficulty of the classical theory is that an electron gas must have a
molar heat capacity equal to (3/2)𝑅. Adding this quantity to the heat capacity of the
lattice, which is 3𝑅 [see Eq. (13.1) of Vol. I], we get the value of (9/2)𝑅 for the molar
heat capacity of a metal. Thus, in accordance with the classical electron theory, the
molar heat capacity of metals ought to be 1.5 times higher than that of dielectrics.
Actually, however, the heat capacity of metals does not differ appreciably from that
of non-metallic crystals. Only the quantum theory of metals was able to explain
this discrepancy.

11.3. The Hall Effect

If a metal plate through which a steady electric current is flowing is placed in a


magnetic field perpendicular to it, then a potential difference of 𝑈H = 𝜑1 − 𝜑2
(Fig. 11.2) is set up between the plate faces parallel to the directions of the current
and field. This phenomenon was discovered by the American physicist E. Hall in
1879 and is called the Hall effect or the galvanomagnetic effect.
The Hall potential difference is determined by the expression
𝑈H = 𝑅H 𝑏𝑗𝐵. (11.13)
Here, 𝑏 is the width of the plate, 𝐼 the current density, 𝐵 the magnetic induction of
238 CLASSICAL THEORY OF ELECTRICAL CONDUCTANCE OF METALS

the field and 𝑎𝑏𝑅𝐻 is a constant of proportionality known as the Hall coefficient.
The Hall effect is easily explained by the electron theory. In the absence of a
magnetic field, the current in the plate is due to the electric field 𝑬 0 (Fig. 11.3). The
equipotential surfaces of this field form a system of planes perpendicular to the
vector 𝑬 0 . Two of them are shown in the figure by solid straight lines. The potential
at all the points of each surface and, consequently, at points 1 and 2 too is the same.
The current carriers—electrons—have a negative charge, therefore, the velocity of
their ordered motion 𝒖 is directed oppositely to the current density vector 𝒋.
When the magnetic field is switched on, each carrier experiences the magnetic
force 𝑭 directed along side 𝑏 of the plate and having a magnitude of
𝐹 = 𝑒𝑢𝐵. (11.14)
As a result, the electrons acquire a velocity component directed toward the upper
(in the figure) face of the plate. A surplus of negative charges is formed at this face
and, accordingly, a surplus of positive charges at the lower face. Consequently,
an additional transverse electric field 𝑬 𝐵 is produced. When the strength of this
field reaches a value such that its action on the charges balances the force given
by Eq. (11.14), a stationary distribution of the charges in a transverse direction will
set in. The corresponding value of 𝐸 𝐵 is determined by the condition 𝑒𝐸 𝐵 = 𝑒𝑢𝐵.
Hence,
𝐸 𝐵 = 𝑢𝐵.
The field 𝑬 𝐵 adds to the field 𝑬 0 to form the resultant field 𝑬. The equipotential
surfaces are perpendicular to the field strength vector. Consequently, they will turn
and occupy the position shown by the dash line in Fig. 11.3. Points 1 and 2 which
were formerly on the same equipotential surface now have different potentials. To
find the voltage appearing between these points, the distance 𝑏 between them must
be multiplied by the strength 𝐸 𝐵 :
𝑈H = 𝑏𝐸 𝐵 = 𝑏𝑢𝐵.
Let us express 𝑢 through 𝑗, 𝑛, and 𝑒 in accordance with the equation 𝑗 = 𝑛𝑒𝑢. The
result is
1
𝑈H = 𝑏𝑗𝐵. (11.15)
𝑛𝑒
Equations (11.13) and (11.15) coincide if we assume that
1
𝑅H = . (11.16)
𝑛𝑒
Inspection of Eq. (11.16) shows that by measuring the Hall coefficient, we can
find the concentration of the current carriers in a given metal (i.e., the number of
carriers per unit volume).
The Hall Effect 239

Fig. 11.4

An important characteristic of a substance is the mobility of the current carriers


in it. By the mobility of the current carriers is meant the average velocity acquired
by the carriers at unit electric field strength. If the carriers acquire the velocity 𝑢 in
a field of strength 𝐸, then their mobility 𝑢0 is
𝑢
𝑢0 = . (11.17)
𝐸
The mobility can be related to the conductivity 𝜎 and to the carrier concentration 𝑛.
For this purpose, let us divide the equation 𝑗 = 𝑛𝑒𝑢 by the field strength 𝐸. Taking
into account that 𝑗/𝐸 = 𝜎 and 𝑢/𝐸 = 𝑢0 , we get
𝜎 = 𝑛𝑒𝑢0 . (11.18)
Having measured the Hall coefficient 𝑅H and the conductivity 𝜎 , we can use
Eqs. (11.16) and (11.18) to find the concentration and mobility of the current carriers
in the relevant specimen.
The Hall effect is observed not only in metals, but also in semiconductors. The
sign of the effect can be used to see whether a semiconductor belongs to the n-
or p-type¹. Figure 11.4 compares the Hall effect for specimens with positive and
negative carriers. The direction of the magnetic force is reversed both when the
direction of motion of the charge changes and when its sign is reversed. Hence,
when the current and field have the same direction, the magnetic force exerted
on positive and negative carriers has the same direction. Therefore, with positive
carriers, the potential of the upper (in the figure) face is higher than that of the lower
one, and with negative carriers the potential is lower. We can thus establish the sign
of the current carriers after determining that of the Hall potential difference.
It is of interest to note that in some metals the sign of 𝑈H corresponds to positive
current carriers. This anomaly is explained by the quantum theory.

¹In n-type semiconductors, the current carriers are negative, and in p-type ones they are positive
(see Vol. III).
241

Chapter 12
ELECTRIC CURRENT IN GASES

12.1. Semi-Self-Sustained and Self-Sustained Conduction

The passage of an electric current through gases is called a gas discharge. Gases
in their normal state are insulators, and current carriers are absent in them. Only
when special conditions are created in gases can current carriers appear in them
(ions, electrons) and an electric discharge be produced.
Current carriers may appear in gases as a result of external action not associated
with the presence of an electric field. In this case, the gas is said to have semi-self-
sustained conduction. Semi-self-sustained discharge may be due to heating of
a gas (thermal ionization), the action of ultraviolet rays or X-rays, and also to the
action of radiation of radioactive substances.
If the current carriers appear as a result of processes due to an electric field
being produced in a gas, the conduction is called self-sustained. The nature of
a gas discharge depends on many factors: on the chemical nature of the gas and
electrodes, on the temperature and pressure of the gas, on the shape, dimensions,
and mutual arrangement of the electrodes, on the voltage applied to them, on
the density and power of the current, etc. This is why a gas discharge may have
very diverse forms. Some kinds of discharge are attended by a glow and sound
effects—hissing, rustling, or crackling.

12.2. Semi-Self-Sustained Gas Discharge

Assume that a gas between electrodes (Fig. 12.1) continuously experiences a constant
in intensity action of an ionizing agent (for example, X-rays). The action of the ion-
izer results in one or more electrons being detached from some of the gas molecules.
The latter, thus, become positively charged ions. At not very low pressures, the
242 ELECTRIC CURRENT IN GASES

Fig. 12.1

detached electrons are usually captured by neutral molecules, which, thus, become
negatively charged ions. Let 𝛥𝑛i stand for the number of pairs of ions appearing
under the action of the ionizer in unit volume per second.
The process of ionization in a gas is attended by recombination of the ions, i.e.,
neutralization of unlike ions when they meet or the formation of a neutral molecule
by a positive ion and an electron.
The probability of two ions of opposite signs meeting each other is proportional
to the number of both positive and negative ions. Hence, the number of pairs of
ions 𝛥𝑛r recombining in unit volume per second is proportional to the square of
the number of pairs of ions 𝑛 per unit volume:
𝛥𝑛r = 𝑟𝑛2 (12.1)
(𝑟 is a constant of proportionality).
In a state of equilibrium, the number of appearing ions equals the number of
recombining ones, hence,
𝛥𝑛i = 𝑟𝑛2 . (12.2)
We, thus, get the following expression for the equilibrium concentration of ions
(the number of pairs of ions in unit volume)·:
  1/2
𝛥𝑛i
𝑛= . (12.3)
𝑟
Several pairs of ions appear per second in 1 cm of atmospheric air under the ac-
Semi-Self-Sustained Gas Discharge 243

tion of cosmic radiation and traces of radioactive substances in the Earth’s crust. The
constant 𝑟 for air is 1.6 × 10−6 cm3 s−1 . Introduction of these values into Eq. (12.3)
gives a value of about 103 cm3 for the equilibrium concentration of ions in the air.
This concentration is not adequate for the conduction to be noticeable. Pure dry
air is a very good insulator.
If we feed a voltage to electrodes, the ions will decrease in number not only
because of recombination, but also because of the ions being drawn off by the field
to the electrodes. Assume that 𝛥𝑛j pairs of ions are drawn off from unit volume
every second. If the charge of each ion is 𝑒0, then the neutralization of one pair of
ions on the electrodes is attended by the transfer of the charge 𝑒0 along the circuit.
Every second, 𝛥𝑛j 𝑆𝑙 pairs of ions reach the electrodes (here, 𝑆 is the area of the
electrodes, 𝑙 is the distance between them; the product 𝑆𝑙 equals the volume of the
space between the electrodes). Consequently, the current in the circuit is
𝐼 = 𝑒0 𝛥𝑛j 𝑆𝑙,
whence
𝐼 𝑗
𝛥𝑛j = = , (12.4)
𝑒0 𝑙𝑆 𝑒0 𝑙
where 𝑗 is the current density.
When a current is present, the condition of equilibrium is as follows:
𝛥𝑛i = 𝛥𝑛r + 𝛥𝑛j
Substituting for 𝛥𝑛r and 𝛥𝑛j their values from Eqs. (12.1) and (12.4), we arrive at the
equation
𝛥𝑛i = 𝑟𝑛2 + 0 .
𝑗
(12.5)
𝑒𝑙
The current density is determined by the expression
𝑗 = 𝑒0 𝑛 𝑢+0 + 𝑢−0 𝐸, (12.6)


where 𝑢+0 and 𝑢−0 are the mobilities of the positive and negative ions, respectively
[see Eq. (11.17)].
Let us consider two extreme cases—weak and strong fields.
With weak fields, the current density will be very small, and the addend 𝑗/(𝑒0 𝑙)
in Eq. (12.5) may be disregarded in comparison with 𝑟𝑛2 (this signifies that the
ions leave the space between the electrodes mainly as a result of recombination).
Equation (12.5) thus transforms into Eq. (12.2), and we get Eq. (12.3) for the equilibrium
concentration of the ions. Using this value of 𝑛 in Eq. (12.6), we get
  1/2
0 𝛥𝑛i
𝑢+0 + 𝑢−0 𝐸. (12.7)

𝑗=𝑒
𝑟
The multiplier of 𝐸 in Eq. (12.7) does not depend on the field strength. Hence, with
244 ELECTRIC CURRENT IN GASES

Fig. 12.2

weak fields, a semi-self-sustained gas discharge obeys Ohm’s law.


The mobility of ions in gases has a value of the order of 10−4 m2 V−1 s−1 [or
1 cm2 V−1 s−1 ]. Hence, at the equilibrium 𝑛 = 103 cm−3 = 109 m−3 , and the field
strength 𝐸 = 1 V m−1 , the current density will be
𝑗 = 1.6 × 10−19 × 109 10−4 + 10−4 × 1 ∼ 10−14 A m−2 = 10−18 A cm−2


[see Eq. (12.6); the ions are assumed to be singly charged].


With strong fields, we may disregard the addend 𝑟𝑛2 in Eq. (12.5) in comparison
with 𝑗/𝑒0 𝑙. This signifies that virtually all the appearing ions will reach the electrodes
without having time to recombine. In these conditions, Eq. (12.5) becomes
𝑗
𝛥𝑛i = 0 ,
𝑒𝑙
whence
𝑗 = 𝑒0 𝛥𝑛i 𝑙. (12.8)
This current density is produced by all the ions originated by the ionizer in a column
of the gas with unit cross-sectional area between the electrodes. Consequently, this
current density is the greatest at the given intensity of the ionizer and the given
distance 𝑙 between the electrodes. It is called the saturation current density 𝑗sat .
Let us calculate 𝑗sat for the following conditions: 𝛥𝑛i = 10−3 cm−3 (this is ap-
proximately the rate of ion formation in the atmospheric air in ordinary conditions),
𝑙 = 0.1 m. The introduction of these data into Eq. (12.8) yields
𝑗sat = 1.6 × 10−19 × 107 × 101 ∼ 10−13 A m−2 = 10−17 A cm−2 .
These calculations show that the conduction of air in ordinary conditions is negli-
gibly small.
At intermediate values of 𝐸, there is a smooth transition from a linear depen-
dence of 𝑗 on 𝐸 to saturation; when the latter is reached, 𝑗 stops depending on 𝐸
(see the solid curve in Fig. 12.2). The region of saturation is followed by a region of
a sharp growth in the current (see the portion of the curve depicted by the dash
line). The explanation of this growth is that beginning from a certain value of 𝐸, the
Ionization Chambers and Counters 245

To amplifier
and counter

To amplifier
and counter
Fig. 12.3

electrons¹ given birth to by the external ionizer manage to acquire a considerable


energy while on their free path. This energy is sufficient to ionize the molecules
they collide with. The free electrons produced in this ionization, after gaining speed,
cause ionization in their turn. Thus, an avalanche-like reproduction of the primary
ions produced by the external ionizer occurs, and the discharge current is amplified.
The process does not lose its nature of a semi-self- sustained discharge, however,
because after the action of the external ionizer stops, the discharge continues only
until all the electrons (primary and secondary) reach the anode (the rear boundary of
the space containing ionizing particles—electrons—moves toward the anode). For
a discharge to become self-sustained, two meeting avalanches of ions are needed.
This is possible only if ionization by a collision is capable of giving birth to carriers
of both signs.
It is very important that the semi-self-sustained discharge currents amplified as
a result of reproduction of the carriers are proportional to the number of primary
ions produced by the external ionizer. This property of a discharge is used in
proportional counters (see the following section).

12.3. Ionization Chambers and Counters

Ionization chambers and counters are employed for detecting and counting ele-
mentary particles, and also for measuring the intensity of X-rays and gamma rays.
The functioning of these instruments is based on the use of a semi-self-sustained
gas discharge.
The schematic diagram of an ionization chamber and a counter is the same
(Fig. 12.3). They differ only in their operating conditions and structural features.
A counter (Fig. 12.3b) consists of a cylindrical body along whose axis a thin wire
(anode) fastened on insulators is stretched. The body of the counter is the cathode.

¹Owing to the greater length of their free path, electrons acquire the ability to produce ionization
by a collision earlier than gas ions do.
246 ELECTRIC CURRENT IN GASES

I II III IV V

Fig. 12.4

A window of mica or aluminium foil is made in the end of the counter to admit
the ionizing particles. Some particles, and also X-rays and gamma rays penetrate
into a counter or an ionization chamber directly through their walls. An ionization
chamber (Fig. 12.3b) can have electrodes of various shapes. In particular, they may
be the same as in a counter, have the shape of plane parallel plates, etc.
Assume that a high-speed charged particle producing 𝑁0 pairs of primary ions
(electrons and positive ions) flies into the space between the electrodes. The ions
produced are carried along by the field toward the electrodes, and as a result a certain
charge 𝑞, which we shall call a current pulse, passes through resistor 𝑅. Figure 12.4
shows how the current pulse 𝑞 depends on the voltage 𝑈 between the electrodes for
two different amounts of primary ions 𝑁0 differing by three times (𝑁02 = 3𝑁01 ).
Six regions can be earmarked on the graph. Regions I and II were considered in the
preceding section. In particular, region II is the region of the saturation current—all
the ions produced by an ionizing particle reach the electrodes without having time
to recombine. It is quite natural that the current pulse does not depend on the
voltage in these conditions.
Beginning from the value 𝑈p , the field strength becomes sufficient for the
electrons to be able to ionize the molecules by a collision. Therefore, the number of
electrons and positive ions grows like an avalanche. As a result, 𝐴𝑁0 ions reach each
of the electrodes. The quantity 𝐴 is called the gas amplification factor. In region
III, this factor does not depend on the number of primary ions (but does depend on
the voltage). Therefore, if we keep the voltage constant, the current pulse will be
proportional to the number of primary ions. Region III is called the proportional
region, and the voltage 𝑈p the threshold of the proportional region. The gas
amplification factor changes in this region from 1 at its beginning to 103 -104 at its
end (the scale along the 𝑞-axis has not been observed in Fig. 12.4; only the ratio of
1:3 between the ordinates in regions II and III has been observed).
Ionization Chambers and Counters 247

Fig. 12.5

In region IV, called the region of partial proportionality, the gas amplifica-
tion factor 𝐴 depends to a greater and greater extent on 𝑁0 . In this connection, the
difference between the current pulses produced by different numbers of primary
ions becomes smoothed out more and more.
At voltages corresponding to region V (it is known as the Geiger region, and
the voltage 𝑈g as the threshold of this region), the process acquires the nature
of a self-sustained discharge. The primary ions only produce an impetus for its
appearance. The current pulse in this region is absolutely independent of the
number of primary ions.
In region VI, the voltage is so high that a discharge, after once being set up, does
not stop. It is, therefore, called the region of continuous discharge.
Ionization Chambers. An ionization chamber is an instrument operating
without gas amplification, i.e., at voltages corresponding to region II. There are
two kinds of ionization chambers. Chambers of one kind are used for registering
the pulses initiated by individual particles (pulse chambers). A particle flying into
the chamber produces a certain number of ions in it, and as a result the current 𝐼
begins to flow through resistor 𝑅. The result is that the potential of point 1 (see
Fig. 12.3a) rises and becomes equal to 𝐼 𝑅 (the initial potential of this point was the
same as that of earthed point 2). This potential is fed to an amplifier, and after being
amplified operates a counting device. After all the charges that have reached the
inner electrode pass through resistor 𝑅, the current stops and the potential of point
1 again becomes equal to zero. The nature of operation of the chamber depends on
the duration of the current pulse set up by one ionizing particle.
To determine what the duration of a pulse depends on, let us consider a circuit
consisting of capacitor 𝐶 and resistor 𝑅 (Fig. 12.5). If we impart the opposite charges
+𝑞 and −𝑞 to the capacitor plates, a current will flow through resistor 𝑅, and the
charges on the plates will diminish. The instantaneous value of the voltage applied
across the resistor is 𝑈 = 𝑞/𝐶. Hence, we get the following expression for the
248 ELECTRIC CURRENT IN GASES

current:
𝑈 𝑞
𝐼= = . (12.9)
𝑅 𝑅𝐶
Let us substitute −d𝑞/d𝑡 for the current, where −d𝑞 is the decrement of the charge
on the plates during the time d𝑡. As a result, we get the differential equation
d𝑞 𝑞 d𝑞 𝑞
− = or =− d𝑡.
d𝑡 𝑅𝐶 𝑞 𝑅𝐶
According to Eq. (12.9), d𝑞/𝑞 = d𝐼/𝐼. We can, therefore, write
d𝐼 1
=− d𝑡.
𝐼 𝑅𝐶
Integration of this equation yields
1
ln 𝐼 = − 𝑡 + ln 𝐼0
𝑅𝐶
(ln 𝐼0 is the integration constant). Finally, raising the expression obtained to a power,
we arrive at the equation
 𝑡 
𝐼 = 𝐼0 exp − . (12.10)
𝑅𝐶
It is easy to see that 𝐼0 is the initial value of the current.
It follows from Eq. (12.10) that during the time
𝜏 = 𝑅𝐶, (12.11)
the current diminishes to 1/𝑒 of its original value. Accordingly, the quantity 𝜏 is
called the time constant of a circuit. The greater this quantity, the slower is the
rate of diminishing of the current in a circuit.
The diagram of an ionization chamber (see Fig. 12.3a) is similar to that shown
in Fig. 12.5. The part of 𝐶 is played by the interelectrode capacitance shown by
a dash line on the diagram of the chamber. An increase in the resistance of 𝑅 is
attended by a growth in the voltage across points 1 and 2 at a given current, and this,
consequently, facilitates the registration of the pulses. This circumstance induces
designers to use the highest possible resistance of 𝑅. At the same time, for the
chamber to be able to register separately the current pulses set up by particles
rapidly following one another, the time constant must not be great. Therefore,
designers have to make a compromise when choosing the resistance of 𝑅 for pulse
chambers. It is usually taken of the order of 108 Ω. Hence, at 𝐶 ∼ 10−11 F, the time
constant is 10−3 s.
Another kind of ionization chamber is the so-called integrating chamber. The
resistance of 𝑅 in them is of the order of 1015 Ω. At 𝐶 ∼ 10−11 F, the time constant is
104 s. In this case, the current pulses produced by separate ionizing particles merge
and a steady current flows through the resistor. Its magnitude characterizes the
Ionization Chambers and Counters 249

total charge of the ions produced in the chamber in unit time. Thus, the ionization
chambers of these two kinds differ only in the value of the time constant 𝑅𝐶.
Proportional Counters. The pulses set up by separate particles can be ampli-
fied quite considerably (up to 103 -104 times) if the voltage between the electrodes
is in region III (see Fig. 12.4). An instrument operating in such conditions is called a
proportional counter. The anode of the counter is made in the form of a wire of
several hundredths of a millimetre in diameter. The field strength near the wire
is especially high. With a sufficiently great voltage between the electrodes, the
electrons produced near the wire acquire an energy under the action of the field
that is adequate for producing ionization of the molecules by a collision. The result
is reproduction of the ions. The dimensions of the space in which reproduction
occurs increase with the voltage. The gas amplification factor grows accordingly.
The number of primary ions depends on the nature and energy of the particles
producing the pulse. Therefore, the magnitude of the pulses at the output of a
proportional counter makes it possible to distinguish various particles, and also to
sort particles of the same nature by their energies.
Geiger-Müller Counters. A still greater amplification of the pulse (up to
8
10 ) can be attained by making a counter function in the Geiger region (region
V in Fig. 12.4). A counter operating in these conditions is called a Geiger-Müller
counter (or more briefly a Geiger counter). A discharge in the Geiger region, be-
ing “launched” by an ionizing particle, subsequently transforms into a self-sustained
one. Hence, the magnitude of the pulse does not depend on the initial ionization.
To obtain separate pulses from individual particles, the discharge produced must
be rapidly interrupted (quenched). This is achieved either with the aid of an exter-
nal resistance 𝑅 (in non-self-quenching counters), or at the expense of processes
appearing in the counter itself. In the latter case, the counter is called self quenching.
The quenching of a discharge with the aid of an external resistance is due to
the fact that when a discharge current flows in the resistance, a great voltage drop
is set up in it. Consequently, only part of the applied voltage falls to the lot of the
interelectrode space, and it is insufficient for maintaining the discharge.
Stopping of a discharge in self-quenching counters is due to the following
reasons. Electrons have a mobility that is about 1000 times greater than the mobility
of positive ions. Therefore, during the time it takes the electrons to reach the wire,
the positive ions do not virtually move from their places. These ions produce a
positive space charge that weakens the field near the wire, and the discharge stops.
Quenching of the discharge in this case is prevented by additional processes which
we shall not consider. To suppress them, an admixture of a polyatomic organic
gas (for example, alcohol vapour) is added to the gas filling the counter (usually
argon). Such a counter separates pulses from particles following one another with
250 ELECTRIC CURRENT IN GASES

an interval of the order of 104 s.

12.4. Processes Leading to the Appearance of Current Carriers in a Self-


Sustained Discharge

Before commencing to describe the various kinds of self-sustained gas discharge,


we shall consider the basic processes leading to the production of current carriers
(electrons and ions) in such discharges.
Collisions of Electrons with Molecules. The collisions of electrons (and
also ions) with molecules can have an elastic or inelastic nature. The energy of a
molecule (like that of an atom) is quantized. This signifies that it can have only
discrete (i.e., separated by finite intervals) values called energy levels. The state
with the smallest energy is called the ground one. To transfer a molecule from its
ground state to various excited ones, definite values of the energy 𝑊1 , 𝑊2 , etc., are
needed. A molecule can be ionized by imparting to it a sufficiently great energy 𝑊1 .
Upon transition to an excited state, a molecule usually stays in it only ∼ 10−8 s,
after which it passes back to its ground state, emitting its surplus energy in the form
of a quantum of light—a photon. Molecules can spend a considerably greater time
(about 10−3 s) in certain excited states called metastable.
The laws of energy and momentum conservation must be obeyed when particles
collide. Therefore, definite limitations are imposed on the transfer of energy in a
collision—not all the energy which a colliding particle has can be transferred to
another particle.
If in a collision, an energy sufficient for exciting a molecule cannot be imparted
to it, the total kinetic energy of the particles remains unchanged, and the collision
will be elastic. Let us find the energy imparted to the particle that is struck in an
elastic collision. Assume that a particle of mass 𝑚1 having the velocity 𝑣10 collides
with a stationary (𝑣20 = 0) particle of mass 𝑚2 . The following conditions must be
observed in a central collision:
2
𝑚1 𝑣10 𝑚1 𝑣12 𝑚2 𝑣22
= +
2 2 2
𝑚1 𝑣10 = 𝑚1 𝑣1 + 𝑚2 𝑣2 ,
where 𝑣1 and 𝑣2 are the velocities of the particles after the collision. The velocity of
the second particle from these equations will be
2𝑚1
𝑣2 = 𝑣10
𝑚1 + 𝑚2
(see Sec. 3.11 of Vol. I).
The energy transmitted to the second particle in an elastic collision is deter-
Processes Leading to the Appearance of Current Carriers 251

mined by the expression


𝑚2 𝑣22 𝑚1 𝑣10 2
4𝑚1 𝑚2
𝛥𝑊el = = .
2 2 (𝑚1 + 𝑚2 ) 2
If 𝑚1  𝑚2 , this equation is simplified as follows:
2
𝑚1 𝑣10 4𝑚1 4𝑚1
𝛥𝑊el = = 𝑊10 (12.12)
2 𝑚2 𝑚2
where 𝑊10 is the initial energy of the incident particle.
It can be seen from Eq. (12.12) that a light particle (electron) in an elastic collision
with a heavy particle (molecule) gives up to it only a small fraction of its stock of
energy. The light particle “rebounds” from the heavy one like a ball from a wall, and
its velocity remains virtually unchanged in magnitude. The relevant calculations
show that in a non-central collision the fraction of the energy transferred is still
smaller.
With a sufficiently high energy of the incident particle (electron or ion), a
molecule may be excited or ionized. In this case, the total kinetic energy of the
particles is not conserved—part of the energy goes for excitation or ionization, i.e.,
for increasing the internal energy of the colliding particles or for splitting one of
the particles into two fragments.
Collisions attended by the excitation of particles are called in elastic collisions
of the first kind. A molecule in an excited state upon colliding with another
particle (electron, ion, or neutral molecule) can pass over to the ground state without
emitting its surplus energy, but transferring it to this particle. As a result, the total
kinetic energy of the particles after the collision will be greater than before it. Such
collisions are known as inelastic collisions of the second kind. Molecules pass
over from a metastable state to the ground one as a result of collisions of the second
kind.
In an inelastic collision of the first kind, the equations of energy and momentum
conservation have the form
2
𝑚1 𝑣10 𝑚1 𝑣12 𝑚2 𝑣22
= + + 𝛥𝑊int , (12.13)
2 2 2
𝑚1 𝑣10 = 𝑚1 𝑣1 + 𝑚2 𝑣2 , (12.14)
where 𝛥𝑊int is the increment of the internal energy of a molecule corresponding
to its transition to an excited state. Deleting 𝑣1 from these equations, we get
𝑚1 + 𝑚2 𝑚2 𝑣22
 
𝛥𝑊int = 𝑚2 𝑣10 𝑣2 − . (12.15)
𝑚1 2
At a given velocity of the striking particle (𝑣10 ), the increment of the internal
energy 𝛥𝑊int depends on the velocity 𝑣2 with which the molecule travels after
252 ELECTRIC CURRENT IN GASES

Probability of process
Excitation
Excitation

Energy of electron

Fig. 12.6

the collision. Let us find the greatest possible value of 𝛥𝑊int . To do this, we shall
differentiate function (12.15) with respect to 𝑣2 and equate the derivative to zero:
d ( 𝛥𝑊int )
 
𝑚1 + 𝑚2
= 𝑚2 𝑣10 − 𝑚2 𝑣2 = 0.
d𝑣2 𝑚1
Hence, 2 = 𝑚1 𝑣10 /(𝑚1 + 𝑚2 ). Substitution of this value for 𝑣2 in Eq. (12.15) yields
𝑚1 𝑣12
 
𝑚2
𝛥𝑊int,max = . (12.16)
𝑚1 + 𝑚2 2
If the incident particle is considerably lighter than the struck one (𝑚1  𝑚2 ),
the factor 𝑚2 /(𝑚1 + 𝑚2 ) in Eq. (12.16) is close to unity. Thus, when a light particle
(electron) strikes a heavy one (molecule), almost all the energy of the incident particle
can be used to excite or ionize the molecule².
Even if the energy of the incident particle (electron) is sufficiently great, however,
a collision does not necessarily result in the excitation or ionization of a molecule.
These processes have definite probabilities depending on the energy (and, therefore,
on the velocity) of the electron. Figure 12.6 shows the approximate path followed by
these probabilities. The higher the velocity of the electron, the smaller is the duration
of its interaction with the molecule near which it flies. Hence, both probabilities
rapidly reach a maximum, and then diminish with an increase in the energy of
the electron. Inspection of the figure shows that an electron having, for example,
the energy 𝑊 0 will cause ionization of a molecule with greater probability than its
excitation.
Photoionization. Electromagnetic radiation consists of elementary particles
called photons. The energy of a photon is ℏ𝜔, where ℏ is Planck’s constant divided
by 2𝜋 [see Eq. (7.43)], and 𝜔 is the cyclic frequency of the radiation. A photon can
be absorbed by a molecule, and its energy goes to excite or ionize the molecule.

²When ionization occurs, Eqs. (12.13) become more complicated because there will be three
particles instead of two after a collision. The conclusion on the possibility of spending almost all of
the electron’s energy for ionization is correct, however.
Processes Leading to the Appearance of Current Carriers 253

In this case, the ionization of the molecule is called photoionization. Ultraviolet


radiation is capable of producing direct photoionization. The energy of a photon
of visible light is insufficient to detach an electron from a molecule. Hence, visible
radiation is not capable of producing direct photoionization. It may be the cause,
however, of so-called stepped photoionization. This process is carried out in two
steps. In the first one, a photon transfers the molecule to an excited state. In the
second step, the excited molecule is ionized as a result of its colliding with another
molecule.
Short-wave radiation may appear in a gas discharge that is capable of producing
direct photoionization. A sufficiently fast electron may not only ionize a molecule
when it collides with it, but also transfer the ion formed into an excited state. The
transition of an ion to the ground state is attended by the emission of radiation
having a higher frequency than that of a neutral molecule. The energy of a photon
of such radiation is sufficient for direct photoionization.
Emission of Electrons by the Surface of Electrodes. Electrons may be
supplied to a gas-discharge space as a result of their emission by the surface of
the electrodes. Such kinds of emission as thermionic (thermoelectron), secondary
electron, and autoelectronic emission play the main part in some kinds of discharge.
Thermionic emission is the name given to the emission of electrons by heated
solid or liquid bodies. Owing to the free electrons in a metal having a variety of
velocities in accordance with a distribution law, there is always a certain number of
them whose energy is sufficient for them to overcome the potential barrier and leave
the metal. The number of such electrons at room temperature is negligibly small.
With elevation of the temperature, however, the number of electrons capable of
leaving the metal grows very rapidly and becomes quite noticeable at a temperature
of the order of 1000 K.
By secondary electron emission is meant the emission of electrons by the
surface of a solid or a liquid body when it is bombarded with electrons or ions. The
ratio of the number of emitted (secondary) electrons to the number of particles
producing the emission is called the secondary electron emission coefficient. When
electrons are used to bombard the surface of a metal, the values of this coefficient
vary from 0.5 (for beryllium) to 1.8 (for platinum).
Autoelectronic (or cold) emission is the emission of electrons by the surface
of a metal occurring when an electric field of a very high strength (∼ 108 V m−1 ) is
set up near the surface. This phenomenon is also sometimes called field induced
electron emission.
254 ELECTRIC CURRENT IN GASES

Fig. 12.7

12.5. Gas-Discharge Plasma

Some kinds of self-sustained discharge are characterized by a very high degree of


ionization. A highly ionized gas, provided that the total charge of the electrons and
ions in each elementary volume equals (or almost equals) zero, is called a plasma.
A plasma is a special state of a substance. The matter in the interior of the Sun
and other stars having a temperature of scores of millions of kelvins is in this state.
A plasma produced owing to the high temperature of a substance is called high-
temperature (or isothermal). A gas-discharge plasma, as its name implies, is one
produced in a gas discharge.
For a plasma to be in a stationary state, processes are needed that replenish
the stock of ions diminishing as a result of recombination. In high-temperature
plasma, this is achieved as a result of thermal ionization, in gas-discharge plasma,
as a result of collision ionization by electrons accelerated by an electric field. The
ionosphere (one of the layers of the atmosphere) is a special variety of plasma. The
high degree of ionization of the molecules (∼ 1%) is maintained in the ionosphere
by photoionization due to the Sun’s shortwave radiation.
The electrons in a gas-discharge plasma participate in two motions—chaotic
with a certain average velocity h𝑣i and ordered motion in a direction opposite to 𝑬
with the average velocity h𝑢i much smaller than h𝑣i.
We shall prove that an electric field not only leads to ordered motion of the
electrons of a plasma, but also increases the velocity h𝑣i of their chaotic motion.
Assume that at the moment when the field is switched on the gas contains a certain
number of electrons whose average velocity corresponds to the gas temperature
𝑇g (𝑚 h𝑣i 2 /2 = 3𝑘𝑇g /2)· In the interval between two successive collisions with
molecules, an electron covers on an average the path 𝑙 (Fig. 12.7; the trajectory of
the electron is curved slightly under the action of the force −𝑒𝐸). The work done
by the field on the electron is
𝐴 = 𝑒𝐸𝑙 𝐹 , (12.17)
where 𝑙 𝐹 is the projection of the electron’s path onto the direction of the force
Gas-Discharge Plasma 255

exerted on it. Owing to collisions with molecules, the direction of motion of


the electron constantly changes chaotically. The magnitude and sign of 𝑙, change
accordingly. This is why the work given by Eq. (12.17) for separate portions of
the path varies in magnitude and changes in its sign. On some sections, the field
increases the energy of the electron, on others diminishes it. If ordered motion of
the electrons were absent, the average value of 𝑙, and, consequently, the work given
by Eq. (12.17) would be zero. The presence of ordered motion, however, leads to the
average value of the work 𝐴 differing from zero; it is positive and equals
1
h𝐴i = 𝑒𝐸 h𝑢i 𝜏 = 𝑒𝐸 h𝑢i , (12.18)
h𝑣i
where 𝜏 is the average time needed by the electrons to cover their free path (h𝑢i 
h𝑢i).
Thus, a field on an average increases the energy of the electrons. True, an
electron upon colliding with a molecule gives up part of its energy to it. But, as we
have seen in the preceding section, the fraction 𝛿 of the energy transferred in an
elastic collision is very small—it averages³ 𝛿 = 2(𝑚/𝑀) (here 𝑚 is the mass of an
electron, and 𝑀 that of a molecule).
In a rarefied gas (in which 𝑙 is greater) and with a sufficiently

2 great field strength
𝐸, the work h𝐴i [Eq. (12.18)] may exceed the energy 𝑚 𝑣 h𝛿i /2 transferred on an
average to a molecule in each collision. The result will be a growth in the energy of
chaotic motion of the electrons. It ultimately reaches values sufficient to excite or
ionize a molecule. Beginning from this moment, part of the collisions stop being
elastic and are attended by a large loss of energy. Therefore, the average fraction
h𝛿i of energy transferred increases.
Thus, the electrons acquire the energy needed for ionization not during one
interval between collisions, but gradually in the course of a number of them. Ion-
ization leads to the appearance of a large number of electrons and positive ions—a
plasma is produced.
The energy of the electrons of a plasma is determined by the condition that
the average value of the work done by the field on an electron during one interval
between collisions equals the average value of the energy given up by the electron
upon colliding with a molecule:
1 𝑚 𝑣2


𝑒𝐸 h𝑢i = h𝛿i .
h𝑣i 2
Here, 𝛿 is an intricate function of h𝑣i.
Experiments show that the Maxwell distribution by velocities holds for the

³According to Eq. (12.12), in a central collision 𝛿 = 4(𝑚/𝑀). When the electron and the molecule
only slightly touch each other, we have 𝛿 ≈ 0.
256 ELECTRIC CURRENT IN GASES

electrons in a gas-discharge plasma. Owing to the weak interaction of the electrons


with the molecules (in an elastic collision 𝛿 is very small, while the relative number
of inelastic collisions is negligible), the average velocity of chaotic motion of the
electrons is many times greater than the velocity corresponding to the temperature
𝑇g of the gas. If we introduce the temperature of the electrons 𝑇g determining it
from the equation 𝑚 𝑣2 = 3𝑘𝑇e /2, then we get a value of the order of several tens

of thousands of kelvins for 𝑇e . The failure of the temperatures 𝑇g and 𝑇e to coincide


indicates that there is no thermodynamic equilibrium between the electrons and
molecules in a gas-discharge plasma⁴. The concentration of the current carriers in
a plasma is very high. Therefore, a plasma is an excellent conductor. The mobility
of the electrons is about three orders of magnitude greater than that of the ions.
Hence, the current in a plasma is mainly set up by its electrons.

12.6. Glow Discharge

A glow discharge appears at low pressures. It can be observed in a glass tube about
0.5 m long with flat metal electrodes soldered into its ends (Fig. 12.8). A voltage of
∼ 1000 V is supplied to the electrodes. There is virtually no current in the tube at
atmospheric pressure. If the pressure is lowered, then approximately at 50 mmHg a
discharge appears in the form of a glowing sinuous thin cord connecting the anode
and the cathode. Lowering of the pressure is attended by thickening of the cord,
and at about 5 mmHg the cord fills the entire cross section of the tube—a glow
discharge sets in. Its principal parts are shown in Fig. 12.8. Near the cathode is a
thin luminous layer called the cathode luminous film. Between the cathode and
the luminous film is the Aston dark space. At the other side of the luminous film
is a weakly luminous layer which by contrast appears to be dark and is accordingly
known as the cathode (or Crookes) dark space. This layer bounds on a luminous
region called the negative glow. All the above layers form the cathode part of the
glow discharge.
The negative glow is followed by the Faraday dark space. The boundary
between them is blurred. The remaining part of the tube is filled with a luminous
gas; it is called the positive column. At a lower pressure, the cathode part of the
discharge and the Faraday dark space become wider, while the positive column
becomes shorter. At a pressure of the order of 1 mmHg, the positive column breaks
up into a number of alternating dark and light bent layers-strata.
Measurements made with the aid of probes (thin wires soldered in at different

⁴The average energy of the molecules, electrons, and ions in a high-temperature plasma is the
same. This explains its other name—isothermal plasma.
Glow Discharge 257

Aston Cathode Faraday Dark spaces

Cathode Anode
Cathode film Negative glow Positive glow Luminous
regions

Cathode potential drop

Fig. 12.8

points along the tube) and by other means have shown that the potential changes
non-uniformly along a tube (see the graph in Fig. 12.8). Virtually the entire potential
drop falls to the share of the first three parts of the discharge up to the cathode dark
space inclusively. This portion of the voltage applied to a tube is called the cathode
potential drop. The potential remains unchanged in the region of the negative
glow—here the field strength is zero. Finally, the potential gradually grows in the
Faraday dark space and in the positive column. Such a distribution of the potential
is due to the formation in the cathode dark space of a positive space charge because
of the increased concentration of the positive ions.
The main processes needed to maintain a glow discharge occur in its cathode
part. The other parts of the discharge are not significant, they may even be ab-
sent (with a small spacing of the electrodes or at a low pressure). There are two
main processes—secondary electron emission from the cathode produced by its
bombardment with positive ions, and collision ionization of the gas molecules by
electrons.
The positive ions accelerated by the cathode potential drop bombard the cathode
and knock electrons out of it. These electrons are accelerated by the electric field
in the Aston dark space. Acquiring sufficient energy, they begin to excite the gas
molecules, owing to which the cathode luminous film appears. The electrons that fly
without any collisions into the region of the cathode dark space have a high energy,
and as a result they ionize the molecules more frequently than they excite them
(see the graphs in Fig. 12.6). Thus, the intensity of glowing of the gas diminishes, but
in return many electrons and positive ions appear. The ions produced first have
a very low velocity. As a result, a positive space charge is formed in the cathode
dark space. This leads to redistribution of the potential along the tube and to the
appearance of the cathode potential drop.
The electrons appearing in the cathode dark space penetrate into the negative
glow region that is characterized by a high concentration of electrons and positive
ions and by a total space charge close to zero (a plasma). Therefore, the field strength
258 ELECTRIC CURRENT IN GASES

here is very low. Owing to the high concentration of electrons and ions, an intensive
recombination process goes on in the negative glow region. It is attended by the
emission of the energy liberated during this process. Thus, the negative glow is
mainly a glow of recombination.
The electrons and ions penetrate from the negative glow region into the Faraday
dark space because of diffusion (there is no field on the boundary between these
regions, but in return there is a high gradient of electron and ion concentration).
The lower concentration of the charged particles greatly diminishes the probability
of recombination in the Faraday dark space. This is why the latter space seems to
be dark.
A field is already present in the Faraday dark space. The electrons carried
away by this field gradually accumulate energy so that the conditions needed for
the existence of a plasma finally appear. The positive column is a gas-discharge
plasma. It plays the part of a conductor joining the anode to the cathode parts of the
discharge. The glow of the positive column is mainly due to transitions of excited
molecules to their ground state. Molecules of different gases emit radiation of
different wavelengths in such transitions. Therefore, the glow of the positive column
has a characteristic colour for each gas. This circumstance is taken advantage of in
glow tubes for manufacturing luminous inscriptions and advertisements. These
inscriptions are the positive column of a glow discharge. Neon gas-discharge tubes
produce a red glow, argon ones a bluish-green glow, etc.
If the electrode spacing is gradually diminished, the cathode part of the discharge
remains unchanged whereas the length of the positive column diminishes until this
column disappears completely. Next, the Faraday dark space disappears, and the
length of the negative glow begins to decrease, the position of the boundary of this
glow with the cathode dark space remaining unchanged. When the distance from
the anode to this boundary becomes very small, the discharge stops.
If the pressure is gradually lowered, the cathode part of the discharge extends
over a greater and greater part of the interelectrode space, and finally the cathode
dark space extends over almost the entire tube. The glow of the gas in this case
stops being noticeable but in return the tube walls begin to glow with a greenish
colour. The majority of the electrons knocked out of the cathode and accelerated
by the cathode potential drop reach the tube walls without colliding with molecules
of the gas and cause the walls to glow upon striking them. For historical reasons,
the stream of electrons emitted by the cathode of a gas-discharge tube at very low
pressures was called cathode rays. The glow produced by bombardment with fast
electrons is called cathodoluminescence.
If a narrow canal is made in the cathode of a gas-discharge tube, part of the
positive ions penetrate into the space beyond the cathode and form a sharply
Arc Discharge 259

Fig. 12.9

bounded beam of ions called canal (or positive) rays. Beams of positive ions were
first obtained in exactly this way.

12.7. Arc Discharge

In 1802, the Russian physicist Vasili Petrov (1761-1834) discovered that when con-
tacting carbon electrodes connected to a large galvanic battery are moved apart,
a concentrated light flares up between the electrodes. When the electrodes are
horizontal, the heated luminescent gas bends in the shape of an arc. This is why the
phenomenon discovered by Petrov was called an electric arc. The current in the
arc may reach enormous values (from 103 A to 104 A) at a voltage of several scores
of volts.
An arc discharge can proceed at both a low (of the order of several millimetres
of mercury) and a high (up to 1000 atmospheres) pressure. The main processes
maintaining the discharge are thermionic emission from the heated cathode surface
and thermal ionization of the molecules due to the high temperature of the gas in
the space between the electrodes. Almost the entire interelectrode space is filled
with a high-temperature plasma. It is the conductor through which the electrons
emitted by the cathode reach the anode. The temperature of the plasma is about
6000 K. In a superhigh-pressure arc, the temperature of the plasma may reach
10000 K (we remind our reader that the temperature of the Sun’s surface is 5800 K).
Owing to bombardment by positive ions, the cathode is heated to about 3500 K.
The anode, bombarded by a powerful stream of electrons, is heated still more. As a
result, the anode intensively evaporates, and a depression—a crater—is formed on
its surface. The crater is the brightest place in an arc.
An arc discharge has a dropping volt-ampere characteristic (Fig. 12.9). The
explanation is that a current increase is attended by a growth in the thermionic
emission from the cathode and in the degree of ionization of the gas-discharge
space. As a result, the resistance of this space diminishes at a greater rate than that
of the current increase.
260 ELECTRIC CURRENT IN GASES

Apart from the thermionic arc described above (i.e., a discharge due to thermionic
emission from the heated surface of the cathode) an arc with a cold cathode is
also encountered. Usually liquid mercury poured into a cylinder from which the
air has been evacuated is the cathode of such an arc. The discharge occurs in the
mercury vapour. The electrons fly out of the cathode as a result of autoelectronic
emission. The strong field at the cathode surface needed for this to occur is set up
by the positive space charge formed by the ions. The electrons are emitted not by
the entire surface of the cathode, but by a small luminous and continuously moving
cathode spot. The temperature of the gas in this case is not high. The molecules
in the plasma are ionized, as in a glow discharge, as a result of collisions with the
electrons.

12.8. Spark and Corona Discharges

A spark discharge is produced when the electric field strength reaches the breakdown
value 𝐸br for the given gas. The value of 𝐸br depends on the gas pressure; it is about
3 MV m−1 (30 kV cm−1 ) for air. The value of 𝐸br varies with the pressure. According
to the experimentally established Paschen law, the ratio of the breakdown field
strength to the pressure is approximately constant:
𝐸br
≈ constant.
𝑝
A spark discharge is attended by the formation of a brightly luminous tortuous
branched canal along which a short-time strong current pulse flows. An example is
lightning; its length may be up to 10 km, the diameter of the canal up to 40 cm, the
current may reach 100000 and more amperes, and the duration of the pulse is about
10−4 s. Every stroke of lightning consists of several (up to 50) pulses flowing along
the same canal; their total duration (together with the intervals between the pulses)
may reach several seconds. The temperature of the gas in the spark canal is up to
10000 K. The rapid strong heating of the gas leads to a sharp growth in the pressure
and the production of shock and sound waves. This is why a spark discharge is
attended by sound phenomena—from a weak crackling for a low-power spark to
peals of thunder accompanying a stroke of lightning.
The appearance of a spark is preceded by the formation in the gas of a greatly
ionized canal known as a streamer. The latter is obtained by overlapping of the
separate electron avalanches appearing along the path of the spark. The forefather
of each avalanche is an electron released by photoionization. How a streamer de-
velops is shown in Fig. 12.10. Assume that the field strength has a value such that an
electron flying out of the cathode as a result of some process or other acquires an
Spark and Corona Discharges 261

Cathode

Anoode
Fig. 12.10

energy sufficient for ionization along its free path. This causes multiplication of
the electrons to occur—an avalanche is formed (the positive ions appearing during
this process do not play a noticeable part owing to their much smaller mobility;
they only set up the space charge resulting in redistribution of the potential). The
short-wave radiation emitted by an atom that lost one of its inner electrons when
ionized (this radiation is shown by wavy lines in the figure) produces photoion-
ization of the molecules, the detached electrons giving birth to more and more
new avalanches. After overlapping of the avalanches, a well-conducting canal—a
streamer—is formed along which a powerful stream of electrons flows from the
cathode to the anode—breakdown occurs.
If the electrodes have a shape at which the field in the space between them
is approximately homogeneous (for example, they are spheres of a sufficiently
great diameter), then breakdown occurs at a quite definite voltage 𝑈br whose value
depends on the distance between the spheres 𝑙 (𝑈br = 𝐸br 𝑙). This underlies the
design of a spark voltmeter used to measure high voltages (from 103 V to 105 V).
During such measurements, the maximum distance 𝑙max is determined at which a
spark appears. Next multiplying 𝐸br by 𝑙max , we get the value of the voltage being
measured.
If one of the electrodes (or both) has a very great curvature (for example, the
electrode is a thin wire or a sharp point), then when the voltage is not too high, a
so-called corona discharge is produced. When the voltage grows, this discharge
transforms into a spark or an arc discharge.
In a corona discharge, the ionization and excitation of the molecules occur
not in the entire interelectrode space, but only near an electrode having a small
radius of curvature, where the field strength reaches values equal to or greater
than 𝐸br . The gas glows in this part of the discharge. The glow has the form of a
corona surrounding the electrode, and this explains the name given to this kind of
discharge. A corona discharge from a point has the form of a luminous brush, and
for this reason it is sometimes known as a brush discharge. Positive and negative
coronas are distinguished depending on the sign of the corona electrode. The
external corona region is between the corona layer and the non-corona electrode.
Breakdown conditions (𝐸  𝐸br ) exist only within the limits of the corona layer.
We can, therefore, say that a corona discharge is incomplete breakdown of the gas
262 ELECTRIC CURRENT IN GASES

space.
With a negative corona, the phenomena at the cathode are similar to those at
the cathode of a glow discharge. The positive ions accelerated by the field knock
electrons out of the cathode. These electrons produce ionization and excitation of
the molecules in the corona layer. In the external region of the corona, the field
is not sufficient to impart the energy needed for ionization or excitation of the
molecules to the electrons. For this reason, the electrons that penetrate into this
region drift toward the anode under the action of the field. Part of the electrons are
captured by the molecules, the result being the formation of negative ions. Thus,
the current in the external region is due only to negative carriers-electrons and
negative ions. The discharge in this region is of a semi-self-sustained nature.
In a positive corona, the electron avalanches are conceived at the outer boundary
of the corona and fly toward the corona electrodethe anode. The appearance of
electrons giving birth to avalanches is due to photoionization produced by the
radiation of the corona layer. The current carriers in the external region of the
corona are the positive ions that drift to the cathode under the action of the field.
If both electrodes have a great curvature (two corona electrodes), processes
occur near each of them that are characteristic of a corona electrode of the given
sign. Both corona layers are separated by an external region in which opposite
streams of positive and negative current carriers travel. Such a corona is called a
bipolar one.
The self-sustained gas discharge mentioned in Sec. 12.5 when treating counters
is a corona discharge.
The thickness of the corona layer and the discharge current grow with an
increasing voltage. At a low voltage, the size of the corona is small, and its glow is
hard to notice. Such a microscopic corona is produced near a sharp point off which
an electric wind flows (see Sec. 3.1).
The bluish electrical glow caused by corona discharge on masts and other high
parts of a ship at sea before and after electrical storms was called St. Elmo’s fire in
olden days.
In high-voltage facilities, for example, in high-tension transmission lines, a
corona discharge leads to the harmful leakage of current. Measures therefore have
to be taken to prevent it. For this purpose, for instance, the wires of high-tension
lines are taken of a sufficiently large diameter, which is the greater, the higher is the
voltage of the line.
The corona discharge has found a useful application in engineering in electrical
filters. The gas being purified flows through a tube along whose axis a negative
corona electrode is arranged. The negative ions present in a great number in the
external region of the corona settle on the particles or droplets polluting the gas and
Spark and Corona Discharges 263

are carried along with them to the external non-corona electrode. Upon reaching
the latter, the particles become neutralized and settle on it. Later, blows are struck
at the tube and the sediment formed by the precipitated particles drops into a
collector.
265

Chapter 13
ELECTRICAL OSCILLATIONS

13.1. Quasistationary Currents

When considering electrical oscillations, we have to do with time-varying currents.


Ohm’s law and Kirchhoff’s rules following from it were established for a steady
current. They also hold, however, for the instantaneous values of a varying current
and voltage if the changes are not too fast. Electromagnetic disturbances propagate
along a circuit with a tremendous speed equal to the speed of light 𝑐. Assume that
the length of a circuit is 𝑙. If during the time 𝜏 = 𝑙/𝑐 needed for the transmission of a
disturbance to the farthest point of a circuit, the current changes insignificantly, then
the instantaneous values of the current in all the cross sections of the circuit will
be virtually identical. Currents obeying this condition are called quasistationary.
For periodically varying currents, the condition for a quasistationary state is
𝑙
𝜏 =  𝑇,
𝑐
where 𝑇 is the period of the changes.
The delay for a circuit 3 m long is 𝜏 = 10−8 s. Thus, up to values of 𝑇 of the
order of 10−6 s (which corresponds to a frequency of 106 Hz), the currents in such a
circuit may be considered quasistationary. A current of industrial frequency (𝜈 = 50
or 60 Hz) is quasistationary for circuits up to about 100 km long.
The instantaneous values of quasistationary currents obey Ohm’s law. Hence,
Kirchhoff’s rules also hold for them.
In the following when studying electrical oscillations, we shall always assume
that the currents we are dealing with are quasistationary.
266 ELECTRICAL OSCILLATIONS

Stages:

Fig. 13.1

13.2. Free Oscillations in a Circuit Without a Resistance

Electrical oscillations may appear in a circuit containing an inductance and a capac-


itance. Such a circuit is therefore called an oscillatory circuit. Figure 13.1a shows
the consecutive stages of an oscillatory process in an idealized circuit containing
no resistance.
Oscillations can be set up in the circuit either by supplying a certain initial
charge to the capacitor plates or by producing a current in the inductance (for
example, by switching off the external magnetic field passing through the coil turns).
Let us use the first method. We shall connect the capacitor to a source of voltage
after disconnecting it from the inductance. The result will be the appearance of
unlike charges +𝑞 and −𝑞 on the plates (stage 1). An electric field will be set up
between the plates, and its energy will be (𝑞2 /𝐶)/2 [see Eq. (4.5)]. If we next switch
off the voltage source and connect the capacitor to the inductance, it will begin to
discharge, and a current will flow through the circuit. The energy of the electric
field will diminish as a result, but in return a constantly growing energy of the
magnetic field set up by the current flowing through the inductance will appear.
This energy is 𝐿𝐼 2 /2 [see Eq. (8.37)].
Since the resistance of the circuit is zero, the total energy consisting of the
energies of the electric and magnetic fields is not used for heating the wires and will
remain constant¹. Therefore, at the moment when the voltage across the capacitor
and, consequently, the energy of the electric field, vanish, the energy of the magnetic
field and, consequently, the current reach their maximum value (stage 2; beginning

¹Strictly speaking, in such an idealized circuit, energy would be lost on the radiation of electro-
magnetic waves. This loss grows with an increasing frequency of oscillations and when the circuit is
more “open”.
Free Oscillations in a Circuit Without a Resistance 267

from this moment, the current flows at the expense of the self induced e.m.f.). After
this, the current diminishes, and, when the charges on the plates reach their initial
value 𝑞, the current will vanish (stage 3). Next, the same processes occur in the
opposite direction (stages 4 and 5). After them, the system returns to its initial state
(stage 5), and the entire cycle repeats again and again. The charge on the plates, the
voltage across the capacitor, and the current flowing in the inductance periodically
change (i.e., oscillate) during the process. The oscillations are attended by mutual
transformations of the electric and magnetic field energies.
Figure 13.1b compares the oscillations of a spring pendulum with those in the
circuit. The supply of charges to the capacitor plates corresponds to bringing the
pendulum out of its equilibrium position by exerting an external force on it and
imparting the initial deviation 𝑥 to it. The potential energy of elastic deformation
of the spring equal to 𝑘𝑥2 /2 is produced. Stage 2 corresponds to passing of the
pendulum through its equilibrium position. At this moment, the quasi-elastic force
vanishes, and the pendulum continues its motion by inertia. By this time, the energy
of the pendulum completely transforms into kinetic energy and is determined by
the expression 𝑚𝑥2 /2. We shall let our reader compare the further stages.
It can be seen from a comparison of electrical and mechanical oscillations that
the energy of an electric field (𝑞2 /𝐶)/𝐸 is similar to the potential energy of elastic
deformation, and the energy of a magnetic field 𝐿𝐼 2 /2 is similar to the kinetic
energy. The inductance 𝐿 plays the part of the mass 𝑚, and the reciprocal of the
capacitance (1/𝐶) the part of the spring constant 𝑘. Finally, the displacement 𝑥 of
the pendulum from its equilibrium position corresponds to the charge 𝑞, and the
speed 𝑥¤ to the current 𝐼 = 𝑞¤. We shall see below that the analogy between electrical
and mechanical oscillations also extends to the mathematical equations describing
them.
Let us find an equation for the oscillations in a circuit without a resistance (an
𝐿-𝐶 circuit). We shall consider the current charging the capacitor to be positive²
(Fig. 13.2). Hence, by Eq. (5.1),
d𝑞
𝐼= = 𝑞¤.
d𝑡
Equation (5.27) of Ohm’s law for circuit 1-3-2 is
𝐼 𝑅 = 𝜑1 − 𝜑2 + E12 .
In our case, 𝑅 = 0, 𝜑1 − 𝜑2 = −𝑞/𝐶, and E12 = Es = −𝐿 (d𝐼/d𝑡). Introducing these

²With such a choice of the direction of the current, the analogy between electrical and mechanical
oscillations is more complete: 𝑞¤ corresponds to the speed 𝑋¤ (with a different choice, −¤𝑞 corresponds
to the speed 𝑥).
¤
268 ELECTRICAL OSCILLATIONS

Fig. 13.2

values into Eq. (5.27), we get


𝑞 d𝐼
0=− −𝐿 . (13.1)
𝐶 d𝑡
Finally, replacing d𝐼/d𝑡 with 𝑞¥ [see Eq. (5.1)], we get
1
𝑞¥ + 𝑞 = 0. (13.2)
𝐿𝐶
If we introduce the symbol
1
𝜔0 = √ , (13.3)
𝐿𝐶
Eq. (13.2) becomes
𝑞¥ + 𝜔20 𝑞 = 0, (13.4)
which is our good acquaintance from the science of mechanical oscillations [see Eq.
(7.7) of Vol. I]. The following function is a solution of this equation:
𝑞 = 𝑞m cos(𝜔0 𝑡 + 𝛼) (13.5)
(the subscript “m” stands for maximum).
Thus, the charge on the capacitor plates changes according to a harmonic law
with a frequency determined by Eq. (13.3). This frequency is called the natural
frequency of the circuit (it corresponds to the natural frequency of a harmonic
oscillator). We get the so-called Thomson formula for the period of the oscilla-
tions:
2𝜋
𝑇=√ . (13.6)
𝐿𝐶
The voltage across the capacitor differs from the charge by the factor 1/𝐶:
𝑞m
𝑈= cos(𝜔0 𝑡 + 𝛼) = 𝑈m cos(𝜔0 𝑡 + 𝛼). (13.7)
𝐶
Time differentiation of Eq. (13.5) yields an expression for the current:
𝜋
𝐼 = −𝜔0 𝑞m sin(𝜔0 𝑡 + 𝛼) = 𝐼m cos(𝜔0 𝑡 + 𝛼 + ). (13.8)
2
Free Damped Oscillations 269

Fig. 13.3

Thus, the current leads the voltage across the capacitor in phase by 𝜋/2.
A comparison of Eqs. (13.5) and (13.7) with Eq. (13.8) shows that at the moment
when the current reaches its maximum value, the charge and the voltage vanish,
and vice versa. We have already established this relation between the charge and
the current on the basis of energy considerations.
Examination of Eqs. (13.7) and (13.8) shows that
𝑞m
𝑈m = , 𝐼m = 𝜔0 𝑞m .
𝐶
Taking the ratio of these amplitudes and substituting for 𝜔0 its value from Eq. (13.3),
we get
  1/2
𝐿
𝑈m = 𝐼m . (13.9)
𝐶
We can also obtain this equation if we proceed from the fact that the maximum
value of the energy of the electric field 𝐶𝑈m2 /2 must equal the maximum value of
the energy of the magnetic field 𝐿𝐼m 2 /2.

13.3. Free Damped Oscillations

Any real circuit has a resistance. The energy stored in the circuit is gradually spent
in this resistance for heating, owing to which the free oscillations become damped.
Equation (5.27) written for circuit 1-3-2 shown in Fig. 13.3 has the form
𝑞 d𝐼
𝐼𝑅 = − − 𝐿 (13.10)
𝐶 d𝑡
[compare with Eq. (13.1)]. Dividing this equation by 𝐿 and substituting 𝑞¤ for 𝐼
and 𝑞¥ for d𝐼/d𝑡, we obtain
𝑅 1
𝑞¥ + 𝑞¤ + 𝑞 = 0. (13.11)
𝐿 𝐿𝐶
Taking into account that the reciprocal of 𝐿𝐶 equals the square of the natural
270 ELECTRICAL OSCILLATIONS

frequency of the circuit 𝜔0 [see Eq. (13.3)], and introducing the symbol
𝑅
𝛽= , (13.12)
𝐿𝐶
Eq. (13.11) can be written in the form
𝑞¥ + 2𝛽 𝑞¤ + 𝜔20 𝑞 = 0. (13.13)
This equation coincides with the differential equation of damped mechanical
oscillations [see Eq. (7.11) of Vol. I].
When 𝛽 2 < 𝜔20 , i.e., 𝑅2 /(4𝐿2 ) < 1/(𝐿𝐶), the solution of Eq. (13.3) has the form
𝑞 = 𝑞m,0 𝑒−𝛽𝑡 cos(𝜔𝑡 + 𝛼), (13.14)
q
where 𝜔 = 𝜔20 − 𝛽 2 . Substituting for 𝜔0 its value from Eq. (13.3) and for 𝛽 its value
from Eq. (13.12), we find that
1 𝑅2
 
𝜔= − 2 . (13.15)
𝐿𝐶 4𝐿
Thus, the frequency of damped oscillations 𝜔 is smaller than the natural frequency
𝜔0 . When 𝑅 = 0, Eq. (13.13) transforms into Eq. (13.3).
Dividing Eq. (13.14) by the capacitance 𝐶, we get the voltage across the capacitor:

1
𝑈= 𝑞m,0 𝑒−𝛽𝑡 cos(𝜔𝑡 + 𝛼) = 𝑈m,0 𝑒−𝛽𝑡 cos(𝜔𝑡 + 𝛼). (13.16)
𝐶
To find the current, we shall differentiate Eq. (13.14) with respect to time
𝐼 = 𝑞¤ = 𝑞m,0 𝑒−𝛽𝑡 [−𝛽 cos(𝜔𝑡 + 𝛼) − 𝜔 sin(𝜔𝑡 + 𝛼)].
Multiplying the right-hand side of this equation by the expression
𝜔0
𝜔2 − 𝛽 2
p

equal to unity, we get


" #
𝛽 𝜔
𝐼 = 𝜔0 𝑞m,0 𝑒−𝛽𝑡 − p cos(𝜔𝑡 + 𝛼) − p sin(𝜔𝑡 + 𝛼) .
𝜔2 − 𝛽 2 𝜔2 − 𝛽 2
Introducing the angle 𝜓 determined by the conditions
𝛽 𝛽 𝜔 𝜔
cos 𝜓 = − p = − , sin 𝜓 = p = ,
2
𝜔 −𝛽 2 𝜔 0 2
𝜔 −𝛽 2 𝜔 0
we can write
𝐼 = 𝜔0 𝑞m,0 𝑒−𝛽𝑡 cos(𝜔𝑡 + 𝛼 + 𝜓). (13.17)
Since cos 𝜓 < 0 and sin 𝜓 > 0, the value of 𝜓 is within the limits from 𝜋/2 to 𝜋
(i.e., 𝜋/2 < 𝜓 < 𝜋). Thus, when a circuit contains a resistance, the current leads the
voltage across the capacitor in phase by more than 𝜋/2 (when 𝑅 = 0, the advance
Free Damped Oscillations 271

Fig. 13.4

in phase is 𝜋/2).
A plot of function (13.14) is depicted in Fig. 13.4. Plots of the voltage and current
are similar to it.
It is customary practice to characterize the damping of oscillations by the
logarithmic decrement
 
𝐴(𝑡)
𝜆 = ln = 𝛽𝑇 (13.18)
𝐴(𝑡 + 𝑇)
[see Eq. (7.104) of Vol. I]. Here 𝐴(𝑡) is the amplitude of the relevant quantity (𝑞, 𝑈,
or 𝐼). We remind our reader that the logarithmic decrement is the reciprocal of the
number of oscillations 𝑁𝑒 performed during the time needed for the amplitude to
decrease to 1/𝑒 of its initial value:
1
𝜆= .
𝑁𝑒
Using in Eq. (13.18) the value of 𝛽 from Eq. (13.12) and substituting 2𝜋/𝜔 for 𝑇,
we get the following expression for 𝐴:
𝑅 2𝜋 𝜋 𝑅
𝜆= = . (13.19)
2𝐿 𝜔 𝐿𝜔
The frequency 𝜔, and, therefore, also 𝐴 are determined by the parameters of a circuit
𝐿, 𝐶, and 𝑅. Thus, the logarithmic decrement is a characteristic of a circuit.
If the damping is not great (𝛽 2  𝜔20 ), we can assume in Eq. (13.19) that 𝜔 ≈ 𝜔0 =

1/ 𝐿𝐶. Hence,
√   1/2
𝜋 𝑅 𝐿𝐶 𝐶
𝜆≈ = 𝜋𝑅 . (13.20)
𝐿 𝐿
An oscillatory circuit is often characterized by its quality, or simply 𝑄, deter-
mined as a quantity that is inversely proportional to the logarithmic decrement:
𝜋
𝑄 = = 𝜋 𝑁𝑒 . (13.21)
𝜆
272 ELECTRICAL OSCILLATIONS

It follows from Eq. (13.21) that the quality of a circuit is the higher, the greater is
the number of oscillations completed before the amplitude diminishes to 1/𝑒 of its
initial value.
For weak damping, we have
  1/2
1 𝐿
𝑄= (13.22)
𝑅 𝐶
[see Eq. (13.20)].
In Sec. 7.10 of Vol. I, we showed that when the damping is weak, the quality of a
mechanical oscillatory system equals the ratio of the energy stored in the system at
a given moment to the decrement of this energy during one period of oscillations
with an accuracy to the factor 2𝜋. We shall show that this also holds for electrical
oscillations. The amplitude of the current in a circuit diminishes according to
the law 𝑒−𝛽𝑡 . The energy 𝑊 stored in the circuit is proportional to the square of
the current amplitude (or to the square of the amplitude of the voltage across the
capacitor). Hence, 𝑊 diminishes according to the law 𝑒−2𝛽𝑡 . The relative reduction
in the energy during a period is
𝛥𝑊 𝑊 (𝑡) − 𝑊 (𝑡 + 𝑇) 1 − 𝑒−2𝛽𝑡
= = = 1 − 𝑒−2𝜆 .
𝑊 𝑊 (𝑡) 1
With insignificant damping (i.e., when 𝐴  1), we may assume that 𝑒−2𝜆 is approxi-
mately equal to 1 − 2𝜆:
𝛥𝑊
= 1 − (1 − 2𝜆) = 2𝜆.
𝑊
Finally, substituting the quality 𝑄 of the circuit for 𝜆 in this expression in accordance
with Eq. (13.21) and solving the equation obtained relative to 𝑄, we get
𝛥𝑊
𝑄 = 2𝜋 . (13.23)
𝑊
We shall note in conclusion that when 𝑅2 /(4𝐿2 ) > 1/(𝐿𝐶), i.e., when 𝛽 2 > 𝜔20 ,
an aperiodic discharge of the capacitor occurs instead of oscillations. The resistance
of a circuit at which an oscillatory process transforms into an aperiodic one is called
critical. The value of the critical resistance 𝑅cr is determined by the condition
𝑅2cr /(4𝐿2 ) = 1/(𝐿𝐶), whence
  1/2
𝐿
𝑅cr = 2 . (13.24)
𝐶
Forced Electrical Oscillations 273

Fig. 13.5

13.4. Forced Electrical Oscillations

To produce forced oscillations of a system, an external periodically changing action


must be exerted on it. This can be achieved for electrical oscillations if we connect
a varying e.m.f. in series with the circuit elements or, if after breaking the circuit,
we feed an alternating voltage to the contacts formed, i.e., the voltage
𝑈 = 𝑈m cos(𝜔𝑡) (13.25)
(Fig. 13.5). This voltage must be added to the self-induced e.m.f.. As a result, Eq. (13.10)
acquires the form
𝑞 d𝐼
𝐼𝑅 = − − 𝐿 + 𝑈m cos(𝜔𝑡). (13.26)
𝐶 d𝑡
After transformations, we get the equation
𝑈m
𝑞¥ + 2𝛽 𝑞¤ + 𝜔20 𝑞 = cos(𝜔𝑡). (13.27)
𝐿
Here, 𝜔20 and 𝛽 are determined by Eqs. (13.3) and (13.12).
Equation (13.27) coincides with the differential equation of forced mechanical
oscillations [see Eq. (7.111) of Vol. I]. A partial solution of this equation has the form
𝑞 = 𝑞m cos(𝜔𝑡 − 𝜓), (13.28)
where
𝑈m /𝐿 2𝛽𝜔
𝑞m = q , tan 𝜓 =
2 2
2 2 2 𝜔20 − 𝜔2
𝜔0 − 𝜔 + 4𝛽 𝜔
[see Eq. (7.119) of Vol. I]. Substitution of their values for 𝜔0 and 𝛽 gives
𝑈m
𝑞m = p , (13.29)
𝜔 𝑅 + [𝜔𝐿 − 1/(𝜔𝐶)] 2
2

𝑅
tan 𝜓 = . (13.30)
[1/(𝜔𝐶) − 𝜔𝐿]
A general solution is obtained if we add the general solution of the relevant
homogeneous equation to partial solution (13.28). This solution was obtained in
the preceding section [see Eq. (13.14)]. It contains the exponential factor 𝑒−𝛽𝑡 , there-
fore, after sufficient time elapses, becomes very small and it may be disregarded.
274 ELECTRICAL OSCILLATIONS

Consequently, stationary forced oscillations are described by the function (13.28).


Time differentiation of Eq. (13.28) gives the current in a circuit with stationary
oscillations:  𝜋
𝐼 = −𝜔𝑞m sin(𝜔𝑡 − 𝜓) = 𝐼m cos 𝜔𝑡 − 𝜓 +
2
(𝐼m = 𝜔𝑞m ). Let us write this expression in the form³
𝐼 = 𝐼m cos(𝜔𝑡 − 𝜑), (13.31)
where 𝜑 = 𝜓 − 𝜋/2 is the shift in phase between the current and the applied voltage
[see Eq. (13.25)]. In accordance with Eq. (13.30):
 𝜋 𝑞 𝜔𝐿 − 1/(𝐿𝐶)
tan 𝜑 = tan 𝜓 − =− = . (13.32)
2 tan 𝜓 𝑅
Inspection of this equation shows that the current lags in phase behind the voltage
(𝜑 > 0) when 𝜔𝐿 > 1/(𝜔𝐶), and leads the voltage (𝜑 < 0) when 𝜔𝐿 < 1/(𝜔𝐶).
According to Eq. (13.29):
𝑈m
𝐼m = 𝜔𝑞m = p . (13.33)
𝑅 + [𝜔𝐿 − 1/(𝜔𝐶)] 2
2

Let us write Eq. (13.26) in the form


𝑞 d𝐼
𝐼𝑅 + + 𝐿 = 𝑈m cos(𝜔𝑡). (13.34)
𝐶 d𝑡
The product 𝐼 𝑅 equals the voltage 𝑈 𝑅 across the resistance, 𝑞/𝐶 is the voltage across
the capacitor 𝑈𝐶 , and the expression 𝐿 (d𝐼/d𝑡) determines the voltage across the
inductance 𝑈 𝐿 . Taking this into account, we can write
𝑈 𝑅 + 𝑈𝐶 + 𝑈 𝐿 = 𝑈m cos(𝜔𝑡). (13.35)
Thus, the sum of the voltages across the separate elements of a circuit at each
moment of time equals the voltage applied from an external source (see Fig. 13.5).
According to Eq. (13.31)
𝑈 𝑅 = 𝑅𝐼m cos(𝜔𝑡 − 𝜑). (13.36)
Dividing Eq. (13.28) by the capacitance, we get the voltage across the capacitor
𝑞m  𝜋
𝑈𝐶 = cos(𝜔𝑡 − 𝜓) = 𝑈𝐶,m cos 𝜔𝑡 − 𝜑 − . (13.37)
𝐶 2
Here,
𝑞m 𝑈m 𝐼m
𝑈𝐶,m = = = (13.38)
𝜔𝐶 𝑅2 + [𝜔𝐿 − 1/(𝜔𝐶)] 2 𝜔𝐶
p
𝐶
[see Eq. (13.31)]. Multiplying the derivative of function (13.31) by 𝐿, we get the voltage

³We shall not encounter the concept of potential any more up to the end of this chapter. Therefore,
no misunderstandings will appear if we use the symbol 𝜑 for the phase angle.
Forced Electrical Oscillations 275

Axis of
currents

Fig. 13.6

across the inductance:


d𝐼  𝜋
𝑈𝐿 = 𝐿 = −𝜔𝐿𝐼m sin(𝜔𝑡 − 𝜑) = 𝑈 𝐿,m cos 𝜔𝑡 − 𝜑 + . (13.39)
d𝑡 2
Here,
𝑈 𝐿,m = 𝜔𝐿𝐼m . (13.40)
A comparison of Eqs. (13.31), (13.36), (13.37), and (13.39) shows that the voltage
across the capacitor lags in phase behind the current by 𝜋/2, while the voltage
across the inductance leads the current by 𝜋/2. The voltage across the resistance
changes in phase with the current. The phase relations can be shown very clearly
with the aid of a vector diagram (see Sec. 7.7 of Vol. I). We remind our reader that
a harmonic oscillation (or a harmonic function) can be shown with the aid of a
vector whose length equals the amplitude of the oscillation, while the direction of
the vector makes an angle equal to the initial phase of the oscillation with a certain
axis. Let us take the axis of currents as the straight line from which the initial phase
is counted. This gives us the diagram shown in Fig. 13.6.
According to Eq. (13.35), the sum of the three functions 𝑈 𝑅 , 𝑈𝐶 , and 𝑈 𝐿 must equal
the applied voltage 𝑈. The voltage 𝑈 is accordingly shown in the diagram by a
vector equal to the sum of the vectors 𝑈 𝑅 , 𝑈𝐶 , and 𝑈 𝐿 . We must note that Eq. (13.33) is
easily obtained from the right triangle formed in the vector diagram by the vectors
𝑈, 𝑈 𝑅 , and the difference 𝑈 𝐿 − 𝑈𝐶 .
The resonance frequency for the charge 𝑞 and the voltage 𝑈𝐶 across the capacitor
is
 1/2
1 𝑅2

2 2  1/2
𝜔𝑞,res = 𝜔𝑈,res = 𝜔0 − 2𝛽 = − 6 𝜔0 (13.41)
𝐿𝐶 2𝐿2
[see Eq. (7.127) of Vol. I].
Resonance curves for 𝑈𝐶 are shown in Fig. 13.7 (resonance curves for 𝑞 have
the same form). They are similar to the resonance curves obtained for mechanical
oscillations (see Fig. 7.24 of Vol. I). When 𝜔 → 0, the resonance curves converge
at one point having the ordinate 𝑈𝐶,m = 𝑈m , i.e., the voltage appearing across the
276 ELECTRICAL OSCILLATIONS

Fig. 13.7 Fig. 13.8

capacitor when it is connected to a source of steady voltage 𝑈m . The maximum in


resonance will be the higher and the sharper, the smaller is 𝛽 = 𝑅/(2𝐿), i.e., the
smaller is the resistance and the greater the inductance of the circuit.
Resonance curves for the current are shown in Fig. 13.8. They correspond to the
resonance curves for the velocity in mechanical oscillations. The amplitude of the
current has a maximum value at 𝜔𝐿 − 1/(𝜔𝐶) [see Eq. (13.33)]. Consequently, the
resonance frequency for the current coincides with the natural frequency of the
circuit 𝜔0 :
1
𝜔 𝐼,res = 𝜔0 = √ . (13.42)
𝐿𝐶
The intercept formed by the resonance curves on the 𝐼m -axis is zero—at a
constant voltage, a steady current cannot flow in a circuit containing a capacitor.
At small damping (when 𝛽 2  𝜔20 ), the resonance frequency for the voltage
can be taken equal to 𝜔0 [see Eq. (13.41)]. Accordingly, we may consider that 𝜔res 𝐿 −
1/(𝜔res𝐶). By Eq. (13.38), the ratio of the amplitude of the voltage across the capacitor
in resonance 𝑈𝐶,m,res to the amplitude of the external voltage 𝑈m will in this case
be
√   1/2
𝑈𝐶,m,res 1 𝐿𝐶 1 𝐿
= = = =𝑄 (13.43)
𝑈m 𝜔0𝐶𝑅 𝐶𝑅 𝑅 𝐶
[we have assumed in Eq. (13.38) that 𝜔 = 𝜔𝑈,res = 𝜔0 . Here, 𝑄 is the quality of the
circuit [see Eq. (13.22)]. Thus, the quality of a circuit shows how many times the
voltage across a capacitor can exceed the applied voltage.
The quality of a circuit also determines the sharpness of the resonance curves.
Figure 13.9 shows a resonance curve for the current in a circuit. Instead of laying
off the values of 𝐼m corresponding to a given frequency along the axis of ordinates,
we have laid off the ratio of 𝐼m to 𝐼m,res (i.e., to 𝐼m in resonance). Let us consider
the width of the curve 𝛥𝜔 taken at the height 0.7 (a power ratio of 0.72 ≈ 0.5
Alternating Current 277

Fig. 13.9

corresponds to a ratio of the current amplitudes equal to 0.7). We can show that the
ratio of this width to the resonance frequency equals a quantity that is the reciprocal
of the quality of a circuit:
𝛥𝜔 1
= . (13.44)
𝜔0 𝑄
We remind our reader that Eqs. (13.43) and (13.44) hold only for large values of
𝑄, i.e., when the damping of the free oscillations in the circuit is small.
The phenomenon of resonance is used to separate the required component
from a complex voltage. Assume that the voltage applied to a circuit is
𝑈 = 𝑈m,1 cos(𝜔1 𝑡 + 𝛼1 ) + 𝑈m,2 cos(𝜔2 𝑡 + 𝛼2 ) + . . . .
By tuning the circuit to one of the frequencies 𝜔1 , 𝜔2 , etc. (i.e., by correspondingly
choosing its parameters 𝐶 and 𝐿), we can obtain a voltage across the capacitor that
exceeds the value of the given component 𝑄 times, whereas the voltage produced
across the capacitor by the other components will be weak. Such a process is carried
out, for example, when tuning a radio receiver to the required wavelength.

13.5. Alternating Current

The stationary forced oscillations described in the preceding section can be con-
sidered as the flow of an alternating current produced by the alternating voltage
𝑈 = 𝑈m cos(𝜔𝑡) (13.45)
in a circuit including a capacitance, an inductance, and a resistance. According to
Eqs. (13.31), (13.32), and (13.33), this current varies according to the law
𝐼 = 𝐼m cos(𝜔𝑡 − 𝜑). (13.46)
The amplitude of the current is determined by the amplitude of the voltage 𝑈m the
278 ELECTRICAL OSCILLATIONS

circuit parameters 𝐶, 𝐿, 𝑅, and the frequency 𝜔:


𝑈m
𝐼m = p . (13.47)
𝑅 + [𝜔𝐿 − 1/(𝜔𝐶)] 2
2

The current lags in phase behind the voltage by the angle 𝜑 that depends on the
parameters of the circuit and on the frequency:
𝜔𝐿 − 1/(𝜔𝐶)
tan 𝜑 = . (13.48)
𝑅
When 𝜑 < 0, the current actually leads the voltage.
The expression
 2 ! 1/2
1

𝑍 = 𝑅2 + 𝜔𝐿 − (13.49)
𝜔𝐶
in the denominator of Eq. (13.47) is called the impedance.
If a circuit consists only of a resistance 𝑅, the equation of Ohm’s law has the
form
𝐼 𝑅 = 𝑈m cos(𝜔𝑡).
Hence, it follows that the current in this case varies in phase with the voltage, while
the amplitude of the current is
𝑈m
𝐼m = .
𝑅
A comparison of this expression with Eq. (13.47) shows that the replacement of a
capacitor with a shorted circuit section signifies a transition to 𝐶 → ∞ instead of
to 𝐶 = 0.
Any real circuit has finite values of 𝑅, 𝐿, and 𝐶. It may happen that some of
these parameters are such that their influence on the current may be disregarded.
Suppose that 𝑅 of a circuit may be assumed equal to zero, and 𝐶 equal to infinity.
Now, we can see from Eqs. (13.47) and (13.48) that
𝑈m
𝐼m = (13.50)
𝜔𝐿
and that tan 𝜑 = ∞ (accordingly, 𝜑 = 𝜋/2). The quantity
𝑋 𝐿 = 𝜔𝐿 (13.51)
is called the inductive reactance. If 𝐿 is expressed in henries, and 𝜔 in rad s−1 , then
𝑋 𝐿 will be expressed in ohms. Examination of Eq. (13.51) shows that the inductive
reactance grows with the frequency 𝜔. An inductance does not react to a steady
current (𝜔 = 0), i.e., 𝑋 𝐿 = 0.
The current in an inductance lags behind the voltage by 𝜋/2. Accordingly, the
voltage across the inductance leads the current by 𝜋/2 (see Fig. 13.6).
Alternating Current 279

Now, let us assume that 𝑅 and 𝐿 both equal zero. Hence, according to Eqs.
(13.47) and (13.48), we have
𝑈m
𝐼m = (13.52)
1/(𝜔𝐶)
tan 𝜑 = −∞ (i.e., 𝜑 = −𝜋/2). The quantity
1
𝑋𝐶 = (13.53)
𝜔𝐶
is called the capacitive reactance. If 𝐶 is expressed in farads, and 𝜔 in rad s−1 then
𝑋𝐶 will be expressed in ohms. It follows from Eq. (13.53) that the capacitive reactance
diminishes with increasing frequency. For a steady current, 𝑋𝐶 = ∞—a steady
current cannot flow through a capacitor. Since 𝜑 = −𝜋/2, the current flowing
through a capacitor leads the voltage by 𝜋/2. Accordingly, the voltage across a
capacitor lags behind the current by 𝜋/2 (see Fig. 13.6).
Finally, suppose that we may assume 𝑅 to equal zero. In this case, Eq. (13.47)
becomes
𝑈m
𝐼m = . (13.54)
|𝜔𝐿 − 1/(𝜔𝐶)|
The quantity
1
𝑋 = 𝜔𝐿 − = 𝑋 𝐿 − 𝑋𝐶 (13.55)
𝜔𝐶
is called the reactance.
Equations (13.48) and (13.49) can be written in the form
𝑋 √
tan 𝜑 = , 𝑍 = 𝑅2 + 𝑋 2 .
𝑅
Thus, if the values of the resistance 𝑅 and the reactance 𝑋 are laid off along the legs
of a right triangle, then the length of the hypotenuse will numerically equal 𝑍 (see
Fig. 13.6).
Let us find the power liberated in an alternating current circuit. The instan-
taneous value of the power equals the product of the instantaneous values of the
voltage and current:
𝑃 (𝑡) = 𝑈 (𝑡)𝐼 (𝑡) = 𝑈m cos(𝜔𝑡) × 𝐼m cos(𝜔𝑡 − 𝜑). (13.56)
Taking advantage of the formula
1 1
cos 𝛼 cos 𝛽 = cos(𝛼 − 𝛽) + cos(𝛼 + 𝛽),
2 2
we can write Eq. (13.56) in the form
1 1
𝑃 (𝑡) = 𝑈m 𝐼m cos 𝜑 + 𝑈m 𝐼m cos(2𝜔𝑡 − 𝜑). (13.57)
2 2
Of practical interest is the time-average value 𝑃 (𝑡), which we shall denote
280 ELECTRICAL OSCILLATIONS

Fig. 13.10

simply by 𝑃. Since the average value of cos(2𝜔𝑡 − 𝜑) is zero, we have


𝑈m 𝐼 m
𝑃= cos 𝜑. (13.58)
2
Inspection of Eq. (13.57) shows that the instantaneous power fluctuates about the
average value with a frequency double that of the current (Fig. 13.10).
In accordance with Eq. (13.48),
𝑅 𝑅
cos 𝜑 = p = . (13.59)
2
𝑅 + [𝜔𝐿 − 1/(𝜔𝐶)] 2 𝑍
Using this value of cos 𝜑 in Eq. (13.48) and taking into account that 𝑈m /𝑍 = 𝐼m , we
get
𝑅𝐼 2
𝑃 = m. (13.60)
2
The same power is developed by a direct current whose strength is
𝐼m
𝐼=√ . (13.61)
2
Quantity (13.61) is known as the effective value of the current. Similarly, the
quantity
𝑈m
𝑈= √ , (13.62)
2
is called the effective voltage.
Expressing the average power through the effective current and voltage, we get
𝑃 = 𝑈 𝐼 cos 𝜑. (13.63)
The factor cos 𝜑 in this expression is called the power factor. Engineers try to
make cos 𝜑 as high as possible. At a low value of cos 𝜑, a large current must be
passed through a circuit to obtain the required power, and this results in greater
losses in the feeder lines.
PART II

WAVES
283

Chapter 14
ELASTIC WAVES

14.1. Propagation of Waves in an Elastic Medium

If at any place of an elastic (solid or fluid) medium its particles are made to oscillate,
then owing to interaction between the particles, this oscillation will propagate in
the medium from particle to particle with a certain velocity 𝑣. The process of the
propagation of oscillations in space is called a wave.
The particles of a medium in which a wave is propagating are not made to
perform translational motion by the wave, they only oscillate about their equi-
librium positions. Depending on the direction of oscillations of particles relative
to the direction of propagation of the wave, longitudinal and transverse waves
are distinguished. In the former, the particles of the medium oscillate along the
direction of propagation of the wave. In transverse waves, the particles of the
medium oscillate in directions at right angles to the direction of wave propagation.
Elastic transverse waves can appear only in a medium having a resistance to shear.
Therefore, only longitudinal waves can appear in fluids. Both longitudinal and
transverse waves can appear in a solid.
Figure 14.1 shows the motion of the particles when a transverse wave propagates
in a medium. The numbers 1, 2, etc. designate particles spaced at a distance of 𝑣𝑇/4,
i.e., at the distance travelled by the wave during one-fourth of the period of the
oscillations performed by the particles. At the moment of time taken as zero, the
wave propagating along the axis from left to right reached particle 1. As a result, the
particle began to move upward from its equilibrium position, carrying the following
particles along. After one-fourth of a period, particle 1 reaches its extreme top
position; simultaneously, particle 2 begins to move from its equilibrium position.
After another fourth of a period elapses, the first particle will pass its equilibrium
position moving downward, the second particle will reach its extreme top position,
284 ELASTIC WAVES

Fig. 14.1

Fig. 14.2

and the third particle will begin to move upward from its equilibrium position.
At the moment 𝑇, the first particle will complete a cycle of oscillation and will be
in the same state of motion as at the initial moment. The wave by the moment 𝑇,
having covered the path 𝑣𝑇, will reach particle 5.
Figure 14.2 shows how the particles move when a longitudinal wave propagates
in a medium. All the reasoning relating to the behaviour of particles in a transverse
wave can also be related to the given case with displacements to the right and left
substituted for the upward and downward ones. A glance at the figure shows that
the propagation of a longitudinal wave in a medium is attended by alternating
compensations and dilatations of the particles (the places of compensation of the
particles are surrounded by a dash line in the figure). They move in the direction of
wave propagation with the velocity 𝑣.
Figures 14.1 and 14.2 show oscillations of particles whose equilibrium positions
are on the 𝑥-axis. Actually, not only the particles along the 𝑥-axis, but the entire
Propagation of Waves in an Elastic Medium 285

Fig. 14.3

collection of particles contained in a certain volume oscillate. Spreading from the


source of oscillations, the wave process involves new and new parts of space. The
locus of the points reached by the oscillations at the moment of time 𝑡 is called the
wavefront. The latter is the surface separating the part of space already involved
in the wave process from the region in which oscillations have not yet appeared.
The locus of the points oscillating in the same phase is known as a wave surface.
A wave surface can be drawn through any point of the space involved in a wave
process. Hence, there is an infinitely great number of wave surfaces, whereas there
is only one wavefront at each moment of time. Wave surfaces remain stationary
(they pass through the equilibrium positions of particles oscillating in the same
phase). A wavefront is in constant motion.
Wave surfaces can have any shape. In the simplest cases, they are planes or
spheres. The wave in these cases is called plane or spherical, accordingly. In a plane
wave, the wave surfaces are a multitude of parallel planes, in a spherical wave they
are a multitude of concentric spheres.
Assume that a plane wave is propagating along the 𝑥-axis. Hence, all the points
of the medium whose equilibrium positions have an identical coordinate 𝑥 (but
different values of 𝑦 and 𝑧) oscillate in the same phase. Figure 14.3 shows a curve
that produces the displacement 𝜉 of points having different 𝑥’s at a certain moment
of time from their equilibrium position. This figure must not be understood as a
visible image of a wave. It shows a graph of the function 𝜉 (𝑥, 𝑡) for a certain fixed
moment of time 𝑡. Such a graph can be constructed for both a longitudinal and a
transverse wave.
The distance 𝜆 covered by a wave during the time equal to the period of os-
cillations of the particles of a medium is called the wavelength. It is obvious that
𝜆 = 𝑣𝑇, (14.1)
where 𝑣 is the velocity of the wave and 𝑇 is the period of oscillations.
The wavelength can also be defined as the distance between the closest points
of a medium that oscillate with a phase difference of 2𝜋 (see Fig. 14.3).
Substituting 1/𝜈 (𝜈 is the frequency of oscillations) for 𝑇 in Eq. (14.1), we get
𝜆𝜈 = 𝑣. (14.2)
286 ELASTIC WAVES

Fig. 14.4

We can also arrive at this equation from the following considerations. In one second,
a wave source completes 𝜈 oscillations, producing during each oscillation one “crest”
and one “trough” in the medium. By the moment when the source will complete its
𝜈-th oscillation, the first crest will cover the path 𝑣. Consequently, the path 𝑣 must
contain 𝜈 crests and troughs of the wave.

14.2. Equations of a Plane and a Spherical Wave

A wave equation is an expression that gives the displacement of an oscillating


particle as a function of its coordinates 𝑥, 𝑦, 𝑧, and the time 𝑡:
𝜉 = 𝜉 (𝑥, 𝑦, 𝑧, 𝑡) (14.3)
(we have in mind the coordinates of the equilibrium position of the particle). This
function must be periodical both relative to the time t and to the coordinates 𝑥, 𝑦,
𝑧. Its periodicity in time follows from the fact that 𝜉 describes the oscillations of a
particle having the coordinates 𝑥, 𝑦, 𝑧. Its periodicity with respect to the coordinates
follows from the fact that points at a distance 𝜆 from one another oscillate in the
same way.
Let us find the form of the function 𝜉 for a plane wave assuming that the
oscillations are harmonic. For simplicity, we shall direct the coordinate axes so
that the 𝑥-axis coincides with the direction of propagation of the wave. The wave
surfaces will therefore be perpendicular to the 𝑥-axis and, since all the points of the
wave surface oscillate identically, the displacement 𝜉 will depend only on 𝑥 and on
𝑡, i.e., 𝜉 = 𝜉 (𝑥, 𝑡). Let the oscillations of the points in the plane 𝑥 = 0 (Fig. 14.4) have
the form
𝜉 (0, 𝑡) = 𝐴 cos(𝜔𝑡 + 𝛼).
Let us find the form of the oscillations of the points in the plane corresponding
to an arbitrary value of 𝑥. To travel the path from the plane 𝑥 = 0 to this plane,
the wave needs the time 𝑇 = 𝑥/𝑣 (here, 𝑣 is the velocity of wave propagation).
Equations of a Plane and a Spherical Wave 287

Consequently, the oscillations of the particles in the plane 𝑥 will lag in time by 𝑇
behind the oscillations of the particles in the plane 𝑥 = 0, i.e., they will have the
form h  𝑥 i
𝜉 (𝑥, 𝑡) = 𝐴 cos[𝜔(𝑡 − 𝜏) + 𝛼] = 𝐴 cos 𝜔 𝑡 − +𝛼 .
𝑣
Thus, the equation of a plane wave (both a longitudinal and a transverse one)
propagating in the direction of the 𝑥-axis has the following form:
h  𝑥 i
𝜉 = 𝐴 cos 𝜔 𝑡 − +𝛼 . (14.4)
𝑣
The quantity 𝐴 is the amplitude of a wave. The initial phase of the wave 𝛼 is
determined by our choice of the beginning of counting 𝑥 and 𝑡. When considering
one wave, the initial time and the coordinates are usually selected so that 𝛼 is zero.
This cannot be done, as a rule, when considering several waves jointly.
Let us fix a value of the phase in Eq. (14.4) by assuming that
 𝑥
𝜔 𝑡− + 𝛼 = constant. (14.5)
𝑣
This expression determines the relation between the time 𝑡 and the place 𝑥 where
the phase has a fixed value. The value of d𝑥/d𝑡 ensuing from it gives the velocity
with which the given value of the phase propagates. Differentiation of Eq. (14.5)
yields
1
d𝑡 − d𝑥 = 0,
𝑣
whence
d𝑥
= 𝑣. (14.6)
d𝑡
Thus, the velocity of wave propagation 𝑣 in Eq. (14.4) is the velocity of phase propa-
gation, and in this connection it is called the phase velocity.
According to Eq. (14.6), we have d𝑥/d𝑡 > 0. Hence, Eq. (14.4) describes a wave
propagating in the direction of growing 𝑥. A wave propagating in the opposite
direction is described by the equation
h  𝑥 i
𝜉 = 𝐴 cos 𝜔 𝑡 + +𝛼 . (14.7)
𝑣
Indeed, equating the phase of wave (14.7) to a constant and differentiating the equa-
tion obtained, we arrive at the expression
d𝑥
= −𝑣,
d𝑡
from which it follows that the wave given by Eq. (14.7) propagates in the direction
of diminishing 𝑥.
The equation of a plane wave can be given a symmetrical form relative to 𝑥 and
288 ELASTIC WAVES

𝑡. For this purpose, let us introduce the quantity


2𝜋
𝑘= , (14.8)
𝜆
known as the wave number. Multiplying the numerator and the denominator of
Eq. (14.8) by the frequency 𝜈, we can represent the wave number in the form
𝜔
𝑘= (14.9)
𝜈
[see Eq. (14.2)]. Opening the parentheses in Eq. (14.4) and taking Eq. (14.9) into account,
we arrive at the following equation for a plane wave propagating along the 𝑥-axis:
𝜉 = 𝐴 cos(𝜔𝑡 − 𝑘𝑥 + 𝛼). (14.10)
The equation of a wave propagating in the direction of diminishing 𝑥 differs from
Eq. (14.10) only in the sign of the term 𝑘𝑥.
In deriving Eq. (14.10), we assumed that the amplitude of the oscillations does not
depend on 𝑥. This is observed for a plane wave when the energy of the wave is not
absorbed by the medium. When a wave propagates in a medium absorbing energy,
the intensity of the wave gradually diminishes with an increasing distance from the
source of oscillations—damping of the wave is observed. Experiments show that
in a homogeneous medium such damping occurs according to an exponential law:
𝐴 = 𝐴0 𝑒−𝛾𝑥 [compare with the diminishing of the amplitude of damped oscillations
with time; see Eq. (7.102) of Vol. I]. Accordingly, the equation of a plane wave has
the following form:
𝜉 = 𝐴0 𝑒−𝛾𝑥 cos(𝜔𝑡 − 𝑘𝑥 + 𝛼) (14.11)
(𝐴0 is the amplitude at points in the plane 𝑥 = 0, and 𝛾 is the attenuation coefficient).
Now, let us find the equation of a spherical wave. Any real source of waves has
a certain extent. But if we limit ourselves to considering a wave at distances from
its source appreciably exceeding the dimensions of the source, then the latter may
be treated as a point one. A wave emitted by a point source in an isotropic and
homogeneous medium will be spherical. Assume that the phase of oscillations of
the source is (𝜔𝑡 + 𝛼). Hence, points on a wave surface of radius 𝑟 will oscillate with
the phase 𝜔(𝑡 − 𝑟/𝑣) + 𝛼 = 𝜔0 𝑡 − 𝑘𝑟 + 𝛼 (the wave needs the time 𝜏 = 𝑟/𝑣 to travel
the path 𝑟). The amplitude of the oscillations in this case, even if the energy of the
wave is not absorbed by the medium, does not remain constant—it diminishes with
the distance from the source as 1/𝑟 (see Sec. 14.6). Consequently, the equation of a
spherical wave has the form
𝐴
𝜉 = cos(𝜔𝑡 − 𝑘𝑟 + 𝛼), (14.12)
𝑟
where 𝐴 is a constant quantity numerically equal to the amplitude at a distance
of unity from the source. The dimension of 𝐴 equals that of the oscillating quan-
Equation of a Plane Wave Propagating in an Arbitrary Direction 289

Fig. 14.5

tity multiplied by the dimension of length. The factor 𝑒−𝛾𝑟 must be multiplied to
Eq. (14.12) for an absorbing medium.
We remind our reader that owing to the assumptions we have made, Eq. (14.12)
holds only when 𝑟 appreciably exceeds the dimensions of the source. When 𝑟 tends
to zero, the expression for the amplitude tends to infinity. The explanation of this
absurd result is that the equation cannot be used for small 𝑟’s.

14.3. Equation of a Plane Wave Propagating in an Arbitrary Direction

Let us find the equation of a plane wave propagating in a direction making the angles
𝛼, 𝛽, 𝛾 (not to be confused with the attenuation coefficient) with the coordinate axes
𝑥, 𝑦, 𝑧. We shall assume that the oscillations in a plane passing through the origin
of coordinates (Fig. 14.5) have the form
𝜉 0 = 𝐴 cos(𝜔𝑡 + 𝛼). (14.13)
Let us take a wave surface (plane) at the distance 𝑙 from the origin of coordinates.
The oscillations in this plane will lag behind those expressed by Eq. (14.13) by the
time 𝜏 = 𝑙/𝑣:
   
𝑙
𝜉 = 𝐴 cos 𝜔 𝑡 − + 𝛼 = 𝐴 cos(𝜔𝑡 − 𝑘𝑙 + 𝛼) (14.14)
𝑣
[𝑘 = 𝜔/𝑣; see Eq. (14.9)].
Let us express 𝑙 through the position vector of points on the surface being
considered. For this purpose, we shall introduce the unit vector 𝒏ˆ of a normal to
the wave surface. A glance at Fig. 14.5 shows that the scalar product of 𝒏ˆ and the
position vector 𝒓 of any point on the surface is 𝑙:
𝒏ˆ · 𝒓 = 𝑟 cos 𝜑 = 𝑙.
290 ELASTIC WAVES

Substitution of 𝒏ˆ · 𝒓 for l in Eq. (14.14) yields


𝜉 = 𝐴 cos[𝜔𝑡 − 𝑘( 𝒏ˆ · 𝒓) + 𝛼]. (14.15)
The vector
ˆ
𝒌 = 𝑘𝒏, (14.16)
equal in magnitude to the wave number 𝑘 = 2𝜋/𝜆 and directed along a normal to
the wave surface is called the wave vector. Thus, Eq. (14.15) can be written in the
form
𝜉 (𝒓, 𝑡) = 𝐴 cos(𝜔𝑡 − 𝒌 · 𝒓 + 𝛼). (14.17)
We have obtained the equation for a plane undamped wave propagating in the
direction determined by the wave vector 𝒌. For a damped wave, the factor 𝑒−𝛾𝑙 =
ˆ
𝑒−𝛾 ( 𝒏·𝒓) must be added to the equation.
Function (14.17) gives the deviation of a point having the position vector 𝒓 from
its equilibrium position at the moment of time 𝑡 (we remind our reader that 𝒓
determines the equilibrium position of the point). To pass over from the position
vector of a point to its coordinates 𝑥, 𝑦, 𝑧, let us express the scalar product 𝒌 · 𝒓
through the components of the vectors along the coordinate axes:
𝒌 · 𝒓 = 𝑘 𝑥 𝑥 + 𝑘 𝑦 𝑦 + 𝑘 𝑧 𝑧.
The equation of a plane wave, therefore, becomes
𝜉 (𝑥, 𝑦, 𝑧; 𝑡) = 𝐴 cos 𝜔𝑡 − 𝑘 𝑥 𝑥 − 𝑘 𝑦 𝑦 − 𝑘 𝑧 𝑧 + 𝛼 . (14.18)


Here,
2𝜋 2𝜋 2𝜋
𝑘𝑥 = cos 𝛼, 𝑘 𝑦 = cos 𝛽, 𝑘 𝑧 = cos 𝛾. (14.19)
𝜆 𝜆 𝜆
Function (14.18) gives the deviation of a point having the coordinates 𝑥, 𝑦, 𝑧 at the
moment of time 𝑡. When 𝒏ˆ coincides with 𝒆ˆ 𝑥 , we have 𝑘 𝑥 = 𝑥, 𝑘 𝑦 = 𝑘 𝑧 = 0, and
Eq. (14.18) transforms into Eq. (14.10). It is very convenient to write the equation of a
plane wave in the form
h i
𝜉 = < 𝐴𝑒𝑖(𝜔𝑡−𝒌·𝒓+𝛼) . (14.20)
The symbol < is usually omitted, having in mind that only the real part of the
relevant expression is taken. In addition, the complex number
𝐴ˆ = 𝐴𝑒𝑖𝛼 , (14.21)
called the complex amplitude is introduced. The magnitude of this number gives
the amplitude, and the argument, the initial phase of the wave.
Thus, the equation of a plane undamped wave can be written in the form
ˆ 𝑖(𝜔𝑡−𝒌·𝒓) .
𝜉 = 𝐴𝑒 (14.22)
The advantages of writing the equation in this form will come to light later.
The Wave Equation 291

14.4. The Wave Equation

The equation of any wave is the solution of a differential equation called the wave
equation. To establish the form of the wave equation, let us compare the second
partial derivatives with respect to the coordinates and time of function (14.18) de-
scribing a plane wave. Differentiating this function twice with respect to each of
the variables, we get
∂2 𝜉
= −𝜔2 𝐴 cos(𝜔𝑡 − 𝒌 · 𝒓 + 𝛼) = −𝜔2 𝜉,
∂𝑡2
∂2 𝜉
= −𝑘2𝑥 𝐴 cos(𝜔𝑡 − 𝒌 · 𝒓 + 𝛼) = −𝑘2𝑥 𝜉,
∂𝑥2
∂2 𝜉
= −𝑘2𝑦 𝐴 cos(𝜔𝑡 − 𝒌 · 𝒓 + 𝛼) = −𝑘2𝑦 𝜉,
∂𝑦 2
∂2 𝜉
= −𝑘2𝑧 𝐴 cos(𝜔𝑡 − 𝒌 · 𝒓 + 𝛼) = −𝑘2𝑧 𝜉.
∂𝑡2
Summation of the derivatives with respect to the coordinates yields
∂2 𝜉 ∂2 𝜉 ∂2 𝜉
+ + = −(𝑘2𝑥 + 𝑘2𝑦 + 𝑘2𝑧 )𝜉 = −𝑘2 𝜉. (14.23)
∂𝑥2 ∂𝑦 2 ∂𝑧 2
Comparing this sum with the time derivative and substituting 1/𝑣2 for 𝑘2 /𝜔2 [see
Eq. (14.9)], we get the equation
∂2 𝜉 ∂2 𝜉 ∂2 𝜉 1 ∂2 𝜉
+ + = . (14.24)
∂𝑥2 ∂𝑦 2 ∂𝑧 2 𝑣2 ∂𝑡2
This is exactly the wave equation. It can be written in the form
1 ∂2 𝜉
Δ𝜉 = 2 2 , (14.25)
𝑣 ∂𝑡
where Δ is the Laplacian operator [see Eq. (1.104)].
It is easy to convince ourselves that the wave equation is satisfied not only by
function (14.18), but also by any function of the form
𝑓 (𝑥, 𝑦, 𝑧; 𝑡) = 𝑓 (𝜔𝑡 − 𝑘 𝑥 𝑥 − 𝑘 𝑦 𝑦 − 𝑘 𝑧 𝑧 + 𝛼). (14.26)
Indeed, denoting the expression in parentheses in the right-hand side of Eq. (14.26)
by 𝜁 , we have
∂𝜁 ∂𝑓 ∂𝜁 ∂2 𝑓 ∂𝑓 0 ∂𝜁
= = 𝑓 0 𝜔, = 𝜔 = 𝜔2 𝑓 00. (14.27)
∂𝑡 ∂𝜁 ∂𝑡 ∂𝑡2 ∂𝜁 ∂𝑡
Similarly,
∂2 𝑓 2 00 ∂2 𝑓 2 00 ∂2 𝑓
= 𝑘 𝑓 , = 𝑘 𝑓 , = 𝑘2𝑧 𝑓 00. (14.28)
∂𝑥2 𝑥
∂𝑦 2 𝑦
∂𝑧2
Introducing Eqs. (14.27) and (14.28) into Eq. (14.24), we arrive at the conclusion that
292 ELASTIC WAVES

Fig. 14.6

function (14.26) satisfies the wave equation if we assume that 𝑣 = 𝜔/𝑘.


Any function satisfying an equation of the form of Eq. (14.24) describes a wave;
the square root of the quantity that is the reciprocal of the coefficient of ∂2 𝜉/∂𝑡2
gives the phase velocity of this wave.
We must note that for a plane wave propagating along the 𝑥-axis, the wave
equation has the form
∂2 𝜉 1 ∂2 𝜉
= . (14.29)
∂𝑥2 𝑣2 ∂𝑡2

14.5. Velocity of Elastic Waves in a Solid Medium

Assume that a longitudinal plane wave propagates in the direction of the 𝑥-axis. Let
us separate in the medium a cylindrical volume with a base area of 𝑆 and a height
of 𝛥𝑥 (Fig. 14.6). The displacements s of particles with different 𝑥’s are different at
each moment of time (see Fig. 14.3 showing 𝜉 against 𝑥). If the base of the cylinder
with the coordinate 𝑥 has at a certain moment of time the displacement 𝜉, then the
displacement of a base with the coordinate 𝑥 + 𝛥𝑥 will be 𝜉 + 𝛥𝜉. Therefore, the
volume being considered will be deformed—it receives the elongation 𝛥𝜉 (𝛥𝜉 is
an algebraic quantity, 𝛥𝜉 < 0 corresponds to compression of the cylinder) or the
relative elongation 𝛥𝜉/𝛥𝑥. The quantity 𝛥𝜉/𝛥𝑥 gives the average deformation of
the cylinder. Since 𝜉 varies with 𝑥 according to a non-linear law, the true deforma-
tion in different cross sections of the cylinder will differ. To obtain the deformation
(strain) in the cross section 𝑥, we must make 𝛥𝑥 tend to zero. Thus,
∂𝜉
𝜀= (14.30)
∂𝑥
Velocity of Elastic Waves in a Solid Medium 293

Fig. 14.7

(we have used the symbol of the partial derivative because 𝜉 depends not only on 𝑥,
but also on 𝑡).
The presence of tensile strain points to the existence of the normal stress 𝜎
which at small strains is proportional to the strain. According to Eq. (2.30) of Vol. I,
∂𝜉
𝜎 = 𝐸𝜀 = 𝐸 = (14.31)
∂𝑥
(𝐸 is Young’s modulus of the medium). We must note that the unit strain ∂𝜉/∂𝑥
and, consequently, the stress 𝜎 at a fixed moment of time depend on 𝑥 (Fig. 14.7).
Where the deviations of the particles from their equilibrium position are maximum,
the strain and the stress are zero. Where the particles are passing through their
equilibrium position, the strain and stress reach their maximum values, the positive
and negative strains (i.e., tensions and compressions) alternating. Accordingly,
as we have already noted in Sec. 14.1, a longitudinal wave consists of alternating
compressions and dilatations of the medium.
Let us revert to the cylindrical volume depicted in Fig. 14.6 and write an equation
of motion for it. Assuming that 𝛥𝑥 is very small, we can consider that the projection
of the acceleration onto the 𝑥-axis is the same for all points of the cylinder and is
∂𝜉/∂𝑥. The mass of the cylinder is 𝜌𝑆 𝛥𝑥, where 𝜌 is the density of the undeformed
medium. The projection onto the 𝑥-axis of the force acting on the cylinder equals
the product of the area 𝑆 of the cylinder base and the difference between the normal
stresses in the cross sections (𝑥 + 𝛥𝑥 + 𝜉 + 𝛥𝜉) and (𝑥 + 𝜉):
∂𝜉 ∂𝜉
    
𝐹 𝑥 = 𝑆𝐸 − . (14.32)
∂𝑥 𝑥+𝛥𝑥+𝜉+𝛥𝜉 ∂𝑥 𝑥+𝜉
The value of the derivative ∂𝜉/∂𝑥 in the section 𝑋 + 𝛿 can be written with great
accuracy for small values of 𝛿 in the form
∂𝜉 ∂𝜉 ∂ ∂𝜉 ∂𝜉 ∂2 𝜉
        
= + 𝛿= + 2 𝛿, (14.33)
∂𝑥 𝑥+𝛿 ∂𝑥 𝑥 ∂𝑥 ∂𝑥 𝑥 ∂𝑥 𝑥 ∂𝑥
where by ∂2 𝜉/∂𝑥2 is meant the value of the second partial derivative of 𝜉 with respect
294 ELASTIC WAVES

to 𝑥 in the cross section 𝑥.


Owing to the smallness of the quantities 𝛥𝑥, 𝑆, and 𝛥𝜉, we can perform trans-
formation (14.33) in Eq. (14.32):
∂𝜉 ∂2 𝜉 ∂𝜉 ∂2 𝜉
      
𝐹 𝑥 = 𝑆𝐸 + ( 𝛥𝑥 + 𝜉 + 𝛥𝜉) − + 𝜉
∂𝑥 𝑥 ∂𝑥2 ∂𝑥 𝑥 ∂𝑥2
∂2 𝜉 ∂2 𝜉
= 𝑆𝐸 2 ( 𝛥𝑥 + 𝛥𝜉) ≈ 𝑆𝐸 2 𝛥𝑥 (14.34)
∂𝑥 ∂𝑥
[the relative elongation ∂𝜉/∂𝑥 in elastic deformations is much smaller than unity.
Consequently, 𝛥𝜉  𝛥𝑥 so that the addend 𝛥𝜉 in the sum ( 𝛥𝑥 + 𝛥𝜉) may be
disregarded].
Introducing the found values of the mass, acceleration, and force into the
equation of Newton’s second law, we get
∂2 𝜉 ∂2 𝜉
𝜌𝑆 𝛥𝑥 2 = 𝑆𝐸 2 𝛥𝑥.
∂𝑡 ∂𝑥
Finally, cancelling 𝑆 𝛥𝑥, we arrive at the equation
∂2 𝜉 𝜌 ∂2 𝜉
= , (14.35)
∂𝑥2 𝐸 ∂𝑡2
which is the wave equation for the case when 𝜉 is independent of 𝑦 and 𝑧. A
comparison of Eqs. (14.29) and (14.35) shows that
  1/2
𝐸
𝑣= , (14.36)
𝜌
Thus, the phase velocity of longitudinal elastic waves equals the square root of
Young’s modulus divided by the density of the medium.
Similar calculations for transverse waves lead to the expression
  1/2
𝐺
𝑣= , (14.37)
𝜌
where 𝐺 is the shear modulus.

14.6. Energy of an Elastic Wave

Assume that the plane longitudinal wave,


𝜉 = 𝐴 cos(𝜔𝑡 − 𝑘𝑥 + 𝛼)
[see Eq. (14.10)], is propagating in the direction of the 𝑥-axis in a certain medium.
Let us separate in this medium an elementary volume 𝛥𝑉 so small that the velocity
and the strain at all the points of this volume may be considered the same and equal,
respectively, to ∂𝜉/∂𝑡 and ∂𝜉/∂𝑥.
Energy of an Elastic Wave 295

The volume we have separated has the kinetic energy


 2
𝜌 ∂𝜉
𝛥𝑊k = 𝛥𝑉 (14.38)
2 ∂𝑡
(𝜌𝛥𝑉 is the mass of the volume, and ∂𝜉/∂𝑡 is its velocity).
According to Eq. (3.81) of Vol. I, the volume being considered also has the
potential energy of elastic deformation
 2
𝐸𝜀 2 𝐸 ∂𝜉
𝛥𝑊p = 𝛥𝑉 = 𝛥𝑉
2 2 ∂𝑥
(𝜀 = ∂𝜉/∂𝑥 is the relative elongation of the cylinder, 𝐸 is Young’s modulus of the
medium). Let us use Eq. (14.36) to substitute 𝜌𝑣2 for Young’s modulus (𝜌 is the density
of the medium, and 𝑣 is the phase velocity of the wave). Hence, the expression for
the potential energy of the volume 𝛥𝑉 acquires the form
 2
𝐸𝜀 2 ∂𝜉
𝛥𝑊p = 𝛥𝑉 . (14.39)
2 ∂𝑥
The sum of Eqs. (14.38) and (14.39) gives the total energy
2  2#
" 
1 ∂𝜉 2 ∂𝜉
𝛥𝑊 = 𝛥𝑊k + 𝛥𝑊p = 𝜌 +𝑣 𝛥𝑉 . (14.40)
2 ∂𝑡 ∂𝑥
Dividing this energy by the volume 𝛥𝑉 in which it is contained, we get the energy
density
2  2#
" 
1 ∂𝜉 2 ∂𝜉
𝑤= 𝜌 +𝑣 . (14.41)
2 ∂𝑡 ∂𝑥
Differentiation of Eq. (14.10) once with respect to 𝑡 and another time with respect
to 𝑥 yields:
∂𝜉
= −𝐴𝜔 sin(𝜔𝑡 − 𝑘𝑥 + 𝛼),
∂𝑡
∂𝜉
= 𝑘𝐴 sin(𝜔𝑡 − 𝑘𝑥 + 𝛼).
∂𝑥
Introducing these equations into Eq. (14.41) and taking into account that 𝑘2 𝑣2 = 𝜔2 ,
we get
𝑤 = 𝜌𝐴2 𝜔2 sin2 (𝜔𝑡 − 𝑘𝑥 + 𝛼). (14.42)
A similar expression for the energy density is obtained for a transverse wave.
It can be seen from Eq. (14.42) that the energy density at each moment of time is
different at different points of space. At the same point, the energy density varies
with time as the square of the sine. The average value of sine square is one-half.
Accordingly, the time-averaged value of the energy density at each point of a medium
296 ELASTIC WAVES

Fig. 14.8

is
1 2 2
h𝑤i = 𝜌𝐴 𝜔 . (14.43)
2
The energy density given by Eq. (14.42) and its average value [Eq. (14.43)] are propor-
tional to the density of the medium 𝜌, the square of the frequency 𝜔, and the square
of the wave amplitude 𝐴. Such a relation holds not only for an undamped plane
wave, but also for other kinds of waves (a plane damped wave, a spherical wave,
etc.).
Thus, a medium in which a wave is propagating has an additional store of
energy. The latter is supplied to the different points of the medium from the source
of oscillations by the wave itself; consequently, a wave carries energy with it. The
amount of energy carried by a wave through a surface in unit time is called the
energy ftux through this surface. If the energy d𝑊 is carried through a given
surface during the time d𝑡, then the energy flux 𝛷 is
d𝑊
𝛷= . (14.44)
d𝑡
The energy flux is a scalar quantity whose dimension equals that of energy divided
by the dimension of time, i.e., coincides with the dimension of power. Accordingly,
𝛷 is measured in watts, erg s−1 , etc.
The energy flux at different points of a medium can have a different intensity. To
characterize the flow of energy at different points of space, a vector quantity called
the density of the energy flux is introduced. It numerically equals the energy
flux through a unit area placed at the given point perpendicular to the direction
in which the energy is being transferred. The direction of the vector of the energy
flux density coincides with that of energy transfer.
Assume that the energy d𝑊 is transferred during the time d𝑡 through the area
𝛥𝑆⊥ perpendicular to the direction of propagation of a wave. The energy flux
density will therefore be
𝛥𝛷 𝛥𝑊
𝑗= = (14.45)
𝛥𝑆⊥ 𝛥𝑆⊥ 𝛥𝑡
[see Eq. (14.44)]. The energy 𝛥𝑊 confined in a cylinder with the base 𝛥𝑆⊥ and the
Energy of an Elastic Wave 297

altitude 𝑣 𝛥𝑡 (𝑣 is the phase velocity of the wave) will be transferred through the area
𝛥𝑆⊥ (Fig. 14.8) during the time 𝛥𝑡. If the dimensions of the cylinder are sufficiently
small (as a result of the smallness of 𝛥𝑆⊥ and 𝛥𝑡) to consider that the energy density
at all points of the cylinder is the same, then 𝛥𝑊 can be found as the product of the
energy density 𝑤 and the volume of the cylinder equal to 𝛥𝑆⊥ 𝑣 𝛥𝑡:
𝛥𝑊 = 𝑤 𝛥𝑆⊥ 𝑣 𝛥𝑡.
Using this expression in Eq. (14.45), we get the following equation for the density of
the energy:
𝑗 = 𝑤𝑣. (14.46)
Finally, introducing the vector 𝒗 whose magnitude equals the phase velocity of the
wave and whose direction coincides with that of wave propagation (and energy
transfer), we can write
𝒋 = 𝑤𝒗. (14.47)
We have obtained an expression for the vector of the energy flux density. This
vector was first introduced by the outstanding Russian physicist Nikolai Umov
(1846-1915) and is called Umov’s vector.
The vector given by Eq. (14.47), like the energy density 𝑤, is different at different
points of space. At a given point, it varies in time according to a sine square law. Its
average value is
1
h𝒋i = h𝑤i 𝒗 = 𝜌𝐴2 𝜔2𝒗 (14.48)
2
[see Eq. (14.43)]. Equation (14.48), like Eq. (14.43), holds for a wave of any kind (spheri-
cal, damped, etc.). We shall note that when we speak of the intensity of a wave at a
given point, we have in mind the time-averaged value of the density of the energy
flux transferred by the wave.
Knowing 𝒋 for all the points of an arbitrary surface 𝑆, we can calculate the
energy flux through this surface. For this purpose, let us divide the surface into
elementary areas d𝑆. During the time d𝑡, the energy d𝑊 confined in the oblique
cylinder shown in Fig. 14.9 will pass through area d𝑆. The volume of this cylinder
is d𝑉 = 𝑣 d𝑡 d𝑆 cos 𝜑. It contains the energy d𝑊 = 𝑤 d𝑉 = 𝑤𝑣 d𝑡 d𝑆 cos 𝜑 (here,
𝑤 is the instantaneous value of the energy density where area d𝑆 is). Taking into
account that
𝑤𝑣 d𝑆 cos 𝜑 = 𝑗 d𝑆 cos 𝜑 = 𝒋 · d𝑺
(d𝑺 = 𝒏ˆ d𝑆; see Fig. 14.9), we can write: d𝑊 = 𝒋 · d𝑺 d𝑡. Hence, we obtain the
following equation for the energy flux d𝛷 through area d𝑆:
d𝑊
d𝛷 = = 𝒋 · d𝑺 (14.49)
d𝑡
298 ELASTIC WAVES

Fig. 14.9

[compare with Eq. (1.72)]. The total energy flux through a surface equals the sum of
the elementary fluxes given by Eq. (14.49):

𝛷= 𝒋 · d𝑺. (14.50)
𝑆
We can say in accordance with Eq. (1.74) that the energy flux equals the flux of the
vector 𝒋 through surface 𝑆.
Substituting for the vector 𝒋 in Eq. (14.50) its time-averaged value, we get the
average value of 𝛷:

h𝛷i = h𝒋i · d𝑺. (14.51)
𝑆
Let us calculate the mean value of the energy flux through an arbitrary wave
surface of an undamped spherical wave. At each point of this surface, the vectors
𝒋 and d𝑺 coincide in direction. In addition, the magnitude of the vector 𝒋 for all
points of the surface is identical. Hence,

h𝛷i = h𝑗i d𝑆 = h𝑗i 𝑆 = h𝑗i 4𝜋𝑟 2
𝑆
(𝑟 is the radius of the wave surface). According to Eq. (14.48), we have h𝑗i = 𝜌𝐴2 𝜔2 𝑣/2.
Thus,
h𝛷i = 2𝜋 𝜌𝜔2 𝐴2𝑟 𝑟 2
(𝐴𝑟 is the amplitude of the wave at a distance 𝑟 from its source). Since the energy of
the wave is not absorbed by the medium, the average energy flux through a sphere
of any radius must have the same value, i.e., the condition
𝐴2𝑟 𝑟 2 = constant
must be observed. It follows that the amplitude 𝐴𝑟 of an undamped spherical wave
is inversely proportional to the distance 𝑟 from the wave source [see Eq. (14.12)].
Accordingly, the mean density of the energy flux h𝑗i is inversely proportional to
the square of the distance from the source.
For a plane damped wave, the amplitude diminishes with the distance according
Standing Waves 299

to the law 𝐴 = 𝐴0 𝑒−𝛾𝑥 [see Eq. (14.11)]. The average density of the energy flux (i.e.,
the wave intensity) correspondingly diminishes according to the law
𝑗 = 𝑗0 𝑒−𝜘𝑥 . (14.52)
Here, 𝜘 = 2𝛾 is a quantity called the wave absorption coefficient. Its dimension
is the reciprocal of that of length. It is easy to see that the reciprocal of 𝜘 equals the
distance over which the intensity of a wave diminishes to 1/𝑒 of its initial value.

14.7. Standing Waves

If several waves propagate in a medium simultaneously, then the oscillations of


the particles of the medium will be the geometrical sum of the oscillations which
the particles would perform if each of the waves propagated separately. Hence, the
waves are simply superposed onto one another without disturbing one another. This
statement following from experiments is called the principle of superposition
of waves.
When the oscillations due to separate waves at each point of a medium have a
constant phase difference, the waves are called coherent. (A stricter definition of
coherence will be given in Sec. 17.2). The summation of coherent waves gives rise to
the phenomenon of interference, consisting in that the oscillations at some points
amplify, and at other points weaken one another.
A very important case of interference is observed in the superposition of two
plane waves having the same amplitude and approaching each other from opposite
directions. The resulting oscillatory process is called a standing wave. Standing
waves are produced when waves are reflected from obstacles. The wave striking an
obstacle and the reflected wave travelling toward it in the opposite direction as a
result of superposition produce a standing wave.
Let us write the equations of two plane waves propagating along the 𝑥-axis in
opposite directions:
𝜉 1 = 𝐴 cos(𝜔𝑡 − 𝑘𝑥 + 𝛼1 ),
𝜉 2 = 𝐴 cos(𝜔𝑡 + 𝑘𝑥 + 𝛼2 ).
Adding these two equations and transforming the result according to the formula
for the sum of cosines, we get
 𝛼 − 𝛼 i  𝛼 + 𝛼 i
2 1 1 2
h h
𝜉 = 𝜉 1 + 𝜉 2 = 2𝐴 cos 𝑘𝑥 + cos 𝜔𝑡 + . (14.53)
2 2
Equation (14.53) is the equation of a standing wave. To simplify it, let us choose the
beginning of reading 𝑥 so that the difference 𝛼2 − 𝛼1 vanishes, and the beginning
of reading 𝑡 so that the sum 𝛼1 + 𝛼2 vanishes. We shall also substitute for the wave
300 ELASTIC WAVES

number 𝑘 its value 2𝜋/𝜆. Equation (14.53) now becomes


2𝜋 𝑥
  
𝜉 = 2𝐴 cos cos(𝜔𝑡). (14.54)
𝜆
A glance at Eq. (14.54) shows that at every point of a standing wave the oscillations
have the same frequency as those of the opposite waves, the amplitude depending
on 𝑥:
2𝜋 𝑥
 
amplitude = 2𝐴 cos

.
𝜆
At the points whose coordinates comply with the condition
2𝜋 𝑥
= ±𝑛𝜋 (𝑛 = 0, 1, 2, . . .), (14.55)
𝜆
the amplitude of the oscillations reaches its maximum value. These points are
known as antinodes of the standing wave. We obtain the values of the antinode
coordinates from Eq. (14.55):
𝜆
𝑥anti = ±𝑛 (𝑛 = 0, 1, 2, . . .). (14.56)
2
It must be borne in mind that an antinode is not a single point, but a plane
whose points have the value of the coordinate 𝑥 determined by Eq. (14.56).
At the points whose coordinates comply with the condition
2𝜋 𝑥 1
 
=± 𝑛+ 𝜋 (𝑛 = 0, 1, 2, . . .),
𝜆 2
the amplitude of the oscillations vanishes. These points are called the nodes of
the standing wave. The points of the medium at the nodes do not oscillate. The
coordinates of the  nodes have the values
1 𝜆
𝑥node = ± 𝑛 + (𝑛 = 0, 1, 2, . . .). (14.57)
2 2
A node, like an antinode, is not a single point, but a plane whose points have values
of the coordinate 𝑥 determined by Eq. (14.57).
Examination of Eqs. (14.56) and (14.57) shows that the distance between adjacent
antinodes, like that between adjacent nodes, is 𝜆/2. The antinodes and nodes are
displaced relative to one another by a quarter of a wavelength.
Let us revert to Eq. (14.54). The factor 2𝐴 cos(2𝜋 𝑥/𝜆) changes its sign after
passing through its zero value. Accordingly, the phase of the oscillations at different
sides of a node differs by 𝑛. This signifies that points at different sides of a node os-
cillate in counterphase. All the points between two adjacent nodes oscillate in phase.
Figure 14.10 contains a number of “instantaneous photographs” of the deviations of
the points from their equilibrium position. The first of them corresponds to the
moment when the deviations reach their greatest absolute value. The following
Standing Waves 301

Node Node Node Node

Fig. 14.10 Fig. 14.11

“photographs” have been made at intervals of one-fourth of a period. The arrows


show the velocities of the particles.
Differentiating Eq. (14.54) once with respect to 𝑡 and once with respect to 𝑥,
we find expressions for the velocity of the particles 𝜉¤ and the deformation of the
medium 𝜀:
d𝜉 2𝜋 𝑥
 
𝜉¤ = = −2𝜔𝐴 cos sin(𝜔𝑡), (14.58)
d𝑡 𝜆
d𝜉 2𝜋 2𝜋 𝑥
 
𝜀= = −2 𝐴 sin cos(𝜔𝑡). (14.59)
d𝑥 𝜆 𝜆
Equation (14.58) describes a standing wave of velocity, and Eq. (14.59) one of defor-
mation.
Figure 14.11 compares “instantaneous photographs” of the displacement, velocity,
and deformation for the time moments 0 and 𝑇/4. Inspection of the graphs shows
that the nodes and antinodes of the velocity coincide with their displacement
counterparts; the nodes and antinodes of the deformation, however, coincide with
the antinodes and nodes of the displacement, respectively. When 𝜉 and 𝜀 reach their
maximum values, 𝜉¤ becomes equal to zero, and vice versa. Accordingly, the energy
of a standing wave transforms twice during a period, once completely into potential
energy mainly concentrated near the nodes of the wave (where the deformation
302 ELASTIC WAVES

Fig. 14.12

antinodes are), and once completely into kinetic energy mainly concentrated near
the antinodes of the wave (where the antinodes of the velocity are). The result is
the transition of energy from each node to its adjacent antinodes and back. The
time-averaged energy flux in any cross section of the wave is zero.

14.8. Oscillations of a String

When transverse oscillations are produced in a stretched string fastened at both


ends, standing waves are set up in it, and there must be nodes at the places where the
string is fastened. Hence, only such oscillations are produced with an appreciable
intensity in a string when the length of the latter is an integer multiple of half their
wavelength (Fig. 14.12). This gives the condition
𝜆 2𝑙
𝑙=𝑛 or 𝜆n = (𝑛 = 1, 2, 3, . . .) (14.60)
2 𝑛
(𝑙 is the length of the string). The following frequencies correspond to the wave-
lengths given by Eq. (14.60):
𝑣 𝑣
𝜈n = = 𝑛 (𝑛 = 1, 2, 3, . . .) (14.61)
𝜆n 2𝑙
(𝑣 is the phase velocity of the wave determined by the string tension and the mass
per unit length, i.e., the linear density of the string).
The frequencies 𝜈n are called the natural frequencies of the string. The natural
frequencies are integral multiples of the frequency
𝑣
𝜈i = ,
2𝑙
called the fundamental frequency.
Harmonic oscillations with frequencies according to Eq. (14.61) are called natu-
ral or normal oscillations. They are also known as harmonics. In the general
case, the oscillation of a string is a superposition of various harmonics.
The oscillations of a string are remarkable in the respect that according to
Sound 303

classical notions, we get discrete values of one of the quantities characterizing the
oscillations (their frequency). Such a discrete nature is an exception for classical
physics. For quantum processes, it is the rule rather than an exception.

14.9. Sound

If elastic waves propagating in air have a frequency ranging from 16 Hz to 20000 Hz,
then upon reaching the human ear, they cause a sound to be perceived. Accordingly,
elastic waves in any medium having a frequency confined within the above limits
are called sound waves or simply sound. Elastic waves with frequencies below
16 Hz are called infrasound, and those with frequencies above 20000 Hz are called
ultrasound. The human ear does not hear infra- and ultrasounds.
People distinguish sounds they hear by pitch, timbre (quality), and loudness.
A definite physical characteristic of a sound wave corresponds to each of these
subjective appraisals.
Any real sound is not a simple harmonic oscillation, but is the superposition of
harmonic oscillations with a definite set of frequencies. The collection of frequencies
of the oscillations present in a given sound is called its acoustic spectrum. If a
sound contains oscillations of all the frequencies within an interval from 𝜈 0 to 𝜈 00,
then, the spectrum is called continuous. If a sound consists of oscillations having
the discrete frequencies 𝜈1 , 𝜈2 , 𝜈3 , etc., then, the spectrum is known as a line one.
Noises have a continuous acoustic spectrum. Oscillations with a line spectrum
produce the sensation of a sound with a more or less definite pitch. Such a sound is
called a tone sound, or simply a tone.
The pitch of a tone is determined by its fundamental (lowest) frequency. The
relative intensity of the overtones (i.e., of the oscillations of the frequencies 𝜈2 ,
𝜈3 , etc.) determines the timbre, or quality, of the sound. The different spectral
composition of sounds produced by various musical instruments makes it possible
to distinguish by ear, for example, a flute from a violin or a piano.
By the intensity of a sound is meant the time-averaged value of the density of
the energy flux carried by a sound wave. To be audible, a wave must have a certain
minimum intensity known as the threshold of hearing. This threshold differs
somewhat for different persons and depends quite greatly on the frequency of the
sound. The human ear is most sensitive to frequencies from 1000 Hz to 4000 Hz.
In this region of frequencies, the threshold of hearing averages about 10−12 W m−2 .
At other frequencies, it is higher (see the bottom curve in Fig. 14.13).
At intensities of the order of 1 W m−2 to 10 W m−2 , a wave stops being perceived
as a sound and produces only a feeling of pain and pressure in the ear. The value
of the intensity at which this occurs is known as the threshold of pain (or the
304 ELASTIC WAVES

1 120
-2 �reshold of pain
10 100
10 -4
80
-6
10 60
10 -8 40
�reshold of hearing
10-10 20
10-12 0
20 200 2000 20000

Fig. 14.13

threshold of feeling). The pain threshold, like the hearing one, depends on the
frequency (see the top curve in Fig. 14.13; the data given in this figure relate to the
average normal hearing).
The subjectively estimated loudness of a sound grows much more slowly than
the intensity of the sound waves. When the intensity grows in a geometric progres-
sion, the loudness grows approximately in an arithmetical progression, i.e., linearly.
On these grounds, the loudness level 𝐿 is determined as the logarithm of the ratio
between the intensity of the given sound 𝐼 and the intensity 𝐼0 taken as the initial
one:  
𝐼
𝐿 = log . (14.62)
𝐼0
The initial intensity 𝐼0 is taken equal to 10−12 W m−2 so that the hearing threshold
at a frequency of the order of 1000 Hz is at the zero level (𝐿 = 0).
The unit of loudness level 𝐿 determined by Eq. (14.62) is called the bell (B).
Generally the decibel (dB), which is one-tenth of a bell is preferred. The value of 𝐿
in decibels is determined by the equation
 
𝐼
𝐿 = 10 log . (14.63)
𝐼0
The ratio of two intensities 𝐼1 and 𝐼2 can also be expressed in decibels:
 
𝐼1
𝐿12 = 10 log . (14.64)
𝐼2
This equation can be used to express the reduction in the intensity (the damping)
of a wave over a certain path in decibels. Thus, for example, a damping of 20 dB
signifies that the intensity has dropped to one-hundredth of its initial value.
The entire range of intensities at which a wave produces a feeling of sound
Sound 305

in the human ear (from 10−12 W m−2 ), corresponds to values of the loudness level
from 0 dB to 130 dB. Table 14.1 gives approximate values of the loudness level for
selected sounds.
The energy which sound waves convey with them is extremely small. If we
assume, for example, that a glass of water completely absorbs the entire energy of
a sound wave with a loudness level of 70 dB falling on it (in this case the amount
of energy absorbed per second will be about 2 × 10−7 W), then, to heat the water
from room temperature to boiling about ten thousand years will be needed.
Ultrasonic waves can be produced in the form of directed beams like beams of
light. Directed ultrasonic beams have found a widespread application for locating
objects and determining the distance to them in water. The first to put forward the
idea of ultrasonic location was the outstanding French physicist Paul Langevin. He
implemented this idea during the first world war for detecting submarines.
At present, ultrasonic locators are used for detecting icebergs, fish shoals, and
the like.
It is general knowledge that by shouting and determining the time that elapses
until the echo arrives, i.e., the sound reflected by an obstacle—a mountain, forest,
the surface of the water in a well, etc.—we can find the distance to the obstacle
by multiplying half of this time by the speed of sound. This principle underlies
the locator (sonar) mentioned above, and also the ultrasonic echo sounder used to
measure the depth and determine the relief of the sea bottom.
Ultrasonic location permits bats to orient themselves very well when flying in
the dark. A bat periodically emits pulses of an ultrasonic frequency and according

Table 14.1

Sound Loudness level, dB

Ticking of a clock 20
Whisper at a distance of 1 m 30
Quiet conversation 40
Speech of a moderate loudness 60
Loud speech 70
Shout 80
Noise of an aircraft engine:
at a distance of 5 m 120
at a distance of 3 m 130
306 ELASTIC WAVES

Fig. 14.14

to the reflected signals received by its ears assesses the distances to surrounding
objects with a high accuracy.

14.10. The Velocity of Sound in Gases

A sound wave in a gas is a sequence of alternating regions of compression and


rarefaction of the gas propagating in space. Hence, the pressure at every point of
space experiences a periodically changing deflection 𝛥𝑝 from its average value 𝑝
coinciding with the pressure existing in the gas when waves are absent. Thus, the
instantaneous value of the pressure at a point of space can be written in the form
𝑝0 = 𝑝 + 𝛥𝑝.
Assume that a wave is propagating along the 𝑥-axis. Let us consider the volume
of a gas in the form of a cylinder with a base area of 𝑆 and an altitude of 𝛥𝑥 (Fig. 14.14),
as we did in Sec. 14.5 when finding the velocity of elastic waves in a solid medium.
The mass of the gas confined in this volume is 𝜌𝑆 𝛥𝑥, where 𝜌 is the density of the
gas undisturbed by the wave. Owing to the smallness of 𝛥𝑥, the projection of the
acceleration onto the 𝑥-axis for all the points of the cylinder may be considered the
same and equal to ∂2 𝜉/∂𝑡2 .
To find the projection onto the 𝑥-axis of the force exerted on the volume being
considered, we must take the product of the cylinder base area 𝑆 and the difference
between the pressures in the cross sections (𝑥 + 𝜉) and (𝑥 + 𝛥𝑥 + 𝜉 + 𝛥𝜉). Repeating
the reasoning that led us to Eq. (14.34), we get
∂𝑝0
𝐹 𝑥 = − 𝑆 𝛥𝑥
∂𝑥
The Velocity of Sound in Gases 307

[we remind our reader that when deriving Eq. (14.34) we took advantage of the
assumption 𝛥𝜉  𝛥𝑥].
Thus, we have found the mass of the separated volume of gas, its acceleration,
and the force exerted on it. Now let us write the equation of Newton’s second law
for this volume of gas:
∂2 𝜉 ∂𝑝0
(𝜌𝑆 𝛥𝑥) 2 = − 𝑆 𝛥𝑥.
∂𝑡 ∂𝑥
After cancelling 𝑆 𝛥𝑥, we get
∂2 𝜉 ∂𝑝0
𝜌 2 =− . (14.65)
∂𝑡 ∂𝑥
The differential equation we have obtained contains two unknown functions,
namely, 𝜉 and 𝑝0. Let us express one of them through the other. To do this, we
shall find the relation between the pressure of a gas and the relative change in its
volume ∂𝜉/∂𝑥. This relation depends on the nature of the process of compression
(or rarefaction) of the gas. The compressions and rarefactions of a gas in a sound
wave follow one another so frequently that adjacent portions of the medium do not
manage to exchange heat, and the process can be considered as an adiabatic one. In
an adiabatic process, the pressure and volume of a given mass of a gas are related
by the equation
𝑝𝑉 𝛾 = constant, (14.66)
where 𝛾 is the ratio between the heat capacities of the gas at constant pressure and
at constant volume [see Eq. (10.42) of Vol. I].
In accordance with Eq. (14.66):
∂𝜉 ∂𝜉
  𝛾  𝛾
𝛾 0 𝛾 0
𝑝(𝑆 𝛥𝑥) = 𝑝 [𝑆( 𝛥𝑥 + 𝛥𝜉)] = 𝑝 𝑆 𝛥𝑥 + 𝛥𝑥 = 𝑝 (𝑆 𝛥𝑥) 1 +
0 𝛾
.
∂𝑥 ∂𝑥
Cancelling (𝑆 𝛥𝑥) 𝛾 yeilds:
∂𝜉
 𝛾
𝑝= 𝑝 1+
0
.
∂𝑥
Taking advantage of the assumption ∂𝜉/∂𝑥  1, let us expand the expression
(1 + ∂𝜉/∂𝑥) 𝛾 into a series by powers of ∂𝜉/∂𝑥 and disregard the terms of the higher
orders of smallness. The
 result is
∂𝜉

𝑝 = 𝑝0 1 + 𝛾 .
∂𝑥
Let us solve this equation with respect to 𝑝0:
∂𝜉
 
𝑝
0
𝑝 =   ≈ 𝑝 1−𝛾 (14.67)
∂𝜉 ∂𝑥
1+𝛾
∂𝑥
308 ELASTIC WAVES

[we have used the formula 1/(1 + 𝑥) ≈ 1 − 𝑥 holding for 𝑥  1]. It is a simple
matter to obtain an expression for 𝛥𝑝 from the relation we have found:
∂𝜉
𝛥𝑝 = 𝑝0 − 𝑝 = −𝛾 𝑝 . (14.68)
∂𝑥
Since the order of magnitude of 𝛾 is near unity, it follows from Eq. (14.68) that
|∂𝜉/∂𝑥| ≈ | 𝛥𝑝/𝑝|. Thus, the condition, ∂𝜉/∂𝑥  1, signifies that the deviation of
the pressure from its average value is much smaller than the pressure itself. This is
indeed true: for the loudest sounds, the amplitude of oscillations of the air pressure
does not exceed 1 mmHg, whereas the atmospheric pressure 𝑝 has a value of the
order of 103 mmHg.
Differentiating Eq. (14.67) with respect to 𝑥, we find that
∂𝑝0 ∂2 𝜉
= −𝛾 𝑝 2 .
∂𝑥 ∂𝑥
Finally, using this value of ∂𝑝0/∂𝑥 in Eq. (14.65), we get the differential equation
∂2 𝜉 𝜌 ∂2 𝜉
2
= .
∂𝑥 𝛾 𝑝 ∂𝑡2
Comparing it with wave equation (14.29), we get the following expression for the
velocity of sound waves in a gas:
  1/2
𝑝
𝑣= 𝛾 (14.69)
𝜌
(we remind our reader that 𝑝 and 𝜌 are the pressure and the density of the gas
undisturbed by a wave).
At atmospheric pressure and conventional temperatures, most gases are close in
their properties to an ideal gas. Therefore, we can assume that the ratio 𝑝/𝜌 for them
equals 𝑅𝑇/𝑀, where 𝑅 is the molar gas constant, 𝑇 is the absolute temperature,
and 𝑀 is the mass of a mole of a gas [see Eq. (10.22) of Vol. I]. Introducing this value
into Eq. (14.69), we get the following equation for the velocity of sound in a gas:
𝛾𝑅𝑇 1/2
 
𝑣= . (14.70)
𝑀
Examination of this equation shows that the velocity of sound is proportional to
the square root of the temperature and does not depend on the pressure.
The average velocity of thermal motion of gas molecules is determined by the
formula
 1/2
8𝑅𝑇

h𝑣mol i =
𝜋𝑀
[see Eq. (11.70) of Vol. I]. A comparison of this equation with Eq. (14.70) shows that
the velocity of sound in a gas is related to the average velocity of thermal motion of
The Velocity of Sound in Gases 309

its molecules by the expression


 𝛾𝜋  1/2
𝑣 = h𝑣mol i . (14.71)
8
Substitution for 𝛾 of its value for air equal to 1.4 yields the expression 𝑣 ≈ 3 h𝑣mol i /4.
The maximum possible value of 𝛾 is 5/3. In this case, 𝑣 ≈ 4 h𝑣mol i /5. Thus, the
velocity of sound in a gas is of the same order of magnitude as the average velocity
of thermal motion of the molecules, but is always somewhat lower than h𝑣mol i.
Let us calculate the value of the velocity of sound in air at a temperature of
290 K (room temperature). For air, we have 𝛾 = 1.40, and 𝑀 = 29 × 10−3 kg mol−1 .
The molar gas constant is 𝑅 = 8.31 J mol−1 K−1 . Introducing these values into
Eq. (14.70), we get
𝛾𝑅𝑇 1/2
 1/2
1.4 × 8.31 × 290
  
𝑣= = = 340 m s−1 .
𝑀 29 × 10 −3

The value of the sound velocity in air which we have found agrees quite well with
the value found experimentally.
Let us find the relation between the intensity of a sound wave 𝐼 and the am-
plitude of the pressure oscillations ( 𝛥𝑝)m . We mentioned in Sec. 14.9 that by the
intensity of sound is meant the average value of the density of the energy flux.
Hence,
1
𝐼 = 𝜌𝐴2 𝜔2 𝑣 (14.72)
2
[see Eq. (14.48)]. Here, 𝑝 is the density of the undisturbed gas, 𝐴 is the amplitude of
oscillations of the particles of the medium, i.e., the amplitude of the oscillations of
the displacement 𝜉, 𝜔 is the frequency, and 𝑣 the phase velocity of the wave. We
must note that in the given case the particles of the medium are understood to
be macroscopic (i.e., including a great number of molecules) volumes, and not
molecules; the linear dimensions of these volumes are much smaller than the
wavelength.
Assume that 𝜉 changes according to the law 𝜉 = 𝐴 cos(𝜔𝑡 − 𝑘𝑥 + 𝛼). Hence,
∂𝜉 𝜔
= 𝐴𝑘 sin(𝜔𝑡 − 𝑘𝑥 + 𝛼) = 𝐴 sin(𝜔𝑡 − 𝑘𝑥 + 𝛼).
∂𝑥 𝑣
Introducing this value into Eq. (14.68), we obtain
𝜔
𝛥𝑝 = −𝛾 𝑝𝐴 sin(𝜔𝑡 − 𝑘𝑥 + 𝛼) = −( 𝛥𝑝)m sin(𝜔𝑡 − 𝑘𝑥 + 𝛼),
𝑣
whence
( 𝛥𝑝)m 𝑣2
𝐴= . (14.73)
𝛾 𝑝𝜔
310 ELASTIC WAVES

Using this expression in Eq. (14.72), we get


1 ( 𝛥𝑝) 2 𝑣2 ( 𝛥𝑝) 2 𝜌 2 4
 
𝐼 = 𝜌 2 2m 2 𝜔2 𝑣 = 2 m 𝑣 .
2 𝛾 𝑝 𝜔 2𝛾 𝜌𝑣 𝑝
Taking into account that 𝑣4 = (𝛾𝑅𝑇/𝑀) 2 , and (𝑝/𝜌) 2 = (𝑅𝑇/𝑀) 2 [see Eq. (14.70)
and the text preceding it], we can write that
( 𝛥𝑝)m2
𝐼= . (14.74)
2𝜌𝑣
We can use this equation to calculate the approximate values of the amplitude of
air pressure oscillations corresponding to the range of loudness levels from 0 dB
to 130 dB. These values range from 3 × 10−5 Pa (2 × 10−7 mmHg) to 100 Pa (about
1 mmHg).
Let us assess the amplitude of oscillations of the particles 𝐴 and that of the
velocity of the particles ( 𝜉)
¤ m . We shall begin with an assessment of the quantity
𝐴 determined by Eq. (14.73). Taking into account that 𝑣/𝜔 = 𝜆/(2𝜋), we get the
expression
𝐴 1 ( 𝛥𝑝)m2 ( 𝛥𝑝)m2
= ≈ 0.1 (14.75)
𝜆 2𝜋𝛾 𝑝 𝑝
(𝛾 ≈ 1.5, consequently, 2𝜋𝛾 ≈ 10). At a loudness of 130 dB, the ratio ( 𝛥𝑝)m2 /𝑝 has a
value of the order of 10−3 , while at a loudness of 60 dB this ratio is about 2 × 10−7 .
The lengths of sound waves in air range from 21 m (at 𝜈 = 16 Hz) to 17 mm (at
𝜈 = 20000 Hz). Inserting these values into Eq. (14.75), we find that at a loudness of
60 dB the amplitude of oscillations of the particles is about 4 × 10−4 mm for the
longest waves and about 3 × 10−7 mm for the shortest ones. At a loudness of 130 dB,
the amplitude of oscillations for the longest waves is about 2 mm.
For harmonic oscillations, the amplitude of the velocity ( 𝜉) ¤ m equals that of
the displacement 𝐴 multiplied by the cyclic frequency 𝜔: ( 𝜉)m = 𝐴𝜔. Multiplying
¤
Eq. (14.75) by 𝜔, we get
¤ m 1 ( 𝛥𝑝)m
( 𝜉) ( 𝛥𝑝)m
= ≈ . (14.76)
𝑣 𝛾 𝑝 𝑝
Consequently, at a loudness of 130 dB, the amplitude of the velocity is about
340 m s−1 × 10−3 = 0.34 m s−1 . At a loudness of 60 dB, the amplitude of the velocity
will be of the order of 0.1 mm s−1 . We must note that unlike the displacement
amplitude, the velocity amplitude does not depend on the wavelength.
The Doppler Effect for Sound Waves 311

Fig. 14.15

14.11. The Doppler Effect for Sound Waves

Assume that a device sensing the oscillations of the medium, which we shall call a
receiver, is placed in a fluid at a certain distance from the wave source. If the source
and the receiver of the waves are stationary relative to the medium in which the
wave is propagating, then the frequency of the oscillations picked up by the receiver
will equal the frequency 𝜈0 of the oscillations of the source. If the source or the
receiver or both are moving relative to the medium, then the frequency 𝜈 picked up
by the receiver may differ from 𝜈0 . This phenomenon is called the Doppler effect.
[It is named after the Austrian scientist Christian Doppler (1803-1853) who described
the effect for light waves.]
Let us assume that the source and the receiver move along the straight line
joining them. We shall assume the velocity of the source 𝑣s to be positive if it moves
toward the receiver and negative if it moves away from the receiver. Similarly, we
shall assume the velocity of the receiver 𝑣r to be positive if the latter moves toward
the source and negative if it moves away from the source.
If the source is stationary and oscillates with the frequency 𝜈0 , then by the
moment when the source will complete its 𝜈0 -th oscillation, the “crest” of the wave
produced by the first oscillation will travel the path 𝑣 in the medium (𝑣 is the
velocity of propagation of the wave relative to the medium). Hence, the 𝜈0 “crests”
and “troughs” of the wave produced by the source in one second will cover the
length 𝑣. If the source is moving relative to the medium with the velocity 𝑣s , then at
the moment when the source completes its 𝜈0 -th oscillation, the crest produced by
the first oscillation will be at a distance of 𝑣 − 𝑣s from the source (Fig. 14.15). Hence,
the length 𝑣 − 𝑣s , will contain 𝜈0 crests and troughs of a wave, so that the wavelength
will be
𝑣 − 𝑣s
𝜆= . (14.77)
𝜈0
The stationary receiver will be passed in one second by the crests and troughs
accommodated on the length 𝑣. If the receiver is moving with the velocity 𝑣r , then
at the end of a time interval of one second it will pick up the trough which at
the beginning of this interval was at a distance numerically equal to 𝑣 from its
present position. Thus, in one second, the receiver will pick up the oscillations
312 ELASTIC WAVES

Fig. 14.16

corresponding to the crests and troughs accommodated on a length numerically


equal to 𝑣 + 𝑣r (Fig. 14.16) and will oscillate with the frequency
𝑣 + 𝑣r
𝜈= .
𝜆
Substituting for 𝜆 its value from Eq. (14.77), we get
 
𝑣 + 𝑣r
𝜈 = 𝜈0 . (14.78)
𝑣 − 𝑣s
It follows from Eq. (14.78) that upon such motion of the source and the receiver
when the distance between them diminishes, the frequency 𝜈 picked up by the
receiver will be greater than that of the source 𝜈0 . If the distance between the source
and the receiver increases, 𝜈 will be less than 𝜈0 .
If the directions of the velocities 𝒗s and 𝒗r do not coincide with the straight line
passing through the source and the receiver, then, the projections of the vectors 𝒗s
and 𝒗r onto the direction of this straight line must be substituted for 𝑣s and 𝑣s in
Eq. (14.78).
Inspection of Eq. (14.78) shows that the Doppler effect for sound waves is de-
termined by the velocities of the source and the receiver relative to the medium in
which the sound propagates. The Doppler effect is also observed for light waves,
but the equation for the change in the frequency differs from Eq. (14.78). This is
due to the fact that no material medium exists for light waves whose oscillations
would be “light”. Therefore, the velocities of the source and the receiver of light
relative to the “medium” are deprived of a meaning. For light, we can speak only
of the relative velocity of the receiver and the source. The Doppler effect for light
waves depends on the magnitude and direction of this velocity. This effect will be
considered for light waves in Sec. 21.4.
313

Chapter 15
ELECTROMAGNETIC WAVES

15.1. The Wave Equation for an Electromagnetic Field

We established in Chapter 9 that a varying electric field sets up a magnetic one


which, generally speaking, is also varying. This varying magnetic field sets up an
electric field, and so on. Thus, if we use oscillating charges to produce a varying
(alternating) electromagnetic field, then, in the space surrounding the charges a
sequence of mutual transformations of an electric and a magnetic field propagating
from point to point will appear. This process will be periodic in both time and
space and, consequently, will be a wave.
We shall show that the existence of electromagnetic waves follows from Maxwell’s
equations. For a homogeneous, neutral (𝜌 = 0), non-conducting (𝒋 = 0) medium
with a constant permittivity 𝜀 and a constant permeability 𝜇, we have
∂𝑩 ∂𝑯 ∂𝑫 ∂𝑬
= 𝜇𝜇0 , = 𝜀𝜀0 ,
∂𝑡 ∂𝑡 ∂𝑡 ∂𝑡
∇ · 𝑩 = 𝜇𝜇0 (∇ · 𝑯), ∇ · 𝑫 = 𝜀𝜀0 (∇ · 𝑬).
Consequently, Eqs. (9.5), (7.3), (9.13), and (2.23) can be written as follows:
∂𝑯
∇ × 𝑬 = −𝜇𝜇0 , (15.1)
∂𝑡
∇ · 𝑯 = 0, (15.2)
∂𝑬
∇ × 𝑯 = 𝜀𝜀0 , (15.3)
∂𝑡
∇ · 𝑬 = 0. (15.4)
Let us take a curl of both sides of Eq. (15.1):
∂𝑯
 
∇ × (∇ × 𝑬) = −𝜇𝜇0 ∇ × . (15.5)
∂𝑡
314 ELECTROMAGNETIC WAVES

The symbol ∇ denotes differentiation by coordinates. A change in the sequence of


differentiation with respect to the coordinates and time leads to the equation
∂𝑯 ∂
 
∇× = (∇ × 𝑯).
∂𝑡 ∂𝑡
Making such a substitution in Eq. (15.5) and introducing the value given by Eq. (15.3)
for the curl of 𝑯 into the equation obtained, we have
∂2 𝑬
∇ × (∇ × 𝑬) = −𝜀𝜀0 𝜇𝜇0 2 . (15.6)
∂𝑡
According to Eq. (1.107), ∇ × (∇ × 𝑬) = ∇(∇ · 𝑬) − Δ𝑬. Because of Eq. (15.4),
the first term of this expression is zero. Consequently, the left-hand side of Eq. (15.6)
is −Δ𝑬. Thus, omitting the minus signs at both sides of the equation, we obtain
∂2 𝑬
Δ𝑬 = 𝜀𝜀0 𝜇𝜇0 2 .
∂𝑡
According to Eq. (6.15), we have 𝜀0 𝜇0 = 1/𝑐. The equation can, therefore, be written
in the form
𝜀𝜇 ∂2 𝑬
Δ𝑬 = 2 2 . (15.7)
𝑐 ∂𝑡
Expanding the Laplacian operator, we get
∂𝑬 ∂𝑬 ∂𝑬 𝜀𝜇 ∂2 𝑬
2+ 2+ 2= 2 2. (15.8)
∂𝑥 ∂𝑦 ∂𝑧 𝑐 ∂𝑡
Taking a curl of both sides of Eq. (15.3) and performing similar transformations,
we arrive at the equation
∂𝑯 ∂𝑯 ∂𝑯 𝜀𝜇 ∂2 𝑯
2+ 2+ 2= 2 2 . (15.9)
∂𝑥 ∂𝑦 ∂𝑧 𝑐 ∂𝑡
Equations (15.8) and (15.9) are inseparably related to each other because they have
been obtained from Eqs. (15.1) and (15.3) each of which contains both 𝑬 and 𝑯.
Equations (15.8) and (15.9) are typical wave equations [see Eq. (14.24)]. Any function
satisfying such an equation describes a wave. The square root of the quantity that
is the reciprocal of the coefficient of the time derivative gives the phase velocity of
this wave. Hence, Eqs. (15.8) and (15.9) point to the fact that electromagnetic fields
can exist in the form of electromagnetic waves whose phase velocity is
𝑐
𝑣= √ . (15.10)
𝜀𝜇
In a vacuum (i.e., when 𝜀 = 𝜇 = 1), the velocity of electromagnetic waves coincides
with that of light in free space 𝑐.
Plane Electromagnetic Wave 315

15.2. Plane Electromagnetic Wave

Let us investigate a plane electromagnetic wave propagating in a neutral non-


conducting medium with a constant permittivity 𝜀 and permeability 𝜇 (𝜌 = 0, 𝒋 = 0,
𝜀 = constant, 𝜇 = constant). We shall direct the 𝑥-axis at right angles to the wave
surfaces. Hence, 𝑬 and 𝑯, and, consequently, their components along the coordinate
axes will not depend on the coordinates 𝑦 and 𝑧. For this reason, Eqs. (9.15)-(9.18)
can be simplified as follows:
∂𝐻𝑥 ∂𝐸 𝑧 ∂𝐻 𝑦 ∂𝐸 𝑦 ∂𝐻𝑧
0 = 𝜇𝜇0 , = 𝜇𝜇0 , = −𝜇𝜇0 (15.11)
∂𝑡 ∂𝑥 ∂𝑡 ∂𝑥 ∂𝑡
∂𝐵 𝑥 ∂𝐻𝑥
= 𝜇𝜇0 = 0, (15.12)
∂𝑥 ∂𝑥
∂𝐸 𝑥 ∂𝐻𝑧 ∂𝐸 𝑦 ∂𝐻 𝑦 ∂𝐸 𝑧
0 = 𝜀𝜀0 , = −𝜀𝜀0 , = 𝜀𝜀0 (15.13)
∂𝑡 ∂𝑥 ∂𝑡 ∂𝑥 ∂𝑡
∂𝐷𝑥 ∂𝐸 𝑥
= 𝜀𝜀0 = 0. (15.14)
∂𝑥 ∂𝑥
Equation (15.14) and the first of Eqs. (15.13) show that 𝐸 𝑥 can depend neither on 𝑥 nor
on 𝑡. Equation (15.12) and the first of Eqs. (15.11) give the same result for 𝐻𝑥 . Hence,
𝐸 𝑥 and 𝐻𝑥 differing from zero can be due only to constant homogeneous fields
superposed onto the electromagnetic field of a wave. The wave field itself cannot
have components along the 𝑥-axis. It thus follows that the vectors 𝑬 and 𝑯 are
perpendicular to the direction of propagation of the wave, i.e., that electromagnetic
waves are transverse. We shall assume in the following that the constant fields are
absent and that 𝐸 𝑥 = 𝐻𝑥 = 0.
The last two equations (15.11) and the last two equations (15.13) can be combined
into two independent groups
∂𝐸 𝑦 ∂𝐻𝑧 ∂𝐻𝑧 ∂𝐸 𝑦
= −𝜇𝜇0 , = −𝜀𝜀0 , (15.15)
∂𝑥 ∂𝑡 ∂𝑥 ∂𝑡
∂𝐸 𝑧 ∂𝐻 𝑦 ∂𝐻 𝑦 ∂𝐸 𝑧
= 𝜇𝜇0 , = 𝜀𝜀0 . (15.16)
∂𝑥 ∂𝑡 ∂𝑥 ∂𝑡
The first group of equations relates the components 𝐸 𝑦 and 𝐻𝑧 , and the second
group, the components 𝐸 𝑧 and 𝐻 𝑦 . Assume that there was initially set up a varying
electric field 𝐸 𝑦 directed along the 𝑦-axis. According to the second of Eqs. (15.15),
this field produces the magnetic field 𝐻𝑧 directed along the 𝑧-axis. In accordance
with the first of Eqs. (15.15), the field 𝐻𝑧 produces the electric field 𝐸 𝑦 , and so on.
Neither the field 𝐸 𝑧 nor the field 𝐻 𝑦 is produced. Similarly, if the field 𝐸 𝑧 was
produced initially, then according to Eqs. (15.16) the field 𝐻 𝑦 will appear that will
set up the field 𝐸 𝑧 , etc. In this case, the fields 𝐸 𝑦 and 𝐻𝑧 are not produced. Thus, to
316 ELECTROMAGNETIC WAVES

describe a plane electromagnetic wave, it is sufficient to take one of the systems of


equations (15.15) or (15.16) and to assume that the components in the other system
equal zero.
Let us take Eqs. (15.15) to describe a wave, assuming that 𝐸 𝑧 = 𝐻 𝑦 = 0. We
shall differentiate the first equation with respect to 𝑥 and make the substitution
(∂/∂𝑥) (∂𝐻𝑧 /∂𝑡) = (∂/∂𝑡) (∂𝐻𝑧 /∂𝑥). Next introducing ∂𝐻𝑧 /∂𝑥 from the second
equation, we get a wave equation for 𝐸 𝑦 :
∂2 𝐸 𝑦 𝜀𝜇 ∂2 𝐸 𝑦
= 2 (15.17)
∂𝑥2 𝑐 ∂𝑡2
(we have substituted 1/𝑐2 for 𝜀0 𝜇0 ). Differentiating the second of Eqs. (15.15) with
respect to 𝑥, we find a wave equation for 𝐻𝑧 after similar transformations:
∂2 𝐻𝑧 𝜀𝜇 ∂2 𝐻𝑧
= 2 . (15.18)
∂𝑥2 𝑐 ∂𝑡2
The equations obtained are a particular case of Eqs. (15.8) and (15.9).
We remind our reader that 𝐸 𝑥 = 𝐸 𝑧 = 0 and 𝐻𝑥 = 𝐻 𝑦 = 0, so that 𝐸 𝑦 = 𝐸
and 𝐻𝑧 = 𝐻. We have retained the subscripts 𝑦 and 𝑧 of 𝐸 and 𝐻 to stress the
circumstance that the vectors 𝑬 and 𝑯 are directed along mutually perpendicular
axes 𝑦 and 𝑧.
The simplest solution of Eq. (15.17) is the function
𝐸 𝑦 = 𝐸m cos(𝜔𝑡 − 𝑘𝑥 + 𝛼1 ). (15.19)
The solution of Eq. (15.18) is similar:
𝐻𝑧 = 𝐻m cos(𝜔𝑡 − 𝑘𝑥 + 𝛼2 ). (15.20)
In these equations, 𝜔 is the frequency of the wave, 𝑘 is the wave number equal
to 𝜔/𝑣, and 𝛼1 and 𝛼2 are the initial phases of the oscillations at points with the
coordinate 𝑥 = 0.
Introducing functions (15.19) and (15.20) into Eqs. (15.15), we get
𝑘𝐸m sin(𝜔𝑡 − 𝑘𝑥 + 𝛼1 ) = 𝜇𝜇0 𝜔𝐻m sin(𝜔𝑡 − 𝑘𝑥 + 𝛼2 ),
𝑘𝐻m sin(𝜔𝑡 − 𝑘𝑥 + 𝛼2 ) = 𝜀𝜀0 𝜔𝐸m sin(𝜔𝑡 − 𝑘𝑥 + 𝛼1 ).
For these equations to be satisfied, equality of the initial phases 𝛼1 and 𝛼2 is needed.
In addition, the following relations must be observed
𝑘𝐸m = 𝜇𝜇0 𝜔𝐻m ,
𝑘𝐻m = 𝜀𝜀0 𝜔𝐸m .
Multiplying these two equations, we find that
𝜀𝜀0 𝐸2m = 𝜇𝜇0 𝐻m
2
. (15.21)
Thus, the oscillations of the electric and magnetic vectors in an electromagnetic
Plane Electromagnetic Wave 317

Fig. 15.1

wave occur with the same phase (𝛼1 = 𝛼2 ), while the amplitudes of these vectors
are related by the expression
√ √
𝐸m 𝜀𝜀0 = 𝐻m 𝜇𝜇0 . (15.22)
For a wave propagating in a vacuum, we have
  1/2 p
𝐸m 𝜇0
= = 4𝜋 × 10−7 × 4𝜋 × 9 × 109 = 120𝜋 ≈ 377. (15.23)
𝐻m 𝜀0
In the Gaussian system of units, Eq. (15.22) becomes
√ √
𝐸m 𝜀 = 𝐻m 𝜇. (15.24)
Consequently, for a vacuum, we have 𝐸m = 𝐻m (𝐸m is measured in cgse units,
and 𝐻m in cgsm ones).
Multiplying Eq. (15.19) by the unit vector 𝒆ˆ 𝑦 of the 𝑦-axis (𝐸 𝑦 𝒆ˆ 𝑦 = 𝑬), and
Eq. (15.20) by the unit vector 𝒆ˆ 𝑧 of the 𝑧-axis (𝐻𝑧 𝒆ˆ 𝑧 = 𝑯), we get equations for a
plane electromagnetic wave in the vector form
𝑬 = 𝑬 m cos(𝜔𝑡 − 𝑘𝑥)
(15.25)
𝑯 = 𝑯 m cos(𝜔𝑡 − 𝑘𝑥)
(we have assumed that 𝛼1 = 𝛼2 = 0).
Figure 15.1 shows an “instantaneous photograph” of a plane electromagnetic wave.
A glance at the figure shows that the vectors 𝑬 and 𝑯 form a right-handed system
with the direction of propagation of the wave. At a fixed point of space, the vectors 𝑬
and 𝑯 vary with time according to a harmonic law. They simultaneously grow from
zero, and next reach their maximum value in one-fourth of a period; if 𝑬 is directed
upward, then, 𝑯 is directed to the right (we look along the direction of propagation
of the wave). In another one-fourth of a period, both vectors simultaneously vanish.
Next, they again reach their maximum value, but this time, 𝑬 is directed downward,
and 𝑯 to the left. And, finally, upon completion of a period of oscillation, the vectors
again vanish. Such changes in the vectors 𝑬 and 𝑯 occur at all points of space, but
with a shift in phase determined by the distance between the points measured along
the 𝑥-axis.
318 ELECTROMAGNETIC WAVES

Ch

Inductor
Ch

Fig. 15.2

15.3. Experimental Investigation of Electromagnetic Waves

The first experiments with non-optical electromagnetic waves were conducted in


1888 by the German physicist Heinrich Hertz (1857-1894). Hertz produced waves
with the aid of a vibrator which he had invented. The vibrator consisted of two
rods separated by a spark gap. When a high voltage was fed to the vibrator from an
induction coil, a spark jumped through the gap. It shorted the latter, and damped
electrical oscillations were set up in the vibrator (Fig. 15.2; the chokes shown in
the figure were intended to prevent the high-frequency current from branching
off into the inductor winding). During the time the spark burned, a great number
of oscillations were completed. They produced a train of electromagnetic waves
whose length was approximately twice that of the vibrator. By placing vibrators of
various length at the focus of a concave parabolic mirror, Hertz obtained directed
plane waves whose length ranged from 0.6 m to 10 m.
Hertz also studied the emitted wave with the aid of a half-wave vibrator having
a small spark gap at its middle. When such a vibrator was placed parallel to the
electric field strength vector of the wave, oscillations of the current and voltage were
produced in it. Since the length of the vibrator was equal to 𝜆/2, the oscillations in
it owing to resonance reached such an intensity that they caused small sparks to
jump across the spark gap.
Hertz reflected and refracted electromagnetic waves with the aid of large metal
mirrors and an asphalt prism (over 1 m in size and with a mass of 1200 kg). He
discovered that both these phenomena obey the laws established in optics for light
waves. By reflecting a running plane wave with the aid of a metal mirror to the
opposite direction, Hertz obtained a standing wave. The distance between the nodes
and antinodes of the wave made it possible to find its length 𝜆. By multiplying 𝜆 by
the frequency of oscillations 𝜈 of the vibrator, the velocity of the electromagnetic
waves was determined, and it was found to be close to 𝑐. By placing a grate of
Energy of Electromagnetic Waves 319

parallel copper wires in the path of waves, Hertz discovered that the intensity of
the waves passing through the grate changes very greatly when the grate is rotated
about the beam. When the wires forming the grate were perpendicular to the vector
𝑬, the wave passed through the grate without any hindrance. When the wires were
arranged parallel to 𝑬, the wave did not pass through the grate. Thus, the transverse
nature of electromagnetic waves was proved.
Hertz’s experiments were continued by the Russian physicist Pyotr Lebedev
(1866-1912), who in 1894 obtained electromagnetic waves 6 mm long and studied how
they travel in crystals. He detected double refraction of the waves (see Sec. 19.3).
In 1896, the Russian inventor Aleksandr Popov (1859-1905) for the first time
in history transmitted a message over a distance of about 250 m with the aid of
electromagnetic waves (the words “Heinrich Hertz” were transmitted). This laid the
foundation of radio engineering.

15.4. Energy of Electromagnetic Waves

Electromagnetic waves transfer energy. According to Eq. (14.46), the density of the
energy flux can be obtained by multiplying the energy density by the wave velocity.
The density of the energy of an electromagnetic field 𝑤 consists of the density
of the energy of the electric field [determined by Eq. (4.10)] and that of the energy of
the magnetic field [determined by Eq. (8.40)]:
𝜀𝜀0 𝐸2 𝜇𝜇0 𝐻 2
𝑤 = 𝑤𝐸 + 𝑤𝐻 = + . (15.26)
2 2
The vectors 𝑬 and 𝑯 at a given point of space vary in the same phase¹. Therefore,
Eq. (15.22) giving the relation between the amplitude values of 𝐸 and 𝐻 also holds
for their instantaneous values. It thus follows that the densities of the energy of
the electric and magnetic fields of a wave are identical at each moment of time:
𝑤𝐸 = 𝑤𝐻 . We can, therefore, write that
𝑤 = 2𝑤𝐸 = 𝜀𝜀0 𝐸2 . (15.27)
√ √
Taking advantage of the fact that 𝐸 𝜀𝜀0 = 𝐻 𝜇𝜇0 , we can write Eq. (15.27) in
the form
√ 1
𝑤 = 𝜀𝜀0 𝜇𝜇0 𝐸𝐻 = 𝐸𝐻,
𝑣
where 𝑣 is the velocity of an electromagnetic wave [see Eq. (15.10)].
Multiplying the expression found for 𝑤 by the wave velocity 𝑣, we get the

¹This holds only for a non-conducting medium. The phases of 𝑬 and 𝑩 do not coincide in a
conducting medium.
320 ELECTROMAGNETIC WAVES

magnitude of the energy flux density vector


𝑆 = 𝑤𝑣 = 𝐸𝐻. (15.28)
The vectors 𝑬 and 𝑯 are mutually perpendicular and form a right-handed system
with the direction of propagation of the wave. For this reason, the direction of
the vector 𝑬 × 𝑯 coincides with that of energy transfer, and the magnitude of this
vector is 𝐸𝐻. Hence, the vector of the density of the electromagnetic energy flux
can be written as the vector product of 𝑬 and 𝑯:
𝑺 = 𝑬 × 𝑯. (15.29)
The vector 𝑺 is known as the Poynting vector.
By analogy with Eq. (14.50), the flux 𝛷 of electromagnetic energy through surface
𝐴s can be found by integration:

𝛷= 𝑺 · d𝑨s (15.30)
𝐴s
[in Eq. (14.50) the surface area was designated by the symbol 𝑆; since this symbol is
used to designate the Poynting vector, we were forced to introduce the symbol 𝐴s
for the surface area].
Let us consider a portion of a homogeneous cylindrical conductor through
which a steady current is flowing (Fig. 15.3) as an example of applying Eqs. (15.29)
and (15.30). We shall first consider that extraneous forces are absent on this portion
of the conductor. Hence, according to Eq. (5.22), the following relation is observed
at each point of the conductor:
1
𝒋 = 𝜎 𝑬 = 𝑬.
𝜌
The steady current is distributed over the cross section of the conductor with
an identical density 𝒋. Hence, the electric field within the limits of the portion of
the conductor shown in Fig. 15.3 will be homogeneous. Let us mentally separate a
cylindrical volume of radius 𝑟 and length 𝑙 inside the conductor. At each point on
the side surface of this cylinder, the vector 𝑯 is perpendicular to the vector 𝑬 and
is directed tangentially to the surface. The magnitude of 𝑯 is 𝑗𝑟/2 [according to
Eq. (7.10), we have 2𝜋𝑟𝐻 = 𝑗𝜋𝑟 2 ]. Thus, the Poynting vector given by Eq. (15.29) is
directed toward the axis of the conductor at each point on the surface and has the
magnitude 𝑆 = 𝐸𝐻 = 𝐸𝑗𝑟 2 /2. Multiplying 𝑆 by the side surface area of the cylinder
𝐴s equal to 2𝜋𝑟𝑙, we find that the following flux of electromagnetic energy enters
the volume we are considering:
1
𝛷 = 𝑆 𝐴s = 𝐸𝑗𝑟 × 2𝜋𝑟𝑙 = 𝐸𝑗 × 𝜋𝑟 2 𝑙 = 𝐸𝑗 × 𝑉 , (15.31)
2
where 𝑉 is the volume of the cylinder.
Energy of Electromagnetic Waves 321

Fig. 15.3

According to Eq. (6.4), 𝐸𝑗 = 𝑝𝑗2 is the amount of heat liberated in unit time per
unit volume of the conductor. Consequently, Eq. (15.31) indicates that the energy
liberated in the form of Lenz-Joule heat is supplied to the conductor through its side
surface in the form of energy of an electromagnetic field. The energy flux gradually
weakens with deeper penetration into the conductor (both the Poynting vector and
the surface through which the flux passes diminish) as a result of absorption of
energy and its conversion into heat.
Now, let us assume that extraneous forces whose field is homogeneous are
exerted within the limits of the portion of the conductor we are considering (𝑬 ∗ =
constant). In this case according to Eq. (5.25), at each point of the conductor we have
1
𝒋 = 𝜎 (𝑬 + 𝑬 ∗ ) = (𝑬 + 𝑬 ∗ ) ,
𝜌
whence
𝑬 = 𝜌𝒋 − 𝑬 ∗ . (15.32)
We shall consider that the extraneous forces on the portion of the circuit being
considered do not hamper the flow of the current, but facilitate it. This signifies
that the direction of 𝑬 ∗ coincides with that of 𝒋. Let us assume that the relation
𝜌𝑗 = 𝐸∗ is observed. Hence, according to Eq. (15.32), the electrostatic field strength
𝑬 at each point vanishes, and there is no flux of electromagnetic energy through
the side surface. In this case, heat is liberated at the expense of the work of the
extraneous forces.
If the relation 𝐸∗ > 𝜌𝑗 holds, then, as can be seen from Eq. (15.32), the vector
322 ELECTROMAGNETIC WAVES

Fig. 15.4

𝑬 will be directed oppositely to the vector 𝒋. In this case, the vectors 𝑬 and 𝑺 will
have directions opposite to those shown in Fig. 15.3. Hence, instead of flowing in,
electromagnetic energy flows out through the side surface of the conductor into
the space surrounding it.
In summarizing, we can say that in the closed circuit of a steady current, the
energy from the sections where extraneous forces act is transmitted to other sections
of the circuit not along the conductors, but through the space surrounding the
conductors in the form of a flux of electromagnetic energy characterized by the
vector 𝑺.

15.5. Momentum of Electromagnetic Field

An electromagnetic wave absorbed in a body imparts a momentum to the body, i.e.,


exerts a pressure on it. This can be shown by the following example. Assume that a
plane wave impinges normally onto a flat surface of a weakly conducting body with
𝜀 and 𝜇 equal to unity (Fig. 15.4). The electric field of the wave produces a current of
density 𝒋 = 𝜎 𝑬 in the body. The magnetic field of the wave will act on the current
with a force whose value per unit volume of the body can be found by Eq. (6.43):
𝑭 u.v = 𝒋 × 𝑩 = 𝜇0 (𝒌 × 𝑯).
The direction of this force, as can be seen from Fig. 15.4, coincides with the direction
of propagation of the wave.
The momentum
d𝐾 = 𝐹u.v d𝑙 = 𝜇0 𝑗𝐻 d𝑙 (15.33)
is imparted to a surface layer having a unit area and a thickness of d𝑙 in unit time
(the vectors 𝒋 and 𝑯 are mutually perpendicular). The energy absorbed in this layer
in unit time is
d𝑊 = 𝑗𝐸 d𝑙. (15.34)
Momentum of Electromagnetic Field 323

It is liberated in the form of heat.


The momentum given by Eq. (15.33) and the energy [Eq. (15.34)] are imparted to
the layer by the wave. Let us take their ratio, omitting the differential symbols as
superfluous:
𝐾 𝐻
= 𝜇0 .
𝑊 𝐸
Taking into account that 𝜇0 𝐻 2 = 𝜀0 𝐸2 , we get
𝐾 √ 1
= 𝜀 0 𝜇0 = .
𝑊 𝑐
It thus follows that an electromagnetic wave carrying the energy 𝑊 has the mo-
mentum
1
𝐾 = 𝑊. (15.35)
𝑐
The same relation between the energy and the momentum holds for particles with
a zero rest mass [see Eq. (8.57) of Vol. I]. This is not surprising because according to
quantum notions, an electromagnetic wave is equivalent to a flux of photons, i.e.,
particles whose mass (we have in mind their rest mass) is zero.
Examination of Eq. (15.35) shows that the density of the momentum (i.e., the
momentum of unit volume) of an electromagnetic field is
1
𝐾u.v = 𝑤. (15.36)
𝑐
The energy density is related to the magnitude of the Poynting vector by the ex-
pression 𝑆 = 𝑤𝑐. Substituting 𝑆/𝑐 for 𝑤 in Eq. (15.36) and taking into account that
the directions of the vectors 𝑲 and 𝑺 coincide, we can write that
1 1
𝑲 u.v = 2 𝑺 = 2 (𝑬 × 𝑯). (15.37)
𝑐 𝑐
We shall note that when energy of any kind is transferred, the density of the
energy flux equals the density of the momentum multiplied by 𝑐2 . Let us consider,
for example, a collection of particles distributed in space with the density 𝑛 and
flying with a velocity 𝑣 identical in magnitude and direction. In this case, the density
of the momentum
𝑚𝒗
𝑲 u.v = 𝑛 p . (15.38)
1 − (𝑣2 /𝑐2 )
The particles carry along energy whose density flux 𝒋𝑊 equals the density of the
particle flux multiplied by the energy of one particle:
𝑚𝑐2
𝒋𝑊 = 𝑛𝒗 p . (15.39)
1 − (𝑣2 /𝑐2 )
324 ELECTROMAGNETIC WAVES

It follows from Eqs. (15.38) and (15.39) that


1
𝑲 u.v = 2 𝒋𝑊 . (15.40)
𝑐
Assume that an electromagnetic wave falling normally on a body is completely
absorbed by the body. Hence, a unit of surface area of the body receives in unit
time the momentum of the wave enclosed in a cylinder with a base area of unity
and an altitude of 𝑐. According to Eq. (15.36), this momentum is (𝑤/𝑐)𝑐 = 𝑤. At
the same time, the momentum imparted to a unit surface area in unit time equals
the pressure 𝑝 on the surface. Hence, for an absorbing surface, we have 𝑝 = 𝑤.
This quantity pulsates with a very high frequency. We can, therefore, measure its
time-averaged value in practice. Thus,
𝑝 = h𝑤i . (15.41)
For an ideally reflecting surface, the pressure will be double this value.
The value of the pressure calculated by Eq. (15.41) is very small. For example, at
a distance of 1 m from a source of light having an intensity of a million candelas,
the pressure is only about 10−7 Pa (about 10−9 gf cm−2 ). Pyotr Lebedev succeeded
in measuring the pressure of light. By carrying out experiments requiring great
inventiveness and skill, Lebedev measured the pressure of light on solids in 1900,
and on gases in 1910. The results of the measurements completely agreed with
Maxwell’s theory.

15.6. Dipole Emission

An oscillating electric dipole is the simplest system emitting electromagnetic waves.


An example of such a dipole is the system formed by a fixed point charge +𝑞 and
a point charge −𝑞 oscillating near it (Fig. 15.5). The dipole electric moment of this
system varies in time according to the law
𝒑 = −𝑞𝒓 = −𝑞𝑙ˆ𝒆 cos(𝜔𝑡) = 𝒑m cos(𝜔𝑡), (15.42)
where 𝒓 is the position vector of the charge −𝑞, 𝑙 the amplitude of oscillations, 𝒆ˆ is
the unit vector directed along the dipole axis, and 𝒑m = −𝑔𝑙ˆ𝒆.
Acquaintance with such an emitting system is especially important in con-
nection with the fact that many questions of the interaction of radiation with a
substance can be explained classically proceeding from the notion of atoms as of
systems of charges containing electrons that are capable of performing harmonic
oscillations about their equilibrium position.
Let us consider the radiation of a dipole whose dimensions are small in compar-
ison with the wavelength (𝑙  𝜆). Such a dipole is called elementary. The pattern
of the electromagnetic field in direct proximity to the dipole is very complicated.
Dipole Emission 325

Fig. 15.5 Fig. 15.6

It becomes simplified quite greatly in the so-called wave zone of the dipole that
begins at distances 𝑟 considerably exceeding the wavelength (𝑟  𝜆). If a wave is
propagating in a homogeneous isotropic medium, then its wavefront in the wave
zone will be spherical (Fig. 15.6). The vectors 𝑬 and 𝑯 at each point are mutually
perpendicular and are perpendicular to the ray, i.e., to the position vector drawn to
the given point from the centre of the dipole.
Let us call sections of the wavefront by planes passing through the dipole axis
meridians, and by planes perpendicular to the dipole axis parallels. We can now
say that the vector 𝑬 at each point of a wave zone is directed along a tangent to
the meridian, and the vector 𝑯 along a tangent to the parallel. If we look along
the ray 𝑟, then the instantaneous pattern of the wave will be the same as shown
in Fig. 15.5, the only difference being that the amplitude in motion along the ray
gradually diminishes.
At each point, the vectors 𝑬 and 𝑯 oscillate according to the law cos(𝜔𝑡 − 𝑘𝑟).
The amplitudes 𝑬 m and 𝑯 m depend on the distance 𝑟 to the emitter and on the angle
a between the direction of the position vector 𝒓 and the dipole axis (see Fig. 15.6).
This dependence has the following form for a vacuum:
1
𝐸m ∝ 𝐻m ∝ sin 𝜃.
𝑟
The average value of the density of the energy flux h𝑆i is proportional to the product
𝐸m 𝐻m , consequently,
1
h𝑆i ∝ 2 sin2 𝜃. (15.43)
𝑟
A glance at this expression shows that the wave intensity changes along the ray (at
𝜃 = constant) in inverse proportion to the square of the distance from the emitter.
In addition, it depends on the angle 𝜃. The emission of a dipole is the greatest in
directions at right angles to its axis (𝜃 = 𝜋/2). There is no emission in the directions
326 ELECTROMAGNETIC WAVES

Fig. 15.7

coinciding with the axis (𝜃 = 0 and 𝜋). How the intensity depends on the angle 𝜃 is
shown very illustratively with the aid of a dipole directional diagram (Fig. 15.7).
This diagram is constructed so that the length of the segment it intercepts on a ray
conducted from the centre of the dipole gives the intensity of emission at the angle
𝜃.
The corresponding calculations show that the radiant power 𝑃 of a dipole (i.e.,
the energy emitted in all directions in unit time) is proportional to the square of
the second time derivative of the dipole moment:
𝑃 ∝ 𝒑¥ 2 . (15.44)
According to Eq. (15.42), 𝒑¥ 2 = 2 𝜔4 cos2 (𝜔𝑡).
𝑝m Introduction of this value into expres-
sion (15.44) yields
2 4
𝑃 ∝ 𝑝m 𝜔 cos(𝜔𝑡). (15.45)
Time averaging of this expression gives
2 4
h𝑃i ∝ 𝑝m 𝜔 . (15.46)
Thus, the average radiant power of a dipole is proportional to the square of the
amplitude of the electric dipole moment and to the fourth power of the frequency.
Therefore, at a low frequency, the emission of electrical systems (for instance,
industrial frequency alternating current transmission lines) is insignificant.
According to Eq. (15.42), we have 𝒑¥ = −𝑞¥𝒓 = −𝑞𝒂, where 𝒂 is the acceleration of
an oscillating charge. Substitution of this expression for 𝒑¥ in Eq. (15.44) yields²:
𝑃 ∝ 𝑞2 𝒂2 . (15.47)
Expression (15.47) determines the radiant power not only for oscillations, but also
for arbitrary motion of a charge. A charge travelling with acceleration produces
electromagnetic waves, and the radiated power is proportional to the square of the
charge and the square of the acceleration. For example, the electrons accelerated in
a betatron (see Sec. 10.5) lose energy as a result of radiation mainly due to centripetal
acceleration 𝑎n = 𝑣2 /𝑟. According to expression (15.47), the amount of energy

²The constant of proportionality when SI units are used is 𝜇0 /𝜀0 /(6𝜋 𝑐2 ), and when units of the
p

Gaussian system are used is 2/(3𝑐3 ).


Dipole Emission 327

lost grows greatly with an increasing velocity of the electrons in the betatron (in
proportion to 𝑣4 ). Hence, the possible acceleration of electrons in a betatron is
limited to about 500 MeV (at a velocity corresponding to this value, the losses due
to radiation become equal to the energy imparted to the electrons by the vortex
electric field).
A charge performing harmonic oscillations emits a monochromatic wave with a
frequency equal to that of the charge oscillations. If the acceleration 𝒂 of the charge
does not change according to a harmonic law, then the radiation consists of a set of
waves of different frequencies.
According to expression (15.47), the intensity vanishes when 𝒂 = 0. Conse-
quently, an electron travelling with a constant velocity does not emit electromagnetic
waves. This holds, however, only for the case when the velocity of an electron 𝑣el

does not exceed the speed of light 𝑣l = 𝑐/ 𝜀𝜇 in the medium in which the electron
is travelling. When 𝑣el > 𝑣l , radiation is observed. It was discovered in 1934 by the
Soviet physicists Sergei Vavilov (1891-1951) and Pavel Cerenkov (born 1904).
PART III

OPTICS
331

Chapter 16
OPTICS

16.1. The Light Wave

Light is a complicated phenomenon: in some cases it behaves like an electromagnetic


wave, in others like a stream of special particles (photons). In the present volume,
we shall treat wave optics, i.e., the range of phenomena based on the wave nature of
light. The collection of phenomena due to the corpuscular (particulate) nature of
light will be dealt with in Volume III.
What oscillates in an electromagnetic wave are the vectors 𝑬 and 𝑯. Experi-
ments show that the physiological, photochemical, photoelectrical and other actions
of light are due to the oscillations of the electric vector. Accordingly, we shall speak
in the following of the light vector, having in mind the electric field strength vector.
We shall meanwhile make no mention of the magnetic vector of a light wave.
We shall designate the magnitude of the light vector amplitude, as a rule, by the
letter 𝐴 (sometimes by the symbol 𝐸m ). Hence, the change in space and time of the
projection of the light vector onto the direction along which it oscillates will be
described by the equation
𝐸 = 𝐴 cos(𝜔𝑡 − 𝑘𝑟 + 𝛼), (16.1)
where 𝑘 is the wave number and 𝑟 is the distance measured along the direction of
propagation of the light wave.
For a plane wave propagating in a non-absorbing medium, 𝐴 = constant, for a
spherical wave, 𝐴 diminishes in proportion to 1/𝑟, and so on.
The ratio of the speed of a light wave in a vacuum to the phase velocity 𝑣
in a medium is known as the absolute refractive index of the medium and is
designated by the letter 𝑛. Thus,
𝑐
𝑛= . (16.2)
𝑣
332 OPTICS


A comparison with Eq. (15.10) shows that 𝑛 = 𝜀𝜇. For the overwhelming majority
of transparent substances, 𝜇 does not virtually differ from unity. We can therefore
consider that

𝑛 = 𝜀. (16.3)
Equation (16.3) relates the optical and the electrical properties of a substance. It
may seem on the face of it that this equation is wrong. For example, for water 𝜀 = 81,
whereas 𝑛 = 1.33. It must be borne in mind, however, that the value 𝜀 = 81 has
been obtained from electrostatic measurements. A different value of 𝜀 is obtained
for fast-varying electric fields, and it depends on the frequency of oscillations of the
field. This explains the dispersion of light, i.e., the dependence of the refractive
index (or speed of light) on the frequency (or wavelength). Using the value of 𝜀
obtained for the relevant frequency in Eq. (16.3) leads to the correct value of 𝑛.
The values of the refractive index characterize the optical density of the
medium. A medium with a greater 𝑛 is called optically denser than one with a
smaller 𝑛. Conversely, a medium with a lower 𝑛 is called optically less dense than
one with a greater 𝑛.
The wavelengths of visible light are within the following limits:
𝜆0 = 0.40 µm to 0.76 µm(4000 Å to 7600 Å). (16.4)
These values relate to light waves in a vacuum. The lengths of light waves in
substances will have other values. For oscillations of frequency 𝜈, the wavelength
in a vacuum is 𝜆0 = 𝑐/𝜈. In a medium in which the phase velocity of a light wave is
𝑣 = 𝑐/𝑛, the wavelength has the value 𝜆 = 𝑣/𝜈 = 𝑐/(𝜈𝑛) = 𝜆0 /𝑛. Thus, the length
of a light wave in a medium with the refractive index 𝑛 is related to the wavelength
in a vacuum by the expression
𝜆0
𝜆= . (16.5)
𝑛
The frequencies of visible light waves are within the limits
𝜈 = 0.39 × 1015 Hz to 0.75 × 1015 Hz. (16.6)
The frequency of the changes in the vector of the energy flux density carried by a
wave will be still greater (it equals 2𝜈). Neither our eye nor any other receiver of
luminous energy can track such frequent changes in the energy flux, hence, they
register the time-averaged flux. The magnitude of the time-averaged energy flux
density carried by a light wave is called the light intensity 𝐼 at the given point
of space. The density of the flux of electromagnetic energy is determined by the
Poynting vector 𝑺. Hence,
𝐼 = | h𝑺i | = | h𝑬 × 𝑯i |. (16.7)
Averaging is performed over the time of “operation” of the instrument, which, as
The Light Wave 333

we have already noted, is much greater than the period of oscillations of the wave.
The intensity is measured either in energy units (for example, in W m−2 ), or in light
units named “lumen per square metre” (see Sec. 16.5).
According to Eq. (15.22), the magnitudes of the amplitudes of the vectors 𝑬 and
𝑯 in an electromagnetic wave are related by the expression
√ √ √
𝐸m 𝜀𝜀0 = 𝐻m 𝜇𝜇0 = 𝐻m 𝜇0
(we have assumed that 𝜇 = 1). It thus follows that
√ 𝜀0 1/2
    1/2
𝜀0
𝐻m = 𝜀 𝐸m = 𝑛𝐸m ,
𝜇0 𝜇0
where 𝑛 is the refractive index of the medium in which the wave propagates. Thus,
𝐻m is proportional to 𝐸m and 𝑛:
𝐻m ∝ 𝑛𝐸m . (16.8)
The magnitude of the average value of the Poynting vector is proportional to 𝐻m 𝐸m .
We can therefore write that
𝐼 ∝ 𝑛𝐸2m = 𝑛𝐴2 (16.9)
(the constant of proportionality is 𝜀0 /𝜇0 ). Hence, the light intensity is proportional
p

to the refractive index of the medium and the square of the light wave amplitude.
We must note that when considering the propagation of light in a homogeneous
medium, we may assume that the intensity is proportional to the square of the light
wave amplitude
𝐼 ∝ 𝐴2 . (16.10)
For light passing through the interface between media, however, the expression
for the intensity, which does not take the factor 𝑛 into account, leads to non-
conservation of the light flux.
The lines along which light energy propagates are called rays. The averaged
Poynting vector h𝑺i is directed at each point along a tangent to a ray. The direction
of h𝑺i in isotropic media coincides with a normal to the wave surface, i.e., with
the direction of the wave vector 𝒌. Hence, the rays are perpendicular to the wave
surfaces. In anisotropic media, a normal to the wave surface generally does not
coincide with the direction of the Poynting vector so that the rays are not orthogonal
to the wave surfaces.
Although light waves are transverse, they usually do not display asymmetry
relative to a ray. The explanation is that in natural light (i.e., in light emitted by
conventional sources) there are oscillations that occur in the most diverse directions
perpendicular to a ray (Fig. 16.1). The radiation of a luminous body consists of the
waves emitted by its atoms. The process of radiation in an individual atom continues
334 OPTICS

Ray

Fig. 16.1

about 10−8 s. During this time, a sequence of crests and troughs (or, as is said, a
wave train) of about three metres in length is formed. The atom “dies out”, and
then “flares up” again after a certain time elapses. Many atoms “flare up” at the same
time. The wave trains they emit are superposed on one another and form the light
wave emitted by the relevant body. The plane of oscillations is oriented randomly
for each wave train. Therefore, the resultant wave contains oscillations of different
directions with an equal probability.
In natural light, the oscillations in different directions follow one another rapidly
and without any order. Light in which the direction of the oscillations has been
brought into order in some way or other is called polarized. If the oscillations of
the light vector occur only in a single plane passing through a ray, the light is called
plane (or linearly) polarized. The order may consist in that the vector 𝑬 rotates
about a ray while simultaneously pulsating in magnitude. The result is that the tip
of the vector 𝑬 describes an ellipse. Such light is called elliptically polarized. If
the tip of the vector 𝑬 describes a circle, the light is called circularly polarized.
We shall deal with natural light in Chapters 17 and 18. For this reason, we shall
display no interest in the direction of the light vector oscillations. The ways of
obtaining polarized light and its properties are considered in Chapter 19.

16.2. Representation of Harmonic Functions Using Exponents

Let us form the sum of two complex numbers 𝑧1 = 𝑥1 + 𝑖 𝑦1 and 𝑧2 = 𝑥2 + 𝑖 𝑦2 :


𝑧 = 𝑧1 + 𝑧2 = (𝑥1 + 𝑖 𝑦1 ) + (𝑥2 + 𝑖 𝑦2 ) = (𝑥1 + 𝑥2 ) + 𝑖( 𝑦1 + 𝑦2 ). (16.11)
It can be seen from Eq. (16.11) that the real part of the sum of complex numbers
equals the sum of the real parts of the addends:
< {(𝑧1 + 𝑧2 )} = < {𝑧1 } + < {𝑧2 } . (16.12)
Let us assume that a complex number is a function of a certain parameter, for
example, of the time 𝑡:
𝑧(𝑡) = 𝑥(𝑡) + 𝑖 𝑦(𝑡).
Representation of Harmonic Functions Using Exponents 335

Differentiating this function with respect to 𝑡, we get


d𝑧 d𝑥 d𝑦
= +𝑖 .
d𝑡 d𝑡 d𝑡
It thus follows that the real part of the derivative of 𝑧 with respect to 𝑡 equals the
derivative of the real part of 𝑧 with respect to 𝑡:
d𝑧 d
 
< = < {𝑧} . (16.13)
d𝑡 d𝑡
A similar relation holds upon integration of a complex function. Indeed,
∫ ∫ ∫
𝑧(𝑡) d𝑡 = 𝑥(𝑡) d𝑡 + 𝑖 𝑦(𝑡) d𝑡,
whence it can be seen that the real part of the integral of 𝑧(𝑡) equals the integral of
the real part of 𝑧(𝑡):
∫  ∫
< 𝑧(𝑡) d𝑡 = < [𝑧(𝑡) d𝑡] . (16.14)

It is evident that relations similar to Eqs. (16.12), (16.13), and (16.14) also hold for
the imaginary parts of complex functions.
It follows from the above that when the operations of addition, differentiation,
and integration are performed with complex functions, and also linear combina-
tions of these operations, the real (imaginary) part of the result coincides with the
result that would be obtained when similar operations are performed with the real
(imaginary) parts of the same functions¹. Using the symbol e 𝐿 to denote a linear
combination of the operations listed above, we can write:
n o
< e 𝐿(< {𝑧1 } , < {𝑧2 } , . . .).
𝐿(𝑧1 , 𝑧2 , . . .) = e (16.15)
The property of linear operations we have established makes it possible to use
the following procedure in calculations: when performing linear operations with
harmonic functions of the form 𝐴 cos(𝜔𝑡 − 𝑘 𝑥 𝑥 − 𝑘 𝑦 𝑦 − 𝑘 𝑧 𝑧 + 𝛼), we can replace
these functions with the exponents
𝐴 exp[𝑖(𝜔𝑡 − 𝑘 𝑥 𝑥 − 𝑘 𝑦 𝑦 − 𝑘 𝑧 𝑧 + 𝛼)] = 𝐴ˆ exp[𝑖(𝜔𝑡 − 𝑘 𝑥 𝑥 − 𝑘 𝑦 𝑦 − 𝑘 𝑧 𝑧)], (16.16)
where 𝐴ˆ = 𝐴 𝑒𝑖𝛼 is a complex number called the complex amplitude. With such
representation, we can add functions, differentiate them with respect to the variables
𝑡, 𝑥, 𝑦, 𝑧, and also integrate over these variables. In performing the calculations, we
must take the real part of the result obtained. The expediency of this procedure
is explained by the fact that calculations with exponents are considerably simpler
than calculations performed with trigonometric functions.

¹We must note that this rule cannot be applied to non-linear operations, for example, to the
multiplication of functions and squaring them.
336 OPTICS

Passing over to representation (16.16), we in essence add to all functions of the


kind 𝐴 cos(𝜔𝑡 − 𝑘 𝑥 𝑥 − 𝑘 𝑦 𝑦 − 𝑘 𝑧 𝑧 + 𝛼) the addends 𝑖𝐴 sin(𝜔𝑡 − 𝑘 𝑥 𝑥 − 𝑘 𝑦 𝑦 − 𝑘 𝑧 𝑧 + 𝛼).
We remind our reader that we have used a similar procedure when studying forced
oscillations (see Sec. 7.12 of Vol. I).

16.3. Reflection and Refraction of a Plane Wave at the Interface Between Two
Dielectrics

Assume that a plane electromagnetic wave falls on the plane interface between two
homogeneous and isotropic dielectrics. The dielectric in which the incident wave
is propagating is characterized by the permittivity 𝜀1 , and the second dielectric by
the permittivity 𝜀2 . We assume that the permeabilities are unity. Experiments show
that in this case, apart from the plane refracted wave propagating in the second
dielectric, a plane reflected wave propagating in the first dielectric is produced.
Let us determine the direction of propagation of the incident wave with the aid
of the wave vector 𝒌, of the reflected wave with the aid of the vector 𝒌 0 and, finally,
of the refracted wave with the aid of the vector 𝒌 00. We shall find how the directions
of 𝒌 0 and 𝒌 00 are related to the direction of 𝒌. We can do this by taking advantage of
the fact that the following condition must be observed at the interface between the
two dielectrics:
𝐸1,𝜏 = 𝐸2,𝜏 . (16.17)
Here 𝐸1,𝜏 and 𝐸2,𝜏 are the tangential components of the electric field strength in
the first and second medium, respectively.
In Sec. 2.7, we proved Eq. (16.17) for electrostatic fields [see Eq. (2.44)]. It can easily
be extended, however, to time-varying fields. According to Eq. (9.5), the circulation
of 𝑬 determined by Eq. (2.42) for varying fields must be not zero, but equal to the
integral (− 𝑩) ¤ · d𝑺 taken over the area of the loop shown in Fig. 2.9:

∮ ∫
𝐸 𝑙 d𝑙 = 𝐸1,𝑥 𝑎 − 𝐸2,𝑥 𝑎 + h𝐸𝑏 i 2𝑏 = − 𝑩¤ · d𝑺.
𝑆=𝑎𝑏
Since 𝑩¤ is finite, in the limit transition 𝑏 → 0 the integral in the right-hand side
vanishes, and we arrive at condition (2.43), from which follows Eq. (2.44).
Assume that the vector 𝒌 determining the direction of propagation of the
incident wave is in the plane of the drawing (Fig. 16.2). The direction of a normal to
ˆ The plane in which the vectors
the interface will be characterized by the vector 𝒏.
𝒌 and 𝒏ˆ are is called the plane of incidence of the wave. Let us take the line of
intersection of the plane of incidence with the interface between the dielectrics as
the 𝑥-axis. We shall direct the 𝑦-axis at right angles to the plane of the dielectric
interface. The 𝑧-axis will, therefore, be perpendicular to the plane of incidence,
Reflection and Refraction of a Plane Wave at the Interface Between Two Dielectrics
337

Fig. 16.2

while the vector 𝝉ˆ will be directed along the 𝑥-axis (see Fig. 16.2).
It is obvious from considerations of symmetry that the vectors 𝒌 0 and 𝒌 00 can
only be in the plane of incidence (the media are homogeneous and isotropic). Indeed,
assume that the vector 𝒌 0 has deflected from this plane “toward us”. There are no
grounds, however, to give such a deflection priority over an equal deflection “away
from us”. Consequently, the only possible direction of 𝒌 0 is that in the plane of
incidence. Similar reasoning also holds for the vector 𝒌 00.
Let us separate from a naturally falling ray a plane-polarized component in
which the direction of oscillations of the vector 𝑬 makes an arbitrary angle with
the plane of incidence. The oscillations of the vector 𝑬 in the plane electromagnetic
wave propagating in the direction of the vector 𝒌 are described by the function²
𝑬 = 𝑬 m exp[𝑖(𝜔𝑡 − 𝒌 · 𝒓)] = 𝑬 m exp[𝑖(𝜔𝑡 − 𝑘 𝑥 𝑥 − 𝑘 𝑦 𝑦)]
(with our choice of the coordinate axes, the projection of the vector 𝒌 onto the 𝑧-axis
is zero, therefore, the addend −𝑘 𝑧 𝑧 is absent in the exponent). By correspondingly
choosing the beginning of reading 𝑡, we have made the initial phase of the wave
equal zero.
The field strengths in the reflected and refracted waves are determined by
similar expressions
𝑬 0 = 𝑬m
0
exp[𝑖(𝜔𝑡 − 𝑘 𝑥0 𝑥 − 𝑘 0𝑦 𝑦 + 𝛼 0)]
𝑬 00 = 𝑬 m
00
exp[𝑖(𝜔𝑡 − 𝑘 𝑥00 𝑥 − 𝑘 00𝑦 𝑦 + 𝛼 00)],
where 𝛼 0 and 𝛼 00 are the initial phases of the relevant waves.

²More exactly, the real part of this function, but we shall say simply function for brevity’s sake.
338 OPTICS

The resultant field in the first medium is


𝑬 1 = 𝑬 + 𝑬 0 = 𝑬 m exp[𝑖(𝜔𝑡 − 𝑘 𝑥 𝑥 − 𝑘 𝑦 𝑦 + 𝛼)]
0
+ 𝑬m exp[𝑖(𝜔𝑡 − 𝑘 𝑥0 𝑥 − 𝑘 0𝑦 𝑦 + 𝛼 0)]. (16.18)
In the second medium
𝑬 2 = 𝑬 00 = 𝑬 m
00
exp[𝑖(𝜔𝑡 − 𝑘 𝑥00 𝑥 − 𝑘 00𝑦 𝑦 + 𝛼 00)]. (16.19)
According to Eq. (16.17), the tangential components of Eqs. (16.18) and (16.19) must be
the same at the interface, i.e., when 𝑦 = 0. We thus arrive at the expression
𝐸m,𝜏 exp[𝑖(𝜔𝑡 − 𝑘 𝑥 𝑥)] + 𝐸m,𝜏
0
exp[𝑖(𝜔 0𝑡 − 𝑘 𝑥 𝑥 0 + 𝛼 0)]
00
= 𝐸m,𝜏 exp[𝑖(𝜔 00𝑡 − 𝑘 𝑥 𝑥 + 𝛼 00)]. (16.20)
For condition (16.20) to be observed at any 𝑡, all the frequencies must be the
same:
𝜔 = 𝜔 0 = 𝜔 00. (16.21)
To convince ourselves that this is true, let us write Eq. (16.20) in the form
𝑎 exp(𝑖𝜔𝑡) + 𝑏 exp(𝑖𝜔 0𝑡) = 𝑐 exp(𝑖𝜔 00𝑡),
where the coefficients 𝑎, 𝑏, and 𝑐 are independent of 𝑡. The equation which we have
written is equivalent to the following two:
𝑎 cos(𝜔𝑡) + 𝑏 cos(𝜔 0𝑡) = 𝑐 cos(𝜔 00𝑡)
𝑎 sin(𝜔𝑡) + 𝑏 sin(𝜔 0𝑡) = 𝑐 sin(𝜔 00𝑡).
The sum of two harmonic functions will also be a harmonic function only if the
functions being added have the same frequencies. The harmonic function obtained
as a result of addition will have the same frequency as the summated functions.
Hence, follows Eq. (16.21). We have, thus, arrived at the conclusion that the fre-
quencies of the reflected and refracted waves coincide with that of the incident
wave.
For condition (16.20) to be observed at any 𝑥, the projections of the wave vectors
onto the 𝑥-axis must be equal:
𝑘 𝑥 = 𝑘 𝑥0 = 𝑘 𝑥00. (16.22)
The angles 𝜃, and 𝜃 0, shown in Fig. 16.2 are called the angle of incidence, the
𝜃 00
angle of reflection, and the angle of refraction. A glance at the figure shows
that 𝑘 𝑥 = 𝑘 sin 𝜃, 𝑘 𝑥0 = 𝑘 0 sin 𝜃 0, 𝑘 𝑥00 == 𝑘 00 sin 𝜃 00. Equation (16.22) can therefore be
written in the form
𝑘 sin 𝜃 = 𝑘 0 sin 𝜃 0 = 𝑘 00 sin 𝜃 00.
The vectors 𝒌 and 𝒌 0 have the same magnitude equal to 𝜔/𝑣1 ; the magnitude of the
Reflection and Refraction of a Plane Wave at the Interface Between Two Dielectrics
339

vector 𝒌 00 equals 𝑤/𝑣2 . Hence,


𝜔 𝜔 𝜔
sin 𝜃 = sin 𝜃 0 = sin 𝜃 00.
𝑣1 𝑣1 𝑣2
It thus follows that
𝜃 0 = 𝜃, (16.23)
sin 𝜃 𝑣1
= = 𝑛12 . (16.24)
sin 𝜃 00 𝑣2
The relations we have obtained are obeyed for any plane-polarized component
of a natural ray. Hence, they also hold for a natural ray as a whole.
Equation (16.23) expresses the law of reflection of light, according to which
the reflected ray lies in one plane with the incident ray and the normal to the point of
incidence; the angle of reflection equals the angle of incidence.
Equation (16.24) expresses the law of refraction of light, according to which
the refracted ray lies in one plane with the incident ray and the normal to the point
of incidence; the ratio of the sine of the angle of incidence to the sine of the angle of
refraction is constant for given substances.
The quantity 𝑛12 in Eq. (16.24) is known as the relative refractive index of
the second substance with respect to the first one. Let us write this quantity in the
form
𝑣1 𝑐 𝑣1 𝑐/𝑣2 𝑛2
𝑛12 = = = = . (16.25)
𝑣2 𝑣2 𝑐 𝑐/𝑣1 𝑛1
Thus, the relative refractive index of two substances equals the ratio of their absolute
refractive indices.
Substituting the ratio 𝑛2 /𝑛1 for 𝑛12 in Eq. (16.24), we can write the law of refrac-
tion in the form
𝑛1 sin 𝜃 = 𝑛2 sin 𝜃 00. (16.26)
Inspection of this equation shows that when light passes from an optically denser
medium to an optically less dense one, the rays move away from a normal to the
interface of the media. An increase in the angle of incidence 𝜃 is attended by a more
rapid growth in the angle of refraction 𝜃 00, and when the angle 𝜃 reaches the value
𝜃 cr = arcsin 𝑛12 , (16.27)
the angle becomes equal to 𝜋/2. The angle determined by Eq. (16.27) is called the
𝜃 00
critical angle.
The energy carried by an incident ray is distributed between the reflected and
the refracted rays. As the angle of incidence grows, the intensity of the reflected ray
increases, while that of the refracted ray diminishes and vanishes at the critical angle.
At angles of incidence within the limits from 𝜃 cr to 𝜋/2, the light wave penetrates
340 OPTICS

into the second medium to a distance of the order of a wavelength 𝜆 and then
returns to the first medium. This phenomenon is called total internal reflection.
Let us find the relations between the amplitudes and phases of the incident,
reflected, and refracted waves. For simplicity, we shall limit ourselves to the normal
incidence of a wave onto the interface between dielectrics (we remind our reader
that the dielectrics are assumed to be homogeneous and isotropic). Assume that
the oscillations of the vector 𝑬 in the falling wave occur along the direction which
we shall take as the 𝑥-axis. It follows from considerations of symmetry that the
oscillations of the vectors 𝑬 0 and 𝑬 00 also occur along the 𝑥-axis. In the given case,
the unit vector 𝝉ˆ coincides with the unit vector 𝒆ˆ 𝑥 . Therefore, the condition of
continuity of the tangential component of the electric field strength has the form
𝐸 𝑥 + 𝐸 𝑥0 = 𝐸 𝑥00. (16.28)
Expression (16.8) obtained for the amplitude values of 𝐸 and 𝐻 also holds for
their instantaneous values: 𝐻 ∝ 𝑛𝐸. It thus follows that the instantaneous value of
the energy flux density is proportional to 𝑛𝐸2 . Thus, the law of energy conservation
leads to the equation
𝑛1 𝐸2𝑥 = 𝑛1 𝐸 𝑥02 + 𝑛2 𝐸 𝑥002 . (16.29)
We must note that the quantities 𝐸 𝑥 , 𝐸 𝑥0
and in Eqs. (16.28) and (16.29) are the
𝐸 𝑥00
instantaneous values of the projections.
Introducing 𝐸 𝑥00 − 𝐸 𝑥 into Eq. (16.29) instead of 𝐸 𝑥0 [see Eq. (16.28)], it is easy to
see that
2𝑛1
 
00
𝐸𝑥 = 𝐸𝑥 . (16.30)
(𝑛1 + 𝑛2 )
Using this value of 𝐸 𝑥00 in Eq. (16.28), we find that
 
𝑛1 − 𝑛2
0
𝐸𝑥 = 𝐸𝑥 . (16.31)
𝑛1 + 𝑛2
Examination of Eq. (16.30) shows that the projections of the vectors 𝑬 and 𝑬 00
have identical signs at each moment of time. Hence, we conclude that the oscillations
in the incident wave and in the one passing into the second medium occur at the
interface in the same phase—when a wave passes through the interface there is no
jump in the phase.
It can be seen from Eq. (16.31) that when 𝑛2 < 𝑛1 , the sign of 𝐸 𝑥0 coincides with
that of 𝐸 𝑥 . This signifies that the oscillations in the incident and reflected waves
occur at the interface in the same phase—the phase of a wave does not change upon
reflection. If 𝑛2 > 𝑛1 , then the sign of 𝐸 𝑥0 is opposite to that of 𝐸 𝑥 , the oscillations in
the incident and reflected waves occur at the interface in counterphase—the phase
of the wave changes in a jump by 𝜋 upon reflection. The result obtained also holds
Reflection and Refraction of a Plane Wave at the Interface Between Two Dielectrics
341

upon the inclined falling of a wave at the interface between two transparent media.
Thus, when a light wave is reflected from an interface between an optically less
dense medium and an optically denser one (when 𝑛1 < 𝑛2 ), the phase of oscillations
of the light vector changes by 𝜋. Such a phase change does not occur upon reflection
from an interface between an optically denser medium and an optically less dense
one (when 𝑛1 > 𝑛2 ).
Equations (16.30) and (16.31) have been obtained for the instantaneous values of
the projections of the light vectors. Similar relations also hold for the amplitudes of
the light vectors:
2𝑛1
 
𝑛1 − 𝑛2
00
𝐸m = 0
𝐸m , 𝐸m = 𝐸m . (16.32)
(𝑛1 + 𝑛2 ) 𝑛1 + 𝑛2
These relations make it possible to find the reflection coefficient 𝜌 and the transmis-
sion coefficient 𝜏 of a light wave (for normal incidence at the interface between two
transparent media). Indeed, by definition
02
𝐼 0 𝑛1 𝐸 m
𝜌= = ,
𝐼 𝑛1 𝐸2m
where 𝐼 0 is the intensity of the reflected wave, and 𝐼 is the intensity of the incident
one. Using in this equation the ratio 𝐸m 0 /𝐸 obtained from Eq. (16.32), we arrive at
m
the formula
2
𝑛12 − 1

𝜌= . (16.33)
𝑛12 + 1
Here, 𝑛12 = 𝑛2 /𝑛1 is the refractive index of the second medium relative to the first
one.
We get the following expression for the transmission coefficient:
2
𝐼 00 𝑛2 𝐸m 002
2

𝜏= = = 𝑛12 . (16.34)
𝐼 𝑛1 𝐸2m 𝑛12 + 1
We must note that the substitution for 𝑛12 in Eq. (16.33) of its reciprocal 𝑛21 =
1/𝑛12 does not change the value of 𝜌. Hence, the coefficient of reflection of the inter-
face between two given media has the same value for both directions of propagation
of light.
The index of refraction for glass is close to 1.5. Introducing 𝑛12 = 1.5 into
Eq. (16.33), we get 𝜌 = 0.04. Thus, each surface of a glass plate reflects (with incidence
close to normal) about four per cent of the luminous energy falling on it.
342 OPTICS

16.4. Luminous Flux

A real light wave is a superposition of waves with lengths confined within the
interval 𝛥𝜆. The latter is finite even for monochromatic (single-coloured) light. In
white light, 𝛥𝜆 covers the entire range of electromagnetic waves perceived by the
eye, i.e., it ranges from 0.40 µm to 0.76 µm.
The distribution of the energy flux by wavelengths can be characterized with
the aid of the distribution function
d𝛷en
𝜑(𝜆) = , (16.35)
d𝜆
where d𝛷en is the energy flux falling to the wavelengths from 𝜆 to 𝜆 + 𝛥𝜆. Knowing
the form of function (16.35), we can calculate the energy flux transferred by waves
whose lengths are within the finite interval from 𝜆1 to 𝜆2 :
∫ 𝜆2
𝛷en = 𝜑(𝜆) d𝜆. (16.36)
𝜆1
The action of light on the eye (the perception of light) depends quite greatly on the
wavelength. This is easy to understand if we take into account that electromag-
netic waves with 𝜆 below 0.40 µm and above 0.76 µm are not perceived at all by
the human eye. The sensitivity of an average normal human eye to radiation of
various wavelengths can be depicted graphically by a curve of relative spectral
sensitivity (Fig. 16.3). The wavelength 𝜆 is laid off along the horizontal axis, and the
relative spectral sensitivity 𝑉 (𝜆) along the vertical one. The eye is most sensitive
to radiation of the wavelength 0.555 µm³ (the green part of the spectrum). The
function 𝑉 (𝜆) for this wavelength is taken equal to unity. The luminous intensity
estimated visually for other wavelengths is lower, although the energy flux is the
same. Accordingly, 𝑉 (𝜆) for these wavelengths is also less than unity. The values
of the function 𝑉 (𝜆) are inversely proportional to the values of the energy fluxes
producing a visual sensation identical in intensity:
𝑉 (𝜆1 ) (d𝛷en )2
= .
𝑉 (𝜆2 ) (d𝛷en )1
For example, 𝑉 (𝜆) = 0.5 signifies that for obtaining a visual sensation of the same
intensity, light of the given wavelength must have a density of the energy flux twice
that of light for which 𝑉 (𝜆) = 1. Outside of the interval of visible wavelengths, the
function 𝑉 (𝜆) is zero.
The quantity 𝛷 called the luminous flux is introduced to characterize the
luminous intensity with account of its ability to produce a visual sensation. For the

³It is interesting to note that this wavelength is represented with the greatest intensity in solar
radiation.
Photometric Quantities and Units 343

0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75

Fig. 16.3

interval d𝜆, the luminous flux is determined as the product of the energy flux and
the corresponding value of the function 𝑉 (𝜆):
d𝛷 = 𝑉 (𝜆) d𝛷en . (16.37)
Expressing the energy flux through the function of energy distribution by wave-
lengths [see Eq. (16.35)], we get
d𝛷 = 𝑉 (𝜆)𝜑(𝜆) d𝜆. (16.38)
The total luminous flux is
∫ ∞
𝛷= 𝑉 (𝜆)𝜑(𝜆) d𝜆. (16.39)
0
The function 𝑉 (𝜆) is a dimensionless quantity. Consequently, the dimension
of luminous flux coincides with that of energy flux. This makes it possible to define
the luminous flux as the flux of luminous energy assessed according to its visual
sensation.

16.5. Photometric Quantities and Units

Photometry is the branch of optics occupied in measuring luminous fluxes and


quantities related to them.
Luminous Intensity. A source of light whose dimensions may be disregarded
in comparison with the distance from the place of observation to the source is
called a point source. In a homogeneous and isotropic medium, the wave emitted
by a point source will be spherical. Point sources of light are characterized by the
luminous intensity 𝐼 determined as the luminous flux emitted by a source per unit
solid angle:
d𝛷
𝐼= (16.40)
d𝛺
d𝛷 is the luminous flux emitted by a source within the limits of the solid angle d𝛺).
344 OPTICS

In the general case, the luminous intensity depends on the direction: 𝐼 =


𝐼 (𝜃, 𝜑) (here 𝜃 and 𝜑 are the polar and the azimuth angles in a spherical system
of coordinates). If 𝐼 does not depend on the direction, the light source is called
isotropic. For an isotropic source
𝛷
𝐼= , (16.41)
4𝜋
where 𝛷 is the total luminous flux emitted by the source in all directions.
When dealing with an extended source, we can speak of the luminous intensity
of an element of its surface d𝑆. Now by d𝛷 in Eq. (16.40) we must understand the
luminous flux emitted by the surface element d𝑆 within the limits of the solid angle
d𝛺.
The unit of luminous intensity—the candela (cd) is one of the basic SI units.
It is defined as the luminous intensity, in the perpendicular direction, of a surface
of 1/600000 square metre of a complete radiator at the temperature of freezing
platinum under a pressure of 101325 pascals. By a complete radiator is meant a
device having the properties of a blackbody (see Vol. III).
Luminous Flux. The unit of luminous flux is the lumen (lm). It equals the
luminous flux emitted by an isotropic source with a luminous intensity of 1 candela
within a solid angle of one steradian:
1 lm = 1 cd · 1 sr. (16.42)
It has been established experimentally that an energy flux of 0.0016 W cor-
responds to a luminous flux of 1 lm formed by radiation having a wavelength of
𝜆 = 0.555 µm. The energy flux
0.0016
𝛷en = W, (16.43)
𝑉 (𝜆)
corresponds to a luminous flux of 1 lm formed by radiation of a different wavelength.
Illuminance. The degree of illumination of a surface by the light falling on it
is characterized by the quantity
d𝛷inc
𝐸= , (16.44)
d𝑆
known as the illuminance or illumination (d𝛷inc is the luminous flux incident
on the surface element d𝑆).
The unit of illuminance is the lux (lx) equal to the illuminance produced by a
flux of 1 lm uniformly distributed over a surface having an area of 1 m2 :
1 lx = 1 lm : 1 m2 . (16.45)
The illuminance 𝐸 produced by a point source can be expressed through the
luminous intensity 𝐼, the distance 𝑟 from the surface to the source, and the angle 𝛼
between a normal to the surface 𝒏ˆ and the direction to the source. The flux incident
Photometric Quantities and Units 345

Fig. 16.4

on the area d𝑆 (Fig. 16.4) is d𝛷inc = 𝐼 d𝛺 and it is confined within the solid angle
d𝛺 subtended by d𝑆. The angle d𝛺 is d𝑆 cos 𝛼/𝑟 2 . Hence, d𝛷inc = 𝐼 d𝑆 cos 𝛼/𝑟 2 .
Dividing this flux by d𝑆, we get
𝐼 cos 𝛼
𝐸= . (16.46)
𝑟2
Luminous Emittance. An extended source of light can be characterized by
the luminous emittance 𝑀 of its various sections, by which is meant the luminous
flux emitted outward by unit area in all directions (within the limits of values of 𝜃
from 0 to 𝜋/2, where 𝜃 is the angle made by the given direction with an external
normal to the surface):
d𝛷em
𝑀= (16.47)
d𝑆
(d𝛷em is the flux emitted outward in all directions by the surface elements d𝑆 of the
source).
Luminous emittance may appear as a result of a surface reflecting the light
falling on it. Here, by d𝛷em in Eq. (16.47), we must understand the flux reflected by
the surface element d𝑆 in all directions.
The unit of luminous emittance is the lumen per square metre (lm m−2 ).
Luminance. Luminous emittance characterizes radiation (or reflection) of
light by a given place of a surface in all directions. The radiation (reflection) of light
in a given direction is characterized by the luminance 𝐿. The direction can be given
by the polar angle 𝜃 (measured from the outward normal 𝒏ˆ to the emitting surface
area 𝛥𝑆) and the azimuth angle 𝜑. Luminance is defined as the ratio of the luminous
intensity of an elementary surface area 𝛥𝑆 in a given direction to the projection of
the area 𝛥𝐴 onto a plane perpendicular to the chosen direction.
Let us consider the elementary solid angle d𝛺 subtended by the luminous area
d𝑆 and oriented in the direction (𝜃, 𝜑) (Fig. 16.5). The luminous intensity of area
𝛥𝑆 in the given direction, according to Eq. (16.40), is 𝐼 = d𝛷/d𝛺, where d𝛷 is the
luminous flux propagating within the limits of the angle d𝛺. The projection of
𝛥𝑆 onto a plane normal to the direction (𝜃, 𝜑) (in Fig. 16.5 the trace of this plane is
346 OPTICS

Fig. 16.5

depicted by a dash line) is 𝛥𝑆 cos 𝜃. Hence, the luminance is


d𝛷
𝐿= . (16.48)
d𝛺 𝛥𝑆 cos 𝜃
In the general case, the luminance differs for different directions: 𝐿 = 𝐿(𝜃, 𝜑).
Like the luminous emittance, the luminance can be used to characterize a surface
that reflects the light falling on it.
In accordance with Eq. (16.48), the flux emitted by the area 𝛥𝑆 within the limits
of the solid angle d𝛺 in the direction determined by 𝜃 and 𝜑 is
d𝛷 = 𝐿(𝜃, 𝜑) d𝛺 𝛥𝑆 cos 𝜃. (16.49)
A source whose luminance is identical in all directions (𝐿 = constant) is called
a Lambertian source (obeying Lambert’s law) or a cosine source (the flux emitted
by a surface element of such a source is proportional to cos 𝜃). Only a blackbody
strictly observes Lambert’s law.
The luminous emittance 𝑀 and luminance 𝐿 of a Lambertian source are related
by a simple expression. To find it, let us introduce d𝛺 = sin 𝜃 d𝜃 d𝜑 into Eq. (16.49)
and integrate the expression obtained with respect to 𝜑 within the limits from 0
to 2𝜋 and with respect to 𝜃 from 0 to 𝜋/2, taking into account that 𝐿 = constant.
As a result, we shall find the total light flux emitted by surface element 𝛥𝑆 of a
Lambertian source outward in all directions:
∫ 2𝜋 ∫ 𝜋/2
𝛥𝛷em = 𝐿𝛥𝑆 d𝜑 sin 𝜃 cos 𝜃 d𝜃 = 𝜋 𝐿𝛥𝑆.
0 0
We get the luminous emittance by dividing this flux by 𝛥𝑆. Thus, for a Lambertian
source, we have
𝑀 = 𝜋 𝐿. (16.50)
The unit of luminance is the candela per square metre (cd m−2 ).A uniformly
luminous plane surface has a luminance of 1 cd m in a direction normal to it if in
−2

this direction the luminous intensity of one square metre of surface is one candela.
Geometrical Optics 347

16.6. Geometrical Optics

The lengths of light waves perceived by the human eye are very small (of the order of
10−7 m). For this reason, the propagation of visible light in a first approximation can
be considered without giving attention to its wave nature and assuming that light
propagates along lines called rays. In the limiting case corresponding to 𝜆 → 0,
the laws of optics can be formulated using the language of geometry.
Accordingly, the branch of optics in which the finiteness of the wavelengths is
disregarded is known as geometrical optics. Another name for it is ray optics.
Geometrical optics is based on four laws: (1) the law of propagation of light
along a straight line; (2) the law of independence of light rays; (3) the law of light
reflection; and (4) the law of refraction.
The law of straight-line propagation states that in a homogeneous medium
light propagates in a straight line. This law is approximate—when light passes through
very small openings, deviations from a straight line are observed that increase with
a diminishing size of the opening.
The law of independence of light rays states that rays do not disturb one
another when they intersect. The intersection of rays does not hinder each of them
from propagating independently of the others. This law holds only at not too great
luminous intensities. At intensities reached with the aid of lasers, the independence
of light rays stops being observed.
The laws of reflection and refraction of light were formulated in Sec. 16.3 [see
Eqs. (16.23) and (16.24) and the text following them].
Geometrical optics can be based on the principle established by the French
mathematician Pierre de Fermat (1601-1665). It underlies the laws of straight-line
propagation, reflection, and refraction of light. As formulated by Fermat himself,
this principle states that any light ray will travel between two end points along a line
requiring the minimum transit time.
Light needs the time d𝑡 = d𝑠/𝑣, where 𝑣 is the speed of light at the given point of
the medium, to cover the distance d𝑠 (Fig. 16.6). Replacing 𝑣 with 𝑐/𝑛 [see Eq. (16.2)],
we find that d𝑡 = (1/𝑐)𝑛 d𝑠. Consequently, the time 𝜏 spent by light in covering the
distance from point 1 to point 2 is
1 2

𝜏= 𝑛 d𝑠. (16.51)
𝑐 1
The quantity
∫ 2
𝐿= 𝑛 d𝑠 (16.52)
1
having the dimension of length is called the optical path. In a homogeneous
348 OPTICS

A
Sc

M N

Fig. 16.6 Fig. 16.7

medium, the optical path equals the product of the geometrical path 𝑠 and the index
of refraction 𝑛 of the medium:
𝐿 = 𝑛𝑠. (16.53)
According to Eqs. (16.51) and (16.52), we have
𝐿
𝜏= . (16.54)
𝑐
The proportionality of the time 𝜏 of covering a path to the optical path 𝐿 makes
it possible to word Fermat’s principle as follows: light travels along a path whose
optical length is minimum. More exactly, the optical path must be extremal, i.e., either
minimum or maximum, or stationary—identical for all possible paths. In the last
case, all the paths of light between two points are tautochronous (requiring the
same time for covering them).
The reversibility of light rays ensues from Fermat’s principle. Indeed, the optical
path that is minimum when light travels from point 1 to point 2 is also minimum
when light travels in the opposite direction. Consequently, a ray emitted toward
another one that has travelled from point 1 to point 2 will cover the same path, but
in the opposite direction.
Let us use Fermat’s principle to obtain the laws of reflection and refraction of
light. Assume that a light ray reaches point B from point A after being reflected
from surface MN (Fig. 16.7, the straight path from A to B is blocked by opaque
screen Sc). The medium in which the ray travels is homogeneous. Therefore, the
minimality of the optical length consists in the minimality of its geometrical length.
The geometrical length of an arbitrarily taken path is A00B = A000B (auxiliary point
A0 is a mirror image of point A). A glance at the figure shows that the path of the ray
reflected at point 0 will be the shortest. At this point the angle of reflection equals
the angle of incidence. We must note that when point 00 moves away from point 0,
Geometrical Optics 349

M
N 0
N M
Sc

Fig. 16.8 Fig. 16.9

the geometrical path grows unlimitedly so that in the given case we have only one
extreme—a minimum.
Now let us find the point at which a ray travelling from A to B must be refracted
for the optical path to be extremal (Fig. 16.8). The optical path for an arbitrary ray is
q q
𝐿 = 𝑛1 𝑠1 + 𝑛2 𝑠2 + 𝑛1 𝑎21 + 𝑥2 + 𝑛2 𝑎22 + (𝑏 − 𝑥) 2 .
To find the extreme value, let us differentiate 𝐿 with respect to 𝑥 and equate the
derivative to zero:
d𝐿 𝑛1 𝑥 𝑛2 (𝑏 − 𝑥) 𝑥 (𝑏 − 𝑥)
=q −q = 𝑛1 − 𝑛2 = 0.
d𝑥 2 2 2 2 𝑠 1 𝑠2
𝑎1 + 𝑥 𝑎2 + (𝑏 − 𝑥)
The factors of 𝑛1 and 𝑛2 equal sin 𝜃 and sin 𝜃 00, respectively. We, thus, get the
relation
𝑛1 sin 𝜃 = 𝑛2 sin 𝜃 00,
expressing the law of refraction [see Eq. (16.26)].
Let us consider reflection from the inner surface of an ellipsoid of revolution
(Fig. 16.9; 𝐹1 and 𝐹2 are the foci of the ellipsoid). According to the definition of an
ellipse, the paths 𝐹1 0𝐹2 , 𝐹1 00 𝐹2 , 𝐹1 000 𝐹2 , etc. are identical in length. Hence, all the
rays leaving focus 𝐹1 and arriving after reflection at focus 𝐹2 are tautochronous.
In this case, the optical path is stationary. If we replace the surface of the ellipsoid
with surface MM having a smaller curvature and oriented so that a ray leaving
point 𝐹1 arrives at point 𝐹2 after being reflected from MM, then path 𝐹1 0𝐹1 will
be minimum. For surface NN whose curvature is greater than that of the ellipsoid,
path 𝐹1 0𝐹2 will be maximum.
The optical paths are also stationary when the rays pass through a lens (Fig. 16.10).
Ray P0P0 has the shortest path in air (where the index of refraction 𝑛 is virtually
equal to unity) and the longest path in glass (𝑛 ∼ 1.5). Ray PQQ0P0 has the longest
path in air, but a shorter one in glass. As a result, the optical paths will be the
350 OPTICS

S1 S2 S3

M N

M N
Fig. 16.10 Fig. 16.11

same for all the rays. Hence, the latter are tautochronous, and the optical path is
stationary.
Let us consider a wave propagating in a non-homogeneous isotropic medium
along rays 1, 2, 3, etc. (Fig. 16.11). We shall consider that the non-homogeneity is
sufficiently small for us to assume the index of refraction to be constant on sections
of the rays of length 𝜆. We shall construct wave surfaces S1 , S2 , S3 , etc., so that
the oscillations at the points of each following surface, lag in phase by 2𝜋 behind
the oscillations at the points on the preceding surface. The oscillations at points
on the same ray are described by the equation 𝜉 = 𝐴 cos(𝜔𝑡 − 𝑘𝑟 + 𝛼) (here, 𝑟
is the distance measured along the ray). The lag in phase is determined by the
expression 𝑘 𝛥𝑟, where 𝛥𝑟 is the distance between adjacent surfaces. From the
condition 𝑘 𝛥𝑟 = 2𝜋, we find that 𝛥𝑟 = 2𝜋/𝑘 = 𝜆. The optical length of each of the
paths of geometrical length 𝜆 is 𝑛𝜆 = 𝜆0 [see Eq. (16.5)]. According to Eq. (16.54), the
time 𝜏 during which light covers a path is proportional to the optical length of the
path. Consequently, the equality of the optical paths signifies equality of the times
needed for light to travel the relevant paths. We, thus, arrive at the conclusion that
sections of rays confined between two wave surfaces have identical optical paths
and are tautochronous. In particular, the sections of the rays between wave surfaces
MM and NN depicted by dash lines in Fig. 16.10 are tautochronous.
It can be seen from our treatment that the lag in phase 𝛿 appearing on the
optical path 𝐿 is determined by the expression
𝐿
𝛿 = 2𝜋 (16.55)
𝜆0
(𝜆0 is the length of a wave in a vacuum).
Centered Optical System 351

Fig. 16.12

16.7. Centered Optical System

A collection of rays forms a beam. If rays when continued intersect at one point,
the beam is called homocentric. A spherical wave surface corresponds to a homo-
centric beam of rays. Figure 16.12a shows a converging, and Fig. 16.12b a diverging
homocentric beam. A particular case of a homocentric beam is a beam of parallel
rays; a plane light wave corresponds to it.
Any optical system transforms light beams. If the system does not violate the
homocentricity of the beams, then the rays emerging from point P intersect at
one point P0. This point is the optical image of point P. If a point of an object is
depicted in the form of a point, the image is called a point or a stigmatic one.
An image is called real if the light rays actually intersect at point P0 (see
Fig. 16.12a), and virtual if the continuations of the rays in a direction opposite
to the direction of propagation of the light intersect at P0 (see Fig. 16.12b).
Owing to the reversibility of light rays, light source P and image P0 may exchange
roles—a point source placed at P0 will have its image at P. For this reason, P and P0
are called conjugate points.
An optical system that produces a stigmatic image which is geometrically similar
to the object it depicts is called ideal. With the aid of such a system, a space
continuity of points P is depicted in the form of a space continuity of points P0. The
first continuity of points is known as the object space, and the second one as the
image space. In both spaces, points, straight lines, and planes uniquely correspond
to one another. Such a relation of two spaces is called collinear correspondence
in geometry.
An optical system is a collection of reflecting and refracting surfaces separat-
ing optically homogeneous media from one another. These surfaces are usually
spherical or plane (a plane can be considered as a sphere of infinite radius). More
complicated surfaces such as an ellipsoid, hyperboloid or paraboloid of revolution
are used much less frequently.
An optical system formed by spherical (in particular, by plane) surfaces is called
352 OPTICS

Fig. 16.13

centered if the centres of all the surfaces are on a single straight line. This line is
called the optical axis of the system. To each point P or plane S in object space
there corresponds its conjugate point P0 or plane S0 in image space. The infinite
multitude of conjugate points and conjugate planes includes points and planes
having special properties. Such points and planes are called cardinal ones. Among
them are the focal, principal, and nodal points and planes. Setting of the cardinal
points or planes completely determines the properties of an ideal centered optical
system.
Focal Planes and Focal Points of an Optical System. Figure 16.13 shows the
external refracting surfaces and the optical axis of an ideal centered optical system.
Let us take plane S perpendicular to the optical axis in the object space of this
system. It follows from considerations of symmetry that plane S0 conjugate to S
is also perpendicular to the optical axis. Displacement of plane S relative to the
system will produce a corresponding displacement of plane S0. When plane S is
very far, a further increase in its distance from the system will produce virtually no
change in the position of plane S0. This signifies that as a result of removing plane S
to infinity, plane S0 will be in a definite extreme position F0. Pland F0 coinciding
with the extreme position of plane S0 is called the second (or hack) focal plane of
the optical system. We can say briefly that the second focal plane F0 is defined as a
plane conjugate to plane S∞ perpendicular to the axis of the system and at infinity
in the object space.
The point of intersection of the second focal plane with the optical axis is known
as the second (or hack) focal point (focus) of the system. It is also designated by
the letter F0. This point is conjugate to point P∞ on the axis of the system at infinity.
Rays emerging from P∞ form a beam parallel to the axis (see Fig. 16.13). When they
leave the system, these rays form a beam converging at focal point F0. A parallel
beam impinging on the system may leave it not in the form of a converging beam (as
in Fig. 16.13), but in the form of a diverging one. Hence, what intersects at point F0
will be not the actual rays that emerge, but their extensions in the reverse direction.
Accordingly, the second focal plane will he in front (in the direction of the rays) of
Centered Optical System 353

Fig. 16.14

the system or inside it.


The rays emanating from an infinitely remote point Q∞ , not lying on the axis
of the system form a parallel beam directed at an angle to the axis of the system.
Upon emerging from the system, these rays form a beam converging at point Q0
belonging to the second focal plane, but not coinciding with focal point F0 (see point
Q0 in Fig. 16.13). It follows from the above that the image of an infinitely remote
object will be in the focal plane.
If we remove plane S0 perpendicular to the axis to infinity (Fig. 16.14), its conju-
gate plane S will advance to its extreme position F called the first (or front) focal
plane of the system. We can say for short that the first focal plane F is a plane
conjugate to planes S∞ 0 perpendicular to the axis of the system and at infinity in the

image space.
The point of intersection of first focal plane F with the optical axis is called the
first (or front) focal point (focus) of the system. This point is also designated by
the symbol F. The rays emerging from focal point F form a beam of rays parallel
to the axis after leaving the system. The rays emerging from point Q belonging to
focal plane F (see Fig. 16.14) form a parallel beam directed at an angle to the axis
of the system after passing through the latter. It may happen that a beam which is
parallel upon leaving a system is obtained when a converging beam of light falls
on the system instead of a diverging one (as in Fig. 16.14). In this case, the first focal
point is either beyond the system or inside it.
Principal Planes and Points. Let us consider two conjugate planes at right
angles to the optical axis of the system. Arrow 𝑦 (Fig. 16.15) in one of these planes
will have as its image arrow 𝑦 0 in the other plane. It follows from axial symmetry
of the system that arrows 𝑦 and 𝑦 0 must be in the same plane passing through
the optical axis (in the plane of the drawing). The image 𝑦 0 may be in the same
direction as object 𝑦 (see Fig. 16.15a), or in the opposite direction (see Fig. 16.15b). In
the first case, the image is called erect, in the second—inverted. Segments laid
off upward from an optical axis are considered to be positive, and those laid off
354 OPTICS

Fig. 16.15

downward—negative. The actual lengths of the segments are shown in drawings,


i.e., the positive quantities (−𝑦) and (−𝑦 0) for negative segments.
The ratio of the linear dimensions of an image and an object is called the linear
(longitudinal) or the lateral magnification. Designating it by the symbol 𝑀, we
can write
𝑦0
𝑀= . (16.56)
𝑦
The linear magnification is an algebraic quantity. It is positive if the image is erect
(the signs of 𝑦 and 𝑦 0 are the same) and negative if the image is inverted (the signs
of 𝑦 and 𝑦 0 are opposite).
We can prove that there are two conjugate planes which reflect each other with
a linear magnification of 𝑀 = +1. These planes are known as the principal ones.
The plane belonging to the object space is called the first (or front) principal plane
of a system. It is designated by the symbol H. The plane belonging to the image
space is called the second (or back) principal plane. Its symbol is H0. The points
of intersection of the principal planes with the optical axis are called the principal
points of the system (first and second, respectively). They are designated by the
same symbols H and H0. Depending on the design of a system, its principal planes
and points may be either outside or inside the system. One of the planes may be
outside and the other inside a system. Finally, both planes may be outside a system
at the same side of it.
It can be seen from the definition of the principal planes that ray 1 intersecting
(actually—Fig. 16.16a, or when virtually continued inside the system—Fig. 16.16b)
the first principal plane H at point Q has as its conjugate ray 10 that intersects
(directly or upon virtual continuation) principal plane H0 at point Q0. The latter
is in the same direction and at the same distance from the axis as point Q. This is
easy to understand if we remember that Q and Q0 are conjugate points, and take
into account that any ray passing through point Q must have as its conjugate a ray
passing through point Q0.
Nodal Planes and Nodal Points. Conjugate points N and N’ lying on the
optical axis and having the property that the conjugate rays passing through them
(actually or when imaginarily continued inside the system) are parallel to each
Centered Optical System 355

Fig. 16.16

Fig. 16.17

other are called nodal points or nodes (see rays 1-10 and 2-20 in Fig. 16.17). Planes
perpendicular to the axis and passing through the nodal points are called nodal
planes (first and second).
The distance between the nodal points always equals that between the principal
points. When the optical properties of the media at both sides of the system are the
same (i.e., 𝑛 = 𝑛0), the nodal and principal points coincide.
Focal Lengths and Optical Power of a System. The distance from first
principal point H to first focal point F is called the first focal length 𝑓 of the
system. The distance from H’ to F’ is known as the second focal length 𝑓 0. The
focal lengths 𝑓 and 𝑓 0, are algebraic quantities. They are positive if a given focal
point is at the right of the relevant principal point, and negative in the opposite
case. For example, for the system shown in Fig. 16.18 (see below), the second focal
length 𝑓 0 is positive, and the first focal length 𝑓 is negative. The figure depicts the
true length of HF, i.e., the positive quantity (-𝑓 ) equal to the absolute value of 𝑓 .
We can show that the following relation holds between the focal lengths 𝑓 and
𝑓 of a centered optical system formed by spherical refracting surfaces:
0

𝑓 𝑛
0
= − 0, (16.57)
𝑓 𝑛
where 𝑛 is the refractive index of the medium in front of the optical system, and 𝑛0
is the refractive index of the medium behind the system. Examination of Eq. (16.57)
356 OPTICS

Fig. 16.18

shows that when the refractive indices of the media at both sides of an optical
system are the same, the focal lengths differ only in their sign:
𝑓 0 = −𝑓 . (16.58)
The quantity
𝑛0 𝑛
𝑃= 0 =− , (16.59)
𝑓 𝑓
is known as the optical power of a system. When 𝑃 grows, the focal length 𝑓 0
diminishes, and, consequently, the rays are refracted by the optical system to a
greater extent. The optical power is measured in dioptres (D). To obtain 𝑃 in
dioptres, the focal length in Eq. (16.59) must be taken in metres. When 𝑃 is positive,
the second focal length 𝑓 0 is also positive; hence, the system produces a real image of
an infinitely remote point—a parallel beam of rays is transformed into a converging
one. In this case, the system is called converging. When 𝑃 is negative, the image of
an infinitely remote point will be virtual—a parallel beam of rays is transformed by
the system into a diverging one. Such a system is called diverging.
Formula of a System. We completely determine the properties of an optical
system by setting its cardinal planes or points. In particular, knowing the position
of the cardinal planes, we can construct the optical image produced by a system. Let
us take segment OP perpendicular to the optical axis in the object space (Fig. 16.18,
the nodal points are not shown in the figure). The position of this segment can be
set either by the distance 𝑥 measured from point F to point 0, or by the distances
from H to 0. The quantities 𝑥 and 𝑠, like the focal lengths 𝑓 and 𝑓 0, are algebraic
ones (their magnitudes are shown in figures).
Let us draw ray 1 parallel to the optical axis from point P. It will intersect plane
H at point A. In accordance with the properties of principal planes, ray 10 conjugate
to ray 1 must pass through point A0 of plane H0 conjugate to point A. Since ray 1
Centered Optical System 357

is parallel to the optical axis, then ray 10 conjugate to it will pass through second
focal point F0. Now let us draw ray 2 passing through the first focal point F from
point P. It will intersect plane H at point B. Ray 20 conjugate to it will pass through
point B0 of plane H0 conjugate to B and will be parallel to the optical axis. Point P0
of intersection of rays 10 and 20 is the image of point P. Image 00P0, like object OP,
is perpendicular to the optical axis.
The position of image 00P0 can be characterized either by the distance 𝑥 0 from
point F0 to point 00 or by the distance 𝑠 0 from H0 to 00. The quantities 𝑥 0 and 𝑠 0 are
algebraic ones. For the case shown in Fig. 16.18, they are positive.
The quantity 𝑥 0 determining the position of the image is related to the quantity
𝑥 determining the position of the object and to the focal lengths 𝑓 and 𝑓 0. For the
right triangles with a common apex at point F (see Fig. 16.18), we can write the
relation
OP 𝑦 −𝑥
= = . (16.60)
HB −𝑦 0 −𝑓
Similarly, for the triangles with their common apex at point F0, we have
H0A0 𝑦 𝑓0
= = . (16.61)
O0P0 −𝑦 0 𝑥 0
Combining both relations, we find that (−𝑥)/(−𝑓 ) = 𝑓 0/𝑥 0, whence
𝑥𝑥 0 = 𝑓 𝑓 0. (16.62)
This equation is known as Newton’s formula. For the condition that 𝑛 = 𝑛0,
Newton’s formula has the form
𝑥𝑥 0 = −𝑓 2 (16.63)
[see Eq. (16.57)].
It is easy to pass over from the formula relating the distances 𝑥 and 𝑥 0 to the
object and to the image from the focal points of a system to a formula establishing
the relation between the distances 𝑠 and 𝑠 0 from the principal points. A glance
at Fig. 16.18 shows that (−𝑥) == (−𝑠) − (−𝑓 ) (i.e., 𝑥 = 𝑠 − 𝑓 ), and 𝑥 0 = 𝑠 0 − 𝑓 0.
Introducing these expressions for 𝑥 and 𝑥 0 into Eq. (16.62) and making the relevant
transformations, we get
𝑓 𝑓0
+ = 1. (16.64)
𝑠 𝑠0
When the condition is observed that 𝑓 0 = −𝑓 [see Eq. (16.58)], Eq. (16.64) is simplified
as follows:
1 1 1
− = . (16.65)
𝑠 𝑠0 𝑓
Equations (16.62)-(16.65) are equations of a centered optical system.
358 OPTICS

Fig. 16.19

16.8. Thin Lenses

A lens is a very simple centered optical system. It is a transparent (usually glass)


body bounded by two spherical surfaces⁴ (in a particular case one of the surfaces
can be plane). The points of intersection of the surfaces with the optical axis of
a lens are called the apices of the refracting surfaces. The distance between the
apices is named the thickness of the lens. If the lens thickness may be ignored in
comparison with the smaller of the radii of curvature of the surfaces bounding a
lens, the latter is called thin.
Calculations which we do not give here show that for a thin lens the principal
planes H and H0 may be considered to coincide and pass through the centre 0 of
the lens (Fig. 16.19). The following expression is obtained for the focal lengths of a
thin lens:   
𝑛0 𝑅1 𝑅2
0
𝑓 = −𝑓 = , (16.66)
𝑛 − 𝑛0 𝑅2 − 𝑅1
where 𝑛 is the refractive index of the lens, 𝑛0 is the refractive index of the medium
surrounding the lens, 𝑅1 and 𝑅2 are the radii of curvature of the lens surfaces.
The radii of curvature must be treated as algebraic quantities: for a convex
surface (i.e., when the centre of curvature is to the right of the apex), the radius of
curvature must be considered positive, and for a concave surface (i.e., when the
centre of curvature is to the left of the apex) the radius must be considered negative.
The magnitude of the radius of curvature is shown in drawings, i.e., −𝑅 if 𝑅 < 0.
If the refractive indices of the media at both sides of a thin lens are the same,
then the nodal points N and N0 coincide with the principal points, i.e., are at the
centre 0 of the lens. Hence, in this case, any ray passing through the centre of the
lens does not change its direction. If the refractive indices of the media before and
after a lens are different, then the nodal points do not coincide with the principal
points and a ray passing through the centre of the lens changes its direction.

⁴There are also lenses with surfaces having a more intricate shape.
Huygens’ Principle 359

Fig. 16.20

A parallel beam of rays after passing through a lens converges at a point on the
focal plane (see point Q0 in Fig. 16.20). To determine the position of this point, we
must continue the ray passing through the centre of the lens up to its intersection
with the focal plane (see ray 0Q0 shown by a dash line). The other rays will gather at
the point of intersection too. Such a method is suitable when the optical properties
of the medium at each side of a lens are identical (𝑛 = 𝑛0). Otherwise, a ray passing
through the centre will change its direction. To find point Q0 in this case, we must
know the position of the nodal points of the lens.
We must note that the optical paths laid off along the rays, beginning at wave
surface SS (see Fig. 16.20) and terminating at point Q0 are identical and are tau-
tochronous (see the end of Sec. 16.6).
In concluding, we must say that a lens is far from ideal optical system. The
images of objects it produces have a number of errors. But a consideration of them
is beyond the scope of the present book.

16.9. Huygens’ Principle

In the following two chapters, we shall have to do with processes taking place
behind an opaque barrier with apertures when a light wave impinges on the barrier.
In the approximation of geometrical optics, no light ought to penetrate beyond
the barrier into the region of the geometrical shadow. Actually, however, a light
wave in principle propagates throughout the entire space behind the barrier and
penetrates into the region of the geometrical shadow, this penetration being the
more noticeable, the smaller are the dimensions of the apertures. With a diameter
of the apertures or a width of slits comparable with the length of a light wave, the
approximation of geometrical optics is absolutely illegitimate.
The behaviour of light behind a barrier with an aperture can be explained
360 OPTICS

Fig. 16.21 Fig. 16.22

qualitatively with the aid of Huygens’ principle, named in honour of the Dutch
physicist Christian Huygens (1629-1696) who discovered it. This principle establishes
the way of constructing a wavefront at the moment of time 𝑡 + 𝛥𝑡 according to the
known position of the wavefront at the moment 𝑡. According to Huygens’ principle,
every point on an advancing wavefront can be considered as a source of secondary
wavelets, and the envelope of these wavelets defines a new wavefront (Fig. 16.21; the
medium is assumed to be non-homogeneous—the velocity of the wave in the lower
part of the figure is greater than in the upper one).
Assume that a plane barrier with an aperture is struck by a wavefront parallel
to it (Fig. 16.22). According to Huygens, every point on the portion of the wavefront
bordering on the aperture is a centre of secondary wavelets which will be spherical
in a homogeneous and isotropic medium. Constructing the envelope of these
wavelets, we shall see that the wave penetrates beyond the aperture into the region
of the geometrical shadow (these regions are shown by dash lines in the figure),
bending around the edges of the barrier.
Huygens’ principle gives no information on the intensity of waves propagating
in various directions. This shortcoming was eliminated by the French physicist
Augustin Fresnel (1788-1827). The improved Huygens-Fresnel principle is treated in
Sec. 18.1, where a physical substantiation of the principle is also given.
361

Chapter 17
INTERFERENCE OF LIGHT

17.1. Interference of Light Waves

Let us assume that two waves of the same frequency, being superposed on each
other, produce oscillations of the same direction, namely,
𝐴1 cos(𝜔𝑡 + 𝛼1 ), 𝐴2 cos(𝜔𝑡 + 𝛼2 ),
at a certain point in space. The amplitude of the resultant oscillation at the given
point is determined by the expression
𝐴2 = 𝐴21 + 𝐴22 + 2𝐴1 𝐴2 cos 𝛿,
where 𝛿 = 𝛼2 − 𝛼1 [see Eq. (7.84) of Vol. I].
If the phase difference 𝛿 of the oscillations set up by the waves remains constant
in time, then the waves are called coherent¹.
The phase difference 𝛿 for incoherent waves varies continuously and takes on
any values with an equal probability. Hence, the time-averaged value of cos 𝛿 equals
zero. Therefore,

2
2
2
𝐴 = 𝐴1 + 𝐴2 .
Taking into account Eq. (16.10), we thus conclude that the intensity observed upon
the superposition of incoherent waves equals the sum of the intensities produced
by each of the waves individually:
𝐼 = 𝐼1 + 𝐼2 . (17.1)
For coherent waves, cos 𝛿 has a time-constant value (but a different one for
each point of space), so that,
𝐼 = 𝐼1 + 𝐼2 + 2 𝐼1 𝐼2 cos 𝛿. (17.2)
p

At the points of space for which cos 𝛿 > 0, the intensity 𝐼 will exceed 𝐼1 + 𝐼2 ; at the
¹We shall discuss the concept of coherence in greater detail in the following.
362 INTERFERENCE OF LIGHT

Fig. 17.1

points for which cos 𝛿 < 0, it will be smaller than 𝐼1 + 𝐼2 . Thus, the superposition
of coherent light waves is attended by redistribution of the light flux in space. As
a result, maxima of the intensity will appear at some spots and minima at others.
This phenomenon is called the interference of waves. Interference manifests itself
especially clearly when the intensity of both interfering waves is the same: 𝐼1 = 𝐼2 .
Hence, according to Eq. (17.2), at the maxima 𝐼 = 4𝐼1 , while at the minima 𝐼 = 0.
For incoherent waves in the same condition, we get the same intensity 𝐼 = 2𝐼1
everywhere [see Eq. (17.1)].
It follows from what has been said above that when a surface is illuminated by
several sources of light (for example, by two lamps), an interference pattern ought
to be observed with a characteristic alternation of maxima and minima of intensity.
We know from our everyday experience, however, that in this case the illumination
of the surface diminishes monotonously with an increasing distance from the light
sources, and no interference pattern is observed. The explanation is that natural
light sources are not coherent.
The incoherence of natural light sources is due to the fact that the radiation of a
luminous body consists of the waves emitted by many atoms. The individual atoms
emit wave trains with a duration of about 10−8 s and a length of about 3 m (see
Sec. 16.1). The phase of a new train is not related in any way to that of the preceding
one. In the light wave emitted by a body, the radiation of one group of atoms after
about 10−8 s is replaced by the radiation of another group, and the phase of the
resultant wave undergoes random changes.
Coherent light waves can be obtained by splitting (by means of reflections or
refractions) the wave emitted by a single source into two parts. If these waves are
made to cover different optical paths and are then superposed onto each other,
interference is observed. The difference between the optical paths covered by the
interfering waves must not be very great because the oscillations being added must
belong to the same resultant wave train. If this difference will be of the order of
one metre, oscillations corresponding to different trains will be superposed, and
the phase difference between them will continuously change in a chaotic way.
Assume that the splitting into two coherent waves occurs at point 0 (Fig. 17.1).
Interference of Light Waves 363

Up to point P, the first wave travels the path 𝑠1 in a medium of refractive index 𝑛1 ,
and the second wave travels the path 𝑠2 , in a medium of refractive index 𝑛2 . If the
phase of oscillations at point 0 is 𝜔𝑡, then the first wave will produce the oscillation
𝐴1 cos 𝜔(𝑡−𝑠1 /𝑣1 ) at point P, and the second wave, the oscillation 𝐴2 cos 𝜔(𝑡−𝑠2 /𝑣2 )
at this point; 𝑣1 = 𝑐/𝑛1 and 𝑣2 = 𝑐/𝑛2 are the phase velocities of the waves. Hence,
the difference between the phases of the oscillations produced by the waves at point
P will be  
𝑠2 𝑠1 𝜔
𝛿=𝜔 − = (𝑛2 𝑠2 − 𝑛1 𝑠1 ).
𝑣2 𝑣1 𝑐
Replacing 𝜔/𝑐 with 2𝜋 𝜈/𝑐 = 2𝜋/𝜆0 (where 𝜆0 is the wavelength in a vacuum), the
expression for the phase difference can be written in the form
2𝜋
𝛿= 𝛥, (17.3)
𝜆0
where
𝛥 = 𝑛2 𝑠2 − 𝑛1 𝑠1 = 𝐿1 − 𝐿2 , (17.4)
is a quantity equal to the difference between the optical paths travelled by the waves
and is called the difference in optical path [compare with Eq. (16.55)].
A glance at Eq. (17.3) shows that if the difference in the optical path equals an
integral number of wavelengths in a vacuum:
𝛥 = ±𝑚𝜆0 (𝑚 = 0, 1, 2, . . .), (17.5)
then the phase difference 𝛿 is a multiple of 2𝜋, and the oscillations produced at point
P by both waves will occur with the same phase. Thus, Eq. (17.5) is the condition for
an interference maximum, i.e., for constructive interference.
If 𝛥 equals a half-integral number of wavelengths in a vacuum:
1
 
𝛥=± 𝑚+ 𝜆0 (𝑚 = 0, 1, 2, . . .), (17.6)
2
then, 𝛿 = ±(2𝑚 + 1)𝜋, so that the oscillations at point P are in counterphase.
Thus, Eq. (17.6) is the condition for an interference minimum, i.e., for destructive
interference.
Let us consider two cylindrical coherent light waves emerging from sources
S1 and S2 having the form of parallel thin luminous filaments or narrow slits
(Fig. 17.2). The region in which these waves overlap is called the interference field.
Within this entire region, there are observed alternating places with maximum and
minimum intensity of light. If we introduce a screen into the interference field,
we shall see on it an interference pattern having the form of alternating light and
dark fringes. Let us calculate the width of these fringes, assuming that the screen
is parallel to a plane passing through sources S1 and S2 . We shall characterize the
364 INTERFERENCE OF LIGHT

S1

S2

Fig. 17.2

position of a point on the screen by the coordinate x measured in a direction at


right angles to lines S1 and S2 . We shall choose the beginning of our readings at
point 0 relative to which S1 and S2 are arranged symmetrically. We shall consider
that the sources oscillate in the same phase. Examination of Fig. 17.2 shows that
 2  2
2 2 𝑑 2 2 𝑑
𝑠1 = 𝑙 + 𝑥 − , 𝑠2 = 𝑙 + 𝑥 + .
2 2
Hence,
𝑠22 − 𝑠12 = (𝑠2 + 𝑠1 ) (𝑠2 − 𝑠1 ) = 2𝑥𝑑.
It will be established somewhat later that to obtain a distinguishable interference
pattern, the distance between the sources 𝑑 must be considerably smaller than the
distance to the screen 𝑙. The distance 𝑥 within whose limits interference fringes are
formed is also considerably smaller than 𝑙. In these conditions, we can assume that
𝑠2 + 𝑠1 ≈ 2𝑙. Thus, 𝑠2 − 𝑠1 = 𝑥𝑑/𝑙. Multiplying 𝑠2 − 𝑠1 by the refractive index of the
medium 𝑛, we get the difference in the optical path
𝑥𝑑
𝛥=𝑛 . (17.7)
𝑙
The introduction of this value of 𝛥 into condition (17.5) shows that intensity maxima
will be observed at values of 𝑥 equal to
𝑙
𝑥max = ±𝑚 𝜆 (𝑚 = 0, 1, 2, . . .). (17.8)
𝑑
Here 𝜆 = 𝜆0 /𝑛 is the wavelength in the medium filling the space between the sources
and the screen.
Using the value of 𝛥 given by Eq. (17.7) in condition (17.6), we get the coordinates
of the intensity minima:
1 𝑙
 
𝑥min = ± 𝑚 + 𝜆 (𝑚 = 0, 1, 2, . . .). (17.9)
2 𝑑
Interference of Light Waves 365

Let us call the distance between two adjacent intensity maxima the distance
between interference fringes, and the distance between adjacent intensity min-
ima the width of an interference fringe. It can be seen from Eqs. (17.8) and (17.9)
that the distance between fringes and the width of a fringe have the same value
equal to
𝑙
𝛥𝑥 = 𝜆. (17.10)
𝑑
According to Eq. (17.10), the distance between the fringes grows with a decreas-
ing distance 𝑑 between the sources. If 𝑑 were comparable with 𝑙, the distance
between the fringes would be of the same order as 𝜆, i.e., would be several scores of
micrometres. In this case, the separate fringes would be absolutely indistinguishable.
For an interference pattern to become distinct, the above-mentioned condition
𝑑  𝑙 must be observed.
If the intensity of the interfering waves is the same (𝐼1 = 𝐼2 = 𝐼0 ), then according
to Eq. (17.2) the resultant intensity at the points for which the phase difference is 𝛿
is determined by the expression
 
2 𝛿
𝐼 = 2𝐼0 (1 + cos 𝛿) = 4𝐼0 cos .
2
Since 𝛿 is proportional to 𝛥 [see Eq. (17.3)], then, in accordance with Eq. (17.7), 𝛿 grows
proportionally to 𝑥. Hence, the intensity varies along the screen in accordance with
the law of cosine square. The right-hand part of Fig. 17.2 shows the dependence of 𝐼
on 𝑥 obtained in monochromatic light.
The width of the interference fringes and their spacing depend on the wave-
length 𝜆. The maxima of all wavelengths will coincide only at the centre of a pattern
when 𝑥 = 0. With an increasing distance from the centre of the pattern, the maxima
of different colours become displaced from one another more and more. The result
is blurring of the interference pattern when it is observed in white light. The num-
ber of distinguishable interference fringes appreciably grows in monochromatic
light.
Having measured the distance between the fringes 𝛥𝑥 and knowing 𝑙 and 𝑑, we
can use Eq. (17.10) to find 𝜆. It is exactly from experiments involving the interference
of light that the wavelengths for light rays of various colours were determined for
the first time.
We have considered the interference of two cylindrical waves. Let us see what
happens when two plane waves are superposed. Assume that the amplitudes of
these waves are the same, and the directions of their propagation make the angle
2𝜑 (Fig. 17.3). We shall consider that the directions of oscillations of the light vector
are perpendicular to the plane of the drawing. The wave vectors 𝒌1 and 𝒌2 are in
366 INTERFERENCE OF LIGHT

Sc

Fig. 17.3

the plane of the drawing and have the same magnitude equal to 𝑘 = 2𝜋/𝜆. Let us
write the equations of these waves:
𝐴 cos(𝜔𝑡 − 𝒌1 · 𝒓) = 𝐴 cos(𝜔𝑡 − 𝑘 sin 𝜑 × 𝑥 − 𝑘 cos 𝜑 × 𝑦),
𝐴 cos(𝜔𝑡 − 𝒌2 · 𝒓) = 𝐴 cos(𝜔𝑡 + 𝑘 sin 𝜑 × 𝑥 − 𝑘 cos 𝜑 × 𝑦).
The resultant oscillation at points with the coordinates 𝑥 and 𝑦 has the form
𝐴 cos(𝜔𝑡 − 𝑘 sin 𝜑 × 𝑥 − 𝑘 cos 𝜑 × 𝑦) + 𝐴 cos(𝜔𝑡 + 𝑘 sin 𝜑 × 𝑥 − 𝑘 cos 𝜑 × 𝑦)
= 2𝐴 cos(𝑘 sin 𝜑 × 𝑥) cos(𝜔𝑡 − 𝑘 cos 𝜑 × 𝑦). (17.11)
It follows from this equation that at points where 𝑘 sin 𝜑 × 𝑥 = ±𝑚𝜋 (𝑚 =
0, 1, 2, . . .), the amplitude of the oscillations is 2𝐴; where 𝑘 sin 𝜑 × 𝑥 = ±(𝑚 + 1/2)𝜋,
the amplitude of the oscillations is zero. No matter where we place screen Sc, which
is perpendicular to the 𝑦-axis, we shall observe on it a system of alternating light
and dark fringes parallel to the 𝑧-axis (this axis is perpendicular to the plane of the
drawing). The coordinates of the intensity maxima will be
𝑚𝜋 𝑚𝜆
𝑥max = ± =± . (17.12)
𝑘 sin 𝜑 2 sin 𝜑
Only the phase of the oscillations depends on the position of the screen (on the
coordinate 𝑦) [see Eq. (17.11)].
We have assumed for simplicity that the initial phases of interfering waves are
zero. If the difference between these phases is other than zero, a constant addend
will appear in Eq. (17.12)—the fringe pattern will move along the screen.

17.2. Coherence

By coherence is meant the coordinated proceeding of several oscillatory or wave


processes. The degree of coordination may vary. We can accordingly introduce the
concept of the degree of coherence of two waves.
Temporal and spatial coherence are distinguished. We shall begin with a
Coherence 367

discussion of temporal coherence.


Temporal Coherence. The process of interference described in the preceding
section is idealized. This process is actually much more complicated. The reason is
that a monochromatic wave described by the expression
𝐴 cos(𝜔𝑡 − 𝑘𝑟 + 𝛼),
where 𝐴, 𝜔, and 𝛼 are constants, is an abstraction. A real light wave is formed by the
superposition of oscillations of all possible frequencies (or wavelengths) confined
within a more or less narrow but finite range of frequencies 𝛥𝜔 (or the correspond-
ing range of wavelengths 𝛥𝜆). Even for light considered to be monochromatic
(single-coloured), the frequency interval 𝛥𝜔 is finite². In addition, the amplitude
of the wave 𝐴 and the phase 𝛼 undergo continuous random (chaotic) changes with
time. Hence, the oscillations produced at a certain point of space by two superposed
light waves have the form
𝐴1 (𝑡) cos[𝜔1 (𝑡)𝑡 + 𝛼1 (𝑡)], 𝐴2 (𝑡) cos[𝜔2 (𝑡)𝑡 + 𝛼2 (𝑡)], (17.13)
the chaotic changes in the functions 𝐴1 (𝑡), 𝜔1 (𝑡), 𝛼1 (𝑡), 𝐴2 (𝑡), 𝜔2 (𝑡), and 𝛼2 (𝑡)
being absolutely independent.
We shall assume for simplicity’s sake that the amplitudes 𝐴1 and 𝐴2 are constant.
Changes in the frequency and phase can be re duced either to a change only in the
phase, or to a change only in the frequency. Let us write the function
𝑓 (𝑡) = 𝐴 cos[𝜔(𝑡)𝑡 + 𝛼(𝑡)], (17.14)
in the form
𝑓 (𝑡) = 𝐴 cos{𝜔0 𝑡 + [𝜔(𝑡) − 𝜔0 ]𝑡 + 𝛼(𝑡)},
where 𝜔0 is a certain average value of the frequency, and introduce the notation
[𝜔(𝑡) − 𝜔0 ]𝑡 + 𝛼(𝑡) = 𝛼 0 (𝑡). Equation (17.14) will, thus, become
𝑓 (𝑡) = 𝐴 cos[𝜔0 𝑡 + 𝛼 0 (𝑡)]. (17.15)
We have obtained a function in which only the phase of the oscillation changes
chaotically.
On the other hand, it is proved in mathematics that an inharmonic function,
for example, function (17.14), can be represented in the form of the sum of harmonic
functions with frequencies confined within a certain interval 𝛥(𝜔) [see Eq. (17.16)].
Thus, when considering the matter of coherence, two approaches are possible:
a “phase” one and a “frequency” one. Let us begin with the phase approach. Assume
that the frequencies 𝜔1 and 𝜔2 in Eqs. (17.13) satisfy the condition 𝜔1 = 𝜔2 = constant.
Now let us find the influence of a change in the phases 𝛼1 and 𝛼2 . According to

²The spectral lines emitted by atoms have a “natural” width of the order of 10−8 rad s−1 (𝛥𝜆 ∼
10−4 Å).
368 INTERFERENCE OF LIGHT

Eq. (17.12), with our assumptions, the intensity of light at a given point is determined
by the expression
𝐼 = 𝐼1 + 𝐼2 + 2 𝐼1 𝐼2 cos[𝛿 (𝑡)],
p

where 𝛿 (𝑡) = 𝛼2 (𝑡) − 𝛼1 (𝑡). The last addend in this equation is called the interfer-
ence term.
An instrument that can be used to observe an interference pattern (the eye³, a
photographic plate, etc.) has a certain inertia. In this connection, it registers a pattern
averaged over the time interval 𝑡instr needed for “operation” of the instrument. If
during the time 𝑡instr the factor cos[𝛿 (𝑡)] takes on all the values from −1 to +1,
the average value of the interference term will be zero. Therefore, the intensity
registered by the instrument will equal the sum of the intensities produced at a
given point by each of the waves separately—interference is absent, and we are
forced to acknowledge that the waves are incoherent.
If during the time 𝑡instr , however, the value of cos[𝛿 (𝑡)] remains virtually con-
stant⁴, the instrument will detect interference, and the waves must be acknowledged
as coherent.
It follows from the above that the concept of coherence is relative: two waves
can behave like coherent ones when observed using one instrument (having a low
inertia), and like incoherent ones when observed using another instrument (having
a high inertia). The coherent properties of waves are characterized by introducing
the coherence time 𝑡coh . It is defined as the time during which a chance change
in the wave phase 𝛼(𝑡) reaches a value of the order of 𝜋. During the time 𝑡coh , an
oscillation, as it were, forgets its initial phase and becomes incoherent with respect
to itself.
Using the concept of the coherence time, we can say that when the instrument
time is much greater than the coherence time of the superposed waves (𝑡instr  𝑡coh ),
the instrument does not register interference. When 𝑡instr  𝑡coh , the instrument
will detect a sharp interference pattern. At intermediate values of 𝑡instr , the sharpness
of the pattern will diminish as 𝑡instr grows from values smaller than 𝑡coh to values
greater than it.
The distance 𝑙coh = 𝑐𝑡coh over which a wave travels during the time 𝑙coh is called
the coherence length (or the train length). The coherence length is the distance
over which a chance change in the phase reaches a value of about 𝜋. To obtain an
interference pattern by splitting a natural wave into two parts, it is essential that the

³We remind our reader that the showing of motion picture films is based on the inertia of visual
perception, which is about 0.1 s.
⁴The phase difference 𝛿 (𝑡) varies for different points of space. The influence of the interference
term manifests itself at the points where it differs from zero.
Coherence 369

Fig. 17.4

optical path difference 𝛥 be smaller than the coherence length. This requirement
limits the number of visible interference fringes observed when using the layout
shown in Fig. 17.2. An increase in the fringe number 𝑚 is attended by a growth in
the path difference. As a result, the sharpness of the fringes becomes poorer and
poorer.
Let us pass over to a consideration of the part of the non-monochromatic
nature of light waves. Assume that light consists of a sequence of identical trains
of frequency 𝜔0 and duration 𝑇. When one train is replaced with another one,
the phase experiences disordered changes. As a result, the trains are mutually
incoherent. With these assumptions, the duration of a train 𝜏 virtually coincides
with the coherence time 𝑡coh .
In mathematics, the Fourier theorem is proved, according to which any finite
and integrable function 𝐹 (𝑡) can be represented in the form of the sum of an infinite
number of harmonic components with a continuously changing frequency:

𝐹 (𝑡) = +∞𝐴(𝜔) 𝑒𝑖𝜔𝑡 d𝜔. (17.16)
−∞
Expression (17.16) is known as the Fourier integral. The function 𝐴(𝜔) inside
the integral is the amplitude of the relevant monochromatic component. Accord-
ing to the theory of Fourier integrals, the analytical form of the function 𝐴(𝜔) is
determined by the expression
∫ +∞
𝐴(𝜔) = 2𝜋 𝐹 (𝜉) 𝑒−𝑖𝜔𝜉 d𝜉, (17.17)
−∞
where 𝜉 is an auxiliary integration variable.
Assume that the function 𝐹 (𝑡) describes a light disturbance at a certain point
at the moment of time 𝑡 due to a single wave train. Hence, it is determined by the
conditions
𝜏
𝐹 (𝑡) = 𝐴0 exp(𝑖𝜔0 𝑡) at |𝑡| 
2
𝜏
𝐹 (𝑡) = 0 at |𝑡| > .
2
A graph of the real part of this function is given in Fig. 17.4.
370 INTERFERENCE OF LIGHT

Fig. 17.5

Outside the interval from −𝜏/2 to +𝜏/2, the function 𝐹 (𝑡) is zero. Therefore,
expression (17.17) determining the amplitude of the harmonic components has the
form
∫ +𝜏/2
𝐴(𝜔) = 2𝜋 [𝐴0 exp(𝑖𝜔0 𝜉)] exp(−𝑖𝜔𝜉) d𝜉
−𝜏/2
+𝜏/2
exp[𝑖(𝜔0 − 𝜔)𝜉] +𝜏/2

= 2𝜋 𝐴0 exp[𝑖(𝜔0 − 𝜔)𝜉] d𝜉 = 2𝜋 𝐴0 .
−𝜏/2 𝑖(𝜔0 − 𝜔) −𝜏/2
After introducing the integration limits and simple transformations, we arrive at
the equation
sin[(𝜔 − 𝜔0 )𝜏/2]
𝐴(𝜔) = 𝜋 𝐴0 𝜏 .
(𝜔 − 𝜔0 )𝜏/2
The intensity 𝐼 (𝜔) of a harmonic wave component is proportional to the square
of the amplitude, i.e., to the expression
sin2 [(𝜔 − 𝜔0 )𝜏/2]
𝑓 (𝜔) = . (17.18)
[(𝜔 − 𝜔0 )𝜏/2] 2
A graph of function (17.18) is shown in Fig. 17.5. A glance at the figure shows that
the intensity of the components whose frequencies are within the interval of width
𝛥𝜔 = 2𝜋/𝜏 considerably exceeds the intensity of the remaining components. This
circumstance allows us to relate the duration of a train 𝜏 to the effective frequency
range 𝛥𝜔 of a Fourier spectrum:
2𝜋 1
𝜏= = .
𝛥𝜔 𝛥𝜈
Identifying 𝜏 with the coherence time, we arrive at the relation
1
𝑡coh ∼ (17.19)
𝛥𝜈
(The sign ∼ stands for “equal to in the order of magnitude”).
Coherence 371

It can be seen from expression (17.19) that the broader the interval of frequencies
present in a given light wave, the smaller is the coherence time of this wave.
The frequency is related to the wavelength in a vacuum by the expression
𝜈 = 𝑐/𝜆0 . Differentiation of this expression yields 𝛥𝜈 = 𝑐 𝛥𝜆0 /𝜆20 ≈ 𝑐 𝛥𝜆/𝜆2 (we
have omitted the minus sign obtained in differentiation and also assumed that
𝜆0 ≈ 𝜆). Substituting for 𝛥𝜈 in Eq. (17.19) its expression through 𝜆 and 𝛥𝜆, we obtain
the following expression for the coherence time:
𝜆2
𝑡coh ∼ . (17.20)
𝑐 𝛥𝜆
Hence, we get the following value for the coherence length:
𝜆2
𝑙 coh = 𝑐𝑡coh ∼ . (17.21)
𝛥𝜆
Examination of Eq. (17.5) shows that the path difference at which a maximum of
the 𝑚-th order is obtained is determined by the relation
𝛥𝑚 = ±𝑚𝜆0 ≈ ±𝑚𝜆.
When this path difference reaches values of the order of the coherence length, the
fringes become indistinguishable. Consequently, the extreme interference order
observed is determined by the condition
𝜆2
𝑚extr 𝜆 ∼ 𝑙coh ∼ ,
𝛥𝜆
whence
𝜆
𝑚extr ∼ . (17.22)
𝛥𝜆
It follows from Eq. (17.22) that the number of interference fringes observed according
to the layout shown in Fig. 17.2 grows when the wavelength interval in the light
used diminishes.
Spatial Coherence. According to the equation 𝑘 = 𝜔/𝑣 = 𝑛𝜔/𝑐, scattering
of the frequencies 𝛥𝜔 results in scattering of the values of 𝑘. We have established
that the temporal coherence is determined by the value of 𝛥𝜔. Consequently, the
temporal coherence is associated with scattering of the values of the magnitude of
the wave vector 𝒌. Spatial coherence is associated with scattering of the directions
of the vector 𝒌 that is characterized by the quantity 𝛥ˆ𝒆𝑘 .
The setting up at a certain point of space of oscillations produced by waves
with different values of 𝒆ˆ 𝑘 is possible if these waves are emitted by different sections
of an extended (not a point) light source. Let us assume for simplicity’s sake that
the source has the form of a disk visible from a given point at the angle 𝜑. It can
be seen from Fig. 17.6 that the angle 𝜑 characterizes the interval confining the unit
vectors 𝒆ˆ 𝑘 . We shall consider that this angle is small.
372 INTERFERENCE OF LIGHT

Fig. 17.6

Assume that the light from the source falls on two narrow slits behind which
there is a screen (Fig. 17.7). We shall consider that the interval of frequencies emitted
by the source is very small. This is needed for the degree of temporal coherence to
be sufficient for obtaining a sharp interference pattern. The wave arriving from the
section of the surface designated in Fig. 17.7 by 0 produces a zero-order maximum
M at the middle of the screen. The zero-order maximum M0 produced by the wave
arriving from section 00 will be displaced from the middle of the screen by the
distance 𝑥 0. Owing to the smallness of the angle 𝜑 and of the ratio 𝑑/𝑙, we can
consider that 𝑥 0 = 𝑙𝜑/2. The zero-order maximum M00 produced by the wave
arriving from section 000 is displaced in the opposite direction from the middle of
the screen over the distance 𝑥 00 equal to 𝑥 0. The zero-order maxima from the other
sections of the source will be between the maxima M0 and M00.
The separate sections of the light source produce waves whose phases are in
no way related to one another. For this reason, the interference pattern appearing
on the screen will be a superposition of the patterns produced by each section
separately. If the displacement 𝑥 0 is much smaller than the width of an interference
fringe 𝛥𝑥 = 𝑙𝜆/𝑑 [see Eq. (17.10)], then, the maxima from different sections of the
source will practically be superposed on one another, and the pattern will be like
the one produced by a point source. When 𝑥 0 ≈ 𝛥𝑥, the maxima from some
sections will coincide with the minima from others, and no interference pattern
will be observed. Thus, an interference pattern will be distinguishable provided
that 𝑥 0 < 𝛥𝑥, i.e.,
𝑙𝜑 𝑙𝜆
< , (17.23)
2 𝑑
or
𝜆
𝜑< . (17.24)
𝑑
We have omitted the factor 2 when passing over from expression (17.23) to (17.24).
Formula (17.24) determines the angular dimensions of a source at which interfer-
ence is observed. We can also use this formula to find the greatest distance between
the slits at which interference from a source with the angular dimension 𝜑 can still
be observed. Multiplying inequality (17.24) by 𝑑/𝜑, we arrive at the condition
𝜆
𝑑< . (17.25)
𝜑
Coherence 373

Fig. 17.7

A collection of waves with different values of 𝒆ˆ 𝑘 can be replaced with the


resultant wave falling on a screen with slits. The absence of an interference pattern
signifies that the oscillations produced by this wave at the places where the first
and second slits are situated are incoherent. Consequently, the oscillations in the
wave itself at points at a distance 𝑑 apart are incoherent too. If the source were
ideally monochromatic (this means that 𝛥𝜈 = 0 and 𝑡coh = ∞), the surface passing
through the slits would be a wave one, and the oscillations at all the points of this
surface would occur in the same phase. We have established that when 𝛥𝑣 ≠ 0 and
the source has finite dimensions (𝜑 ≠ 0), the oscillations at points of a surface at a
distance of 𝑑 > 𝜆/𝜑 are incoherent.
We shall call a surface, which would be a wave one if the source were monochro-
matic, a pseudowave surface⁵ for brevity. We could satisfy condition (17.24) by
reducing the distance 𝑑 between the slits, i.e., by taking closer points of the pseu-
dowave surface. Consequently, oscillations produced by a wave at adequately close
points of a pseudowave surface are coherent. Such coherence is called spatial. Thus,
the phase of an oscillation changes chaotically when passing from one point of a
pseudowave surface to another. Let us introduce the distance 𝜌coh , upon displace-
ment by which along a pseudowave surface a random change in the phase reaches a
value of about 𝜋. Oscillations at two points of a pseudowave surface spaced apart
at a distance less than 𝜌coh will be approximately coherent. The distance 𝜌coh is
called the spatial coherence length or the coherence radius. It can be seen from
expression (17.25) that
𝜆
𝜌coh ∼ . (17.26)
𝜑
The angular dimension of the Sun is about 0.01 radians, and the length of its
light waves is about 0.5 µm. Hence, the coherence radius of the light waves arriving
⁵It must be borne in mind that this term is not used in scientific publications. The author has
coined it for conditional use only to make the treatment more illustrative.
374 INTERFERENCE OF LIGHT

from the Sun has a value of the order of


0.5
𝜌coh = = 50 µm = 0.05 mm. (17.27)
0.01
The entire space occupied by a wave can be divided into parts in each of which
the wave approximately retains coherence. The volume of such a part of space,
called the coherence volume, in its order of magnitude equals the product of the
temporal coherence length and the area of a circle of radius 𝜌coh .
The spatial coherence of a light wave near the surface of the heated body
emitting it is restricted by a value of 𝜌coh of only a few wavelengths. With an
increasing distance from the source, the degree of spatial coherence grows. The
radiation of a laser⁶ has an enormous temporal and spatial coherence. At the outlet
opening of a laser, spatial coherence is observed throughout the entire cross section
of the light beam.
It would seem possible to observe interference by passing light propagating
from an arbitrary source through two slits in an opaque screen. With a small spatial
coherence of the wave falling on the slits, however, the beams of light passing
through them will be incoherent, and an interference pattern will be absent. The
English scientist Thomas Young (1773-1829) in 1802 obtained interference from two
slits by increasing the spatial coherence of the light falling on the slits. Young
achieved such an increase by first passing the light through a small aperture in an
opaque screen. This light was used to illuminate the slits in a second opaque screen.
Thus, for the first time in history, Young observed the interference of light waves
and determined the lengths of these waves.

17.3. Ways of Observing the Interference of Light

Let us consider two concrete interference layouts of which one uses reflection for
splitting a light wave into two parts, and the other refraction of light.
Fresnel’s Double Mirror. Two plane contacting mirrors 0M and 0N are
arranged so that their reflecting surfaces form an obtuse angle close to 𝜋 (Fig. 17.8).
Hence, the angle 𝜑 in the figure is very small. A straight light source S (for example,
a narrow luminous slit) is placed parallel to the line of intersection of the mirrors
0 (perpendicular to the plane of the drawing) at a distance 𝑟 from it. The mirrors
reflect two cylindrical coherent waves onto screen Sc. They propagate as if they
were emitted by virtual sources S1 and S2 . Opaque screen Sc1 prevents the direct
propagation of the light from source S to screen Sc.
Ray 0Q is the reflection of ray S0 from mirror 0M, and ray 0P is the reflection

⁶Lasers will be treated in Vol. III of our course.


Ways of Observing the Interference of Light 375

Sc
Sc1
M P
S1

S2
N
Q

Fig. 17.8

of ray S0 from mirror 0N. It is easy to see that the angle between rays 0P and 0Q
is 2𝜑. Since S and S1 are symmetrical relative to 0M, the length of segment 0S1
equals 0S, i.e., 𝑟. Similar reasoning leads to the same result for segment 0S2 . Thus,
the distance between sources S1 and S2 is
𝑑 = 2𝑟 sin 𝜑 ≈ 2𝑟𝜑.
Inspection of Fig. 17.8 shows that 𝑎 = 𝑟 cos 𝜑 ≈ 𝑟. Hence,
𝑙 = 𝑟 + 𝑏,
where 𝑏 is the distance from the line of intersection of the mirrors 0 to screen Sc.
Using the values of 𝑑 and 𝑙 we have found in Eq. (17.10), we obtain the width of
an interference fringe:
𝑟+𝑏
𝛥𝑥 = 𝜆. (17.28)
2𝑟𝜑
The region of wave overlapping PQ has a length of 2𝑏 tan 𝜑 ≈ 2𝑏𝜑. Dividing this
length by the width of a fringe 𝛥𝑥, we find the maximum number of interference
fringes that can be observed with the aid of Fresnel’s double mirror at the given
parameters of a layout:
4𝑏𝑟𝜑2
𝑁= . (17.29)
𝜆(𝑟 + 𝑏)
For all these fringes to be visible indeed, it is essential that 𝑁/2 be not greater than
𝑚extr determined by expression (17.22).
Fresnel’s Biprism. Two prisms with a small refracting angle 𝜃 made from a
single piece of glass have one common face (Fig. 17.9). A straight light source S is
arranged parallel to this face at a distance a from it.
376 INTERFERENCE OF LIGHT

P
S1

S2
Q

Fig. 17.9

It can be shown that when the refracting angle a of the prism is very small and
the angles of incidence of the rays on the face of the prism are not very great, all
the rays are deflected by the prism through a practically identical angle equal to
𝜑 = (𝑛 − 1) 𝜃
(𝑛 is the refractive index of the prism). The angle of incidence of the rays on the
biprism is not great. Therefore, all the rays are deflected by each half of the biprism
through the same angle. As a result, two coherent cylindrical waves are formed
emerging from virtual sources S1 and S2 in the same plane as S. The distance
between the sources is
𝑑 = 2𝑎 sin 𝜑 ≈ 2𝑎𝜑 = 2𝑎(𝑛 − 1)𝜃.
The distance from the sources to the screen is
𝑙 = 𝑎 + 𝑏.
We find the width of an interference fringe by Eq. (17.10):
(𝑎 + 𝑏)
𝛥𝑥 = 𝜆. (17.30)
2𝑎(𝑛 − 1) 𝜃
The region of overlapping of the waves PQ has the length
2𝑏 tan 𝜑 ≈ 2𝑏𝜑 = 2𝑏(𝑛 − 1) 𝜃.
The maximum number of fringes observed is
4𝑎𝑏(𝑛 − 1) 2 𝜃 2
𝑁= . (17.31)
𝜆(𝑎 + 𝑏)

17.4. Interference of Light Reflected from Thin Plates

When a light wave falls on a thin transparent plate (or film), reflection occurs from
both surfaces of the plate. The result is the production of two light waves that in
Interference of Light Reflected from Thin Plates 377

C
A

Fig. 17.10

known conditions can interfere.


Assume that a plane light wave which can be considered as a parallel beam of
rays falls on a transparent plane-parallel plate (Fig. 17.10). The plate reflects upward
two parallel beams of light. One of them was formed as a result of reflection from
the top surface of the plate and the second as a result of reflection from its bottom
surface (in Fig. 17.10 each of these beams is represented by only one ray). The second
beam is refracted when it enters the plate and leaves it. In addition to these two
beams, the plate throws upward beams produced as a result of three-, five-fold, etc.
reflection from the plate surfaces. Owing to their small intensity, however, we shall
take no account of these beams⁷. We shall also display no interest in the beams
passing through the plate.
The path difference acquired by rays 1 and 2 before they meet at point C is
𝛥 = 𝑛𝑠2 − 𝑠1 , (17.32)
where 𝑠1 is the length of segment BC, 𝑠2 is the total length of segments AO and OC
and 𝑛 the refractive index of the plate.
We assume that the refractive index of the medium surrounding the plate is
unity. A glance at Fig. 17.10 shows that 𝑠1 = 2𝑏 tan 𝜃 2 × sin 𝜃 1 , and 𝑠2 = 2𝑏/cos 𝜃 2
(here, 𝑏 is the thickness of the plate).

⁷At 𝑛 = 1.5, about 5% of the incident luminous flux is reflected from the surface of the plate (see
the last paragraph of Sec. 16.3). After two reflections, the intensity will be 0.05 × 0.05 or 0.25% of
the intensity of the initial beam. After three reflections, the relevant figure is 0.05 × 0.05 × 0.05, or
0.0125%, which is 1/400 of the intensity of the singly reflected beam.
378 INTERFERENCE OF LIGHT

Using these values in Eq. (17.32), we get


2𝑏𝑛 𝑛2 − 𝑛 sin 𝜃 2 sin 𝜃 1
𝛥= − 2𝑏 tan 𝜃 2 sin 𝜃 1 = 2𝑏 .
cos 𝜃 2 𝑛 cos 𝜃 2
Substituting sin 𝜃 1 for 𝑛 sin 𝜃 1 and taking into account that
𝑛 cos 𝜃 2 = 𝑛2 − 𝑛2 sin2 𝜃 2 = 𝑛2 − sin2 𝜃 1 ,
p p

it is easy to give the equation for 𝛥 the form


𝛥 = 2𝑏 𝑛2 − sin2 𝜃 1 .
p
(17.33)
When calculating the phase difference 𝛿 between the oscillations in rays 1 and
2, it is necessary, in addition to the optical path difference 𝛥, to take into account
the possibility of a change in the phase of the wave upon reflection (see Sec. 16.3). At
point A (see Fig. 17.10), reflection occurs from the interface between the optically
less dense medium and the optically denser one. Consequently, the wave phase
experiences a change by 𝜋. At point 0, reflection occurs from the interface between
the optically denser medium and the optically less dense one, so that there is no
jump in the phase. Hence, an additional phase difference equal to 𝜋 is produced
between rays 1 and 2. It can be taken into account by adding to 𝛥 (or subtracting
from it) half a wavelength in a vacuum. The result is
𝜆0
𝛥 = 2𝑏 𝑛2 − sin2 𝜃 1 − .
p
(17.34)
2
Thus, when a plane wave falls on the plate, two reflected waves are formed, and
their path difference is determined by Eq. (17.34). Let us determine the conditions in
which these waves will be coherent and can interfere. We shall consider two cases.
1. A Plane-Parallel Plate. Both plane reflected waves propagate in one direc-
tion making an angle equal to the angle of incidence 𝜃 1 with a normal to the plate.
These waves can interfere if conditions of both temporal and spatial coherence are
observed.
For temporal coherence to take place, the path difference given by Eq. (17.34)
must not exceed the coherence length equal to 𝜆2 /𝛥𝜆 ≈ 𝜆20 /𝛥𝜆0 [see expression
(17.21)]. Consequently, the condition
𝜆0 𝜆2
2𝑏 𝑛2 − sin2 𝜃 1 − < 0 ,
p
2 𝛥𝜆0
or
𝜆0 (𝜆0 /𝛥𝜆0 + 1/2)
𝑏< ,
2 𝑛2 − sin2 𝜃 1
p

must be observed. In the obtained relation, we may disregard 1/2 in comparison


Interference of Light Reflected from Thin Plates 379

Sc

Fig. 17.11

with 𝜆0 /𝛥𝜆0 . The expression 𝑛2 − sin2 𝜃 1 has a magnitude of the order of unity⁸.
p

We can therefore write


𝜆2
𝑏< 0 (17.35)
2𝛥𝜆0
(the double plate thickness must be less than the coherence length).
Thus, the reflected waves will be coherent only if the plate thickness 𝑏 does not
exceed the value determined by expression (17.35). Assuming that 𝜆0 = 5000 Å and
𝛥𝜆0 = 20 Å, we get the extreme value of the thickness equal to
50002
≈ 6 × 105 Å = 0.06 mm. (17.36)
2 × 20
Now, let us consider the conditions for observance of spatial coherence. Let us
place screen Sc in the path of the reflected beams (Fig. 17.11). Rays 10 and 20 arriving
at point P0 will be at a distance 𝜌0 apart in the incident beam. If this distance does
not exceed the coherence radius 𝜌coh of the incident wave, rays 10 and 20 will be
coherent and will produce at point P0 an illumination determined by the value of
the path difference 𝛥 corresponding to the angle of incidence 𝜃 1 . The other pairs of
rays travelling at the same angle 𝜃 10 will produce the same illumination at the other
points of the screen. The screen will thus be uniformly illuminated (in the particular
case when 𝛥 = (𝑛 + 1/2)𝜆0 , the screen will be dark). When the inclination of the
beam is changed (i.e., when the angle 𝜃 1 is changed), the illumination of the screen
will change too.
A glance at Fig. 17.10 shows that the distance between the incident rays 1 and 2

⁸For 𝑛 = 1.5, the magnitude of this expression varies within the limits from 1.12 (at 𝜃 1 = 𝜋/2) to
1.5 (at 𝜃 1 = 0).
380 INTERFERENCE OF LIGHT

Fig. 17.12

is
𝑏 sin(2𝜃 1 )
𝜌 = 2𝑏 tan 𝜃 2 sin 𝜃 1 = p . (17.37)
𝑛2 − sin2 𝜃 1
If we assume that 𝑛 = 1.5, then, for 𝜃 1 = 45° we get 𝜌 = 0.8𝑏, and for 𝜃 1 = 10° we
get 𝜌 = 0.1𝑏. For normal incidence (𝜃 1 = 0), we have 𝜌 = 0 at any 𝑛.
The coherence radius of sunlight has a value of the order of 0.05 mm [see
Eq. (17.27)]. At an angle of incidence of 45°, we may assume that 𝜌 ≈ 𝑏. Hence, for
interference to occur in these conditions, the relation
𝑏 < 0.05 mm (17.38)
must be observed [compare with Eq. (17.36)]. For an angle of incidence of about 10°,
spatial coherence will be retained at a plate thickness not exceeding 0.5 mm. We
thus arrive at the conclusion that owing to the restrictions imposed by temporal and
spatial coherence, interference is observed when a plate is illuminated by sunlight
only if the thickness of the plate does not exceed a few hundredths of a millimetre.
Upon illumination with light having a greater degree of coherence, interference is
also observed in reflection from thicker plates or films.
Interference from a plane-parallel plate is observed in practice by placing in
the path of the reflected beams a lens that gathers the rays at one of the points of
the screen in the focal plane of the lens (Fig. 17.12). The illumination at this point
depends on the value of quantity (17.34). When 𝛥 = 𝑚𝜆0 , we get maxima, and when
𝛥 = (𝑚 + 1/2)𝜆0 —minima of the intensity (𝑚 is an integer). The condition for the
maximum intensity has the form
1
 
2
p
2
2𝑏 𝑛 − sin 𝜃 1 = 𝑚 + 𝜆0 . (17.39)
2
Assume that a thin plane-parallel plate is illuminated by diffuse monochromatic
Interference of Light Reflected from Thin Plates 381

light (see Fig. 17.12). Let us arrange a lens parallel to the plate and put a screen in the
focal plane of the lens. Diffuse light contains rays of the most diverse directions.
The rays parallel to the plane of the drawing and falling on the plate at the angle
𝜃 10 after reflection from both surfaces of the plate will be gathered by the lens at
point P0 and will set up at this point an illumination determined by the value of the
optical path difference. Rays propagating in other planes but falling on the plate at
the same angle 𝜃 10 will be gathered by the lens at other points at the same distance
as point P0 from centre 0 of the screen. The illumination at all these points will
be the same. Thus, the rays falling on the plate at the same angle 𝜃 10 will produce
on the screen a collection of identically illuminated points arranged along a circle
with its centre at 0. Similarly, the rays falling at a different angle 𝜃 100 will produce on
the screen a collection of identically (but different in value because 𝛥 is different)
illuminated points arranged along a circle of another radius. The result will be
the appearance on the screen of a system of alternating bright and dark circular
fringes with a common centre at point 0. Each fringe is formed by the rays falling
on the plate at the same angle 𝜃 1 . This is why interference fringes produced in such
conditions are known as fringes of equal inclination. When the lens is arranged
differently relative to the plate (the screen must coincide with the focal plane of the
lens in all cases), the fringes of equal inclination will have another shape.
Every point of an interference pattern is due to rays which formed a parallel
beam before passing through the lens. Hence, in observing fringes of equal incli-
nation, the screen must be placed in the focal plane of the lens, i.e., in the same
way in which it is arranged to produce an image of infinitely remote objects on it.
Accordingly, fringes of equal inclination are said to be localized at infinity. The part
of the lens can be played by the crystalline lens, and that of the screen by the retina
of the eye. In this case for observing fringes of equal inclination, the eye must be
accommodated as when looking at very remote objects.
According to Eq. (17.39), the position of the maxima depends on the wavelength
𝜆0 . Therefore, in white light, we get a collection of fringes displaced relative to one
another and formed by rays of different colours; the interference pattern acquires
the colouring of a rainbow. The possibility of observing an interference pattern
in white light is determined by the ability of the eye to distinguish light tints of
close wavelengths. The average human eye perceives rays differing in wavelength
by less than 20 Å as having the same colour. Therefore, to assess the conditions in
which interference from plates can be observed in white light, we must assume that
𝛥𝜆0 equals 20 Å. We took exactly this value in assessing the thickness of a plate [see
Eq. (17.36)].
2. Plate of Varying Thickness. Let us take a plate in the form of a wedge with
an apex angle of 𝜑 (Fig. 17.13). Assume that a parallel beam of rays falls on it. Now
382 INTERFERENCE OF LIGHT

Fig. 17.13

the rays reflected from different surfaces of the plate will not be parallel. Two rays
that practically merge before falling on the plate (in Fig. 17.13 they are depicted in the
form of a single straight line designated by the figure 10) intersect after reflection at
point 𝑄 0. The two rays 100 practically merging intersect at point Q00 after reflection.
It can be shown that points Q0, Q00 and other points similar to them lie in one plane
passing through apex 0 of the wedge. Ray 10 reflected from the bottom surface of
the wedge and ray 20 reflected from its top surface will intersect at point R0 that is
closer to the wedge than Q0. Similar rays 10 and 30 will intersect at point P0 that is
farther from the wedge surface than Q0.
The directions of propagation of the waves reflected from the top and bottom
surfaces of the wedge do not coincide. Temporal coherence will be observed only
for the parts of the waves reflected from places of the wedge for which the thickness
satisfies condition (17.35). Assume that this condition is observed for the entire wedge.
In addition, assume that the coherence radius is much greater than the wedge length.
Hence, the reflected waves will be coherent in the entire space over the wedge, and
no matter at what distance from the wedge the screen is, an interference pattern
will be observed on it in the form of fringes parallel to the wedge apex 0 (see the last
three paragraphs of Sec. 17.1). This, particularly, is how matters are when a wedge is
illuminated by light emitted by a laser.
With restricted spatial coherence, the region of localization of the interference
pattern (i.e., the region of space in which an interference pattern can be seen on a
screen placed in it) will be restricted too. If we arrange a screen so that it pass.3s
through points Q0, Q00, . . . (see screen Sc in Fig. 17.13), an interference pattern will
appear on it even if the spatial coherence of the falling wave is extremely small
(rays that coincided before falling on the wedge will intersect at points on the
screen). At a small wedge angle 𝜑, the path difference of the rays can be calculated
with sufficient accuracy by Eq. (17.34) taking as 𝑏 the thickness of the plate at the
Interference of Light Reflected from Thin Plates 383

Fig. 17.14

place where the rays fall on it. Since the path difference for the rays reflected from
different sections of the wedge is now different, the illumination of the screen will
be non-uniform—bright and dark fringes will appear on it (see the dash curve
showing the illumination of screen Sc in Fig. 17.13). Each of these fringes is produced
as a result of reflection from sections of the wedge having the same thickness. This
is why they are known as fringes of equal thickness.
Upon displacement of the screen from position Sc in a direction away from
the wedge or toward it, the degree of spatial coherence of the incident wave begins
to tell. If in the position of the screen denoted in Fig. 17.13 by Sc0, the distance 𝜌0
between the incident rays 10 and 20 becomes of the order of the coherence radius, no
interference pattern will be observed on screen Sc0. Similarly, the pattern vanishes
when the screen is at position Sc00.
Thus, the interference pattern produced when a plane wave is reflected from a
wedge is localized in a certain region near the surface of the wedge. This region
becomes narrower when the degree of spatial coherence of the incident wave
diminishes. Inspection of Fig. 17.13 shows that the conditions for both temporal
and spatial coherence become more favourable nearer to the apex of the wedge.
Therefore, the distinctness of the interference pattern diminishes when moving
from the apex of the wedge to its base. A pattern may be observed only for the
thinner part of the wedge. For its remaining part, the screen will be uniformly
illuminated.
Practically, fringes of equal thickness are observed by placing a lens near a
wedge, and a screen behind the lens (Fig. 17.14). The part of the lens can be played by
the crystalline lens, and of the screen by the retina of the eye. If the screen behind
the lens is in a plane conjugated with the plane designated by Sc in Fig. 17.13 (the
384 INTERFERENCE OF LIGHT

eye is accordingly accommodated to this plane), the pattern will be most distinct.
When the screen onto which the image is projected is moved (or when the lens is
moved), the pattern will become less distinct and will vanish completely if the plane
conjugated with the screen passes beyond the limits of the region of localization of
the interference pattern observed without a lens.
When observed in white light, the fringes will be coloured, so that the surface
of a plate or film will have rainbow colouring. For example, thin films of oil on the
surface of water and soap films have such colouring. The temper colours appearing
on the surface of steel articles when they are hardened are also due to interference
from a film of transparent oxides.
Let us compare the two cases of interference upon reflection from thin films
which we have considered. Fringes of equal inclination are obtained when a plate
of constant thickness (𝑏 = constant) is illuminated by diffuse light containing rays
of various directions (𝜃 1 is varied within more or less broad limits). Fringes of
equal inclination are localized at infinity. Fringes of equal thickness are observed
when a plate of varying thickness (𝑏 varies) is illuminated by a parallel beam of
light (𝜃 1 = constant). Fringes of equal thickness are localized near the plate. In real
conditions, for example, when observing rainbow colours on a soap or oil film, both
the angle of incidence of the rays and the thickness of the film are varied. In this
case, fringes of a mixed type are observed.
We must note that interference from thin films can be observed not only in
reflected, but also in transmitted light.
Newton’s Rings. A classical example of fringes of equal thickness are New-
ton’s rings. They are observed when light is reflected from a thick plane-parallel
glass plate in contact with a plano-convex lens having a large radius of curvature
(Fig. 17.15). The part of a thin film from whose surfaces coherent waves are reflected
is played by the air gap between the plate and the lens (owing to the great thickness
of the plate and the lens, no interference fringes appear as a result of reflections
from other surfaces). With normal incidence of the light, fringes of equal thickness
have the form of concentric rings, and with inclined incidence, of ellipses. Let us
find the radii of Newton’s rings produced when light falls along a normal to the
plate. In this case, sin 𝜃 1 = 0, and the optical path difference equals the double
thickness of the gap [see Eq. (17.33), it is assumed that 𝑛 = 1 in the gap]. It follows
from Fig. 17.15 that
𝑅2 = (𝑅 − 𝑏) 2 + 𝑟 2 ≈ 𝑅2 − 2𝑅𝑏 + 𝑟 2 , (17.40)
where 𝑅 is the radius of curvature of the lens, 𝑟 is the radius of a circle with the
identical gap 𝑏 corresponding to all of its points.
Owing to the smallness of 𝑏, in expression (17.40) we have disregarded the
Interference of Light Reflected from Thin Plates 385

Fig. 17.15

quantity 𝑏2 in comparison with 2𝑅𝑏. In accordance with expression (17.40), 𝑏 =


𝑟 2 /(2𝑅). To take account of the change in the phase by 𝜋 occurring upon reflection
from the plate, we must add 𝜆0 /2 to 2𝑏 = 𝑟 2 /𝑅. The result is
𝑟 2 𝜆0
𝛥= + . (17.41)
𝑅 2
At points for which 𝛥 = 𝑚 0 𝜆0 = 2𝑚 0 (𝜆0 /2), maxima appear, and at points for
which 𝛥 = (𝑚 0 + 1/2)𝜆0 = (2𝑚 0 + 1) (𝜆0 /2), minima of the intensity appear. Both
conditions can be combined into the single one
𝜆0
𝛥=𝑚
2
maxima corresponding to even values of 𝑚, and minima of the intensity, to odd
values. Introducing into this expression Eq. (17.41) for 𝛥 and solving the resulting
equation relative to 𝑟, we find the radii of bright and dark Newton’s rings:
 1/2
𝑅𝜆0 (𝑚 − 1)

𝑟= (𝑚 = 1, 2, 3, . . .). (17.42)
2
Radii of bright rings correspond to even 𝑚’s, and radii of dark rings to odd ones.
The value 𝑟 = 0 corresponds to 𝑚 = 1, i.e., to the point at the place of contact of the
plate and the lens. A minimum of intensity is observed at this point. It is due to the
change in the phase by 𝜋 when a light wave is reflected from the plate.
Coating of Lenses. The coating of lenses is based on the interference of light
when reflected from thin films. The transmission of light through each refracting
surface of a lens is attended by the reflection of about four per cent of the incident
light. In multicomponent lenses, such reflections occur many times, and the total
loss of the light flux reaches an appreciable value. In addition, the reflections from
the lens surfaces result in the appearance of highlights. The reflection of light is
386 INTERFERENCE OF LIGHT

Fig. 17.16

eliminated by applying a thin film of a substance having a refractive index other


than that of the lens to each free surface of the latter. The components obtained in
this way are called coated lenses. The thickness of the coating is chosen so that the
waves reflected from both its surfaces interfere destructively. An especially good
result is obtained if the refractive index of the film equals the square root of the
refractive index of the lens. When this condition is satisfied, the intensity of both
waves reflected from the film surfaces is the same.

17.5. The Michelson Interferometer

Many varieties of interference instruments called interferometers are in use.


Figure 17.16 is a schematic view of a Michelson interferometer⁹. A light beam
from source S falls on semitransparent plate P1 coated with a thin layer of silver
(this layer is depicted by dots in the figure). Half of the incident light flux is reflected
by plate P1 in the direction of ray 1 and half passes through the plate and propagates
in the direction of ray 2. Beam 1 is reflected from mirror M1 and returns to P1 ,
where it is split into two beams of equal intensity. One of them passes through the
plate and forms beam 10, and the second one is reflected in the direction of S. The
latter beam will no longer interest us. Beam 2 after being reflected by mirror M2
also returns to plate P1 where it is divided into two parts: beam 20 reflected from
the semitransparent layer, and the beam transmitted through the layer, which will
also no longer interest us. Light beams 10 and 20 have the same intensity.
If conditions of temporal and spatial coherence are observed, beams 10 and 20

⁹Named after its inventor, the American physicist Albert Michelson (1852-1931).
The Michelson Interferometer 387

will interfere. The result of this interference depends on the optical path difference
from plate P1 to mirrors M1 and M2 , and back. Ray 2 passes through the plate three
times, and ray 1 only once. To compensate the resulting change in the optical path
difference (owing to dispersion) for waves of different lengths, plate P1 is placed
in the path of ray 1. Plates P1 and P2 are identical, except for the silver coating on
the former. This arrangement makes the paths of rays 1 and 2 in glass equal. The
interference pattern is observed with the aid of telescope T.
Let us mentally replace mirror M2 with its virtual image M20 in semitransparent
plate P1 . Beams 10 and 20 can thus be considered as due to reflection from a
transparent plate contained between planes M1 and M2 . We can use adjusting
screws W1 to change the angle between these planes; in particular, they can be
arranged strictly parallel to each other. By rotating micrometric screw W2 , we can
smoothly move mirror M1 without changing its inclination. We can thus change
the thickness of the “plate”; in particular, we can make planes M1 and M2 intersect
(Fig. 17.16b).
The nature of the interference pattern depends on the adjustment of the mirrors
and on the divergence of the beam of light falling on the instrument. If the beam is
parallel, and planes M1 and M2 make an angle other than zero, then straight fringes
of equal thickness parallel to the lines of intersection of planes M1 and M2 will
be observed in the field of vision of the telescope. In white light, all the fringes
except the one coinciding with the line of intersection of the zero-order fringe
will be coloured. The zero-order fringe will be black because beam 1 is reflected
from plate P1 from the outside, and beam 2 from the inside. As a result, a phase
difference equal to 𝜋 is produced between them. In white light, fringes are observed
only with a small thickness of “plate” M1 M20 [see Eq. (17.36)]. In monochromatic
light corresponding to the red line of cadmium, Michelson observed a distinct
interference pattern at a path difference of the order of 500000 wavelengths (the
distance between M1 and M20 in this case is about 150 mm).
With a slightly diverging beam of light and a strictly parallel arrangement of
planes M1 and M20 , fringes of equal inclination are obtained that have the form of
concentric rings. When micrometric screw W2 is rotated, the diameter of the rings
grows or diminishes. Either new rings appear at the centre of the pattern, or the
diminishing rings shrink to a point and then vanish. Displacement of the pattern
by one fringe corresponds to movement of mirror M2 through half a wavelength.
Michelson used the instrument described above to carry out several experi-
ments that entered the annals of physics. The most famous of them, performed
together with the American chemist Edward Morley (1838-1923) in 1887, had the aim
of detecting motion of the Earth relative to the hypothetic ether (we shall treat this
experiment in Sec. 21.3). In 1890-1895, Michelson used the interferometer he had
388 INTERFERENCE OF LIGHT

Fig. 17.17

invented to make the first comparison of the wavelength of the red line of cadmium
with the length of the standard metre.
In 1920, Michelson constructed a stellar interferometer which he used to
measure the angular dimensions of stars. This instrument was mounted on a
telescope. A screen with two slits was installed in front of the objective of the
telescope (Fig. 17.17). The light from a star was reflected from a symmetrical system
of mirrors M1 , M2 , M3 and M4 , installed on a rigid frame fastened on a carriage.
The inner mirrors M3 and M4 , were fixed, and the outer ones M1 and M2 , could
move symmetrically away from or toward mirrors M3 and M4 . The path of the rays
is clear from the figure. Interference fringes were produced in the focal plane of the
telescope objective. Their visibility¹⁰ depended on the distance between the outer
mirrors. By moving these mirrors, Michelson determined the distance 𝑙 between
them at which the visibility of the fringes vanishes. This distance must be of the
order of the coherence radius of a light wave arriving from a star. According to
expression (17.26), the coherence radius is 𝑙 = 𝜆/𝜑. The condition 𝑙 = 𝜆/𝜑 gives the
angular diameter of a star
𝜆
𝜑= .
𝑙
Accurate calculations give the formula
𝜆
𝜑=𝐴 ,
𝑙
where 𝐴 = 1.22 for a source in the form of a uniformly illuminated disk. If the
disk is darker at its edges than at the centre, the coefficient exceeds 1.22, its value

¹⁰The visibility of a fringe is defined as the quantity 𝑉 = (𝐼max − 𝐼min )/(𝐼max + 𝐼min ), where 𝐼max
and 𝐼min are the maximum and minimum intensities of the light in the vicinity of the given fringe,
respectively.
Multibeam Interference 389

depending on the rate of diminishing of the illumination in the direction from the
centre toward the edge. In addition, accurate calculations show that after vanishing
at a certain value of 𝑙, the visibility upon a further increase in 𝑙 again becomes other
than zero; however, the values it reaches are not great.
The maximum distance between the outer mirrors in the stellar interferometer
constructed by Michelson was 6.1 m (the diameter of the telescope was 2.5 m). A
minimum measurable angular diameter of about 0.020 corresponded to this distance.
The first star whose angular diameter was measured was Betelgeuse (alpha Orion).
The value of 𝜑 obtained for it was 0.0470.

17.6. Multibeam Interference

Up to now, we have dealt with two-beam interference. Now let us investigate the
interference of many light rays.
Assume that 𝑁 rays of the same intensity arrive at a given point of a screen, the
phase of each following ray being shifted relative to that of the preceding one by
the same value 𝛿. Let us represent the oscillations set up by the rays in the form of
exponents:
𝐸1 = 𝑎𝑒𝑖𝜔𝑡 , 𝐸2 = 𝑎𝑒𝑖(𝜔𝑡+𝛿) , . . . , 𝐸𝑚 = 𝑎𝑒𝑖 [𝜔𝑡+(𝑚−1) 𝛿 ] , . . . , 𝐸 𝑁 = 𝑎𝑒𝑖[𝜔𝑡+(𝑁−1) 𝛿 ] ,
where 𝑎 is the amplitude of an oscillation. The resultant oscillation is determined
by the formula
Õ𝑁 𝑁
Õ
𝐸= 𝐸𝑚 = 𝑎𝑒𝑖𝜔𝑡 𝑒𝑖(𝑚−1) 𝛿 .
𝑚=1 𝑚=1
The expression obtained is the sum of 𝑁 terms of a geometrical progression with
its first term equal to unity and its common ratio equal to 𝑒𝑖𝛿 . Hence,
𝑖𝜔𝑡 1 − 𝑒
 𝑖𝑁 𝛿

𝐸 = 𝑎𝑒 = 𝐴ˆ 𝑒𝑖𝜔𝑡 ,
1−𝑒 𝑖𝛿

where
1
 
− 𝑖𝑁 𝛿
𝐴ˆ = 𝑎
𝑒
, (17.43)
1 − 𝑒𝑖𝛿
is the complex amplitude that can be represented in the form
𝐴ˆ = 𝐴𝑒𝑖𝛼 , (17.44)
(𝐴 is the usual amplitude of the resultant oscillation, and 𝛼 is its initial phase).
The product of quantity (17.44) and its complex conjugate gives the square of
the amplitude of the resultant oscillation:
𝐴ˆ 𝐴ˆ ∗ = 𝐴𝑒𝑖𝛼 𝐴𝑒−𝑖𝛼 = 𝐴2 . (17.45)
390 INTERFERENCE OF LIGHT

Substituting for 𝐴 in Eq. (17.45) its value from Eq. (17.43), we get the following
expression for the square of the amplitude:
2 1−𝑒
𝑖𝑁 𝛿 1 − 𝑒−𝑖𝑁 𝛿
2 2−𝑒
𝑖𝑁 𝛿 − 𝑒−𝑖𝑁 𝛿
  
2 ˆ ˆ
𝐴 = 𝐴𝐴 = 𝑎 ∗
 =𝑎
1 − 𝑒𝑖𝛿 1 − 𝑒−𝑖𝛿 2 − 𝑒𝑖𝛿 − 𝑒−𝑖𝛿
 

2 1 − cos(𝑁 𝛿) sin2 (𝑁 𝛿/2)


 
=𝑎 = 𝑎2 . (17.46)
1 − cos 𝛿 sin2 (𝛿/2)
The intensity is proportional to the square of the amplitude. Hence, the intensity
produced upon the interference of the 𝑁 rays being considered is determined by
the expression
sin2 (𝑁 𝛿/2) sin2 (𝑁 𝛿/2)
𝐼 (𝛿) = 𝐾𝑎2 = 𝐼 0 (17.47)
sin2 (𝛿/2) sin2 (𝛿/2)
(𝐾 is a constant of proportionality, 𝐼0 = 𝐾𝑎2 is the intensity produced by each of
the rays separately).
At the values
𝛿 = 2𝜋𝑚 (𝑚 = 0, ±1, ±2, . . .), (17.48)
Eq. (17.47) becomes indeterminate. For this reason, we apply L’Hospital’s rule:
sin2 (𝑁 𝛿/2) 2 sin(𝑁 𝛿/2) cos(𝑁 𝛿/2) (𝑁/2) sin(𝑁 𝛿)
lim 2
= lim = lim 𝑁 .
𝛿→2𝜋𝑚 sin (𝛿/2) 𝛿→2𝜋𝑚 2 sin(𝛿/2) cos(𝛿/2) (1/2) 𝛿→2𝜋𝑚 sin 𝛿
The expression obtained is also indeterminate. For this reason, we apply L’Hospital’s
rule again:
sin2 (𝑁 𝛿/2) sin(𝑁 𝛿) cos(𝑁 𝛿)
lim 2
= lim 𝑁 = lim 𝑁 = 𝑁 2.
𝛿→2𝜋𝑚 sin (𝛿/2) 𝛿→2𝜋𝑚 sin 𝛿 𝛿→2𝜋𝑚 cos 𝛿
Thus, when 𝛿 = 2𝜋𝑚 (or when the path differences 𝛥 = 𝑚𝜆0 ), the resultant
intensity is
𝐼 = 𝐼0 𝑁 2 . (17.49)
This result could have been predicted. Indeed, all the oscillations arrive at points for
which 𝛿 = 2𝜋𝑚 in the same phase. Hence, the resultant amplitude is 𝑁 times the
amplitude of a separate oscillation, and the intensity is 𝑁 2 times that of a separate
oscillation.
Let us call the spots where the intensity determined by Eq. (17.49) is observed the
principal maxima. Their position is determined by condition (17.48). The number
𝑚 is called the order of the principal maximum. It can be seen from Eq. (17.47) that
the space between two adjacent principal maxima accommodates 𝑁 − 1 minima
of the intensity. To verify this statement, let us consider, for example, the interval
between the maxima of the zero (𝑚 = 0) and of the first (𝑚 = 1) order. In this
interval, 𝛿 changes from zero to 2𝜋, and 𝛿/2 from zero to 𝜋. The denominator
Multibeam Interference 391

Fig. 17.18

of Eq. (17.47) is other than zero everywhere except for the ends of the interval. It
reaches its maximum value equal to unity at the middle of the interval. The quantity
𝑁 𝛿/2 takes on all the values from zero to 𝑁 𝜋 within the interval being considered.
At values of 𝜋, 2𝜋, . . . , (𝑁 − 1)𝜋, the numerator of Eq. (17.47) becomes equal to zero.
Here, we have minima of the intensity. Their positions correspond to values of 𝛿
equal to
𝑘0
𝛿 = 2𝜋 (𝑘 0 = 1, 2, . . . , 𝑁 − 1). (17.50)
𝑁
There are 𝑁 − 2 secondary maxima in the intervals between the 𝑁 − 1 minima.
The secondary maxima closest to the principal maxima have the greatest intensity.
The secondary maximum closest to the principal zero-order maximum is between
the first (𝑘 0 = 1) and second (𝑘 0 = 2) minima. Values of 𝛿 equal to 2𝜋/𝑁 and 4𝜋/𝑁
correspond to these minima. Hence, 𝛿 = 3𝜋/𝑁 corresponds to the secondary
maximum being considered. Introduction of this value into Eq. (17.47) yields
sin2 (3𝜋/𝑁)
𝐼 (3𝜋/𝑁) = 𝐾𝑎2 2 .
sin (3𝜋/2𝑁)
The numerator equals unity. At a great value of 𝑁, we may assume that the sine in
the denominator equals its argument [sin(3𝜋/2𝑁) ≈ 3𝜋/2𝑁]. Hence,
1 𝐾𝑎2 𝑁 2
𝐼 (3𝜋/𝑁) = 𝐾𝑎2 2
= .
(3𝜋/2𝑁) (3𝜋/2) 2
The quantity in the numerator is the intensity of the principal maximum [see
Eq. (17.49)]. Thus, at a great value of 𝑁, the secondary maximum closest to the
principal maximum has an intensity that is 1/(3𝑛/2) 2 ≈ 1/22 of the intensity of
the principal maximum. The other secondary maxima are still weaker.
Figure 17.18 shows a plot of the function 𝐼 (𝛿) for 𝑁 = 10. For comparison, a plot
of the intensity for 𝑁 = 2 [two-beam interference; see the curve 𝐼 (𝑥) in Fig. 17.2]
is shown by a dash line. Inspection of the figure shows that the principal maxima
392 INTERFERENCE OF LIGHT

become narrower and narrower with an increase in the number of interfering rays.
The secondary maxima are so weak that the interference pattern practically has the
form of narrow bright lines on a dark background.
Now, let us consider the interference of a very great number of rays whose
intensity diminishes in a geometrical progression. The oscillations being added
have the form
𝐸1 = 𝑎𝑒𝑖𝜔𝑡 , 𝐸2 = 𝑎𝜌𝑒𝑖(𝜔𝑡+𝛿) , . . . , 𝐸𝑚 = 𝑎𝜌𝑚−1 𝑒𝑖[𝜔𝑡+(𝑚−1) 𝛿 ] , . . . , (17.51)
(𝜌 is a constant quantity less than unity). The resultant oscillation is described by
the equation
Õ𝑁 Õ𝑁
𝐸= 𝐸𝑚 = 𝑎𝑒𝑖𝜔𝑡 𝜌𝑚−1 𝑒𝑖(𝑚−1) 𝛿 .
𝑚=1 𝑚−1
Using the expression for the sum of the terms of a geometrical progression, we get
𝑖𝜔𝑡 1 − 𝜌𝑒
 𝑖𝑁 𝛿

𝐸 = 𝑎𝑒 = 𝐴𝑒ˆ 𝑖𝜔𝑡 .
1 − 𝜌𝑒 𝑖𝛿

Thus, the complex amplitude is


1 − 𝜌𝑒𝑖𝑁 𝛿
 
ˆ
𝐴=𝑎 . (17.52)
1 − 𝜌𝑒𝑖𝛿
If 𝑁 is very great, the complex number 𝜌𝑁𝑒𝑖𝑁 𝛿 may be disregarded in compari-
son with unity (we shall indicate as an example that 0.9100 ≈ 4 × 10−4 ). Equation
(17.52) is thus simplified as follows:
1
 
𝐴ˆ = 𝑎 .
1 − 𝜌𝑒𝑖𝛿
Multiplying this equation by its complex conjugate, we get the square of the ordinary
amplitude of the resultant oscillation:
𝑎2 𝑎2
𝐴2 = 𝐴ˆ 𝐴ˆ ∗ = =
1 − 𝜌𝑒𝑖𝛿 1 − 𝜌𝑒−𝑖𝛿 1 + 𝜌2 − 𝜌 𝑒𝑖𝛿 + 𝑒−𝑖𝛿
  

𝑎2 𝑎2
= =
1 + 𝜌2 − 2𝜌 cos 𝛿 (1 − 𝜌) 2 + 2𝜌(1 − cos 𝛿)
𝑎 2
= .
(1 − 𝜌 ) + 4𝜌 sin2 (𝛿/2)
2

Hence,
𝐾𝑎2 𝐼1
𝐼 (𝛿) = = , (17.53)
(1 − 𝜌 ) + 4𝜌 sin (𝛿/2) (1 − 𝜌 ) + 4𝜌 sin2 (𝛿/2)
2 2 2

where 𝐼1 = 𝐾𝑎2 is the intensity of the first (most intensive) ray.


Multibeam Interference 393

Fig. 17.19

At values of
𝛿 = 2𝜋𝑚 (𝑚 = 0, ±1, ±2, . . .), (17.54)
Eq. (17.53) has maxima equal to
𝐼1
𝐼max = . (17.55)
(1 − 𝜌) 2
In the intervals between maxima, the function changes monotonously, reaching a
value equal to
𝐼1 𝐼1
𝐼min = = (17.56)
(1 − 𝜌) 2 + 4𝜌 (1 + 𝜌) 2
at the middle of the interval. Thus, the ratio of the intensity at a maximum to that
at a minimum
2
1+𝜌

𝐼max
= (17.57)
𝐼min 1−𝜌
is the greater, the closer 𝜌 is to unity, i.e., the slower is the rate of diminishing of the
intensity of the interfering rays. Figure 17.19 shows a graph of function (17.53) for
𝜌 = 0.8. It can be seen from the figure that the interference pattern has the form
of narrow sharp lines on a virtually dark background. Unlike Fig. 17.18, secondary
maxima are absent.
A practical case of a great number of rays with a diminishing intensity is encoun-
tered in the Fabry-Perot interferometer. This instrument consists of two glass or
quartz plates separated by an air gap (Fig. 17.20). The internal surfaces of the plates
are thoroughly polished so that the irregularities on them do not exceed several
hundredths of the length of a light wave. Next partly transparent metal layers or
dielectric films¹¹ are applied to these surfaces. The outer surfaces of the plates are
¹¹Metal layers have the shortcoming that they absorb light rays to a great extent. This is why recent
years have seen their replacement with multilayer dielectric coatings having a high reflectivity.
394 INTERFERENCE OF LIGHT

Fig. 17.20 Fig. 17.21

at a slight angle relative to the inner ones to eliminate the highlights due to the
reflection of light from these surfaces. In the original design of the interferometer,
one of the plates could be moved relative to the other stationary one with the aid
of a micrometric screw. The unreliability of this design, however, resulted in its
coming out of use. In modern designs, the plates are secured rigidly. The parallelity
of the internal working planes is achieved by installing an invar or quartz ring¹²
between the plates. This ring has three projections with thoroughly polished edges
at each side. The plates are pressed against the ring by springs. This design reliably
ensures strict parallelity of the internal planes of the plates and constancy of the
distance between them. Such an interferometer with a fixed distance between its
plates is known as a Fabry-Perot etalon.
Let us see what happens to a ray entering the gap between the plates (Fig. 17.21).
Assume that the intensity of the entering ray is 𝐼0 . At point A1 , this ray is divided
into ray 1 emerging outward and reflected ray 10. If the coefficient of reflection
from the surface of the plate is 𝜌, then the intensity of ray 1 will be 𝐼1 = (1 − 𝑝)𝐼0 ,
and the intensity of the reflected ray will be 𝐼10 = 𝜌𝐼0 ¹³. At point B1 , ray 10 is
divided into two. Ray 100 shown by a dash line will drop out of consideration,
while reflected ray 100 will have an intensity of 𝐼100 = 𝜌𝐼10 = 𝜌2 𝐼0 . At point A2 , ray
100 will be divided into two rays—ray 2 emerging outward having an intensity of
𝐼2 = (1 − 𝜌)𝐼100 = (1 − 𝜌)𝜌2 /𝐼0 and reflected ray 20, and so on. Thus, the following

¹²Both these materials are distinguished by their extremely low temperature coefficient of expan-
sion.
¹³We disregard the absorption of light in the reflecting layers and inside the plates.
Multibeam Interference 395

Fig. 17.22

relation holds for the intensities of rays 1, 2, 3, etc. emerging from the instrument:
𝐼1 : 𝐼2 : 𝐼3 : . . . = 1 : 𝜌2 : 𝜌4 : . . . .
Accordingly, for the amplitudes of the oscillations we have
𝐴1 : 𝐴2 : 𝐴3 : . . . = 1 : 𝜌 : 𝜌2 : . . .
[compare with Eq. (17.51)].
The oscillation in each of the rays 2, 3, 4, . . ., lags in phase behind the oscillation
in the preceding ray by the same amount 𝛿 determined by the optical path difference
𝛥 appearing on the path A1 -B1 -A2 or A2 -B2 -A3 , etc. (see Fig. 17.21). A glance at the
figure shows that 𝛥 = 2𝑙/cos 𝜑, where 𝜑 is the angle of incidence of the rays on the
reflecting layers.
If we gather rays 1, 2, 3, . . ., with the aid of a lens at point P of its focal plane
(see Fig. 17.20), then the oscillations produced by these rays will have the form given
by Eq. (17.51). Hence, the intensity at point P is determined by Eq. (17.53), in which 𝜌
has the meaning of the coefficient of reflection, and
2𝜋 2𝑙
𝛿= = .
𝜆 cos 𝜑
When a diverging beam of light is passed through the instrument, fringes of
equal inclination having the form of sharp rings (Fig. 17.22) will be produced in the
focal plane of the lens.
The Fabry-Perot interferometer is used in spectroscopy to study the fine struc-
ture of spectral lines. It has also come into great favour in metrology for comparing
the length of the standard metre with the wavelengths of individual spectral lines.
397

Chapter 18
DIFFRACTION OF LIGHT

18.1. Introduction

By diffraction is meant the combination of phenomena observed when light propa-


gates in a medium with sharp heterogeneities¹ and associated with deviations from
the laws of geometrical optics. Diffraction, in particular, leads to light waves bend-
ing around obstacles and to the penetration of light into the region of a geometrical
shadow. The bending of sound waves around obstacles (i.e., the diffraction of sound
waves) is constantly observed in our everyday life. To observe the diffraction of light
waves, special conditions must be set up. This is due to the smallness of the lengths
of light waves. We know that in the limit, when 𝜆 → 0, the laws of wave optics
transform into those of geometrical optics. Hence, other conditions being equal,
the deviations from the laws of geometrical optics decrease with a diminishing
wavelength.
There is no appreciable physical difference between interference and diffrac-
tion. Both phenomena consist in the redistribution of the light flux as a result of
superposition of the waves. For historical reasons, the redistribution of the intensity
produced as a result of the superposition of waves emitted by a finite number of
discrete coherent sources has been called the interference of waves. The redistribu-
tion of the intensity produced as a result of the superposition of waves emitted by
coherent sources arranged continuously has been called the diffraction of waves.
We, therefore, speak about the interference pattern from two narrow slits and about
the diffraction pattern from one slit.
Diffraction is usually observed by means of the following set-up. An opaque
barrier closing part of the wave surface of the light wave is placed in the path of a
light wave propagating from a certain source. A screen on which the diffraction

¹For example, near the boundaries of opaque or transparent bodies, through small holes, etc.
398 DIFFRACTION OF LIGHT

Fig. 18.1: Fraunhofer diffraction observed by placing a lens after light source S and another
one in front of point of observation P. Points S and P lie in the focal plane of each lens.

pattern appears is placed after the barrier.


Two kinds of diffraction are distinguished. If the light source S and the point of
observation P are so far from a barrier that the rays falling on the barrier and those
travelling to point P form virtually parallel beams, we have to do with diffraction
in parallel rays or with Fraunhofer diffraction. Otherwise, we have to do with
Fresnel diffraction. Fraunhofer diffraction can be observed by placing a lens after
light source S and another one in front of point of observation P, so that points S
and P lie in the focal plane of the relevant lens (Fig. 18.1).
The criterion allowing us to determine the kind of diffraction we are dealing
with—Fresnel or Fraunhofer—in each specific case will be given in Sec. 18.5.

18.2. Huygens-Fresnel Principle

The penetration of light waves into the region of a geometrical shadow can be
explained with the aid of Huygens’ principle (see Sec. 16.9). This principle, however,
gives no information on the amplitude and, consequently, on the intensity of waves
propagating in different directions. The French physicist Augustin Fresnel (1788-
1827) supplemented Huygens’ principle with the concept of the interference of
secondary waves. Taking into account the amplitudes and phases of the secondary
waves makes it possible to find the amplitude of the resultant wave for any point of
space. Huygens’ principle developed in this way was named the Huygens-Fresnel
principle.
According to the Huygens-Fresnel principle, every element of wave surface 𝑆
(Fig. 18.2) is the source of a secondary spherical wave whose amplitude is propor-
tional to the size of element d𝑆. The amplitude of a spherical wave diminishes with
the distance 𝑟 from its source according to the law 1/𝑟 [see Eq. (14.12)]. Consequently,
the oscillation
𝑎0 d𝑆
d𝐸 = 𝐾 cos(𝜔𝑡 − 𝑘𝑟 + 𝛼0 ), (18.1)
𝑟
Huygens-Fresnel Principle 399

Fig. 18.2: Huygens-Fresnel principle: every element of wave surface 𝑆 is the source of a
secondary spherical wave whose amplitude is proportional to the size of element d𝑆.

arrives from each section d𝑆 of a wave surface at point P in front of this surface. In
Eq. (18.1), (𝜔𝑡 + 𝛼0 ) is the phase of the oscillation where wave surface 𝑆 is, 𝑘 is the
wave number, 𝑟 is the distance from surface element d𝑆 to point P. The factor 𝛼0
is determined by the amplitude of the light oscillation at the location of d𝑆. The
coefficient 𝐾 depends on the angle 𝜑 between a normal 𝒏ˆ to area d𝑠 and the direction
from d𝑆 to point P. When 𝜑 = 0, this coefficient is maximum; when 𝜑 = 𝜋/2, it
vanishes.
he resultant oscillation at point P is the superposition of the oscillations given
by Eq. (18.1) taken for the entire wave surface 𝑆:

𝑎0
𝐸 = 𝐾 (𝜑) cos(𝜔𝑡 − 𝑘𝑟 + 𝛼0 ) d𝑆. (18.2)
𝑆 𝑟
This equation is an analytical expression of the Huygens-Fresnel principle.
The Huygens-Fresnel principle can be substantiated by the following reasoning.
Assume that thin opaque screen Sc (Fig. 18.3) is placed in the path of a light wave (we
shall consider it plane for simplicity’s sake). The intensity of the light everywhere
after the screen will be zero. The reason is that the light wave falling on the screen
produces oscillations of the electrons in the material of the screen. The oscillating
electrons emit electromagnetic waves. The field after the screen is a superposition of
the primary wave (falling on the screen) and all the secondary waves. The amplitudes
and phases of the secondary waves are such that upon superposition of these waves
with the primary one, a zero amplitude is obtained at any point P after the screen.
Consequently, if the primary wave produces the oscillation
𝐴prim cos(𝜔𝑡 + 𝛼)
at point P, then the resultant oscillation produced by the secondary waves at the
same point has the form
𝐴sec cos(𝜔𝑡 + 𝛼 − 𝜋).
Here, 𝐴sec = 𝐴prim .
400 DIFFRACTION OF LIGHT

Sc

Fig. 18.3: A light wave (primary wave) falling on a thin opaque screen Sc produces oscillations
of the electrons in the material of the screen. The oscillating electrons emit electromagnetic
waves (secondary wave). The amplitudes and phases of the secondary waves are such that,
upon superposition of these waves with the primary one, a zero amplitude is obtained at
any point P after the screen.

What has been said above signifies that when calculating the amplitude of an
oscillation set up at point P by a light wave propagating from a real source, we can
replace this source with a collection of secondary sources arranged along the wave
surface. This is exactly the essence of Huygens-Fresnel principle.
Let us divide the opaque barrier into two parts. One of them, which we shall
call a stopper, has finite dimensions and an arbitrary shape (a circle, rectangle, etc.).
The other part includes the entire remaining surface of the infinite barrier. As long
as the stopper is in place, the resultant oscillation at point P after the barrier is zero.
It can be represented as the sum of the oscillations set up by the primary wave, the
wave produced by the stopper, and the wave produced by the remaining part of the
barrier:
𝐴prim cos(𝜔𝑡 + 𝛼) + 𝐴stop cos(𝜔𝑡 + 𝛼 0) + 𝐴bar cos(𝜔𝑡 + 𝛼 00) = 0. (18.3)
If the stopper is removed, i.e., the wave is transmitted through the aperture in
the opaque barrier, then the oscillation at point P will have the form
𝐸P = 𝐴prim cos(𝜔𝑡 + 𝛼) + 𝐴bar cos(𝜔𝑡 + 𝛼 00)
= −𝐴stop cos(𝜔𝑡 + 𝛼 0) = 𝐴stop cos(𝜔𝑡 + 𝛼 0 − 𝜋).
We have used condition (18.3) and assumed that removal of the stopper does not
change the nature of the oscillations of the electrons in the remaining part of the
barrier.
We can, thus, consider that the oscillations at point P are produced by a col-
lection of sources of secondary waves on the surface of the aperture formed after
removal of the stopper.
Fresnel Zones 401

S P

1st zone
2nd zone
3rd zone
4th zone
Fig. 18.4: Fresnel zones obtained by division of a wave surface, travelling from S to P, into
annular zones constructed so that the distances from the edges of each zone to point P
differ by 𝜆/2, where 𝜆 is the length of the wave in the medium of propagation.

18.3. Fresnel Zones

The performance of calculations by Eq. (18.2) is a very difficult task in the general
case. As Fresnel showed, however, the amplitude of the resultant oscillation can
be found by simple algebraic or geometrical summation in cases distinguished by
symmetry.
To understand the essence of the method developed by Fresnel, let us determine
the amplitude of the light oscillation set up at point P by a spherical wave propagating
in an isotropic homogeneous medium from point source S (Fig. 18.4). The wave
surfaces of such waves are symmetrical relative to straight line SP. Taking advantage
of this circumstance, let us divide the wave surface shown in the figure into annular
zones constructed so that the distances from the edges of each zone to point P differ
by 𝜆/2 (𝜆 is the length of the wave in the medium in which it is propagating). Zones
having this property are known as Fresnel zones.
A glance at Fig. 18.4 shows that the distance 𝑏𝑚 from the outer edge of the 𝑚-th
zone to point P is
𝜆
𝑏𝑚 = 𝑏 + 𝑚 (18.4)
2
(𝑏 is the distance from the crest 0 of the wave surface to point P).
The oscillations arriving at point P from similar points of two adjacent zones
(i.e., from points at the middle of the zones, or at the outer edges of the zones, etc.)
are in counterphase. Therefore, the resultant oscillations produced by each of the
zones as a whole will differ in phase for adjacent zones by 𝜋 too.
Let us calculate the areas of the zones. The outer boundary of the 𝑚-th zone
separates a spherical segment of height ℎ𝑚 on the wave surface (Fig. 18.5). Let the
402 DIFFRACTION OF LIGHT

S P

Fig. 18.5: Area of a Fresnel zone. The outer boundary of the 𝑚-th zone separates a spherical
segment of height ℎ𝑚 on the wave surface. The area of this segment be 𝑆𝑚 .

area of this segment be 𝑆𝑚 . Hence, the area of the 𝑚-th zone can be written as
𝛥𝑆𝑚 = 𝑆𝑚 − 𝑆𝑚−1 ,
where 𝑆𝑚−1 is the area of the spherical segment separated by the outer boundary of
the (𝑚 − 1)-th zone.
It can be seen from Fig. 18.5 that
 2
2 2 2
− (𝑏 + ℎ𝑚 ) 2 ,
𝜆
𝑟𝑚 = 𝐴 − (𝑎 − ℎ𝑚 ) = 𝑏 + 𝑚
2
where 𝑎 is the radius of the wave surface and 𝑟𝑚 is the radius of the outer boundary
of the 𝑚-th zone.
Squaring the terms in parentheses, we get
 2
𝑟𝑚2 = 2𝑎ℎ𝑚 − ℎ2𝑚 = 𝑏𝑚𝜆 + 𝑚2 − 2𝑏ℎ𝑚 − ℎ2𝑚 ,
𝜆
(18.5)
2
whence,
𝑏𝑚𝜆 + 𝑚2 (𝜆/2) 2
ℎ𝑚 = . (18.6)
2(𝑎 + 𝑏)
Restricting ourselves to a consideration of not too great 𝑚’s, we may disregard the
addend containing 𝜆2 owing to the smallness of 𝜆. In this approximation
𝑏𝑚𝜆
ℎ𝑚 = . (18.7)
2(𝑎 + 𝑏)
The area of a spherical segment is 𝑆 = 2𝜋 𝑅ℎ (here, 𝑅 is the radius of the sphere
and ℎ is the height of the segment). Hence,
 
𝜋 𝑎𝑏
𝑆𝑚 = 2𝜋 𝑎ℎ𝑚 = 𝑚𝜆,
(𝑎 + 𝑏)
Fresnel Zones 403

and the area of the 𝑚-th zone is


𝜋 𝑎𝑏𝜆
𝛥𝑆𝑚 = 𝑆𝑚 − 𝑆𝑚−1 = .
(𝑎 + 𝑏)
The expression we have obtained does not depend on 𝑚. This signifies that when 𝑚
is not too great, the areas of the Fresnel zones are approximately identical.
We can find the radii of the zones from Eq. (18.5). When 𝑚 is not too great,
the height of a segment ℎ𝑚  𝑎, and we can, therefore, consider that 𝑟𝑚2 = 2𝑎ℎ𝑚 .
Substituting for ℎ𝑚 its value from Eq. (18.7), we get the following expression for the
radius of the outer boundary of the 𝑚-th zone:
   1/2
𝑎𝑏
𝑟𝑚 = 𝑚𝜆 . (18.8)
𝑎+𝑏
If we assume that 𝑎 = 𝑏 = 1 m and 𝐴 = 0.5 µm, then, we get a value of 𝑟1 = 0.5 mm
for the radius of the first (central) zone. The radii of the following zones grow as

𝑚.
Thus, the areas of the Fresnel zones are approximately the same. The distance
𝑏𝑚 from a zone to point P slowly increases with the zone number 𝑚. The angle 𝜑
between a normal to the zone elements and the direction toward point P also grows
with 𝑚. All this leads to the fact that the amplitude 𝐴𝑚 of the oscillation produced
by the 𝑚-th zone at point P diminishes monotonously with increasing 𝑚. Even at
very high values of 𝑚, when the area of a zone begins to grow appreciably with 𝑚
[see Eq. (18.6)], the decrease in the factor 𝐾 (𝜑) overbalances the increase in 𝛥𝑆𝑚 , so
that 𝐴𝑚 continues to diminish. Thus, the amplitudes of the oscillations produced at
point P by Fresnel zones form a monotonously diminishing sequence:
𝐴1 > 𝐴2 > 𝐴3 > . . . > 𝐴𝑚−1 > 𝐴𝑚 > . . . .
The phases of the oscillations produced by adjacent zones differ by 𝜋. Therefore,
the amplitude 𝐴 of the resultant oscillation at point P can be represented in the
form
𝐴 = 𝐴1 − 𝐴2 + 𝐴3 − 𝐴4 + . . . . (18.9)
This expression includes all the amplitudes from odd zones with one sign and
from even zones with the opposite one.
Let us write Eq. (18.9) in the form
   
𝐴1 𝐴1 𝐴3 𝐴3 𝐴5
𝐴= + − 𝐴2 + + − 𝐴4 + + .... (18.10)
2 2 2 2 2
Owing to the monotonous diminishing of 𝐴𝑚 , we can approximately assume that
𝐴𝑚−1 + 𝐴𝑚+1
𝐴𝑚 = .
2
The expressions in parentheses will therefore vanish, and Eq. (18.10) will be simplified
404 DIFFRACTION OF LIGHT

Fig. 18.6: Vector diagram obtained when the Fig. 18.7: The phases of the oscillations at
oscillations produced by the separate zones points 0 and 1 differ by 𝜋 (the infinitely small
are added. The vectors form a broken spiral- vectors forming the spiral have opposite di-
shaped line instead of a closed figure. rections at these points).

as follows:
𝐴1
𝐴= . (18.11)
2
According to Eq. (18.11), the amplitude produced at a point P by an entire spherical
wave surface equals half the amplitude produced by the central zone alone. If we
put in the path of a wave an opaque screen having an aperture that leaves only
the central Fresnel zone open, the amplitude at point P will equal 𝐴1 , i.e., it will
be double the amplitude given by Eq. (18.11). Accordingly, the intensity of the light
at point P will in this case be four times greater than when there are no barriers
between points S and P.
Now, let us solve the problem on the propagation of light from source S to
point P by the method of graphical addition of amplitudes. We shall divide the
wave surface into annular zones similar to Fresnel zones, but much smaller in
width (the path difference from the edges of a zone to point P is a small fraction
of 𝜆 the same for all zones). We shall depict the oscillation produced at point P
by each of the zones in the form of a vector whose length equals the amplitude of
the oscillation, while the angle made by the vector with the direction taken as the
beginning of measurement gives the initial phase of the oscillation (see Sec. 7.7 of
Vol. I). The amplitude of the oscillations produced by such zones at point P slowly
diminishes from zone to zone. Each following oscillation lags behind the preceding
one in phase by the same magnitude. Hence, the vector diagram obtained when
the oscillations produced by the separate zones are added has the form shown in
Fig. 18.6.
If the amplitudes produced by the individual zones were the same, the tail of the
last of the vectors shown in Fig. 18.6 would coincide with the tip of the first vector.
Actually, the value of the amplitude diminishes, although very slightly. Hence, the
vectors form a broken spiral-shaped line instead of a closed figure.
Fresnel Zones 405

Fig. 18.8: (a, b) First and second Fresnel zones. (c) The oscillation produced at point P by the
entire wave surface is depicted by vector OC. The amplitude here equals half the amplitude
produced by the first zone. (d) The oscillation
√ produced by the inner half of the first Fresnel
zone is depicted by vector 0B, which is 2 times greater than vector 0C. Thus, the intensity
of light in the inner half of the First zone is twice that of the entire wave surface.

In the limit when the widths of the annular zones tend to zero (their number
will grow unlimitedly), the vector diagram has the form of a spiral winding toward
point C (Fig. 18.7). The phases of the oscillations at points 0 and 1 differ by 𝜋 (the
infinitely small vectors forming the spiral have opposite directions at these points).
Consequently, part 0-1 of the spiral corresponds to the first Fresnel zone. The
vector drawn from point 0 to point 1 (Fig. 18.8a) depicts the oscillation produced at
point P by this zone. Similarly, the vector drawn from point 1 to point 2 (Fig. 18.8b)
depicts the oscillation produced by the second Fresnel zone. The oscillations from
the first and second zones are in counterphase; accordingly, vectors 01 and 12 have
opposite directions.
The oscillation produced at point P by the entire wave surface is depicted by
vector 0C (Fig. 18.8c). Inspection of the figure shows that the amplitude in this case
equals half the amplitude produced by the first zone. We have obtained this result
algebraically earlier [see Eq. (18.11)]. We shall note that the oscillation produced by
the inner half of the first Fresnel zone is depicted by vector 0B (Fig. 18.8d). Thus, the
action of the inner half of the first
√ Fresnel zone is not equivalent to half the action
of the first zone. Vector 0B is 2 times greater than vector 0C. Consequently, the
intensity of the light produced by the inner half of the first Fresnel zone is double
the intensity produced by the entire wave surface.
The oscillations from the even and odd Fresnel zones are in counterphase
and, therefore, mutually weaken one another. If we would place in the path of the
light wave a plate that would cover all the even or odd zones, the intensity of the
light at point P would sharply grow. Such a plate, known as a zone one, functions
like a converging lens. Figure 18.9 shows a plate covering the even zones. A still
greater effect can be achieved by changing the phase of the even (or odd) zone
406 DIFFRACTION OF LIGHT

Fig. 18.9: Plate covering the even Fresnel zones. The oscillations from the even and odd
Fresnel zones are in counterphase and, therefore, mutually weaken one another.

oscillations by 𝜋 instead of covering these zones. This can be done with the aid
of a transparent plate whose thickness at the places corresponding to the even or
odd zones differs by a properly selected value. Such a plate is called a phase zone
plate. In comparison with the amplitude zone plate covering zones, a phase plate
produces an additional two-fold increase in the amplitude, and a four-fold increase
in the light intensity.

18.4. Fresnel Diffraction from Simple Barriers

The methods of algebraic and graphical addition of amplitudes treated in the preced-
ing section make it possible to solve a number of problems involving the diffraction
of light.
Diffraction from a Round Aperture. Let us put an opaque screen with a
round aperture of radius 𝑟0 cut out in it in the path of a spherical light wave. We
shall arrange the screen so that a perpendicular dropped from light source S passes
through the centre of the aperture (Fig. 18.10). Let us take point P on the continuation
of this perpendicular. At an aperture radius 𝑟0 considerably smaller than the lengths
𝑎 and 𝑏 shown in the figure, the length 𝑎 can be considered equal to the distance
from source S to the barrier, and the length 𝑏, to the distance from the barrier to
point P. If the distances 𝑎 and 𝑏 satisfy the relation
   1/2
𝑎𝑏
𝑟0 = 𝑚𝜆 , (18.12)
𝑎+𝑏
where 𝑚 is an integer, then the aperture will leave open exactly 𝑚 first Fresnel zones
constructed for point P [see Eq. (18.8)]. Hence, the number of open Fresnel zones is
Fresnel Diffraction from Simple Barriers 407

Barrier Screen

Fig. 18.10: Opaque screen with a round aperture of radius 𝑟0 cut out in it in the path of a
spherical light wave. The screen is arranged so that a perpendicular dropped from light
source S passes through the centre of the aperture. The diffraction patterns produced by
the round aperture are shown for when 𝑚 is odd (b) and when 𝑚 is even (c).

determined by the expression


𝑟02 1 1
 
𝑚= + . (18.13)
𝜆 𝑎 𝑏
According to Eq. (18.9), the amplitude at point P will be
𝐴 = 𝐴1 − 𝐴2 + 𝐴3 − 𝐴4 + . . . ± 𝐴𝑚 . (18.14)
The amplitude 𝐴𝑚 is taken with a plus sign if 𝑚 is odd and with a minus sign if
𝑚 is even. Writing Eq. (18.14) in a form similar to Eq. (18.10) and assuming that the
expressions in parentheses equal zero, we arrive at the equations
𝐴1 𝐴𝑚
𝐴= + (𝑚 is odd)
2 2
𝐴1 𝐴𝑚−1
𝐴= + − 𝐴𝑚 (𝑚 is even).
2 2
The amplitudes from two adjacent zones are virtually the same. We may therefore
replace ( 𝐴𝑚−1 /2) − 𝐴𝑚 with −𝐴𝑚 /2. The result is
𝐴1 𝐴𝑚
𝐴= ± , (18.15)
2 2
where the plus sign is taken for odd and the minus sign for even 𝑚’s.
The amplitude 𝐴𝑚 differs only slightly from 𝐴1 for small 𝑚’s. Hence, with odd
𝑚’s, the amplitude at point P will approximately equal 𝐴1 , and at even 𝑚’s, zero. It
is easy to obtain this result with the aid of the vector diagram shown in Fig. 18.7.
If we remove the barrier, the amplitude at point P will become equal to 𝐴1 /2
[see Eq. (18.11)]. Thus, a barrier with an aperture opening a small odd number of
zones not only does not weaken the illumination at point P but, on the contrary,
leads to an increase in the amplitude almost twice, and of the intensity, almost four
408 DIFFRACTION OF LIGHT

Fig. 18.11: (a, b, c) Diffraction patterns for different numbers of Fresnel zones obtained at
points P, P0, and P00 of Fig. 18.10a, respectively. The patterns in (b) and (c) are limited by the
edges of the aperture.

times.
Let us determine the nature of the diffraction pattern that will be observed
on a screen placed after the barrier (see Fig. 18.10). Owing to the symmetrical
arrangement of the aperture relative to straight line SP, the illumination at various
points of the screen will depend only on the distance 𝑟 from point P. At this point
itself, the intensity will reach a maximum or a minimum depending on whether the
number of open (effective) Fresnel zones is even or odd. Assume, for example, that
this number is three. In this case, we get a maximum of intensity at the centre of the
diffraction pattern. A pattern of the Fresnel zones for point P is given in Fig. 18.11a.
Now let us move along the screen to point P0. The pattern of the Fresnel zones
for point P0 limited by the edges of the aperture has the form shown in Fig. 18.11b. The
edges of the aperture will obstruct a part of the third zone, and simultaneously the
fourth zone will be partly opened. As a result, the intensity of the light diminishes,
and reaches a minimum at a certain position of point P0. If we move along the
screen to point P00, the edges of the aperture will partly obstruct not only the third,
but also the second Fresnel zone, simultaneously partly opening the fifth zone
(Fig. 18.11c). The result will be that the action of the open sections of the odd zones
will overbalance the action of the open sections of the even zones and the intensity
will reach a maximum. True, this maximum will be weaker than that observed at
point P.
Thus, the diffraction pattern produced by a round aperture has the form of
alternating bright and dark concentric rings. There will be either a bright (𝑚 is
odd) or a dark (𝑚 is even) spot at the centre of the pattern (Fig. 18.12). The variation
in the intensity 𝐼 with the distance 𝑟 from the centre of the pattern is shown in
Fig. 18.10b (for an odd 𝑚) and in Fig. 18.10c (for an even 𝑚). When the screen is moved
parallel to itself along straight line SP, the patterns shown in Fig. 18.12 will replace
one another [according to Eq. (18.13), when 𝑏 changes, the value of 𝑚 becomes odd
and even alternately].
Fresnel Diffraction from Simple Barriers 409

Fig. 18.12: Diffraction patterns produced by a round aperture alternating bright and dark
concentric rings. At the centre of the pattern, a bright spot results for an odd 𝑚 value, while
a dark spot for an even 𝑚.

If the aperture opens only a part of the central Fresnel zone, a blurred bright
spot is obtained on the screen; there is no alternation of bright and dark rings in
this case. If the aperture opens a great number of zones, the alternation of the bright
and dark rings is observed only in a very narrow region on the boundary of the
geometrical shadow; inside this region the illumination is virtually constant.
Diffraction from a Disk. Let us place an opaque disk of radius 𝑟0 between
light source S and observation point P (Fig. 18.13). If the disk covers 𝑚 first Fresnel
zones, the amplitude at point P will be
 
𝐴𝑚+1 𝐴𝑚+1 𝐴𝑚+3
𝐴 = 𝐴𝑚+1 − 𝐴𝑚+2 + 𝐴𝑚+3 − . . . = + − 𝐴𝑚+2 + + ....
2 2 2
The expressions in parentheses can be assumed to equal zero, consequently
𝐴𝑚+1
𝐴= . (18.16)
2
Let us determine the nature of the pattern obtained on the screen (see Fig. 18.13).
It is obvious that the illumination can depend only on the distance 𝑟 from point P.
With a small number of covered zones, the amplitude 𝐴𝑚+1 differs slightly from
𝐴1 . The intensity at point P will therefore be almost the same as in the absence of a
barrier between source S and point P [see Eq. (18.11)]. For point P0, displaced relative
to point P in any radial direction, the disk will cover a part of the (𝑚 + 1)-th Fresnel
zone and part of the 𝑚-th zone will be opened simultaneously. This will cause the
intensity to diminish. At a certain position of point P0, the intensity will reach its
minimum. If the distance from the centre of the pattern is still greater, the disk
will cover additionally a part of the (𝑚 + 2)-th zone, and a part of the (𝑚 − 1)-th
zone will be opened simultaneously. As a result, the intensity grows and reaches a
maximum at point P00.
Thus, the diffraction pattern for an opaque disk has the form of alternating
bright and dark concentric rings. The centre of the pattern contains a bright spot
(Fig. 18.14). The light intensity 𝐼 varies with the distance 𝑟 from point P as shown in
410 DIFFRACTION OF LIGHT

Screen
Opaque disk

Fig. 18.13: (a) Opaque disk of radius 𝑟0 between light Fig. 18.14: Diffraction pattern
source S and observation point P. (b) Variation of the for an opaque disk alternat-
light intensity with the distance 𝑟 from point P. ing bright and dark concentric
rings.

Fig. 18.13b.
If the disk covers only a small part of Fig. 18.14 the central Fresnel zone, it does
not form a shadow at all—the illumination of the screen everywhere is the same as
in the absence of barriers. If the disk covers many Fresnel zones, alternation of the
bright and dark rings is observed only in a narrow region on the boundary of the
geometrical shadow. In this case, 𝐴𝑚+1  𝐴1 , so that the bright spot at the centre
is absent, and the illumination in the region of the geometrical shadow equals zero
practically everywhere.
The bright spot at the centre of the shadow formed by a disk was the cause of
an incident between Simeon Poisson and Augustin Fresnel. The Paris Academy
of Sciences announced the diffraction of light as the topic for its 1818 prize. The
organizers of the competition were advocates of the corpusculate theory of light and
were sure that the papers submitted to the competition would bring a final victory to
their theory. Fresnel submitted a paper, however, in which all the optical phenomena
known at that time were explained from the wave viewpoint. In considering this
paper, Poisson, who was a member of the competition committee, gave attention
to the fact that an “absurd” conclusion follows from Frenel’s theory: there must
be a bright spot at the centre of the shadow formed by a small disk. D. Arago
immediately conducted an experiment and found that such a spot does indeed exist.
This brought victory and all-round recognition to the wave theory of light.
Diffraction from the Straight Edge of a Half-Plane. Let us put an opaque
half-plane with a straight edge in the path of a light wave (which we shall consider
to be plane for simplicity). We shall arrange this half-plane so that it coincides
with one of the wave surfaces. We shall place a screen parallel to the half plane at
a distance 𝑏 behind it and take point P on the screen (Fig. 18.15). Let us divide the
open part of the wave surface into zones having the form of very narrow straight
Fresnel Diffraction from Simple Barriers 411

Half-plane Wave
surface

Screen

Fig. 18.15: Opaque half-plane with a straight edge in Fig. 18.16: Picture that helps to es-
the path of a light wave, arranged to coincide with tablish the dependence of the am-
one of the wave surfaces. A screen parallel to the plitude on the zone number 𝑚.
half-plane is at a distance 𝑏 behind it.

strips parallel to the edge of the half-plane. We shall choose the width of the zones
so that the distances from point P to the edges of any zone measured in the plane
of the drawing differ by the same amount 𝛥. When this condition is observed, the
oscillations set up at point P by the adjacent zones will differ in phase by a constant
value.
We shall assign the numbers 1, 2, 3, etc. to the zones at the right of point P, and
the numbers 10, 20, 30, etc. to those at the left of this point. The zones numbered 𝑚
and 𝑚 0 have an identical width and are symmetrical relative to point P. Therefore,
the oscillations produced by them at P coincide in amplitude and in phase.
To establish the dependence of the amplitude on the zone number 𝑚, let us
assess the areas of the zones. A glance at Fig. 18.16 shows that the total width of the
first 𝑚 zones is

𝑑1 + 𝑑2 + . . . + 𝑑𝑚 = (𝑏 + 𝑚 𝛥) 2 − 𝑏2 = 2𝑏𝑚 𝛥 + 𝑚2 𝛥2 .
p

Since the zones are narrow, we have 𝛥  𝑏. Consequently, when 𝑚 is not very
great, we may ignore the quadratic term in the radicand. This yields

𝑑1 + 𝑑2 + . . . + 𝑑𝑚 = 2𝑏𝑚 𝛥.

Assuming in this equation that 𝑚 = 1, we find that 𝑑1 = 2𝑏𝛥. Hence, we can write
the expression for the total width of the first 𝑚 zones as follows

𝑑1 + 𝑑2 + . . . + 𝑑𝑚 = 𝑚,
whence √ √ 
𝑑 𝑚 = 𝑑1 𝑚− 𝑚−1 . (18.17)
Calculations by Eq. (18.17) show that
𝑑1 : 𝑑2 : 𝑑3 : 𝑑4 : . . . = 1 : 0.41 : 0.32 : 0.27 : . . . . (18.18)
The areas of the zones are in the same proportion. Examination of Eq. (18.18) shows
412 DIFFRACTION OF LIGHT

Fig. 18.17: Approximate vector diagrams Fig. 18.18: Complete diagram vectors depict-
showing the graphical addition of the os- ing the oscillations corresponding to these
cillations produced by straight lines. The zones symmetrically relative to the origin
amplitudes are constant in (a) and variable of coordinates 0.
in accordance to Eq. (18.18) in (b).

that the amplitude of the oscillations set up at point P by the individual zones
initially (for the first zones) diminishes very rapidly, and then this diminishing
becomes slower. For this reason, the broken line obtained in the graphical addition
of the oscillations produced by the straight zones first has a gentler slope than that
for annular zones (the areas of which in a similar construction are approximately
equal). Both vector diagrams are compared in Fig. 18.17. In both cases, the lag
in phase of each following oscillation has been taken the same. The value of the
amplitude for the annular zones (Fig. 18.17a) has been taken constant, and for the
straight zones (Fig. 18.17b), diminishing in accordance with proportion (Fig. 18.18).
The graphs in Fig. 18.17 are approximate. In an exact construction of these graphs,
account must be taken of the dependence of the amplitude on 𝑟 and 𝜑 [see Eq. (18.1)].
This does not affect the general nature of the diagrams, however.
Figure Fig. 18.17b shows only the oscillations produced by the zones to the
right of point P. The zones numbered 𝑚 and 𝑚 0 are symmetrical relative to P. It is
therefore natural, when constructing the diagram, to arrange the vectors depicting
the oscillations corresponding to these zones symmetrically relative to the origin
of coordinates 0 (Fig. 18.18). If the width of the zones is made to tend to zero, the
broken line shown in Fig. 18.18 will transform into a smooth curve (Fig. 18.19) called
a Cornu spiral.
The equation of a Cornu spiral in the parametric form is
∫ 𝑣  2 ∫ 𝑣  2
𝜋𝑢 𝜋𝑢
𝜉= cos d𝑢, 𝜂 = sin d𝑢. (18.19)
0 2 0 2
These integrals are known as Fresnel integrals. They can be solved only numerically,
and tables are available that can be used to find the values of the integrals for various
Fresnel Diffraction from Simple Barriers 413

1.0

0.5

0.0

-0.5

-1.0
-1.0 -0.5 0.0 0.5 1.0

Fig. 18.19: Cornu spiral. When the width of the zones is made tend to zero, the broken
line shown in Fig. 18.18 transforms into a smooth curve. This makes it possible to find the
amplitude of a light oscillation at any point on a screen.

𝑣’s. The meaning of the parameter 𝑣 is that |𝑣| gives the length of the arc of the
Cornu spiral measured from the origin of coordinates.
The figures along the curve in Fig. 18.19 give the values of the parameter 𝑣. Points
F1 and F2 , which are asymptotically approached by the curve when 𝑣 tends to +∞
and −∞, are called the focal points or poles of the Cornu spiral. Their coordinates
are
1 1
𝜉 = + , 𝜂 = + , for point F1 ,
2 2
1 1
𝜉 = − , 𝜂 = − , for point F2 .
2 2
The right-hand curl of the spiral (section 0F1 ) corresponds to zones to the right of
point P, and the left-hand curl (section 0F2 ) to zones to the left of this point.
Let us find the derivative d𝜂/d𝜉 for the point of the curve corresponding to a
given value of the parameter 𝑣. According to Eq. (18.19), the values
 2  2
𝜋𝑣 𝜋𝑣
d𝜉 = cos d𝑣 d𝜂 = sin d𝑣,
2 2
correspond to the increment d𝑥 of 𝑣. Consequently, d𝜂/d𝜉 = tan(𝜋 𝑣2 /2). At the
same time, d𝜂/d𝜉 = tan 𝜃, where 𝜃 is the angle of inclination of a tangent to the
curve at the given point. Thus,
𝜃 = 𝑣2 .
𝜋
(18.20)
2
414 DIFFRACTION OF LIGHT

Fig. 18.20: (a) The right-hand curl of the spiral corresponds to oscillations from the unhatched
zones depicted by a vector whose origin is at point 0 and whose end is at point F1 . (b)
When point P is displaced to the region of the geometrical shadow, the half-plane covers
a greater and greater number of unhatched zones. The beginning of the resultant vector
moves along the right-hand curl in the direction of pole F1 . If point P is displaced from the
boundary of the geometrical shadow to the right, the tip of the resultant vector slides along
the left-hand curl of the spiral in a direction to pole F2 . The amplitude passes through a
number of maxima [the first one equals the length of the segment MF1 (c)] and minima [the
first one equals the length of segment NF1 (d)].

It thus follows that at the point corresponding to 𝑣 = 1 a tangent to the Cornu


spiral is perpendicular to the 𝜉-axis. When 𝑣 = 2, the angle 𝜃 is 2𝜋, so that a tangent
is parallel to the 𝜉-axis. When 𝑣 = 3, the angle 𝜃 is 9𝜋/2, so that a tangent is again
perpendicular to the 𝜉-axis, and so on.
The Cornu spiral makes it possible to find the amplitude of a light oscillation
at any point on a screen. We shall characterize the position of the point by the
coordinate 𝑥 measured from the boundary of the geometrical shadow (see Fig. 18.15).
All the hatched zones will be covered for point P on the boundary of the geometrical
shadow (𝑥 = 0). The right-hand curl of the spiral corresponds to oscillations from
the unhatched zones. Hence, the resultant oscillation will be depicted by a vector
whose origin is at point 0 and whose end is at point F1 (Fig. 18.20a). When point P is
displaced to the region of the geometrical shadow, the half-plane covers a greater
and greater number of unhatched zones. Therefore, the beginning of the resultant
vector moves along the right-hand curl in the direction of pole F1 (Fig. 18.20b). As a
result, the amplitude of the oscillation monotonously tends to zero.
Fresnel Diffraction from Simple Barriers 415

Fig. 18.21: Dependence of light intensity with the Fig. 18.22: Photograph of the diffrac-
coordinate 𝑥. Upon a transition to the region of tion pattern produced by the edge of a
the geometrical shadow, the intensity gradually half-plane.
tends to zero instead of changing in a jump. A
number of alternating maxima and minima of the
intensity are to the right of the boundary of the
geometrical shadow.

If point P is displaced from the boundary of the geometrical shadow to the


right, in addition to the unhatched zones a constantly growing number of hatched
ones will be uncovered. Therefore, the tip of the resultant vector slides along the
left-hand curl of the spiral in a direction to pole F2 . The amplitude passes through a
number of maxima (the first of them equals the length of segment MF1 in Fig. 18.20c)
and minima (the first of them equals the length of segment NF1 in Fig. 18.20d).
When the wave surface is completely uncovered, the amplitude equals the length
of F2 F1 (Fig. 18.20e), i.e., is exactly double the amplitude on the boundary of the
geometrical shadow (see Fig. 18.20a). Accordingly, the intensity on the boundary of
the geometrical shadow is one-fourth of the intensity 𝐼0 obtained on the screen in
the absence of barriers.
The dependence of light intensity 𝐼 on the coordinate 𝑧 is shown in Fig. 18.21.
Inspection of the figure shows that upon a transition to the region of the geometrical
shadow, the intensity gradually tends to zero instead of changing in a jump. A
number of alternating maxima and minima of the intensity are to the right of
the boundary of the geometrical shadow. Calculations show that at 𝑏 = 1 m and
𝜆 = 0.5 µm the coordinates of the maxima (see Fig. 18.21) have the following values:
𝑥1 = 0.61 mm, 𝑥2 = 1.17 mm, 𝑥3 = 1.54 mm, 𝑥4 = 1.85 mm, etc. With a change in
the distance 𝑏 from the half-plane√to the screen, the values of the coordinates of
the maxima and minima vary as 𝑏. It can be seen from the above data that the
maxima are quite dense. The Cornu curve can also be used to find the relative value
of the intensity at the maxima and minima. We get the value of 1.37𝐼0 for the first
maximum and 0.78𝐼0 for the first minimum.
Figure 18.22 contains a photograph of the diffraction pattern produced by the
416 DIFFRACTION OF LIGHT

~
Wave surface
~

Fig. 18.23: Infinitely long slit formed by plac-


Fig. 18.24: Cornu spiral with the vectors cor-
ing two half-planes facing opposite direc- responding to the infinitely long slit formed
tions next to each other. from two-half-planes as in Fig. 18.23.

edge of a half-plane.
Diffraction from a Slit. An infinitely long slit can be formed by placing two
half-planes facing opposite directions next to each other. Therefore, the problem
on the Fresnel diffraction from a slit can be solved with the aid of a Cornu spiral.
We shall consider that the wave surface of the incident light, the plane of the slit,
and the screen on which a diffraction pattern is observed are parallel to one another
(Fig. 18.23).
For point P opposite the middle of the slit, the tip and the tail of the resultant
vector are at points on the spiral that are symmetrical relative to the origin of
coordinates (Fig. 18.24). If we move to point P’ opposite an edge of the slit, the tip of
the resultant vector will move to the middle of the spiral 0. The tail of the vector
will move along the spiral in the direction of pole F1 . Upon motion into the region
of the geometrical shadow, the tip and the tail of the resultant vector will slide along
the spiral and in the long run will be at the smallest distance apart (see the vector in
Fig. 18.24 corresponding to point P00). The intensity of the light reaches a minimum
here. Upon further sliding along the spiral, the tip and tail of the vector will move
apart again, and the intensity will grow. The same will occur when we move from
point P in the opposite direction because the diffraction pattern is symmetrical
relative to the middle of the slit.
If we change the width of the slit by moving the half-planes in opposite di-
rections, the intensity at middle point P will pulsate, alternately passing through
maxima (Fig. 18.25a) and minima that differ from zero (Fig. 18.25b).
Thus, a Fresnel diffraction pattern from a slit is either a bright (for the case
shown in Fig. 18.25a) or a relatively dark (for the case shown in Fig. 18.25b) cen-
Fraunhofer Diffraction from a Slit 417

Fig. 18.25: Fresnel diffraction pattern from a slit is either a bright (a) or a relatively dark
(b) central fringe at both sides of which there are alternating dark and bright fringes
symmetrical relative to it.

tral fringe at both sides of which there are alternating dark and bright fringes
symmetrical relative to it.
With a great width of the slit, the tip and tail of the resultant vector for point P
are on the internal turns of the spiral near poles F1 and F2 . Therefore, the intensity
of the light at the points opposite the slit will be virtually constant. A system of
closely spaced narrow bright and dark fringes is formed only on the boundaries of
the geometrical shadow.
We must note that all the results obtained in the present section hold provided
that the coherence radius of the light wave falling on the barrier greatly exceeds the
characteristic dimension of the barrier (the diameter of the aperture or disk, the
width of the slit, etc.).

18.5. Fraunhofer Diffraction from a Slit

Assume that a plane light wave falls on an infinitely long² slit (Fig. 18.26). Let us place
a converging lens behind the slit and a screen in the focal plane of the lens. The
wave surface of the incident wave, the plane of the slit, and the screen are parallel
to one another. Since the slit is infinite, the pattern observed in any plane at right
angles to the slit will be the same. It is therefore sufficient to investigate the nature
of the pattern in one such plane, for example, in that of Fig. 18.26. All the quantities
introduced in the following, in particular the angle 𝜑 made by a ray with the optical
axis of the lens, relate to this plane.
Let us divide the open part of the wave surface into elementary zones of width
²In practice, it is sufficient that the length of the slit he many times its width.
418 DIFFRACTION OF LIGHT

Screen
P
Fig. 18.26: A plane light wave hitting on an infinitely long slit. A converging lens is placed
behind the slit and a screen in the focal plane of the lens. The wave surface of the incident
wave, the plane of the slit, and the screen are parallel to one another.

d𝑥 parallel to the edges of the slit. The secondary waves emitted by the zones in
the direction determined by the angle 𝜑 will gather at point P of the screen. Each
elementary zone will produce the oscillation d𝐸 at point P. The lens will gather
plane (and not spherical) waves in the focal plane. Therefore, the factor 1/𝑟 in
Eq. (18.1) for d𝐸 will be absent for Fraunhofer diffraction. Limiting ourselves to a
consideration of not too great angles 𝜑, we can assume that the coefficient 𝐾 in
Eq. (18.1) is constant. Hence, the amplitude of the oscillation produced by a zone
at any point of the screen will depend only on the area of the zone. The area is
proportional to the width d𝑥 of a zone. Consequently, the amplitude d𝐴 of the
oscillation d𝐸 produced by a zone of width d𝑥 at any point of the screen will have
the form
d𝐴 = 𝐶 d𝑥,
where 𝐶 is a constant.
Let 𝐴0 stand for the algebraic sum of the amplitudes of the oscillations produced
by all the zones at a point of the screen. We can find 𝐴0 by integrating d𝐴 over the
entire width of the slit 𝑏:
∫ ∫ +𝑏/2
𝐴0 = d𝐴 = 𝐶 d𝑥 = 𝐶𝑏.
−𝑏/2
Hence, 𝐶 = 𝐴0 /𝑏 and, therefore,
𝐴0
d𝐴 = d𝑥.
𝑏
Now let us find the phase relations between the oscillations d𝐸. We shall
compare the phases of the oscillations produced at point P by the elementary
zones having the coordinates 0 and 𝑥 (Fig. 18.26). The optical paths 0P and QP
Fraunhofer Diffraction from a Slit 419

are tautochronous (see Fig. 16.20). Therefore, the phase difference between the
oscillations being considered is formed on the path 𝛥 equal to 𝑥 sin 𝜑. If the initial
phase of the oscillation produced at point P by the elementary zone at the middle
of the slit (𝜑 = 0) is assumed to equal zero, then the initial phase of the oscillation
produced by the zone with the coordinate 𝑥 will be
𝛥 2𝜋
−2𝜋 = − 𝑥 sin 𝜑,
𝜆 𝜆
where 𝜆 is the wavelength in the given medium.
Thus, the oscillation produced by the elementary zone with the coordinate 𝑥 at
point P (whose position is determined by the angle 𝜑) can be written in the form
2𝜋
    
𝐴0
d𝐸 𝜑 = d𝑥 exp 𝑖 𝜔𝑡 − 𝑥 sin 𝜑 (18.21)
𝑏 𝜆
(we have in mind the real part of this expression).
Integrating Eq. (18.21) over the entire width of the slit, we shall find the resultant
oscillation produced at point P by the part of the wave surface uncovered by the
slit:
2𝜋
∫ +𝑏/2     
𝐴0
𝐸𝜑 = exp 𝑖 𝜔𝑡 − 𝑥 sin 𝜑 d𝑥.
−𝑏/2 𝑏 𝜆
Let us put the multipliers not depending on 𝑥 outside the integral In addition, we
shall introduce the symbol
𝜋
𝛾 = sin 𝜑. (18.22)
𝜆
As a result, we get
𝐴0 𝑖𝜔𝑡 +𝑏/2 −2𝑖𝛾𝑥 1
∫  
𝐴0 𝑖𝜔𝑡 +𝑏/2
𝐸𝜑 = 𝑒 𝑒 d𝑥 = 𝑒 − 𝑒−2𝑖𝛾𝑥 −𝑏/2
𝑏 −𝑏/2 𝑏 2𝑖𝛾
1 𝑖𝜔𝑡 𝐴0 1
     
𝑖𝜔𝑡 𝐴0
 
−𝑖𝛾𝑏 𝑖𝛾𝑏 𝑖𝛾𝑏 −𝑖𝛾𝑏
=𝑒 − 𝑒 −𝑒 =𝑒 𝑒 −𝑒 .
𝛾𝑏 2𝑖 𝛾𝑏 2𝑖
The expression in brackets determines the complex amplitude 𝐴ˆ 𝜑 of the re-
sultant oscillation. Taking into account that the difference between the exponents
divided by 2𝑖 is sin(𝛾𝑏) (see Sec. 7.3 of Vol. I), we can write
sin(𝛾𝑏) sin[(𝜋𝑏 sin 𝜑)/𝜆]
𝐴ˆ 𝜑 = 𝐴0 = 𝐴0 (18.23)
𝛾𝑏 [(𝜋𝑏 sin 𝜑)/𝜆]
[we have introduced the value of 𝛾 from Eq. (18.22)].
Equation (18.23) is a real one. Its magnitude is the usual amplitude of the resultant
oscillation:
sin[(𝜋𝑏 sin 𝜑)/𝜆]

𝐴𝜑 = 𝐴0
. (18.24)
[(𝜋𝑏 sin 𝜑)/𝜆]
420 DIFFRACTION OF LIGHT

For a point opposite the centre of the lens, 𝜑 = 0. Introduction of this value
into Eq. (18.24) gives the value 𝐴0 for the amplitude³. This result can be obtained in
a simpler way. When 𝜑 = 0, the oscillations from all the elementary zones arrive
at point P in the same phase. Therefore, the amplitude of the resultant oscillation
equals the algebraic sum of the amplitudes of the oscillations being added.
At values of 𝜑 satisfying the condition (𝜋𝑏 sin 𝜑)/𝜆 = ±𝑘𝜋, i.e., when
𝑏 sin 𝜑 = ±𝑘𝜆 (𝑘 = 1, 2, 3, . . .), (18.25)
the amplitude 𝐴𝜑 vanishes. Thus, condition (18.25) determines the positions of the
minima of intensity. We must note that 𝑏 sin 𝜑 is the path difference 𝛥 of the rays
travelling to point P from the edges of the slit (see Fig. 18.26).
It is easy to obtain condition (18.25) from the following considerations. If the
path difference 𝛥 from the edges of the slit is ±𝑘𝜆, the uncovered part of the wave
surface can be divided into 2𝑘 zones equal in width, and the path difference from the
edges of each zone will be 𝜆/2 (see Fig. 18.27 for 𝑘 = 2). The oscillations from each
pair of adjacent zones mutually destroy each other, so that the resultant amplitude
vanishes. If the path difference 𝛥 for point P is +(𝑘 + 1/2)𝜆, the number of zones
will be odd, the action of one of them will not be compensated, and a maximum of
intensity is observed.
The intensity of light is proportional to the square of the amplitude. Hence, in
accordance with Eq. (18.24),
sin2 [(𝜋𝑏 sin 𝜑)𝜆]
𝐼 𝜑 = 𝐼0 , (18.26)
[(𝜋𝑏 sin 𝜑)𝜆] 2
where 𝐼0 is the intensity at the middle of the diffraction pattern (opposite the centre
of the lens), and 𝐼 𝜑 is the intensity at the point whose position is determined by the
given value of 𝜑.
We find from Eq. (18.26) that 𝐼−𝜑 = 𝐼 𝜑 . This signifies that the diffraction pattern
is symmetrical relative to the centre of the lens. We must note that when the slit is
displaced parallel to the screen (along the 𝑥-axis in Fig. 18.26), the diffraction pattern
observed on the screen remains stationary (its middle is opposite the centre of the
lens). Conversely, displacement of the lens with the slit stationary is attended by
the same displacement of the pattern on the screen.
A graph of function (18.26) is depicted in Fig. 18.28. The values of sin 𝜑 are laid
off along the axis of abscissas, and the intensity 𝐼 𝜑 along the axis of ordinates. The
number of intensity minima is determined by the ratio of the width of a slit 𝑏 to
the wavelength 𝜆. It can be seen from condition (18.25) that sin 𝜑 = ±𝑘𝜆/𝑏. The

³We remind our reader that lim𝑢→0 sin 𝑢/𝑢 = 1 (at small values of 𝑢 we may assume that sin
sin 𝑢 ≈ 𝑢).
Fraunhofer Diffraction from a Slit 421

Fig. 18.27: Destruction and construction of Fig. 18.28: Graph of function (18.26). The
oscillations for a path difference 𝛥 from the number of intensity minima is determined
edges of the slit equal to ±𝑘𝜆 and to +(𝑘 + by the ratio of the width of a slit 𝑏 to the
1/2)𝜆, respectively, with 𝑘 = 2. wavelength 𝜆.

magnitude of sin 𝜑 cannot exceed unity. Hence, 𝑘𝜆/𝑏 > 1, whence


𝑏
𝑘> . (18.27)
𝜆
At a slit width less than a wavelength, minima do not appear at all. In this case,
the intensity of the light monotonously diminishes from the middle of the pattern
toward its edges.
The values of the angle 𝜑 obtained from the condition 𝑏 sin 𝜑 = ±𝜆 correspond
to the edges of the central maximum. These values are ± arcsin(𝜆/𝑏). Consequently,
the angular width of the central maximum is
 
𝜆
𝛿 𝜑 = 2 arcsin . (18.28)
𝑏
When 𝑏  𝜆, the value of sin(𝜆/𝑏) can be assumed equal to 𝜆/𝑏. The equation for
the angular width of the central maximum is thus simplified as follows:
2𝜆
𝛿𝜑 = . (18.29)
𝜑
Let us solve the problem on the Fraunhofer diffraction from a slit by the method
of graphical summation of the amplitudes. We divide the open part of the wave
surface into very narrow zones of an identical width. The oscillation produced
by each of these zones has the same amplitude 𝛥𝐴 and lags in phase behind the
preceding oscillation by the same value 𝛿 that depends on the angle 𝜑 determining
the direction to the point of observation P. When 𝜑 = 0, the phase difference 𝛿
vanishes, and the vector diagram has the form shown in Fig. 18.29a. The amplitude
of the resultant oscillation 𝐴0 equals the algebraic sum of the amplitudes of the
oscillations being added. When 𝛥 = 𝑏 sin 𝜑 = 𝜆/2, the oscillations from edges of
the slit are in counterphase. Accordingly, the vectors 𝛥𝑨 arrange themselves along a
422 DIFFRACTION OF LIGHT

Fig. 18.29: Solution of the problem on the Fraunhofer diffraction from a slit by the method
of graphical summation of the amplitudes. The open part of the wave surface is divided into
very narrow zones of an identical width. The oscillation produced by each of these zones
has the same amplitude 𝛥𝐴 and lags in phase behind the preceding oscillation by the same
value 𝛿 that depends on the angle 𝜑 determining the direction to the point of observation P.
(a) Vector diagram with 𝜑 = 0, 𝛿 = 0. (b) When 𝛥 = 𝑏 sin 𝜑 = 𝜆/2, the vectors 𝛥𝑨 form a
semicircle of length 𝐴0 . (c) When 𝛥 = 𝑏 sin 𝜑 = 𝜆, the vectors 𝛥𝑨 arrange themselves along
a semicircle of length 𝐴0 , but with phase difference equal o 2𝜋. (d) Constructing sequentially
the vectors 𝛥𝑨, when 𝛥 = 𝑏 sin 𝜑 = 3𝜆/2, we travel one and a half times around a circle of
diameter 𝐴1 = 2𝐴0 /3𝜋, which is the amplitude of the first maximum.

semicircle of length 𝐴0 (Fig. 18.29b). Hence, the resultant amplitude is 2𝐴0 /𝜋. When
𝛥 = 𝑏 sin 𝜑 = 𝜆, the oscillations from the edges of the slit differ in phase by 2𝜋.
The corresponding vector diagram is shown in Fig. 18.29c. The vectors 𝛥𝑨 arrange
themselves along a circle of length 𝐴0 . The resultant amplitude is zero—the first
minimum is obtained. The first maximum is obtained at 𝛥 = 𝑏 sin 𝜑 = 3𝜆/2. In this
case, the oscillations from the edges of the slit differ in phase by 3𝜋. Constructing
sequentially the vectors 𝛥𝑨, we travel one and a half times around a circle of
diameter 𝐴1 = 2𝐴0 /3𝜋 (Fig. 18.29d). It is exactly the diameter of this circle that is
the amplitude of the first maximum. Thus, the intensity of the first maximum is
𝐼1 = (2/3𝜋) 2 𝐼0 ≈ 0.045𝐼0 . We can find the relative intensity of the other maxima
in a similar way. As a result, we get the following proportion:
 2  2  2
2 2 2
𝐼0 : 𝐼1 : 𝐼2 : 𝐼3 : . . . = : : : .... (18.30)
3𝜋 5𝜋 7𝜋
Thus, the central maximum considerably exceeds the remaining maxima in intensity;
the main fraction of the light flux passing through the slit is concentrated in it.
Fraunhofer Diffraction from a Slit 423

Fig. 18.30: Path difference of the rays from the edges of the slit to point P, to determine the
kind of diffraction that will occur in each particular case.

When the width of the slit is very small in comparison with the distance from it
to the screen, the rays travelling to point P from the edges of the slit will be virtually
parallel even in the absence of a lens between the slit and the screen. Consequently,
when a plane wave falls on a slit, Fraunhofer diffraction will be observed. All the
equations obtained above will hold; by 𝜑 in them one should understand the angle
between the direction from any edge of the slit to point P and a normal to the plane
of the slit.
Let us establish a quantitative criterion permitting us to determine the kind of
diffraction that will occur in each particular case. We shall find the path difference
of the rays from the edges of the slit to point P (Fig. 18.30). We apply the cosine law
to the triangle with the legs 𝑟, 𝑟 + 𝛥, and 𝑏:
𝜋 
(𝑟 + 𝛥) 2 = 𝑟 2 + 𝑏2 − 2𝑟𝑏 cos +𝜑 .
2
Simple transformations yield
2𝑟 𝛥 + 𝛥2 = 𝑏2 + 2𝑟𝑏 sin 𝜑. (18.31)
We are interested in the case when the rays travelling from the edges of the slit to
point P are almost parallel. When this condition is observed, 𝛥2  𝑟 𝛥, and we can
therefore ignore the addend 𝛥2 in Eq. (18.31). In this approximation
𝑏2
𝛥= + 𝑏 sin 𝜑. (18.32)
2𝑟
In the limit at 𝑟 → ∞, we get a value of the path difference 𝛥∞ = 𝑏 sin 𝜑 that
coincides with the expression in Eq. (18.25).
At finite 𝑟’s, the nature of the diffraction pattern will be determined by the
relation between the difference 𝛥 − 𝛥∞ and the wavelength 𝜆. If
𝛥 − 𝛥∞ = 𝜆, (18.33)
424 DIFFRACTION OF LIGHT

Fig. 18.31: Diagram to make a visual interpretation of the parameter (18.35). 𝑚 is the number
of Fresnel zone, 𝜆 the wavelength, 𝑏 is the width of the slit and 𝑙 the distance from the
middle of the slit to the point P.

the diffraction pattern will be virtually the same as in Fraunhofer diffraction. At


𝛥 − 𝛥∞ comparable with 𝜆 (i.e., 𝛥 − 𝛥∞ ∼ 𝜆), Fresnel diffraction will take place. It
follows from Eq. (18.32) that
𝑏2 𝑏2
𝛥 − 𝛥∞ = ∼
2𝑟 𝑙
(here, 𝑙 is the distance from the slit to the screen). Introduction of this expression
into inequality (18.33) gives the condition (𝑏2 /𝑙)  𝜆 or
𝑏2
 1. (18.34)
𝑙𝜆
Thus, the nature of diffraction depends on the value of the dimensionless parameter
𝑏2
. (18.35)
𝑙𝜆
If this parameter is much smaller than unity, Fraunhofer diffraction is observed,
if it is of the order of unity, Fresnel diffraction is observed, and, finally, if this
parameter is much greater than unity, the approximation of geometrical optics is
applicable. For convenience of comparison, let us write what has been said above
in the following form:
  1 ⇒ Fraunhofer diffraction,
𝑏2 


∼ 1 ⇒ Fresnel diffraction, (18.36)

𝑙𝜆 
  1 ⇒ geometrical optics.


Parameter (18.35) can be given a visual interpretation. Let us take point P opposite
the middle of a slit (Fig. 18.31). For this point, the number 𝑚 of Fresnel zones opened
by the slit is determined by the expression
 2  2
= 𝑙2 +
𝜆 𝑏
𝑙+𝑚 .
2 2
Diffraction Grating 425

Opening the parentheses and discarding the addend proportional to 𝜆2 , we get⁴


𝑏2 𝑏2
𝑚= ∼ . (18.37)
4𝑙𝜆 𝑙𝜆
Thus, parameter (18.35) is directly associated with the number of uncovered Fresnel
zones (for a point opposite the middle of the slit).
If a slit opens a small fraction of the central Fresnel zone (𝑚  1), Fraunhofer
diffraction is observed. The distribution of the intensity in this case is shown by
the curve depicted in Fig. 18.28. If a slit uncovers a small number of Fresnel zones
(𝑚 ∼ 1), an image of the slit surrounded along its edges by clearly visible bright
and dark fringes will be obtained on the screen. Finally, when a slit opens a large
number of Fresnel zones (𝑚  1), a uniformly illuminated image of the slit is
obtained on the screen. Only at the boundaries of the geometrical shadow are there
very narrow alternating brighter and darker fringes virtually indistinguishable by
the eye.
Let us see how the pattern changes when the screen is moved away from the
slit. When the screen is near the slit (𝑚  1), the image corresponds to the laws of
geometrical optics. Upon increasing the distance, we first obtain a Fresnel diffraction
pattern which then transforms into a Fraunhofer pattern. The same sequence of
changes is observed when we reduce the width of the slit 𝑏 without changing the
distance 𝑙.
It is clear from what has been said above that the value of the parameter (18.35)
is the criterion of the applicability of geometrical optics (it must be much greater
than unity) instead of the smallness of the wavelength in comparison with the
characteristic dimension of the barrier (for example, the width of the slit). Assume,
for instance, that both ratios 𝑏/𝜆 and 𝑙/𝑏 equal 100. In this case, 𝜆  𝑏, but
𝑏2 /(𝑙𝜆) = 1, and, therefore, a distinctly expressed Fresnel diffraction pattern will
be observed.

18.6. Diffraction Grating

A diffraction grating is a collection of a large number of identical equispaced slits


(Fig. 18.32). The distance 𝑑 between the centres of adjacent slits is called the period
of the grating.
Let us place a converging lens parallel to a grating and put a screen in the focal
plane of the lens. We shall determine the nature of the diffraction pattern obtained
on the screen when a plane light wave falls on the grating (we shall consider for

⁴We must note that the number of open zones will be larger for points greatly displaced to the
region of the geometrical shadow.
426 DIFFRACTION OF LIGHT

Fig. 18.32: Scheme of a diffraction grating with period 𝑑 and slit width 𝑏. A converging lens
is parallel to it focus a normal incident light and a screen in the focal plane of the lens serves
to check the diffraction pattern similar to that of Fig. 18.28.

simplicity’s sake that the wave falls normally on the grating). Each slit produces
a pattern on the screen that is described by the curve depicted in Fig. 18.28. The
patterns from all the slits will be at the same place on the screen (regardless of the
position of the slit, the central maximum is opposite the centre of the lens). If the
oscillations arriving at point P from different slits were incoherent, the resultant
pattern produced by 𝑁 slits would differ from the pattern produced by a single slit
only in that all the intensities would grow 𝑁 times. The oscillations from different
slits are coherent to a greater or smaller extent, however. The resultant intensity will
therefore differ from 𝑁 𝐼 𝜑 [𝐼 𝜑 is the intensity produced by one slit; see Eq. (18.26)].
We shall assume in the following that the coherence radius of the incident wave
is much greater than the length of the grating so that the oscillations from all the
slits can be considered coherent relative to one another. In this case, the resultant
oscillation at point P whose position is determined by the angle 𝜑 is the sum of 𝑁
oscillations having the same amplitude 𝐴𝜑 shifted relative to one another in phase
by the same amount 𝛿. According to Eq. (17.47), the intensity in these conditions is
sin2 (𝑁 𝛿/2)
𝐼gr = 𝐼 𝜑 (18.38)
sin2 (𝛿/2)
(here 𝐼 𝜑 plays the part of 𝐼0 ).
A glance at Fig. 18.32 shows that the path difference from adjacent slits is 𝛿 =
𝑑 sin 𝜑. Hence, the phase difference is
𝛥 2𝜋
𝛿 = 2𝜋 = 𝑑 sin 𝜑, (18.39)
𝜆 𝜆
where 𝜆 is the wavelength in the given medium.
Introducing into Eq. (18.38) Eqs. (18.26) and (18.39) for 𝐼 𝜑 and 𝛿, respectively, we
get
sin2 (𝜋𝑏 sin 𝜑/𝜆) sin2 (𝑁 𝜋𝑑 sin 𝜑/𝜆)
𝐼gr = 𝐼0 (18.40)
(𝜋𝑏 sin 𝜑/𝜆) 2 𝑠𝑖𝑛2 (𝜋𝑑 sin 𝜑/𝜆)
(𝐼0 is the intensity produced by one slit opposite the centre of the lens).
Diffraction Grating 427

The first multiplier of 𝐼0 in Eq. (18.40) vanishes condition (18.25) is observed, i.e.,
𝑏 sin 𝜑 = ±𝑘𝜆 (𝑘 = 1, 2, 3, . . .).
At these points, the intensity produced by each slit individually equals zero.
The second multiplier of 𝐼0 in Eq. (18.40) acquires the value 𝑁 2 for points
satisfying the condition
𝑑 sin 𝜑 = ±𝑚𝜆 (𝑚 = 0, 1, 2, . . .) (18.41)
[see Eq. (17.49)]. For the directions determined by this condition, the oscillations
from individual slits mutually amplify one another. As a result, the amplitude of
the oscillations at the corresponding point of the screen is
𝐴max = 𝑁 𝐴𝜑 (18.42)
(𝐴𝜑 is the amplitude of the oscillation emitted by one slit at the angle 𝜑).
Condition (18.41) determines the positions of the intensity maxima called the
principal ones. The number 𝑚 gives the order of the principal maximum. There is
only one zero-order maximum, and there are two each of the maxima of the 1st,
2nd, etc. orders.
Squaring Eq. (18.42), we find that the intensity of the principal maxima 𝐼max is
𝑁 2 times greater than the intensity 𝐼 𝜑 , produced in the direction 𝜑 by a single slit:
𝐼max = 𝑁 2 𝐼 𝜑 . (18.43)
Apart from the minima determined by condition (18.25), there are 𝑁 − 1 addi-
tional minima in each interval between adjacent principal maxima. These minima
appear in the directions for which the oscillations from individual slits mutually
destroy one another. In accordance with Eq. (17.50), the directions of the additional
minima are determined by the condition
𝑘0
𝑑 sin 𝜑 = ± 𝜆(𝑘 0 = 1, 2, . . . , 𝑁 − 1, 𝑁 + 1, . . . , 2𝑁 − 1, 2𝑁 + 1, . . .). (18.44)
𝑁
In Eq. (18.44), 𝑘 0 takes on all integral values except for 0, 𝑁, 2𝑁, . . ., i.e., except for
those at which Eq. (18.44) transforms into Eq. (18.41).
It is easy to obtain condition (18.44) by the method of graphical addition of
oscillations. The oscillations from the individual slits are depicted by vectors of
the same length. According to Eq. (18.44), each of the following vectors is turned
relative to the preceding one by the same angle
2𝜋 2𝜋 0
𝛿= 𝑑 sin 𝜑 = 𝑘.
𝜆 𝜆
Therefore, when 𝑘 0 is not an integral multiple of 𝑁, we put the tip of the following
vector against the tail of the preceding one and obtain a closed broken line that
completes 𝑘 0 (when 𝑘 0 < 𝑁/2) or 𝑁 − 𝑘 0 (when 𝑘 0 > 𝑁/2) revolutions before the
tail of the 𝑁-th vector contacts the tip of the first one. The resultant amplitude
428 DIFFRACTION OF LIGHT

Fig. 18.33: Method of graphical addition of oscillations to arrive at Eq. (18.44). Sum of the
vectors for 𝑁 = 9 and for the values of 𝑘 0 equal to 1, 2, and 𝑁 − 1 = 8.

Fig. 18.34: Graph of Eq. (18.40) for 𝑁 = 4 and 𝑑/𝑏 = 3. The dash curve shows the intensity
produced by one slit multiplied by 𝑁 2 [Eq. (18.43)].

accordingly equals zero. The above is explained in Fig. 18.33 that shows the sum of
the vectors for 𝑁 = 9 and for the values of 𝑘 0 equal to 1, 2, and 𝑁 − 1 = 8.
Between the additional minima, there are weak secondary maxima. The number
of such maxima falling to an interval between adjacent principal maxima is 𝑁 − 2.
We showed in Sec. 17.6 that the intensity of the secondary maxima does not exceed
1/22nd of that of the closest principal maximum.
Figure 18.34 shows a graph of function (18.40) for 𝑁 = 4 and 𝑑/𝑏 = 3. The
dash curve passing through the peaks of the principal maxima shows the intensity
produced by one slit multiplied by 𝑁 2 [see Eq. (18.43)]. At the ratio of the grating
period to the slit width used in the figure (𝑑/𝑏 = 3), the principal maxima of the
third, sixth, etc. orders fall to the minima of intensity from one slit, owing to which
these maxima vanish. In general, it can be seen from Eqs. (18.25) and (18.41) that the
principal maximum of the 𝑚-th order falls to the 𝑘-th minimum from one slit if the
equation 𝑚/𝑑 = 𝑘/𝑏 or 𝑚/𝑘 = 𝑑/𝑏 is satisfied. This is possible if 𝑑/𝑏 equals the ratio
of two integers 𝑟 and 𝑠 (the ease when these integers are not great is of practical
interest). Here, the principal maximum of the 𝑟-th order will be superposed on the
𝑠-th minimum from one slit, the maximum of the 2𝑟-th order will be superposed
Diffraction Grating 429

on the 2𝑠-th minimum, etc. As a result, the maxima of orders 𝑟, 2𝑟, 3𝑟, etc. will be
absent.
The number of principal maxima observed is determined by the ratio of the
period of the grating 𝑑 to the wavelength 𝜆. The magnitude of sin 𝜑 cannot exceed
unity. It therefore follows from Eq. (18.41) that
𝑑
𝑚6 . (18.45)
𝜆
Let us determine the angular width of the central (zero) maximum. The position
of the additional minima closest to it is determined by the condition 𝑑 sin 𝜑 = ±𝜆𝑁
[see Eq. (18.44)]. Hence, values of 𝜑 equal to ± arcsin(𝜆/𝑁𝑑) correspond to these
minima. We thus obtain the following expression for the angular width of the
central maximum: 
2𝜆

𝜆
𝛿 𝜑0 = 2 arcsin ≈ (18.46)
𝑁𝑑 𝑁𝑑
(we have taken advantage of the circumstance that𝜆/𝑁 𝐷  1).
The position of the additional minima closest to the principal maximum of the
𝑚-th order is determined by the condition 𝑑 sin 𝜑 = (𝑚 ± 1/𝑁)𝜆. Hence, for the
angular width of the 𝑚-th maximum, we get the expression
1 𝜆 1 𝜆
   
𝛿 𝜑𝑚 = 2 arcsin 𝑚 + − arcsin 𝑚 − .
𝑁 𝑑 𝑁 𝑑
Introducing the notation 𝑚𝜆/𝑑 = 𝑥 and 𝜆/𝑁𝑑 = 𝛥𝑥, we can write this equation in
the form
𝛿 𝜑𝑚 = arcsin(𝑥 + 𝛥𝑥) − arcsin(𝑥 − 𝛥𝑥). (18.47)
With a great number of slits, the value of 𝛥𝑥 = 𝜆/𝑁𝑑 will be very small. We can
therefore assume that arcsin(𝑥 ± 𝛥𝑥) ∼ arcsin 𝑥 ± (arcsin 𝑥) 0 𝛥𝑥. The introduction
of these values into Eq. (18.47) leads to the approximate expression
2𝛥𝑥 1 𝜆
𝛿 𝜑𝑚 = 2(arcsin 𝑥) 0 𝛥𝑥 = √ =p . (18.48)
1 − 𝑥2 1 − 𝑚2 (𝜆/𝑑2 ) 2 𝑁𝑑
When 𝑚 = 0, this expression transforms into Eq. (18.46).
The product 𝑁𝑑 gives the length of the diffraction grating. Consequently, the
angular width of the principal maxima is inversely proportional to the length of
the grating. The width 𝛿 𝜑𝑚 grows with an increase in the order 𝑚 of a maximum.
The position of the principal maxima depends on the wavelength 𝜆. Therefore,
when white light is passed through a grating, all the maxima except for the central
one will expand into a spectrum whose violet end faces the centre of the diffraction
pattern, and whose red end faces outward. Thus, a diffraction grating is a spectral
instrument. We must note that whereas a glass prism deflects violet rays the greatest,
430 DIFFRACTION OF LIGHT

a diffraction grating, on the contrary, deflects red rays to a greater extent.


Figure 18.35 shows schematically the spectra of different orders produced by a
grating when white light is passed through it. At the centre is a narrow zero-order
maximum; only its edges are coloured [according to expression (18.46), 𝛿 𝜑0 depends
on 𝜆]. At both sides of the central maximum are two first-order spectra, then two
second-order spectra, etc. The positions of the red end of the 𝑚-th order spectrum
and the violet end of the (𝑚 + 1)-th order one are determined by the relations
0.76 0.40
sin 𝜑r = 𝑚 , sin 𝜑v = (𝑚 + 1) ,
𝑑 𝑑
where 𝑑 has been taken in micrometres. When the condition is observed that
0.76𝑚 > 0.40(𝑚 + 1),
the spectra of the 𝑚-th and (𝑚 + 1)-th orders partly overlap. The inequality gives
𝑚 > 10/9. Hence, partial overlapping begins from the spectra of the second and
third orders (see Fig. 18.35, in which for illustration the spectra of different orders
are displaced relative to one another vertically).
The main characteristics of a spectral instrument are its dispersion and re-
solving power. The dispersion determines the angular or linear distance between
two spectral lines differing in wavelength by one unit (for example by 1 Å). The
resolving power determines the minimum difference between wavelengths 𝛿 𝜆 at
which the two lines corresponding to them are perceived separately in the spectrum.
The angular dispersion is defined as the quantity
𝛿𝜑
𝐷= , (18.49)
𝛿𝜆
where 𝛿 𝜑 is the angular distance between spectral lines differing in wavelength by
𝛿 𝜆.
To find the angular dispersion of a diffraction grating, let us differentiate condi-
tion (18.41) for the principal maximum at the left with respect to 𝜑 and at the right
with respect to 𝜆. Omitting the minus sign, we get
(𝑑 cos 𝜑)𝛿 𝜑 = 𝑚 𝛿 𝜆,
whence
𝛿𝜑 𝑚
𝐷= = . (18.50)
𝛿 𝜆 𝑑 cos 𝜑
Within the range of small angles; cos 𝜑 ≈ 1. We can therefore assume that
𝑚
𝐷≈ . (18.51)
𝑑
It can be seen from expression (18.51) that the angular dispersion is inversely pro-
portional to the grating period 𝑑. The higher the order 𝑚 of a spectrum, the greater
is the dispersion.
Diffraction Grating 431

r v v r

vr vr v v rv rv
Fig. 18.35: Scheme of the spectra of different orders produced by a grating when white light
is passed through it. At the centre there is a narrow zero-order maximum and only its edges
are coloured [Eq. (18.46)]. At both sides of the central maximum are two first-order spectra,
then two second-order spectra, etc.

Linear dispersion is defined as the quantity


𝛿𝑙
𝐷lin = , (18.52)
𝛿𝜆
where 𝛿 𝑙 is the linear distance on a screen or photographic plate between spectral
lines differing in wavelength by 𝛿 𝜆. A glance at Fig. 18.36 shows that for small values
of the angle 𝜑 we can assume that 𝛿 𝑙 ≈ 𝑓 0 𝛿 𝜑, where 𝑓 0 is the focal length of the lens
gathering the diffracted rays on a screen. Consequently, the linear dispersion is
associated with the angular dispersion 𝐷 by the relation
𝐷lin = 𝑓 0 𝐷.
Taking expression (18.51) into consideration, we get the following equation for the
linear dispersion of a diffraction grating (with small 𝜑’s):
𝑚
𝐷lin = 𝑓 0 . (18.53)
𝑑
The resolving power of a spectral instrument is defined as the dimensionless
quantity
𝜆
𝑅= , (18.54)
𝛿𝜆
where 𝛿 𝜆 is the minimum difference between the wavelengths of two spectral lines
at which these lines are perceived separately.
The possibility of resolving (i.e., perceiving separately) two close spectral lines
depends not only on the distance between them (that is determined by the dispersion
of the instrument), but also on the width of the spectral maximum. Figure 18.37
shows the resultant intensity (solid curves) observed in the superposition of two
close maxima (the dash curves). In case (a), both maxima are perceived as a single
one. In case (b), there is a minimum between the maxima. Two close maxima are
perceived by the eye separately if the intensity in the interval between them is not over
80% of the intensity of a maximum. According to the criterion proposed by the
British physicist John Rayleigh (1842-1919), such a ratio of the intensities occurs if
the middle of one maximum coincides with the edge of another one (Fig. 18.37b).
Such a mutual arrangement of the maxima is obtained at a definite (for the given
instrument) value of 𝛿 𝜆.
432 DIFFRACTION OF LIGHT

Fig. 18.36: For small values of the angle 𝜑 Fig. 18.37: Resultant intensity (solid curves)
we can assume that 𝛿 𝑙 ≈ 𝑓 0 𝛿 𝜑, where 𝑓 0 is observed in the superposition of two close
the focal length of the lens gathering the maxima (the dash curves). (a) Both maxima
diffracted rays on a screen. are perceived as a single one. (b) There is a
minimum between the maxima.

Let us find the resolving power of a diffraction grating. The position of the
middle of the 𝑚-th maximum for the wavelength 𝜆 + 𝛿 𝜆 is riei;ermined by the
condition
𝑑 sin 𝜑max = 𝑚(𝜆 + 𝛿 𝜆).
The edges of the 𝑚-th maximum for the wavelength 𝜆 are at angles complying with
the condition
1
 
𝑑 sin 𝜑min = 𝑚 ± 𝜆.
𝑁
The middle of the maximum for the wavelength 𝜆 + 𝛿 𝜆 coincides with the edge of
the maximum for the wavelength 𝜆 if
1
 
𝑚(𝜆 + 𝛿 𝜆) = 𝑚 ± 𝜆,
𝑁
whence
𝜆
𝑚𝛿 𝜆 = .
𝑁
Solving this equation relative to 𝜆/𝛿 𝜆, we get an expression for the resolving power:
𝑅 = 𝑚𝑁. (18.55)
Thus, the resolving power of a diffraction grating is proportional to the order 𝑚 of
the spectrum and the number of slits 𝑁.
Figure 18.38 compares the diffraction patterns obtained for two spectral lines
with the aid of gratings differing in the values of the dispersion 𝐷 and the resolving
power 𝑅. Gratings I and II have the same resolving power (they have the same
number of slits 𝑁), but a different dispersion (in grating I, the period 𝑑 is double
and the dispersion 𝐷 is half of the respective quantities of grating II). Gratings II
Diffraction Grating 433

II

III

Fig. 18.38: Comparison of diffraction patterns obtained for two spectral lines with the aid of
gratings differing in the values of the dispersion 𝐷 and the resolving power 𝑅. Gratings I
and II have the same resolving power, but in grating I the period is double and dispersion
𝐷 is half of that in grating II. Gratings II and III have the same dispersion, but the resolving
power of grating II doubles that of grating IIII.

and III have the same dispersion (they have the same 𝑑’s), but a different resolving
power (the number of slits 𝑁 and the resolving power 𝑅 of grating II are double
the respective quantities of grating III).
Transmission and reflecting diffraction gratings are in use. Transmission grat-
ings are made from glass or quartz plates on whose surface a special machine using
a diamond cutter makes a number of parallel lines. The spaces between these lines
are the slits.
Reflecting gratings are applied with the aid of a diamond cutter on the surface
of a metal mirror. Light falls on a reflecting grating at an acute angle. A grating of
period 𝑑 functions in the same way as a transmission grating with the period 𝑑 cos 𝜃,
where 𝜃 is the angle of incidence of the light, would function with the light falling
normally. This makes it possible to observe a spectrum when light is reflected, for
example, from a gramophone record having only a few lines (grooves) per millimetre
if it is placed so that the angle of incidence is close to 𝜋/2. The American physicist
Henry Row land (1848-1901) invented a concave reflecting grating which focuses the
diffraction spectra by itself (without a lens).
The best gratings have up to 1200 lines per mm (𝑑 ≈ 0.8 µm). It can be seen
434 DIFFRACTION OF LIGHT

from Eq. (18.45) that no second-order spectra are observed in visible light with such
a period. The total number of lines in such gratings reaches 200000 (they are about
200 mm long). With a focal length of the instrument 𝑓 0 = 2 m, the length of the
visible first-order spectrum in this case is over 700 mm.

18.7. Diffraction of X-Rays

Let us place two diffraction gratings one after the other so that their lines are
mutually perpendicular. The first grating (whose lines, say, are vertical) will produce
a number of maxima in the horizontal direction. Their positions are determined by
the condition
𝑑1 sin 𝜑1 = ±𝑚1 𝜆 (𝑚1 = 0, 1, 2 . . .). (18.56)
The second grating (with horizontal lines) will divide each of the beams formed in
this way into vertically arranged maxima whose positions are determined by the
condition
𝑑2 sin 𝜑2 = ±𝑚2 𝜆 (𝑚2 = 0, 1, 2 . . .). (18.57)
As a result, the diffraction pattern will have the form of regularly arranged spots,
with two integral indices 𝑚1 and 𝑚2 corresponding to each of them (Fig. 18.39).
An identical diffraction pattern is obtained if instead of two separate gratings
we take one transparent plate with two systems of mutually perpendicular lines
applied on it. Such a plate is a two-dimensional periodic structure (a conventional
grating is a one-dimensional structure). Having measured the angles 𝜑1 and 𝜑2
determining the positions of the maxima and knowing the wavelength 𝜆, we can use
Eqs. (18.56) and (18.57) to find the periods of the structure 𝑑1 and 𝑑2 . If the directions
in which a structure is periodic (for example, directions at right angles to the grating
lines) make the angle a differing from 𝜋/2, the diffraction maxima will be at the
apices of parallelograms instead of at the apices of rectangles (as in Fig. 18.39). In
this case, the diffraction pattern can be used to determine not only the periods 𝑑1
and 𝑑2 , but also the angle 𝛼.
Any two-dimensional periodic structures such as a system of small apertures
or one of opaque tiny spheres produce a diffraction pattern similar to that shown
in Fig. 18.39.
For diffraction maxima to appear, it is essential that the period of the structure
𝑑 be greater than 𝜆. Otherwise, conditions (18.56) and (18.57) can be satisfied only at
values of 𝑚1 and 𝑚2 equal to zero (the magnitude of sin 𝜑 cannot exceed unity).
Diffraction is also observed in three-dimensional structures, i.e., spatial forma-
tions displaying periodicity along three directions not in one plane. All crystalline
bodies are such structures. Their period (∼ 10−10 m), however, is too small for the
Diffraction of X-Rays 435

-2;2 -1;2 0;2 1;2 2;2

-2;1 -1;1 0;1 1;1 2;1

-2;0 -1;0 0;0 1;0 2;0

-2;-1 -1;-1 0;-1 1;-1 2;-1

-2;-2 -1;-2 0;-2 1;-2 2;-2

Fig. 18.39: Diffraction pattern with two in- Fig. 18.40: Formation of diffraction maxima
tegral indices 𝑚1 and 𝑚2 corresponding to from a three-dimensional structure. The
two diffraction gratings placed one after the coordinate axes 𝑥, 𝑦 and 𝑧 are positioned in
other so that their lines are mutually per- the directions along which the properties of
pendicular. the structure display periodicity.

observation of diffraction in visible light. The condition 𝑑𝜆 is observed for crystals


only for X-rays. The diffraction of X-rays from crystals was first observed in 1913
in an experiment conducted by the German physicists Max von Laue (1879-1959),
Walter Friedrich (1883-1968), and Paul Knipping (1883-1935). (The idea belonged to
von Laue, while the other two authors ran the experiment.).
Let us find the conditions for the formation of diffraction maxima from a three-
dimensional structure. We position the coordinate axes 𝑥, 𝑦, and 𝑧 in the directions
along which the properties of the structure display periodicity (Fig. 18.40). The
structure can be represented as a collection of equally spaced parallel trains of
structural elements arranged along one of the coordinate axes. We shall consider
the action of an individual linear train parallel, for instance, to the 𝑥-axis (Fig. 18.41).
Assume that a beam of parallel rays making the angle 𝛼0 with the 𝑥-axis falls on
the train. Every structural element is a source of secondary wavelets. An incident
wave arrives at adjacent sources with a phase difference of 𝛿 0 = 2𝜋 𝛥0 /𝜆, where
𝛥0 = 𝑑1 cos 𝛼0 (here, 𝑑1 is the period of the structure along the 𝑥-axis). Apart
from this, the additional path difference 𝛥 = 𝑑1 cos 𝛼 is produced between the
secondary wavelets propagating in directions that make the angle 𝛼 with the 𝑥-axis
(all such directions lie along the generatrices of a cone whose axis is the 𝑥-axis). The
oscillations from different structural elements will be mutually amplified for the
directions for which
𝑑1 (cos 𝛼 − cos 𝛼0 ) = ±𝑚1 𝜆 (𝑚1 = 0, 1, 2, . . .). (18.58)
There is a separate cone of directions for each value of 𝑚1 , and along these
directions we get maxima of the intensity from one individually taken train parallel
to the 𝑥-axis. The axis of this cone coincides with the 𝑥-axis.
436 DIFFRACTION OF LIGHT

Fig. 18.41: Scheme of a collection of equally spaced parallel trains of structural elements
arranged along the 𝑥-axis.

The condition of the maximum for a train parallel to the 𝑦-axis has the form
𝑑2 (cos 𝛽 − cos 𝛽0 ) = ±𝑚2 𝜆 (𝑚2 = 0, 1, 2, . . .), (18.59)
where 𝑑2 is the period of the structure in the direction of the 𝑦-axis, 𝛽0 is the angle
between the incident beam and the 𝑦-axis, and 𝛽 is the angle between the 𝑦-axis
and the directions along which diffraction maxima are obtained.
A cone of directions whose axis coincides with the 𝑦-axis corresponds to each
value of 𝑚2 .
In directions satisfying conditions (18.58) and (18.59) simultaneously, mutual
amplification of the oscillations from sources in the same plane perpendicular to the
𝑧-axis occur (these sources form a two-dimensional structure). The directions of the
intensity maxima produced lie along the lines of intersection of the direction cones,
of which one is determined by condition (18.58), and the second one by condition
(18.59).
Finally, for the train parallel to the 𝑧-axis, the directions of the maxima are
determined by the condition
𝑑3 (cos 𝛾 − cos 𝛾0 ) = ±𝑚3 𝜆 (𝑚3 = 0, 1, 2, . . .), (18.60)
where 𝑑3 is the period of the structure in the direction of the 𝑧-axis, 𝛾0 is the angle
between the incident beam and the 𝑧-axis, and 𝛾 is the angle between the 𝑧-axis and
the directions along which diffraction maxima are obtained.
As in the preceding cases, a cone of directions whose axis coincides with the
𝑧-axis corresponds to each value of 𝑚3 .
In the directions satisfying conditions (18.58), (18.59), and (18.60) simultaneously,
mutual amplification of the oscillations from all the elements forming the three-
dimensional structure occurs. As a result, diffraction maxima are produced by the
three-dimensional structure. The directions of these maxima are on the lines of
intersection of three cones whose axes are parallel to the coordinate axes.
Diffraction of X-Rays 437

The conditions
𝑑1 (cos 𝛼 − cos 𝛼0 ) = ±𝑚1 𝜆,
𝑑2 (cos 𝛽 − cos 𝛽0 ) = ±𝑚2 𝜆, (𝑚𝑖 = 0, 1, 2, . . .) (18.61)
𝑑3 (cos 𝛾 − cos 𝛾0 ) = ±𝑚3 𝜆,
which we have found are called Laue’s formulas. Three integral numbers 𝑚1 , 𝑚2 ,
and 𝑚3 correspond to each direction (𝛼, 𝛽, 𝛾) determined by these formulas. The
greatest value of the magnitude of the difference between cosines is two. Hence,
conditions (18.61) can be obeyed with values of the numbers 𝑚 other than zero only
provided that 𝜆 does not exceed 2𝑑.
The angles 𝛼, 𝛽 and 𝛾 are not independent. For example, when a Cartesian
system of coordinates is used, they are related by the expression
cos2 𝛼 + cos2 𝛽 + cos2 𝛾 = 1. (18.62)
Thus, when 𝛼0 , 𝛽0 and 𝛾0 are given, the angles 𝛼, 𝛽 and 𝛾 determining the directions
of the maxima can be found by solving a system of four equations. If the number of
equations exceeds the number of unknowns, a system of equations can be solved
only when definite conditions are observed (only when these conditions are satisfied
can the three cones intersect one another along a single line).
The system of Eqs. (18.61) and (18.62) can be solved only for certain quite definite
wavelengths (𝜆 can be considered as a fourth unknown whose values obtained
from the solution of the system of equations are exactly the wavelengths for which
maxima are observed). Generally speaking, only one maximum corresponds to each
such value of 𝜆. Several symmetrically arranged maxima may be obtained, however.
If the wavelength is fixed (monochromatic radiation), the system of equations
can be made simultaneous by varying the values of 𝛼0 , 𝛽0 and 𝛾0 , i.e., by turning the
three-dimensional structure relative to the direction of the incident beam.
We have not treated the question of how rays travelling from different structural
elements are made to converge to one point on a screen. A lens does this for visible
light. A lens cannot be used for X-rays because the refractive index of these rays
in all substances is virtually equal to unity. For this reason, the interference of the
secondary wavelets is achieved by using very narrow beams of rays producing spots
of a very small size on a screen (or a photographic plate) even without a lens.
The Russian scientist Yuri Vulf (1863-1925) and the British physicists William
Henry Bragg (1862-1942) and his son William Lawrence Bragg (1890-1971) showed
independently of each other that the diffraction pattern from a crystal lattice can
be calculated in the following simple way. Let us draw parallel equispaced planes
through the points of a crystal lattice (Fig. 18.42). We shall call these planes atomic
layers. If the wave falling on the crystal is plane, the envelope of the secondary
438 DIFFRACTION OF LIGHT

II II II II III III III


I
I
I
I
I
Fig. 18.42: Reflection of a plane wave Fig. 18.43: The difference between the paths
upon parallel equispaced planes through the of two waves reflected from adjacent atomic
points of a crystal lattice (atomic planes). 𝑑 layers is 2𝑑 sin 𝜃. (𝑑 and 𝑡ℎ𝑒𝑡𝑎 are described
is the period of identity of the crystal and in the caption of Fig. 18.42).
𝜃 is the angle supplementing the angle of
incidence and called the glancing angle of
the incident rays.

waves set up by the atoms in such a layer will also be a plane. Thus, the summary
action of the atoms in one layer can be represented in the form of a plane wave
reflected from an atom-covered surface according to the usual law of reflection.
The plane secondary wavelets reflected from different atomic layers are coherent
and will interfere with one another like the waves emitted in the given direction
by different slits of a diffraction grating. As in the case of a grating, the secondary
wavelets will virtually destroy one another in all directions except those for which
the path difference between adjacent wavelets is a multiple of 𝜆. Inspection of
Fig. 18.42 shows that the difference between the paths of two waves reflected from
adjacent atomic layers is 2𝑑 sin 𝜃, where 𝑑 is the period of identity of the crystal
in a direction at right angles to the layers being considered, and 𝜃 is the angle
supplementing the angle of incidence and called the glancing angle of the incident
rays. Consequently, the directions in which diffraction maxima are obtained are
determined by the condition
2𝑑 sin 𝜃 = ±𝑚𝜆 (𝑚 = 0, 1, 2, . . .). (18.63)
This expression is known as the Bragg-Vulf formula.
The atomic layers in a crystal can be drawn in a multitude of ways (Fig. 18.43).
Each system of layers can produce a diffraction maximum if condition (18.63) is
observed for it. Only those maxima have an appreciable intensity, however, that
are obtained as a result of reflections from layers sufficiently densely populated by
atoms (for instance, from layers I and II in Fig. 18.43).
We must note that calculations by the Bragg-Vulf formula and by Laue’s formulas
[see Eqs. (18.61)] lead to coinciding results.
Diffraction of X-Rays 439

Fig. 18.44: Laue diffraction pattern of beryl (a mineral of the silicate group).

The diffraction of X-rays from crystals has two principal applications. It is used
to investigate the spectral composition of X-radiation (X-ray spectroscopy) and
to study the structure of crystals (X-ray structure analysis).
By determining the directions of the maxima obtained in the diffraction of the
X-radiation being studied from crystals with a known structure, we can calculate
the wavelengths. Originally, crystals of the cubic system were used to determine
wavelengths, the spacing of the planes being determined from the density and
relative molecular mass of the crystal.
In the method of structural analysis proposed by von Laue, a beam of X-rays
is directed onto a stationary monocrystal. The radiation contains a wavelength
at which condition (18.63) is satisfied for each system of layers sufficiently densely
populated by atoms. Consequently, we obtain a collection of black spots on a
photographic plate placed behind the crystal (after development). The mutual
arrangement of the spots reflects the symmetry of the crystal. The distances between
the spots and their intensities allow us to find the arrangement of the atoms in a
crystal and their spacing. Figure 18.44 shows a Laue diffraction pattern of beryl (a
mineral of the silicate group).
The method of structural analysis developed by the Dutch physicist Peter De-
bye and the Swiss physicist Paul Scherrer uses monochromatic X-radiation and
polycrystalline specimens. The substance being studied is ground into a powder,
and the latter is pressed into a wire-shaped specimen. The specimen is put along the
axis of a cylindrical chamber on whose side surface a photographic film is placed
(Fig. 18.45). Among the enormous number of chaotically oriented minute crystals,
there will always be a multitude of such ones for which condition (18.63) will be
observed, the diffracted ray being in the most diverse planes for different crystals.
As a result, for each system of atomic layers and each value of 𝑚, we get not one
direction of a maximum, but a cone of directions whose axis coincides with the
direction of the incident beam (see Fig. 18.45). The pattern obtained on the film (a
Debye powder pattern) has the form shown in Fig. 18.46. Each pair of symmetrically
arranged lines corresponds to one of the diffraction maxima satisfying condition
440 DIFFRACTION OF LIGHT

Fig. 18.45: Scheme of the instrument developed by Debye and Scherrer for structural analysis.
It uses monochromatic X-radiation and polycrystalline specimens put along the axis of a
cylindrical chamber on whose side surface a photographic film is placed.

Fig. 18.46: Pattern obtained on the film (a Debye powder pattern). Each pair of symmetrically
arranged lines corresponds to one of the diffraction maxima satisfying condition (18.63) at a
certain value of 𝑚.

(18.63) at a certain value of 𝑚. The structure of the crystal can be determined by


decoding the X-ray pattern.

18.8. Resolving Power of an Objective

Assume that a plane light wave falls on an opaque screen with a round aperture
of radius 𝑏 cut out of it. The number of Fresnel zones opened by the aperture for
point P opposite the centre of the aperture at the distance 𝑙 from it can be found by
Eq. (18.13) assuming that 𝑎 = ∞, 𝑟0 = 𝑏, and 𝑏 = 𝑙. The result is
𝑏2
𝑚= (18.64)
𝑙𝜆
[compare with expression (18.37)].
In the same way as for a slit, depending on the value of parameter (18.64), we have
to do either with the approximation of geometrical optics, or Fresnel diffraction,
or, finally, Fraunhofer diffraction [see expressions (18.36)].
We can observe a Fraunhofer diffraction pattern from a round aperture on a
screen in the focal plane of a lens placed behind the aperture by directing a plane
light wave onto the aperture. This pattern has the form of a central bright spot
Resolving Power of an Objective 441

Fig. 18.47: Fraunhofer diffraction pattern from a round aperture on a screen in the focal
plane of a lens placed behind the aperture by directing a plane light wave onto the aperture.
This pattern has the form of a central bright spot surrounded by alternating dark and bright
rings.

surrounded by alternating dark and bright rings (Fig. 18.47). The corresponding
calculations show that the first minimum is at the angular distance from the centre
of the diffraction pattern of
 
𝜆
𝜑min = arcsin 1.22 , (18.65)
𝐷
where 𝐷 is the diameter of the aperture [compare with Eq. (18.28)]. If 𝐷  𝜆, we
may consider that
𝜆
𝜑min = 1.22 . (18.66)
𝐷
The major part (about 84%) of the light flux passing through the aperture gets
into the region of the central bright spot. The intensity of the first bright ring is only
1.74%, and of the second, 0.41% of the intensity of the central spot. The intensity of
the other bright rings is still smaller. For this reason, in a first approximation, we
may consider that the diffraction pattern consists of only a single bright spot with
an angular radius determined by Eq. (18.65). This spot is in essence the image of an
infinitely remote point source of light (a plane light wave falls on the aperture).
The diffraction pattern does not depend on the distance between the aperture
and the lens. In particular, it will be the same when the edges of the aperture are
made to coincide with the edges of the lens. It thus follows that even a perfect lens
cannot produce an ideal optical image. Owing to the wave nature of light, the image
of a point produced by the lens has the form of a spot that is the central maximum
of a diffraction pattern. The angular dimension of this spot diminishes with an
increasing diameter of the lens mount 𝐷.
With a very small angular distance between two points, their images obtained
442 DIFFRACTION OF LIGHT

Directio
n to
1st poin
t

on to
Directi
int
2nd po

Fig. 18.48: Rayleigh criterion: two close points will still be resolved if the middle of the central
diffraction maximum for one of them coincides with the edge of the central maximum
for the second one. This occurs if the angular distance between the points 𝛿𝜓 equals the
angular radius given by Eq. (18.65).

with the aid of an optical instrument will be superposed and will produce a single
luminous spot. Hence, two very close points will not be perceived by the instrument
separately or, as we say, will not be resolved by the instrument. Consequently, no
matter how great the image is in size, the corresponding details will not be seen on
it.
Let 𝛿𝜓 stand for the smallest angular distance between two points at which
they can still be resolved by an optical instrument. The reciprocal of 𝛿𝜓 is called
the resolving power of the instrument:
1
𝑅= . (18.67)
𝛿𝜓
Let us find the resolving power of the objective of a telescope or camera when
very remote objects are being looked at or photographed. In this condition, the
rays travelling into the objective from each point of the object may be considered
parallel, and we can use formula (18.65). According to the Rayleigh criterion, two
close points will still be resolved if the middle of the central diffraction maximum
for one of them coincides with the edge of the central maximum (i.e., with the first
minimum) for the second one. A glance at Fig. 18.48 shows that this will occur if
the angular distance between the points 𝛿𝜓 will equal the angular radius given
by Eq. (18.65). The diameter of the objective mount 𝐷 is much greater than the
wavelength 𝜆. We may therefore consider that
𝜆
𝛿𝜓 = 1.22 .
𝐷
Hence,
𝐷
𝑅= . (18.68)
1.22𝜆
It can be seen from this formula that the resolving power of an objective grows
with its diameter.
Holography 443

The diameter of the pupil of an eye at normal illumination is about 2 mm. Using
this value in Eq. (18.68) and taking 𝜆 = 0.5 × 10−3 mm, we get
0.5 × 10−3
𝛿𝜓 = 1.22 × = 0.305 × 10−3 rad ≈ 10.
2
Thus, the minimum angular distance between points at which the human eye still
perceives them separately, equals one angular minute. It is interesting to note that
the distance between adjacent light sensitive elements of the retina corresponds to
this angular distance.

18.9. Holography

Holography (i.e., “complete recording”, from the Greek “bolos” meaning “the whole”
and “grapho”-“write”) is a special way of recording the structure of the light wave
reflected by an object on a photographic plate. When this plate (a hologram) is illu-
minated with a beam of light, the wave recorded on it is reconstructed in practically
its original form, so that when the eye perceives the reconstructed wave, the visual
sensation is virtually the same as it would be if the object itself were observed.
Holography was invented in 1947 by the British physicist Dennis Gabor. The
complete embodiment of Gabor’s idea became possible, however, only after the
appearance in 1960 of light sources having a high degree of coherence—lasers.
Gabor’s initial arrangement was improved by the American physicists Emmet Leith
and Juris Upatnieks, who obtained the first laser holograms in 1963. The Soviet
scientist Yuri Denisyuk in 1962 proposed an original method of recording holograms
on a thick-layer emulsion. This method, unlike holograms on a thin-layer emulsion,
produces a coloured image of the object.
We shall limit ourselves to an elementary consideration of the method of record-
ing holograms on a thin-layer emulsion. Figure 18.49a contains a schematic view
of an arrangement for recording holograms, and Fig. 18.49b a schematic view of
reconstruction of the image. The light beam emitted by the laser, expanded by a
system of lenses, is split into two parts. One part is reflected by the mirror to the
photographic plate forming the so-called reference wave 1. The second part reaches
the plate after being reflected from the object; it forms object beam 2. Both beams
must be coherent. This requirement is satisfied because laser radiation has a high
degree of spatial coherence (the light oscillations are coherent over the entire cross
section of a laser beam). The reference and object beams superpose and form an
interference pattern that is recorded by the photographic plate. A plate exposed
in this way and developed is a hologram. Two beams of light participate in form-
ing the hologram. In this connection, the arrangement described above is called
444 DIFFRACTION OF LIGHT

r
Lase

Beam
Mirror

expander
Hologram
Object Virtual image

Real image

Fig. 18.49: Holograms on a thin-layer emulsion. (a) Schematic view of an arrangement for
recording holograms. (b) Schematic view of reconstruction of the image.

two-beam or split-beam holography.


To reconstruct the image, the developed photographic plate is put in the same
place where it was in recording the hologram, and is illuminated with the reference
beam of light (the part of the laser beam that illuminated the object in recording the
hologram is now stopped). The reference beam diffracts on the hologram, and as
a result a wave is produced having exactly the same structure as the one reflected
by the object. This wave produces a virtual image of the object that is seen by
the observer. In addition to the wave forming the virtual image, another wave is
produced that gives a real image of the object. This image is pseudoscopic; this
means that it has a relief which is the opposite of the relief of the 2 object—the
convex spots are replaced by concave ones, and vice versa.
Let us consider the nature of a hologram and the process of image reconstruc-
tion. Assume that two coherent parallel beams of light rays fall on the photographic
plate, with the angle 𝜓 between the beams (Fig. 18.50). Beam 1 is the reference one,
and beam 2, the object one (the object in the given case is an infinitely remote point).
We shall assume for simplicity that beam 1 is normal to the plate. All the results
obtained below also hold when the reference beam falls at an angle, but the formulas
will be more cumbersome.
Owing to the interference of the reference and object beams, a system of al-
ternating straight maxima and minima of the intensity is formed on the plate. Let
points A and B correspond to the middles of adjacent interference maxima. Hence,
the path difference 𝛥0 equals 𝜆. Examination of Fig. 18.50 shows that 𝛥0 = 𝑑 sin 𝜓;
hence,
𝑑 sin 𝜓 = 𝜆. (18.69)
Having recorded the interference pattern on the plate (by exposure and devel-
oping), we direct reference beam 1 at it. For this beam, the plate plays the part of
Holography 445

Fig. 18.50: Two coherent parallel beams of Fig. 18.51: Plate illuminated with a reference
light rays fall on the photographic plate, beam, produces a diffraction pattern whose
with the angle 𝜓 between the beams. Beam 1 maxima form the angles 𝜑 with a normal to
is the reference one, and beam 2, the object the plate. The maximum corresponding to
one (the object is considered an infinitely re- 𝑚 = 0 is on the continuation of the refer-
mote point). We shall assume for simplicity ence beam. The maximum corresponding
that beam 1 is normal to the plate. to 𝑚 = +1 has the same direction as object
beam 2 did during the exposure. In addi-
tion, a maximum corresponding to 𝑚 = −1
appears.

a diffraction grating whose period 𝑑 is determined by Eq. (18.69). A feature of this


grating is the circumstance that its transmittance changes in a direction perpendic-
ular to the “lines” according to a cosine law (in the gratings treated in Sec. 18.6 it
changed in a jump: gap-dark-gap-dark, etc.). The result of this feature is that the
intensity of all the diffraction maxima of orders higher than the first one virtually
equals zero.
When the plate is illuminated with the reference beam (Fig. 18.51), a diffraction
pattern appears whose maxima form the angles 𝜑 with a normal to the plate. These
angles are determined by the condition
𝑑 sin 𝜑 = 𝑚𝜆 (𝑚 = 0, ±1) (18.70)
[compare with formula (18.41)]. The maximum corresponding to 𝑚 = 0 is on the
continuation of the reference beam. The maximum corresponding to 𝑚 = ±1 has
the same direction as object beam 2 did during the exposure [compare Eqs. (18.69)
and (18.70)]. In addition, a maximum corresponding to 𝑚 = −1 appears.
It can be shown that the result we have obtained also holds when object beam 2
consists of diverging rays instead of parallel ones. The maximum corresponding
to 𝑚 = +1 has the nature of diverging beam of rays 20 (it produces a virtual image
of the point from which rays 2 emerged during the exposure); the maximum cor-
responding to 𝑚 = −1, on the other hand, has the nature of a converging beam of
rays 200 (it forms a real image of the point which rays 2 emerged from during the
exposure).
446 DIFFRACTION OF LIGHT

In recording the hologram, the plate is illuminated by reference beam 1 and


numerous diverging beams 2 reflected by different points of the object. An intricate
interference pattern is formed on the plate as a result of superposition of the patterns
produced by each of the beams 2 separately. When the hologram is illuminated
with reference beam 1, all beams 2 are reconstructed, i.e., the complete light wave
reflected by the object (𝑚 = +1 corresponds to it). Two other waves appear in
addition to it (corresponding to 𝑚 = 0 and 𝑚 = −1). But these waves propagate in
other directions and do not hinder the perception of the wave producing a virtual
image of the object (see Fig. 18.49).
The image of an object produced by a hologram is three-dimensional. It can be
viewed from different positions. If in recording a hologram close objects concealed
more remote ones, then by moving to a side we can look behind the closer object
(more exactly, behind its image) and see the objects that had been concealed previ-
ously. The explanation is that when moving to a side we see the image reconstructed
from the peripheral part of the hologram onto which the rays reflected from the
concealed objects also fell during the exposure. When looking at the images of close
and far objects, we have to accommodate our eyes as when looking at the objects
themselves.
If a hologram is broken into several pieces, then each of them when illuminated
will produce the same picture as the original hologram. But the smaller the part of
the hologram used to reconstruct the image, the lower is its sharpness. This is easy
to understand by taking into account that when the number of lines of a diffraction
grating is reduced, its resolving power diminishes [see Eq. (18.55)].
The possible applications of holography are very diverse. A far from complete
list of them includes holographic motion pictures and television, holographic mi-
croscopes, and control of the quality of processing articles. The statement can be
encountered in publications on the subject that holography can be compared as
regards its consequences with the setting up of radio communication.
447

Chapter 19
POLARIZATION OF LIGHT

19.1. Natural and Polarized Light

We remind our reader that light is called polarized if the directions of oscillations
of the light vector in it are brought into order in some way or other (see Sec. 16.1).
In natural light, oscillations in various directions rapidly and chaotically replace
one another.
Let us consider two mutually perpendicular electrical oscillations occurring
along the axes 𝑥 and 𝑦 and differing in phase by 𝛿:
𝐸 𝑥 = 𝐴1 cos(𝜔𝑡), 𝐸 𝑦 = 𝐴2 cos(𝜔𝑡 + 𝛿). (19.1)
The resultant field strength 𝑬 is the vector sum of the strengths 𝑬 𝑥 and 𝑬 𝑦
(Fig. 19.1). The angle 𝜑 between the directions of the vectors 𝑬 and 𝑬 𝑥 is determined
by the expression
𝐸𝑥 𝐴2 cos(𝜔𝑡 + 𝛿)
tan 𝜑 = = . (19.2)
𝐸𝑦 𝐴1 cos(𝜔𝑡)
If the phase difference 𝛿 undergoes random chaotic changes, then the angle
𝜑, i.e., the direction of the light vector 𝑬, will experience intermittent disordered
changes too. Accordingly, natural light can be represented as the superposition
of two incoherent electromagnetic waves polarized in mutually perpendicular
planes and having the same intensity. Such a representation greatly simplifies the
consideration of the transmission of natural light through polarizing devices.
Assume that the light waves 𝐸 𝑥 and 𝐸 𝑦 are coherent, with 𝛿 equal to zero or 𝜋.
Hence, according to Eq. (19.2),
𝐴2
tan 𝜑 = ± = constant.
𝐴1
Consequently, the resultant oscillation occurs in a fixed direction—the wave is
plane-polarized.
448 POLARIZATION OF LIGHT

Fig. 19.1: The resultant field strength 𝑬 is the Fig. 19.2: We consider that quantities (19.1)
vector sum of the strengths 𝑬 𝑥 and 𝑬 𝑦 . are the coordinates of the tail of the resultant
vector 𝑬.

When 𝐴1 = 𝐴2 , and 𝛿 = ±𝜋/2, we have


tan 𝜑 = ∓ tan(𝜔𝑡)
[cos(𝜔𝑡 ± 𝜋/2) = ∓ sin(𝜔𝑡)]. It thus follows that the plane of oscillations rotates
about the direction of the ray with an angular velocity equal to the frequency of
oscillation 𝜔. The light in this case will be circularly polarized.
To find the nature of the resultant oscillation with an arbitrary constant value
of 𝛿, let us take into account that quantities (19.1) are the coordinates of the tail of
the resultant vector 𝑬 (Fig. 19.2). We know from our treatment of oscillations (see
Sec. 7.9 of Vol. I) that two mutually perpendicular harmonic oscillations of the
same frequency produce motion along an ellipse when summated (in particular,
motion along a straight line or a circle may be obtained). Similarly, a point with
the coordinates determined by Eqs. (19.1), i.e., the tail of vector 𝑬, travels along
an ellipse. Consequently, two coherent plane-polarized light waves whose planes
of oscillations are mutually perpendicular produce an elliptically polarized light
wave when superposed on each other. At a phase difference of zero or 𝜋, the ellipse
degenerates into a straight line, and plane-polarized light is obtained. At 𝛿 = ±𝜋/2
and equality of the amplitude of the waves being added, the ellipse transforms into
a circle—circularly polarized light is obtained.
Depending on the direction of rotation of the vector 𝑬, right and left elliptical
and circular polarizations are distinguished. If with respect to the direction opposite
that of the ray the vector 𝑬 rotates clockwise, the polarization is called right, and
in the opposite case it is left.
The plane in which the light vector oscillates in a plane-polarized wave will
be called the plane of oscillations. For historical reasons, the term plane of
polarization was applied not to the plane in which the vector 𝑬 oscillates, but to
the plane perpendicular to it.
Natural and Polarized Light 449

Plane-polarized light can be obtained from natural light with the aid of devices
called polarizers. These devices freely transmit oscillations parallel to the plane
which we shall call the polarizer plane and completely or partly retain the oscil-
lations perpendicular to this plane. We shall apply the adjective imperfect to a
polarizer that only partly retains oscillations perpendicular to its plane. We shall
apply the term “polarizer” for brevity to a perfect polarizer that completely retains
the oscillations perpendicular to its plane and does not weaken the oscillations
parallel to its plane.
Light is produced at the outlet from an imperfect polarizer in which the oscilla-
tions in one direction predominate over the oscillations in other directions. Such
light is called partly polarized. It can be considered as a mixture of natural and
plane-polarized light. Partly polarized light, like natural light, can be represented in
the form of a superposition of two incoherent plane-polarized waves with mutually
perpendicular planes of oscillations. The difference is that for natural light the
intensity of these waves is the same, and for partly polarized light it is different.
If we pass partly polarized light through a polarizer, then when the device
rotates about the direction of the ray, the intensity of the transmitted light will
change within the limits from 𝐼max to 𝐼min . The transition from one of these values
to the other one will occur upon rotation through an angle of 𝜋/2 (during one
complete revolution both the maximum and the minimum intensity will be reached
twice). The expression
𝐼max − 𝐼min
𝑃= (19.3)
𝐼max + 𝐼min
is known as the degree of polarization. For plane-polarized light, 𝐼min = 0, and
𝑃 = 1. For natural light, 𝐼min = 𝐼max and 𝑃 = 0.
The concept of the degree of polarization cannot be applied to elliptically
polarized light (in such light the oscillations are completely ordered, so that the
degree of polarization always equals unity).
An oscillation of amplitude 𝐴 occurring in a plane making the angle 𝜑 with
the polarizer plane can be resolved into two oscillations having the amplitudes
𝐴 k = 𝐴 cos 𝜑 and 𝐴⊥ = 𝐴 sin 𝜑 (Fig. 19.3; the ray is perpendicular to the plane
of the drawing). The first oscillation will pass through the device, the second
will be retained. The intensity of the transmitted wave is proportional to 𝐴2k =
𝐴2 cos2 𝜑, i.e., is 𝐼 cos2 𝜑, where 𝐼 is the intensity of the oscillation of amplitude 𝐴.
Consequently, an oscillation parallel to the plane of the polarizer carries along a
fraction of the intensity equal to cos2 𝜑. In natural light, all the values of 𝜑 are equally
probable. Therefore, the fraction of the light transmitted through the polarizer
will equal the average value of cos2 𝜑, i.e., one-half. When the polarizer is rotated
450 POLARIZATION OF LIGHT

Plane of
polarizer
Plane of
polarizer

Fig. 19.3: An oscillation of amplitude 𝐴 oc- Fig. 19.4: Plane-polarized light of amplitude
curring in a plane making the angle 𝜑 with 𝐴0 and intensity 𝐼0 falling on a polarizer. 𝜑
the polarizer plane can be resolved into is the angle between the plane of oscillations
two oscillations having the amplitudes par- of the incident light and the plane of the
allel and perpendicular: 𝐴 k = 𝐴 cos 𝜑 and polarizer. The component of the oscillation
𝐴⊥ = 𝐴 sin 𝜑. having the amplitude 𝐴 = 𝐴0 cos 𝜑, will pass
through the device.

about the direction of a natural ray, the intensity of the transmitted light remains
the same. What changes is only the orientation of the plane of oscillations of the
light leaving the device.
Assume that plane-polarized light of amplitude 𝐴0 and intensity 𝐼0 falls on
a polarizer (Fig. 19.4). The component of the oscillation having the amplitude
𝐴 = 𝐴0 cos 𝜑, where 𝜑 is the angle between the plane of oscillations of the incident
light and the plane of the polarizer, will pass through the device. Hence, the intensity
of the transmitted light 𝐼 is determined by the expression
𝐼 = 𝐼0 cos2 𝜑. (19.4)
Relation (19.4) is known as Malus’s law. It was first formulated by the French
physicist Etienne Malus (1775-1812).
Let us put two polarizers whose planes make the angle 𝜑 in the path of a
natural ray. Plane-polarized light whose intensity 𝐼0 is half that of natural light
will emerge from the first polarizer. According to Malus’s law, light having an
intensity of 𝐼0 cos2 𝜑 will emerge from the second polarizer. The intensity of the
light transmitted through both polarizers is
1
𝐼 = 𝐼nat cos2 𝜑. (19.5)
2
The maximum intensity equal to 𝐼nat /2 is obtained at 𝜑 = 0 (the polarizers are
parallel). At 𝜑 = 𝜋/2, the intensity is zero-crossed polarizers transmit no light.
Assume that elliptically polarized light falls on a polarizer. The device transmits
Polarization in Reflection and Refraction 451

Fig. 19.5: For elliptically polarized light falling on a polarizer, the device transmits the
component 𝑬 k of the vector 𝑬 in the direction of the plane of the polarizer. The maximum
value of this component is reached at points 1 and 2.

the component 𝑬 k of the vector 𝑬 in the direction of the plane of the polarizer
(Fig. 19.5). The maximum value of this component is reached at points 1 and 2. Hence,
the amplitude of the plane-polarized light leaving the device equals the length of 010.
Rotating the polarizer around the direction of the ray, we shall observe changes in
the intensity ranging from 𝐼max (obtained when the plane of the polarizer coincides
with the semimajor axis of the ellipse) to 𝐼min (obtained when the plane of the
polarizer coincides with the semiminor axis of the ellipse). The intensity of light for
partly polarized light will change in the same way upon rotation of the polarizer.
For circularly polarized light, rotation of the polarizer is not attended (as for natural
light) by a change in the intensity of the light transmitted through the device.

19.2. Polarization in Reflection and Refraction

If the angle of incidence of light on the interface between two dielectrics (for
example, on the surface of a glass plate) differs from zero, the reflected and refracted
rays will be partly polarized¹. Oscillations perpendicular to the plane of incidence
predominate in the reflected ray (in Fig. 19.6 these oscillations are denoted by points),
and oscillations parallel to the plane of incidence predominate in the refracted ray
(they are depicted in the figure by double-headed arrows). The degree of polarization
depends on the angle of incidence. Let 𝜃 Br stand for the angle satisfying the condition
tan 𝜃 Br = 𝑛12 (19.6)

¹Elliptically polarized light is obtained upon reflection from a conducting surface (for example,
from the surface of a metal).
452 POLARIZATION OF LIGHT

Fig. 19.6: Polarization in reflection and refraction. Oscillations perpendicular to the plane
of incidence predominate in the reflected ray (dots) and oscillations parallel to the plane of
incidence predominate in the refracted ray (double-headed arrows).
(𝑛12 is the refractive index of the second medium relative to the first one). At an
angle of incidence 𝜃 1 equal to 𝜃 Br , the Fig. 19.6 reflected ray is completely polarized
(it contains only oscillations perpendicular to the plane of incidence). The degree
of polarization of the refracted ray at an angle of incidence equal to 𝜃 Br reaches its
maximum value, but this ray remains polarized only partly.
Relation (19.6) is known as Brewster’s law, in honour to its discoverer, the
British physicist David Brewster (1781-1868), and the angle 𝜃 Br is called Brewster’s
angle. It is easy to see that when light falls at Brewster’s angle, the reflected and
refracted rays are mutually perpendicular. The degree of polarization of the re-
flected and refracted rays for different angles of incidence can be obtained with
the aid of Fresnel’s formulas. The latter follow from the conditions imposed on an
electromagnetic field at the interface between two dielectrics². These conditions
include the equality of the tangential components of the vectors 𝑬 and 𝑯, and also
the equality of the normal components of the vectors 𝑫 and 𝑩 at both sides of the
interface (for one side the sum of the relevant vectors for the incident and reflected
waves must be taken, and for the other, the vector for the refracted wave).
Fresnel’s formulas establish the relations between the complex amplitudes of
the incident, reflected, and refracted waves. We remind our reader that by the
complex amplitude 𝐴ˆ is meant the expression 𝐴𝑒𝑖𝛼 , where 𝐴 is the conventional
amplitude, and 𝛼 is the initial phase of the oscillations. Hence, the equality of two
complex amplitudes signifies the equality of both the conventional amplitudes and
the initial phases of the two oscillations:
𝐴ˆ 1 = 𝐴ˆ 2 ⇒ 𝐴1 = 𝐴2 and 𝛼1 = 𝛼2 . (19.7)
When the complex amplitudes differ in sign, the conventional ones are the same,

²Fresnel obtained these formulas on the basis of the notions of light as of elastic waves propagating
in ether.
Polarization in Reflection and Refraction 453

while the initial phases differ by 𝜋 (𝑒𝑖𝜋 = −1):


𝐴ˆ 1 = − 𝐴ˆ 2 ⇒ 𝐴1 = 𝐴2 and 𝛼1 = 𝛼2 + 𝜋. (19.8)
Let us represent the incident wave in the form of a superposition of two inco-
herent waves in one of which the oscillations occur in the plane of incidence, and
in the other, are perpendicular to this plane. Let us denote the complex amplitude
of the first wave by 𝐴ˆ k , and of the second by 𝐴ˆ ⊥ . We shall proceed similarly with
the reflected and refracted waves. We shall use the same symbols for the amplitudes
of the reflected waves, adding one prime, and the same symbols for the amplitudes
of the refracted waves, adding two primes. Thus,
• 𝐴ˆ k and 𝐴ˆ ⊥ = amplitudes of the incident waves,
• 𝐴ˆ 0k and 𝐴ˆ ⊥
0 = amplitudes of the reflected waves,

• 𝐴ˆ 00k and 𝐴ˆ ⊥
00 = amplitudes of the refracted waves.

Fresnel’s formulas have the following form³:


ˆ0 ˆ tan(𝜃 1 − 𝜃 2 ) ˆ0 ˆ sin(𝜃 1 − 𝜃 2 )

 𝐴 k = 𝐴 k tan(𝜃 1 + 𝜃 2 ) , 𝐴⊥ = 𝐴⊥ sin(𝜃 1 + 𝜃 2 )



(19.9)

 𝐴ˆ 00 = 𝐴ˆ
 2 sin 𝜃 2 cos 𝜃 1 ˆ 00 ˆ 2 sin 𝜃 2 cos 𝜃 1
k , 𝐴⊥ = 𝐴⊥
 k sin(𝜃 1 + 𝜃 2 ) cos(𝜃 1 − 𝜃 2 ) sin(𝜃 1 + 𝜃 2 )


(𝜃 1 is the angle of incidence, and 𝜃 2 is the angle of refraction of the light wave).
We must underline the fact that formulas (19.9) establish the relations between the
complex amplitudes at the interface between dielectrics, i.e., at the point of incidence
of a ray on this interface.
It can be seen from the last two of formulas (19.9) that the signs of the complex
amplitudes of the incident and refracted waves at any values of the angles 𝜃 1 and
𝜃 2 are the same (the sum of 𝜃 1 and 𝜃 2 cannot exceed 𝜋). This signifies that when
penetrating into the second medium, the phase of the wave does not undergo a
jump. In dealing with the phase relations between an incident and a reflected wave,
we must take into account that for a wave polarized perpendicularly to the plane of
incidence, the coincidence of the signs of 𝐴ˆ ⊥ and 𝐴ˆ ⊥ 0 corresponds to the absence

of a jump in the phase in reflection (Fig. 19.7a). For a wave that is polarized in the
plane of incidence, on the other hand, a jump in the phase is absent when the signs
of 𝐴ˆ k and 𝐴ˆ 0k are opposite (Fig. 19.7b).
The phase relations between the reflected and incident waves depend on the
relation between the refractive indices 𝑛1 and 𝑛2 of the first and second media, and
also on the relation between the angle of incidence 𝜃 1 and Brewster’s angle 𝜃 Br (we
remind our reader that when 𝜃 1 = 𝜃 Br , the sum of the angles 𝜃 1 and 𝜃 2 , is 𝜋/2). Table
³Fresnel’s formulas are customarily written without “caps” over the amplitudes. To underline
the fact that we are dealing with complex amplitudes, however, we found it helpful to write the
amplitudes with the “caps”.
454 POLARIZATION OF LIGHT

Fig. 19.7: Dealing with phase relations. (a) For a wave polarized perpendicularly to the plane
of incidence, the coincidence of the signs of 𝐴ˆ ⊥ and 𝐴ˆ ⊥
0 corresponds to the absence of a

jump in the phase in reflection. (b) For a wave that is polarized in the plane of incidence, on
the other hand, a jump in the phase is absent when the signs of 𝐴ˆ k and 𝐴ˆ 0k are opposite.

19.1 gives the results following from the first two of formulas (19.9) in four possible
cases. It follows from the table that for incidence at an angle less than Brewster’s
angle, reflection from an optically denser medium is attended by a jump in phase of
𝜋; reflection from an optically less dense medium occurs without a change in phase.
This result for 𝜃 1 = 0 was obtained in Sec. 16.3. When 𝜃 1 > 𝜃 Br , the phase relations
for both wave components are different.
We obtain from the first of formulas (19.9) that when 𝜃 1 + 𝜃 2 = 𝜋/2, i.e., at
𝜃 1 = 𝜃 Br , the amplitude 𝐴ˆ 0k vanishes. Consequently, only oscillations perpendicular
to the plane of incidence are present in the reflected wave—the latter is completely
polarized. Thus, Brewster’s law directly follows from Fresnel’s formulas.
At small angles of incidence, the sines and tangents in formulas (19.9) may be
replaced by the angles themselves, and the cosines may be assumed equal to unity.

Table 19.1

𝜃 1 < 𝜃 Br (𝜃 1 + 𝜃 2 < 𝜋/2) 𝜃 1 > 𝜃 Br (𝜃 1 + 𝜃 2 > 𝜋/2)

𝑛2 > 𝑛1 , The signs of 𝐴ˆ 0k and 𝐴ˆ k are the The sign of 𝐴ˆ 0k is opposite to


𝜃1 > 𝜃2 same (a phase jump by 𝜋). that of 𝐴ˆ k (no phase jump).
The sign of 𝐴ˆ ⊥
0 is opposite to
The sign of 𝐴ˆ ⊥0 is opposite to
ˆ
that of 𝐴⊥ (a phase jump by 𝜋).
0
ˆ
that of 𝐴 k (a phase jump by 𝜋).
𝑛2 < 𝑛1 , The sign of 𝐴ˆ 0k is opposite to The signs of 𝐴ˆ 0k and 𝐴ˆ k are the
𝜃1 < 𝜃2 that of 𝐴ˆ k (no phase jump). same (a phase jump by 𝜋).
The signs of 𝐴ˆ ⊥
0 and 𝐴ˆ 0 are the The signs of 𝐴ˆ ⊥
0 and 𝐴ˆ are the


same (no phase jump). same (no phase jump).
Polarization in Double Refraction 455

In addition, in this case we may consider that 𝜃 1 = 𝑛12 𝜃 2 (this follows from the
law of refraction after the sines are replaced with the relevant angles). As a result,
Fresnel’s formulas for small angles of incidence acquire the form
 ˆ0 𝜃1 − 𝜃2 𝑛12 − 1
 𝐴 k = 𝐴ˆ k = 𝐴ˆ k
𝑛12 + 1



 𝜃 1 + 𝜃 2

ˆ 𝑛12 − 1

ˆ0 ˆ 𝜃1 − 𝜃2


 𝐴⊥ = 𝐴⊥ 𝜃 1 + 𝜃 2 = − 𝐴⊥ 𝑛12 + 1



small 𝜃 1 ⇒ (19.10)

 ˆ 00 ˆ 2𝜃 2 ˆ 2
 𝐴 = 𝐴 k = 𝐴 k
k 𝑛12 + 1



 𝜃1 + 𝜃2
2𝜃 2 2

 𝐴ˆ ⊥00 = 𝐴ˆ = 𝐴ˆ ⊥



⊥ .
𝑛12 + 1

 𝜃1 + 𝜃2
Squaring Eqs. (19.10) and multiplying the expressions obtained by the refractive
index of the relevant medium, we get relations between the intensities of the inci-
dent, reflected, and refracted rays for small angles of incidence [see expression (19.6)].
Here, for example, the intensity of the reflected light 𝐼 0 can be calculated as the sum
of the intensities of both components 𝐼 k0 and 𝐼⊥0 because these components are not
coherent in natural light [the intensities instead of the amplitudes are summated
for incoherent waves, see Eq. (17.1)]. As a result, we get
2
𝑛12 − 1

0
𝐼 =𝐼 ,
𝑛12 + 1
From these formulas, we get Eqs. (16.33) and (16.34) for 𝜌 and 𝜏.

19.3. Polarization in Double Refraction

When light passes through all transparent crystals except for those belonging to
the cubic system, a phenomenon is observed called double refraction⁴. It consists
in that a ray falling on a crystal is split inside the latter into two rays propagating,
generally speaking, with different velocities and in different directions.
Doubly refracting (or birefringent) crystals are divided into uniaxial and biax-
ial ones. In uniaxial crystals, one of the refracted rays obeys the conventional law
of refraction, in particular, it is in the same plane as the incident ray and a normal
to the refracting surface. This ray is called an ordinary ray and is designated by
the symbol o. For the other ray, called an extraordinary ray (designated by e),
the ratio of the sines of the angle of incidence and the angle of refraction does not
remain constant when the angle of incidence varies. Even upon normal incidence
of light Fig. 19.8 on a crystal, an extraordinary ray, generally speaking, deviates from
⁴Double refraction was first observed in 1669 by the Danish scientist Erasm Bartholin (1625-1698)
for Iceland spar (a variety of calcium carbonate CaCO3 —crystals of the hexagonal system).
456 POLARIZATION OF LIGHT

e
o

Fig. 19.8: Double refraction or birefringence. Doubly refracting (or birefringent) crystals
are divided into uniaxial and biaxial ones. In uniaxial crystals, one of the refracted rays
(ordinary rays “o”) obeys the conventional law of refraction, in particular it is in the same
plane as the incident ray and a normal to the refracting surface, while the extraordinary ray
(“e”) deviates from a normal.

a normal (Fig. 19.8). In addition, an extraordinary ray does not lie, as a rule, in the
same plane as the incident ray and a normal to the refracting surface. Examples of
uniaxial crystals are Iceland spar, quartz, and tourmaline. In biaxial crystals (mica,
gypsum), both rays are extraordinary—the refractive indices for them depend on
the direction in the crystal. In the following, we shall be concerned only with
uniaxial crystals.
Uniaxial crystals have a direction along which ordinary and extraordinary rays
propagate without separation and with the same velocity⁵. This direction is known
as the optical axis of the crystal. It must be borne in mind that an optical axis is
not a straight line passing through a point of a crystal, but a definite direction in
the crystal. Any straight line parallel to the given direction is an optical axis of the
crystal.
A plane passing through an optical axis is called a principal section or a
principal plane of the crystal. Customarily, the principal section passing through
the light ray is used.
Investigation of the ordinary and extraordinary rays shows that they are both
completely polarized in mutually perpendicular directions (see Fig. 19.8). The plane
of oscillations of the ordinary ray is perpendicular to a principal section of the
crystal. In the extraordinary ray, the oscillations of the light vector occur in a plane
coinciding with a principal section. When they emerge from the crystal, the two
rays differ from each other only in the direction of polarization so that the terms
“ordinary” and “extraordinary” have a meaning only inside the crystal.
In some crystals, one of the rays is absorbed to a greater extent than the other.
This phenomenon is called dichroism. A crystal of tourmaline (a mineral of a
complex composition) displays very great dichroism in visible rays. An ordinary

⁵Biaxial crystals have two such directions.


Polarization in Double Refraction 457

Optical axis
of crystal

e o

Fig. 19.9: In an ordinary ray, the oscillations of the light vector occur in a direction per-
pendicular to a principal section of the crystal (depicted by dots on the relevant ray). The
oscillations in an extraordinary ray take place in a principal section. The directions of
oscillations of the vector 𝑬 are depicted by double-headed arrows, making different angles
a with an optical axis.

ray is virtually completely absorbed in it over a distance of 1 mm. In crystals of


iodoquinine sulphate, one of the rays is absorbed over a path of about 0.1 mm. This
circumstance has been taken advantage of for manufacturing a polarizing device
called a polaroid. It is a celluloid film into which a great number of identically
oriented minute crystals of iodoquinine sulphate have been introduced.
Double refraction is explained by the anisotropy of crystals. In crystals of the
noncubic system, the permittivity 𝜀 depends on the direction. In uniaxial crystals, 𝜀
in the direction of an optical axis and in directions perpendicular to it has different
values 𝜀 k and 𝜀⊥ . In other directions, 𝜀 has intermediate values. According to

Eq. (16.3) 𝑛 = 𝜀. It thus follows from the anisotropy of 𝜀, that different values of the
refractive index 𝑛 correspond to electromagnetic waves with different directions of the
oscillations of the vector 𝑬. Therefore, the velocity of the light waves depends on the
direction of oscillations of the light vector 𝑬.
In an ordinary ray, the oscillations of the light vector occur in a direction
perpendicular to a principal section of the crystal (in Fig. 19.9 these oscillations are
depicted by dots on the relevant ray). Therefore, with any direction of an ordinary
ray (three directions 1, 2, and 3 are shown in the figure), the vector 𝑬 makes a right
angle with an optical axis of the crystal, and the velocity of the light wave will be the

same, equal to 𝑣o = 𝑐/ 𝜀⊥ . Depicting the velocity of an ordinary ray in the form
of lengths laid off in different directions, we shall get a spherical surface. Figure
19.9 shows the intersection of this surface with the plane of the drawing. A picture
458 POLARIZATION OF LIGHT

Optical axis
of crystal
e e

o
o

Positive Negative o e Axis


Fig. 19.10: Depending on which of the velocities, Fig. 19.11: Wave surfaces of an or-
𝑣o or 𝑣e , is greater, positive and negative uniaxial dinary and extraordinary rays with
crystals are distinguished. Positive crystals, 𝑣e < 𝑣o their centre at point 2 on the surface
(𝑛e > 𝑛o ); negative crystals, 𝑣e > 𝑣o (𝑛e < 𝑛o ). of the crystal.

such as that in Fig. 19.9 is observed in any principal section, i.e., in any plane passing
through an optical axis. Let us imagine that a point source of light is placed at point
0 inside a crystal. Hence, the sphere which we have constructed will be the wave
surface of ordinary rays.
The oscillations in an extraordinary ray take place in a principal section. There-
fore, for different rays, the directions of oscillations of the vector 𝑬 (in Fig. 19.9 these
directions are depicted by double-headed arrows) make different angles a with an

optical axis. For ray 1, the angle 𝛼 is 𝜋/2, owing to which the velocity is 𝑣o = 𝑐/ 𝜀⊥ ,

for ray 2, the angle 𝛼 = 0, and the velocity is 𝑣e = 𝑐/ 𝜀 k . For ray 3, the velocity has
an intermediate value. We can show that the wave surface of extraordinary rays
is an ellipsoid of revolution. At places of intersection with an optical axis of the
crystal, this ellipsoid and the sphere constructed for the ordinary rays come into
contact.
Uniaxial crystals are characterized by a refractive index of an ordinary ray
equal to 𝑛o = 𝑐/𝑣o , and a refractive index of an extraordinary ray perpendicular to
an optical axis equal to 𝑛e = 𝑐/𝑣e . The latter quantity is called simply the refractive
index of an extraordinary ray.
Depending on which of the velocities, 𝑣o or 𝑣e , is greater, positive and negative
uniaxial crystals are distinguished (Fig. 19.10). For positive crystals, 𝑣e < 𝑣o (this
means that 𝑛e > 𝑛o ). For negative crystals, 𝑣e > 𝑣o (𝑛e < 𝑛o ). It is simple to
remember what crystals are called positive and what negative. For positive crystals,
the ellipsoid of velocities is extended along an optical axis reminding one of the
vertical line in the sign “+”; for negative crystals, the ellipsoid of velocities is extended
in a direction perpendicular to an optical axis, reminding one of the sign “-”.
The path of an ordinary and an extraordinary ray in a crystal can be determined
Interference of Polarized Rays 459

e e
o o
o e o e Axis e o Axis
e o
Axis

Fig. 19.12: Three cases of the normal incidence of light on the surface of a crystal differing in
the direction of the optical axis. (a) The rays o and e propagate along an optical axis without
separating. (b) An extraordinary ray may deviate from a normal to this surface. (c) The
ordinary and extraordinary rays travel in the same direction, but propagate with different
velocities.

with the aid of the Huygens principle. Figure 19.11 depicts wave surfaces of an
ordinary and extraordinary rays with their centre at point 2 on the surface of the
crystal. The construction is for the moment of time when the wavefront of the
incident wave reaches point 1. The envelopes of all the secondary wavelets (the
waves whose centres are in the interval between points 1 and 2 are not shown in the
figure) for the ordinary and extraordinary rays are evidently planes. The refracted
ray o or e emerging from point 2 passes through the point of contact of the envelope
with the relevant wave surface.
We remind our reader that rays are defined as lines along which the energy of a
light wave propagates (see Sec. 16.1). A glance at Fig. 19.11 shows that the ordinary
ray o coincides with a normal to the relevant wave surface. The extraordinary ray
e, on the other hand, appreciably deviates from a normal to the wave surface.
Figure 19.12 shows three cases of the normal incidence of light on the surface
of a crystal differing in the direction of the optical axis. In case (a), the rays o
and e propagate along an optical axis and therefore travel without separating.
Inspection of Fig. 19.12b shows that even upon normal incidence of light on a
refracting surface, an extraordinary ray may deviate from a normal to this surface
(compare with Fig. 19.8). In Fig. 19.12c, the optical axis of the crystal is parallel to
the refracting surface. In this case with normal incidence of the light, the ordinary
and extraordinary rays travel in the same direction, but propagate with different
velocities. The result is a constantly growing phase difference between them. The
nature of polarization of the ordinary and extraordinary rays in Fig. 19.12 is not
indicated. It is the same as for the rays depicted in Fig. 19.11.

19.4. Interference of Polarized Rays

When two coherent rays polarized in mutually perpendicular directions are su-
perposed, no interference pattern with the characteristic alternation of maxima
460 POLARIZATION OF LIGHT

Axis

Polarizer
Plate plane
Polarizer

Fig. 19.13: Superposed ordinary and an extraordinary ray emerging from a crystal plate.
(a) Light through a crystal plate cut out parallel to the optical axis. Rays 1 and 2 that are
polarized in mutually perpendicular planes will emerge from the plate. (b) With a polarizer
in the path of these rays, both rays will oscillate in one plane (polarizer plane) after passing
through the polarizer.

and minima of the intensity can be obtained. Interference occurs only when the
oscillations in the interacting rays occur along the same direction. The oscillations
in two rays initially polarized in mutually perpendicular directions can be brought
into one plane by passing these rays through a polarizer installed so that its plane
does not coincide with the plane of oscillations of any of the rays.
Let us see what happens when an ordinary and an extraordinary ray emerging
from a crystal plate are superposed. Assume that the plate has been cut out parallel
to an optical axis (Fig. 19.13). With normal incidence of the light on the plate, the or-
dinary and extraordinary rays will propagate without separating, but with different
velocities (see Fig. 19.12c). The following path difference appears between the rays
while they pass through the plate:
𝛥 = (𝑛o − 𝑛e )𝑑, (19.11)
or the following phase difference:
(𝑛o − 𝑛e )𝑑
𝛿= 2𝜋 (19.12)
𝜆0
(𝑑 is the plate thickness, and 𝜆0 the wavelength in a vacuum).
Thus, if we pass natural light through a crystal plate cut out parallel to the optical
axis (Fig. 19.13a), two rays 1 and 2 that are polarized in mutually perpendicular planes
will emerge from the plate⁶, and between them there will be a phase difference
determined by Eq. (19.12). Let us place a polarizer in the path of these rays. Both rays
after passing through the polarizer will oscillate in one plane. Their amplitudes
⁶In the crystal, ray 1 was extraordinary and could be designated by the symbol e, and ray 2 was
ordinary (o). Upon emerging from the crystal, these rays lost their right to be called ordinary and
extraordinary.
Passing of Plane-Polarized Light Through a Crystal Plate 461

will equal the components of the amplitudes of rays 1 and 2 in the direction of the
plane of the polarizer (Fig. 19.13b).
The rays emerging from the polarizer are produced as a result of division of the
light obtained from a single source. Therefore, they ought to interfere. If rays 1 and
2 are produced as a result of natural light passing through the plate, however, they
do not interfere. The explanation is very simple. Although the ordinary and extraor-
dinary rays are produced by the same light source, they contain mainly oscillations
belonging to different wave trains emitted by individual atoms. The oscillations in
the ordinary ray are predominantly due to the trains whose oscillation planes are
close to one direction in space, whereas those in the extraordinary ray are due to
trains whose oscillation planes are close to another direction perpendicular to the
first one. Since the individual trains are incoherent, the ordinary and extraordinary
rays produced from natural light, and, consequently, rays 1 and 2 too, are also
incoherent.
Matters are different if plane-polarized light falls on a crystal plate. In this case,
the oscillations of each train are divided between the ordinary and extraordinary
rays in the same proportion (depending on the orientation of an optical axis of the
plate relative to the plane of oscillations in the incident ray). Consequently, rays o
and e, and therefore rays 1 and 2 too, will be coherent and will interfere.

19.5. Passing of Plane-Polarized Light Through a Crystal Plate

Let us consider a crystal plate cut out parallel to an optical axis. We saw in the
preceding section that when plane-polarized light falls on such a plate, the ordi-
nary and extraordinary rays are coherent. At the entrance to the plate, the phase
difference 𝛿 of these rays is zero, and at the exit from the plate
𝛥 (𝑛o − 𝑛e )𝑑
𝛿 = 2𝜋 = 2𝜋 (19.13)
𝜆0 𝜆0
[see Eqs. (19.11) and (19.12); we assume that the light falls on the plate normally].
A plate cut out parallel to an optical axis for which
𝜆0
(𝑛o − 𝑛e )𝑑 = 𝑚𝜆0 +
4
(𝑚 is any integer or zero) is called a quarter-wave plate. An ordinary and an
extraordinary rays passing through such a plate acquire a phase difference equal to
𝜋/2 (we remind our reader that the phase difference is determined with an accuracy
to 2𝜋𝑚). A plate for which
𝜆0
(𝑛o − 𝑛e )𝑑 = 𝑚𝜆0 +
2
462 POLARIZATION OF LIGHT

Fig. 19.14: A half-wave plate turns the Fig. 19.15: For a plane-polarized light through a
plane of oscillations of the light passing quarter-wave plate at 𝜑 = 45°, the amplitudes
through it through the angle 2𝜑 (𝜑 is the of both rays emerging from the plate will be the
angle between the plane of oscillations same (no dichroism), with phase shift of 𝜋/2,
in the incident ray and the axis of the and the light will be circularly polarized. At a
plate). This means that passing through different value of the 𝜑, the amplitudes of the
the plate the phase difference between rays emerging from the plate will be different.
the oscillations of 𝑬 o and 𝑬 e changes by These rays when superposed form elliptically
𝜋. polarized light.

is called a half-wave plate, etc.


Let us see how plane-polarized light passes through a half-wave plate. The
oscillation of 𝑬 in the incident ray occurring in plane P produces the oscillation
of 𝑬 o of the ordinary ray and the oscillation of 𝑬 e of the extraordinary ray when
entering the crystal (Fig. 19.14). During the time spent in passing through the plate,
the phase difference between the oscillations of 𝑬 o and 𝑬 e changes by 𝜋. Therefore,
at the exit from the plate, the phase relation between the ordinary and extraordinary
rays will correspond to the mutual arrangement of the vectors 𝑬 e and 𝑬 o0 (at the
entrance to the plate it corresponded to the mutual arrangement of the vectors 𝑬 e
and 𝑬 o ). Consequently, the light emerging from the plate will be polarized in plane
P0. Planes P and P0 are symmetrical relative to optical axis 0 of the plate. Thus, a
half-wave plate turns the plane of oscillations of the light passing through it through
the angle 2𝜑 (𝜑 is the angle between the plane of oscillations in the incident ray and
the axis of the plate).
Now let us pass plane-polarized light through a quarter-wave plate (Fig. 19.15).
If we arrange the plate so that the angle 𝜑 between plane of oscillations P in the
incident ray and plate axis 0 is 45°, the amplitudes of both rays emerging from the
plate will be the same (dichroism is assumed to be absent). The phase shift between
the oscillations in these rays will be 𝜋/2. Hence, the light emerging from the plate
A Crystal Plate Between Two Polarizers 463

will be circularly polarized. At a different value of the angle 𝜑, the amplitudes of


the rays emerging from the plate will be different. Consequently, these rays when
superposed form elliptically polarized light; one of the axes of the ellipse coincides
with plate axis 0.
When plane-polarized light is passed through a plate with a fractional number
of waves not coinciding with 𝑚 +1/4 or 𝑚 +1/2, two coherent light waves polarized
in mutually perpendicular planes will emerge from the plate. Their phase difference
is other than 𝜋/2 and other than 𝜋. Hence, with any relation between the amplitudes
of these waves depending on the angle 𝜑 (see Fig. 19.15), elliptically polarized light
will be produced at the exit from the plate, and none of the axes of the ellipse will
coincide with plate axis 0. The orientation of the ellipse axes relative to axis 0 is
determined by the phase difference 𝛿, and also by the ratio of the amplitudes, i.e., by
the angle 𝜑 between the plane of oscillations in the incident wave and plate axis 0.
We must note that regardless of the plate thickness, when 𝜑 is zero or 𝜋/2,
only one ray will propagate in the plate (in the first case an extraordinary ray, in
the second case an ordinary one) so that at the plate exit the light remains plane-
polarized with its plane of oscillations coinciding with P.
If we place a quarter-wave plate in the path of elliptically polarized light and
arrange its optical axis along one of the ellipse axes, then the plate will introduce an
additional phase difference equal to 𝜋/2. As a result, the phase difference between
two plane-polarized waves whose sum is an elliptically polarized wave becomes
equal to zero or 𝜋, so that the superposition of these waves produces a plane-
polarized wave. Hence, a properly turned quarter-wave plate transforms elliptically
polarized light into plane-polarized light. This underlies a method by means of
which we can distinguish elliptically polarized light from partly polarized light,
or circularly polarized light from natural light. The light being studied is passed
through a quarter-wave plate and a polarizer placed after it. If the ray being studied
is elliptically polarized (or circularly polarized), then by rotating the plate and the
polarizer around the direction of the ray, we can achieve complete darkening of the
field of vision. If the light is partly polarized (or natural), it is impossible to achieve
extinction of the ray being studied with any position of the plate and polarizer.

19.6. A Crystal Plate Between Two Polarizers

Let us place a plate made from a uniaxial crystal cut out parallel to optical axis 0
between polarizers⁷ P and P0 (Fig. 19.16). Plane-polarized light of intensity 𝐼 will
emerge from polarizer P. In passing through the plate, the light in the general case

⁷The second polarizer P 0 in the direction of ray propagation is also called an analyzer.
464 POLARIZATION OF LIGHT

Fig. 19.16: Plate made from a uniaxial crystal cut out parallel to optical axis 0 between
polarizers P and P0. Plane-polarized light of intensity 𝐼 will emerge from polarizer P. In
passing through the plate, the light in the general case will become elliptically polarized.

will become elliptically polarized. When it emerges from polarizer P0, the light
will again be plane-polarized. Its intensity 𝐼 0 depends on the mutual orientation of
the planes of polarizers P and P0 and an optical axis of the plate, and also on the
phase difference 𝛿 acquired by the ordinary and extraordinary rays when they pass
through the plate.
Assume that the angle 𝜑 between the plane of polarizer P and plate axis 0 is 𝜋/4.
Let us consider two particular cases: the polarizers are parallel (Fig. 19.17a), and they
are crossed (Fig. 19.17b). The light oscillation leaving polarizer P will be depicted
by the vector 𝑬 in plane P. At the entrance to the plate, the oscillation of 𝑬 will
produce two oscillations—the oscillation of 𝑬 o (ordinary ray) perpendicular to the
optical axis, and the oscillation of 𝑬 e (extraordinary ray) parallel to the axis. These
oscillations will be coherent; in passing through the plate, they acquire the phase
difference 𝛿 that is determined by the plate thickness and the difference between
the refractive indices of the ordinary and extraordinary rays. The amplitudes of
these oscillations are the same and equal
𝜋  𝐸
𝐸o = 𝐸e = 𝐸 cos =√ , (19.14)
4 2
where 𝐸 is the amplitude of the wave emerging from the first polarizer.
The components of the oscillations of 𝑬 o and 𝑬 e will pass through the second
polarizer in the direction of plane P0. The amplitudes of these components in both
cases equal those given by Eq. (19.14) multiplied by cos(𝜋/4), i.e.,
𝐸
𝐸o0 = 𝐸e0 = . (19.15)
2
For parallel polarizers (Fig. 19.17a), the phase difference of the waves emerging
from polarizer P0 is 𝛿, i.e., the phase difference acquired when passing through the
plate. For crossed polarizers (Fig. 19.17b), the projections of the vectors 𝑬 o and 𝑬 e
onto the direction of P0 have different signs. This signifies that an additional phase
A Crystal Plate Between Two Polarizers 465

Fig. 19.17: Two particular cases for when the polarizers are parallel (a) and when are crossed
(b). The light oscillation leaving polarizer P will be depicted by the vector 𝑬 in plane P.

difference equal to 𝜋 appears apart from the phase difference 𝛿.


The waves leaving the second polarizer will interfere. The amplitude 𝐸 k of the
resultant wave for parallel polarizers is determined by the relation
𝐸2k = 𝐸o02 + 𝐸e02 + 2𝐸o0 𝐸e0 cos 𝛿,
and for crossed polarizers by the relation
𝐸2⊥ = 𝐸o02 + 𝐸e02 + 2𝐸o0 𝐸e0 cos(𝛿 + 𝜋).
Taking Eq. (19.15) into consideration, we can write that
1 2 1 2 1 2 1 2
 
2 2 2 𝛿
𝐸 k = 𝐸 + 𝐸 + 2 𝐸 cos 𝛿 = 𝐸 (1 + cos 𝛿) = 𝐸 cos
4 4 4 2 2
1 2 1 2 1 2 1 2
 
2 2 2 𝛿
𝐸⊥ = 𝐸 + 𝐸 + 2 𝐸 cos(𝛿 + 𝜋) = 𝐸 (1 − cos 𝛿) = 𝐸 sin .
4 4 4 2 2
The intensity is proportional to the square of the amplitude. Hence,
   
2 𝛿 2 𝛿
𝐼 k = 𝐼 cos
0
, 𝐼⊥ = 𝐼 sin
0
, (19.16)
2 2
where 𝐼 k0 is the intensity of the light emerging from the second polarizer when the
polarizers are parallel, 𝐼⊥0 is the same intensity when the polarizers are crossed, and
𝐼 is the intensity of the light that has passed through the first polarizer.
It follows from formulas (19.16) that the intensities 𝐼 k0 and 𝐼⊥0 are “complementary”
—their sum gives the intensity 𝐼. In particular, when
𝛿 = 2𝑚𝜋 (𝑚 = 1, 2, . . .), (19.17)
the intensity 𝐼 k0 will equal 𝐼, while the intensity 𝐼⊥0 will vanish. At values of
𝛿 = (2𝑚 + 1)𝜋 (𝑚 = 0, 1, 2, . . .), (19.18)
on the other hand, the intensity 𝐼 k0 will vanish, while the intensity 𝐼⊥0 reaches the
value 𝐼.
466 POLARIZATION OF LIGHT

Fig. 19.18: (a) A plate placed between polarizers. The bottom half of the plate is thicker than
the top one. The light passing through the plate contains radiation of only two wavelengths
𝜆1 and 𝜆2 . (b) View from the side of polarizer P0. The light components will be elliptically
polarized.

The difference between the refractive indices 𝑛o − 𝑛e depends on the wavelength


of the light 𝜆0 . In addition, 𝜆0 directly enters expression (19.13) for 𝛿. Assume that the
light falling on polarizer P consists of radiation of two wavelengths 𝜆1 and 𝜆2 such
that 𝛿 for 𝜆1 satisfies condition (19.17), and for 𝜆2 condition (19.18). In this case with
parallel polarizers, light of wavelength 𝜆1 will pass without hindrance through the
system depicted in Fig. 19.16, whereas light of wavelength 𝜆2 will be made completely
extinct. With crossed polarizers, light of wavelength 𝜆2 will pass without hindrance,
and light of wavelength 𝜆1 will be made completely extinct. Consequently, with
one arrangement of the polarizers, the colour of the light transmitted through the
system will correspond to the wavelength 𝜆1 , and with the other arrangement, to
the wavelength 𝜆2 . Such two colours are called complementary. When one of the
polarizers is rotated, the colour continuously changes, varying during each quarter
of a revolution from one complementary colour to the other. A change in colour
is also observed at 𝜑 differing from 𝜋/4 (but not equal to zero or 𝜋/2), the colours
being less saturated, however.
The phase difference 𝛿 depends on the plate thickness. Hence, if a doubly
refracting transparent plate placed between polarizers has a different thickness at
different places, the latter when observed from the side of polarizer P0 will seem to
be coloured differently. When polarizer P0 is rotated, these colours change, each
of them transforming into its complementary colour. Let us explain this by the
following example. Figure 19.18a shows a plate placed between polarizers. The
bottom half of the plate is thicker than the top one. Assume that the light passing
through the plate contains radiation of only two wavelengths 𝜆1 and 𝜆2 . Figure
19.18b gives a “view” from the side of polarizer P0. At the exit from the crystal plate,
each of the light components will, generally speaking, be elliptically polarized. The
orientation and the eccentricity of the ellipses for the wavelengths 𝜆1 and 𝜆2 , and
Artificial Double Refraction 467

Fig. 19.19: Glass plate Q between crossed polarizers P and P 0 . Without glass deformation,
there is no transmission of light. Compressing the plate light begins to pass through the
system and the pattern observed in the transmitted rays being speckled with coloured
fringes. Each fringe corresponds to identically deformed spots on the plate.

also for different halves of the plate, will be different. When the plane of polarizer
P0 is placed in position P10 , in the light transmitted through P0 the wavelength 𝜆1
will predominate in the top half of the plate and the wavelength 𝜆2 in the bottom
half. Therefore, the two halves will be coloured differently. When polarizer P0 is
placed in position P20 , the colour of the top half will be determined by the light of
wavelength 𝜆2 , and of the bottom half by the light of wavelength 𝜆1 . Thus, when
polarizer P0 is turned through 90°, the two halves of the plate exchange colours, as it
were. It is quite natural that this will occur only at a definite ratio of the thicknesses
of the two halves of the plate.

19.7. Artificial Double Refraction

External action may cause double refraction to appear in transparent amorphous


bodies, and also in crystals of the cubic system. This occurs, in particular, upon the
mechanical deformations of bodies. The difference between the refractive indices
of an ordinary and an extraordinary ray is a measure of the appearance of optical
anisotropy. Experiments show that this difference is proportional to the stress 𝜎 at
a given point of a body (i.e., to the force per unit area; see Sec. 2.9 of Vol. I):
𝑛o − 𝑛e = 𝑘𝜎 (19.19)
(𝑘 is a proportionality constant depending on the properties of the substance).
Let us place glass plate Q between crossed polarizers P and P0 (Fig. 19.19). As long
as the glass is not deformed, such a system transmits no light. If the plate is subjected
to compression, light begins to pass through the system, the pattern observed in
the transmitted rays being speckled with coloured fringes. Each fringe corresponds
to identically deformed spots on the plate. Consequently, the distribution of the
fringes makes it possible to assess the distribution of the stresses inside the plate.
468 POLARIZATION OF LIGHT

Fig. 19.20: Kerr effect in liquids. A Kerr cell is placed between crossed polarizers P and P 0 .
The Kerr cell contains liquid into which capacitor plates have been introduced. When a
voltage is applied across the plates, a virtually homogeneous electric field is set up between
them. Thus, the liquid acquires the properties of a uniaxial crystal with an optical axis
oriented along the field.

This underlies the optical method of studying stresses. A model of a component


or structural member made from a transparent isotropic material (for example,
from Plexiglas) is placed between crossed polarizers. The model is subjected to the
action of loads similar to those which the article itself will experience. The pattern
observed in transmitted white light makes it possible to determine the distribution
of the stresses and also to estimate their magnitude.
The appearance of double refraction in liquids and amorphous solids under
the action of an electric field was discovered by the Scotch physicist John Kerr
(1824-1907) in 1875. This effect was named the Kerr effect after its discoverer. In
1930, it was also observed in gases. An arrangement for studying the Kerr effect in
liquids is shown schematically in Fig. 19.20. It consists of a Kerr cell placed between
crossed polarizers P and P0. A Kerr cell is a sealed vessel containing a liquid into
which capacitor plates have been introduced. When a voltage is applied across the
plates, a virtually homogeneous electric field is set up between them. Under its
action, the liquid acquires the properties of a uniaxial crystal with an optical axis
oriented along the field.
The resulting difference between the refractive indices 𝑛o and 𝑛e is proportional
to the square of the field strength 𝐸:
𝑛o − 𝑛e = 𝑘𝐸2 . (19.20)
The path difference
𝛥 = (𝑛o − 𝑛e )𝑙 = 𝑘𝑙𝐸2
appears between the ordinary and extraordinary rays along the path 𝑙. The corre-
Rotation of Polarization Plane 469

sponding phase difference is


𝛿 = 2𝜋 = 2𝜋 𝑙𝐸2 .
𝛥 𝑘
𝜆0 𝜆0
The latter expression is conventionally written in the form
𝛿 = 2𝜋 𝐵𝑙𝐸2 , (19.21)
where 𝐵 is a quantity characteristic of a given substance and known as the Kerr
constant.
The Kerr constant depends on the temperature of a substance and on the
wavelength of the light. Among known liquids, nitrobenzene (C6 H5 N02 ) has the
highest Kerr constant.
The Kerr effect is explained by the different polarization of molecules in various
directions. In the absence of a field, the molecules are oriented chaotically, therefore
a liquid as a whole displays no anisotropy. Under the action of a field, the molecules
turn so that either their electric dipole moments (in polar molecules) or their
directions of maximum polarization (in non-polar molecules) are oriented in the
direction of the field. As a result, the liquid becomes optically anisotropic. The
thermal motion of the molecules counteracts the orienting action of the field. This
explains the reduction in the Kerr constant with elevation of the temperature.
The time during which the prevailing orientation of the molecules sets in (when
the field is switched on) or vanishes (when the field is switched off) is about 10−10 s.
Therefore, a Kerr cell placed between crossed polarizers can be used as a virtually
inertialess light shutter. In the absence of a voltage across the capacitor plates, the
shutter will be closed. When the voltage is switched on, the shutter transmits a
considerable part of the light falling on the first polarizer.

19.8. Rotation of Polarization Plane

Natural Rotation. Some substances known as optically active ones have the ability
of causing rotation of the plane of polarization of plane-polarized light passing
through them. Such substances include crystalline bodies (for example, quartz,
cinnabar), pure liquids (turpentine, nicotine), and solutions of optically active sub-
stances in inactive solvents (aqueous solutions of sugar, tartaric acid, etc.).
Crystalline substances rotate the plane of polarization to the greatest extent
when the light propagates along the optical axis of the crystal. The angle of rotation
𝜑 is proportional to the path 𝑙 travelled by a ray in the crystal:
𝜑 = 𝛼𝑙. (19.22)
The coefficient 𝛼 is called the rotational constant. It depends on the wavelength
(dispersion of the ability to rotate).
470 POLARIZATION OF LIGHT

X X

C T T C
Z Z
Y Y
(a) (b)
Fig. 19.21: Optically active substances exist in two varieties: right-hand and left-hand. The
molecules or crystals of a right-hand substance are a mirror image of the molecules or
crystals of the left-hand. The symbols C, X, Y, Z, and T stand for atoms or groups of atoms
(radicals) differing from one another. Molecule (b) is a mirror image of molecule (a).

In solutions, the angle of rotation of the plane of polarization is proportional to


the path of the light in the solution and to the concentration of the active substance,
𝑐:
𝜑 = [𝛼]𝑐𝑙. (19.23)
Here. [𝛼] is a quantity called the specific rotational constant.
Depending on the direction of rotation of the polarization plane, optically
active substances are divided into right-hand and left-hand ones. The direction of
rotation (relative to a ray) does not depend on the direction of the ray. Consequently,
if a ray that has passed through an optically active crystal along its optical axis is
reflected by a mirror and made to pass through the crystal again in the opposite
direction, then the initial position of the polarization plane is restored.
All optically active substances exist in two varieties—right-hand and left-hand.
There exist right-hand and left-hand quartz, right-hand and left-hand sugar, etc.
The molecules or crystals of one variety are a mirror image of the molecules or
crystals of the other one (Fig. 19.21). The symbols C, X, Y, Z, and T stand for atoms
or groups of atoms (radicals) differing from one another. Molecule (b) is a mirror
image of molecule (a). If we look at the tetrahedron depicted in Fig. 19.21 along the
direction CX, then in clockwise circumvention we shall encounter the sequence
ZYTZ for molecule (a) and ZTYZ for molecule (b). The same is observed for any of
the directions CY, CZ, and CT. The alternation of the radicals X, Y, Z, T in molecule
(b) is the opposite of their alternation in molecule (a). Consequently, if, for example,
a substance formed of molecules (a) is right-hand, then one formed of molecules (b)
is left-hand.
If we place an optically active substance (a crystal of quartz, a transparent
tray with a sugar solution, etc.) between two crossed polarizers, then the field
of vision becomes bright. To get darkness again, one of the polarizers has to be
rotated through the angle 𝜑 determined by expression (19.22) or (19.23). When a
Rotation of Polarization Plane 471

solution is used, we can determine its concentration 𝑐 by Eq. (19.23) if we know the
specific rotational constant [𝛼] of the given substance and the length 𝑙 and have
measured the angle of rotation 𝜑. This way of determining the concentration is
used in the production of various substances, in particular in the sugar industry
(the corresponding instrument is called a saccharimeter).
Magnetic Rotation of the Polarization Plane. Optically inactive substances
acquire the ability of rotating the plane of polarization under the action of a mag-
netic field. This phenomenon was discovered by Michael Faraday and is therefore
sometimes called the Faraday effect. It is observed only when light propagates
along the direction of magnetization. Therefore, to observe the Faraday effect, holes
are drilled in the pole shoes of an electromagnet, and a light ray is passed through
them. The substance being studied is placed between the poles of the electromagnet.
The angle of rotation of the polarization plane 𝜑 is proportional to the distance
𝑙 travelled by the light in the substance and to the magnetization of the latter. The
magnetization, in turn, is proportional to the magnetic field strength H [see Eq. (7.14)].
We can therefore write that
𝜑 = 𝑉 𝑙𝐻. (19.24)
The coefficient 𝑉 is known as the Verdet constant or the specific magnetic
rotation. The constant 𝑉 , like the rotational constant 𝛼, depends on the wavelength.
The direction of rotation is determined by the direction of the magnetic field.
The sign of rotation does not depend on the direction of the ray. Therefore, if we
reflect the ray from a mirror and make it pass through the magnetized substance
again in the opposite direction, the rotation of the plane of polarization will double.
The magnetic rotation of the polarization plane is due to the precession of the
electron orbits (see Sec. 7.7) produced under the action of the magnetic field.
Optically active substances when acted upon by a magnetic field acquire an
additional ability of rotating the plane of polarization that is added to their natural
ability.
473

Chapter 20
INTERACTION OF
ELECTROMAGNETIC
WAVES WITH A SUBSTANCE

20.1. Dispersion of Light

By the dispersion of light are meant phenomena due to the dependence of the
refractive index of a substance on the length of the light wave. This dependence
can be characterized by the function
𝑛 = 𝑓 (𝜆0 ), (20.1)
where 𝜆0 is the length of a light wave in a vacuum.
The derivative of 𝑛 with respect to 𝜆0 is called the dispersion of a substance.
Function (20.1) for all transparent colourless substances in the visible part of
the spectrum has the nature shown in Fig. 20.1. Diminishing of the wavelength is
attended by an increase in the refractive index at a constantly growing rate. Hence,
the dispersion of a substance d𝑛/d𝜆0 is negative. Its absolute value increases when
𝜆0 decreases.
If a substance absorbs part of the rays, then the course of dispersion displays an
anomaly in the region of absorption and near it (see Fig. 20.6). On a certain section,
the dispersion of the substance d𝑛/d𝜆0 will be positive. Such a variation of 𝑛 with
𝜆0 is called anomalous dispersion.
Media having the property of dispersion are known as dispersing ones. In
these media, the speed of light waves depends on the wavelength 𝜆0 or the frequency
𝜔.
474INTERACTION OF ELECTROMAGNETIC WAVES WITH A SUBSTANCE

Fig. 20.1: Dispersion curve of a substance for all transparent colourless substances in the
visible part of the spectrum.

20.2. Group Velocity

Strictly monochromatic light of the kind


𝐸 = 𝐴 cos(𝜔𝑡 − 𝑘𝑥 + 𝛼) (20.2)
is an infinite sequence in time and space of “crests” and “valleys” propagating along
the 𝑥-axis with the phase velocity
𝜔
𝑣= (20.3)
𝑘
[see Eq. (19.4)]. We cannot use such a wave to transmit a signal because each following
crest differs in no way from the preceding one. To transmit a signal, we must put a
“mark” on the wave, say, interrupt it for a certain time 𝛥𝑡. In this case, however, the
wave will no longer be described by Eq. (20.2).
It is the simplest to transmit a signal with the aid of a light pulse (Fig. 20.2).
According to the Fourier theorem, such a pulse can be represented as the superpo-
sition of waves of the kind given by Eq. (20.2) having frequencies confined within
a certain interval 𝛥𝜔. A superposition of waves differing only slightly from one
another in frequency is called a wave packet or a wave group. The analytical
expression for a wave packet has the form
∫ 𝜔0 +𝛥𝜔/2
𝐸(𝑥, 𝑡) = 𝐴𝜔 cos(𝜔𝑡 − 𝑘𝜔 𝑥 + 𝛼 𝜔 ) d𝜔 (20.4)
𝜔0 −𝛥𝜔/2
(the subscript 𝜔 used with 𝐴, 𝑘, and 𝛼 indicates that these quantities differ for
different frequencies). With a fixed value of 𝑡, a plot of function (20.4) has the form
shown in Fig. 20.2. When 𝑡 changes, the graph becomes displaced along the 𝑥-axis.
Within the limits of a packet, plane waves amplify one another to a greater or smaller
extent. Outside these limits, they virtually completely annihilate one another.
The relevant calculations show that the smaller the width of a packet 𝛥𝑥, the
greater is the interval of frequencies 𝛥𝜔 or accordingly the greater is the interval
Group Velocity 475

Fig. 20.2: Light pulse for transmitting a signal, Eq. (20.4), with a fixed value of 𝑡. When t
changes, the graph becomes displaced along the 𝑥-axis. Within the limits of a packet, plane
waves amplify one another to a greater or smaller extent. Outside these limits, they virtually
completely annihilate one another.

of wave numbers 𝛥𝑘 needed to describe a packet with the aid of Eq. (20.4). The
following relation holds:
𝛥𝑘 𝛥𝑥 ≈ 2𝜋. (20.5)
We must stress the fact that for the superposition of waves described by Eq. (20.4)
to be considered a wave packet, the condition 𝛥𝜔  𝜔0 must be obeyed.
In a non-dispersing medium, all the plane waves forming a packet propagate
with the same phase velocity 𝑣. It is evident that in this case the velocity of the
packet coincides with 𝑣, and the shape of the packet does not change with time. It
can he shown that a packet spreads in a dispersing medium with time—its width
grows. If the dispersion is not great, spreading of the packet is not too fast. In this
case, we can say that the packet travels with the velocity 𝑢, by which we mean the
velocity of the centre of the packet, i.e., of the point with the maximum value of
𝐸. This velocity is called the group velocity. In a dispersing medium, the group
velocity 𝑢 differs from the phase velocity 𝑣 (here we mean the phase velocity of
the harmonic component with the maximum amplitude, in other words, the phase
velocity for the dominating frequency). We shall show below that when d𝑛/d𝜆0 < 0,
the group velocity is smaller than the phase one (𝑢 < 𝑣); when d𝑛/d𝜆0 > 0, the
group velocity is greater than the phase one (𝑢 > 𝑣).
Figure 20.3 shows “photographs” of a wave packet for three consecutive mo-
ments 𝑡1 , 𝑡2 , and 𝑡3 . The figure is for the case when 𝑢 < 𝑣. Inspection of the figure
shows that motion of the packet is attended by motion of the crests and valleys
“inside” it. New crests constantly appear at the left-hand boundary of the packet.
After travelling along the packet, they vanish at its right-hand boundary. Hence,
whereas the packet as a whole travels with the velocity 𝑢, the individual crests and
valleys travel with the velocity 𝑣.
When 𝑢 > 𝑣, the directions of motion of the packet and of the crests inside it
are opposite.
Let us explain what has been said above using the example of the superposition
476INTERACTION OF ELECTROMAGNETIC WAVES WITH A SUBSTANCE

Fig. 20.3: “Photographs” of a wave packet for three consecutive moments 𝑡1 , 𝑡2 , and 𝑡3 , for
𝑢 < 𝑣. The motion of the packet is attended by motion of the crests and valleys “inside”
it. New crests constantly appear at the left-hand boundary of the packet. After travelling
along the packet, they vanish at its right-hand boundary. Hence, whereas the packet as a
whole travels with the velocity 𝑢, the individual crests and valleys travel with the velocity 𝑣.

of two plane waves of the same amplitude and of different wavelengths 𝜆. Figure
20.4 gives an “instant photograph” of the waves. One of them is shown by a solid
line, and the other by a dash line. The intensity is the greatest at point A where
the phases of the two waves coincide at the given moment. At points B and C, the
two waves are in counterphase, owing to which the intensity of the resultant wave
is zero. Assume that both waves are propagating from left to right, the velocity
of the “solid” wave being lower than that of the “dash” one (here d𝑛/d𝜆 > 0 and,
consequently, d𝑛/d𝜆 < 0).
Thus, the place at which the waves amplify each other will move to the left with
time relative to the waves. As a result, the group velocity will be lower than the
phase value. If the velocity of the “solid” wave is greater than that of the "dash" one
(i.e., d𝑛/d𝜆 > 0), the place at which amplification of the waves occurs will move to
the right so that the group velocity will be greater than the phase one.
Let us write the equations of the waves, assuming for simplicity that the initial
phases equal zero:
𝐸1 = 𝐴 cos(𝜔𝑡 − 𝑘𝑥)
𝐸2 = 𝐴 cos[(𝜔 + 𝛥𝜔)𝑡 − (𝑘 + 𝛥𝑘)𝑥].
Here 𝑘 = 𝜔/𝑣1 , and (𝑘+ 𝛥𝑘) = (𝜔+ 𝛥𝜔)/𝑣2 . Assume that 𝛥𝜔  𝜔, hence, 𝛥𝑘 
𝑘. Now, summating the oscillations and performing transformations according to
the formula for the sum of cosines, we get
  
𝛥𝜔 𝛥𝑘
𝐸 = 𝐸1 + 𝐸2 = 2𝐴 cos 𝑡− 𝑥 cos(𝜔𝑡 − 𝑘𝑥) (20.6)
2 2
Group Velocity 477

Fig. 20.4: “Instant photograph” of two waves. The intensity is the greatest at point A where
the phases of the two waves coincide at the given moment. At points B and C, the two
waves are in counterphase, owing to which the intensity of the resultant wave is zero.

(in the second multiplier, we have disregarded 𝛥𝜔 in comparison with 𝜔 and 𝛥𝑘 in


comparison with 𝑘).
The multiplier in brackets varies much more slowly with 𝑥 and 𝑡 than the second
multiplier. We can therefore consider expression (20.6) as the equation of a plane
wave whose amplitude varies according to the law¹
 
𝛥𝜔 𝛥𝑘
Amplitude = 2𝐴 cos

𝑡− 𝑥 .
2 2
In the given case, there is a number of identical amplitude maxima determined by
the condition
𝛥𝜔 𝛥𝑘
𝑡− 𝑥max = ±𝑚𝜋 (𝑚 = 0, 1, 2, . . .). (20.7)
2 2
Each of these maxima can be considered as the centre of the relevant wave packet.
Solving Eq. (20.7) relative to 𝑥max we get
𝛥𝜔
𝑥max = 𝑡 + constant.
𝛥𝑘
It thus follows that the maxima travel with the velocity
𝛥𝜔
𝑢= . (20.8)
𝛥𝑘
The expression obtained is the group velocity for a packet formed by two compo-
nents.
Let us find the velocity with which the centre of a wave packet described by
expression (20.5) travels. Passing over from cosines to exponents, we get
∫ 𝜔0 +𝛥𝜔/2
𝐸(𝑥, 𝑡) = 𝐴ˆ 𝜔 exp[𝑖(𝜔𝑡 − 𝑘𝜔 𝑥)] d𝜔 (20.9)
𝜔0 −𝛥𝜔/2
[ 𝐴ˆ 𝜔 = 𝐴𝜔 exp(𝑖𝛼 𝜔 ) is the complex amplitude].
Let us expand the function 𝑘𝜔 = 𝑘(𝜔) into a series in the vicinity of 𝜔0 :
d𝑘
 
𝑘 𝜔 = 𝑘0 + (𝜔 − 𝜔0 ) + . . . . (20.10)
d𝜔 0

¹Compare with Eqs. (7.86) and (7.87) of Vol. I. The dependence of function (20.6) on 𝑥 at a fixed
value of 𝑡 is depicted by a curve similar to the one in Fig. 7.11a of Vol. 1.
478INTERACTION OF ELECTROMAGNETIC WAVES WITH A SUBSTANCE

Here, 𝑘0 = 𝑘(𝜔)0 , and (d𝑘/d𝜔)0 is the value of the derivative at point 𝜔0 .


We shall introduce the variable 𝜉 = 𝜔 − 𝜔0 . Hence, 𝜔 = 𝜔0 + 𝜉 and d𝜔 = d𝑡.
Performing such a substitution in Eq. (20.9) and introducing the value of 𝑘𝜔 from
Eq. (20.10), we can write
d𝑘
∫ +𝛥𝜔/2      
𝐸(𝑥, 𝑡) = exp[𝑖(𝜔0 𝑡 − 𝑘0 𝑥)] 𝐴ˆ 𝜉 exp 𝑖 𝑡 − 𝑥 𝜉 d𝜉. (20.11)
−𝛥𝜔/2 d𝜔 0
We have arrived at an equation of a plane wave of frequency 𝜔0 , wave number 𝑘0 ,
and complex amplitude
d𝑘
∫ +𝛥𝜔/2      
ˆ 𝑡) =
𝐴(𝑥, 𝐴ˆ 𝜉 exp 𝑖 𝑡 − 𝑥 𝜉 d𝜉. (20.12)
−𝛥𝜔/2 d𝜔 0
It can be seen from Eq. (20.12) that the equation
d𝑘
 
𝑡− 𝑥 = constant, (20.13)
d𝜔 0
relates the time 𝑡 and the coordinate 𝑥 of the plane in which the complex amplitude
has a given fixed value, in particular including a value such that the magnitude of the
complex amplitude, i.e., the conventional amplitude 𝐴(𝑥, 𝑡), reaches a maximum.
Taking into account that 1/(d𝑘/d𝜔)0 = (d𝜔/d𝑘)0 , we can write Eq. (20.13) in
the form
d𝑘 constant
   
𝑥max = 𝑡 − constant 0
constant =0
. (20.14)
d𝜔 0 (d𝑘/d𝜔)0
It follows from Eq. (20.14) that the place where the amplitude of a wave packet
is maximum travels with the velocity (d𝜔/d𝑘)0 . We thus arrive at the following
expression for the group velocity:
d𝜔
𝑢= (20.15)
d𝑘
(the subscript 0 is no longer needed and has been omitted). We previously obtained
a similar expression for a packet of two waves [see Eq. (20.8)]. We remind our reader
that we have disregarded the terms of higher orders of smallness in expansion
(20.8). In this approximation, the shape of the wave packet does not change with
time. If we take into account the following terms of the expansion, then we get an
expression for the amplitude from which it follows that the width of a packet grows
with time—a wave packet broadens.
We can give a different form to the expression for the group velocity. Substitut-
ing 𝑣𝑘 for 𝜔 [see Eq. (20.3)], we can write Eq. (20.14) as follows:
d(𝑣𝑘) d𝑣
𝑢= =𝑣+𝑘 . (20.16)
d𝑘 d𝑘
Elementary Theory of Dispersion 479

We shall further write


d𝑣 d𝑣 d𝜆
= .
d𝑘 d𝜆 d𝑘
We find from the relation 𝜆 = 2𝜋/𝑘 that d𝜆/d𝑘 = −2𝜋/𝑘2 = −𝜆/𝑘. Accordingly,
d𝑣/d𝑘 = −(d𝑣/d𝜆)(𝜆/𝑘). Using this value in Eq. (20.16), we get
d𝑣
𝑢=𝑣−𝜆 . (20.17)
d𝜆
A glance at this formula shows that the group velocity 𝑢 can be either smaller or
greater than the phase velocity 𝑣, depending on the sign of d𝑣/d𝜆. In the absence of
dispersion, d𝑣/d𝜆 = 0, and the group velocity coincides with the phase one.
The maximum of the intensity falls to the centre of a wave packet. Therefore,
when the concept of group velocity has a meaning, the velocity of energy transfer
by a wave equals the group velocity.
The concept of group velocity may be applied only provided that the absorption of the
wave energy in the given medium is not great. With considerable attenuation of the
waves, the concept of group velocity loses its meaning. This occurs in the region of
anomalous dispersion. In this region, the absorption is very great, and the concept
of group velocity cannot be applied.

20.3. Elementary Theory of Dispersion

The dispersion of light can be explained on the basis of the electromagnetic the-
ory and the electron theory of a substance. For this purpose, we must consider
the process of interaction of light with a substance. The motion of the electrons
in an atom obeys the laws of quantum mechanics. In particular, the concept of
the trajectory of an electron in an atom loses all meaning. As Lorentz showed,
however, it is sufficient to restrict ourselves to the hypothesis on the existence of
electrons bound quasi-elastically within atoms for a qualitative understanding of
many optical phenomena. When brought out of their equilibrium position, such
electrons will begin to oscillate, gradually losing the energy of oscillation on the
emission of electromagnetic waves. As a result, the oscillations will be damped.
The attenuation can be taken into account by introducing the “force of friction of
emission” proportional to the velocity.
When an electromagnetic wave passes through a substance, every electron
experiences the action of the Lorentz force
𝑭 = −𝑒𝑬 − 𝑒(𝒗 × 𝑩) = −𝑒𝑬 − 𝑒𝜇0 (𝒗 × 𝑯) (20.18)
[see Eq. (6.35); the charge of an electron is −𝑒]. According to Eq.
p (15.23), the ratio of
the magnetic and electric field strengths in a wave is 𝐻/𝐸 = 𝜀0 /𝜇0 . Hence, from
480INTERACTION OF ELECTROMAGNETIC WAVES WITH A SUBSTANCE

Eq. (20.18), we get the following value for the ratio of the magnetic and electric
forces exerted on an electron
  1/2
𝜇0 𝑣𝐻 𝜀0 √ 𝑣
= 𝜇0 𝑣 = 𝑣 𝜀 0 𝜇0 = .
𝐸 𝜇0 𝑐
Even if the amplitude 𝑎 of electron oscillations reached a value of the order of 1 Å
(10−10 m), i.e., of the order of an atom’s dimensions, the amplitude of the velocity
of an electron 𝑎𝜔 would be about 10−10 × 3 × 1015 = 3 × 105 m s−1 [according to
Eq. (16.6), 𝜔 = 2𝜋 𝜈 equals about 3 × 1015 rad s−1 ]. Thus, the ratio 𝑣/𝑐 is clearly less
than 10−3 so that we may disregard the second addend in Eq. (20.18).
We can thus consider that when an electromagnetic wave passes through a
substance, every electron experiences the force
𝐹 = −𝑒𝐸0 cos(𝜔𝑡 + 𝛼)
(𝛼 is a quantity determined by the coordinates of a given electron, and 𝐸0 is the
amplitude of the electric field strength of the wave).
To simplify our calculations, we shall first disregard the attenuation due to
emission. We shall subsequently take the attenuation into account by introducing
the relevant corrections into the formulas obtained. The equation of motion of an
electron in this case has the form
𝑟¥ + 𝜔20 𝑟 = − 𝐸0 cos(𝜔𝑡 + 𝛼)
𝑒
𝑚
[see Eq. (7.13) of Vol. I,]; 𝜔0 is the natural frequency of oscillations of an electron).
Let us add −𝑖(𝑒/𝑚)𝐸0 sin(𝜔𝑡 + 𝛼) to the right-hand side of this equation and thus
pass over to the complex functions 𝐸ˆ and 𝑟ˆ:
d2 𝑟ˆ
+ 𝜔2 𝑟ˆ = − 𝐸ˆ 0 exp(𝑖𝜔𝑡).
𝑒
2
(20.19)
d𝑡 𝑚
Here, 𝐸ˆ 0 = 𝐸0 exp(𝑖𝛼) is the complex amplitude of the electric field of a wave.
We shall seek a solution of the equation in the form 𝑟ˆ = 𝑟ˆ0 exp(𝑖𝜔𝑡), where 𝜔
is the complex amplitude of oscillations of an electron. Accordingly, d2 𝑟ˆ/d𝑡2 =
−𝜔2 𝑟ˆ0 exp(𝑖𝜔𝑡). Introducing these expressions into Eq. (20.19) and cancelling out
the common factor exp(𝑖𝜔𝑡), we arrive at the expression
−𝜔2 𝑟ˆ0 + −𝜔20 𝑟ˆ0 = − 𝐸ˆ 0 ,
𝑒
𝑚
whence
−(𝑒/𝑚) 𝐸ˆ 0
𝑟ˆ0 = .
(𝜔20 − 𝜔2
Elementary Theory of Dispersion 481

Multiplying the equation obtained by exp(𝑖𝜔𝑡), we obtain


ˆ
−(𝑒/𝑚) 𝐸(𝑡)
𝑟ˆ (𝑡) = 2
.
(𝜔0 − 𝜔2
ˆ we find 𝑟 as a function
Finally, taking the real parts of the complex functions 𝑟ˆ and 𝐸,
of 𝑡:
−(𝑒/𝑚)𝐸(𝑡)
𝑟(𝑡) = . (20.20)
(𝜔20 − 𝜔2
To simplify the problem, we shall consider that the molecules are non-polar.
In addition, since the masses of nuclei are great in comparison with the mass of
an electron, we shall ignore the displacements of the nuclei from the equilibrium
positions under the action of the wave field. In this approximation, the dipole
electric momentÕof a molecule
Õ can be represented in the form
𝒑(𝑡) = 𝑞𝑙 𝑹0,𝑙 + 𝑒𝑘 [𝒓 0,𝑘 + 𝒓 𝑘 (𝑡)]
𝑙 𝑘
( )
Õ Õ Õ
= 𝑞𝑙 𝑹0,𝑙 + 𝑒𝑘 𝒓 0,𝑘 + 𝑒𝑘 𝒓 𝑘 (𝑡)
𝑙 𝑘 𝑘
Õ Õ
= 𝒑0 + 𝑒𝑘 𝒓 𝑘 (𝑡) = 𝑒𝑘 𝒓 𝑘 (𝑡),
𝑘 𝑘
where 𝑞𝑙 and 𝑅0,𝑙 are the charges and position vectors of the equilibrium positions
of the nuclei, 𝑒𝑘 and 𝑟0,𝑘 are the charge and position vector of the equilibrium
position of the 𝑘-th electron, 𝒓 𝑘 (𝑡) is the displacement of the 𝑘-th electron from
its equilibrium position under the action of the wave field, and 𝒑0 is the dipole
moment of a molecule in the absence of a field, which is assumed to equal zero.
All the 𝒓 𝑘 (𝑡)’s are collinear with 𝑬(𝑡). We therefore obtain the following ex-
pression for the projection of 𝒑(𝑡) onto the direction of 𝑬(𝑡):
Õ Õ
𝑝(𝑡) = 𝑒𝑘 𝑟𝑘 (𝑡) = (−𝑒)𝑟𝑘 (𝑡)
𝑘 𝑘
(we have taken into account that 𝑒𝑘 for all electrons is identical and equals −𝑒).
Let us introduce into this equation the value of 𝑟(𝑡) from Eq. (20.20), taking into
consideration that the electrons in a molecule have different natural frequencies
𝜔0,𝑘 . As a result, we get
Õ 𝑒2 /𝑚
𝑝(𝑡) = 𝐸(𝑡). (20.21)
𝑘
(𝜔20,𝑘 − 𝜔2 )
Let us denote the number of molecules in unit volume by the symbol 𝑁. The
product 𝑁 𝑝(𝑡) gives the polarization 𝑃 (𝑡) of a substance. According to Eqs. (2.5)
482INTERACTION OF ELECTROMAGNETIC WAVES WITH A SUBSTANCE

Fig. 20.5: Behaviour of function (20.22) Fig. 20.6: Square root of Fig. 20.5 in terms
when the friction of emission is disregarded of 𝜆0 . The dash curve shows how the coeffi-
(dashed line) and when is considered (solid cient of absorption of light by a substance
line). changes.

and (2.20), the permittivity is


𝑃 (𝑡) 𝑁 𝑝(𝑡)
𝜀 =1+ 𝜒 =1+ =1+ .
𝜀0 𝐸(𝑡) 𝜀0 𝐸(𝑡)
Using in this expression the ratio 𝑝(𝑡)/𝐸(𝑡) obtained from Eq. (20.21) and substitut-
ing 𝑛2 for 𝑒 [see Eq. (16.3)], we arrive at the formula
𝑁 𝑒2 /𝑚
𝑛2 = 1 + . (20.22)
𝜀0 (𝜔20,𝑘 − 𝜔2 )
At frequencies 𝜔 appreciably differing from all the natural frequencies 𝜔0,𝑘 , the
sum in Eq. (20.22) will be small in comparison with unity, so that 𝑛2 ≈ 1. Near
each of the natural frequencies, function (20.22) is interrupted: when 𝜔 tends to 𝜔0,𝑘
from the left, it becomes equal to +∞, and when it tends to 𝜔0,𝑘 from the right, the
function becomes equal to −∞ (see the dash curves in Fig. 20.5). Such a behaviour of
function (20.22) is due to the fact that we have disregarded the friction of emission
[we remind our reader that when friction is disregarded, the amplitude of the forced
oscillations in resonance becomes equal to infinity; see Eq. (7.128) of Vol. I]. When
the friction of emission is taken into consideration, we get the dependence of 𝑛2 on
𝜔 depicted in Fig. 20.5 by the solid curve.
Passing over from 𝑛2 to 𝑛 and from 𝜔 to 𝜆0 , we get the curve shown in Fig. 20.6
(the figure gives only a portion of the curve in the region of one of the resonance
wavelengths). The dash curve in this figure shows how the coefficient of absorption
of light by a substance changes (see the following section). Segment 3-4 is similar to
the curve shown in Fig. 20.1. Segments 1-2 and 3-4 correspond to normal dispersion
(d𝑛/d𝜆0 < 0). On segment 2-3, the dispersion is anomalous (d𝑛/d𝜆0 > 0). In region
1-2, the refractive index is less than unity, hence, the phase velocity of the wave
exceeds 𝑐. This circumstance does not contradict the theory of relativity, which is
based on the statement that the velocity of transmitting a signal cannot exceed 𝑐. In
Absorption of Light 483

the preceding section, we found that it is impossible to transmit a signal with the
aid of an ideally monochromatic wave. Energy (i.e., a signal) is transmitted with
the aid of a not completely monochromatic wave (wave packet), however, with
a velocity equal to the group velocity determined by Eq. (20.17). In the region of
normal dispersion, d𝑛/d𝜆 > 0 (d𝑛 and d𝜆 have different signs, while d𝑛/d𝜆 < 0),
so that although 𝑣 > 𝑐, the group velocity is less than 𝑐. In the region of anomalous
dispersion, the concept of group velocity loses its meaning (the absorption is very
great). Therefore, the value of 𝑢 calculated by Eq. (20.17) will not characterize the
rate of energy transmission. The relevant calculations give a value less than 𝑐 for
the velocity of energy transmission in this case too.

20.4. Absorption of Light

When a light wave passes through a substance, part of the wave energy is spent
for producing oscillations of the electrons. This energy is partly returned to the
radiation in the form of the secondary wavelets set up by the electrons; it is partly
transformed, however, into the energy of motion of the atoms, i.e., into the internal
energy of the substance. This is the reason why the intensity of light transmitted
through a substance diminishes—light is absorbed in the substance. The forced
oscillations of the electrons and therefore the absorption of light become especially
intensive at the resonance frequency (see the dash absorption curve in Fig. 20.6).
Experiments show that the intensity of light when it passes through a substance
diminishes according to the exponential law
𝐼 = 𝐼0 𝑒−𝜘𝑙 . (20.23)
Here, 𝐼0 is the intensity of light at the entrance to the absorbing layer (on its boundary
or at a certain place inside the substance), 𝑙 is the thickness of the layer, and 𝜘 is
the constant depending on the properties of the absorbing substance and called the
absorption coefficient.
Equation (20.23) is known as Bouguer’s law² [in honour of the French scientist
Pierre Bouguer (1698-1758)].
Differentiation of Eq. (20.23) yields
d𝐼 = −𝜘𝐼0 𝑒−𝜘𝑙 d𝑙 = −𝜘𝐼 d𝑙. (20.24)
It follows from this expression that the decrement of the intensity along the path d𝑙
is proportional to the length of this path and to the value of the intensity itself. The
absorption coefficient is the constant of proportionality.
Inspection of Eq. (20.23) shows that when 𝑙 = 1/𝜘, the intensity 𝐼 is 1/𝑒-th of 𝐼0 .

²Also known as the Beer-Lambert law or Beer-Lambert-Bouguer law.


484INTERACTION OF ELECTROMAGNETIC WAVES WITH A SUBSTANCE

Fig. 20.7: The absorption coefficient of a substance Fig. 20.8: Broad absorption bands of
whose atoms or molecules do not virtually act on gases at high pressures (also liquids
one another (gases and metal vapours at a low pres- and solids). As the pressure of gases
sure) is close to zero for most wavelengths. It dis- is increased, the absorption max-
plays sharp maxima (Fig. 20.7) only for very nar- ima expand and at high pressures
row spectral regions (having a width of several hun- the absorption spectrum of gases ap-
dredths of an angstrom). These maxima correspond proaches those of liquids. This in-
to the resonance frequencies of oscillations of the dicates that the expansion of the ab-
electrons inside the atoms. The molecular frequen- sorption bands is the result of the
cies are in the infrared region of the spectrum. atoms interacting with one another.

Thus, the absorption coefficient is a quantity inversely proportional to the thickness


of the layer that reduces the intensity of light passing through it to 1/𝑒-th of its
initial value.
The absorption coefficient depends on the wavelength 𝜆 (or the frequency 𝜔).
The absorption coefficient of a substance whose atoms or molecules do not virtually
act on one another (gases and metal vapours at a low pressure) is close to zero for
most wavelengths. It displays sharp maxima (Fig. 20.7) only for very narrow spectral
regions (having a width of several hundredths of an angstrom). These maxima
correspond to the resonance frequencies of oscillations of the electrons inside the
atoms. For polyatomic molecules, frequencies corresponding to the oscillations
of the atoms inside the molecules are also detected. Since the masses of atoms
are tens of thousands of times greater than the mass of an electron, the molecular
frequencies are much smaller than the atomic ones—they are in the infrared region
of the spectrum.
Gases at high pressures, and also liquids and solids produce broad absorption
bands (Fig. 20.8). As the pressure of gases is increased, the absorption maxima, which
are initially very narrow (see Fig. 20.7), expand more and more, and at high pressures
the absorption spectrum of gases approaches those of liquids. This fact indicates
that the expansion of the absorption bands is the result of the atoms interacting
with one another.
Metals are virtually opaque for light (𝜘 for them has a value of the order of
6
10 m−1 ; for comparison we shall point out that for glass 𝜘 ≈ 1 m−1 ). This is due to
the presence of free electrons in metals. The action of the electric field of a light
Scattering of Light 485

wave causes the free electrons to come into motion—fast-varying currents attended
by the liberation of Lenz-Joule heat are produced in the metal. As a result, the
energy of the light wave rapidly diminishes and transforms into the internal energy
of the metal.

20.5. Scattering of Light

From the classical viewpoint, the process of scattering of light consists in that light
passing through a substance causes the electrons in the atoms to oscillate. The
oscillating electrons produce secondary wavelets that propagate in all directions.
This phenomenon should seem to result in the scattering of light in all conditions.
The secondary wavelets, however, are coherent, so that their mutual interference
must be taken into consideration.
The relevant calculations show that in a homogeneous medium the secondary
wavelets completely destroy one another in all directions except for that of propa-
gation of the primary wave. Therefore, no redistribution of the light by directions,
i.e., scattering of the light, occurs.
The secondary wavelets do not destroy one another in side directions only
when light propagates in a non-homogeneous medium. The light waves become
diffracted on the non-homogeneities of the medium and produce a diffraction
pattern characterized by a quite uniform distribution of the intensity between all
directions. Such diffraction on fine non-homogeneities is called the scattering of
light.
Media having a clearly expressed optical non-homogeneity are known as turbid
media. They include (1) smoke, i.e., a suspension of very minute solid particles
in a gas, (2) fogs and mists-suspensions of very minute liquid droplets in gases,
(3) suspensions formed by solid particles in the bulk of a liquid, (4) emulsions, i.e.,
suspensions of very minute droplets of one liquid in another one that does not
dissolve the first liquid (an example of an emulsion is milk, which is a suspension
of droplets of fat in water), and (5) solids such as mother-of-pearl, opals, and milk
glass.
Light scattered on particles whose size is considerably smaller than the length
of a light wave becomes partly polarized. The explanation is that the oscillations of
the electrons produced by the scattered light beam occur in a plane at right angles
to the beam (Fig. 20.9). The oscillations of the vector 𝑬 in a secondary wavelet occur
in a plane passing through the direction of oscillations of the charges (see Fig. 15.6).
Therefore, the light scattered by the particles in directions normal to the beam will
be completely polarized. The scattered light is polarized only partly in directions
that make an angle other than a right one with the beam.
486INTERACTION OF ELECTROMAGNETIC WAVES WITH A SUBSTANCE

Scattered beam

Oscillations
of vector E
Direction of
observation

Fig. 20.9: The light scattered by the particles in directions normal to the beam will be
completely polarized. The scattered light is polarized only partly in directions that make
an angle other than a right one with the beam.

As a result of scattering of the light in side directions, the intensity in the


direction of its propagation diminishes more rapidly than when only absorption
occurs. Consequently, for a turbid substance, Eq. (20.23) must contain the coefficient
𝜘0 due to scattering in addition to the absorption coefficient 𝜘:
0
𝐼 = 𝐼0 𝑒−(𝜘+𝜘 )𝑙 . (20.25)
The constant is called the extinction coefficient.
𝜘0
If the dimensions of the non-homogeneities are small in comparison with the
length of a light wave (not over ∼ 0.1𝜆), then the intensity of the scattered light 𝑙 is
proportional to the fourth power of the frequency or is inversely proportional to
the fourth power of the wavelength:
1
𝐼 ∝ 𝜔4 ∝ 4 . (20.26)
𝜆
This relation is known as Rayleigh’s law after the British physicist John Rayleigh
(1842-1919). It is easy to understand its origin if we take into account that the radiant
power of an oscillating charge is proportional to the fourth power of the frequency
and, consequently, is inversely proportional to the fourth power of the wavelength
[see expression (15.46)].
If the dimensions of the non-homogeneities are comparable with the length
of a wave, then the electrons at different spots on the non-homogeneities oscillate
with an appreciable phase shift. This circumstance makes the phenomenon more
complicated and leads to other regularities—the intensity of the scattered light
becomes proportional to only the square of the frequency (inversely proportional
to the square of the wavelength).
It is simple to observe the manifestation of law (20.26) by passing a beam of
white light through a vessel with a turbid liquid (Fig. 20.10). Owing to scattering, the
trace of the beam in the liquid is seen very well from a side. Since short light waves
are scattered to a much greater extent than the long ones, the trace seems to be
bluish. The beam passing through the liquid is enriched with long-wave radiation
Scattering of Light 487

Sc
P

Fig. 20.10: A beam of white light passing through a vessel with a turbid liquid. The beam
passing through the liquid is enriched with long-wave radiation and forms a reddish-yellow
spot on screen Sc instead of a white one. With a polarizer P at the entrance of the beam
to the vessel, we shall find that the intensity of the scattered light in different directions
perpendicular to the initial beam is not the same.

and forms a reddish-yellow spot on screen Sc instead of a white one. If we put


polarizer P at the entrance of the beam to the vessel, we shall find that the intensity
of the scattered light in different directions perpendicular to the initial beam is
not the same. The directivity of dipole emission (see Fig. 15.7) results in the fact
that in the directions coinciding with the plane of oscillations of the primary beam,
the intensity of the scattered light virtually equals zero, while in the directions
perpendicular to the plane of the oscillations, the intensity of the scattered light is
maximum. By turning the polarizer around the direction of the primary beam, we
shall observe alternate amplification and attenuation of the light scattered in the
given direction.
Even liquids and gases carefully purified of foreign admixtures and impurities
scatter light to some extent. The Soviet physicist Leonid Mandelshtam (1879-1944)
and the Polish physicist Marian Smoluchowski (1872-1917) established that the ap-
pearance of the optical non-homogeneities is due in this case to fluctuation of the
density (i.e., deviations of the density from its mean value observed within the
confines of small volumes). These fluctuations are produced by chaotic motion
of the molecules of the substance; therefore, the scattering of light due to them is
called molecular.
Molecular scattering explains the light blue colour of the sky. The places of
compression and rarefaction of the air continuously appearing in the atmosphere
owing to the random motion of its molecules scatter sunlight. According to law
(20.26), the light blue and blue rays are scattered to a greater extent than the yellow
and red ones, the result being the light blue colour of the sky. When the Sun is low
above the horizon, the rays propagating directly from it pass through a scattering
medium of great thickness, and as a result they are enriched with waves of greater
lengths. This is why the sky at sunrise and sunset has red tints.
488INTERACTION OF ELECTROMAGNETIC WAVES WITH A SUBSTANCE

There are especially favourable conditions for the appearance of consider-


able density fluctuations near the critical state of a substance (at the critical point
d𝑝/d𝑉 = 0; see Sec. 15.4 of Vol. I). These fluctuations result in intensive scattering
of light such that a glass ampule with the substance seems to be absolutely black
when looked through. This phenomenon is known as critical opalescence.

20.6. The Vavilov-Cerenkov Effect

In 1934, the Soviet physicist Pavel Cerenkov (born 1904), working under the supervi-
sion of Sergei Vavilov (1891-1951), discovered a special kind of glow of liquids under
the action of radium gamma-rays. Vavilov advanced the correct assumption that
the fast electrons produced by the gamma-rays are the source of the radiation. This
phenomenon was named the Vavilov-Cerenkov effect. Its complete theoretical
explanation was given in 1937 by the Soviet physicists Igor Tamm (1895-1971) and Ilya
Frank (born 1908)³.
According to the electromagnetic theory, a charge moving uniformly emits no
electromagnetic waves (see Sec. 15.6). As Tamm and Frank showed, however, this
holds only if the velocity 𝑣 of a charged particle does not exceed the phase velocity
𝑐/𝑛 of electromagnetic waves in the medium in which the particle is moving. A
particle emits electromagnetic waves even when travelling uniformly provided that
𝑣 > 𝑐/𝑛. The particle actually loses energy on radiation owing to which it travels
with a negative acceleration. This acceleration is not the cause (as when 𝑣 < 𝑐/𝑛),
but a consequence of radiation. If the loss of energy at the expense of radiation
were replenished in some way or other, a particle travelling uniformly with the
velocity 𝑣 > 𝑐/𝑛 would nevertheless be a source of radiation.
The Vavilov-Cerenkov effect was observed experimentally for electrons, pro-
tons, and mesons travelling in liquid and solid media.
Vavilov-Cerenkov radiation has a light blue colour because short waves pre-
dominate in it. The most characteristic feature of this radiation is the fact that it is
emitted not in all directions, but only along the generatrices of a cone whose axis
coincides with the direction of velocity of the relevant particle (Fig. 20.11). The angle
𝜃 between the directions of propagation of the radiation and the velocity vector of
a particle is determined by the equation
𝑐/𝑛 𝑐
cos 𝜃 = = . (20.27)
𝑣 𝑛𝑣
The Vavilov-Cerenkov effect finds widespread application in experimental
equipment. In the so-called Cerenkov counters, a light pulse produced by a fast

³In 1958, Cerenkov, Tamm, and Frank were awarded a Nobel prize for their work.
The Vavilov-Cerenkov Effect 489

Fig. 20.11: Vavilov-Cerenkov radiation most characteristic feature is that it is emitted not
in all directions, but only along the generatrices of a cone whose axis coincides with the
direction of velocity of the relevant particle. The angle 𝜃 is formed between the directions
of propagation of the radiation and the velocity vector of a particle.

charged particle is transformed with the aid of a photomultiplier⁴ into a current


pulse. To make such a counter function, the energy of a particle must exceed the
threshold value determined by the condition 𝑣 = 𝑐/𝑛. Therefore, Cerenkov counters
make it possible not only to register particles, but also to assess their energy. It
is even possible to determine the angle 𝜃 between the direction of a flash and the
velocity of the particle. This allows us to use Eq. (20.27) to calculate the velocity
(and, consequently, also the energy) of a particle.

⁴By a photomultiplier is meant an electronic multiplier whose first electrode (a photocathode) is


capable of emitting electrons under the action of light.
491

Chapter 21
MOVING-MEDIA OPTICS

21.1. The Speed of Light

The speed of light in a vacuum is one of the fundamental physical quantities. The
establishment of the finite nature of the speed of light had a tremendous significance
of principle. The finite nature of the speed of transmitting signals and of transmitting
interactions underlies the theory of relativity.
In view of the fact that the numerical value of the speed of light is very high,
the experimental determination of this speed is a very complicated task. The speed
of light was first determined on the basis of astronomical observations. In 1676, the
Danish astronomer Olaus Romer (1644-1710) determined the speed of light from
observations of eclipses of Jupiter’s satellites. He obtained a value of 215000 km s−1 .
The Earth’s motion in orbit results in the visible position of stars on the celestial
sphere changing. This phenomenon, called the aberration of light, was used in
1727 by the British astronomer James Bradley (1693-1762) to determine the speed of
light.
Assume that the direction to a star seen in a telescope is perpendicular to the
plane of the Earth’s orbit. Hence, the angle between the direction toward the star
and the vector of the Earth’s velocity 𝑣 will be 𝜋/2 during the entire year (Fig. 21.1).
Let us point the axis of the telescope directly at the star. During the time 𝜏 needed
for the light to cover the distance from the objective to the eyepiece, the telescope
will move together with the Earth over the distance 𝑣𝜏 in a direction at right angles
to the light ray. As a result, the image of the star will be displaced from the centre
of the eyepiece. For the image to be exactly at the centre of the eyepiece, the axis
of the telescope must be turned in the direction of the vector 𝒗 through the angle
whose tangent is determined by the relation
𝑣
tan 𝛼 = (21.1)
𝑐
492 MOVING-MEDIA OPTICS

Direction
to star

Objective

Eyepiece

Fig. 21.1: Bradley experimental scheme to measure the speed of light. The direction to a star
seen in a telescope is perpendicular to the plane of the Earth’s orbit. The angle between
the direction toward the star and the vector of the Earth’s velocity 𝑣 will be 𝜋/2 during the
entire year.

(see Fig. 21.1). In exactly the same way, raindrops falling vertically will fly through
a long tube placed on a moving cart only if the axis of the tube is inclined in the
direction of motion of the cart.
Thus, the visible position of a star is displaced relative to the true one through
the angle 𝛼. The Earth’s velocity vector constantly turns in the plane of the orbit.
Therefore, the telescope axis also turns, describing a cone about the true direction
toward the star. Accordingly, the visible position of the star on the celestial sphere
describes a circle whose angular diameter is 2𝛼. If the direction toward the star
makes an angle other than a right one with the plane of the Earth’s orbit, the visible
position of the star describes an ellipse whose major axis has the angular dimension
2𝛼. For a star in the plane of the orbit, the ellipse degenerates into a straight line.
Bradley found from astronomical observations that 2𝛼 = 40.90. The corre-
sponding value of 𝑐 obtained by Eq. (21.1) is 303000 km s−1 .
In terrestrial conditions, the speed of light was first measured by the French
scientist Armand Fizeau (1819-1896) in 1849. The layout of his experiment is shown
in Fig. 21.2. Light from source S fell on a half-silvered mirror. The light reflected
from the mirror got onto the edge of a rapidly rotating toothed disk. Every time a
space between the teeth was opposite the light beam, a light pulse was produced
that reached mirror M and was reflected back. If at the moment when the light
The Speed of Light 493

Fig. 21.2: Fizeau experimental setup to measure the speed of light. Light from source S falls
on a half-silvered mirror. The light reflected from the mirror hits the edge of a rapidly
rotating toothed disk. Every time a space between the teeth was opposite the light beam, a
light pulse was produced that reached mirror M and was reflected back. If at the moment
when the light returned to the disk a space was opposite the beam, the reflected pulse passed
partly through the half-silvered mirror and reached the observer’s eye. If a tooth of the disk
was in the path of the reflected pulse, the observer saw no light.

returned to the disk a space was opposite the beam, the reflected pulse passed partly
through the half-silvered mirror and reached the observer’s eye. If a tooth of the
disk was in the path of the reflected pulse, the observer saw no light.
During the time 𝜏 = 2𝑙/𝑐 needed for the light to cover the distance to mirror
M and back, the disk managed to turn through the angle 𝛥𝜔 = 𝜔𝜏 = 2𝑙𝜔/𝑐, where
𝜔 is the angular velocity of the disk. Assume that the number of disk teeth is 𝑁.
Therefore, the angle between the centres of adjacent teeth is 𝛼 = 2𝜋/𝑁. The light
did not return to the observer’s eye at such disk velocities at which the disk in the
time 𝜏 managed to turn through the angles 𝛼/2, 3𝛼/2, . . . , (𝑚 − 1/2)𝛼, etc. Hence,
the condition for the 𝑚-th blackout has the form
1 2𝑙𝜔 1 2𝜋

𝛥𝜔 = 𝑚 − 𝛼 or = 𝑚− .
2 𝑐 2 𝑁
According to this formula, knowing 𝑙, 𝑁, and the angular velocity 𝜔𝑚 at which the
𝑚-th blackout is obtained, we can find 𝑐. In Fizeau’s experiment, 𝑙 was about 8.6 km.
The value of 313000 km s−1 was obtained for 𝑐.
In 1928, Kerr cells (see Sec. 19.7) were used to measure the speed of light. They
made it possible to interrupt a light beam with a much higher frequency (about
107 s−1 ) than when a rotating toothed disk was used. This made measurements of 𝑐
possible with 𝑙 of the order of several metres.
Albert Michelson performed several measurements of the speed of light using
the method of a rotating prism. In Michelson’s experiment conducted in 1932, light
propagated in a tube 1.6 km long from which the air was evacuated.
At present, the speed of light in a vacuum is taken equal to
𝑐 = 299792.5 ± 0.1 km s−1 . (21.2)
We must note that in all the experiments in which light was interrupted, the group
494 MOVING-MEDIA OPTICS

velocity of the light waves was determined, and not the phase velocity. In air, these
two velocities virtually coincide.

21.2. Fizeau’s Experiment

Up to now, we assumed that the sources, receivers, and other bodies relative to
which the propagation of light was considered are stationary. It is quite natural to
be interested in how motion of a source of light waves affects the propagation of
light. Here, it becomes necessary to indicate relative to what the motion takes place.
We established in Sec. 14.11 that the motion of a source or a receiver of sound waves
relative to the medium in which these waves are propagating affects the proceeding
of acoustic phenomena (the Doppler effect), and, consequently, can be detected.
The wave theory initially treated light as elastic waves propagating in a hypo-
thetic medium called universal ether. After Maxwell advanced his theory, elastic
ether was replaced by an ether that was a carrier of electromagnetic waves and
fields. By this ether was meant a special medium filling, like its elastic ether prede-
cessor, the entire space of the universe and penetrating all bodies. Since ether was a
certain medium, it would be possible to count on detecting the motion of bodies,
for example light sources or receivers, with respect to this medium. In particular,
the existence of an “ether wind” blowing around the Earth in its motion about the
Sun ought to be expected.
Galileo’s principle of relativity was established in mechanics. According to it,
all inertial reference frames are equivalent in a mechanical respect. The detection
of ether would make it possible to separate (with the aid of optical phenomena) a
special (related to ether) predominant, absolute reference frame. Therefore, motion
of the other frames could be considered relative to this absolute frame.
Thus, the establishment of how universal ether interacts with moving bodies,
was a matter of principle. Three possibilities could be assumed: (1) ether is absolutely
not disturbed by moving bodies, (2) ether is partly carried along by moving bodies,
acquiring a velocity of 𝛼𝑣, where 𝑣 is the velocity of a body relative to the absolute
reference frame, and 𝛼 is a drag coefficient less than unity, and (3) ether is completely
carried along by moving bodies, for example by the Earth, in the same way as a body
in its motion carries along the layers of gas adjoining its surface. The last possibility,
however, is disproved by the existence of the phenomenon of light aberration. We
established in the preceding section that the change in the visible position of stars
can be explained by the motion of the telescope relative to the reference frame
(medium) in which the light wave is propagating.
To find out whether ether is carried along by moving bodies, Fizeau conducted
the following experiment in 1851. A parallel beam of light from source S was split by
Fizeau’s Experiment 495

M1 M2

M3

Fig. 21.3: Fizeau’s interferometer experiment to determine the role of the ether in the motion
bodies in it. A parallel beam of light from source S was split by half-silvered plate P into
two beams 1 and 2.

half-silvered plate P into two beams 1 and 2 (Fig. 21.3). As a result of reflection from
mirrors M1 , M2 and M3 , the beams, after completing the same total path 𝐿, again
reached plate P. Beam 1 partly passed through P, while beam 2 was partly reflected.
As a result, two coherent beams 10 and 20 were set up. They produced an interference
pattern in the form of fringes in the focal plane of a telescope. Two tubes along
which water could be passed with the velocity u in the directions indicated by the
arrows were installed in the paths of beams 1 and 2. Ray 2 propagated in both tubes
opposite to the flow of the water, and ray 1 with the flow.
When the water was stationary, beams 1 and 2 covered the path 𝐿 in the same
time. If water in its motion even partly carries along ether, then when the flow of
the water was switched on, ray 2, which propagates opposite to the flow, would
spend more time to cover the path 𝐿 than ray 1 travelling in the direction of flow. As
a result, a certain path difference will appear between the rays, and the interference
pattern will be displaced.
The path difference we are interested in appears only in the path of the rays in
the water. This path has the length 2𝑙. Let the velocity of light in the water relative
to the ether be 𝑣. When ether is not carried along by the water, the speed of light
relative to the arrangement will coincide with 𝑣. Let us assume that the water in its
motion partly carries along the ether, imparting to it the velocity 𝛼𝑢 relative to the
arrangement (𝑢 is the velocity of the water, and 𝛼 is the drag coefficient). Hence, the
velocity of light relative to the arrangement will be 𝑣 + 𝛼𝑢 for ray 1 and 𝑣 − 𝛼𝑢 for
ray 2. Ray 1 covers the path 2𝑙 during the time 𝑡1 = 2𝑙/(𝑣 + 𝛼𝑢), and ray 2 during
the time 𝑡2 = 2𝑙/(𝑣 − 𝛼𝑢). It can be seen from Eq. (16.54) that the optical length of a
path to cover which the time 𝑡 is required equals 𝑐𝑡. Hence, the path difference of
496 MOVING-MEDIA OPTICS

rays 1 and 2 is 𝛿 = 𝑐(𝑡2 − 𝑡1 ). Dividing 𝛿 by 𝛥 by 𝜆0 , we get the number of fringes by


which the interference pattern will be displaced when the flow of water is switched
on:
2𝑙 2𝑙 4𝑐𝑙𝛼𝑢
 
𝑐(𝑡2 − 𝑡1 ) 𝑐
𝛥𝑁 = = − = .
𝜆0 𝜆0 𝑣 − 𝛼𝑢 𝑣 + 𝛼𝑢 𝜆0 (𝑣2 − 𝛼 2 𝑢2 )
Fizeau discovered that the interference fringes are indeed displaced. The value
of the drag coefficient corresponding to this displacement was
1
𝛼 = 1 − 2, (21.3)
𝑛
where 𝑛 is the refractive index of water. Thus, Fizeau’s experiment showed that
ether (if it exists) is carried along by moving water only partly.
It is easy to see that the result of Fizeau’s experiment is explained by the rela-
tivistic law of velocity addition. According to the first of equations (8.27) in Vol. I,
the velocities 𝑣𝑥 and 𝑣𝑥0 of a body in frames K and K0 are related by the expression
𝑣𝑥0 + 𝑣0
𝑣𝑥 = (21.4)
1 + 𝑣0 𝑣𝑥0 /𝑐2
(𝑣0 is the velocity of the frame K0 relative to the frame K).
Let us relate the reference frame K to Fizeau’s instrument, and the frame K0 to
the moving water. Now, the part of 𝑣0 will be played by the velocity of the water 𝑢,
that of 𝑣𝑥0 by the velocity of the light relative to the water equal to 𝑐/𝑛, and, finally,
the part of 𝑣𝑥 will be played by the velocity of the light relative to the instrument
𝑣inst . Introduction of these values into Eq. (21.4) yields
𝑐/𝑛 + 𝑢 (𝑐/𝑛) + 𝑢
𝑣inst = = .
1 + 𝑢(𝑐/𝑛)𝑐2 1 + 𝑢/(𝑐𝑛)
The velocity of the water 𝑢 is much smaller than 𝑐. The expression obtained can
therefore be simplified as follows:
1
 
(𝑐/𝑛) + 𝑢 𝑐  𝑢 𝑐
𝑣inst = =≈ +𝑢 1− ≈ +𝑢 1− 2 (21.5)
1 + 𝑢/(𝑐𝑛) 𝑛 𝑐𝑛 𝑛 𝑛
[we have disregarded the term 𝑢2 /(𝑐𝑛)].
According to classical notions, the velocity of light relative to the instrument
𝑣inst equals the sum of the velocity of light relative to ether, i.e., 𝑐/𝑛, and of the
velocity of ether relative to the instrument, i.e., 𝛼𝑢:
𝑐
𝑣inst = + 𝛼𝑢.
𝑛
A comparison with Eq. (21.5) gives the value obtained by Fizeau for the drag coeffi-
cient 𝛼 [see Eq. (21.3)].
It must be borne in mind that only the velocity of light in a vacuum is the same
in all reference frames. Its velocity in a substance differs in different reference
Michelson’s Experiment 497

Fig. 21.4: Michelson and Morley experiment. A brick foundation supported an annular iron
trough with mercury. A wooden float having the shape of the bottom half of a longitudinally
cut doughnut floated on the mercury. The float carried a massive square stone slab. This
design made it possible to smoothly turn the slab about the vertical axis of the arrangement.

frames. It has the value 𝑐/𝑛 in the frame associated with the medium in which the
light is propagating.

21.3. Michelson’s Experiment

In 1881, Michelson carried out his famous experiment by means of which he counted
on detecting the motion of the Earth relative to ether (the ether wind). In 1887, he
repeated his experiment together with Morley on an improved instrument. The
arrangement used by Michelson and Morley is shown in Fig. 21.4. A brick foundation
supported an annular iron trough with mercury. A wooden float having the shape of
the bottom half of a longitudinally cut doughnut floated on the mercury. The float
carried a massive square stone slab. This design made it possible to smoothly turn
the slab about the vertical axis of the arrangement. A Michelson interferometer (see
Fig. 17.6) was installed on the slab. The interferometer was modified so that both
rays before returning to the half-silvered plate cover a distance coinciding with the
diagonal of the slab several times. A diagram of the path of the rays is shown in
Fig. 21.5. The symbols in this figure correspond to those used in Fig. 17.16.
The experiment was based on the following reasoning. Let us assume that
interferometer arm PM2 (Fig. 21.6) coincides with the direction of motion of the
Earth relative to ether. Consequently, the time needed for ray 1 to cover the path to
498 MOVING-MEDIA OPTICS

Light source
M1

Lens

P1

P2
Telescope

M2

Movable mirror
Fig. 21.5: Modified Michelson interferometer. The interferometer was modified so that both
rays before returning to the half-silvered plate cover a distance coinciding with the diagonal
of the slab several times. The symbols in this figure correspond to those used in Fig. 17.16.

mirror M1 and back will differ from the time needed for ray 2 to cover path PM2 P.
As a result, even when the lengths of both arms are equal, rays 1 and 2 will acquire
a certain path difference. If we turn the arrangement through 90°, the arms will
exchange places, and the path difference will change its sign. This should result in
displacement of the interference pattern whose magnitude, as shown by calculations
performed by Michelson, could be detected quite readily.
To calculate the expected displacement of the interference pattern, let us find
the time spent by rays 1 and 2 to cover the relevant paths. Assume that the Earth’s
velocity relative to the ether is 𝑣. If the ether is not carried along by the Earth and
the velocity of light relative to the ether is 𝑐 (the refractive index of air is practically
equal to unity), then the velocity of light relative to the instrument will be 𝑐 − 𝑣
for direction PM2 and 𝑐 + 𝑣 for direction M2 P. Hence, the time needed for ray 2 is
determined by the expression
2𝑙𝑐 2𝑙 1 2𝑙 𝑣2
 
𝑙 𝑙
𝑡2 = + = = ≈ 1+ 2 (21.6)
𝑐 − 𝑣 𝑐 + 𝑣 𝑐 2 − 𝑣2 𝑐 (1 − 𝑣2 /𝑐2 ) 𝑐 𝑐
(the Earth’s velocity along its orbit is 30 km s−1 , therefore, 𝑣2 /𝑐2 = 10−8  1).
Before commencing to calculate the time 𝑡1 , let us consider the following ex-
ample from mechanics. Suppose that a launch developing the velocity 𝑐 relative to
water has to cross a river with a current velocity of 𝑣 in a direction strictly perpen-
Michelson’s Experiment 499

M1

M2

Fig. 21.6: Reasoning of Michelson and Morley experi- Fig. 21.7: Considerations to cal-
ment, assuming that the interferometer arm PM2 coin- culate time 𝑡1 . Suppose that a
cides with the direction of motion of the Earth relative launch developing the velocity 𝑐
to ether. Then, the time needed for ray 1 to cover the relative to water has to cross a
path to mirror M1 and back will differ from the time river with a current velocity of 𝑣
needed for ray 2 to cover path PM2 P. As a result, even in a direction strictly perpendicu-
when the lengths of both arms are equal, rays 1 and lar to its banks. For the launch to
2 will acquire a certain path difference. Turning the travel in the required direction,
arrangement 90°, the arms will exchange places, and its velocity 𝑐 relative to the water
the path difference will change its sign. must be directed as shown here.

dicular to its banks (Fig. 21.7). For the launch to travel in the required direction, its
velocity 𝑐 relative to the water must be directed as shown in the figure. √ Therefore,
the velocity of the launch relative to the banks will be |𝑐 + 𝑣| = 𝑐2 − 𝑣2 . The
velocity of ray 1 relative to the arrangement (as assumed by Michelson) will be the
same. Consequently, the time taken by ray 1 is¹
2𝑙 2𝑙 1 2𝑙 1 𝑣2
 
𝑡1 = √ = p ≈ 1+ 2 . (21.7)
𝑐 2 − 𝑣2 𝑐 1 − 𝑣2 /𝑐2 𝑐 2𝑐
Substituting for 𝑡2 and 𝑡1 in the expression 𝛥 = 𝑐(𝑡2 − 𝑡1 ) their values from
expressions (21.6) and (21.7), we get the path difference for rays 1 and 2:
𝑣2 1 𝑣2 𝑣2
   
𝛥 = 2𝑙 1 + 2 − 1 + 2 = 𝑙 2 .
𝑐 2𝑐 𝑐
When the arrangement is turned through 90°, the path difference changes its sign.
Consequently, the number of fringes by which the interference pattern will be
displaced is
2𝛥 𝑙 𝑣2
𝛥𝑁 = =2 . (21.8)
𝜆0 𝜆0 𝑐

¹We have used the formulas 1 − 𝑥 ≈ 1 − 𝑥/2 and 1/(1 − 𝑥) ≈ 1 + 𝑥, for small values of 𝑥.
500 MOVING-MEDIA OPTICS

The arm length 𝑙 (taking into account multifold reflections) was 11 m. The
wavelength of the light used by Michelson and Morley was 0.59 µm. The use of
these values in Eq. (21.8) gives
2 × 11
𝛥𝑁 = × 10−8 = 0.37 ≈ 0.4 fringe.
0.59 × 10−6
The arrangement made it possible to detect a displacement of the order of 0.01 fringe.
But no displacement of the interference pattern was detected. The experiment was
repeated during different times of the day to exclude the possibility of the horizon
plane being perpendicular to the vector of the Earth’s orbital velocity at the moment
of measurements. Subsequently, the experiment was repeated many times during
different seasons of the year (during a year, the vector of the Earth’s orbital velocity
turns in space through 360°), and negative results were constantly obtained. The
attempt to detect an ether wind was not successful Universal ether remained elusive.
Several attempts were made to explain the negative result of Michelson’s ex-
periment without refuting the hypothesis of the existence of universal ether. But
all these attempts were groundless. An exhaustive non-contradictory explanation
of all the experimental facts including the results of Michelson’s experiment was
given by Albert Einstein in 1905. He arrived at the conclusion that universal ether,
i.e., a special medium that could serve as an absolute reference frame, does not exist.
Accordingly, Einstein extended the mechanical principle of relativity to all physical
phenomena without any exception. He further postulated in accordance with exper-
imental data that the speed of light in a vacuum is the same in all inertial reference
frames and does not depend on the motion of the light sources and receivers.
The principle of relativity and the principle of the constancy of the speed of
light form the foundation of the special theory of relativity developed by Einstein
(see Chapter 8 of Vol. I).

21.4. The Doppler Effect

In acoustics, the change in frequency due to the Doppler effect is determined by the
velocities of the source and the receiver relative to the medium that is the carrier of
the sound waves [see Eq. (14.78)]. The Doppler effect also exists for light waves. But
there is no special medium that would serve as the carrier of electromagnetic waves.
Therefore, the Doppler displacement of the frequency of light waves is determined
only by the relative velocity of the source and the receiver.
Let us associate the origin of coordinates of the frame K with a light source and
the origin of coordinates of the frame K0 with a receiver (Fig. 21.8). We shall direct
the axes 𝑥 and 𝑥 0, as usual, along the velocity vector 𝑣 with which the frame K0 (i.e.,
the receiver) is moving relative to the frame K (i.e., the source). The equation of a
The Doppler Effect 501

Source Receiver
Fig. 21.8: The Doppler effect for light waves. Let us associate the origin of coordinates of the
frame K with a light source and the origin of coordinates of the frame K0 with a receiver.
We shall direct the axes 𝑥 and 𝑥 0, along the velocity vector 𝑣 with which the frame K0 (the
receiver) is moving relative to the frame K (the source).

plane light wave emitted by the source in the direction of the receiver will have the
following form in the frame K:
h  𝑥 i
𝐸(𝑥, 𝑡) = 𝐴 cos 𝜔 𝑡 − +𝛼 . (21.9)
𝑐
Here, 𝜔 is the frequency of a wave registered in the reference frame associated with
the source, i.e., the frequency of oscillations of the source. We assume that the light
wave is propagating in a vacuum; therefore, the phase velocity is 𝑐.
According to the principle of relativity, the laws of nature have the same form
in all inertial reference frames. Hence, in the frame K0, the wave given by Eq. (21.9)
will be described by the equation
𝑥0
   
𝐸(𝑥 0, 𝑡 0) = 𝐴0 cos 𝜔 0 𝑡 0 − + 𝛼0 , (21.10)
𝑐
where 𝜔 0 is the frequency registered in the reference frame K0, i.e., the frequency
picked up by the receiver. We have provided all the quantities except 𝑐, which is the
same in all reference frames, with primes.
We can obtain an equation of a wave in the frame K0 from an equation in the
frame K by passing over from 𝑥 and 𝑡 to 𝑥 0 and 𝑡 0 with the aid of the Lorentz trans-
formations. Introducing instead of 𝑥 and 𝑡 in Eq. (21.9) their values in accordance
with Eqs. (8.17) of Vol. I, we get
𝑡 0 + (𝑣/𝑐2 )𝑥 0
" ! #
𝑥 0 + 𝑣𝑡 0
𝐸(𝑥 , 𝑡 ) = 𝐴 cos 𝜔 p
0 0
− p +𝛼
1 − 𝑣2 /𝑐2 𝑐 1 − 𝑣2 /𝑐2
(the part of 𝑣0 is played by 𝑣). The latter expression is easily transformed into the
following one:
" ! #
1 0

− 𝑣/𝑐 𝑥
𝐸(𝑥 0, 𝑡 0) = 𝐴 cos 𝜔 p 𝑡0 − +𝛼 . (21.11)
1 − 𝑣2 /𝑐2 𝑐
Equation (21.11) describes the same wave in the frame K0 as Eq. (21.10). Therefore,
502 MOVING-MEDIA OPTICS

the following relation must be observed:


 1/2
1 − 𝑣/𝑐 1 − 𝑣/𝑐

0
𝜔 = 𝜔p =𝜔 .
1 − 𝑣2 /𝑐2 1 + 𝑣/𝑐
Let us change our notation: we shall denote the frequency 𝜔 of the source by 𝜔0 ,
and the frequency 𝜔 0 of the receiver by 𝜔. The preceding equation will thus become
 1/2
1 − 𝑣/𝑐

𝜔 = 𝜔0 . (21.12)
1 + 𝑣/𝑐
Passing over from the angular frequency to the ordinary one, we have
 1/2
1 − 𝑣/𝑐

𝜈 = 𝜈0 . (21.13)
1 + 𝑣/𝑐
The velocity 𝑣 of the receiver relative to the source in Eqs. (21.12) and (21.13) is an
algebraic quantity. When the receiver moves away from the source, 𝑣 > 0, and by
Eq. (21.12), 𝜔 < 𝜔0 ; when the receiver approaches the source, 𝑣 < 0, so that 𝜔 > 𝜔0 .
When 𝑣  𝑐, Eq. (21.12) can be written approximately as follows:
1 − (1/2) (𝑣/𝑐) 1𝑣 1𝑣
    
𝜔 ≈ 𝜔0 ≈ 𝜔0 1 − 1− .
1 + (1/2) (𝑣/𝑐) 2𝑐 2𝑐
Hence, after Taylor expansion around 𝑣 = 0 and limiting ourselves to terms of the
order of 𝑣/𝑐, we get
 𝑣
𝜔 = 𝜔0 1 − . (21.14)
𝑐
From this formula, we can find the relative change in the frequency:
𝛥𝜔 𝑣
=− (21.15)
𝜔 𝑐
(𝛥𝜔 stands for 𝜔 − 𝜔𝑜 ).
We can show that a transverse Doppler effect exists for light waves in addition
to the longitudinal effect we have considered. It consists in a reduction in the
frequency picked up by the receiver observed when the vector of the relative velocity
is directed at right angles to the straight line passing through the receiver and the
source² (when, for example, the source travels along a circle at whose centre the
receiver is). In this case, the frequency 𝜔0 in the frame of the source is associated
with the frequency 𝜔 in the frame of the receiver by the relation
 1/2
𝑣2 1 𝑣2
  
𝜔 = 𝜔0 1 − 2 ≈ 𝜔0 1 − 2 . (21.16)
𝑐 2𝑐

²We remind our reader that the transverse Doppler effect does not exist for sound waves.
The Doppler Effect 503

The relative change in frequency in the transverse Doppler effect


𝛥𝜔 1 𝑣2
= − 2, (21.17)
𝜔 2𝑐
is proportional to the square of the ratio 𝑣/𝑐 and, consequently, is considerably
smaller than in the longitudinal effect for which the relative change in the frequency
is proportional to the first power of 𝑣/𝑐.
The existence of the transverse Doppler effect was proved experimentally by
the American physicist Herbert Ives (1882-1953) in 1938. He determined the change in
the frequency of emission of hydrogen atoms in canal rays (see the last paragraph
of Sec. 12.6).
The velocity of the atoms was about 2 × 108 m s−1 . These experiments were a
direct experimental confirmation of the correctness of Lorentz transformations.
In the general case, the vector of the relative velocity can be resolved into two
components of which one is directed along the ray, and the other at right angles to it.
The first component gives rise to the longitudinal, and the second to the transverse
Doppler effect.
The longitudinal Doppler effect is used to determine the radial velocity of
stars. By measuring the relative shift of the lines in the spectra of stars, we can use
Eq. (21.12) to determine 𝑣.
The thermal motion of the molecules of a luminous gas, owing to the Doppler
effect, leads to broadening of the spectral lines. As a result of the chaotic nature
of the thermal motion, all the directions of the molecules’ velocities relative to
a spectrograph are equally probable. Therefore, the radiation registered by the
instrument contains all the frequencies in the interval from 𝜔0 (1−𝑣/𝑐) to 𝜔𝑜 (1+𝑣/𝑐),
where 𝜔0 is the frequency emitted by the molecules, and 𝑣 is the velocity of thermal
motion [see Eq. (21.14)]. The registered width of a spectral line is thus 2𝜔0 𝑣/𝑐. The
quantity
𝑣
𝛿 𝜔D = 2𝜔0 , (21.18)
𝑐
is called the Doppler width of a spectral line (𝑣 stands for the most probable
velocity of the molecules). The magnitude of the Doppler broadening of spectral
lines makes it possible to assess the velocity of thermal motion of the molecules
and, consequently, the temperature of a luminous gas.
505

APPENDICES

A.1. List of Symbols

𝐴 amplitude; gas amplification; work


𝑨 vector potential of magnetic field†
𝑎 amplitude
𝒂 acceleration; vector
𝐵 Kerr constant
𝑩 magnetic induction
𝐶 capacitance; circulation of vector; Curie constant
𝑐 electromagnetic constant; speed of light
𝐷 angular dispersion
𝑫 electric displacement
𝑑 period of diffraction grating; separation distance of capacitor plates
÷ divergence
𝐸 illuminance; Young’s modulus
𝑬 electric field strength
𝑬∗ strength of extraneous force field
E electromotive force (e.m.f.)
𝑒 base of natural logarithms; positive elementary charge
𝒆ˆ unit vector
𝐹 Faraday constant
𝑭 force
𝑓 focal length
𝐺 shear modulus
† The magnitude of a vector is denoted by the same symbol as the vector itself,
but in ordinary italic (sloping) type.
506 APPENDICES

𝑯 magnetic field strength


ℏ Planck’s constant ℎ divided by 2𝜋
𝐼 current; luminous intensity;
√ sound intensity
𝑖 imaginary unity (𝑖 = −1)
𝒋 current density; density of energy flux
𝑲 momentum
𝑘 constant of proportionality; wavenumber
𝒌 wavevector
𝐿 inductance; loudness level; luminance; optical path
𝑳 angular momentum
𝑙 length; mean free path
𝒍 displacement
𝑀 luminous emittance; magnification; mass of mole
𝑴 magnetization; moment of force
𝑚 mass
𝑁 demagnetization factor; number
𝑁A Avogadro constant
𝑛 number; refractive index
𝑃 degree of polarization; optical power; power; probability; radiated power
𝑷 force of gravity; polarization
𝑝 pressure
𝒑 dipole moment; electric moment
𝑄 amount of heat; quality of oscillator circuit
𝑞 charge
𝑅 molar gas constant; radius; resistance; resolving power
𝑅H Hall coefficient
< real number
𝑟 distance; radius
𝒓 position vector
𝑆 area
𝑺 Poynting vector
𝑇 absolute temperature; period
𝑻 torque
𝑡 time
𝑈 voltage
𝑢 mobility of ion
𝒖 velocity
𝑉 Verdet constant; visibility
𝒗 velocity
List of Symbols 507

𝑊 energy
𝑤 energy density
𝑋 reactance
𝑥 coordinate
𝑦 coordinate
𝑍 atomic number of element; impedance
𝑧 coordinate; valence
𝛼 angle; drag coefficient; initial phase of oscillations; rotational constant
𝛽 angle; polarizability of molecule; relative velocity
𝛾 angle; attenuation coefficient
𝛥 difference in optical path; increment; Laplacian operator
𝛿 density of metal; fraction of energy; phase difference
𝜀 relative permittivity; strain
𝜀0 electric constant
𝜃 angle; polar angle; polar coordinate
𝜘 thermal conductivity; wave absorption coefficient
𝜘0 extinction coefficient
𝜆 linear charge density; logarithmic decrement; wavelength
𝜇 permeability
𝜇B Bohr magneton
𝜇0 magnetic constant
𝜈 frequency
𝜉 displacement of wave point
𝜋 ratio of circumference to diameter
𝜌 coherence radius; density; reflection coefficient; resistivity; volume density
of charge
𝜎 conductivity; cross-sectional area; stress; surface charge density
𝜏 retardation time; time; time constant of a circuit; transmission coefficient
𝛷 flux
𝜑 angle; azimuthal angle; potential
𝜒 electric susceptibility
𝜒m magnetic susceptibility
𝛹 flux linkage; total magnetic flux
𝜓 angle
𝛺 solid angle
𝜔 angular frequency
𝝎 angular velocity
∇ del (Hamiltonian) operator
508 APPENDICES

A.2. Units of Electrical and Magnetic Quantities in the International System


(SI) and in the Gaussian System

The electric constant


1 1
𝜀0 = F m−1 ≈ F m−1 .
4𝜋 (2.997925) 2 × 109 4𝜋 × 9 × 109
The magnetic constant
𝜇0 = 4𝜋 × 10−7 H m−1 .
The electromagnetic constant
1
𝑐= √ = 2.997925 × 108 m s−1 ≈ 3 × 108 m s−1 .
𝜀 0 𝜇0
The relations between the units are given approximately. To obtain the exact
values, substitute 2.997925 for 3 and (2.997925) 2 for 9.
Units of Electrical and Magnetic Quantities 509

Table A.1

Unit and symbol


Quantity and aymbol Relation
SI Gaussian

Force 𝐹 newton (N) dyne (dyn) 1 N = 105 dyn


Work 𝐴 and energy 𝑊 joule (J) erg (erg) 1 J = 107 erg
Charge 𝑞 coulomb (C) cgse𝑞 1 C = 3 × 109 cgse𝑞
Electric field strength 𝐸 volt per metre (V m−1 ) cgse𝐸 1 C = 3 × 104 m−1
Potential 𝜑, voltage 𝑈, e.m.f. E volt (V) cgse𝜑,𝑈,E 1 cgse𝜑,𝑈,E = 300 V
Electric moment of dipole 𝑝 coulomb-metre (C m) cgse 𝑝 1 C m = 3 × 1011 cgse𝑞
Polarization 𝑃 coulomb per metre squared (C m−2 ) cgse𝑃 1 C m−2 = 3 × 105 cgse𝑃
Electric susceptibility 𝜒 SI 𝜒 cgse 𝜒 1 cgse 𝜒 = 4𝜋 SI 𝜒
Electric displacement 𝐷 coulomb per metre squared (C m−2 ) cgse𝐷 1 C m−2 = 4𝜋3 × 105 cgse𝐷
Flux of electric displacement 𝛷 coulomb (C) cgse𝛷 1 C = 4𝜋3 × 109 cgse𝛷
Capacitance 𝐶 farad (F) centimetre (cm) 1 F = 1011 cm
Current 𝐼 ampere (A) cgse𝐼 1 A = 3 × 109 cgse𝐼
Current density 𝑗 ampere per metre squared (A m−2 ) cgse 𝑗 1 A m−2 = 3 × 105 cgse 𝑗
Resistance 𝑅 ohm (Ω) cgse𝑅 1 cgse𝑅 = 9 × 1011 Ω
Resistivity 𝜌 ohm-metre (Ω m) cgse𝜌 1 cgse𝜌 = 9 × 109 Ω m
Conductivity 𝜎 siemens per metre (S m−1 ) cgse𝜎 1 S m−1 = 9 × 109 cgse𝜎
Magnetic induction 𝐵 tesla (T) gauss (Gs) 1 T = 104 Gs
Flux of magnetic induction 𝛷 weber (Wb) maxwell (Mx) 1 Wb = 108 Mx
Flux linkage 𝛹 weber (Wb) maxwell (Mx) 1 Wb = 108 Mx
Magnetic moment 𝑝m ampere-metre squared (A m2 ) cgsm 𝑝m 1 A m2 = 103 cgsm 𝑝m
Magnetization 𝑀 ampere per metre (A m−1 ) cgsm𝑀 1 cgsm𝑀 = 103 A m−1
Magnetic field strength 𝐻 ampere per metre (A m−1 ) oersted (Oe) 1 A m−1 = 4𝜋10−3 Oe
Magnetic susceptibility 𝜒m SI 𝜒m cgsm 𝜒m 1 cgsm 𝜒m = 4𝜋 SI 𝜒m
Inductance 𝐿 henry (H) centimetre (cm) 1 H = 109 cm
Mutual inductance 𝐿12 henry (H) centimetre (cm) 1 H = 109 cm
510 APPENDICES

A.3. Basic Formulas of Electricity and Magnetism in the SI and in the Gaus-
sian System

1. Coulomb’s law:
1 𝑞1 𝑞2 𝑞1 𝑞2
𝐹= 2
(SI) 𝐹 = 2 (GS).
4𝜋 𝜀0 𝑟 𝑟
2. Electric field strength (definition):
𝑭
𝑬= .
𝑞
3. Field strength of point charge:
1 𝑞 𝑞
𝐸= 2
(SI) 𝐸 = 2 (GS).
4𝜋 𝜀0 𝜀𝑟 𝜀𝑟
4. Field strength between charged planes and near surface of charged conductor:
𝜎 4𝜋𝜎
𝐸= (SI) 𝐸= (GS).
𝜀0 𝜀 𝜀
5. Potential (definition):
𝑊p
𝜑= .
𝑞
6. Potential of field of point charge:
1 𝑞 𝑞
𝜑= (SI) 𝜑= (GS).
4𝜋 𝜀0 𝜀𝑟 𝜀𝑟
7. Work of field forces on charge:
𝐴 = 𝑞(𝜑1 − 𝜑2 ).
8. Relation between 𝑬 and 𝜑:
𝑬 = −∇𝜑.
9. Relation between 𝜑 and 𝑬:
∫ 2
𝜑1 − 𝜑2 = 𝑬 · d𝒍.
1
10. Curl of vector 𝑬 for electrostatic field:
∇ × 𝑬 = 0.
11. Circulation
∮ of vector 𝑬 for electrostatic field:
𝑬 · d𝒍 = 0.
12. Electric moment of dipole:
𝑝 = 𝑞𝑙.
Basic Formulas of Electricity and Magnetism 511

13. Torque acting on dipole in electric field:


𝑻 = 𝒑 × 𝑬.
14. Energy of dipole in field:
𝑊 = −𝒑 × 𝑬.
15. Dipole moment of “elastic” molecule:
𝒑 = 𝛽𝜀0 𝑬 (SI) 𝒑 = 𝛽𝑬 (GS).
16. Polarization (definition):
1 Õ
𝑷= 𝒑.
𝛥𝑉
17. Relation between 𝑷 and 𝑬:
𝑷 = 𝜒𝜀0 𝑬 (SI) 𝑷 = 𝜒𝑬 (GS).
18. Relation between 𝑷 and volume density of bound charges:
𝜌0 = −∇ · 𝑷.
19. Relation between 𝑷 and surface density of bound charges:
𝜎 0 = 𝑃n .
20. Electric displacement (definition):
𝑫 = 𝜀0 𝑬 + 𝑷 (SI) 𝑫 = 𝑬 + 4𝜋 𝑷 (GS).
21. Divergence of vector 𝑫:
∇ · 𝑫 = 𝜌 (SI) ∇ · 𝑫 = 4𝜋 𝜌 (GS).
22. Gauss’s
∮ theoremÕ for 𝑫: ∮ Õ
𝑫 · d𝑺 = 𝑞 (SI) 𝑫 · d𝑺 = 4𝜋 𝑞 (GS).
23. Relation between permittivity 𝜀 and electric susceptibility 𝜒:
𝜀 = 1 + 𝜒 (SI) 𝜀 = 1 + 4𝜋 𝜒 (GS).
24. Relation between values of 𝜒 in the SI and in the Gaussian system:
𝜒SI = 4𝜋 𝜒GS .
25. Relation between 𝑫 and 𝑬:
𝑫 = 𝜀𝜀0 𝑬 (SI) 𝑫 = 𝜀𝑬 (GS).
26. Relation between 𝑫 and 𝑬 in a vacuum:
𝑫 = 𝜀0 𝑬 (SI) 𝑫 = 𝑬 (GS).
27. 𝑫 of point charge field:
1 𝑞 𝑞
𝐷= (SI) 𝐷= (GS).
4𝜋 𝑟 2 𝑟2
512 APPENDICES

28. Capacitance of capacitor (definition):


𝑞
𝐶= .
𝑈
29. Capacitance of plane capacitor:
𝜀0 𝜀𝑆 𝜀0 𝜀𝑆
𝐶= (SI) 𝐶= (GS).
𝑑 4𝜋𝑑
30. Energy of system of charges:

𝑊= 𝑞𝜑.
2
31. Energy of charged capacitor:
𝐶𝑈 2
𝑊= .
2
32. Density of electric field energy:
𝜀0 𝜀𝐸2 𝜀𝐸2
𝑤= (SI) 𝑤= (GS).
2 8𝜋
33. Current (definition):
d𝑞
𝐼= .
d𝑡
34. Current density (definition):
d𝐼
𝑗= .
d𝑆⊥
35. Continuity equation:
∂𝜌
∇· 𝒋=− .
∂𝑡
36. Voltage (definition):
𝑈 = 𝜑1 − 𝜑2 + E12 .
37. Ohm’s law:
𝑈
𝐼= .
𝑅
38. Ohm’s law in differential form
1
𝒋 = 𝑬 = 𝜎 𝑬.
𝜌
39. Joule-Lenz law:
∫ 𝑡
𝑄= 𝑅𝐼 2 d𝑡.
0
40. Joule-Lenz law in differential form:
𝑤 = 𝜌𝑗2 .
Basic Formulas of Electricity and Magnetism 513

41. Force of interaction of two parallel currents in a vacuum (per unit length):
𝜇0 2𝐼1 𝐼2 1 2𝐼1 𝐼2
𝐹= (SI) 𝐹= 2 (GS).
4𝜋 𝑏 𝑐 𝑏
42. Field of freely moving charge:
𝜇0 𝑞(𝒗 × 𝒓) 1 𝑞(𝒗 × 𝒓)
𝑩= 3
(SI) 𝑩= .
4𝜋 𝑟 𝑐 𝑟3
43. Biot-Savart law:
𝜇0 𝐼 (d𝒍 × 𝒓) 1 𝐼 (d𝒍 × 𝒓)
d𝑩 = 3
(SI) d𝑩 = (GS).
4𝜋 𝑟 𝑐 𝑟3
44. Lorentz force:
𝑞
𝑭 = 𝑞𝑬 + 𝑞(𝒗 × 𝑩) (SI) 𝑭 = 𝑞𝑬 + (𝒗 × 𝑩) (GS).
𝑐
45. Ampere’s law:
1
d𝑭 = 𝐼 (d𝒍 × 𝑩) (SI) d𝑭 = 𝐼 (d𝒍 × 𝑩) (GS).
𝑐
46. Magnetic moment of loop with current:
1
𝑝m = 𝐼𝑆 (SI) 𝑝m = 𝐼𝑆 (GS).
𝑐
47. Angular momentum exerted on magnetic moment in a magnetic field:
𝑳 = 𝒑m × 𝑩.
48. “Mechanical” energy of magnetic moment in a magnetic field:
𝑊 = −𝒑m · 𝑩.
49. Divergence of vector 𝑩:
∇ · 𝑩 = 0.
50. Gauss’s theorem for 𝑩:

𝑩 · d𝑺 = 0.
51. Magnetization (definition):
1 Õ
𝑴= 𝒑m .
𝛥𝑉
52. Magnetic field strength (definition):
1
𝑯 = 𝑩 − 𝑴 (SI) 𝑯 = 𝑩 − 4𝜋 𝑴 (GS).
𝜇0
53. Relation between 𝑴 and 𝑯:
𝑴 = 𝜒m 𝑯.
54. Relation between permeability 𝜇 and magnetic susceptibility 𝜒m :
𝜇 = 1 + 𝜒m (SI) 𝜇 = 1 + 4𝜋 𝜒m (GS).
514 APPENDICES

55. Relation between values of 𝜒m in the SI and in the Gaussian system:


𝜒m,SI = 4𝜋 𝜒m,GS .
56. Relation between 𝑩 and 𝑯:
𝑩 = 𝜇𝜇0 𝑯 (SI) 𝑩 = 𝜇𝑯 (GS).
57. Relation between 𝑩 and 𝑯 a vacuum:
𝑩 = 𝜇0 𝑯 (SI) 𝑩 = 𝑯 (GS).
58. Curl of vector 𝑯 for a stationary field:
4𝜋
∇ × 𝑯 = 𝒋 (SI) ∇×𝑯 = 𝒋 (GS).
𝑐
59. Circulation of vector 𝑯 for a stationary field:
4𝜋 Õ
∮ Õ ∮
𝑯 · d𝒍 = 𝐼 (SI) 𝑯 · d𝒍 = 𝐼 (GS).
𝑐
60. Magnetic field strength of straight current:
1 2𝐼 1 2𝐼
𝐻= (SI) 𝐻= (GS).
4𝜋 𝑏 𝑐 𝑏
61. Magnetic field strength at centre of ring current:
𝐼 1 2𝜋 𝐼
𝐻= (SI) 𝐻= (GS).
2𝑅 𝑐 𝑅
62. Field strength of solenoid:
4𝜋
𝐻 = 𝑛𝐼 (SI) 𝐻= 𝑛𝐼 (GS).
𝑐
63. Flux of magnetic induction (definition):

𝛷 = 𝑩 · d𝑺.
𝑆
64. Work done on loop with current when it is moved in a magnetic field:
1
𝐴 = 𝐼 𝛥𝛷 (SI) 𝐴 = 𝐼 𝛥𝛷 (GS).
𝑐
65. Flux linkage or total magnetic flux (definition):
Õ
𝛹 = 𝛷.
66. Induced e.m.f.:
d𝛹 1 d𝛹
Ei = − (SI) Ei = − (GS).
d𝑡 𝑐 d𝑡
67. Inductance (definition):
𝛹 𝛹
𝐿= (SI) 𝐿 = 𝑐 (GS).
𝐼 𝐼
68. Inductance of solenoid:
𝐿 = 𝜇𝜇 𝑛2 𝑙𝑆 (SI) 𝐿 = 4𝜋 𝜇𝜇 𝑛2 𝑙𝑆 (GS).
Basic Formulas of Electricity and Magnetism 515

69. E.m.f. of self-induction (in absence of ferromagnetics):


d𝐼 1 d𝐼
Es = −𝐿 (SI) Es = − 2 𝐿 (GS).
d𝑡 𝑐 d𝑡
70. Energy of magnetic fteld of current:
𝐿𝐼 2 1 𝐿𝐼 2
𝑊= (SI) 𝑊= 2 (GS).
2 𝑐 2
71. Density of energy of magnetic field:
𝜇0 𝜇𝐻 2 𝜇𝐻 2
𝑤= (SI) 𝑤= (GS).
2 8𝜋
72. Energy of linked loops with current:
1Õ 1 Õ
𝑊= 𝐿𝑖𝑘 𝐼 𝑖 𝐼 𝑘 (SI) 𝑊= 2 𝐿𝑖𝑘 𝐼 𝑖 𝐼 𝑘 (GS).
2 𝑖,𝑘 2𝑐 𝑖,𝑘
73. Density of displacement current:
¤ (SI) 1 ¤
𝒋dis = 𝑫 𝒋dis = 𝑫 (GS).
4𝜋
74. Maxwell’s equations in differential form:
∂𝑩 1 ∂𝑩
∇×𝑬 =− (SI) ∇×𝑬 =− (SI)
∂𝑡 𝑐 ∂𝑡
∇ · 𝑩 = 0 (SI) ∇ · 𝑩 = 0 (GS)
∂𝑫 4𝜋 1 ∂𝑫
∇×𝑯 = 𝒋+ (SI) ∇×𝑯 = 𝒋+ (GS)
∂𝑡 𝑐 𝑐 ∂𝑡
∇ · 𝑫 = 𝜌 (SI) ∇ · 𝑫 = 4𝜋 𝜌 (GS).
75. Maxwell s equations in integral form:
∂𝑩 1 ∂𝑩
∮ ∫ ∮ ∫
𝑬 · d𝒍 = − · d𝑺 (SI) 𝑬 · d𝒍 = − · d𝑺 (GS)
𝑆 ∂𝑡 𝑐 𝑆 ∂𝑡
∮𝛤 ∮ 𝛤

𝑩 · d𝑺 = 0 (SI) 𝑩 · d𝑺 = 0 (GS)
𝑆 𝑆
∂𝑫
∮ ∫ ∫
𝑯 · d𝒍 = 𝒋 · d𝑺 + · d𝑺 (SI)
𝛤 𝑆 𝑆 ∂𝑡
4𝜋 1 ∂𝑫
∮ ∫ ∫
𝑯 · d𝒍 = 𝒋 · d𝑺 + · d𝑺 (GS)
𝛤 𝑐 𝑆 𝑐 𝑆 ∂𝑡
∮ ∫ ∮ ∫
𝑫 · d𝑺 = 𝜌 d𝑉 (SI) 𝑫 · d𝑺 = 4𝜋 𝜌 d𝑉 (GS).
𝑆 𝑉 𝑆 𝑉
76. Velocity of electromagnetic waves:
𝑐
𝑣= √ .
𝜀𝜇
516 APPENDICES

77. Relation between amplitudes of vectors 𝑬 and 𝑯 in an electromagnetic wave:


√ √ √ √
𝐸m = 𝜀0 𝜀 = 𝐻m 𝜇0 𝜇 (SI) 𝐸m = 𝜀 = 𝐻m 𝜇 (GS).
78. Poynting vector:
1
𝑺 = 𝑬 × 𝑯 (SI) 𝑺= 𝑬 × 𝑯 (GS).
4𝜋 𝑐
79. Density of electromagnetic field momentum:
1 1
𝑲 = 2 𝑬 × 𝑯 (SI) 𝑲= 𝑬 × 𝑯 (GS).
𝑐 4𝜋 𝑐2

You might also like