W A YE
MECHANI CS
ADVANCED GENERAL
THEORY
J. F R E NK E L
PROFESSOR AT THE PH Y8IOO-TECHNICAL
INSTITUTE, LENINGRAD
DOVER PUBLICATIONS, INC.
19 5 0
FIRST AMERICAN PRINTING
BY SPECIAL ARRANGEMENT WITH
OXFORD UNIVERSITY PRESS
First Edition 1934
PRINTED AND BOUND IN THE UNITED STATES
OF AMERICA
P R E F AC E
HE present volume forming the second P art of my Wave Mechanics
T is devoted (as foreshadowed in the Preface to P art I) to the mathe
matical development of the general ideas underlying the new mechanics,
connecting it with classical mechanics and constituting it a complete
self-supporting theory. In building up the mathematical framework of
this theory I have limited myself to what I consider its most essen
tial elements, leaving aside a number of questions which have a metho
dological value only (such as the group theory) or which are met with
in the solution of special problems.
I t is my intention to consider some of these questions later on in
connexion with the special problems which will be discussed in P art III
(Advanced Special Theory’) ; I have carefully avoided complicating the
general scheme of the theory by such special questions—with a few
exceptions inserted for illustration (the relativistic theory of the hydro
gen-like atom, for example).
To make the general scheme more comprehensible I have not spared
space, dealing with especially important general questions (such as the
transformation and the perturbation theory, or the relativistic theory
of the electron) at much greater length than would be necessary from
the point of view of an adequate presentation to a sophisticated reader.
I must cordially thank the editors for their readiness to meet my de
mands on space, which have resulted in a book larger than was originally
contemplated. I must also thank M. L. Urquhart and Miss B. Swirles
for help in correcting the English and the proofs.
The present book, like P art I, is complete in itself, and can be read
without acquaintance with P art I, provided the reader is familiar with
some elementary account of wave mechanics, and is ready to explore
its mathematical depths to obtain a profounder insight into the theory
and to prepare himself for applying it to various special problems.
The earlier portions of this book were written in 1931 while I was in
America; it was completed in Leningrad nearly two years later. Some
of the shortcomings of the book are due to this interruption and the
impossibility of revising it in 1933 from the very beginning.
A list of the more important references for each section is given a t the
end of the book; it is followed by a short index which should enable the
reader to locate easily all the more im portant subjects treated.
LENING R AD J . P.
Nov. 1933
C O NT E NT S
I. CLASSICAL MECHANICS AS THE LIMITING FORM OF WAVE
MECHANICS
1. Motion in One Dim ension; Par tial Reflection and Uncer tainty in
the Sign of the Velocity . . . 1
2. Comparison between the Schrodinger and the Classical Equation
of Motion in One Dim ension; Aver age Velocity and Current
Density . . . . . . . 7
3. Gener alization for Non-stationar y Motion in Three Dimensions;
The Hamilton-J acobi Equation . . . .1 5
4. Comparison of the Appr oximate Solutions of Schrodinger ’s
E quation; Comparison of Classical and Wave-mechanical
Aver age Values . . . .2 4
5. Motion in a Limited R egion ; Quantum Conditions and Aver age
Values . . . . .3 4
II. OPERATORS
6. Oper ational For m of Schr odinger ’s Equation, and Operational
Repr esentation of Physical Quantities . . .4 7
7. Char acteristic Functions and Values of Operators ; Operational
E q u ation s: Constants of the Motion . . . .5 4
8. Pr obable Values of Physical Quantities and their Change with
the Time . . . . . . .6 2
9. Tho Var iational For m of the Schrodinger Equation and its
Application to the Per tur bation Theor y . . .6 8
10. Or thogonality and Nor malization of Char acteristic Functions for
Discr ete and Continuous Spectr a . . . .7 5
III. MATRICES
11. Matr ix Repr esentation of Physical Quantities and Matrix For m
of the Equations of Motion . . . . .8 5
12. The Correspondence between Matrix and Classical Mechanics . 97
13. Application of the Matr ix Method to Oscillator y and Rotational
Motion . . . . . . . 106
14. Matr ix Repr esentation in the Case of a Continuous Spectr um . 120
IV. TRANSFORMATION THEORY
15. Restr icted Tr ansfor mation T h eor y; Matrices defined from differ
e n t ‘Points of View’ . . . . . 127
16. Transfor mation of Matr ices . . . . . 138
17. Tr ansfor mation Theor y of Matr ices as a Gener alization of Wave
Mechanics; Tr ansfor mation of Basic Quantities .148
18. Geometrical Repr esentation of the Tr ansfor mation Theor y 162
V. PERTURBATIO N THEORY
19. Per tur bation Theor y not involving the Time (Method of Station
ar y States) . . . . . . . 177
20. Extension of the Pr eceding Theor y to the Case of ‘Relative
Degener acy* and Continuous Spectr a; Effect of Per tur bation
on Var ious Physical Quantities . .189
C O NT E NT S
21. Per tur bation Theor y involving the Tim e; General Pr ocesses;
Theor y of Tr ansitions . , . . .197
22. Fir st Appr oxim ation; Theor y of Simple Tr ansitions .214
23. Second Appr oxim ation; Theor y of Combined Tr ansitions . 226
24. Theor y of Tr ansitions for an Undefined Initial "State . 230
VI. RELATIVISTIC REMODELLING AND MAGNETIC G E NE R AL I
ZATION OF T HE WAVE MECHANICS OF A SINGLE
ELECTRON
25. Sim plest For m of R elativistic W ave Mechanics. . . 239
26. Magnetic For ces in the Appr oximate Non-R elativistic Wave
Mechanics . . . . . . . 247
27. R elativistic Wave Mechanics as a For mal Gener alization of
Maxwell's Electr omagnetic Theor y of Light . 259
28. Alter native For m of the W ave E quations; Duplicity and Quad-
r uplicity Phenomenon . . . . . 268
29. Pauli's Appr oximate Theor y in the Two-dimensional Matr ix
F or m ; the E lectr on’s Magnetic Moment and Angular Mo
mentum . . . . . . . 279
30. Moro E xa ct For m of the Two-dimensional Matr ix T h eor y; the
Electr on’s Electr ic Moment . . . . . 297
31. The E xa ct Four -dimensional Matr ix Theor y of Dir ac . .311
32. General Tr eatment of the Spin E ffect; Angular Momentum and
Magnetic Moment . . . . . . 323
33. The Motion of an Electr on in a Centr al Field of For ce; Fine
Str uctur e and Zeeman Effect . . . . 330
34. Negative Ener gy St a t es; P ositive Electr ons and Neutr ons 344
35. The Invar iance of th e Dir ac Equation with regard to Coor dinate
Tr ansfor mations . . . . . . 349
36. Tr ansfor mation of the Dir ac Equation to Curvilinear Coor di
nates . . . . . . . 363
VII. T H E PROBLEM OF MANY PARTICLES
37. General R esults. Virial Theor em, Linear and Angular Momentum 369
38. Magnetic For ces and Spin Effects . . . . 378
39. Complex Par ticles tr eated as Mater ial P oin ts with Inner Coordi
nates ; Theor y of Incom plete System s . . 386
40. Identical Par ticles (Electr ons) and the E xclusion Pr inciple 392
VI I I . R EDUC TION OF T H E PROBLEM OF A SYSTEM OF I DE NT I
CAL PARTICLES TO THAT OF A SINGLE PARTICLE
41. Per tur bation Theor y of a System of Spinless Electr ons and the
E xchange Degener acy . . . . . 400
42. Intr oduction of th e Spin Coor dinates and Solution of the Per
tur bation Pr oblem with Antisymmetr ical W ave Functions . 410
43. The Method of the Self-consistent Field with Factor ized W ave
F unctions . . . . . . . 423
44. The Method of the Self-consistent Field with Antisymmetr ical
Functions and Dir ac’s Den sity Matr ix 428
45. Appr oxim ate Solutions (Thomas-Fer mi-Dir ac Equation) 439
CONTENTS
IX. SECOND (INTENSITY) QUANTIZATION AND QUANTUM
ELECTRODYNAMICS
40. Second Quantization with respect to Electrons . 447
47. Intensity Quantization of Particles described in the Configura
tion Space by a Symmetrical Wave Function (Einstein-Bose
Statistics) . . . . . . . 462
48. Interaction between a ‘Doubly Quantized* System and an Ordin
ary System: Application to Photons . . . . 474
49. Electromagnetic Waves with Quantized Amplitudes; Thoory of
Spontaneous Transitions and of Radiation Damping . 484
60. Application of Quantized Electron Waves to the Emission and
Scattering of Radiation . . . . . 494
51. Connexion between Quantized Mechanical (Electron) Waves and
Electromagnetic Waves . . . . . 502
62. The Quantum Electrodynamics of Heisenberg, Pauli, and Dirac 500
53. Breit's Formula. Concluding Remarks 512
REFERENCES . 519
INDEX TO PART I 523
INDEX TO PART II 525
ADVANCED GENERAL THEORY
I
CLASSICAL MECHANICS AS THE LIMITING FORM
OF WAVE MECHANICS
1. Motion in One Dimension; Par tial Reflection and Uncer tainty
in the Sign of the Velocity
In the first part of this book we have given a general outline of the
development and present state of wave mechanics, emphasizing the
physical meaning of the new conceptions and avoiding, as far as pos
sible, formal questions connected with the mathematical expression of
these new conceptions. We have thus been led astray from the old
conceptions based on classical corpuscular mechanics, deepening, as it
were, the abyss separating the old from the new mechanics.
A systematic study of the formal questions referred to above reveals
the wonderful fact that in spite of the fundamental physical difference
between the new and the old mechanics, they are extremely similar
from the mathematical point of view, i.e. from the point of view of
the mathematical expression of the various physical quantities and the
mathematical equations connecting them. This formal similarity forms
a bridge over the abyss between the old and the new mechanics,
enabling one to consider the latter as an extension or rather a refine
ment of the former and to establish a one-to-one correspondence
between the old ‘classical* and the new ‘quantum* conceptions, quan
tities, and equations—a correspondence which often looks like an
identity.
The existence of such a correspondence is a very instructive example
of the fact—many times already illustrated by the development of
physics—that a drastic revision of our physical conceptions can be
associated with a simple improvement in the underlying mathematical
scheme.
We shall start by considering the simplest case of the wave-mechanical
equation, i.e. the equation describing the stationary motion of a particle
in one dimension:
g + — ( W - U ) t = 0, ( 1)
the potential energy U being supposed to depend on x only (and not
upon t, otherwise the total energy W would not be constant).
3505.6 B
2 CLASSICAL MECHANICS AS LIMITING FORM §1
If U were constant, then this equation would have a solution of the
formt 0 = Aeiax (la)
representing a sine wave travelling in the direction of the positive x-axis,
a being the positive square root of the expression %n2m(W—U)/hl (sup
posed to be positive). I t must be borne in mind, however, th at (1 a) is
only a particular solution of ( 1), the general solution being
t/f = A'eiax+ A'e -ia*, (1b)
which represents the superposition of two sine waves of the same length
travelling in opposite directions. The fact th at ( 1) has two independent
particular solutions, representing, under the condition U = const.,
waves travelling in opposite directions, and th at its general solution is
equal to the sum of these two, is a consequence of the fact th at ( 1) is
a linear equation of the second order.
In the general case, either for a constant or a variable U(x), the
function $, which is a complex quantity, can be written in the form
. (2 )
where A = \\f/\ is its modulus and <f> is its argument (both of them of
course being real). This representation of </r suggests th at it may be
possible to interpret the process described by it in a way similar to
that corresponding to expression (la), namely, as a propagation of a
wave with a (variable) amplitude A(x) in a definite direction specified
by the phase <f>(x) (positive if d<f>ldx > 0 and negative if cty/dx < 0 ).
Such an interpretation is, however, in general wrong, as is clearly
shown by taking for ip the expression (lb ) corresponding to U — const.
Assuming A' and A ” to be real, we get in this case
Acos<£ = (A '+A 'Jcosoz, Asiiuf) = (A'—A")sin ax,
and consequently
A2 = A'2+ Af,2+ 2A'A*coa 2a xy (2 a)
tan 4> = ta n a *- (2 b)
The functions A and <f> can, of course, be interpreted as the resulting
amplitude and phase at the various points, but they will not refer to
oscillations propagated in one definite direction. I t will be noticed that
A t instead of being constant, may oscillate with x twice as rapidly as
the phase of each of the two component waves, and th at the resulting
phase <f> may alternately increase and decrease with increase of x.
t W© shall drop in future the time factor e-t,nWtlk, the oscillatory character of 0 as
a function of the time being understood.
§1 MOTION IN O.NE DIM ENSION
Substituting (2) in (1) and taking into account the relations
d x ~ dx* + t A d i e *
we get, after cancelling the common factor e^,
d2A
dx2 +
Because A, <f>, and the parameter
a1
■*= ^ ( W - U ) (3)
are all real quantities, this equation can be split up into two equations:
d2A
(3a)
dx2 +
9d A fy , A d?± _ 0. (3 b )
“ dx dx
J~ ' dx2 ~
If the latter equation be divided by Ad<f>/dx, it can be immediately
integrated to give
2 log A + log--? = const.
dx
or A2<— — C ( = con st.). W
Putting d<f>/dx C/A2 in (3 a), we get
d2A
+ (<x2~ C 2A-*)A = 0. (4 a)
dx2
This equation for A \ip\2 is equivalent to the Schrtidinger equation
(1) for </r, but differs from it formally by the fact th at it is not linear.
Let us assume for a moment th at SchrOdinger’s equation, in the case
of a variable U, admits of a particular solution of the type (la),
i.e. a solution representing waves travelling in one definite direction,
e.g. in the positive direction. We could then obviously identify A in
(2) with the amplitude and <f>with the phase of these particular waves.
According to the definition of phase, the change of phase corresponding
to an increase of x by dx (the time being fixed) would be given in this
case by 27r
d<f> = — dx = a dx
A
(A denoting the wave-length at the point considered).
4 CLASSICAL MECHANICS AS LIMITING FORM §l
We should thus have the equation
d<f>
= (5 )
dx
which is inconsistent with (3 a) unless d2A/dx2 = 0. This condition, giving
A = ax+ b, is, however, in general inconsistent with the relation (4a),
Q
i.e. A2<x = Cy unless a = -------— ywhich means a very special assump-
(a.r+6)2
tion for the potential-energy function U (the preceding relation is ful
filled in particular if U = const., a being equal to zero in this case).
We thus see that a one-sided wave propagation, corresponding to the
motion of a particle in one definite direction, is in general impossible.
From the point of view of the wave conception this result is very
easily explained. Thus every field of force, i.e. every inhomogeneity in
the potential energy U or the parameter a, leads to a partial reflection
of a wave impinging on it. If the inhomogeneity is due to a discon
tinuous jump of a., the reflection is produced at the point (or plane) of
discontinuity. If <x varies continuously, the reflection is produced
gradually (the reflected waves giving rise to reflected waves of the
second order travelling in the initial direction, and so on).
From the corpuscular point of view this means th at a particle moving
along the axis of a; in a field of force parallel to x may have its velocity
reversed at every instant, so that while the magnitude of the velocity is
a given function of xyits direction or sign remains uncertain.
This uncertainty constitutes the fundamental difference between the
new and the old mechanics. In the old mechanics, if the direction of
the velocity is fixed at some initial instant, then it should remain the
same so long as the kinetic energy W—U remains positive (a8 > 0).
Such a determinateness does not actually exist in the phenomena of
motion. When these phenomena are described by wave mechanics, we
find Nature in a position very similar to that of a theoretical physicist
who, in performing complicated (and even simple!) calculations, often
feels a strong uncertainty about the sign (+ or —) which must be
assigned to the quantities under consideration.
This uncertainty of sign or of direction of velocity for a given magni
tude of the latter and a given position can be regarded as an ‘uncertainty
principle’ characteristic of wave mechanics and not related directly to
the uncertainty principle of Heisenberg. The difference between them is
that in the latter the localization of the particle is imagined to be effected
by means of a ‘wave packet’ involving an uncertainty not so much
§1 MOTION IN ONE DIM E NSIO N G
in the direction of the velocity as in its magnitude, whereas in the
present case there is no need for constructing such a packet, the fact
asserted being not a definite position of the particle, but the connexion
between position, which may be arbitrary (that is, specifiable in terms
of probability only) and the magnitude of the velocity. As we have
just seen, the uncertainty in the direction of this velocity is connected
with the possibility of both transmission and reflection of the particle
in every region where it is acted on by some force. At the very beginning
of this book we came upon this possibility when attempting to interpret,
from the corpuscular point of view, the phenomena of partial reflection
and partial transmission of light at the boundary between two homo
geneous bodies. Later we studied it in more detail when investigating
the motion of material particles in a field of force according to wave
mechanics. We can sum up the results arrived at by saying that the
indeterminateness which constitutes the characteristic distinction be
tween wave mechanics and classical mechanics is due primarily to this
ambiguity in the result produced by a force acting on the particle.
Whereas in classical mechanics such a force must either accelerate or
retard the particle, reversing the direction of its motion only when the
increase of potential energy would exceed the total energy, in wave
mechanics a force can reverse the direction of motion, leaving the
magnitude of the velocity unchanged, even when this force is acting
in the direction of the motion, i.e. even when, according to classical
mechanics, the particle should be accelerated without change of
direction.
So far as the relation between the wave-mechanical and the classical
equations of motion is concerned, this uncertainty in the direction or
in the ‘sign* of the velocity, when its magnitude and the position of
the particle are simultaneously fixed, is much more useful than
Heisenberg’s uncertainty principle (which is another aspect of the
fundamental ambiguity inherent in wave mechanics). I t leads us to
expect that the results predicted by wave mechanics will approach those
predicted by classical mechanics as the reflection coefficient tends to zero,
i.e. when the ambiguity due to the possibility of reflection as well as
transmission vanishes. In this case, transmission, i.e. motion in the
same direction, is the only issue that comes into consideration.
I t is easy to see that a decrease in the reflection coefficient is brought
about by a decrease in the wave-length. When the wave-length becomes
very small compared with the length over which the potential energy
changes by an appreciable amount, the reflection produced by this
6 CLASSICAL MECHANICS AS LIMITING FORM $1
change of potential energy also becomes very small and vanishes in the
limiting case A = 0.
This result can be illustrated by the fact, pointed out in P art I, § 12,
that cathode rays pass without appreciable reflection through an
electric condenser whose thickness is very large compared with the
wave-length, while they are appreciably reflected if this thickness is
reduced to zero, the potential energy change remaining the same. In
the latter case the reflection and transmission coefficients are given
by the well-known formulae
R
(ct + a )2
where a' and of are the values of the parameter a on both sides of the
discontinuity. It may be recalled that this parameter is proportional
to the momentum g = mvy i.e. to the velocity of the electron. When
the velocity of the impinging electrons, that is a', increases, the jump
AU of the potential energy remaining constant, of also increases, while
the difference a' —of decreases. We have in fact, according to (3),
AU = U”- U ' =
, H S-rfm AU
whence a —a = — —---- ,
A2 a -J-a
or approximately
a ' - a ' AU
8772m AU
«'+«* ” A2 4a2 “ 4 ( w - u y
that is, *~ JL (5 a)
- 16 W -U
Here W—U is the average kinetic energy i?nv2 of the electron on both
sides of the discontinuity, while A17 is equal to the change of this
kinetic energy, i.e. approximately mv Av.
We thus get R= (6b)
where A= — .
mv
Formula (5 a) shows that the reflection coefficient tends to zero when
the velocity of the electron is increased, i.e. when the wave-length A
tends to zero, the jump of potential energy AU remaining constant
(AA is an infinitely small quantity of a higher order than A itself).
This result holds, of course, not only for electrons but also for any
other particles: their behaviour conforms more and more to the funda-
51 MOTION IN ONE DIM E NSIO N 7
mental principle of classical mechanics, the principle of determinism
which can be stated in the form
R = 0, D= 1
as their velocity increases.
I t should be noted that, for a given value of AU, the magnitude of
the velocity for which R becomes inappreciable is the smaller the larger
the mass m, since, according to (5 a), it is not the velocity itself but the
kinetic energy \mv2 whose ratio to AU determines R.
2. Com par ison between the Schr ddinger and the Classical
Equation of Motion in One Dimension; Aver age Velocity and
Cur r ent Density
Discontinuities in the potential-energy function U(x) do not, of course,
occur in Nature. When U(x) is a continuous function of x, i.e. when
the force has a finite value, it is possible to give another im portant and
interesting formulation of the condition under which the fundamental
ambiguity of wave mechanics disappears (i.e. the reflection coefficient
vanishes), the wave mechanics thus reducing to classical mechanics.
According to de Broglie’s relation A = h/mv, the wave-length of the
waves associated with the motion of a particle is, other things being
equal, the smaller, the smaller the value of the constant h. In reality,
of course, the latter cannot be changed. If, however, it were not a
universal constant, but could have any value whatsoever, then it would
be possible to say th at wave mechanics would reduce to classical
mechanics in the limiting case h = 0; for this would mean th at the
wave-length would vanish for all values of the velocity. Consequently
the relative change of the potential energy in a distance of the order
of magnitude of the wave-length would also vanish, and with it the
partial reflection which is the fundamental cause of the ambiguity
characteristic of wave mechanics.
This result can be proved in a general way as follows:
Let us put <x = 2nglh in equation (3 a), where g (= mv) is the
magnitude of the momentum of the particle, and also
i 2tt
<f>= y *- (6)
Multiplying (3 a) by (hj2n)2, we get
(6 a)
where g2 = 2m(W —U). (flb)
8 CLASSICAL MECHANICS AS LIMITING FORM §2
I t follows from this equation that in the limiting case h — 0 the func
tion a remains finite and is determined by the differential equation
The momentum g can be determined by this function unambiguously,
i.e. both with respect to magnitude and sign, by the equation
ds
(7 a)
d z’
which is equivalent to equation (5), corresponding to the one-sided wave
propagation, i.e. to the motion of a particle in a definite direction.
This direction remains arbitrary, since (7) has two solutions, namely
dsjdz = +*J{2m(W—U)} and dsjdz — —<J{2m(W—U)}. But once it is
chosen for some initial instant it will remain constant so long as 8 is a
continuous function of z without maxima or minima, where, of course,
g will change its sign after passing through the value g — 0. This
change of sign through a continuous variation corresponds to total
reflection and has nothing to do with the discontinuous reversal of the
sign of g which is allowed by the exact theory embodied in the wave
equation (1) (with li > 0) and which corresponds to partial reflection.
The difference between the exact equation (1) and the approximate
equation (7), so far as the ambiguity in the sign, i.e. in the direction
of the velocity, is concerned, consists in the fact that the former, being
a linear equation of the second order, admits both signs simultaneously
(superposition of waves travelling in opposite directions), while the
latter, being a quadratic equation of the first order, admits either one
sign or the other. I t should be remembered that the exact equation
which is satisfied by the function s is much more complicated than (7).
This exact equation can be obtained by eliminating A from equations
(3 a) and (3 b) with ^ = 2ns/h.
I t is often convenient to use, instead of the function defined in this
way, another function S defined by the equation
\fj= eiiirS^\ (8)
or S = A lo g f (8a)
This S is connected with s (i.e. the ‘phase’ <f>) and the ‘amplitude* A by
the relation ,
S = s + ± lo g A.
K2 SCHRODINGEK’S a n d c l a s s i c a l e q u a t i o n OF MOTION 9
It is a oomplex quantity which represents both <f> and A and is equi
valent to iff.
Substituting the expression (8) in Schrodinger’s equation (1) and
using the relations
dif, _ ei2nSlh.
dx h dx dx2 = (x)'( d x ) e + t h dx*e ’
we get (8 b)
Jf we put here h — 0, this equation reduces to (7), so that when h — 0
the two functions s and S become identical. We must now investigate
the meaning of the approximate equation (7) which they both satisfy
in this limiting case.
In a certain sense it merely expresses the law of the conservation of
energy—since dsjdx is, by definition, the momentum g of the particle
and — is its kinetic energy.
2m \dxj
The equation is unusual, however, in that the momentum of the
particle, and consequently its velocity, is determined as a function of
the coordinate a*, whereas in the classical description of motion the
velocity, as well as the coordinate itself, usually appear as functions of
the time t. Such a description of motion is impossible in wave mechanics
because of the uncertainty in the direction of the velocity. If it is
true, however, th a t in the case h — 0 the wave-mechanical equation
of motion (8 b) must reduce to the classical equation, then equation
(7) must be equivalent to Newton’s equation of motion
d2x _ _ d U
dt2 dx * (»>
defining x and v = dx/dt as functions of the time. This equivalence is
readily recognized as soon as we realize what is meant by defining the
velocity (or momentum) of a particle as a function of its coordinate.
Let us suppose that equation (9) has been integrated, and that x and
v have been determined as functions of the time t. Then, eliminating
the time t between them, we can express one of them, e.g. vt as a func
tion v(x) of the other. The acceleration d2x/dt2 can then be calculated
by means of the formula
d2x __dv __ dv dx _ d v __ d /v2\
dt2 ~~ dt~~ dx dt ~~ dxV d x\2 j
c
10 CLASSICAL MECHANICS AS LIMITING FORM 52
so that equation (9) can be written in the form
d mvz __ dU
dx (2) dx
or - - + U = const.
£
If mv — g is replaced by dsjdx and the constant is denoted by IT, we
get equation (7).
We thus see that this equation expresses not only the law of con
servation of energy, but at the same time the classical law of motion.
It should be mentioned that both laws are equivalent to one another
only in the special case which we arc considering here of motion in one
dimension (see below).
Another way of interpreting equation (7), or rather the fact implied
1 ds
in it that the velocity v — — — of the particle is determined not as
17 m dx r
a function of the time but as a function of the coordinate x, is to
replace the single particle under consideration by an infinite number
of copies of this particle, filling space (or the line x) in a continuous
way, so that at any instant t a copy is to be found situated at, or rather
passing through, any point x. This method is similar to one used in
hydrodynamics except that, in the hydrodynamical case, the copies of
a particle are replaced by actual particles (supposed to be identical),
moving under the combined influence of external forces and forces of
mutual action (represented by the hydrostatic pressure). Provided we
are not interested in the individuality of the particles, i.e. in the question
which particle is to be found at a given point, the motion of the particles
can be specified by defining the velocity of the particle passing through
each fixed point as a function of the coordinates of this point and, in
general, of the time. If the velocity does not depend upon the time
(it should be remembered th at the velocity we are speaking of refers
not to a definite particle but to a definite point) the motion is called
stationary or steady.
Thus the picture which can be associated with equation (7) is that
of an assembly of copies of the particle under consideration, streaming
steadily and filling space in a continuous way. If we select from this
assembly a definite copy which at the time t was passing through the
point x, then, knowing the dependence of the velocity v upon x, we
can follow its motion and determine both the velocity and position of
this particular copy as functions of the time. For instance, at the
§2 SCHRODINGER’S a n d c l a s s i c a l e q u a t i o n OF MOTION 11
moment t+ dt the copy in question will be situated at the point x+ v dt,
and will have the velocity v(x-\-dx) = v(x-{-v dt) = v(x) + ~ v dt, which
dx
means that its acceleration is equal to vdv/dx, as was obtained above.
We have thus shown that the wave-mechanical equation of motion
actually reduces to the classical equation in the limiting case when the
wave-length associated with the motion of a particle tends to zero,
either owing to increase in velocity (which is a thing that can actually
happen) or to decrease in the constant h (which is an artifice). The
fundamental reason for this lies in the elimination of partial reflection,
i.e. of a reversal in the direction of the velocity or, in other words, the
elimination of the uncertainty in its sign.
Strictly speaking, however, this uncertainty cannot be eliminated.
I t is impossible to describe the motion of a particle in the classical way,
i.e. as a determinate change of position and velocity with the time.
The only way of describing it is to ascertain the 'probability of finding
the particle at a given place and the probability that, being at this
place, it is moving in the one or the other direction (the magnitude of
the velocity being fixed). This intrusion of the probability conception
into the description of the motion is necessary because of the ambiguity
arising from the alternative: partial reflection or partial transmission.
One could say th at this ambiguity—wholly alien to classical mechanics
—forms the gate through which the concept of probability penetrates
into the realm of physics.
The probability of position is measured, as we know, by the product
if/if/*t so th at tfj(x)ijj*(x) dx measures the probability that the particle is
situated in the region between x and x+ dx. Using the picture of an
assembly of copies of the particle in question filling space (or the z-axis)
in a continuous way, we can interpret if/if/* dx as the relative number
of copies situated within the interval dx (this number is independent of
the time so long as ip = i/fle-i2nvt, corresponding to a motion with a
definite total energy W = hv). If the integral J ip*p* dx converges,
—oo
if/ can be normalized in such a way that this integral is equal to 1, in
agreement with the usual normalization of probability. Otherwise we
need not worry about this normalization, since after all only relative
values of if/if/* for different points come into account.
I t should be noticed th at in the classical description of the motion
we can also use a continuous assembly of copies instead of an individual
particle, as is actually done when the equation of motion is written in
12 CLASSICAL MECHANICS AS LIMITING FORM §2
the form (7) corresponding to the determination of the velocity as a
function of the coordinate and not of the time. From the point of
view of this description the difference between the old and the new
theory can be summed up as follows. In the old theory it is always
possible to ‘individualize’ a certain copy by following its motion, i.e. by
determining its coordinate and velocity as definite functions of time,
whereas in the new theory such ‘individualization’ is impossible, the
direction of motion being uncertain. It thus becomes necessary to con
sider the assembly as a whole without attempting to disentangle it,
i.e. to trace the motion of a particular copy in time. This being so, the
density of the assembly, i.e. the relative number of copies per unit
range, or, in other words, tho probability of finding the particle repre
sented by these copies in a given range, becomes the primary thing
th at can and must be determined—whereas in classical mechanics it
remains irrelevant and therefore arbitrary. Of course the determination
of 00* in wave mechanics is also connected with some arbitrariness,
which can only be removed by specifying the boundary conditions or
the conditions at infinity for the function 0.
Knowing the function 0, one can determine many other things besides
the probability of position. Thus by means of it we can determine the
probability of the two opposite directions of motion, th at is, of the two
opposite signs of the velocity, if the magnitude of the velocity is
assumed to be fixed for a given position by the classical relation
v = yj{2(W—U)jm} or by de Broglie’s relation v — If p ' is
the probability of the positive direction and p" th at of the negative
direction, then the average or probable value of the velocity at a given
point is given by the formula
v= , (p'—p')\v\ (10)
with the condition p ’+ p" = 1.
This probable velocity, or the probabilities p, can be determined
quite generally with the help of the relation (4), as soon as the physical
meaning of this relation is recognized. We shall first see what the
expression A 2d<f>jdx means in the simple case of a wave travelling in
one direction in a force-free space, th at is, a wave representing the free
motion of a particle in one direction. We have, in this case, according
to (1 a), <f>= oix and consequently A 2 ^ = A2<x = |0 |2^ ~ = |0|*v.
ax h h
If |0|* is interpreted as the (relative) density of the copies of the
particle, then the product |0 |2v = j must obviously be defined as the
$2 SCHRODINGER’S a n d c l a s s i c a l EQUATION OF MOTION 13
corresponding current density, i.e. the (relative) number of copies passing
through the given point or plane x const, in the direction of v in
unit time. If |^r|2 is interpreted as the probability density, then j can
be defined as the probability current density, i.e. the probability that
the particle will cross the plane x = const, in unit time. The ratio
j j |012 is nothing else than the actual velocity of motion, which, in view
of the fact th at the direction of the motion is perfectly definite, coincides
with the probable velocity v (p' or p" = 1).
I t is natural to extend the above interpretation of the expression
A 2d<f>jdx as a measure of the current density to any type of wave
function for from this point of view the fact that A2d<f>ldx is constant
(i.e. independent of x) simply means that the number of copies passing
through different planes x = xl and x = x2, say, is the same, just as
if they were actual indestructible particles. The law expressed by the
relation (4) would thus be the law of conservation of the number of
copies or of the conservation of probability (see below). If this inter
pretation is correct, then it must obviously be possible to write j in
the form j• = i(nP*v,
, /llk a)x
(10
where v denotes the probable velocity of the copies at the point in
question. Now this is actually the case if j is defined as J —A2^
zirm dx
(the coefficient h/Sirm is the same as in the special case considered
above), which gives the following expression for the probable velocity
h ty
(10b)
ium d x'
The ‘phase’ <f>can be expressed in terms of the function t/r — Ae't and
its conjugate complex t/t* Ae~*4 by means of the formula
* = g lo g <*/**)
whence it follows that
, -4*71771\tfi(ldx#_!
ift* d x)
_J277l111r (1 i.w))
\ i dx
(10.,
dS\
or, according to (8 a),
m \d xtr
R (/) denoting the real part of /. In the classical theory this equation
reduoes to €= v, in accordance with the fact th at the motion proceeds
in a perfectly definite direction, the probabilities p ' and p ” being equal
respectively to 1 and 0. In the wave-mechanical theory |f | is, in general,
14 CLASSICAL MECHANICS AS LIMITING FORM §2
different from |v|, the values of the probabilities p ' and p ” being dif
ferent from both 1 and 0. They can be determined from v and v by
means of the formula
Substituting (10 c) in (10 a), we get the following expression for the
current density:
A r (11)
J 4iritn \ dx * dx I 2vm W dx) ' ’
We shall now check these results by applying them to two simple cases.
We shall put first
*/* = A'eia* + A'e -i<xxy
which corresponds to the free motion of a particle along the x~axis in
an unspecified direction.
Assuming the coefficients A for the sake of simplicity to be real (this
condition does not involve any loss of generality, for it can always be
satisfied by a suitable choice of the origin x — 0), we have
= A' i- iet* + AV «
whence
i dx
= a(A'2—A**)-\-i2<xA'A" sin 2ax,
so that j reduces to the constant value,
ha
J - (A'2- A " 2)
2irm
or j = |v| (A'2~ A H2). ( lla )
Unlike j, the probable velocity
j A'l-A* *
V ~ W* ~ V ~A’*+ A’i + 2 A'A’ cos2aa:
is a function of x, varying periodically between the values
A' + A"
^max \v IA'i—A nn
A '-A '
and Amin = M A'+ A"
The fact that the maximum value of the probable velocity v turns
out to be larger than the magnitude of the classical velocity |i>| in
validates the idea considered above of taking the latter over into the
wave-mechanical theory as the magnitude of the ‘actual’ velocity. With
§2 SCHRODINGER’S AND CLASSICAL EQUATION OF MOTION 15
\v/v\ > 1 formula (10) leads to values of the probabilities p which arc
devoid of physical meaning, one of them being larger than 1 and the
other smaller than 0. Although the classical velocity can be determined
wave-mechanically from the wave-length A (by means of the formula
|r| = h/(m\)), yet it is the probable velocity v only which has a direct
physical significance.
This is also clearly seen if we take as a second example the case
xjj = A'e+ fc+ A'e-P*
corresponding to a region of total reflection where the kinetic energy is
negative and the velocity v is imaginary. We have in this case = «/r,
j -- 0, and v = 0, as might be expected.
3. Gener alization for Non-stationar y Motion in Thr ee Dim en
sions; The H am ilton-J acobi Equation
We shall now generalize the results of the preceding section to the
motion of a particle in three dimensions under the action of forces
derived from a potential-energy function U which may depend not only
upon the coordinates a\ y, z, but also upon the time t.
The wave-mechanical description of such a motion is given by the
generalized equation of Schrodingcr
" 2)
Our main object will be to trace the relation of this equation to the
corresponding classical equations of motion,
d2x dll d2y SU dh dU
m (12a)
dt2 dx m dt* ~~ dy' m dfi ~ ‘dz
The general character of this relation can be described in a way similar
to that used for the one-dimensional motion discussed above. The
fundamental characteristics of the wave-mechanical theory can thus be
partially reduced, as before, to the ambiguity arising from the pheno
menon of partial reflection and partial transmission—a phenomenon
which implies a sudden change in the direction of the velocity, its
magnitude being assumed to be the same function of the coordinates
as in the classical theory.
The uncertainty in the direction of the velocity, which in the case
of one-dimensional motion was equivalent to an ambiguity of sign, is
now'—in the case of motion in space—of a still more distressing
character. However, we may still expect this uncertainty, as well as
partial reflection, to vanish in the limiting case of motion corresponding
16 CLASSICAL MECHANICS AS LIMITING FORM $3
to infinitely short wave-lengths (which can be realized by an increase
of velocity or of mass, or by a fictitious decrease of the constant h).
Thus in this limiting case equation (12) must become equivalent to
equations (12 a) in the sense of admitting particular solutions corre
sponding to a perfectly definite type of classical motion.
To demonstrate this equivalence we shall replace the particle under
consideration by an assembly of copies distributed and moving in space
like the particles of some continuous fluid (without interaction of
course!). The velocity vector v of each copy can then be defined—
according to the classical theory—as a function of the coordinates
x, y, z of the (fixed) point through which this copy is passing, and of
the time—the motion being not necessarily a steady one. It should be
noticed that the partial derivative dxjbt of v with regard to the time
docs not define the acceleration of a given copy, for it refers to different
copies passing through the same point at different instants of time t
and / + dt. This acceleration can be defined by the total derivative
d\Idt, its x-component being thus given by
dvx _ cvx dvx dx dvx dy bvx dz
dt rt dx dt cy dt dz dt
dvx dvr t cvx , bvx . bv,
or (13)
W = ldti + v* dx
e ^ ' rV^ + v^
We shall now assume the motion of the fluid formed by our assembly
of copies to be irrotational, which means that the velocity vector can
be represented as the gradient of a scalar function, the so-called ‘velocity
potential*. We shall denote this function by sjm and put accordingly
rav = V*, (13a)
1 ds I be 1 ds
that is
m d x’ v m by’ z mbz'
We make this assumption (which is by no means necessary) not only
because we desire to simplify the formulation of the classical theory as
applied to the copy assembly, but also because we wish to establish the
connexion between this theory and the wave-mechanical theory. We
have in fact, for a wave propagated in one definite direction, a relation
exactly similar to (13 a) between the phase <j> and the vector a whose
direction is the direction of propagation and whose length is 27r/A, where
A is the value of the wave-length at the corresponding point:
a = Vf (14)
53 MOTION IN T H R E E DIM E NSIO NS 17
If we put (14a)
according to de Broglie’s relation, we get
9.TT
(14b)
as before [cf. (0), § 1]. Thus, by assuming irrotational motion of the
assembly of copies, it becomes possible to establish a connexion between
the motion of a particle and the propagation of waves in the limiting
case of infinitely short waves, i.e. when partial reflection is excluded
and the motion of every copy of the particle proceeds along a perfectly
definite p a th ; this path can be considered as the ‘ray’ passing through
the point at which the copy in question was initially situated. If partial
reflection does take place the idea of rays loses all meaning, each ray
branching into two at every point. Only by neglecting reflection can
one speak of rays as lines along which the waves, i.e. the surfaces of
constant phase, are propagated.
Returning to the expression (13) for the £ -component of the accelera
tion of the copy passing through the point x, y, z, at the instant t we
can, because of (13 a), rewrite it in the form
dvx _ 1 d2s
since ^=
dy m dxdy cx
or
— — , which is the first of the equations (12 a), is
thus equivalent to
Similar results are obtained for the second and the third equations, and
so all three of them can be replaced by the single equation
where F(i) is an arbitrary function of the time alone. This function,
without loss of generality, can be put equal to zero, for it corresponds
3595.6 D
18 CLASSICAL MECHANICS AS LIMITING FORM §3
to an additive term f F(t) dt in s which is irrelevant for the determina
tion of the velocity according to (13 a). The function s can thus be
defined by the equation
(15)
This equation was established by Hamilton and Jacobi and bears
their name. In the special case when U does not depend upon the time
explicitly (constant field of force), the function s—usually called the
(mechanical) 'action’—reduces to
s = s0(z,y,z)—Wt, (15a)
where s0 is determined by the equation
(15b )
Here IT is a constant which can obviously be defined as the energy.
Thus, in a sense, equation (15 b), in conjunction with the relation
(13 a), expresses the law of the conservation of energy. However, as we
have just seen, it expresses much more than th at,f since, in conjunction
with (13 a), it is equivalent to the three classical equations of motion
(12 a) for the special case of an invariable field of force and of a fixed
value of the total energy. The equations (12 a) and (15 b)—or more
generally (15)—are formally different because the former refer to an
individual particle, while the latter refer to a continuous assembly of
copies of this particle. If we select a definite copy and follow its motion
we come back to equations (12 a).
I t can now easily be shown th at in the limiting case of infinitely
small wave-length the wave equation (12) admits particular solutions
of the form 0 = Ae ^y representing a one-sided propagation of waves
which can be associated, by means of the relations (14), (14 a), and
(14 b), with the motion of the particle in question according to the
classical theory, the different ‘rays’ coinciding with the paths of the
different copies of this particle.
Putting tp = Ae^, we get in the same way as in § 1
whe
t Except in the one-dimensional case.
53 M OTION I N T H R E E DIM E NSIO NS 10
We have further
Substituting these expressions in equation (12), cancelling the common
factor e1^, and separating the real and imaginary parts, we obtain the
two equations:
and
If <j>i6 replaced by 8t these equations become
h
(16)
2ra — + 2V^4-Vs-f AVzs = 0. (16a)
Putting h = 0 we see that the first of these equations reduces to the
Hamilton-Jacobi equation (15). The same result is obtained if V2A = 0,
which must obviously express the general condition for one-sided pro
pagation of waves of finite length. In both cases the wave-mechanical
theory becomes completely equivalent to the classical theory. Both
cases are, of course, fictitious, h being a constant and the equation
V2A = 0 being satisfied only under very special conditions—in
particular for force-free motion. The equation (16) can, however,
reduce'approximately to (15) in the case of a nearly one-sided wave
propagation with a very weak partial reflection—so weak th at the
reflected (or scattered) waves can be neglected. This condition is more
nearly approached the larger the mass m of the particle for a given
velocity or the larger the velocity for a given mass, i.e. the smaller the
wave-length, if we are treating motion corresponding to a constant
value of the energy W. In the latter case the wave-length becomes
a definite function of the coordinates. In the general case the idea of
wave-length has no precise meaning and can be introduced only by
representing the wave function i/j as a superposition of waves with
different frequencies, corresponding to motions with different energies.
If U does not contain the time explicitly, equations (16) and (16 a)
admit particular solutions of the type 8 = s0(x}y}z) —Wt and
20 CLASSICAL MECHANICS AS LIMITING FORM §3
A = A(x, y, z), i.e. dsjdt = —W and BAjdt — 0. They therefore reduce to
- J ! _ v*4 + — (V«0)2+ t; = w (17)
8n-*m 2m ' 0
and 2V^4-Vtf0+ = 0. (17 a)
In the limiting case h = 0 the first of these becomes equivalent to the
classical equation (15 b).
This equivalence, as well as the approximate equivalence which can
be obtained in the case of large values of W or m, must not be misunder
stood. I t refers to particular solutions of equations (17) and (17 a), or of
the corresponding SchrOdinger equation
= o (i 7b)
with ip = Aei2n(8o~TVi)f/i — \j^(xi y,z)e~i2rTWtih (17 c)
th at is, to solutions which represent—approximately— waves travelling in
a definite direction (the direction may, of course, vary from point to
point, being defined by the direction of the 'rays’ passing through these
points). Now the general solution of (17 b) in the case of short waves
can be represented as a superposition of a number of such particular
solutions corresponding to waves travelling in different directions, under
the limitations imposed by boundary conditions (in the case of long
waves this is possible for force-free motion only). The classical equation
(15 b), on the other hand, does not admit of such superposition for the
function \p defined as Aei2lTSlh. This can clearly be seen in the simple
case of one-dimensional motion where A is connected with s by the
ds JO
relation A2-j- — C [cf. (4), § 1], so th at \p = et2n*lh. The physical
dx ^(ds/dx)
reason for this is that 'superposition* of two different types of motion
would mean, according to classical mechanics, their ‘simultaneous
realization*—an obviously impossible thing if they are alternative. In
wave mechanics, on the contrary, it is just this alternative character
which is expressed by superposition, the latter corresponding to the
addition law of the classical probability theory. Similar results apply
to the general equations (12) and (15), the former allowing the super-
position of processes with different energies if U does not depend upon
the time—while the latter reduces in this case to equation (15 b) corre
sponding to one definite value of the energy W.
The non-validity of the superposition principle in classical mechanics
can easily be demonstrated with the help of the function S — dog tp
§3 M OTIO N IN T H R E E DI M E NSI O NS 21
introduced in § 2 [eq. (8)]. This function satisfies the differential
equation , ^„
J ^ V 2^ + ~ ^ - l ( V , S f ) 2+ f / - 0 (18)
47rim ct 2m'
which is obtained from SchrOdinger’s equation (12) by the substitution
iff = ei^nsih an(j which reduces to the Hamilton-Jacobi equation (15) if
h is put equal to zero. The function # thus coincides in this case
with the function 8, which means that the amplitude A can be con
sidered as practically constant.
Now if in the Hamilton-Jacobi equation (15) we put s — S — -^-Jog *p,
2m
we get the following ‘approximate’ equation for ifr.
^ o
2m ^ dt Hn2m
or £ x ? / U * = 0, (18a )
' r> h* \4m St J
which is quadratic and of the first order (like the equation for S) instead
of being linear and of the second order like the exact equation of
SchrOdinger. If iffx and iff2 are two particular solutions of (18 a), the
function iff — ipi+tp2 will not *n general represent a solution of this
equation.
Returning to the representation of the exact wave function in the
form A - Aei2w8'ht and considering equation (16 a) connecting A and
s, which has been disregarded hitherto, we see th at this equation can
dA dA2
be simplified if multiplied by A . We have in fact 2^4 — -- — - and
ct ct
ZAVA V s + A ^ s - V (^2) V«-|-.42V% = div(^2V#);
so th at -I- d iv //l2V--) = 0. (19)
dt \ mj
This equation is of the same form as the equation of continuity, i.e. the
equation of the conservation of mass in hydrodynamics or of the con
servation of electricity in electrodynamics,
- £ + d i v j --■= 0,
Ct
where p is the density of mass or electrical charge and j the corre
sponding current density. In the present case we can interpret the
quantity A, = = p
as the density of the copy assembly (i.e. the relative number of copies
22 CLASSICAL MECHANICS AS LIMITING FORM §3
of the given particle in unit volume) or the density of probability. If,
further, we define the corresponding current density by the formula
j = - J 2Vs, (19a)
m
then equation (19) will express the law of the conservation of the copies
or of the probability. In the classical theory the vector Vs/m reduces
to the actual velocity v of the particle (or more exactly of its copies
at the given point), so that j assumes the usual form of the product
of p with v. In the exact wave-mechanical theory it can also be written
in the form ,
i = pv
where the vector v = —7a (19b)
m
must obviously be interpreted as the probable velocity. The classical
velocity can be computed as usual by means of the formula
' - y o - 4
its direction being, however, uncertain. According to the definition of
A and at we have ip = Aei2n8lh, ip* == Ae -i2*8lh, whence
h , ^
and consequently
j = - £ - ( ^ *70-070*) = — (20)
4771171 277771 \l )
Introducing the function S — ^ lo g i//, wc get V# — ^ -r iv ^ r and
\ip*Vip — — (//i/i+ViV, so that
j = i^ * R (V ,S ) (20 a)
m
and V ■-= 1 r (V«). (20 b)
m
Comparing this with (19 b), we see that the function s is equal to the
real p art of S> in accordance with the relation S = $ + J ^lo g A which
results from comparing the two expressions e:2nSlh and Aei2n8lh for ip.
The probable velocity (20 b) could be represented in the form
v = \v\ J np(n) dco,
§3 MOTION IN T H R E E DI M E NSI O NS 23
where n is the unit vector which defines the direction of the classical
velocity and p(n) dw is the probability th at this unit vector lies in the
infinitely small solid angle dw. An unambiguous determination of this
probability appears, however, to be impossible, except for one-dimen
sional motion considered in the preceding section. This is quite natural
if we remember th at the notion of classical velocity, as measured by
the time derivative of the coordinates, cannot be taken over into wave
mechanics.
I t should be mentioned in conclusion th at the relation between wave
mechanics and classical mechanics is usually compared with the relation
between wave optics and the so-called geometrical optics, the latter
being defined as the limiting case of wave optics for very small wave
lengths. This statement would, however, be misleading unless we add
to it that in geometrical optics partial reflection of light (which actually
decreases with decrease of wave-length) should be wholly left out of
account—even in its simplest form on the boundary surface between
two homogeneous media. In this case—and only in this case—is it pos
sible to introduce the idea of rays as lines along which the propagation
of light takes place (this is why geometrical optics is often called ‘ray
optics’ in contradistinction to wave optics, where the idea of ‘rays’ has
in general no meaning). I t was the merit of Hamilton to show, one
hundred years ago, that in this limiting case the wave conception of
light can be replaced by the corpuscular conception, and th at the rays
can be described as the paths of light particles moving, according to
Newton’s classical law, in a certain field of force. The potential energy
of this field of force U is determined by the refractive index p according
to the relation ^2 __ —U)y
where y is a constant depending upon the definition of the mass of
a light particle.t But perhaps the main merit of Hamilton’s work was
th at he applied the same considerations to the motion of particles of
ordinary matter, thus for the first time associating such motion with the
propagation of (infinitely short) waves and describing it by equation (15).
This association of particles with waves, which in Hamilton’s theory
was achieved by interpreting the ‘mechanical action’ s as a measure
of the phase function <f>, was, however, completely forgotten for
a hundred years, until de Broglie rediscovered it in the way described
f This relation is obtained in the simplest way by comparing de Broglie's formula
for the wave-length 1/A — U)}fh with the formula A0/A = ft, which can be
considered as the definition of the refractive index, A0 being the value of A in vacuo,
i.e. for a place where fi = 1.
24 CLASSICAL MECHANICS AS LIMITING FORM §3
in P art I, and SchrOdinger introduced his wave equation, whose relation
to the Hamilton-Jacobi equation has been discussed above.
This mutual reaction of optics and mechanics must not be misinter
preted as an indication of a true analogy between them—in the sense
of a wave-corpuscular duality of light. We must not be led by it to
infer the real existence of photons, moving in material bodies according
to the laws of wave mechanics. For we could replace optics by acoustics,
i.e. light vibrations by mechanical vibrations propagated in the form
of waves in elastic media according to an equation of exactly the same
kind as the differential equation for the light waves. In the limiting
case of infinitely short acoustical waves we could therefore obtain
exactly the same results as in optics, i.e. a kind of ‘ray acoustics’
instead of a ‘wave acoustics’. This would enable one to formulate a
corpuscular theory of sound and describe the propagation of sound as
the motion, according to wave mechanics, of certain particles—e.g.
‘phonons’. I do not think, however, that anybody would believe in the
reality of such ‘phonons’. This does not mean, of course, th at the
photons are equally unreal, for the analogy between acoustics and optics
is just as superficial as th at between optics and mechanics (or acoustics
and the mechanics of single particles).—I am inclined, however, to
think th at photons have no more reality than ‘phonons’, and th at they
are created by a ‘reflection’, as it were, of the wave-corpuscular duality
of m atter in the phenomena of light (cf. P art 1).
4. Com par ison of the Appr oxim ate Solutions of Schr ddinger ’s
Equation; Com par ison of Classical and W ave-m echanical
Aver age Values
Although in the case h = 0 the functions s and S satisfy the same
equation—namely, th at of Hamilton and Jacobi—yet the approximate
expressions for \p obtained therefrom, according to the formulae
tp = Aei2lT8lh and </» = ei2rTSth, turn out to be somewhat different, for the
‘amplitude’ A obtained by means of equation (16 a) is in general a
certain function of the coordinates (and the time), varying very slowly
compared with the ‘phase factor’ 2nsjh.
The discrepancy between the two approximate solutions is due to
the fact th at the error introduced by putting h = 0 is larger in the
case of equation (18), which contains h in the first power, than in
the case of equations (16) and (16 a), where h appears in the second
power. In the latter case we thus drop a small term of the second order,
while in the former case we drop a much larger term of the first order.
§4 APPROXIMATE SOLUTIONS OF SCHRODINGER’S EQUATION 2r>
In order to remove this discrepancy we must put
* '= S o + ^- .S ' , (21)
2m
and after substituting this expression in equation (18) drop terms which
are quadratic in h but keep those which are linear in li (S° and S' being
independent of h and therefore of the same order of magnitude). We
thus get the approximate equation
h
V26’° + — + — . — + — (V,S'°)2+ V,S'° V.S” H- U = 0 . (21 a)
Airim dt 2m dt 2m 2mm
Here >Sf° must be regarded as the zero approximation, corresponding
to h = 0 , i.e. as the solution of the Hamilton-Jacobi equation
' f + 2i;<v S V + ( / - ° .
It can obviously be identified with the (approximate) function s.
The function S' must therefore satisfy the equation
1 V*S°+ — + ± VS°-VS' = 0 , (21 b)
2m dt m
whence it follows th at S' is a real quantity. Now according to (21 ) we
have ^ = ei27rSlh = es,ei27rS°lh} so that, since aSt° = s, must be equal
to A. Substituting in (21 b)
S ' = log (21 c)
we do indeed get equation (16 a). I t may seem th at by developing the
function S in a series of powers of the parameter hj(2m)
s - s °+ i i s ‘+ { L ) ' s ' + -
and solving the equation (18) by successive approximations, one can
obtain as good an approximation for S as may be desired. This assump
tion is, however, incorrect, for it can be shown th at the preceding series
is divergent or rather semi-convergent, which explains why one gets
a closer approximation by keeping the first-order term, as has been
done above. In fact the general solution of a differential equa
tion of the second order cannot be approximated to by starting
with the solution of the equation of the first order obtained by
dropping the second-order terms, however small the parameter by
which they are multiplied may be, just as a quadratic equation cannot
be approximated to by the linear one obtained by dropping the quadratic
term. If, however, the latter is multiplied by a small parameter, then
MM.6 K
20 CLASSICAL MECHANICS AS LIMITING FORM §4
one of the two solutions of the quadratic equation can be approxi
mated to by the solution of the linear one. A similar relationship exists
between the function i/j — el27r<sr,/,t f,v' and one of the particular solu
tions of Schrttdinger’s equation, representing approximately waves
travelling in one direction. I t should be mentioned that this direction
need not remain constant; it can be changed by total reflection, which,
in contradistinction to partial reflection, is a phenomenon perfectly
compatible with classical mechanics since it does not involve any
ambiguity and therefore does not challenge a deterministic description
of the motion. The difference between classical mechanics and wave
mechanics in the approximate form given above, in so far as total
reflection is concerned, consists only in the fact that, according to the
latter, the particle can penetrate into those regions of the field of force
where its ‘ classical ’ velocity becomes imaginary.
According to the relation v = Vs/m S7S°jmf it should follow that
the functions s and S° must also become imaginary. So far as S° is
concerned this is perfectly true. The function s, however, according to
its definition, must remain real. I t will therefore be different from S°
for those regions where v is imaginary and will satisfy an equation
different from th at of Hamilton and Jacobi. We must remember that
equations (16) and (16 a) were obtained on the assumption th at both
8 and A were real. The assumption that s satisfies approximately the
Hamilton-Jacobi equation, even when the latter gives imaginary values
for it, would thus imply a contradiction.
This means that, in the case under consideration, VM must be very
large and of the order of magnitude of 1/h2, so th at the first term in
equation (16) or (17), which when omitted reduces (16) or (17) to the
Hamilton-Jacobi equation, cannot be dropped. We shall not consider
the approximate solution of equations (16) or (16 a) [or (17) and (17 a)]
for this case. I t is simpler to use instead the alternative representation
of tft by means of the function S = S°+ S'h/(27ri) since we do not have
to worry about the reality of S°. An imaginary value of S° leads,
according to (21 b), to an imaginary value of S'. The role of the func
tions S° and S' as determining the phase and the amplitude respectively
will thus be reversed for classically forbidden regions, so that, using
the expression Aei2n*th for tp, we can put
^ _ _ g i 27r&»/A = (>±2n\S*\lh
and (22a)
* - S 5 s ' - ± R |S ' 1-
§4 APPROXIMATE SOLUTIONS OF SOFTRO DINGER'S EQUATION 27
The sign (+ or —) is determined by the condition that A (i.e. 0) must
decrease with increased penetration into the forbidden region. I t can
easily be proved directly that the expressions (22) and (22 a) constitute
an approximate solution of the equations (1 (>) and ( 10 a) for the case
in question if the functions iST° and S' are determined respectively by
the Hamilton-Jacobi equation and by equation (21 b).
Returning to the case when is real (and equal to s), corresponding
to the motion in the classically allowed region of the field of force, let
us examine the approximate values which are obtained for the ampli
tude A = es\
We shall first consider the simplest case of a one-dimensional motion
with constant energy. We have in this case, according to (4),
A2 —- -- const.,
(18
that is, since ^ = i\
dx
where C2 denotes a positive number. We thus get approximately
l - -£■ (23 a)
a0(x) being a solution of the equation
L { % h v - w-
Formula (23) has a very simple physical meaning. It shows that the
probability of finding the particle within a certain region between x and
x+ d x is inversely proportional to its velocity in this region. This is
juBt what we should expect if this probability were defined as propor
tional to the time dt — dxjv which the particle spends in the region
in question. We thus see that the interpretation of the quantity
0 0 * dx = A2 dx as the relative probability of finding the particle in
the region dx is in agreement, so far as the approximate expression for
0 is used, with the classical definition of probability in terms of
duration.
If f(x) is some quantity depending upon the position of the particle,
and if the motion of the latter is confined to a limited region of the
x-axis, e.g. between xx and x2, then the average value of this quantity
in the sense of classical mechanics, i.e. with respect to the time, can
28 CLASSICAL MECHANICS AS LIMITING FORM §4
he defined by the expression
T
i ^ Y S f(x)dt (24)
0
taken for a ‘round trip ’ of the particle, T representing the duration of
this round trip. The round trip can obviously be replaced by a one-way
trip, since the motion must proceed in the same manner on the two
halves of a round trip, with the sign of the velocity reversed. We can
thus put tt
dt,
tx
where lx and t> denote the time of starting from the point xx and
arriving at the point x2 respectively. Replacing dt by dxjv, where v is
a function of x determined by the equation v2 = we
get
/_ j . (24 a)
^1
or, if a ‘round trip’ is taken instead of a ‘one-way’ trip,
i - i m dXy
J T J v
the velocity v being taken with the same sign as dx (i.e. + when x is
increasing from xx to x2, and — when it is decreasing from x2 to xx).
Now the expression (24 a) for / is identical with that obtained by
means of the wave-mechanical definition of the average value of f(x)
according to the formula
/ = J /(* )# * dx,
ar,
(24 b)
if the function 0 is assumed to vanish outside the region (xv x2)
and is replaced by its approximate expression (23 a) for this region.
The normalization constant C must be determined by the condition
a
J dx = 1, th at is,
X, ft
C‘ j ^ = C * j d t = 0 % - l J = 1.
*1 tx
This agreement of the classical theory with the wave-mechanical theory
must hot be overestimated. As a m atter of fact the function tfr does
§4 AVE R AG E VAL UE S 29
not in general vanish outside the classically allowed region, but, as we
have just seen, decreases there approximately as c- 2lris*[lh. According
to the relation v = ~ ^ , we can put (dropping the term containing
the time)
<S° = m J v dx = J V{2m (W-U )} dx. (25)
This formula applies just as well, i.e. with the same degree of aj>proxima-
tion, to the points inside and outside the region (xv x2). In the latter
case, for a point x > x2, we can put
|«S°(*)| = f J {2 m (U - IF)} dx, (25r)
J-i X
, , ~\<{2m(U~\V)}d.r , .v
and consequently \&\ — Ce h*t (2ob)
Thus, to the degree of approximation used, wc should define the wave-
mechanical average of f(x) by the equation
/= / /(*) W fd x
with ", l‘ = r a = c ,/ y ( | , | ,' - ‘' ,)
for W ^ U, i.e. for xx < x ^ x2,
VI2m(U-
and |012 = C2e hl
for x > x2 and a similar expression for x - The constant C must
be determined from the equation J |0 |2 dx = 1.
- QO
The difference between the classical and the wavc-mechanical aver
ages becomes particularly important when there are two or more
classically allowed regions separated from one another by regions for
which W < U. The latter, being permeable to the particle from the
wave-mechanical point of view', do not actually separate but, on the
contrary, connect the former regions.
The comparison of the classical ‘time-average’ with the wave-
mechanical ‘probable value’ for the case of a three-dimensional motion
is much more complicated than in the one-dimensional case and will be
considered in the next section in connexion with the wrave-mechanical
interpretation of the quantum conditions. I t must be remarked here
th a t such averages or probable values have a meaning only when the
motion is confined to a classically limited region, and th at these limits
30 CLASSICAL MECHANICS AS LIMITING FORM §4
can be assigned a priori only in the case of a conservative motion,
i.e. a motion with a given (constant) value of the energy W. Within
the allowed region, limited by the surface lb —U — 0 , the amplitude
function A must satisfy the equation
div(^42Va0) = 0 ,
which can be solved after the function s0 has been determined from
the Hamilton-Jacobi equation (17). It should be remembered that this
equation, which represents another form of equation (17 a), expresses
the law of the conservation of the copies of the particle, or of the
probability of its location |cf. (19)].
Although there is in general no exact equivalence between the
classical and the wave-mechanical average values, yet there are special
cases when this equivalence turns out to be exact. An interesting case
of this sort is provided by the so-called ‘virial’, i.e. by the quantity
tt cU dU dU
\ = ...... x+ ■- y -f z,
dx dy dz
which was introduced by Clausius in the kinetic theory of gases.
For a motion restricted to a limited region, the time average of this
quantity V is connected with the time average of the kinetic energy
by the relation 2T = V (26)
This is called the ‘virial theorem'. I t can be derived as follows: We
multiply Newton’s equations of motion
d2xk dU .
mk
^ ’ etc”
bv the corresponding coordinates and write
d.%, d t dxA _ t d x t f
k 'd(t dt\ ‘k dt ) \ d t ) '
Adding these transformed equations, we get
d v
*l k k L' " 7 J k
Formula (26) is then obtained by averaging with respect to the time
and taking account of the fact that the mean value of
vanishes. If we replace the kinetic energy T by the difference W-~U
and assume that the potential energy is a homogeneous function of the
$4 AVE R AG E VAL UE S 31
nth degree in the coordinates, formula (26) reduces to the form
2{W^-U) = nU or _ *
V= (26 a)
I t can easily be shown that this relation remains exactly valid in wave
mechanics if U is defined as the integral J Uipip* dV and ip is defined
as the exact solution of the corresponding Schrttdinger equation. As
an example we shall consider the simplest case of a one-dimensional
wave-mechanical problem which is described by the equation
dS> 87rhn
(W-U )ip 0.
dx^ w
If we multiply this equation by x dip*/dx and the conjugate equation
(W —U)*f** -- 0 by x^—and add, we obtain
dx* h2 v /r J dx
d /dip dip*\ Sir2m
Xdx\dx dx / h2
By partial integration with respect to x, taking into account the
boundary conditions (ip —- 0 and dipj'dx = 0 for x = ± 00), we get
Hao +ao +00
_( f df*dx -
J dx dx h*•
w
J
dx r # * + 8-2r f
h -J
d(Ux)
dx
dx — 0 ,
-- 00
-Icc
or, since J ipijj* dx = 1 and J fipip* dx —
d(Ux)
dx + |V - = 0.
/ dx dx H2 dx
Further, by multiplying the SchrOdinger equation by 0*, we obtain
j dx + 8~~ J ( W - U ) t o * dx = 0,
or, transforming the first term by partial integration,
j* dip dip* S^hn
dx + (W - U ) 0.
J dx dx ~~hT
32 CLASSICAL MECHANICS AS LIMITING FORM §4
We have therefore W -U + W -^& $ = 0
dx
dU
or 2(JT—U) ~ x
dx ’
This is exactly formula (2G) for the special case that we have con
sidered, f
Another illustration of the connexion between the wave-mechanical
and the classical theory is given by the similarity of the classical equa
tions of motion, ^ ^
m —— — ------ , etc.,
diu dx
and the wave-mechanical relations
d2x dU t
(27)
between the corresponding average (or probable) values of the quantities
involved.
The relations (27) were found by P. Ehrenfest. They are usually
referred to, in connexion with the propagation of a wave packet, as the
equations of motion of the ‘centre’ or ‘centroid’ of the latter, th at is,
of the point with the coordinates
x= J xifnfi* aV, y — J yipip* dV, z= J zipip* dV. (27 a)
If the wave function \jt represents a wave packet formed by superposing
waves with slightly different frequencies (i.e. motions with slightly
different energies), the coordinates x, y, z are certain functions of the
time (in the case of a stationary state where the dependence of $ upon
the time is specified by the factor e~i27rvl they reduce to constants), so
th at we can differentiate them with regard to the time. The corre
sponding quantities can be defined as the average values of the com
ponents of the velocity of the particle or its acceleration, etc.
We shall prove the relations (27) for the simplest case of a motion
parallel to the x-axis (the proof can easily be extended to the case of
three-dimensional motion). We have, by the definition of x,
since x and t are independent variables.
t The proof given is due to B. Finkelstein.
§4 AVE R AG E VAL UE S 33
Now iff and tft* satisfy the equations
dt 4n m\dxz )
di/t*
dt ~
i Srrhn TT
where fi = Hence
fl*
dx
= ih. ( x U ^ - ^ ^ d x .
Tt 4nm J V * te 2 )
By partial integration, in conjunction with the fact that
+«
dx = f(+co)—f ( —co)
J
vanishes if the function / contains x or dip/dx as a factor (since J tpip* dx
— CO
must be finite and equal to 1), we obtain
f = A . T d x . (27b)
dt 4rnmi J v dx r d x)
—oo
This expression could be obtained directly from the relation
— + %■ = 0 (which is a special case of (19)) and the formula
dt dx
j = -A - — for the current density. Putting j = ^*tJ(x),
4irim\ dx dx J
where tJ(x) is the average velocity at the point x, we can rewrite the
preceding equation in the form
+ 00
dx: f
= I v(x)ilnp* dx,
dt
which agrees with the definition of dx/dt as the average value of the
velocity of the particle irrespective of its position.
By differentiating (27 b) with respect to the time, we obtain
3508.6 F
34 CLASSICAL MECHANICS AS LIMITING FORM §4
8t r2m2 J LI ^ ^ \&r2 M 7 cte J
—CO
- - s w —nQO K s D - ^ s H *
= - . * £ Yw + p d x i
Stt W J dx r r
d2x at/
ie- a /2 &r
where — = f «/n/f* dx
dx J dx r r
— CO
is the average (or probable) value of the force acting on the particle.
I t must be emphasized that this value refers not to the average
(or probable) position of the particle, determined by the centre of the
packet (otherwise this centre would move exactly according to the
classical mechanics), but to all possible positions.
If the dimensions of the packet are very small (which means that
the uncertainty in the estimation of the particle’s velocity is very large)
the motion of its centre closely follows classical motion. This, however,
persists only for a very short time, for the packet will spread, the rate
of this spreading being the larger the smaller its original dimensions
(i.c. the larger the original uncertainty in the velocity).
5. M otion in a L im ited Region; Q uantum Conditions and A ver
age Values
We shall now investigate the case of a (three-dimensional) motion
restricted classically to a finite region of space (where W—U > 0 ), and
derive the 'quantization rules’ characteristic of such a motion with the
help of the approximate wave-mechanical theory based on the classical
determination of the phase or action function d(= SQ) by means of the
Hamilton-Jacobi equation. A motion of this kind must obviously have
a periodic or quasi-periodic character, so that the path described by
the particle may fill up the whole region or pass many times in various
directions through the same or nearly the same point (as, for instance,
in the simple case of the oscillatory motion of a particle along a straight
line). If the particle is replaced by a continuous assembly of its copies,
a rather complicated picture results, different copies passing simul-
§ o M OTIO N I N A L IM IT E D R E G IO N
taneously through the same point with velocities which are in general
different both in regard to direction and (if the field of force varies
with the time) in regard to magnitude. The latter must, of course,
remain a single-valued function of the coordinates in the case of motion
with a given (constant) value of the total energy W. The function
<f> = sjm, which can be defined as the velocity potential, must, however,
in this case (as well as in the general case of non-conservative motion)
be a multiple-valued function of the coordinates. Considering the copy
assembly as a kind of fluid, we can illustrate the case in question by
the familiar type of fluid motion with closed stream-lines, each stream
line representing the path of all the particles situated on it. In the
associated wave picture these closed paths of the separate particles or
copies must be interpreted as closed rays.
Now a fluid motion of this type can be irrotational if, for instance,
the fluid is flowing in a closed tube or around some closed tube. The
velocity v of the particles, as a function of their coordinates, can then
be represented as the gradient of a potential <f>, provided the latter is
defined as a multiple-valued function of the coordinates. In fact, taking
the integral of the velocity along a line a connecting two points Px and
P2i then, since the projection vG of v on the line element da is, by
definition, equal to rf^/r/cr, we get
If the line is closed, i.e. if the points P1 and P2 coincide, this integral
should be equal to zero, irrespective of the shape of the line, unless we
assume th at for closed lines of certain type the potential $ may change
after a ‘round trip ’ by an amount A<f>equal to the value of the integral
§ va da taken along the corresponding closed line. If the latter coincides
with a stream-line, the integral will certainly be different from zero,
since along this line we must have va = M-
Now it can easily be proved that in the case of irrotational motion
the integral § va da, which is called the ‘circulation’, will have the same
value for all closed lines of the same fa mily, i.e. of the same general type.
In the case of a fluid flowing around a closed tube along closed stream
lines (Fig. 1), we must distinguish closed lines of two families: those
which do not surround the tube, and those which do. For the former
the circulation will be equal to zero, while for the latter it will have
a certain value different from zero. This result follows from the trans
formation of the line integral § va da, by means of Stokes’s formula,
36 CLASSICAL MECHANICS AS LIMITING FORM §6
into the integral § (curlv)„ dS over any surface S limited by the line a.
In the case of the lines of the first family the surface S will be situated
entirely within the fluid, so th at the integral will vanish, since the
motion is supposed to be irrotational (curlv = 0 ). In the case of the
lines of the second family the surface S will cut the tube around which
the fluid is flowing. Since for points inside the tube the idea of velocity
has no meaning, we can replace the surface S by another surface S'
bounded by two closed lines of the second family. Stokes’s formula
applied to this surface which lies wholly within the fluid, and for which
therefore the integral § (c u t \ y )n dS vanishes, leads to the result that
the integral § va da taken over the double boundary of S' must vanish
if the ‘round trip ’ is made in opposite directions along the two con
stituent lines, whence it follows th at the circulation will have the same
value for both lines if the round trip is made in the same direction.
I t may be mentioned th at exactly similar results are met with in the
theory of the magnetic field generated by a linear electric current. This
field—outside the wire along which the current is flowing—is also
irrotational, so th at the magnetic field strength can be defined as the
gradient of a certain magnetic potential. With every trip around the
wire along any closed line (encircling this wire only once) this potential
must change by a definite value, namely 47ri, where i is the strength
of the current.
The preceding results can be applied without substantial modification
to the flow of the fictitious fluid represented by the copy assembly of
a particle moving in a limited region. In the copy assembly, however,
we must remember that different copies may be imagined to pass
simultaneously through the same point in different directions. This is,
§6 MOTION IN A L IM IT E D R E G IO N 37
of course, impossible in the case of real particles. In particular, closed
stream-lines may degenerate into ‘double lines’, i.e. unclosed lines along
which the copies move first in one and then in the opposite direction
(oscillatory motion).f The ‘circulation’ <f va da for such a double line
will not be equal to zero, but, on the contrary, will be equal to double
the value of the integral J va da for a one-way trip. As a result the
velocity potential <f> --- s/m,y in addition to the multiplicity considered
above, may acquire a duplicity of an entirely different character, corre
sponding to the possible presence at each point of two copies moving
in opposite or, in general, in different directions.
Leaving aside this duplicity we see that, in the case of a particle
confined to a finite region of space, the function s representing the
mechanical action or the momentum-potential of the copies of this
particle must—so long as the motion of these copies is supposed to be
irrotational—be a multiple-valued function of the coordinates, i.e. it
must change by a certain amount As for all closed lines (including
double lines) of a certain family. I t should be mentioned that ‘round
trips’ along any of these lines have nothing to do with the actual
motion, being performed not by definite copies (the latter need not
move in closed lines), but by the process of linear integration referring
to a definite instant of time. The change As of the function s for any
such round trip is called a ‘periodicity modulus’ of s. From the point
of view of the wave picture associated with the motion of the copy
assembly of the particle these ‘periodicity moduli’ divided by the con
stant h represent the number of wave-lengths contained in the corre
sponding closed lines. In fact dsjda = ga is the component of the
momentum of the particle along the line-element da and according to
de Broglie’s relation d(s/h)/da = g jh = Jca must be equal to the corre
sponding component of the ‘wave-number vector’ k = g jh of the
associated waves. The integral § kQda — As/h may therefore be defined
as the number of wave-lengths contained in the line a, or, more exactly,
as the number of wave-crests cut by this line, or still more exactly, as
the difference between the number of waves cut by a in the positive
and in the negative direction (i.e. in the direction of propagation and in
the opposite direction).
Now it is clear th at in the case of motion corresponding to a definite
energy, the wave system associated with it must be such that the
number of waves cut by any closed line should be integral, corresponding
f The tub© around which the fluid is supposed to flow degenerating into a ribbon
with zero thickness.
38 CLASSICAL MECHANICS AS LIMITING FORM §5
to a change of the phase <f> = 2t \s/h by an integral multiple of 2ir,
a change which is irrelevant for the value of the wave function ip = Ae'i.
In the contrary case the latter would also be a multiple-valued function
of the coordinates, and would not represent a stationary system of
standing waves (each standing wave being produced by the super
position of waves travelling in different directions), determined by the
condition that the wave function *p should vanish at or near the
boundary of the region where the particle is supposed to move.
It thus follows from the condition of single-valuedness for the wave
function ip that the ‘periodicity moduli of the ‘action function’ s must
be integral multiples of h.
This condition, which—it should be remembered—refers to the case
of motion confined to a (classically) limited region, can easily be shown
to be equivalent to the quantum conditions of the old quantum theory
discovered by Bohr and by Sommerfeld.
For the general formulation of these quantum conditions, it is
necessary, instead of the original rectangular coordinates x, y, z, to
introduce new variables (generalized coordinates) qv g2, qz. If we suc
ceed in so choosing these new variables that s assumes the form
* = I *.(?«) (28)
a“l
(‘separation variables’), then the quantum conditions run as follows:
j> Pa dhx = = (As)a = »„ h (na an integer). (28 a)
Here the various p a (— d s jd q j are the ‘generalized momenta’ and
(As)a are the ‘principal moduli of periodicity’ of the function s, i.e. those
alterations of this function which correspond to a ‘cyclic’ change of
one of the separation coordinates when the remaining two are kept
fixed. By a ‘cyclic’ change of the coordinate qa we mean an altera
tion such that the given particle returns to its original position and
therefore the rectangular coordinates assume their original values. If
the coordinate qa has the character of an angle so that the rectangular
coordinates are periodic functions of it, then the ‘cyclic change’ of qa
is simply the increase by the corresponding period Aqa (for example,
27t). Otherwise it is an oscillation of qa within certain limits determined
by the nature of the field of force. The cyclic alterations of the in
dividual separation coordinates in the actual motion of the system
take place in periods of time Ata which are in general different from
one another, so th at the motion with regard to the time appears to be
§5 MOTION IN A L IM I T E D R E G IO N 39
non-periodic or conditionally periodic. This dependence of the variables
qa on the time plays no part in the ‘quantizing’ defined by formula (28 a).
The generalized momenta appearing in (28 a) can be defined, and
indeed are usually defined, in a different way—namely, as the partial
derivatives of the kinetic energy T, expressed as a function of the
generalized coordinates and of the corresponding ‘velocities’ dqjdt = qa,
with respect to the latter. The equivalence of both definitions is obvious
in the case of rectangular coordinates, since T — \m (v |+ v j+ v |) and
gx — ds/dx = dT/dvx, etc. If the coordinates are replaced by new
(generalized) coordinates #<*(£,?/, z), we have
q = 4- ^9—v 4~^-nv
dx x+ d y %v+ dz s'
whence dqjdvx =■- dqjdx, etc. We thus get
ds y dq« dT y 8T dqn dT. d3«
dx dqa dx ’ di' M* &>x - I H* dx 9
and consequently, = . -- p n.
dq* dQoc
The formulation of the quantum conditions in the form (28 a) is some
times possible in two or more different ways—if there exist several
sets of ‘separable’ coordinates. Theoretically it is possible—in a single
way at least—for any type of motion (restricted to a finite region).
Practically, however, the ‘separation coordinates’ can be found only
for simple types of motion (i.e. of the field of force). If the separation
coordinates cannot be found, then the quantum conditions—in the sense
of Bohr’s theory—must be stated in the more general form indicated
above, namely, that the moduli of periodicity of 8 with respect to any
closed curve should be equal to an integral multiple of h (or to zero).
We shall now turn to the question of the relation between the wave-
mechanical average or probable value of any function of the coordinates
of the particle for a given quantized state of motion and the corre
sponding classical ‘time average’ of this function. The solution of this
question depends upon the introduction of new coordinates of a still
more general kind than those considered above in connexion with the
formulation of the quantum conditions. These still more general co
ordinates are not directly expressible in terms of the original ones, but
in terms of the original coordinates and the corresponding momenta,
the new momenta being also functions of the old momenta and of the
old coordinates.
Coordinate or rather coordinate-momenta transformations of this
40 CLASSICAL MECHANICS AS LIMITING FORM §5
type were introduced by Hamilton and are called contact or canonical
transformations (the transformation considered above being a particular
case of these transformations).
The theory of canonical transformations is based upon the preserva
tion of the so-called ‘canonical form’ of the classical equations of
motion. In the case of rectangular coordinates these canonical equa
tions can be obtained directly from the usual equations of motion
md 2xfdt2 = — dlJjdx, etc., and have the form
dgT dll dx dH
(29)
dt d x' ’ dt dgx'
where H = ^ n (gl+ gl+ gl)+ U (29 a)
is the total energy expressed as a function of the coordinates and
momenta, and is usually denoted as the ‘Hamiltonian function’. The
equations (29) can be interpreted as referring to a particle moving not
in ordinary space with the three coordinates x, y, z but in the six-
dimensional phase-space (Part I, Chap. V) with the ‘coordinates’ x, y, z,
9x> 9u' 9zi time derivatives of these coordinates representing the six
components of the ‘velocity’ in phase-space and If being a function of
the ‘position’ of the particle in the phase-space.f
For the sake of uniformity in notation we shall, in the following,
instead of x, y, 2 write Qv Q3, and instead of gx, gy, g2 write PA, P 2, A-
The equations (29) then become
dH <iQa _ dH
(29b)
dt
We now introduce new coordinates QJ, Q'lt Q2 determined by three
equations of the form
Qfi - QpiQv Q»Q*) or Qa = Qa(Q'v Q i Q't) («, j8 = 1,2,3). (30)
We then define the new momenta PJ, P!2) P ' by the formulae
z>' 8t - Y 8L W* _ V P 8A « or P - Y P'o 8Qp (30 a)
ZdQ. ZQ't
dQ'p - - dQa dOl ~ Z
F«8Q'a
We F* ~ Z Ff>
which obviously do not assume a knowledge of the action function 8.
I t can then easily be shown that these new coordinates and momenta
satisfy a system of equations of the same form as (29 b),
dPp dH' d,Qp dH’
(31)
dt " = ~ 8Q'? ' dt* ~ dP: ( P = 1 ’2>3>-
f Instead of one particle one can consider a continuous assembly of its copies,
distributed not in the ordinary space as before, but in the phase-space with a density
depending in general upon the time.
§5 MOTION IN A LIMITED REGION 41
where H ' is the new Hamiltonian function which is obtained by re
placing in the original function H (Q ,P ) the old coordinates and
momenta by the new, according to the formulae (30) and (30 a). The
transformation defined by these formulae is called a ‘point transforma
tion*. As already mentioned, it is a special case of the canonical
transformations. A canonical transformation (of the coordinates and
momenta) is defined by the formulae
dd>
p- - & < * -$ • (31a)
where <1>(Q, P ') is a completely arbitrary function of the original co
ordinates and the new momenta. If, in particular, we put
® = i PjifiAQ»Q».Q»)
p =i
we obtain, by (31 a),
Q'p = M Q i ,Q2,Q3Y, =
which corresponds to the point transformation (30), (30 a).
The fact th at the original canonical equations (29) are transformed
by (31a) into equations of the same canonical form (31) can be shown
as follows:
We form the complete differential or rather the variation of the
function <J>, corresponding to a virtual variation (completely indepen
dent of the actual motion) of the variables Qy P ':
and differentiate this expression with regard to the time. We also take
the time derivative of O
and form its variation. By subtracting the expressions thus obtained,
we get, remembering that 8 and dt are commutative,
Now by (29 b) we have
3 0 H .6 o
42 CLASSICAL MECHANICS AS LIMITING FORM
Hence, in virtue of H (P , Q) = H '(P 'yQ')y
we obtain
Since the variations 8Q0 and 8Pp are arbitrary, we can equate their
coefficients. In this way we get equations (31).
Those canonical transformations, in which the transformed Hamil-
tonian // ' depends only on the momenta P ' and not on the coordinates
Q'y play a special role. Such coordinates are usually called cyclic. The
equations (31) reduce in this case to
Po — const. — --prr = too = const.,
dt dPp
i.e. Qp = u)pt-\-(f>p.
If the transformation function O leading to cyclic coordinates is
known, the mechanical problem can be regarded as solved, for the
original coordinates and momenta are then expressed according to the
equations (31a) as functions of the time which, besides t, only contain
constants Ppy wpy and <f>p.
Now it follow's from (31a) that this special transformation function
is just the action function s regarded as a function of Qv Q2, Q3 and of
three arbitrary constants PJ, P 2y P 3 which necessarily appear on solving
the Hamilton-Jacobi equation (16) or (17) by which this function is
defined. These constants of integration can be expressed in terms
of the three principal moduli of periodicity of the action function
J a = (A«s)a with regard to a system of separable coordinates qv q2, q3 (which
we need neither actually know nor consider in detail here). Replacing
the original constants P'a by their expressions in terms of J v J 2, J a we
can write the transformation function <J> in the form s(xyyyz\ J ly J 2iJ 3)
and define the constants J a as the new momenta (P^ = «/J. Considered
from this point of view these constants are called the ‘action variables’
of the problem. The corresponding cyclic coordinates are called the
‘angle variables’. We shall denote them by tty ( = QpY
We have therefore wp = wpt+<f>p, (32)
where according to the transformed canonical equations (31)
(32 a)
and (32 b)
§5 MOTION IN A L IM IT E D R E G IO N 43
To ascertain the dependence of the old coordinates Qa on the new
coordinates wp, we shall introduce for a moment as an intermediate
link between them the separation coordinates qJt q2, g3. Expressed ns
a function of the latter, the function s assumes the form
a -1
To a cyclic alteration of the coordinate qa there corresponds by (32b)
an alteration of the coordinate wp by A =- A^dsJdJp. We have
therefore, because A^Sp ™ */a if a — /?, and — 0 if oc ^ /?,
1 (oc = p),
= 8J°
dJp 0 (“ ¥= P)-
These formulae show that when any angle variable wp is increased by
1 and the remaining w»’s are maintained constant, which corresponds
to the cyclic alteration of the separation coordinate qp, i.e. to the
return of the particle to the original position along a ‘/ 2-curve’, then
the action function s increases exactly by Jp.
From this it follows that the coordinates Qp, and consequently the
momenta Pp, are periodic functions of the angle coordinates with periods
equal to 1. Each of them, as well as any function f(Q v Q2, Qz) (or still
more generally f(Q , P ) ), can be expressed in the form of a triple Fourier
series f — y f e V27r(A-, TCj + k2Wi -f ki u>j) (33)
J , 4* , J ki,k2,ki
k ltk„ k.
>
where kx,k2,k3 are integers which can assume all values from —oo to
+oo, and f ki kv kz are certain expansion coefficients characteristic of
the function /. If instead of the wp we put their values obtained from
(32), we get f = Y <?*„ *„ *, “* + * » “>V, (33a )
ki,kf, k%
where the Ck are new expansion coefficients which we can regard as
the amplitudes of various harmonic vibrations, while
<0 = k1cj1+ ^ 2W2+ ^3 ^3 (33 b)
are the frequencies of these vibrations. The quantities cop, i.e. the
velocities corresponding to the angle coordinates, represent therefore
the fundamental frequencies of the motion.
We can now return to the problem of determining the time mean
value of /. This problem can be solved at once by means of formula
(33 a). Indeed, the required time mean value must obviously be equal
to th at amplitude coefficient in (33 a) for which the vibration frequency
o) vanishes—or the sum of such coefficients if the equation w — 0 is
satisfied by several different combinations of the numbers kv k2, k3.
44 CLASSICAL MECHANICS AS LIMITING FORM §5
This mean value can be represented on the one hand by the general
1 T
formula / = lim — | fd t. On the other hand it can be represented
just as well by the formula
111
/ = JJJ / dwxdw2dw3
0 00
(34)
which does not contain the time explicitly, the triple integration being
extended over the ‘period cube’ in the coordinate space of the angle
variables;/is given as a function of the angle variables by formula (33).
The expression (34) has the form of a ‘statistical’ mean value corre
sponding to an averaging over the various copies of the given particle
distributed with a constant density in the space of the angle coordinates
wv w2, w3. Its numerical agreement with the time mean value of / for
a definite copy means that the curve described by the motion of such a
copy fills up this space uniformly.f *
We can now return from the angle coordinates to our original rect
angular coordinates Qx = x, Q2 = y, Q3 — z. In view of the fact that
the new momenta are constants, the old coordinates may be considered
practically as functions of the new coordinates alone, and vice versa.
We can thus transform the volume integral (34) according to the
well-known theorem of Jacobi, and put
/ = j fD d V , (34 a)
where dV = dxdydz and
dwx dwx dwx
~dx9 ¥ ’ ~dz
dw2 dwt dw2
dx 9 ¥ ’ dz
dwB 8w 3 dws
dx 9 d y ’ dz
By (32 b) this functional determinant can be written in the form
d2s d2S
dJxdx9 s J ity’ dJxdz
d2s 8*8 d2s
(34 b)
dJ2dx9 8J28y’ dJ2dz
d2s d2s d2s
dJ9dx9 8Ja8y’ 8Ja8z
f This condition is satisfied for non-degenerate motion, that is, motion for which the
three fundamental frequencies wlf tolf wt are not commensurable with each other.
§5 MOTION IN A L IM IT E D R E G IO N 45
The volume integration in (34 a) must be extended over the whole
region for which W—U ^ 0 . We are thus brought to the conclusion
th at the relative probability that the particle will be found in the
volume-element dVyas measured by the relative duration of its presence
in this volume-element, is equal to D (J D dV — 1). Comparing this
result with the wave-mechanical average
J = fj W d v ,
we see th at it will agree approximately with (34 a) if ijnjj* — D. Now
in the region W—U ^ 0 the function s — S° is real, so that the modulus
of the function ip — Aei2”^h — ei2irS'lh+fi' must reduce to A — es\ It
follows therefore th at A2 n
I t should be remembered that an exact agreement between the classical
and the wave-mechanical mean value is out of the question- not only
because of the approximative character of the preceding expre. ^ion for
ip (with s determined from the Hamilton-Jacobi equation), out also
because in the wave-mechanical case the integration must be extended
over all space including the classically forbidden region. However, this
region, although infinite, contributes in general only a finite and usually
a small amount to the integral $ f*p*p* dV because of a very rapid
decrease of the function \ip\2.
The relation A2 = D can of course be derived in a straightforward
way by integrating the equation
div^42V$ = 0
[cf. (17 a)], or the equation
V2£ 0+ 2V£ 0-V£' = 0
to which (21 b) is reduced in the case of conservative motion. This
integration has been carried out (in the case of the second equation)
by Van Vlcck, who showed that ^42 must be proportional to the deter
minant 0 2S d 2s d 2s
dxdot dydez d zd a
S2s 8% d 2s
dxd'p d yd p d zd p
82s d 2S C2S
exey d yd y d zd y
where a, 0 , y are any three integration constants occurring in the
expression of the function *0r,y,z;«,£,}'). This determinant is equal
46 CLASSICAL MECHANICS AS LIMITING FORM §5
to the product of D with the determinant which is a con-
stant factor playing the role of a normalization constant.
In the special case of uni-dimensional motion the determinant (34 b)
reduces to d2sfix'dJ, whereas by direct integration we obtained, in this
f>2
case, A2 = — = mC2 - . Thus we must have
v / dz
fds
dz*
that is, mC.
d zdJ \ d z) dJ 2 \dz]
or since — = W—U, we get —AW—U) — C2. This condition
2m\dz) b dJx 1
is actually fulfilled, for dUjdJ — 0 and dWj'dj ----- a> — 1/T, where T is
the period of motion [according to (32 a) with \V //']. Hence wc
get C2 = l/ T in accordance with the simple theory developed in the
preceding section.
II
OPERATORS
6 . O perational F orm of S chrddinger’s E quation, and O p e ra
tional R epresentation of Physical Q uantities
The formal relation between classical mechanics and wave mechanics
can be presented in another way which not only leads us to a deeper
understanding of the theory but also to various important generaliza
tions.
We can arrive at this relation by examining Schrttdinger’s equation
( 12) written in the form ^ ^
where D denotes the operator
2m\\2m ex) \27n d y)
+ h-.- + u.
\2m dzf J 2m di
This can be expressed in terms of the elementary differential operators
h d __ h d h d h <) _
(35)
27n dx ^ x) 2m dy — Py> 2m ~dz Pz* 2ni II ~ P‘
by the formula
(35 a)
D = 2^ P*+ P “+ P ^ + P ,JrU'
The equation — 0 thus reduces to the classical equation
T + U -W = 0
if we replace the operators p x, p y1 pz by the components of the momen
tum, and —pt by the total energy, i.e. if instead of (35) wc put
Pz = 9x> Pv = 9v Pz = 9z• Pi = - W(36)
and cancel the function $ (considering it as a factor). Therefore the
transition from classical mechanics to wave mechanics can formally be
carried out as follows. In the ‘classical’ equation
- (9**+ 9l+9l)+ U —W = 0, (36 a)
which relates the components of the momentum and the total energy of
a particle, we must replace these quantities by the elementary operators
(35) and then multiply the SchrOdinger operator D thus obtained by
the wave function $ on the right, where ‘right multiplication’ simply
means applying the operator to the expression standing on its right.
h d
The replacement of the energy W by the operator —p t = — — . — has
48 O P E R AT O R S §6
been made before, although in a somewhat different connexion, namely,
in the transition from the wave equation
for a conservative motion to the general equation
v V + 8" v ( - 2ATrt; £dl_ t / U
y
= o,
which applies to a motion of any kind. In the former case, since
xfj = 0 °((x, y, z)e~i27rWtlh, the operator p t is actually equivalent to the
energy in th at it satisfies the equation^\jj = — Wiff, which we could write
symbolically (dropping the function operated upon) in the form pt — — W.
A similar equivalence exists between the operators p x, p yi ps and the
components of the momentum gx> gv, gz with respect to the wave
function ^ _ CQngt ei2iHffxx+awv+0tz-woihf
representing the free motion of a particle with a velocity of specified
magnitude and direction. As we know, the latter can be specified only
in this particular case. In the general case the functionsp x0, p yip, p ztp,
—pt if* are not equal to the products of the function 0 by constant
numbers.
I t is natural to associate this result with the fact that, in the general
case, the components gx, gyy gz of the momentum, as well as the energy
W, cannot be defined as certain numbers since they do not have
definite values, and to assume further th at the operators p xi p yi p z,
—p t by which they are replaced in the transition from classical to wave
mechanics must replace them in all wave-mechanical questions.
This principle is corroborated by the following considerations.
( 1) If the wave function *p can be approximated to by the expression
ei2irsih where S is the classical 'action*, i.e. the momentum-potential
determined by the Hamilton-Jacobi equation, then we have
P ,* = y / 2”Slh = = 9 ^.
etc., so that in this approximation the operators p x, p y, pz are actually
equivalent to the components of the momentum gxi gv, gz. This result
still holds approximately if </r is represented in the form Aei2irslh where
8 is the classical momentum-potential, for the partial derivatives of the
amplitude A with regard to x, y, z (so far as the above approximation
can be applied) are very small compared with the partial derivatives
§6 OPERATIONAL FORM OF SCHRODINGER’S EQUATION 49
of a/hy i.e. the components of the wave number (the wave-length being
supposed to be very small).
(2 ) If the function ip is ‘quadratically integrable’, i.e. if it can be
normalized in such a way that the integral J dV is equal to 1, then
the integrals
f $*P x^ dV, j t* p vt dV, j r t p>p dV
coincide with the average values of the components of the momentum
as defined by the integrals
m j j x dV, m j j v dV, m J j , dV,
where j = is the probability current density and v is the average
velocity introduced in the preceding chapter, §§ 2 and 3 . We have in
fact, according to the definition of j x,
Now by partial integration we get
/ dr = ] - \ * ‘ i d r ~ - \ * ' t d v-
since in order that J ipip* dV should have a finite value the function tpifi*
must vanish at infinity rapidly enough to make the integral
jj[ H > * Z: iZd yd z
vanish too. Therefore
J jx dV = 2l l S ^ = J ***** dV‘
The preceding results can be extended to the more complicated*
operators, by which different classical quantities represented as certain
functions of the coordinates and momenta F (xiy)z,gxigyigz) must be
replaced, when gxf gyy gs are replaced by the operators p x>p v, pz. The
simplest example of such a complicated operator is the operator
T = (pl+ P l+ pDIi^m) representing the kinetic energy. If the func
tion \p describes a motion with a given constant value of the total
energy, i.e. if it satisfies the Schrttdinger equation (T+ U —W)*j* == 0 ,
then we have Tift = (W—U)ip, where the ‘operator* (W—U) is a simple
factor. The preceding equation expresses the fact that the kinetic energy
(i.e. the magnitude of the classical velocity) is a definite function of the
coordinates. The sum of the operator T and the potential energy U
3605.6 H
60 O P E R AT O R S §6
represents the total energy of the particle and is usually called the
energy operator, or the Hamiltonian operator, or simply the ‘Hamil
tonian’. Denoting this operator by H , we can write the preceding
equation in the form Hif* — Wif*. I t expresses the fact th at the energy
of the particle in the motion described by the function if* has a definite
value, namely, W. The general equation referring to a non-conservative
motion can be written in the form
(H + pM = 0. (37)
I t implies a certain relation between the two operators H and —ptf
both of which represent the energy W (when it exists)—the former in
a specific way, including the properties of the particle (mass) and the
character of the field of force in which it moves, and the latter in a
perfectly general way independent of these characteristics.
Independently of the form of the operator F (xyyyz; pXJp y,pe)} it can
easily be shown th at the result of applying it to the function if* ex
pressed in the approximate form ei2wSlh (or Aei2rr8lh) is equal approxi
mately to the product F (xyyyz\gxygv,gz)if*. The same is true in the
more general case of an operator containing the time t and the time
derivative operator pt. We have namely
F (xyyyzyt\p xyp yypzypt)if* = F(x, y,z,t; g x,gy,g2, —W)if*y
if the energy W is defined as —dS/dt, in accordance with the Hamilton-
Jacobi equation which gives —dSfdt — (VS)2j2m+ U = T+ U . The
function Fif* resulting from the application of the operator F to the
exact wave function if* can be represented as the product of the latter
with a certain function Fc of the coordinates alone (and eventually of
the time). The function Fc = (Fip)fif* can be defined as the value of the
quantity represented by the operator F at the corresponding point (and
instant of time). This is precisely the way in which we have defined
above the value of the kinetic energy in the case of a conservative
motion. If, in particular, the ratio (Fif*)/if* is equal to a constant Cy
then the quantity represented by F is said to be a constant of the motion,
its value C being independent of the position of the particle (and of
the time). This case can be illustrated by applying the energy operator
H to a function if*which describes a conservative motion, or by applying
any one of the operators p x, p v, pe to the function if* which describes
a uniform rectilinear motion.
If the ratio Fc = (Fif*)/if* is not equal to a constant, then we can
define the average or probable value of the quantity represented by
§6 OPERATIONAL FORM OF SCHRODINGER’S EQUATION 51
the operator F by means of the formula
J = J > c # * dV
or F = j< /,*F ^dV, (38)
with the condition th at J dV = 1. (38a)
This definition of an average value is a generalization of th at already
considered in the preceding chapter in connexion with quantities de
pending on the coordinates alone (such as the potential energy). Its
physical significance has been tested above in the case of the funda
mental operators p x, p y, pz.
As a further illustration of the operational representation of physical
quantities we shall consider the angular momentum of a particle, for
instance, the angular momentum of an electron moving about a fixed
nucleus (cf. P art I, § 14). In classical mechanics this quantity is defined
as a vector with the components
Wz-Wy* Z9 x ~ x9zi X9v-V9x-
We shall define it accordingly as a vector-operator M with the com
ponents
Mx = ypz- z p v, My = zpx- x p s, Mz = xpu- y p x,
Transforming from rectangular coordinates to spherical coordinates by
means of the formulae
x — r sin 6 cos <py y — r sin 0 sin <py z = r cos 0,
dip dip dx dip dy dip dz
We dr dx ~dr dy dr dz dr '
i.e. r — = rs in 0 c o s< ^ + rs in 0 sin^-^- + rc o s 0 ^-
dr dx dy dz
d , 0 ,0
— x ----hV -z—>
d x ^ J dyn dz'
and likewise
. . . .0 , . .
d = —r sm0sin<£— ,0 d d
— + r sin0cos<£,r = s - ---- y~ -
d<p dx dy cy dx
_ h d (39 a)
We have therefore
z ~ 2tri dj>'
52 O P E R AT O R S §G
Further, from (39) we get
M 2 = M l+ M l+ M 2
h2\ , a2 . 2 a2 d( e\ d l a\ , 1
= - ~ 2J ( i/2+ z 2) ^ 5 - 2«/z-52— 2* - - ...1
4jt2[ ;0.r2 y dydz 8x J
A2 [\ , 02 „ 02 «, 0 I
= “ 4 V i {r ~ X )^ ~ 2yZ8 y8 z- 2X8 i — }
where the terms denoted by ... are obtained from the given terms by
cyclic permutation of the coordinates x, y, z. Because of the identity
/ 0 . 0 , 0 \2
dx*+ ---+ Xlx + ---+ 2yZ£j8z + "-’
d2
or
a r2 + - +1^ ^ + - = (r l ) , - r i *
we can write the previous expression in the form
47T2[ \dx2^ d y 2^ dz2) \ dr) dr\
— —— [r 2V2—r2— —2r
4*4 2rdr\'
Hence
V2 = — i- Mt + —(r —Y -f - \
A2 r 2 ^ r 2\ ar/ ^ r \8r ) A2 r 2 + 0r 2+ r 0r’
p2 2 1
or putting V2 = L . 4 . _ 1 . + - Q2,
dr2 r dr r2
1 a2
where
n<S sine 0fl(sm 6&ej + sin20 ety2
denotes the angular part of V2, we get
j,2
M 2 = - n- 0 2. (39 b)
47T2
By applying this operator and the operator (39 a) to the functions
*Pnim= Fni(r )Yim(Q’ 4>)> which specify the stationary states of a hydrogen
like atom, we get
= F,a{r)M%m = - ^ F nlWYlm,
and by the equation n 2l^m-)-Z(i+l)J^m = 0 we get
A2
m
§6 OPERATIONAL FORM OF SCHRODINGER S EQUATION 53
Since, further, the dependence of <f>) upon <f>is expressed by the
factor etm^,
M-Anlm = (^ 0 a )
These relations show that the magnitude of the angular momentum as
well as its direction are constants of the motion—just as in the classical
theory of a particle moving in a central field of force. I t should be
mentioned that the character of the central field affects only the radial
factor Fnl(r) in the wave function the angular factor Ylm(6,4>)
being in all cases a spherical harmonic function. Therefore the above
relations hold for the motion of a particle not only in a Coulomb field
but in any central field of force. They show further that the quantum
numbers I and m which have been introduced in P art I, § 14, as
nodal numbers, characterizing the wave function from a purely
geometrical point of view, have also a dynamical meaning, one of them
(/) determining the total magnitude of the angular momentum according
to the relation M 2 — 1)/i2/ 47t2, and the other (m) determining the
projection of the angular momentum upon the z-axis according to
Mz = mhj^TT. For this reason the numbers I and m will be called re
spectively the angular and the axial quantum numbers.f The constancy
of the direction of the angular momentum is only proved indirectly by
the relation (40 a) because the direction of the z-axis can be chosen
arbitrarily, the functions nlm being so defined that the z-axis is the
axis of the spherical harmonic functions Y , M ) = If we
apply the operators Mx and My to these functions the result will not
be similar to that obtained by applying the operator Mz because the
functions Mx and My\(snlm are not equal to multiples of i/rn/m. Since
we know th at Mx and Mu also represent constants of the motion, we
see th at the condition F\f) = const, tfs cannot be regarded as the general
criterion for the constancy of the quantity represented by the operator
F . I t can easily be shown th at the above failure of this equation to
express the general condition of dynamical constancy is connected with
degeneracy, i.e. with the fact that the functions ipnlm are not determined
by the value of the energy Wn which, in fact, depends only on the
‘principal’ quantum number (n). Any linear combination of the n2
functions which differ from one another by the values assigned
to the numbers I and ra, will also represent a stationary state belonging
to the same value of the energy. This linear combination, i.e. the
t This seoms preferable to the traditional denomination where I is referred to as the
*azimuthal* quantum number and m as tho ‘magnetic’ quantum number.
54 O PE R ATO R S §6
coefficients c,m in the sum 2 2 c * ipninv can be so chosen th at the
l m
resulting function «//* will represent the same thing with respect to the
x-axis as tpnVm. with respect to the 2-axis. Applied to this function the
operator Mx would be equivalent to multiplication by m'h/27r accord
ing to the equation Mxip'n = (hm'/ 2TT)\p'n which could be considered
as a direct expression of the constancy of Mx. The function obtained
by applying Mx to \pnlm can easily be shown to reduce to a linear com-
+i
bination J Cm,'Pnim' °f the 2Z+1 fimctions tpnlm associated with the
m'~-l
z-axis.
7. Char acter istic Functions and Values of Oper ator s; O per a
tional Equations; Constants of the Motion
In general the equation Ftp = const. ip can only be satisfied by functions
*p of a special type which depend upon the nature of the operator F
and are therefore called the characteristic functions of this operator
(‘Eigenfiinktionen’ of the German authors—often translated into
English as ‘proper functions’). The corresponding values of the constant
factor are called the characteristic, values of F . As an example we may
take SchrOdinger’s equation Hip — W\p. In this equation the wave
functions describing the stationary states of motion are the charac
teristic functions of the energy operator H, and the eqergy-levels
W are its characteristic values. In the case of H , as well as in the
case of any other operator, these values and the functions associated
with them can form both a discrete and a continuous set. The
characteristic functions are fully determined by an operator F for a
one-dimensional problem, involving one coordinate only. In three-
dimensional problems there remains in general a certain ambiguity
in the choice of the functions ip, as determined by a single equa
tion of the type Ftp = const. tp, an ambiguity which is known as ‘de
generacy’ if F is the energy operator H. Thus, for example, the operator
h d
specifies the corresponding characteristic functions only
2ni d<j>
with regard to their dependence upon m, defining them as tp = f(r , 9)eimi
where f(r , 6) is an arbitrary function of r and 0. The operator M 2 like
wise determines the dependence of the characteristic functions on the
angles 0, <p only, the equation M 2ip = const, tp being satisfied by
*p = <f>) where f(r ) is an arbitrary function of r, and 7,(0, (p) is an
arbitrary spherical harmonic of order I, which can be expressed as a sum
of 2Z+1 functions of the type Plm(d)eimi with arbitrary coefficients.
§7 C H AR AC T E R IST IC F UNC T IO NS AND VAL UE S O F O P E R AT O R S 55
Now we have also seen that SchrOdinger’s equation Hip = const, ip in
the case of a hydrogen-like atom has for each characteristic value of
H = W n a solution of the form \pn = f n(r)Y(6y<p), where Y(6, <p) is a sum
of n2 spherical harmonic functions of the type Pim(0)eim4>with arbitrary
coefficients (I = 0,1, ...,n —1; m = —Z,..., +Z). We cannot therefore
completely specify the functions ipnlm describing the stationary states
of a hydrogen atom by taking one of the three equations
Hip = const, ip, M 2tp = const. Mzip = const, ipy (41)
but only by taking all three equations together. The functions ipnlm
then appear as the ‘simultaneous characteristic functions* of the
operators H yM 2yand Mz, each of these functions belonging to a ‘triplet*
of characteristic values Wn, (M2)t = Z(Z-f 1)/i2/47t2, and (Mz)m = mh/2tt.
Another simple example of this relationship is provided by the
operators p xy p yy pz. The characteristic functions of these operators are
obviously M y,z)ei27T<>*xlh, f 2(z,x)ei2" wlh, h ( xyy)ei27T0^ h\ f ly / 2, / 3 being
arbitrary functions of the corresponding arguments. Taken together
the three equations
Px'l' = 9x'l*> P v'l' = 9v'l>> Pz'lt = 9z'l>. (4 1 a )
where gxi gv, gz are constants, specify unambiguously the function
ip = const. ei2ir(a*x+0' v+a‘s)lhi (41 b)
which describes the uniform rectilinear motion of a particle with the
momentum components gyy gzi and which is a particular solution of
SchrOdinger’s equation Ilip = Wip with H = (pl+ pl+ pD/Zm, i.e. with
U = 0, corresponding to free motion.
I t should be mentioned that the expression (41 b) for ip is still incom
plete (as well as the expression ip = f n(r)Ylm(9, <p) for the hydrogen-like
atom functions) inasmuch as it does not contain the time. The latter
can be introduced by the additional relation
—PiP =
giving ip ~ e-i2nWifh. The constant W is, however, not independent, but
is connected with gxi gyi gz by the relation W = (g2x+ gl+ gl)l2m.
If F is an ordinary function of the coordinates (or of the time too)
which does not contain the elementary differential operators p x, p y, pz,
then the equation Fip = const, ip has no solutions of the ordinary con
tinuous type. The only possible solutions—except the trivial one ip = 0
—are those for which the function ip is different from zero on the surface
F — const, and vanishes outside this surface (which can be displaced
by varying arbitrarily the value of the constant).
56 O PE R ATO R S §7
Another interesting case is provided by operators which satisfy
the equation Ftp = Ctp identically, i.e. irrespective of the choice of
the function \py and therefore do not determine this function at all.
F == p xx—xpx is the simplest example of such an operator. Applying
it to some function tpy we get
h
F*p =
2in
Thus we see that this operator has one single characteristic value
C = A/27U with which any function can be associated as a ‘charac
teristic function’. The preceding equation can be written symbolically
in the form \
p xx - x p x = (42)
which is obtained by omitting the arbitrary function *p to which the
left- and right-hand sides of this equation must be applied. We have,
of course, similar equations for the two other coordinates and the corre
sponding components of the momentum-operator: p vy —ypy = h/2ni
and pzz—zpz = h/27ri. In addition we have the ‘operational* equations
p xy —ypx = 0 o rp xy = ypx, etc., which express the fact that the order
in which the operators p x and y are applied to any function ip(xyyyz)
is immaterial (since x and y are independent variables). The equations
PxPy—PyPx = 0 are quite similar to the equations xy—yx = 0 express
ing the commutative law of ordinary multiplication. Two operators
F and 0 which, when applied successively in the order F y G to any
function ip give the same result as when applied in the opposite order
GyF yare said to be commutable. This property is expressed symbolically
by the operational equation
F G — GF y (42a)
which means that the ordinary equation
FGiP = Q F f
is satisfied identically, i.e. for any function \p.
In general, the fact that the equation Aip = B\p is satisfied identically
with respect to the function 0, A and B being two outwardly different
operators, is expressed symbolically by the equation A = B. We shall
now give a few examples of such operational equations.
Let us consider first of all the operator F — p xf —fp x where f(x, y, z)
is an arbitrary (continuous) function of the coordinates. Applying it to
an arbitrary function ipf we get
h [f dd f /.n 001 _ h V.,.
§7 CHAR AC T ER ISTIC F UNC T IO NS AND VAL UE S OF O P E R AT O R S G7
so that vJ -Sp x = (43)
which means that the operator p xf —fp x is equivalent to the multiplier
2ni tix
The preceding equation is often written in the form
| = [iV/]> (43a)
where the bracket expression on the right side is defined by
[Px>f] = (43b)
If, in the above definition of F, we replace / by x and p r by pJ [which
means difTerentiation of the nth order with regard to x, combined with
a multiplication by (h/27ri)n], we get
to*-1
= v
dxn~l 4>
so that h (44)
p 'ix—xp'i ■-= .rip”-'
2iri
which can be rewritten symbolically in the form
*'Px - 1p n. X - h --p * .
2m 'd p /
This formula can easily be generalized for any operator expressible as
the sum of terms a npx with coefficients an which do not depend upon
the coordinate x. Denoting this operator by f(psnPyi2K;y,z), we get
xf—fx = — /- (44 a)
2lTl dpx
an equation very similar to (43) with x playing the role of —p x} and
p x the role of x. Putting
[ x,f] = - y (xf—fx) (44 b)
we can consider the equation
as the general definition of the operator d/dpx. We shall write in general
[ F ,0] = - l (F G -G F ), (45)
this ‘bracket expression* introduced by Dirac as the quantum analogue
of the Poisson brackets vanishing if the operators F and G commute
with one another.
3685.6 I
68 O PER ATO R S
I t should be noticed that an operational equation A = B expresses
the identity of the physical quantities represented by the operators
A and B; the existence of such equations indicates that the same
physical quantity can be represented in wave mechanics in a number
of apparently different ways.
Another interesting and important illustration of operational equa
tions is provided by the representation of the angular momentum of
a particle.
From the definition (39) it follows that
M% = (ypz-zp v)2= (ypz)i-(ypz)(zpu)-(zpu)(yp.)+(zp1/)2
= y2P'z+ *2p I - yp„ pzz—zpzp uy,
sincep v commutes with z and pzy and pz commutes with y and p y. Taking
into account the relations pzz = zps-\-hj2iTi and p uy -- ypy-\-hj2my
we get ,
= y2p l+ ^ P l~ 2 yzp upz- - (yp„+zp,),
*-7Ti
whence the formula (39 b) can easily be obtained. We have in addition
MxMv —(yPz—zPV)(zPx—xPz) = yP;zPx—zPuzPx—yPzxPz+zPvxPz
=
ypi Pzz~ z2PUPx yxPl zxPyPz>- +
whence
=
MXMV-M VMX ypxplz+zxpllpl-xp ]/p,s-zypxpz,
h> Ji
= (yP x~xPV)(Pzz~ zPz) = 2n i(yVx ~ Xp'l) = " 2,r iMz'
Thus, according to (45),
[MX,MV] = -~MS. (45a)
In a similar way we can derive the relations [Myi Mz] = —Mx and
[MzyM x] = —
Myy which can also be obtained from (45 a) by a cyclic
permutation of the indices xyy}z. These three relations can be replaced
by the symbolic vector equation
M x M = ---- --M , (45b)
27Tl
w h er e AX B is d e f in e d in t h e u s u a l w a y a s t h e v e c t o r p r o d u c t o f A
and B.
Interesting results are obtained by calculating the bracket expres
sions for the components of the vector M on the one hand, and the
components of the vector T(xyyyz) or P (PX,PV,PZ) on other. We
shall not go into these calculations (which can easily be carried out by
§7 C H AR AC T E R IST IC F UNC T IO NS AND VAL UE S OF O P E R AT O R S 59
the reader) but shall merely notice the following results:
O 2, M] = 0, [p2}M2] = 0, i f 6)
where p 2 = P xA-pl+ pb the first of these equations being equivalent
to the three equations [p2, J f J — o, [p2, My] — 0, [p2, AfJ = 0. These
equations express the fact that the angular momentum of a particle
commutes with its kinetic energy T — p 2j2m (more exactly we should
speak of the operators representing the angular momentum and the
kinetic energy). If the potential energy U is a function of the distance
r = ^j{x2+ y2-]-z2} alone (which corresponds to a central field of force),
then we also have
[U, M] = 0, [ U ,M2] = 0, (46a)
and consequently
[//, M] = 0, [II, M 2] = 0, (46b)
where H = p 2/2?7i-\~U is the Hamiltonian operator representing the
total energy of the particle.
The relations (46 b) can be obtained very simply by using polar
coordinates to represent II and M. Then
// = — ( h \ i f . + - - + y + m
2m \ 27ri/ &rm r r)f r l I
hd_ h2
Ms = M2 = n 2,
2ttx dtp 4n2
and so
[H,MS]
both bracket expressions [Q2, d/d<j>] and [H2,Q2] obviously vanishing.!
The equations (46 b) must be naturally related to the fact th at M
and M 2 represent quantities which are constants of the motion (in the
case of a radially symmetrical field of force). An equation of the type
= (47)
i.c. the commutability of an operator F with the energy operator H ,
can actually be considered as the most general expression of the fact that
F represents a constant of the motion determined by the operator H ,
i.e. by SchrOdinger’s equation Htp = Wtp.
In fact, applying the operator F to both sides of this equation, wo
have FHi/> = WFtp or, if H F = F H t we obtain H(FtP) = W(F*p). This
shows that the function Ftp satisfies the same equation as the function
f In order to obtain (46 a) without the use of polar coordinates we need only notice
that [U, AfJ = [17, yp.-zp^] = y[U, p,] = according to (43 a).
60 O PERATO RS §7
tp with the same characteristic value of the energy operator H. If there
is no degeneracy, i.e. if there is but one function tp associated with the
characteristic value W , then Ftp can differ from ip by a constant factor
only (which is immaterial so far as the equation Htf; = Wtp is con
cerned). Thus in this case we get Ftp = const, t/j, which is the original
condition for the constancy of the quantity represented by F in the
motion described by tp. In the general case, i.e. when there is de
generacy, the function Ftp must obviously be equal to a linear com
bination of all the functions tpv 0 2 , . . . , tpr associated with the same
characteristic value of H, i.e. satisfying the equation Htpk = Wtpk
(Jc — 1, 2, ...,r), with the same value of the energy. Applying F to one
of these functions we thus get, if F I I = H F ,
(4 7 a )
I 1
where ckl are constant numbers, the matrix
C11 C12 • • Clr>
C21 C22 * C2r
cr2 . . crr)
replacing the single constant C of the non-degenerate case.
The fact that the equations (47 a) actually express the constancy of
F can be proved by reducing them to a system of the standard form
J'V'n = (47 b)
where tp„ (n — 1,2, ...,r) are a set of r new characteristic functions of
H belonging to the same energy-level W as the original functions
tpl t ..., tpr and therefore equal to certain linear combinations of the latter.
In order to determine them, we shall first consider the inverse trans
formation, i.e. we shall express the original functions as linear com
binations of the new ones by means of the formulae
'l> k= 'Lakn'l>n- (48)
n^i
If these expressions are substituted in equations (47 a), then, in con
junction with (47 b), we get
2 akncn'lt’n = 2 2 cklaln^rf
n In
Equating the coefficients of the same ip’n and dropping the index n,
we get r
(2 = c'ak (k = 1,2,..., r). (48 a)
§7 C H AR AC T E R IST IC F UNC T IO NS AND VAL UE S OF O P E R AT O R S til
This is a system of r linear homogeneous equations for the determine
tion both of the transformation coefficients a and of the characteristic
values c'. The compatibility condition for equations (48 a)
11—c', ^12 <lr
°21> o22—r' CZr = 0 (4«b)
cr2 • • <Vr-r/
gives r (in general different) values for the unknown c a n d to each
of these values c't there belongs a definite set of coefficients ak> namely,
ai a 2 arn. By solving equations (48) with respect to the ip„y we
can obtain the explicit expressions for the new functions in terms of
the original ones.
Summing up the preceding results, we can say that the condition
[ H,F ] = 0 expresses the constancy of F with respect to all such types
of motion as are described by functions i/j satisfying simultaneously
the equations IIip = const. ^ and Ftp - const. 0. The functions ȣ arc
thus simultaneously the characteristic functions of both II and »T
So far we have regarded the energy as the queen of all the oy>Ci. • ,rs,
but the above considerations seem to banish the energy from this
supreme position and to reduce the Schrodinger equation Hip — const, ip
to the same humble role as that of any other equation Fip -- const. <//
for the characteristic functions and values of any other operator F.
Provided the operator F has a dynamical meaning, its characteristic
functions will describe the motion just as well as the Schrodinger wave
functions although perhaps less completely and from a different point
of view. The product ipip* will represent the probability of finding the
particle in the volume-element dV even if ip is a characteristic function
of some operator F different from the energy without being simul
taneously a characteristic function of the latter. The above-mentioned
difference in the point of view is obviously as follows: if ip is the charac
teristic function of Schrddingcr’s wave equation, then ipip* dV measures
the probability of finding the particle in the volume-element dV with
a specified energy W (the characteristic value of II associated with ip);
if ip is the characteristic function of some other operator F , then ipip* dV
measures the probability of finding the particle in the volume-element
dV with a specified value of the quantity represented by F .
The fact that the probability determined by some ‘wave function’
\p has a conditional character only, dependent upon the assumption of
a certain specified value for the quantity or quantities by which (or
62 O PER ATO R S §7
rather by whose operators) the function ip is characterized, is of funda
mental importance for a deeper understanding and further development
of wave-mechanical theory. We shall not stress this further here, but
shall limit ourselves to the following remarks.
(1) In the case of a one-dimensional motion the SchrGdinger wave
functions are completely determined by one operator only, namely, the
energy operator H. This means th at the energy is the only independent
constant of the motion, i.e. that any other operator F commuting with
H represents simply a function of II. A function of this kind can be
defined by the fact th at its characteristic values are a definite function
of the characteristic values of H . If, for instance, Hip = Wip, then
H 2*p = H(H\p) = IIWiP = Wlhp = W2<p, H ntp = Wnip,
and in general F(H)ip = F(W)ip, (49)
a result which can be proved directly if F is represented by a power
series in H with constant coefficients and which can be used as a defini
tion of F (H) in the general case. The wave functions describing the
motion of a particle in three dimensions are completely determined not
by the energy operator alone, but by three independent mutually com
muting operators which represent three constants of the motion—if one
of them is the energy, or if they indirectly involve the energy, all the
three commuting with the latter—such that their common characteristic
functions are at the same time solutions of the SchrGdinger equation
Hip = Wip.
(2) If the function ip does not satisfy this equation, then it does not
describe the motion, and the operator or operators by which it is defined
(according to the equations Ftp = const, ip) can be said to have specified
values, but not constant values, i.e. values which are not permanent in time.
Thus time appears as the correlate of energy—a fact which is obvious
in view of the possibility of representing the energy not only by the
Hamiltonian operator H , but also by the time derivative operator
h d
—p t — — . —, the general form of the SchrGdinger equation (H + pt)ip = 0
2t t i dt
merely expressing the equivalence of the two representations with
respect to a certain set of functions.
8. Pr obable Values of Physical Quantities and their Change with
the T im e
In classical mechanics time enjoys a supreme role entirely different
from all the other variables, being actually the only independent
variable. The main problem of mechanics is to determine how all the
§8 P R O BA BL E VAL UE S OF P H Y SIC A L Q U A N T IT IE S 03
other variables—in particular the coordinates—change with the time.
In wave mechanics the time seems, at first sight, to be reduced to
a humbler role, since the spatial coordinates no longer depend on the
time but are treated—so far as the wave-mechanical ‘equation of
motion’ is concerned—as independent variables, th at is, they appear
on the same footing as the time itself.
This equivalence between the spatial coordinates and the time is
restricted, however, as we know, to the wave equation = 0
and does not extend to the boundary conditions under which it has to
be solved nor to the interpretation of its solutions. Thus a function
0(x,y, z, J) which satisfies the preceding equation is interpreted as the
measure of the probability of finding the particle under consideration
in a volume-element dV —■dxdydz at a definite instant of timef the
probability in question being defined as equal or proportional to ipip* dV.
If time played the same role as the coordinates, we should not be able
to refer the probability to a definite instant of time but should instead
refer it to an interval of time dt , and define it as proportional to *}t$*dVdt.
There is, however, actually no reason why we should not be able to
refer the probability of location to a given instant of time—for the
particle must be somewhere at any moment. The exceptional role of
the time becomes particularly clear if we restrict ourselves to solutions
of the SchrOdinger equation which vanish at infinite distance (they
cannot vanish for t — ±00 except in separate places!) in such a way
as to ensure the convergence of the integral J dV extended over
all space. Taking the time derivative of this integral and replacing
by —div j , where j = probability
current density, then, if the integration is first extended over a finite
volume limited by a closed surface, we get
(50)
where Jn is the normal component of j . When the surface S is removed
to infinity the latter integral tends to zero (so long as *jt is supposed to
be quadratically integrable), so th at in the limit we get
j tfnf/* dV = const.,
00
which enables one to normalize $ to 1 by the condition
j< H > *dV= l. (50a)
04 O PERATO RS §8
I t should be remarked th at this result holds for the motion of the
particle not only in a constant field of force (this case has been con
sidered in § 17, P art I), but also in a variable field of force.
Now if J if/ip* dV is constant, it is futile to consider the integr al
JJ \jnft* dVdt with a view to nor malizing the function iff in such a way
th a t the tim e would appear on the same footing as the coor dinates.
The Ham iltonian oper ator H> which, as we have seen, is intim a tely
connected with th e tim e, m ust ther efor e play an exceptional r ole in
deter mining the per manence or non-per manence in tim e of differ ent
q uantities connected with the motion.
As has been shown befor e, this per manence is deter mined by the con
dition H F —F H = 0, wher e F is the oper ator r epr esenting the q u an tity
in question. We ar e now going to gener alize th is r esult for quantities
which ar e n ot constants of the motion, i.c. quantities for which the con
dition H F —F H —- 0 is n ot fulfilled.
In classical mechanics such quantities can be deter mined as functions
of th e tim e. In wave mechanics such a deter m ination is only possible
for their pr obable values, as defined by
F = J <f,*F<f, dV,
under the condition (50 a) (which is fulfilled for a motion restricted to
a finite region or represented by a wave packet).
Differ entiating F with r egar d to the tim e, and taking into account
the equations {h + ± = < > ,(* - -)•/-* = 0, We get
-- J [m*){F^)-rFm)] dv.
Now it can easily be pr oved th a t
J ( H f ) ( F f ) dV = j >f>*H(Ft) dV.
In fact, putting Fi// = / i . * • = / . . and writing the operator H in the
form H = ± ( h YU H - l I L j . - \ 4 - t j
2m\2m) \dxt + dyt + d z* r '
we find
/ ( /,» /,- /.« /,) i F . [ » ( /,! /,_ /,» /,) +
- L ( M 1
P R O BA BL E VAL UE S OF P H Y SIC A L Q U A N T IT IE S 65
wher e = / i V / 2- / 2V /1.
If the integral j fi f t d V
is convergent, then the integral J div f12 dV — f / 12n dS must vanish
when the integration is extended over all space (the surface S receding
to infinity), so that we get
j A H f2dV = fh H A d V . (51)
I t should be mentioned that all operators having the property expressed
by this equation are called ‘self-adjoint’. Strictly speaking, the self
adjointness of an operator II is expressed by the fact that the
difference f l / / / 2—/ 2Hfi is equal to the divergence of some vector; this
condition leads to (51) when combined with the condition
j fi J 2 — finite. (51a)
The latter condition is certainly fulfilled for f x — Fif) and / 2 = «/»* so
long as (50 a) is fulfilled.
We thus can rewrite the above expression for dFjdt in the form
dF
*H (F ifj)-^ F (I I ifj)]dV,
'dt
or ^ = — J 4>*(H F -F H ypdV.
dF
dt
(52)
I t follows from this formula that dF jdt — 0, which means that F is
a constant of the motion, if H F = F H. This agrees with the result
found before. According to the general definition of the probable value
of a quantity represented by some operator F , we can define the right-
hand side of (52) as the average value of the operator
2t7i ,
(H F - F H ) = [.H , F ].
h
dF
Ther efor e
dt
dF (52 a)
or dt = [ * > n
if dF jdt is regarded as an operator defined by equation (52 a) and satis
fying the condition ^ _
dt dt
3505.6
66 O P E R AT O R S §8
In the derivation of (52 a) we have tacitly assumed that F did not
contain the time explicitly. If it does contain the time, then equation
(52 a) must be replaced by
dF ^F . t tt jpi (52 b)
For example, let us put F — x. The time derivative of x as a quantity
is equal to zero, since x is independent of t. Regarding x, or rather
dxjdt, as an operator, however, we have
dx = [ H ,x}= ~ [ x, H],
dt
dx d ll
or according to (44 c) (53)
~di *Px
which. • ith H = 2m (ti+ P v+ P i) + u (x<y>3 )>
dx 1
gives ~P x (53 a)
dt m
This equation coincides superficially with the classical relation between
velocity and momentum, considered as definite quantities. In wave
mechanics, however, they are indefinite quantities represented by the
operators dr/dt and p = mdrjdt. Putting F = p x, we have
d% ~ [ U , P r] = [ U,px\ = - f r x,U]
or, according to (43 a),
dJL dU
(53 b)
dt dx dx
Equations (53) and (53 b), together with the corresponding equations
for the y and z components, are formally identical with tho classical
equations of motion in the ‘canonical’ form (see preceding chapter, § 5).
If tho classical quantity represented by the operator F is defined as a
function of the time and of the (classical) variables x, px\ y, p y \ z, ps,
we have
dF __dF , V ' /&F dx ^d F dpx\ ___ cF ^ IdH dF dH dF \ .
dt dt 2 -t\d
3C»y»3 x dt dpx d t) dt 2 * \d
X,ft,Z p x dx dx dpx) ° °
according to (53) and (53 b). Comparing this with (52 b) we see th at
the classical analogue of the quantum bracket expression [HyF] is the
dH dF dH dF \
sum which is the classical Poisson bracket ex
dpx dx dx d p j
pression.
§8 P R O BABL E VAL UE S OF P H YSIC AL Q UANT I T I E S 67
Equation (52 a) looks very similar to equation (43) and the equations
corresponding to the other two coordinates, namely,
I= 1 = <54)
the time t being related to the energy operator H in the same way as the
coordinates x. y, 2 arc related to the operators px, p yi p z representing
the components of momentum. This relationship seems very natural
from the point of view of the relativity theory and seems to indicate
th at time and energy must be treated on the same footing as the spatial
coordinates and the components of the momentum. The similarity
between the relations dF /dt — [//, F] and df/dx = [pxJ ] is, however,
only apparent—for in the latter case / denotes a function or operator
depending explicitly upon x, and d/dx denotes partial differentiation
with regard to x, while in the former case F is a function or operator
which does not contain t explicitly. The time equivalent of equations
(54) is easily seen to be
:-="=!>/./]• (54a)
This equation follows immediately from the definition of the operator
T 0
p t — } . —. Replacing dF/dt in (52 b) by [p,,E], we get
= [ ( # + i> ,) ,n (54b)
Tt should be noticed that the operator II+ p t does not vanish identically,
as might appear from the equation (H-j-p^i/j — 0, but only with respect
to the functions defined by this equation and describing the general
type of motion determined by the Hamiltonian II. The fact th at there
are actually two different operators H and —pt representing the same
quantity, i.e. the energy, and equivalent to one another with respect
to the wave functions describing the motion of the particle, suggests
the possibility of restoring the symmetry between time and space which
is required by the relativity theory by introducing certain operators
Gx, Gin G3 which, though entirely different from px, p y\ pz, would repre
sent the same thing as the latter, i.e. the components of the momentum.
The o])erators G would have to be defined so as to be equivalent to
the corresponding p with respect to the same wave functions as the
operators II and —p t. If this were possible, we could replace the time
in its exceptional role by any one of the three coordinates x, y, 2,
e.g. we could define the wave functions by an equation of the type
68 O P E R AT O R S §8
(Gx—p x)ip = 0, and interpret ipip* dydzdt as the probability of finding
the particle in the region specified by dy, dz, and dt for a definite value
of its ^-coordinate. We could further define the average or probable
value of an operator by the formula F = JJJ ip*Fip dydzdt as a definite
function of x and obtain for its derivative with respect to x an expres
sion similar to (52) or (52 b), i.e.
provided the operator Ox were self-adjoint, in the same sense as H.
This relativistic symmetry between space and time, as expressed by
the equal eligibility of any one of the four quantities x, y, z, t, and the
associated quantities Gxi Gyi GZ1 H to the presidential role which has
hitherto been enjoyed only by t and //, cannot, however, be attained
if we retain the definition of the Hamiltonian operator
11 =
which has so far been used and which corresponds to pre-relativistic
classical mechanics. This follows from the unsymmetrical way in
which the operators p x, p yi pz, and pt are involved in the equation
( / / + pt)ip = 0.
I t is possible, however, to modify the SchrOdingcr equation so as to
secure the desired symmetry enabling one to formulate it in either of
the four equivalent ways (Gx—p x)ip = 0, (Gy—p y)ip = 0, (Gz—pz)ip — 0,
(H+ p^i/j = 0 in agreement with the relativity theory. This modifica
tion (due to Dirac) will be considered later (Chap. VI).
9. The Var iational For m of the Schr odinger Equation and its
Application to the Per tur bation Theor y
If the potential energy U does not involve the time explicitly, then
the equation (H + pt)\jj has, as we know, particular solutions of the type
ip = i/j °( x , y, z)e~i2lTWtih, where the ‘amplitude* function ifj°(x, y, z) satisfies
the equation Hip0 — Wip° (which has been written before in the equi
valent form Hip = Wip). Multiplying it by ip°* and integrating over the
whole space, then if, as we shall assume in future, $ ip°*ip° dV = 1,
We get J dV = W. (55)
This is just what we should expect, since, according to the general
definition of probable (average) values, the integral
J dV --= f dV = H
§9 VARIATIONAL FORM OF SCHRODINCER’S EQUATION 09
is the probable value W of the energy which is a constant of the motion.
We shall now show that the function \ft°) which may be called the
characteristic function of the operator H (the time factor being
irrelevant so far as the equation Htp — Wt/j is concerned), can be deter
mined from the variational principle
8H = S J dV = 0, (55 a)
in conjunction with the normalization condition
J ipt/j* dV — 1. (55 b)
We have in fact
8/7 = J dV -|- J ^°*//S^° dV,
or, according to (51), i.e. because of the self-adjointness of II and
because of the convergence of the integral J ^r0*St/f° dVy
s7? = J dV + J dV. (50)
Further, (55 b) gives
J dV + J SfV-o* dV = 0. (56 a)
So long as the function i/j° is looked for as a complex quantity, it is
equivalent to two real functions. We could therefore consider \p° and
t/j°* as two independent unknown functions, and treat their variations
as arbitrary independent infinitesimal quantities, were it not for the
condition (50a). According to the Lagrange ‘method of multipliers’,
this dependence can be removed by multiplying (50 a) by some constant
factor C and subtracting the result from (50). This gives
J dV + 1 8ip0(Htp°*—Cifj0*) dV = 0,
and since and can now regarded as completely arbitrary,
we must have //^o == c<fi° and H<p°* = C</r#*.
Thus from (55 a) and (55 b) we have obtained the SchrOdinger equation
for the function tft° and its conjugate complex function. The energy W
appears in the variational method as the value of Lagrange’s multiplier
associated with the function «/f°, and the Schrodinger equation appears
as the variational equation of Euler and Lagrange corresponding to the
‘conditional extremum’ of the integral II — J dV. This integral
can be written in a somewhat different form—a form which contains
only the first derivatives of the functions and ^r°* (as it must do if
70 O P E R AT O R S §9
the variational equation is of the second order). We have in fact
* o * il
r dx*Y dx\ dxJ dx dx
and consequently
J dV
= - L | A j 2J^J div(</i°*V</i°) dV - J V ^°*V 0°dFj+ J dV,
or, since the first integral in the square brackets vanishes,
H = J {J^%m
k2
W * W + t W 0) dV. (57)
Putting p — - .V, we can rewrite this expression in the form
2m
H = J ( — M r + U i r i ^ dV, (57a)
where |p^°|2 is the scalar product of the vector p^° and the conjugate
complex vector p*tp°* = — ^ ,V0°*. If, in addition, we introduce the
function S — ^ .log^r0, and so replace p ift° by ip°VS, we get
2m
H= J ( ± I V S I * + u ) lt° l2dV. (57b)
The integrand of this expression looks exactly like the classical expres
sion for the total energy (S0 being the Hamilton-Jacobi action function)
multiplied by |^°|2. I t is worthy of remark th at Schrftdinger first
obtained his wave equation by applying the variation principle to the
integral (57 b), without fully realizing at th at time (beginning of 1920)
its physical meaning.
The variational equation 8// = 0 does not mean th at the values of
H — W obtained from it (with the condition J dV = 1) are
minimum or maximum values compared with those corresponding to
slightly varied functions 0°. In order to find out whether we actually
have an extremum or only a stationary value, we must calculate the
variation of 11 to the second approximation, i.e. to the second order of
the small quantities 8</r° and 8</i°*.
We thus get
AH = J (^* + S ^* )//(^°+ S ^ °) dV - J o dV,
=J dv + J d v + J s^°*//8^° d r .
§9 VARIATIONAL FORM OF SCHRODINGER’S EQUATION 71
On the other hand, we must have
j dV - $ dV
= j dV + J <jfi*8>ffldV+ J 8t/i^St/fidV ^ 0.
Multiplying this equation by the value of W corresponding to the
function ip° and subtracting it from the first,we get, since ip° and if/0*
satisfy the equations Hip0 = Wip°, Hip0* = Wip0*,
A8 = J ty°*(II-W)8fi> dV, (58)
which can also be written in the form
AH = J |p 8 0 °|* + (f/- » W I « ] dV. (58a)
This expression can be considered as the second variation of //, since
it is a small quantity of the second order. Its sign is, in general,
uncertain: it may be positive for some variations hip and negative for
others. The values H = W given by the variational principle hH = 0
must therefore be regarded as stationary and not as minimum or
maximum values. The preceding results arc simplified if we assume (as
we »are usually entitled to do when we are dealing with stationary
states with no magnetic field present) th at the wave function ip° is real;
we need hardly however, restate them in this simplified form.
The variational principle provides us with a very simple and important
method for obtaining approximate solutions of SchrOdinger’s equation
and determining the corresponding energy values—or rather for improv
ing 6uch approximate solutions and energy values after they have been
obtained by some other method, f Thus the variational method is useful
in determining the motion due to a field of force which is slightly different
from some simpler field of force for which the motion is supposed to be
known. The solution of this question is one of the two main problems of the
perturbation theory, the other problem being the determination of transi
tion probabilities which has already been considered briefly in P art I.
We shall give a detailed treatm ent of the perturbation theory in a later
chapter. At present we shall briefly indicate those of its results which
can be obtained, in a straightforward way, by the variational method.
t The method of reducing the solution of a differential equation of the type
#0° s, W\fi° to a variational problem has been worked out by Lord Rayleigh and much
later by W. Ritz in connexion with the problems of the vibration of elastic bodies, which
are formally very similar to the problem of the motion of a particle in wave mechanics.
72 O P E R AT O R S §9
Let us suppose that, somehow or other, we have obtained a function
, y,z; a ) which we know to be capable of approximately representing
one of the characteristic functions of the operator H provided the
undetermined parameter a, contained in it, is suitably chosen. Then
this particular value of a can be determined from the equation
?"<?> = If »*<?>, (59)
da da v '
where H(a) = J <f>°*(x)y)z]a)H(f>0(xyy)z;a) dV, (59a)
and E(a) = J <f>0*</>0 d V, (59b)
in conjunction with the relation H(a) = W, which gives the corre
sponding value of the energy. If the function is normalized to 1 (accord
ing to E — 1) for every value of ay equation (59) can be replaced by
BH(a){da = 0.
This method, which is often used in practice, can be generalized to
include the case when the function (j>° contains many unknown para
meters av a2,..., ar, the closeness of the approximation in general
increasing with the number r of these parameters. We come upon a
particularly simple and interesting case of such an approximation in
the perturbation theory of a degenerate motion, where we have, in
the absence of the perturbation, a set of wave functions $ ( xyyyz)y
$!(.r,y,z),..., $!(#, y, z) representing different states of motion with the
same energy W. Let us assume th at the potential energy U has been
replaced by U \ the difference U '—U corresponding to a small per
turbing field of force (for example, an external electric field of force).
The energy operator H = p 2/2m+ U must then be replaced by the
operator H ' = p 2l2m-\-U' — I I + U ' —U, and the functions
must be replaced by a set of r functions #!' referring to r
states of motion with nearly the same energy, i.e. belonging to r energy
values W[, W'r which are slightly different from one another and
from the approximate value W corresponding to the absence of per
turbing forces (the latter are, of course, supposed to be independent of
the time). Now the functions can be represented approximately as
linear combinations of the functions ifPk with unknown coefficients.
Thus we may write r
W = ( 60 )
=1
the r coefficients aw , a 2k,,...y ark> appearing in the expression of each
function if/f playing the role of the r parameters mentioned above.
§9 VARIATIONAL FORM OF SCHFvODINCER’S EQUATION 73
Dropping the index h' and substituting the expression — 2 ak 4*1
the integrals
W | dV and E' - | i/.0'*-/.0' dV
we get
Id' — £ 2 H'kiat ab (00a)
M I-l
^ S (00b)
^ -l Ml
where
w= J dV, (COc)
= / '/4 V? rfJ7- (cod)
The expressions (00 c) are the matrix dements of the energy operator
H' of the ‘perturbed’ motion with regard to the characteristic functions
describing the unperturbed types of motion associated with the same
energy W. Since these functions need not be orthogonal, the expres
sions Ekl may be different from zero for k I.
The variational principle 8//' — 0, together with the condition
E' = 1, gives the following equations:
dll'
W
,dE' dir -- W
VE'
da* da{ da,
i.e.
1 (//;.,- W'E ufr = 0 (k = 1, 2,..., r), (61)
I -1
± W i - W ' E a ) * t ~ 0. (61a)
The second group can be obtained from the first by a change to con
jugate complex quantities in conjunction with the ‘Hermitian’ relations
(Part I, §17) //;* = H'lk and Ekl = E*, '
and therefore need not be considered separately. The compatibility
condition for the r linear homogeneous equations (61) runs
H'n - W ' E n H ’a - W E U . . . H ’lr- W ’E lr
h : ^ - w e 21 H'a - W E n . . . m , - W E , ir
= 0. (Cl b)
HU —W'Er. H U -W'E,.. H '„ -WE „
This is an equation of th e rth degree for W; its roots llj, IFo,..., WT
are the required (approximate) values of the energy. The coeflicients
aVC>a2*'»•••>
MBS.6 L
74 O P E R AT O R S §0
corresponding to IV' = IVk> according to (61), specify, by means of
equation (60), that type of perturbed motion which has the energy Wk>.
We thus see that the r types of unperturbed motion which have the
same energy W and which are described by the functions
actually give rise, under the influence of the perturbation, to the same
number of different types of motion, but these, in general, now have
different energies W W ' r. This phenomenon is denoted as the
'splitting up’ of a multiple energy-level, by the influence of perturbing
forces, into a number of 'sub-levels’. The Zeeman and Stark effects, i.e.
the splitting of the spectrum lines under the influence of a magnetic or
electric field, are examples of this.
I t should be mentioned th at if the functions </>£ are orthogonal and
normalized to 1, i.e. if E kl is equal to 0 lor k ^ I and to 1 for k ----- /,
equations (61) assume the form
^.H 'u a f= W ' a k (k = l,2,...,r), (02)
l-l
and the compatibility equation for determining the energy values
reduces to
Hu - W H[r
h :,i \ v . ■ ■ Hir
^ 0. (62 a.;
H'rl Hit ■ • • H ir ~ W'
Equations (60), (62), and (62 a) closely resemble equations (48), (48 a),
and (48 b) derived in § 7 for the determination of the characteristic
values of an operator F which is a constant of a motion involving
degeneracy. Actually they are identical, but this is slightly masked by
a difference in notation. If we replace F by H \ reverse the role of the
'old’ and 'new’ functions $ and 0', replacing the 0 by 0°' and the tfj'
by 0°, and in addition write //^ instead of cw and W' instead of c',
then equations (48), (48 a), and (48 b) assume the form of (60), (62),
and (62 a) respectively. This coincidence shows th at the operators H
and H' must commute with one another, i.e. that, to the degree of
approximation obtained by the perturbation theory sketched above,
the perturbation energy H '—H is to be considered as a constant of the
unperturbed motion specified by H.
This perturbation theory can easily be improved and generalized in
such a way as to become what is called a transformation theory, the
primary object of which is to derive exactly the characteristic functions
and values of a certain operator H' from the characteristic functions and
§9 VARIATIONAL FORM OF SCURODINOER'S EQUATION 75
values of some other operator I I . The solution of this problem is given
by the preceding equations if, in the first place, we drop the assumption
th at the original (amplitude) functions ifj« belong to the same
energy-level, and if, in addition, we increase r to infinity, so as to use
the complete set of functions and energy-levels belonging to the operator
11. Equations (60) and (61) or (02), in conjunction with (61 b) or (62a)
will then determine the complete set of functions and energy values
characteristic of the operator IT. Further generalizations of this trans
formation theory involving operators different from the energy and
variables different from the coordinates will be examined later (Chap. IV).
It should be mentioned here that the reduction of an equation of the
form Fifj ™ Cip to a variational principle of the form
SF 8 J 4,*F f dV = 0
(with the condition J i/ji/j * dV — 1) is possible not only when F is the
energy operator //, but in the case of all operators which are ‘self-
adjoint’, i.e. for which fi F f2—f 2F fl = the divergence of some vector.
Actually it is not necessary for the integral J \jj\fj* dV to converge. The
only assumption which it is necessary to make in older to obtain tin*
differential equation F i/j -- Gift from the variational equation 8F = 0
is that E -- J dV should be constant (hE = 0 ).
10. O r t h ogon a lit y a n d N or m a liza t ion of C h a r a ct er ist ic F u n ct ion s
for D iscr et e a n d C on tin u ou s Sp ect r a
The characteristic functions «/»° obtained by the variation principle,
under the condition f dV — const., or by the direct solution of
the equation Hift0 — W*/*0, can form both a discrete and a continuous
set corresponding to a discrete or a continuous set of energy values \Y.
The energy values are therefore said to form a discrete or a continuous
spectrum of the energy operator H. As we know from the general dis
cussion of § 15, P art I, and from the examples of the oscillator and
the hydrogen atom, a discrete spectrum is associated with characteristic
functions which—because of ‘total reflection’—vanish at infinity so
rapidly that the integral J dV converges. This makes it possible
to normalize them to 1 by means of the equation J (/r°«/r°* dV - - 1. The
characteristic functions corresponding to a continuous W-spectrum may
also—although not necessarily—vanish at infinity, but not rapidly
enough (because of the lack of total reflection) to ensure the convergence
of the integral J dVy so that their normalization to 1, or to any
other finite value, is in this case impossible.
76 O P E R AT O R S* §10
This relationship between the convergence or non-convergence of the
integral J dV (which is a measure of the probability of finding
the particle somewhere in the whole of space1) and the discrete or con
tinuous character of the energy spectrum is intimately connected with
the relationship between the characteristic functions and 'which
are associated with or ‘belong to ’ different values of the energy ]Vn
and Wm.
If the equation //</;" — which is satisfied by ifPa is multiplied
by and subtracted from the equation Ihfl* — multiplied by
r,o weget -= (Wm~Wn)K4> l-
Integrating over the whole space, and assuming the integrals f \i/j°n\2 dV
and J | $n\~dV to be convergent, wrc get, because of the self-ajointness
of the energy operator according to (51),
= o,
and since Wm Wtn r
= 0- (63)
This is the ‘orthogonality property’ which has already been deduced
for one-dimensional motion in § 17, P art I. As shown there, this pro
perty can still be retained even when the states arc degenerate, i.e.
when different functions and i b e l o n g to the same energy-level,
provided these functions arc suitably chosen as linear combinations of
the original ones (if the latter do not already satisfy the orthogonality
condition). If the energy values corresponding to different functions
are distinguished by different indices, irrespective of whether these
values are actually different or identical, the orthogonality relation
(G3) and the normalization condition J dV -= 1 can be fused into a
single equation r
j W t fi d V = Sm„, (63 a)
where Smn = 1 if m — n and Bmn — 0 if m J- n.
I t should be mentioned th at the existence of degeneracy must be
regarded not as a general rule, but rather as an exceptional occurrence.
I t only arises in a few cases in which the particle is moving in an
exceptionally simple field of force. -Nevertheless, the simple types of
the potential-energy function U corresponding to these simple fields
of force are of great practical importance.
As shown in P art I when discussing examples of motion in three
dimensions, the different characteristic functions are specified by the
values of three quantum numbers nv n2, n3, which, from the geometrical
§ 10 O R TH O G O NAL ITY AND NORM ALIZATION 77
point of view, give the number of nodal surfaces of the different kinds
and which, from the dynamical point of view, specify the characteristic
values of three operators Fiy F2, F2, representing three independent
constants of the motion which is described by the corresponding charac
teristic function. The energy operator II can be defined as a certain
function of the operators Fv F.,y F.iy its characteristic values being
equal to the same function of the characteristic values C ^y C^t,
of these three operators. The existence of such operators is connected
with the existence of ‘separable coordinates’ qv q2y qz, these coordinates
being such that each characteristic function of II can be represented as
the product of three functions ^ </'»,*. *,(?3) satis-
fying the equations
2*3)' (04)
Since (04 a)
these become
with (04 b)
ii(F 19F 29FsypUtn^ --
where \V(C'y6"', O'") is the same function of the numbers C \ C", Cmas
II is of the operators Fv F2, F.v
In the approximate quasi-classical determination of the function if» in
the form ei-7T,slhywhere S is the action function of the Hamiiton-dacobi
theory, the product relation (04 a) corresponds to the additive relation
S0(x,y9z) - S'(ql) + S'(q2)+ Sn'(qi ) (G4 e)
which serves to define tlic sej)aral)le coordinates in the classical sense.
The quantum numbers ??,, ??..>, ??3 are introduced bv the condition th at
the periodicity moduli of H(k)(qk) must bo integral multiples vk of //.
The energy W(C/yC"yCm) can be written as a function of the quantum
numbers in the form WniTI1li. We have degeneracy when the energy
actually depends on only two or one of these numbers, or upon their
sum—as in the case of a hydrogen-like atom, where we may assume th at
nx denotes the radial quantum number, n2 = I the angular quantum
number, and n3 = m the axial quantum number, F2 being the operator
M 1 and F3 the operator Mz, and hence
== PhiA®)’ == C'”"^
I t is always possible to arrange tlic triplets of numbers ??,, ^3 in
a single row and to specify the functions and tlic energy-levels II
by a single index n indicating the position of the corresponding triplet
in the row. The indices «(</£, IF„) so obtained will, of course, have no
78 O P E R AT O R S §10
connexion with the quantum numbers. One can also use a kind of vector
notation, writing n as an abbreviation for the three indices nv n2>n3.
This is the notation used in § 17 of P art I, and we shall use it in future
when dealing with states of motion belonging to a discrete spectrum.
A continuous spectrum of the energy operator H arises when at
least one of the three operators F , corresponding to the separation
coordinates, has a continuous spectrum of characteristic values, the
spectra of the other two operators remaining discrete (although of
course they may be continuous too). This case occurs with hydrogen
like atoms in the region of positive energy values, i.e. in the region
corresponding to the non-periodic (hyperbolic) motions of the classical
theory. The wave functions can still, in this case, be written in
the form of a product (04 a), the radial quantum number (n J being
replaced by a continuously variable parameter. We may take as this
parameter the characteristic values C' of the operator F1 itself, or
the values of the energy which it determines in conjunction with the
quantized parameters C" and O'". I t will bo convenient to use for the
characteristic functions belonging to a continuous energy spectrum a
notation similar to th at corresponding to the discrete case, replacing
the quantum numbers as indices by the characteristic values of the
operators F and writing O as an abbreviation for the triplet O', O", O'",
so that the characteristic functions and energies are written (z, y, z)
and respectively. If this abbreviation is not desired, it may be
preferable to use a mixed notation involving continuously variable
parameters as well as quantum numbers (e.g. the characteristic functions
of the hydrogen-like atom can be written in the form where the
energy W stands for the continuously variable parameter O').
I t should be mentioned th at a continuous spectrum corresponds to
non-quantizable or partially quantizable motions th at can be de
scribed quasi-classically, i.e. with an approximately determined action
function S0, which is either single-valued, or has a many-valuedness of
a kind restricted to one or two of the parts into which it is separated
according to (64 c). The wave functions belonging to a continuous
spectrum Wc do not possess the orthogonality property which is
characteristic of the functions </£ belonging to the discrete spectrum,
since, as we saw when deriving the orthogonality relation (63), this
relation depends not only upon the self-adjointness of the operator //,
but also on the convergence of the integrals J |^r°|2 dV. These integrals
converge for 0° = i/£ but do not converge for 0° =
The* connexion between th e lack of or thogonality and the continuous
§ 10 ORTH O G O NALITY AND NORM ALIZATION 71)
character of the energy spectrum can be illustrated by the following
argument. Let us suppose th at and are two functions belonging
to two different energy-levels \VC and Wc . Since the latter form a
continuous series, their difference can be made arbitrarily small. Now
if the orthogonality relation (63) applies to the continuous case, then
the integral f would jump discontinuously from zero to
infinity as we go from nearly equal values of C\ and C2 (corresponding
to nearly equal values of the energy) to the limiting case Cx — C2.
I t should also be mentioned that—with the exception of a motion
with one degree of freedom, i.e. specified by one coordinate only—the
continuous spectrum possesses a degeneracy of an infinitely high degree,
in the sense th at each energy value can be associated with an infinite
number of different states of motion, represented by different functions
In the case of a continuous energy spectrum it is possible, and
indeed is often necessary, to consider not merely exactly defined states
of motion corresponding to perfectly definite values of the continuously
variable parameters C, but rather states of motion represented by a
superposition of exactly defined states corresponding to a very small
range AC of these parameters, i.e. by wave functions of the type
|* i/jq dC = (65)
AC
where the integration is extended over the range AC. The wave func
tions obtained in this way obviously represent a generalization of those
functions which have been used in P art I to represent ‘wave groups’
or ‘wave packets’. In defining.these generalized ‘wave-packet’ func
tions, we must take into account the time factor in the expression
iftc = ifye-i2irW°tlh, since the energy Wc is also a function of C. So long,
however, as the region AC is very small, the function (65) can be
written in the form ^ = (65a)
where C0 denotes some arbitrarily chosen ‘point’ contained in AC, and
is a certain function not only of the coordinates, but also of the
time, representing the propagation of the wave packet.
For various reasons, it is usually more convenient to consider the
functions <f%c at a particular instant t = 0, in which case they can be
defined by the integral ^ = j # ^ (65 b)
AC
and to represent the inexactly defined states of motion for any time by
the product of (65 b) by e-i2ir1fWI/l.
80 O P E R AT O R S §10
Let us imagine th at the whole region formed by the variable para
meters C (it may be a ‘line’, a ‘surface’, or a ‘space*—depending upon
the number of continuously variable parameters in the triplet denoted
by C) is divided into very small elements AGY AC2,..., ACn which do
not overlap, and let us consider instead of the exact states the in
accurately determined states which are represented by the amplitude
functions J tfj^dC (n = 1, 2, 3,...). These states can be associated with
a discrete set of energy values Wn referring to certain (arbitrarily
chosen) points of the corresponding elementary regions ACn.
I t can be shown that in the limiting case when the size of each region
is decreased to zero (their number increasing to infinity) the functions
( 06 )
&('n
behave in the same way as the ordinary amplitude functions belonging
to a discrete spectrum, i.e. in such a way that the integrals J 4>n* ^ndV
are convergent. This result follows from the oscillatory character of
the functions \fPc at large distances (see below). Since the functions
(GO) satisfy in the limit the same equation as the corresponding exact
functions (for W — lfcJ , it follows th at they must be mutually ortho
gonal and further th at they can be normalized to 1, so th at we can put
j ft t b l d V = (66a)
Let us consider, for example, the functions
ipk = A(k)e f27Tkx,
which describe a force-free one-dimensional motion with a momentum
g = hk and a kinetic energy W = k2h2/2m .
If we regard A as a slowly varying function of k}we get
Axi 1Me
<£° = J if,0 dh = A(l\) J ei2nkx die = A(lc1)ci2,,k'x'„sin7rAX;.r
ttx
We thus obtain, replacing the volume integration by an integration
along the x-axis,
+00 -| oo+00
J ^ dx = h c J ^
= M(*i) (8inf ) 2d Z= M(fca)|*,
i.e. by (66 a), M(*i)l* = i-
§10 O RTH O G O NALITY AND NORM ALIZATION 81
I t should be noticed th a t the normalizing condition only determines
the modulus of the coefficient A(k). We can still multiply it by an
arbitrary factor of the form e‘^k).
Likewise we find for two intervals Akx and Ak2 about the different
mean values kL and k2:
sin 7t Akxx sin tt Ak2x
A* A2ci2nik*-kJ*
7TX 7TX
If, for simplicity, we put Ak2 = Akx (k2 ^ kx)y then the integral
J dx assumes the form
(f = wAkx).
When Ak -> 0 the quantity (k^—kJ /Ak becomes infinite and therefore
this integral must in the limit be zero. These results can easily be
generalized so as to apply to free motion in three dimensions, repre
sented by a wave function of the form
ifri = ^4(k)e£2wk’r = -4 ity,
since this function is equal to the product of three functions repre
senting one-dimensional motions parallel to the three coordinate axes
respectively, the integrals both with respect to kxi kyy kz as well as with
respect to xyy, z thus reducing to products of integrals for the separate
components. (It should be remarked that AC must be defined in thi,
case as the product AkxAkvAkz.)
The general proof of the quadratic integrability of the functions
(66 ) can be derived from a very simple physical consideration, namely,
from the fact that, at very large distances, the motion represented by
any function ipc must approximate to a force-free motion, at least in
all problems of practical interest for which the field of force determining
the motion of the particle is supposed to vanish at infinity.
Taking again the function = ei27rkx as a typical representative of
wave functions belonging to a continuous spectrum (for the case of one
dimensional motion), let us consider the double integral
J = J J $[.* dxdk2 = J J ei27T{k'~k*)x dxdk2,
extended from —oo to -f-oo both with regard to k2 and x. Since each
of the simple integrals over k2 and over x taken separately between
these limits does not have a definite value, let us define the value of
J as the limit of Jjg = J dx J ei2n(ka-ki)x for £ _> go, or the limit
3505.0 M
82 O P E R AT O R S §10
In the former case we have
*»+** sin irkx
j dk2 >
7TX
kx-ik
and = i j" * * * - ,.
independently of k, and therefore in particular for k — oo, which gives
J = 1. In the latter case we get similarly
J. 77(^2—
-t
+ 00
and
S
= f B TT(kz—i kYn)
J
dk2 = i Jf 8inp dp =
77 p
1,
independently of £, and in particular for f = oo. The two definitions of
J thus lead to the same result, namely, J = 1.
Let us now assume that \pk = A (k)ei27Tkx, where A (k) is some relatively
slowly varying (non-oscillatory) function of k, and let us define the
double integral +QO+00
J / i t ^ k , dk2dx
— 00 — OO
+ 00 +£
as the limit of = j dk2 J dx
-o o
for £ = oo Then since
—OO
—00
we get J = A*(k^)A(k^ = |^4(A;1)|2.
Hence it follows that the ‘normalization* |u4(&1)|2 = 1 which has been
derived above for the function \jj%= A(k)ei2nkx with the help of (66)
and (66 a) (with n = m = k) can be obtained just as well from the
+00 +00
condition J J ip°k* </^i dk2dx — 1. This result can easily be generalized
—oo —oo
for a ny functions i/j°c belonging to a continuous energy spectrum, the
§10 O R T H O G O NAL IT Y AND NO R M AL IZ AT IO N 83
normalization condition of the usual type for the quasi-discrete functions
j Pa dC,
k f(AC.)
namely, J dV = 1,
being equivalent to the condition
JJ Pc, Pc, dG^dV = 1. (67)
The latter is similar to the equation
i] p : p » d v= * i
Tl J
for functions belonging to a discrete spectrum. This equation is an
immediate consequence of the normalization and orthogonality relations
J P * P dV = 5.
I t is possible to treat equation (67) in a similar way, i.e. to consider it as
a corollary following from an orthogonality and normalization relation
for the functions which, according to Dirac, can be written in
the form J fl>* dV = S(6’2- C\y, (67 a)
where S(C') denotes a somewhat unusual type of function, rather defined
by the left side of this equation (together with the condition (67)) than
defining it. As a m atter of fact, this function does not depend upon
the particular type of the function so long as satisfies the con
dition (67) which reduces to
l,
or J S(C) dC - 1, (67 b)
the integration being extended over all values of the continuously
variable parameter (or parameters) C.
I t is obvious th at for C = 0 (i.e. C2 = C J, the function 8(C) becomes
infinite. I t seems, however, impossible to assign to it a definite value
for C ^ O . Take, for example, the normalized function ift%= ci2nkx
(with C = k). According to the definition (67 a), we have
+00
8(&a—fcj) = J ei27Tik*~ki)x dx,
+ oo
i.e. 8(jfc) = J eittnkz dx. ( 68 )
—oo
84 O P E R AT O R S §10
This expression has no definite value. We can, however, replace it, as
we have actually done above in the evaluation of the integral J , by
B^(k) = J el2nkx dx, (68 a)
and pass to the limit £ -> oo after the completion of all the calculations in
which the f unction S^(k) enters, and in particular after integration over
k (which always forms a part of these calculations). The result will
have a perfectly definite value, and indeed the same value as that which
would he obtained by putting from the very beginning
h(k) -- 0 for k •/= 0
-| oo
and | h(lc) dk = 1
— JO
! TO I TO
The above calculation of the integral J J J ,d k2 dx for afunc-
—CO —oo
tion of the type i/tf. = A(k)ei2*kx, subject to the normalizing condition
J — 1, serves to illustrate these relations.
We may thus say that the functions belonging to a continuous
spectrum, though not orthogonal to one another in the strict sense of
the term, can be treated as if they were orthogonal to one another and
can be normalized according to the conditions (67 a) and (67 b) with
8(C) =.= 0 for C r/, o.
The usual normalization J dV ----- 1 for a function belonging to
a discrete spectrum is equivalent to putting the total probability of
finding the particle under consideration somewhere in the whole of
space equal to 1. The normalization (67) or (67 a) can be interpreted
as expressing the fact that the relative probability of finding the
particle within a finite region of space containing the field of force
in which it is moving is infinitely small compared with the pro
bability of finding it at infinity (where it moves practically as a free
particle). Under these circumstances it is more convenient to normalize
the total probability to infinity rather than to unity. This normalizing
to infinity, corresponding to the relation (67) or (67 a), is equivalent
to the usual type of normalization for the quasi-discrete functions
■■
■
*- I dt{'r dC, each of which represents a kind of ‘frozen' wave
V(A6’) J r<
LC
packet.
I ll
MATRICES
11. M a t r ix R ep r esen t a t ion of P h ysica l Q u a n t it ies a n d M a t r ix
F or m of t h e E q u a t ion s of M otion
If a particle is moving in a constant field of force, defined by a potential
energy U(x,y,z) which does not depend upon the time, its total energy
W remains constant. A ‘conservative motion’ of this kind is described,
in wave mechanics, by a particular solution of the equation (H+ p^t/j = 0
of the type \ft — ip°(x,ij,z)e-i2nn’llfl, where the amplitude function i/j ° and
the associated energy constant satisfy the equation 7/0° --- If the
particular solutions of the equation (II-\-p+l> — 0, where the Hamil
tonian II does not contain the time explicitly, form a discrete set
corresponding to a discrete spectrum of If, then the general solution
can be represented as a sum of these particular solutions with arbitrary
constant cocflicients. Thus we may write
<f> = 2 a„ <Pn = I a n 4>ve (0!))
n n
the functions 0” being supposed to be so normalized that they satisfy
the condition J | $ t\2 dV = 1.
If the functions iff form a continuous set, the summation must be
replaced by an integration giving
«/. = | a(C)>f,c dC - J a(4"r c-'-"" V/'' dC, (09a)
where C represents the continuously variable parameters. If some of
the three parameters arc quantized while the others are continuously
variable, the summation must be replaced by a combined summation
and integration. Thus, for example, we may have
(69 b)
.11. *
the functions or being so normalized that they satisfy the
condition (07), and a(C) ac being arbitrary functions of the con
tinuously variable parameters C.
jf—as is generally the case—the energy spectrum consists of a dis
crete part Wn and a continuous part JTC, the general solution of the
equation (H + pt)\js = 0 is represented by a sum of (69) and (09 a) or
(69 b), so th at r
^ = 5 > « 0 » + \ a c ^ic dC, (69c)
n J
or = 2 2 2 a nlvtwi 'Pnlnin,+ 2n. ft.
2 ^f dCv (09d)
80 M AT R IC E S §11
We shall first examine the simplest case, i.e. the representation (69)
corresponding to a discrete spectrum. As already explained in P art I,
§ 17, the summation, from the point of view of the probability theory,
expresses the alternative character of the motions represented by the
different functions ipn or The resulting function i/»can be normalized
to unity in the same way as the separate functions tf*n1 i.e. it can be
made to satisfy the condition
J w * d v = i- (70)
According to (69), in conjunction with the orthogonality and normalizing
relations J \/tn dV — Smn, it then follows that
5 > n « S = 1- (™ a )
The quantities a na* — \an\2 can be interpreted, subject to this condi
tion, as the probabilities of finding the particle in a state of motion
specified by the function *ftny irrespective of its position in space.
The probable (or average) value of any quantity represented by an
operator F is determined by the general formula
F = J >p*F^ dV.
Putting tjt = 2 aH<f>n, we get
F = Z
m nK a »Fm«> (7 l)
where Fmn = J F #n dV. (71 a)
The F mn are the ‘matrix elements’ of the quantity F with respect to
the states of motion ipm and «/>n. Putting
A* = (v/t = WJ h )9
we get F
•Lmn = F mn
{) ei2*v”»'1
° > (71b)
with = jK F tid v (71c)
W —W
and vvmn — V
ym—vvn — - n
(cf. P art I, §§ 17 and 18).
So long as the operator F represents a real quantity, the matrix
elements Fmn> as well as their amplitudes, are Hermitian, i.e. they
satisfy the relations
Fmn = F *m, F ^ = F °l. (72)
These relations are directly evident if F is a (real) function of the
§ 11 M AT R IX R E P R E SE NT AT I O N OF P H YSIC AL Q UANT IT I E S 87
coordinates alone. To establish them for the general case, let us first
h d
put F = px — . We then have
r x 2 m dx
and consequently
-J « =- J
dV <ir .
N°w J - J l dV = J £(*.**) rfK - J , dF,
and since the first integral on the right vanishes, it follows that
and so we get (72). The proof can easily be extended to any function
F of the operators p x, p y, pz (and of the coordinates) not involving
complex quantities (with the exception of the i in the expressions for
p x which is necessary to make these operators correspond to real
quantities).
The relations (72) should not be confused with the self-adjointness
relation (51) which, in the case of the integral (71a), runs
= (72 a)
I t is equivalent to (72) only when
F = F *> (72b)
i.e. when F is a function of the coordinates alone, not involving the
operators p x,p yi pe or involving them in even powers only. In the latter
case, which is met with, for example, when F is the energy operator
H — (Pl+ Pl+ P'DI{'im ) + u (x>V’2)>
the Hermitian relations (72) actually reduce to the relation (72 a)
expressing the self-adjoint character of F . Putting F = H, we have,
since H<f,n = Wa if>n,
H mn = Wn j * * * n d V.
Taking into account the orthogonality and normalizing relations for
the functions this reduces to
H mn = H l n = WnSmn. (73)
We thus get by (71)
# = 2 « n< ^ = l K | 2»n. (73 a)
88 M AT R IC E S
This equation shows th at if H is to be interpreted as the probable
value of the energy, then the number | a j 2 must actually be considered
as the probability of finding the particle in the state of motion repre
sented by the function iftn and associated with the exactly known value
of the energy Wn.
Similar results hold for any operator F which represents a constant
of the motion, i.c. which commutes with the energy operator. If
there is no degeneracy, i.e. if the values of the energy W corresponding
to different functions are all different, then, as already shown in
§ 7, it follows from the relation H F = F H th at Ftfin — Fn ipn, where
F.n is a constant, namely, the value of the quantity represented by F
for the state in question. We thus get, in the same wray as before,
Fmn = ^mnFn>
and F = 2 \an?Fn.
n
These relations can still be retained when there is degeneracy provided
the functions tply */v forming a degenerate set, i.e. belonging to
the same value of the energy, are so defined that they satisfy the rela
tions Fipn = Fnipn (this can always be done, as already shown in § 7).
If they do not satisfy these relations, we have
F'i’k = Ckl ipi
[cf. eq. (47 b), § 7]. Multiplying this equation by 0*, where if/m is some
function of the same degenerate set, and integrating, we get
f F *k dV = ± Ckl J dV = Cknv
J i i J
since we can always suppose the functions to be orthogonal to one
another, irrespective of the degeneracy. We thus get Ckm — Fmk or
^ = (74)
/—i
If ipn is some function not belonging to the degenerate set ifilt *pr>
it follows that
Fnk = f ttF 'l'kd V = 2 F m ltii'h d V = 0.
J I- 1 J
The general expression (71) thus reduces to the sum of the expressions
2 l R2 ai aiFkt = A2r-i 2i=l a*ai F ii (7 4 a)
taken for different values of the energy W. The relation
follows from Wk = Wt. Thus, irrespective of the degeneracy, the
§ 11 M AT R IX R E P R E SE NT AT I O N O F P H YSIC AL Q UANT IT I E S 89
probable value of the operator F representing a constant of the motion
is independent of the time. This independence of F of the time is there
fore the general criterion of the fact that F is a constant of the motion
and commutes with H. If there is no degeneracy, it means th at all the
m atrix elements of F must vanish with the exception of the ‘diagonal’
elements (i.e. those with two identical indices). In the presence of
degeneracy this restriction is too narrow, the constancy of F being
consistent with non-vanishing values of the matrix elements of F for
all those states for which the energy difference vanishes.
The relation (74) is a particular case of the general equation
= (™ )
where the summation is extended over all the characteristic functions
of H , irrespective of whether they belong to the same energy or not.
This relation (75) holds for any operator F , and reduces to (74) when
F is a constant of the motion. Equation (75) is derived in the same
way as (74) by assuming th at the function F\jjk can be expanded in
a series of the type ^ Cid*Pi with coefficients Ckl which may be functions
of the time but do not depend upon the coordinates.! This is equivalent
to assuming th at Fifjk can be expanded in a series of the type Ckl $
with constant coefficients Ckl. In the latter case we obtain, by multi
plication by if/}* and integration over the coordinates,
Je F & dV = 2 CfcJ « iff dV = CJL,
i-e. C%m = F?nk,
and F4>% = £ F U °- (75a)
From this equation it is possible to derive (75) (provided F does not
contain the operator pt) with the help of the relations ipk = ifjke+l2Trykt
and F% = Flke~i27TVtit, where vlk = v,—vk.
If F is not a constant of the motion, the expression (71) for its
probable value contains terms which represent harmonic oscillations
with the ‘transition’ frequencies vmn = {Wm—Wn)lh. (The meaning of
this fact for the emission of light has been discussed in P art I, § 17.)
Taking the derivative of F with respect to the time, we get, according
to (71b), dF
2 X am ^nWfnn Fmn,
dt
t This assumption can be justified for a very wide class of operators satisfying certain
conditions whioh we shall not consider here and which are always fulfilled in practice.
8595.6 X
90 M AT R IC E S §11
or f = 2J T 2 2 < a » w » - W^ F™- <75 b)
7)t n
I t can easily be shown that the right side of this expression is equal
to the probable value of [//, JP], i.e. to 2iri{HF—F H)jh. We have in fact
FHifi,, — FWnijiH --- Wn Ft/),,,
and, according to (75),
so that
(H F - F H ) mn = f t t A H F - F H fo i V
= 2 Fkn Wt j K dV - W n f F f n dV
We may thus define the operator dF /dt by the matrix equation
(,5'>
If, in the preceding equations, we replace H by some other operator
0, we get, by a twofold application of (75),
(F G)tn = F | 0 ^ k= | Gkn F *k - I Gkn 2
= 2 ( 2 Fmk Gkn)if/m.
7)1 K
On the other hand, according to the same formula (75), we have
(fG W » =
m &G)mn*m>
2
where (FQ)mn are the matrix elements of the compound operator F O.
Therefore it follows that
(F G)mn = ^ F mkGkn. (76)
If we put Fmk = F%k eiim’* t, Gkn = Of,
and take into account the relation
Wm- W k Wk—W„ Wm- W n _
**mn> (76 a)
hI ^ h
we get (F G)mn = ( f O L c 48” " '', with
(ro L = |n * « , (7«b)
This relation can be obtained directly by applying the operator F Q to
instead of ipn and using (76 a) instead of (76).
I t should be noticed th at equations (76) or (76 b) coincide with
§ 11 MATRIX REPRESENTATION OF PHYSICAL QUANTITIES 91
equations of § 18, P art I, which were derived by combining the
multiplication and addition laws for the ‘probability amplitudes’ for
transitions from a certain state m to another state n through some
intermediate state k. The matrix elements Fmk and Gkn were inter
preted there as the ‘probability amplitudes’ for the simple transitions
m k and k -> n under the influence of perturbing forces characterized
by F and G respectively, and the matrix element (FG)mn as the
probability amplitude of a transition which is a combination of the
preceding two with the intermediate state k remaining unspecified.
We shall return to this interpretation in a later section.
Equations (76) or (76 b) express, from a purely formal point of view,
the multiplication law of matrices. This matrix multiplication Jaw (i.e.
combination of the rows of the first matrix with the columns of the second)
is quite similar to the multiplication law of determinants, which can be
associated with the corresponding matrices. Hence the matrix of the
operator FG is called the product of the matrices of F and G.
Matrix multiplication is, in general, non-commutative, just like multi
plication (i.e. successive application) of the corresponding operators.
I t must be mentioned further th at the products of two Hermitian
matrices FG and GF are in general not Hermitian, the conjugate com
plex of (FG)mn being equal to (GF )n/n. The two products are therefore
Hermitian matrices only if they are identical, i.e. if F and G commute
with each other.
If, instead of the product of two operators, we consider their sum
F -\-G, which is obviously commutative in the sense that
(F + a w = (G + F W ,
and form the matrix of this sum, we obtain the relation
(F + G )mll = FMn+ G mn = (G + F )mn, (76c)
which expresses the addition law of matrices, this matrix addition satisfy
ing the commutative law.
I t can easily be shown that, for three or more factors, the associative
law is satisfied both for operators and for the corresponding matrices,
just as for ordinary numbers, so that, for example,
(EF )G = E(F G),
and therefore
[(EF)G]mn = 2 (EF )mkGkn = 2 1
k k T
= ^ E ml(FG)ln = [E(FG)]mn.
We thus see that there exists a one-to-one correspondence between different
02 M ATR IC ES §11
operators and the associated matrices, both with respect to addition and
multiplication. This correspondence enables us to replace the operator
representation of physical quantities, which we introduced in the pre
ceding chapter, by a matrix representation, each physical quantity,
whether numerically expressible, i.e. having a definite value, or not,
being represented by an array of matrix elements
• -II
jl^21> • -i
(7 7 )
■^ 3 2 > ■^ 3 3 > • * |
i-i 1
jrn> • .:j
\F*l9 F%0 • • -1!
FS» F {3)3 >
T * * • i
!i (7 7 a )
................................... i !
These will be denoted in future by single letters F and F° respectively,
and will be used in exactly the same way as the operator representing
the physical quantity in question, without direct reference to charac
teristic functions of any kind.
I t should, however, be kept in mind that such functions are indirectly
implied in the very definition of the matrices F or F°, being the charac
teristic functions of the energy operator H. Referred to these particular
functions, the energy is represented by a diagonal matrix
w1 0 0 . . .
0 Wt 0 . . .
0 0 W3 . . . 9 (77 b)
i.e. = smnirn,
1 0 0 . .
0 1 0 . .
where 0 0 1 . .
§11 MATRIX REPRESENTATION OF PHYSICAL QUANTITIES 93
is the so-called ‘unit-matrix*, which in future will sometimes be denoted
by 1 (Sm„ = i mn).
The matrix elements of (77 b), i.e. the energy-levels Wn, appear in
the relations (77 c)
between the elements of (77) and (77 a)—the latter being simple
numbers. The absolute values of the energy cannot, however, be derived
from these relations, which contain their differences only.
To distinguish the quantities Fmn and jPJm, we shall call the Fmn the
matrix components and the F?nn the matrix elements of the quantity F.
For the energy as well as for any other constant of the motion, the
matrix components coincide with the corresponding elements, so th at
we can then put jr - F°
The representation of physical quantities by means of operators
(including functions of the coordinates alone) differs from the repre
sentation by means of matrices in th at the representation by operators
is absolute, while the representation by matrices is relative. By relative
we mean th at the matrix elements of a quantity are defined with
respect to a particular set of stationary states which are specified by
the characteristic functions of a particular operator—or a system of
comm utable operators (like H , il/r, and M 2). We shall see later th at
this distinction is not so fundamental as it seems. The operator repre
sentation given above is based upon the use of the coordinates (and
the time) as the directly observable quantities. But this is not neces
sary. Certain other quantities—e.g. the momentum components—can
assume the role of directly observable quantities. The coordinates then
become represented as operators in terms of these new quantities.
Leaving this aside, and retaining the variables x, y, 2, t as the primary
and directly observed quantities, wc can maintain the above distinction
as a fundamental one.
Now it can easily be shown that the determination of the matrix
elements of any operator F with respect to the characteristic functions
of some other operator H (or of a system of three commutable operators)
does not necessarily require an actual knowledge of these functions. I t
is in fact sufficient to know th at they are such as to make the matrix
of H diagonal. If, moreover, both H and F are explicitly defined as
functions of the coordinates x, y} z and of the elementary operators
Px> Py> Pz> then, taking into account the commutation relations
pzz - z p . = * 1, (78)
94 M AT R IC E S §11
P xy-yP x = o, etc., (78 a)
xy-yx = 0, PxPu PyPx ^ etc., (78b)
(in the matrix representation) we can calculate, with the help of the
matrix addition and multiplication laws together with the condition that
x, y, z, px, p y, pz shall all be Hermitian matrices, the matrix elements
both of / / and' of any other non-diagonal matrix After the matrix
elements of H and F have been determined, we can then calculate the
matrix components of F (those of H coinciding with the elements).
So far, therefore, as the determination of the matrix elements or
components of any physical quantity with respect to the stationary
states defined by some energy operator H is concerned, we can replace
the solution of Schrodingcr’s equation Hip0 = W\p° and the subsequent
integration F °ln = J ip[^F ipn dV by the following problem:
(1) To determine the matrix elements of the quantities a\ ?/, z,
Ps.yPyyPz’ subject to the commutation conditions (78), (78a), (78b), in
such a way that the matrix of the function H {x,y,z\px,p y,pf) shall be
diagonal, i.e. that Hnm — 0 unless n = m.
(2) Knowing the matrices x, y, z, p x, p v, pz, to calculate the matrix
elements (or components if the ^-m atrix is added to the list) of any given
function F (x,y,z; p x,p y,pz).
In this way the functions ip°, specifying the stationary states to
which the matrix elements refer, can be completely eliminated from
the matrix theory, and the latter built up as a closed and consistent
theory, in the air, as it were, by the logical attraction of its elements,
and not requiring the use of any ideas extraneous to it for its support.
I t should be noticed th at the two parts of the above problem arc,
in a certain sense, reciprocal to one another—for in the first part
we are concerned with the solution of a system of matrix equations
for the unknown matrices x, y, z, p x, p y, p sf and in the second with
the calculation of an explicitly given function of these fundamental
matrices.
In problems with one degree of freedom (corresponding to the motion
of a particle in one dimension, such as the linear oscillator) the con
dition ‘H is a diagonal m atrix’, together with the commutation condi
tions (78), etc., provides the basis for a complete and physically
unambiguous determination of the fundamental matrices, e.g. x and
pxi and consequently of the matrices representing, ‘from the point of
view of H ’ as it were, any other quantity F (xip x). I t should be noticed,
however, th at there remains a certain ambiguity which is irrelevant
§ 11 M AT R IX R E P R E SE NT AT I O N OF P H YSIC AL Q UANT IT I E S 95
for the physical interpretation of the matrix elements, but which, as
we shall see later on, is very important for the correct understand
ing of the relation between matrix theory and classical mechanics. If,
in fact, x°mn and (px){}nn are matrix elements which satisfy the condi
tions of the problem (or rather of its first part), then any elements of
tin type i(ot a(t) / \o e?(«,„-<*„)
A/mnc' ’ KFjcnnn^ »
where are arbitrary real numbers, will also satisfy these con
ditions, the elements of any other matrix F l tn being replaced accord
ingly by F ^n e»'<«»-««). This result can easily be proved directly, or
deduced from the original definition of the matrix elements in terms of
the characteristic functions ^ if we use the fact that each of them can
be replaced by its product by without any violation of the ortho
gonality and normalizing relations. This amounts to the introduction
of an arbitrary ‘phase’ into tpn (putting 1 or ‘phase
difference’ into Fmn (putting Fmn — F "lJt
The ‘phase’ constants a vanish in the diagonal elements F ^n which,
as we know, determine the average or probable value of the quantity
represented by F in a stationary state with the energy Wn. The phase
constants also vanish in the products F ^ F ^ , i.e. in the squares of
the moduli of the matrix elements referring to different stationary
states (Wn ^ Wm). These products determine the probability of a
transition between the two states under the influence of a perturbation
proportional to F .
In the general case of motion in three dimensions, the condition th at
the energy matrix should be diagonal (together with the commutation
relations (78), etc.) is not always sufficient for a physically unambiguous
determination of the matrices x, z, p xf p y, p s, and it has then to be
supplemented by a similar condition for one or two other matrices
representing quantities which are constants of the motion, for instance,
the z-component and the square of the angular momentum for motion
in a central field of force. Such additional conditions are necessary in
the case of degeneracy, the existence of which is revealed in the matrix
theory, by the identity of several (diagonal) elements of the energy
matrix. The matrices representing constants of the motion must of
course—irrespective of the presence or absence of degeneracy—com
mute with the energy matrix, i.e. satisfy the relation
{HF)mn = (F H)mn,
which corresponds to the operator relation H F = F H. The multiplica-
96 M AT R IC E S §11
tion law (76), together with the condition th at H is a diagonal matrix
(H mn = g^C
(HF )mn = | H mkFkn = WmFmn,
(F H)mn = -£FmkHkn = WnFmn.
The condition th at F is a constant of the motion therefore reduces to
(Wm- W n)Fmn = 0,
which means that Fmn = F ^n>
i.e. th at the matrix elements of F vanish for all states except those
which correspond to the same value of the energy. Therefore, if there
is no degeneracy, the constants of the motion must be represented by
diagonal matrices. If there is degeneracy they may but need not
necessarily have a diagonal form.
The preceding result has already been obtained in a somewn at
different manner [cf. (77 d)]. I t should be remarked that a function
f(F ) of a diagonal matrix is itself a diagonal matrix, the elements of
which are equal to the same function of the corresponding elements
of the argument matrix
t /W U = f(F nn).
This follows from the fact th at the characteristic values of an operator
f(F ) must be equal to the same function of the characteristic values
of F . This result has already been stated when discussing the energy
operator (§7). I t can be obtained directly from the matrix multiplica
tion law which gives, when F is a diagonal matrix,
(F 2)mn = ^ FmkFkn = KimFmn = &mn>
(F*)mn = | {F*)mkFkn = F ln Smni etc.,
so that, if f(F ) can be expanded in the form 2 ak F k where ak are
numerical coefficients, we have ak F k^j = ^ ak 2^nj8mn.
As has been pointed out a t the beginning of this section, matrices
representing real physical quantities must satisfy the Hermitian con
dition. The products of two such matrices F and G (unless they com
mute with each other) F G and GF cannot therefore represent a real
physical quantity. Representation of real physical quantities can be
obtained, however, by taking the sum of the two products, or their
difference multiplied by i. In the first case we get, on dividing by 2,
the ‘symmetrized’ representation \(F G + G F ) of the classical product
§11 MATRIX REPRESENTATION OF PHYSICAL QUANTITIES 97
of the corresponding quantities. In the second case we get, with the
additional factor 27t /A, the bracket expression [F, #] which has bben
already considered in § 8 and which corresponds to the Poisson-bracket
expression of the classical theory.
12. The Cor r espondence between M atr ix and C lassical M e
chanics
The matrix representation of physical quantities was introduced by
W. Heisenberg towards the end of 1925. A few months later SchrO-
dinger’s wave-mechanical theory appeared, but nevertheless Heisen
berg, Bom, and Jordan continued, for some time during 1926, to
develop their ‘matrix theory*, without seeing any connexion between
it and the 'wave theory’. The connexion was finally discovered by
SchrOdinger (and independently by Pauli) who found th at the Heisen-
berg-Bora-Jordan matrix elements could be calculated from the wave
functions by means of the formula F%,n == j </*?* F $ t dV. This little bit
of history serves to illustrate the fact th at the matrix theory does not
need a wave-mechanical support, but can be made completely ‘self-
supporting’. We shall see later th at the connexion between the wave
theory and the matrix theory can actually be reversed in the sense th at
the matrix theory, in a generalized form due to Dirac and Jordan,
contains the wave-mechanical theory as a particular case (§ 14).
In his formulation of the matrix theory, Heisenberg was guided by
Bohr’s ideas concerning the correspondence between the quantum and
the classical description of the phenomena of radiation. In ‘the good
old days’ before the coming of the quantum theory, atomic phenomena,
and in particular those connected with the emission or absorption of
radiation, were described in terms of a steady motion of the electrons.
To this idea of steady (or continuous) motion, Bohr added the idea of
transitions from one state of motion to another. In this way, between
the years 1913 and 1925, physicists gradually became accustomed to
considering two types of mechanical quantities—classical and quantum-
mechanical. On the one hand we had, for example, the classical
frequencies or amplitudes referring to the steady motion (analysed by
means of a Fourier series into a sum of harmonic vibrations), while
on the other hand we had the quantum frequencies or amplitudes
referring to the transitions.
By means of his ‘correspondence principle’, Bohr was able, in 1918,
to establish an approximate relationship between the classical and the
quantum-mechanical quantities. Advancing still further along the path
3595*6 O
98 M AT R IC E S
laid down by Bohr, Heisenberg rejected the classical quantities alto
gether, as devoid of physical meaning, and devised the matrix scheme
(improved a little later by Bom and Jordan) for the direct calculation
of the quantum-mechanical quantities.
The correspondence principle can be explained in the simplest way
for a one-dimensional motion, restricted classically to a finite region,
e.g. lying between x' and x”, and therefore periodic. The coordinate x
of the particle can then be described classically as a periodic function
of the time and expanded in a Fourier series of the form
k -■t 30
x(l) = 2 z0(k)ei2r r kvt (79)
k - -oo
where v — 1 / r is the fundamental frequency of oscillation ( t is the
period of oscillation, i.e. the duration of the ‘round trip ’ from x' to x"
and back again to x'), and x°(k) is the amplitude of the A:th harmonic
term having a frequency kv. The two complex terms with the fre
quencies -\-kv and —kv must, of course, combine to form a real term
of the type
a,*, cos 2n\k\vt + 6 lA.,sin 27r\k\vt;
it follows that the amplitudes x°(-\-k) and x°( —k) must be conjugate
complex quantities
x° ( -k) = z0( + *)*, (79 a)
giving a ltl = bM = i[*°(fc)-:r0(J)*].
Bohr’s theory, in so far as it was concerned with steady motions,
restricted these motions by quantum conditions which, in the present
case, reduce to the single equation
J ~ j> g dx — nhy (80)
specifying the quantized values of the energy W — Wtl and hence deter
mining the fundamental frequencies v = vn. Putting g — N/{2m( W—U)}>
and differentiating the integral
J = < J{2m(W- V)} dx
with respect to W (considered as a parameter), we get
dJ X dx f m dx £ dx £ ,.
d w - J V {2(F-7;j/m }= J T = j 7 = t rf’
dj
or (80a)
dtf
This relation is a special case of the general relations between the
§ 12 MATRIX AND CLASSICAL MECHANICS 99
energy, the fundamental frequencies v1? v2, v3, and the fundamental
moduli of periodicity Jly J 2, J 3 of the action function S which were
deduced, in an earlier chapter, for motion in three dimensions, with
the help of the theory of canonical transformations (Chap. I, § 5).
Although the 'classical5 frequency v given by (80a) refers to a steady
motion, nevertheless it is expressed, as the ratio of the differences of
W and J for two different, though closely neighbouring, motions as if
it were associated with a transition between them. In fact the relation
(80 a) bears a striking resemblance to Bohr’s frequency condition
which gives the quantum frequency associated with a transition between
two more or less widely different quantized1 states m and a. Intro
ducing the quantized values of the integral J , we can rewrite the
preceding equation in the form
w _ w aW
. (80b)
Jjn
If W varies slowly with «/, and if the quantum jump m —n is not too
large compared with m or n, then the difference ratio AIF/A./ can be
replaced approximately bv the differential coefficient dW/dJ . From
(80 a) we then get the following approximate relation between the
classical and the quantum frequencies:
. S (m—n)v. (80 c)
We may regard this relation as indicating an approximate coincidence
or a ‘correspondence’ between the quantum frequency associated with
a Mold jump and the classical frequency of the harmonic oscillation
of the order k (k — m—n).
This correspondence between the classical and the quantum fre
quencies forms the nucleus of Bohr’s correspondence principle. The
principle is extended by asserting that, in addition to this correspon
dence between the frequencies, there is also a correspondence between
the amplitudes.
Let us denote the functions x(t) for the nth stationary state by a;7l(0
and the expansion coefficients x°(k) by x"t(k). Formula (79) then
becomes 4.®
*»(<) = 2 <(A-)e’^ “ ". (81)
k~ -co
Writing m—n instead of k and putting
x°n(m - n ) = x°mn, (81a)
100 M ATR IC ES $12
formula (81) becomes
(81b)
Now if the classical frequency (m—n)v corresponds to the quantum
frequency vmn of the light emitted by the system under consideration
(linear oscillator) as a result of the transition m-> n (if Wm > Wn),
then the classical amplitude associated with this frequency must,
according to Bohr, correspond to the quantum amplitude of the emitted
light, the correspondence being such that the intensity of the emitted
light must coincide approximately with the intensity calculated classi
cally on the assumption that the motion of the particle (which is
supposed to possess an electric charge without which there would be
no radiation) is represented b}r the simple harmonic term
X® gi2ir(m-n)vt
x* mil — ^mn
The approximation with regard to intensity must be the closer the
closer the approximation with regard to frequency.
The ability of the correspondence principle to predict intensities has
been verified in those cases where there is actually a close approxima
tion between the classical and quantum frequencies. For example, it
was able to predict successfully the relative intensities of the neigh
bouring lines appearing in the Stark effect. Nevertheless the nature of
the correspondence established by Bohr remained mysterious, until
Heisenberg, towards the end of 1925, unveiled it in a way worthy of
admiration both for its simplicity and for its boldness. Basing his theory
upon the principle th at only those things have a real existence which
can be observed, Heisenberg put forward the idea that classical quan
tities do not exist at all, since they do not produce any directly observed
optical effects. In fact the position and intensity of the observed
spectrum lines can only be expressed in terms of quantum or transition
quantities.
From this point of view, the classical method of describing the motion
of the particle by determining its coordinates for a given stationary
state Tiasa certain function of the time xn(t)> which could be expanded
in a Fourier series (81b), was to be considered as an approximation
to the description of the motion by means of a double array or matrix
components of the form
a?mn = x°m
mn„ e'2™™*
'corresponding’ to the totality of the classical harmonic terms for
different values of m and n in the same sense in which an approxima
tion corresponds to the truth.
§ 12 MATRIX AND CLASSICAL MECHANICS 101
At this point two different possibilities for reforming classical
mechanics seemed to be open. The one consisted in assuming that the
motion of the particle in a stationary state n can be described as a
definite function 01 the time, namely, by the series
* » (0 = m I- - oonZne**™"**
which should replace the simple Fourier series (81 b), and that the
equations of motion should be so modified as to lead to solutions of
this new type instead of solutions of the type (81 b).
The second possibility was to assume that the classical description
of motion, establishing a definite dependence of the position of the
particle upon the time, had to be abandoned and replaced by a quantum
description in which the coordinate x was to be determined as a matrix,
made up of components of the type 3?mnei2lTVmnt- In this ease the
external form of the classical equations of motion could be maintained
and only their physical meaning altered, the variables x, p x, //, etc.,
being regarded and determined not as ordinary quantities but as
matrices.
With an unerring intuition Heisenberg chose the second way, thus
giving up the very idea of motion in the classical sense (as being funda
mentally unobservable and therefore devoid of physical meaning) and
laying the foundation of the new quantum or matrix mechanics. The
idea th at the quantum description of motion amounts to the deter
mination of quantities relating only to transitions between different
states requires an important amendment, for besides such components
a matrix contains diagonal components or elements relating to definite
states taken separately. As we know, these diagonal elements are equal
to the average or probable values of the quantity represented by the
m atrix for the corresponding states. This result, which has already
been discussed in Chap. I, § 5, follows also from the preceding considera
tions connected with the correspondence principle. The time-average
value of some quantity, e.g. x, as represented by a Fourier series (81), is
obviously equal to th at term of this series which does not depend upon
the time, for which therefore k = 0. We thus have
Z jt) = *£(0),
or, using the notation (81a),
xn(0 — ’Cfi'
Having defined every physical quantity as a matrix, Heisenberg
102 M AT R IC E S §12
naturally enough replaced the usual multiplication law for ordinary
numbers by the matrix multiplication law. In this he was guided by
the necessity of securing the form
F —F
-* r n n
Q ei2nv'*»i»
x m 11 c
with the same transition frequencies vmn for the matrix representing
any function F(x) as those which appear in the matrix (82 a) for the
coordinate x. Taking, for instance, F(x) = x2 and using the matrix
multiplication law, we get
( * 2)m „ = ^ x mkxk„ = ( £ < *
as a consequence of the relations vmk — (Wm—Wk)jh, vkn — (1Vk—Wn)/ht
vmn = (Wm - K ) / h = vmk+ vkn\ of. (76) and (76b).
Having introduced matrices to represent physical quantities and the
matrix multiplication law for the calculation of matrices representing
functions of such quantities, Heisenberg kept unaltered the form of the
equation of the motion /2
understanding by x and f(x) not the usual variables but the corre
sponding matrices, and put Bohr’s quantum condition § g dx — nh in
the form ^
(9*-X9)nn =
leaving the question of the non-diagonal elements of the matrix open.
The commutation condition
gx—xg
2ni
which also fixes the non-diagonal elements of this matrix (as equal to
zero) was established by way of a generalization somewhat later by
Born and Jordan, and still later was recognized (by Schrftdinger and
Eckart) as giving the key for the transition from matrix mechanics to
wave mechanics, this transition consisting essentially in considering x
Jh 3
as an ordinary variable and g as the operator - - — and further in
2it i dx
replacing matrix equations by operator equations with the wave func
tion to be operated upon.
The information obtained from the wave-mechanical treatment of
a problem is more complete than th at obtained from the matrix-
mechanical treatment, for in addition to the matrix elements we obtain,
in the former case, the wave functions which serve to determine the
§12 MATRIX AND CLASSICAL MECHANICS 103
probable location of the particle, its probable velocity, and so on. In
the matrix mechanics the notion of probability with reference to
separate states appears only through the diagonal elements, represent
ing probable values, while the non-diagonal elements can be interpreted
under certain conditions as the probability amplitudes for transitions
between different states. In Heisenberg’s original theory, the matrix
components of the coordinate were looked for as quantities which
determine the intensity of radiation or, what amounts to the same
thing, the probability of transitions with emission of light, it being
assumed th at the intensity of radiation associated with the matrix
component xmn ~ ei2ny™1 is the same as it would be on the classical
theory if xmn represented the actual motion of the particle as a harmonic
function of the time. The result of this assumption is the same as that
obtained in P art I in connexion with Schrodinger’s theory of radiation,
namely, th at th^ probability of a spontaneous transition w n with
emission of energy in the form of monochromatic light of the frequency
vmn is equal (per unit time) to
A
where e is the electrical charge of the particle [Part I, eq. (93)].
In the preceding sketch of the development of Heisenberg’s matrix
theory from Bohr's correspondence principle we did not attem pt to give
a direct proof of the latter so far as it refers to the connexion between
the Fourier amplitudes and the matrix elements, having confined our
selves to the frequencies with respect to which the correspondence
could be established by means of Bohr’s own theory. This gap can be
filled with the help of wave mechanics, or rather that approximate form
of it which has been discussed in Chap. I, § f>, and which corresponds
to the classical mechanics together with Bohr’s quantum conditions.
We have already used this approximate form of the theory for com
paring the classical time-averages (which are equal to the constant term
in the Fourier expansion of the corresponding quantity F considered
as a function of the time) with its probable values, defined b}r the
integrals J ip* Ftpn dx, which are nothing else but the diagonal elements
Fttn = of the matrix representing F. We have found that to the
approximation implied by the formula (23 a), § 4,
(82)
where vn is the velocity of the particle (defined by the equation
104 M ATR IC ES §12
vn — j{2(W~~U)/m} as a function of its position x) and
sn(xyt) = 8°n(x)-Wnt
the classical action function for the state in question (with the energy
r
Wn), the classical time-average - J F(t) dt coincides with the probable
o
value J* Fiji* tffn dx provided 0 n is normalized to unity, that is, the
x'
coefficients cn are set equal to <J(2/t ).
J
\</>nl*dx = |c„l2 dx/vn = \c.J2I t = l.j
In a similar way it is possible to ascertain the approximate equality
between the Fourier coefficients in the expansion of x(t), or any function
of x supposed to be determined as a function of t according to the
classical laws of motion, and the ‘corresponding’ matrix elements of
this function F(x).
In order to determine the Fourier coefficient xP(n) in the expansion
(79) we multiply x(t) by e~i2nuvl and notice that the constant term in
the resulting expansion is just x°(n).
T
We thus get J
\t°(w) = - x(t)e"i2nnvi dt,
0
or, in the alternative notation corresponding to (81 b),
r
J
= - x(t) dt.
0
The coordinate x can be replaced here, as just mentioned, by any
function of x (or of x and g) giving
r
F nma = - J dt. (82 a)
0
On the other hand, we have by the definition of the matrix elements
= j'lC F ' K d x,
x'
or, according to (82), with s(x,t) = a \x)—Wt, <j°n = / - ,7— ei2,Te"(I,;\
x'
V T VKI
F l n = - ( F(*)e'*"W<*>-Ot>W-J*?— (82 b)
§ 12 MATRIX AND CLASSICAL MECHANICS 105
Now if the states n and m differ but little with respect to their
energy, we can replace <J(vv vm) by a certain mean value of the velocity
for an energy IF lying between IF,, and Wm> and put accordingly
dxl<J(vnvm) — dt just as in the case n — m. We have further under the
same condition
S°n(x)-81(X) = ^ ° £ J ) ( ' W J ,
where J is the action variable (80) (introduced in Chap. II, § 5, for the
general case of a three-dimensional motion), and J n — nht Jm = mh
its quantized values. In the case here considered of a one-dimensional
motion the function a°(:r) can be readily determined, from the equation
g = d8°(x)/dx defining it, by the formula
«%r) , , j gdx = j J{2m(W-U)} dx,
whence it follows [cf. the derivation of (80 a)] that
ds°(x) C dx C m dx .. ,
J
SW ~~ 4 2 (W-U)jm} “ J g ~ +C °nS ”
and consequently (dropping the irrelevant constant)
(ds<>\ _ c)8° dW _ t dW
W/x-coDHt. dW d J dJ
We thus get with the above approximation
dW
< ( * ) - « * ) = t a- j (J„—rm),
or, since with the same approximation (</„—J m]dWjdJ = M’m,
«•(*)-«•(*) = (82c)
This gives, on substitution in (82 b),
2 *r
F nmn = - \ F ( t ) e - ^ w- - w^ dt,
0
which coincides with (82 a) when we remember th at
( m - n )v ^ ( W m- W n)lh.
The preceding results can easily be extended to the general case of
the motion of a particle with three degrees of freedom in a limited
region of space. According to classical mechanics such a motion can
be described under certain very general assumptions as a ‘conditionally
periodic* motion, which means th at the coordinates, or any function F
of the latter, can be represented as a function of the time by a triple
106 M ATR IC ES §12
Fourier series with three different (incommensurable) fundamental
frequencies jq, r2, v3:
F It) ■
■
■V V V Kmj-WjV. ((mj -HjlVj)/
//<i mzm3
the coefficients F° being determined by the formula
7T
LH) lim - f F„ „ „ (<)e -|2’r'("'1 '-i' til.
* ///,>/(. wlt nt )it
J ‘‘ ' r- « T
0
According to wave mechanics, a series of this kind, as a whole, will
have no (or at least no exact) significance; the totality of the harmonic
terms in all such series, corresponding to all possible states nx,n 2,n 3,
will, however, constitute an approximate expression of the matrix
representing the quantity F . The exact expression of its matrix
components can be obtained if we replace the classical frequencies
(rn1—nx)vx-\- {m2 —«2)^ -f (m3.. w3)v;J by the transition frequencies
( W. , u a n ( l (lefino th e am plitudes mm by th e
integrals f dV. T he ap p ro x im ate equivalence of this
definition to th e classical one given above can be shown w ith th e help
of equations (32), (32a), an d (32 b) of § 5 in exactly the sam e way
as before.
One might be tempted to think that it would be possible to give a
correct wave-mechanical definition of the quantity F as a function of the
time by replacing the classical amplitudes and frequencies in the pre
ceding expression for Fn u nft) by the quantum ones, i.e. by putting
g/2rr(UmiW2^J-Tf
m , 7/1 j w i.
The fact that no physical significance can be attached to this ‘modified’
Fourier series is, however, clearly illustrated by the possibility of
multiplying the functions by arbitrary phase factors
resulting in the multiplication of the matrix elements by the phase
factors et(Q£mim^ “awini”.\ which are completely irrelevant from the point
of view of the wave-mechanical or the matrix theory, but profoundly
influence the 'modified’ definition of the function Fu htVft).
13. A pplication of the M atrix M ethod to O scillatory and R o ta
tional M otion
The matrix mechanics of Heisenberg, Born, and Jordan can be con
sidered as a kind of ‘skeleton’ of Schrttdinger’s wave mechanics, com
plete in itself but nevertheless deprived of the flesh and blood of
the probability conception, which forms the vital element of wave
§ 13 OSCILLATORY AND ROTATIONAL MOTION 107
mechanics. In addition, the wave-mechanical theory has another ad
vantage over the matrix theory, for, as a rule, it is easier to solve
Schrttdinger’s equation for the characteristic functions of the energy
operator and then to use these functions to calculate the matrix ele
ments of any other operator by means of integration, than to determine
these matrix elements from the condition that the matrix of the energy
is diagonal, together with the commutation relations for the coordinates
and momentum components, without knowing or using the 6harac-
teristic functions at all.
The practical application of the matrix theory to concrete problems
can, however, be made much easier and more convenient if instead of
carrying out the matrix representation directly with respect to the
fundamental operator relations p xx —xp = —. 1, etc., together with
2m
the condition that H (x,y,z\p x%p v,pz) diagonal, it is carried 'i it with
respect to some other operator relations between certain moie com
plicated functions F, G, etc., the choice of which depends upon the
character of the problem [i.e. on the potential-energy function U(x, y, z)]
if at least some of these functions commute with the energy, i.e. re
present constants of the motion. If G is such a constant (it may, in
particular, coincide with the energy //), and if some other function
F (for instance, the coordinate x) has been found which satisfies a
commutation relation of the form GF —FG ocF-\-j3G w'here a and
are constant, the matrix interpretation leads very simply to the deter
mination of the matrix elements both of G, which can be assumed to
be diagonal, and of F. Applying the matrix multiplication rule to the
left side of the preceding equation, we get
(G F —F G)mn = (Gmm—Gnri)Fmn = otFmn+ pQnnSmn,
whence it follows that all the matrix elements of F vanish with the
exception of the diagonal elements which are equal to
F
■*1171 = —£a ^Gnn
and those for which = a-
This equation leads very simply to the determination of the numbers
Qnn—especially when n can be treated as a simple quantum number
(and not as a set of several quantum numbers n3. ?i2, n3 all of them
different from the numbers m1, m2, r«3 represented by m). By a- suitable
labelling of the states associated with given values of G, we can make
those states for which the values of G differ by <xsuccessive, i.e. having
108 M ATR IC ES §13
values of n and m differing by 1, so that the preceding equation will
reduce to G„+!.n+1~ G nrt — ol. The solution of this equation is obviously
of the form G}lll -- an+ y, where y is a certain constant. We shall not
develop these general considerations but shall merely illustrate and
amplify them by means of two special problems of outstanding sim
plicity and practical importance—namely, the problem of a linear
harmonic oscillator and the problem of the rotational part of the motion
of a particle in a central (radially symmetrical) field of force.
The energy of a linear harmonic oscillator is expressed by the operator
or matrix (as we please)
H = —- P 2+ l(2m ' o)2" i-£2> ' (^ 3)
zm
where v0 is the natural vibration frequency of the classical theory.
According to the matrix theory H has to be ‘diagonalized’ subject to
the additional condition
p x-xp ^1, ! (83 a)
1 being the unit matrix.
We shall put, for the sake of brevity,
27Tv0mx ~ q, 2m il ---- K, h m - to,
so that (83) and (83a) can be written in the form
p 2+ q2 = A', pq—qv = — («3b)
it being understood that w denotes the product of the factor hvQm and
the unit matrix.
We shall now introduce the matrices
r — p + iq and s — p —iq (84)
which are more convenient to deal with than p and q taken separately.
Taking their product in the order rs, we get
rs = p p + iqp—ipq + qq = p 2+ q2—i(pq—qp)i
i.e, rs = K~u). (84a)
Similarly we get sr = K + c j . (84b)
Hence, using the associative law,
rsr — (r8)r — (K —<o)rf
rsr — r{sr) = r(A+o>),
i.e. p u ttin g K —uj = L, = 2ra) (85)
Now since K and w, and consequently L } ar e diagonal m atr ices, we have
r ^J)mn == (^mm ^nti)rmn
§ 13 OSCILLATORY AND ROTATIONAL MOTION 109
and (rw)mn = rmn to, where denotes now not the matrix but simply
the number hv0m, so that the preceding equation can be written in
the form
i^mm ^nn “W)rwn “ 0. (85 a)
Thus either rmn = 0, or Lmm—Lnn — 2cu. In the same way we get
STS =
T= (Ar + o>)« = s(K-Oi)
= (85 b)
so th at either smn = 0 or Lmm—Lnn == —2cj . Now
^»W » = ^> i» i = 2 w ( / / mm ^««) = 2 w i( W ^ ,— W 7J)
is the difference of the energy-levels for the states rn and n multiplied
by 2m (m being the mass and not the label number of the sta te !). We
thus see that the energy-levels must form an arithmetical progression
with the difference 2a>j'lm — - hvQ, so that we can put
Wn — t iJivq -j- const. (86)
With this labelling of the stationary states we must have
rmn — unless in — n +1
(86 a)
smn -- 0, unless m — n —1
The value of the constant in the expression for Wn can be obtained
from the condition that the lowest value of Lnn must be equal to zero.
This condition follows from the equation
(^0/m ^n,n-\&n-\,n ^nn
in conjunction with the fact that Knn cannot assume negative values
because the matrix K represents an essentially positive or rather non
negative quantity, namely, 2m(p2+ ^ 2) (with <p and q both real). Hence
we conclude that the series of stationary states must terminate with
some state nmln which we can obviously label as n — 0. The matrix
elements and must obviously vanish for n ^ 0, since the
states ii ^ —1 do not exist, whence it follows that L qq — 0, or — cu,
and consequently = W0 -- IhvQl that is,
Wn = hv^n + l) (86 b)
in agreement vith the result obtained in P art I, § 13, by means of the
wave-mechanical treatment of the problem of the linear oscillator.
Further, for n > 0 we get
= 2mhv0n. (87)
Now from the definition of r and s according to (84) or
®7i-l,n Pn-l,n ^7n-l,n>
110 M ATR IC ES §13
together with the Hermitian character of the matrices p and q (which
expresses the reality of the quantities represented by them), it follows
th at ,q „ v
«n-l.» = < « -!• (8 7 a>
We thus have = |*„_1>B| = J(2mhv0n). (87 b)
Coming back from r and s to p and q, we have p — !(/•-(-«),
q — —2^(r —*')» an(l consequently
Pn,n-1 ~ n,?i-ly Pn-l.n ~ \
m
$n,n-1 \^n,n— 1> Qn-l,7i 2^n-l,n
all the other matrix elements p mn and qmn vanishing.
We thus get (88 a)
\Vn,n-l I = l?«,«-ll --= J&mtihv0)
and, returning to the original coordinate, x = q/(27rv0m),
K « - il = n)] = Iot t 1v q V i (88 b)
aJ \ 87T£v0 m
The latter relation between x and p can be obtained directly from the
equation p mdxjdt, which gives
p nk = m2nivnkxlll:)
i.e. since vnk = (Wn-lVk)jh = (n -k)v0,
Pn.n-1 = 27TitJ0Zn,n-V
The derivation of the formulae (88) and (88 a) by the purely wave-
mechanical method, i.e. through evaluation of the integrals
+ CO -iCO
*mn= J dz ar>d Pmn= J L dx.
where \f/m and ipn are the normalized characteristic functions of the
harmonic oscillator, would require a much larger amount of mefre com
plicated calculation.
In the case of the hydrogen-like atom, the wave-mechanical method,
on the contrary, proves much more simple and convenient than the
matrix method for the determination of the energy values and the
matrix components. The matrix method can, however, be applied with
advantage in this case, as well as in the general case of the motion
of a particle in any central field of force, for the determination of
quantities which wave-mechanically depend upon the angular part of
the wave functions only [i.e. on the spherical harmonic functions
</,)]■
Here belong in the first place the components of the angular momen-
§ *3 OSCILLATORY AND ROTATIONAL MOTION 111
turn Mx, Myi Msi or rather their matrix elements with regard to states
differing from each other by the values of the axial quantum number
m (or also of the angular quantum number I)—including, of course,
their characteristic values.
The purely matrix determination of these quantities can be obtained
most simply if one starts from the commutation relation
M x M •----= -
--M ,
2m
which has been deduced in the preceding chapter with the help of the
operator definition of the vector M.
We shall put, for the sake of brevity,
M.
so that the commutation relation above referred to assumes the form
A B - BA iC, B C - CB - iA, CA —AC — iB, (89)
A y B, and C being regarded here as matrices.
We shall introduce the matrix
N - A2+ B 2+ C 2 (89a)
which (multiplied by h2ji7r2) represents the square of the total angular
momentum (M2), and shall show that it commutes with each of the
matrices A, B, C (the proof is the same as if they were treated as
operators).
We have, namely,
CA2—A2C = (C A- AC ) A+ A( C A- AC ) - + i(B A+ AB )%
and similarly
C B2—B2C (C B -B C )B + B (C B ~ B C ) = - i( AB + B A) .
Adding these equations to the equation CC2—C2C — 0, we get
C N - N C = 0y (89 b)
and in the same way AN —NA = 0 and B N —N B — 0.
Since, moreover, we know th at N commutes with the energy matrix
Hy it must be a constant of the motion, and its characteristic values,
together with the characteristic values of Hy i.e. the diagonal elements
of N and H in a matrix representation corresponding to characteristic
functions of both H and N, can be used to specify the stationary states.
We know, furthermore, that these characteristic functions can be chosen
in such a way [by putting Ylm{6, <f>) == th a t one of th e thr ee
matr ices A, B, C—C say—shall also be diagonal (cor r esponding to
112 M ATR IC ES §13
C*ft = const. 0). Using the results obtained before by the wave-
mechanical method, we can thus define N and C as diagonal matrices
with the elements ,
W+ D \ (89c)
v,l,m ^ ^
These results can be obtained independently by the purely matrix
method, if we confine ourselves to matrix elements corresponding to
the same energy values and assume both N and C to be diagonal
matrices (which we obviously can do for the sake of simplicity, although
this is by no means necessary).
We shall consider first such matrix elements of A and B as correspond
to states with the same value of N and shall distinguish these states
accordingly by one index m only, specifying the characteristic values
(i.e. the diagonal elements) of C.
As in the case of the oscillator, we shall not consider A and B
separately but in the conjugate complex combinations
A + iB = R y A - i B = S. (90)
Replacing the K of the oscillator theory by C , we have, according
to (89),
( A + i B ) C - C ( A + iB) = ( A C - C A ) + i ( B C - C B ) - -(iB + A ),
i.e. C R —RC = R, (90a)
and similarly C S—SC — —S. (90 b)
These equations arc of exactly the same form as equation (85) for ran d
the corresponding equation for the constant w being replaced by
We thus get, in the same way as before,
Cmm = m +const., (91)
the non-vanishing elements of R and S being
^m,m-1 aRd
and having the same numerical value since
(Ola)
The latter, together with the value of the constant in (91), can be
derived from the equation
R S = ( A + i B ) ( A - i B ) = A* + B * + C = A * + B 2+ C 2+ l - ( C * - C + i),
i.e. BS = N + l ~ ( C - J)2. (92)
Taking th e diagonal elem ents of both sides, we get
(RS)mm = Rn,m-1 Sm. 1>m= ^ + (92a)
wher e N now denotes n ot the matr ix N but the diagonal elem ent of
§ 13 OSCILLATORY AND ROTATIONAL MOTION 113
this matrix corresponding to the state in question (with no subscript
mm affixed to it because it does not depend upon m). In a similar way
we find (SR)mm = Smm+1Rm+lm = t f + J _(<7mm+J)2. (92b)
I t should be remarked th at the same expression can be written in the
form (7?£)m+lyn+1, so th at we must have, according to (92 a),
i)2 = (^m+l.m+l i) 2>
which is, of course, in agreement with (91).
Now since A2+ B2+ C2 — N, the characteristic values of the operator
C or, what is the same thing, the diagonal elements of the matrix C
must lie within certain limits, the maximum value C' not exceeding
-\~N* and the minimum value C" being not smaller than —2V*. Denoting
the corresponding limiting values of m by ra' and m” respectively, we
must have
This gives, according to (92 b),
Cmm- = - 1 + W + 1),
and, according to (92 a),
c m-m- = (93)
as would be expected from the fact th at the relation A2+ B2+ C 2 = N
determines the square of <7.
The difference — m '—m” is obviously an integral num
ber, I say, equal to the number of states with different values of Cmm
which are possible for a given value of N . We thus obtain the following
condition for N: 2J (N + l) = integer = I;
th a t is, N = i ( / 2- l ) = i ( / + l ) ( J - l ) . (93a)
This expression reduces to the usual form
N = /(/+ l) (94)
if we p u t I = 21+1, i.e. define I as an odd integer , giving for the
lim itin g valu es of Cmm
Cfn'm' = (94 a)
i.e. by (91) m' = +1, m* = —I. We thus get
Cmm = m, (94 b)
an d consequently
in accor dance with our pr evious r esults. I t is, however , impor tant to
sm.e q
114 M AT
ATRR IC E
ESS §13
notice th at the matrix theory admits another possibility corresponding
to I being an even integer, 2k say. We get, in this case,
(k+ m -h ) (95)
and Cm.m. — k—I, Cm.m. (95a)
whence Cmm-= m + \ (95 b)
with m! — —k and m” k —i , or
with m. = —(k—1) and ra' -- k. These results can be put in the same
form as the preceding results if we define I as a half-integral angular
quantum number ,_ , .
and m as a half-integral axial quantum number, varying between the
limits + 1 and —I.
We shall then get, as before, Cmm — m. We thus see, by this example,
that the matrix theory is, in a certain respect, more general than the
wave*mechanical theory—at least in that form in which it has been
developed hitherto. We shall give in a later chapter a generalization
of it which provides an equivalent for the half-integral values of I and
m of the matrix theory of the angular momentum.
The non-diagonal matrices of the x and y components of the latter
can easily be derived from (90), (Ola), and (92a). We shall not, how
ever, examine the matrices Mx and My separately, but shall examine
their combinations
Mx+ iM v = A B, Mx—iMv = - S
for the non-vanishing elements of which the following expressions are
obtained ,
(Mx+ iMy)m» .m = ^ V ( ( '+ D 2-(™ + (96)
(Mx- i M y)wm+1 = — (96a)
where atm is an arbitrary phase factor.
A derivation of these results by the usual wave-mechanical method,
i.e. by means of the integral expressions for the matrix elements, would
require a thorough knowledge of the spherical harmonic functions
Ylm = Plm(d)eimt and would be much more laborious than the preceding
calculations.
The preceding method can also be applied to the calculation of
the matrix elements of the coordinates x> y, z and momentum com-
§ 13 O SC IL LAT OR Y AND R O T AT IO NAL M OTION 115
ponents p x, p y, p z—for such states at least a s . differ from each
other in the quantum numbers m and I only (and which in the gase
of the hydrogen-like atom belong to the same energy-level). To do
this we shall examine first the expressions Mzx —xMzi Mzy —yMz, and
M .z-zM .. Since ? ~ (Mzx- xM .) - ^ and M. =-.= xpu- y p x,
we get = —y, and in the same way [Mz.y] =- 4-j, \Mz,z\ — 0.
P uttin« •>' t iy : f. x -iy -v (97)
we thus have \MZ,£ ] - —y + ix ----- i£
\M; ,V] = - y - i x —t(.r —iy) -- —iy
or, with Mz - hC/Zn,
C i - i C -- £. r v ^ c , : -7/ (97a)
and Cz—zC - 0. (97 b)
I t follows immediately from these relations that, so far as the quantum
number m is concerned (/ being left undetermined), 2 is a diagonal
matrix with non-vanishing elements znwr while £ and y are matrices
with non-vanishing elements of the form
frud 7)m 1>IN»
as in the ease of the harmonic oscillator.
I^et us consider now the commutation relations between the quan
tities (operators, matrices) y on the one hand and R, S on the other.
We have
\Mx-ViMy4 ) -■= +
— [J/x, jJ + y] -f i[My, a:]—\My, y\
— «'(-— + (r - A : * (- s + s ) — 0,
\ r'Pj <Px f
and similarly [ Mr- i M u,{] - 2 is.
(98)
^3
1
so th at
!l
S£ -$ S - -2 s. (98 a)
From the first of these equations we get
— Rmi ~~ l 1,/II i>
i.e. ^ L1vm — const. = a,
and likewise from (98 a)
2zmm = == ^m,m+ l (m+ hm
— a ( i ? m „l _ i ^1 -1 ,1 1 1 1 ,»m) ^ f l[ ( ^ ^ ) n i H r ( ^ ) » H h « 4 l ] '
116 M AT R IC E S §13
We thus see th at the non-vanishing matrix elements of the co
ordinates are determined, disregarding an irrelevant proportionality
factor, by the matrix elements of the angular momentum. Substituting,
in the preceding equations, the expressions for R, and S,m - l , n
derived before, we get 2z, = «{[(*+i) j - (« - 1)* ]-[(* + i) * - ( « + i)2]}.
i.e.
= am
(98b)
l£n+, . J = I w i l = «V W +i)2- ( « + i ) 2} )'
In deriving these results it was tacitly assumed th at the total momen
tum remained invariant, i.e. th at the angular quantum number I pre
served the same value in the different states to which the matrix
elements (98 b) refer. Affixing the index I, we should have written the
latter in the more complete form etc.
In order to find out the matrix elements which correspond to different
values of /, we must take into account certain commutation relations
containing the matrix of the total momentum, or its square N ( X W/irr2).
Taking, for instance, the relation
N R -R N - 0
(which follows from N A —AN = 0 and N B —BN — 0 ), we have, since
N is a diagonal matrix with regard both to I and m (as a m atter of fact
not depending upon m),
( N R — RN)r ttn';r,fnr ~ 2 W'
Z'"m"
, " Nr,m"';r,m--
— (Nri— Nr r )Rr m ' . r — 0.
We thus see th at R i ^ r ^ vanishes unless V — V as was assumed
above. This assumption is therefore justified so far as the components
of the angular momentum are concerned (it can be proved in the same
way for S and C). I t need not, however, hold for the coordinates,
i.e. for the matrices f, 77, z.
Taking, for instance, the (Z,,m /;Z'r,m'r)-element of (98), we have
2
m'" ^
Now it can easily be seen that the results derived from (97 a) and (97 b)
as to the non-vanishing elements of f, 77, and z, so far as they are
specified by the quantum number m, remain valid irrespective of the
equality or inequality of the numbers V and V (since these results
depend solely upon the diagonal character of C with regard to m). The
preceding equation need therefore be examined only for the case when
§ 13 OSCILLATORY AND ROTATIONAL MOTION 117
m" = m' —2. Putting m f = m +1 and m” — m — 1, we get
The angular quantum number I represents the maximum absolute value
of the axial quantum number m. This means th at the matrix element
vanish unless both |m| < V and \m+ l \ < Z'; likewise
i wiWvanish unless \m\ < V and \m—11 < T, further
will vanish unless |m| ^ V and |m + l| < Z', and finally Rr ,m\i\m-1
vanish unless |m| < I” and |m—1| < Z*. Since equations (99) must
hold for all values of m, both sides vanishing simultaneously, we can
conclude th at V and I” must be connected with each other in such a way
th at the violation of one of the conditions
\m\ < r, |m + i| r, |m—i | ^ r
will entail the violation of one of the conditions
|m + l |< Z ', |m| < l \ \m—1 ] <C V.
This will obviously be the case if lr — I", or V = Z*+l, or V — V—1.
We thus see th at only those matrix elements of f will be different
from zero for which (99 a)
V - l ” = 0 ,-f i , ~ i .
For otherwise we could, by a suitable choice of m, make one side of
(99) vanish while the other would be different from zero.
The same applies, of course, to the matrix elements of rj and z, or, in
other words, to the matrix elements of all the three coordinates.
Putting in (99) V = I and V —.1 —1, and replacing the matrix ele
ments of R by their expressions (96), we get
)2— = \W l)2~ ( m I) }£/,»»•+
or
^{(Z+Wl+l)(Z—m)}£l,m\l-l,m-l ~ 1)(Z—w0}fzpm+l;I-l,m*
Replacing here the common factor by ^(Z+m), and taking
into account th at the expression (Z+m+1)(Z+m) is obtained from
(Z+m—1)(Z+ m) by replacing m by m +1, we can put
h m+v.i-i,n = N{(l+ m)(l+ m+ l)}t (100)
where 6 is a proportionality coefficient w’hich does not depend either
on I or on m.
Substituting this expression in the equation
— 2zl m. /+1>m = £ Zm ; /,m + 1 (it)n+1; J -i,m
118 M AT R IC E S §13
which follows from (98 a), and putting
~ R = VW+ i )2—(m“ £H ’
we get -b<J(l2- m 2). ( 100 a)
In a similar way for the case V = Z—1, I” = Z we obtain
= b \ l {( l - m ) ( l- m - l ) } ( 100 b)
= I t ' S 2-™ *)’ ( 100 c)
where b' is another coefficient of proportionality, which can be shown
to have the same numerical value as 6 .
I t is interesting to compare the preceding resultsf with the wave-
mechanical method for the determination of matrix elements of the
coordinates for a hydrogen-like atom.
We have, for instance,
^n,l,m\ n',V,m' ^ d^ »
or, putting = j dV -- r2drd(o, dto --- sin 6 dOd<f>, and
z = rc o s 0, 00 7r 2r r
= / /„(»>3 <*" / p i J d)p r„ Ae) cos # sin0 dd j d<f>.
0 0 0
We see, first of all, that on account of the last factor this expression
vanishes unless mf — m. In addition it can be shown that the second
factor also vanishes unles V = Z ^ l. The proof is based on the fact
th at the product cos 0 P/m(0 ) can be represented as the sum of two
functions pM .m(0) and Pt_1>rn(0) with suitably chosen coefficients, and
on the orthogonality of the functions }rz(0, (f>) corresponding to different
values of I [as characteristic functions of the operator II 2 with the
characteristic values —/(Z-J-l)J.
Replacing z by £ = (x-\- iy) = rsin0(cos0-f isin<£) — rsinfle^, we
get, in a similar way,
ao 7T 27r
= / fn(r)r3 dr J p,m(d)ptmid) sin20 dd j <ld>d<f>.
0 0 0
The examination of the last factor shows at once th at this expression
vanishes unless m ' — rn—1; the second factor vanishes likewise if
V 9^ 1.
The conditions relative to m coincide with those obtained by the
m atrix method for z and £; the condition Z' = Z± 1 is, however, more
restrictive, since it excludes the case V = Z.
We see th at here again, as for the values of Z(integral or half-integral),
t Derived in the above way by Bom and Jordan.
§ 13 OSCILLATORY AND ROTATIONAL MOTION 119
the matrix method leads to results of higher generality than the wave-
mechanical method. I t should not be inferred that the results obtained
by the latter are incorrect. On the contrary, it is the results obtained by
the matrix method which require some qualification. The reason for
this is that the properties of the matrices which represent the com
ponents of the angular momentum of an electron are not completely
specific, but, as we shall see later, are shared by matrices representing
allied quantities of a more general character, which can be considered
as the resultant of the angular momentum due to rotation about a fixed
centre and the so-called ‘intrinsic angular momentum’ of the electron,
whose origin is usually ascribed to its spin motion.
I t is possible to generalize the wave-mechanical theory in such a way
as to interpret this ‘spin effect’ and to incorporate the intrinsic momen
tum, allowing for the resultant angular quantum number or, as it is
called, the ‘inner quantum number’ j both integral and half-integral
values and allowing transitions, i.e. non-vanishing matrix elements of
the coordinates, for which this number changes by i 1 or remains con
stant. This does not, however, invalidate in the least the fact that the
angular quantum number Z, representing the ‘orbital angular momen
tum ’ of the particle, can assume integral values only and obeys the
restricted ‘selection rule’ V—I -- ± 1 .
The fact th at we have obtained, by the matrix method, non-vanishing
expressions (98 b) for the matrix elements of the coordinates in the case
V—I = 0 does not contradict the wave-mechanical theory, for these
expressions contain a proportionality factor a, which has not been
specified and which can easily be shown to be equal to zero in the
case considered (if I denotes the orbital and not the total angular
quantum number).
The m atrix elements of the coordinates which we have calculated
have a direct and indeed very important physical significance. They
determine, according to the formula
64jr«
A /in' \*n,A*>
3c3 h
where e denotes the electric charge of the particle, the probability of
a spontaneous transition with emission of light, i.e. they determine the
intensity of the different lines in the emission spectrum of the corre
sponding system or the degree of their ‘blackness’ in the absorption
spectrum [see P art I, § 13]. Such pairs of states n, n’ for wrhich the
matrix elements xn n>vanish do not combine with each other, in the
120 M AT R IC E S § 13
sense th at transitions between them connected with the emission or
absorption of light, corresponding to oscillations in the ^-direction, that
is to say, ‘polarized’ in this direction, are impossible. The relations
between the quantum numbers which characterize the ‘allowed’ transi
tions (corresponding to the non-vanishing matrix elements) are called
‘selection rules’. The latter, as we have just seen, can be different for
different coordinates. For instance, in the case of the z-coordinates
(i.e. of light polarized in the z-direction) they amount to V—I = ± 1
and ra' = ra, while in the case of the x, y-coordinates they are l'—l = ^ 1
and ra' = ra ± 1.
This distinction between the different coordinates is a purely formal
one in the case of a radially symetrical field of force—because of the
degeneracy connected with such a field. This degeneracy—with respect
to the different values of ra—can be eliminated, as will be shown later,
by the presence of a magnetic field parallel to the z-axis (Zeeman effect).
If the latter is weak enough, the preceding expressions for the matrix
elements of z and of x± iy will remain approximately valid and will
determine the intensity of the spectrum lines linearly polarized in
the direction of the magnetic field or circularly polarized about this
direction.
14. M atr ix Repr esentation in the Case of a Continuous Spectr um
We have limited ourselves hitherto to the matrix representation of
physical quantities where the states concerned form a discrete set,
corresponding to a discrete spectrum of the energy operator H.
The case of a continuous spectrum corresponding to a continuous or
‘mixed’ set of states specified by functions of the type 0° or ipc’ntnt*etc.
(§ 11), can be dealt with in a similar manner. The matrix elements
of any operator F are defined in this case in exactly the same way as
in the preceding case, i.e. by integrals of the form
n -c - = f w m - d v (ioi)
or FcMri.:C’>nin: = j tOwnin', P'l'Cton’,* (1 0 1 a )
and so on.
These integrals as a rule do not converge, and are similar to the
Dirac function 8 (0 '—C”) which was introduced and discussed in § 10,
and to which the matrix elements of F actually reduce if F repre
sents the energy H or any other constant of the motion commuting
§ 14 M AT R IX R E P R E SE NT AT I O N W IT H C O NT INUO US SP E C T R UM 121
with H and satisfying the equation F*pc = Fc tpc . We then get, accord
ing to (101),
n c- -- Fc- / M W - dV,
th at is, F 0C.C. — Fc. h{C’~ C"). (101b)
This expression corresponds to a ‘diagonal m atrix’ of the discrete case,
just as h(C'—C") corresponds to the unit matrix.
The somewhat indefinite character of the matrix elements F ^ c» can
be removed in the same way as in the simplest case F = 1 when F**>c-
reduces to the function B(C'—Cn)—namely, by extending the integra
tion in (101) over &finite volume, and passing to the limit V oo after
completing the integration over C or C” which always occurs in
problems of physical interest.! The simplest example of such a problem
is the calculation of the probable value of some quantity F for a motion
specified by a wave function of the type
i/j = j ac tf*c dC, (102)
which can be considered as the superposition of a large number of
‘wave packets’ corresponding to very small intervals of the parameter
C. Although the integrals J Wcl2 dV diverge, the integral J \ijf\2 dV
remains in general finite and can be normalized to 1, just as in the
discrete case when iff = J cn'fIn
We have in fact, reversing the order of integration with respect to
V and <7,
J |*|* d V = { a t, d C ' / a c . d C j *?..*<.. d V
V C C” V
== J a t, d C ’ J a c . d C h r (C ' ~ C " ).
O' cm
Instead of first performing the integration with regard to Cr and C*
and then passing to the limit V -> oo, we can in this case replace the
(perfectly definite) function $V(C'—C”) a t once by the Dirac function
8(C"—C"), which gives
J 1*1* d V = J a* a c . dC ' . (102a)
We thus see th at the first integral converges along with the integral
t In some cases it is preferable to modify the definition of the wave functions tfi so
as to make them vanish on a certain surface S beyond which the forces can be assumed
to vanish. The problem is thus reduced to one characterized by a discrete spectrum.
Suoh quantities as possess a direct physical interest are usually only slightly affected by
the value of the volume V enclosed by S, so long as it is sufficiently large. Their exact
values can be easily calculated b y passing to the limit V -► oo.
3M 3.6 R
122 M AT R IC E S $ 14
J \ac \%dC. The convergence of the latter can, however, always be
secured by a reasonable choice of the function ac. The normalization
condition thus reduces to the equation
J |aP|* dC = 1, (102b)
which replaces the equation 2 \an |2 = 1 of the discrete case, and shows
that the product |<jc \* dC (102c)
can be considered as the probability th at the particle is in a state of
motion specified by the interval (C, C + dC ).
The expression (102c) is of the same form as the expression \t[f\2dV
for the probability of a position specified by the volume element dV;
in both cases we have to deal with continuously variable parameters
(O or the coordinates x> y, z), and therefore in both cases it has a
meaning to talk of probability with reference not to a definite state or
position, but to a definite interval of states or positions, the probability
in question being proportional to the magnitude of the interval.
Subject to the condition (102b), the probable value of a quantity F
can be defined by the usual formula
F = j f'F + d V, (103)
which can be rewritten in the form
F = j a%. dC' J a c- dC" j +%. F f o dV,
C‘ Cm V
i.e. F = J*J a%.ac*Fc.c.d C ' d C \ (103a)
In the simplest case, when F represents a constant of the motion, we
get, according to (101b),
F — J* l°cl2-^cdC, (103b)
in agreement with the above interpretation of the product |ac |2 dC.
If, however, F is not a constant of the motion, the integral (103 a)
representing its probable value cannot be evaluated directly and we
must have recourse to the method indicated above (first integration
over finite volume, then over C" or both C" and C' , and finally passage
to the limit V -> oo).
If the ‘C-space* is subdivided into infinitely small intervals AC', AC#,
etc., and a wave packet is built up for each interval, according to the
formula
f c>— lim <104)
AC
$ 14 MATRIX REPRESENTATION WITH CONTINUOUS SPECTRUM 123
we can replace the matrix components of F with respect to the func
tions \jjC' by matrix components with respect to the ‘quasi-discrete*
functions (normalized to unity):
Fc.c. = f }*,F j,r,d V. (104 a)
The connexion between these matrix components and those discussed
above is given by the formula
/ J ^ iC’dC'■
AC" AC'
(,Mb)
whence it follows that the probable value of F can be written in the
form — _ _ _
F = lim 2 Y J(AC'AC*)FC'C.a *,a c.. (104 c)
AC' AC'
The matrix components—or elements—of a real quantity with respect
to states of a continuous set must, of course, satisfy the Hermitian
relations F* _ F
* c c ' — Ar c'
just as in the case of a discrete spectrum.
‘Continuous matrices *cannot be conveniently represented by a square
array of elements or components, such as are used for discrete matrices.
This, however, does not invalidate the analytical results which have
been established in § 11; the only amendment which they require con
sists in the replacement of the unit matrix 8mn by the Dirac function
h(C’—C”) and of summation with respect to discretely variable indices
by an integration with respect to the continuously variable indices
wherever the latter occur in the place of the former.
This has already been illustrated by the preceding examples. In a
similar way we get instead of (75)
F*!>c = J <LC”>(105)
and instead of (76)
(F G )cc — J Fc.c Gcc.d C (105a)
(multiplication law for continuous matrices).
The seemingly unimportant formal difference between the continuous
(or mixed) and discrete case is connected, however, with a fundamental
difference in the physical meaning both of the wave functions and of
the matrix elements. The essence of this difference consists in the fact
that, while to states belonging to a discrete set there corresponds in
classical mechanics periodic or quasi-periodic motion in a limited region
of space, states belonging to a continuous set correspond to aperiodic
124 M AT R IC E S §14
motions of the classical theory, i.e. to types of motion for which the
kinetic energy remains positive at infinity and which approximate there
fore at infinite distance (so far as the forces vanish there) to free motion.
Motions of this type were not considered in the old quantum theory.
The latter did not encroach upon the holy laws of classical mechanics,
but merely added to them certain quantum restrictions when the motion
was confined to a limited region of space and accordingly displayed
certain periodicities corresponding to the many-valuedness of the action
function S, As already shown above, Bohr’s quantum conditions
amounted to the condition of single-valuedness for the function e'2nSlh.
In the case of aperiodic motions, starting at infinity and ending at
infinity, the action function S remains single-valued, so that quantum
restrictions of any kind are unnecessary.
The coordinates of a particle describing such an aperiodic motion,
considered as functions of the time t, cannot, of course, be expanded
in a Fourier series. The latter can be replaced, however, in this case
by a Fourier integral. Limiting ourselves, for the sake of simplicity,
to motion in one dimension, e.g. parallel to the ar-axis, we can write
instead of (79), § 12, 4oc
x(t) = j X ° ( i d v , (106)
—00
and instead of (81 b) +oe
J
OV(0 = arJv e ^ - v x dv\
— 00
(106 a)
where s jv = x*'(v”—v')9 the product arjv dv” replacing the amplitude
xmn; v = W'jh is the frequency associated with the energy W = Wf,
which is supposed to be the energy of the motion represented by (106 a).
As to the frequency v” = v'+v, it is natural to assume th at it coincides
approximately with W”jh9 where W” denotes the energy of a state, a
transition from which to the state W’ corresponds, with regard to fre
quency and intensity of the emitted light, to the element x%w ei2n(v'~v'*dv”
of the integral (106 a). The question of the degree of approximation
between v” and W*/h (if v = W'jh) has no definite meaning in the
present case with a continuously variable W9 for equations (80), (80 a),
(80 b), and-(80 c) cannot be applied to it, the integrals § referring to
‘round trips’ only. We are therefore entitled to assume th at v” coincides
exactly with W”jh9 i.e. th at there is not only a ‘correspondence’ but
an actual identity between the classical frequencies occurring in (103)
and the quantum frequencies (W”-~W')/h. The responsibility for the
disagreement between the classical and the quantum theory can thus
§ 14 MATRIX REPRESENTATION WITH CONTINUOUS SPECTRUM 125
be shifted entirely on to the amplitude coefficients which can be
supposed to 'correspond’, i.e. to be approximately equal to the matrix
elements of x with regard to the states W and W"
1-00
J Xl/lf <!>”. Ilx.
-00
The correspondence with these elements can actually be established
with the help of the approximate expressions of the wave functions ip®
in a way similar to that used in § 12 for the case of a discrete spectrum.
We shall put accordingly
t/iv. = (1 0 7 )
VVv‘
where the coefficients Cv>must be determined by the condition
-foo
I $$$,■ dx = ( 1 0 7 a)
—ao
Taking into account the relation
«!'(*)-*!•(*) = ( K - K - ) t , (107b)
which can easily be shown to hold approximately (for two states not
far removed from each other) irrespective of the periodic or aperiodic
character of the motion,! we get in the case of neighbouring values of
v and v”:
J F (x)M? J
-f 00 + °o
F'l-,,. .-= dx S 4CV7C„.. F (t)e-i*«wr’- wrW' dt. (108)
—00 —ao
On the other hand, the Fourier coefficients in the integral representing
a function Fv>(t)\ +a0
F At) =
—oo
J F ^ . e ^ ’- ^ d A
are determined by the formula
-foo
JJV = J dt, (108a)
which coincides with the* preceding expression for FvW if we put
t Cf. § 12. Since in the present case the integral J -- j>g dx is non-existent, we can
put directly
,»(*)-»;.(*) s ^ ( w v - w v ) .
e»»
We have further, from the definition ~ = <7 = *J[2in(W—U)],
w = 0^ [2m(Tr- u,3<fa = t’
in the same way as before.—The relation (107 b) can be proved in a somewhat more
complicated manner for the general case of a (non-periodic) three-dimensional motion.
126 M ATR IC ES §14
v”—v' — (W"~ W')lh and CjJ = 1. The latter condition can easily be
shown to follow from (107a). In fact the main contribution to the
integral (107 a) must be due to distant points where the functions a°(;r)
reduce to gx with a constant value of g (corresponding to a constant
potential energy). Replacing g by hk, where k is the wave number, we
get, according to (23 a). Chap. I,
Jf ^di°f ^ = V(*vv) —Jf ei2rtk--k')x
4-OC - f 00
ill0 dr ~ V
—oo 00
= (k'-k" ) = H y'- v”),
\ \ Vv‘ Vv V
whence
- f 00
£r:J?JL'W<fc'-r) dv" =
*V V /
or, since J S(k’—k”) dk” = 1,
Cl< h = i
vv dk
Taking into account the relation v = M 2/(2m), we get
dv/dk = hkjm = vv,
(group velocity = corpuscular velocity) and consequently
C l= 1.
The integral (108) expressing the Fourier components of a function
Fy.(t) converges and has a definite value only when this function
vanishes for t = ± 00. This condition is not satisfied for most of the
quantities referring to aperiodic motion. In the simplest case of uniform
motion we have, for instance, x = vt and J te~i27r{vM~v di the
—00
integral obviously diverging. If, further, F denotes a constant of the
motion—e.g. the energy H—we get
+«
H°w = W„. j e - W ’- ^ d t =
— 00
in exact agreement with the result (101b) obtained from the matrix
definition of # J v .
These considerations give a new explanation of the fact, already
mentioned, th at the matrix elements of various quantities in the
case of a continuous energy spectrum do not in general have definite
values, being expressed by non-converging integrals over oscillatory
functions of the ei2nkx type.
IV
TRANSFORMATION T H EO R Y
15. R estr icted Tr ansfor m ation Theor y; M atr ices defined fr om
differ ent ‘Points of View’
L et us consider two oper ator s H and K which we shall assume to
r epr esent the ener gy of the same par ticle moving in differ ent fields of
for ce with th e potential-ener gy functions U (x,y,z) and V(x,y,z), both
being independent of the tim e and limiting its m ovem ent classically to
a finite r egion.
The char acter istic values of H y which in this case will for m a discr ete
set, will be denoted by IV or I I ”, etc. (the dashed letter s r efer ring n ot
to a par ticular char acter istic value, but to any one of them). The
cor r esponding char acter istic functions will be denoted by
fn = z)e-i2nIi'tlh,
etc. A similar n otation will be used for the char acter istic values K '
and functions <f>K>— <f>}{c (x)yyz)e-i27rK'ilh of the oper ator K.
I f ther e is no degener acy, the functions will be com pletely sp eci
fied by the attached value of the oper ator to which th ey belong. In
case of degener acy we m ust add to the ener gy oper ator one or two
other oper ator s, r epr esenting independent constants of the motion, for
exam ple th e z-component of the angular momentum Mz and its squar e
M 2 if th e poten tial ener gy U depends upon the distance r alone (centr al
field of for ce). To avoid unnecessar y complication, we shall in such
cases under stand by H the set of a ll these thr ee m utually commutable
oper ator s H v H2, H s, and by IV a set of their char acter istic values
H[, H[l, H i cor r esponding to the same function (i*1
th e sense of the sim ultaneous valid ity of all the thr ee equations
= H[ H2ifj}r = H2i/j H', which we shall wr ite
as a single equation HipJr = The same r emar k applies to the
oper ator K> its char acter istic values K \ and its char acter istic func
tion s <f>K:
In addition, let us consider some q ua ntity r epr esented b y an oper ator
F and let us intr oduce its m atr ix r epr esentation with th e help of the
fu nctions \fjw on the one hand and of the functions (f>K>on the other .
W e shall th u s get two differ ent matr ices which we shall denote by
Fh and Fk r espectively and r efer to as the m atr ix of F ‘fr om the
p oin t of view* of H and the matr ix of F fr om th e point of view of K.
128 T R ANSF O R M AT I O N T H E O R Y §15
The components (or elements) of these matr ices will be denoted by
Fh h - and FK,K. (F Q
K>K*). We shall thus have
Fli fr = j K Wh - d V, w m -d v
\ (ioo)
F kk- = j 4 i-r + K -* r , n K- = j w m - d v J
with
Fk .k . = F°k .k . (109a)
In par ticular we shall have
K k 'k ’ ~ dCSK K't (109 b)
since H and K are diagonal matrices from their own point of view,
the elements of these matrices being identical with the respective
characteristic values.
The transformation theory in its simplest form consists in the estab
lishment of a certain connexion between the two ‘points of view’, j.e. of
certain relations between the functions ipjr and the functions as
well as between the matrices and FK. With the help of equations
(109), the second part of this problem can be reduced to the first.
However, we shall see later that it can be solved independently without
the use of the functions tft and <f>, on the basis of the conditions (109 b).
The fundam ental assumption of the tr ansfor mation theor y is th a t
the amplitude functions can be expr essed as linear combina
tions of th e am plitude functions y , z) accor ding to the equation
" Z aH'K' H' (HO)
ir
with con stan t coefficients a iVK.. We shall n ot tr y to ju stify this assum p
tion on for mal gr ounds for the gener al case of any oper ator s H and K
bu t shall be con ten t with the following r emar ks.
(а) The assum ption (110) leads to an unambiguous deter mination
of the expansion coefficients a H.K.. Indeed, m ultiplying (110) by
and supposing th e differ ent functions \pH‘ to be or thogonal to each
other (which we can always do), we get upon integr ation
a H ' K‘ = J 0/r 4**k ' d V . (110a)
I t is clear fr om th is t h a t equation (110) can hold only when the sum
m ation is extended over all th e values of H \ i.e. over all the stationar y
states, defined b y th e oper ator H (and those r epr esenting other in
dependent constants of th e m otion, if ther e is degener acy).
(б) F or our assumption to be justified it is necessar y and sufficient
§ 15 R E ST R I C T E D T R ANSF O R M AT IO N T H E O R Y 129
th at the series (110) with the coefficients determined according to
(110 a) should be convergent.
Wo shall argue in future as if this convergence condition were
satisfied. I t can be shown to be actually satisfied in most cases of
practical importance corresponding to a small difference between K
and II due to some weak ‘perturbing’ forces. In this particular case
the transformation theory we are developing reduces to the so-called
perturbation theory.
If the transformation (110) holds, then the reciprocal transformation
0 /r — ^ aK/n &K' ( 11*)
must also hold with the coefficients
«*>«• = / * » < * » ' • (l l l a )
Comparing this with (110 a), we get the relation
- (H 2)
On substituting the expressions (111) in (110) or (110) in (111), we
get—in the first case—
&K' — 2 a I V K ' f[ a K M
IV fik* — ^ ( 2 a K ' i r a H 'K ')<l>0K',t
i.e. 2 ak li r air K- = %k ' k > (112a)
IV
and in the second case
^ ajr K>a jljr = bj r jr . (112b)
Replacing a ^ jr by a*rfC according to (112), wc obtain the relations
jLair K,a *rK0 ~ Sr 'K" (113)
iv
2 a H ”K ' a *VK' = 3 H ’H '* (1 1 3 a )
which express the orthogonality and normalization of the coefficients
aIVK' (or a£T'7/')-
Another—equivalent—form of these relations is obtained by multi
plying in (110) by its conjugate complex and summing over K \
This gives ^ 4>k t v = 2 ^ £ {<*H'K’aiVKWivVir> i.e. according to
(112c)> ^ +&+**■ = 2 (H 3b)
K' IV
Before proceeding further in the formal development of the theory,
we shall examine the physical meaning of the assumption implied by
the transformation equations (110) and (111).
I t should be noticed first of all th at the latter have an external
s
130 T R ANSF O R M AT IO N T H E O R Y 5 15
resemblance to the representation of the general solution of the wave
equation -f ~ = 0 in the form of a sum of its particular solu
tions, i.e. to the equation
= | Ch 4 u . = | G e ~ i2lrti'tih. (113c)
The fundamental difference between the two cases is th at the time t
enters as an essential factor in equation (113 c), while the transformation
equations (110) or (111) do not contain it at all. If, however, we put
in (113 c) / = 0 or t = t0, i.e. consider the function p a t a definite instant
of time, we see th at by a suitable choice of the amplitude coefficients
CH. it can be made to coincide with any one of the amplitude functions
$£<, so far as the latter are actually expressible by a series of the type
(110). The physical meaning of the assumption implied in formula (110)
is th at any stationary state defined by the operator K , according to
the equation K<f>K*= K'$k . can be represented as a superposition of the
alternative states defined by the operator H (according to HpH. — H'ipH )
at a certain instant of time. Such a coincidence, even if achieved at
a definite instant I = t0, will, however, not persist unless the coefficients
CH• are allowed to vary with the time in an adequate manner. In this
case the function tp defined by (113) will no longer represent a general
solution of the equation ( h -f ~ —W — 0; it seems, however, natural
\ 27Tl Otj
to suppose that, with a suitable definition of the functions Cn^t),
it will represent the general or a particular solution of the equation
The latter assumption reduces to the equation
<pK, = ^ CH>K.(t)pn .} (113d)
W
or f a e-n*K’«h - J CH.K.(t)P%.
which becomes identical with (110) if we put
CH.K'(t) = aH.K. e i W - K W (113 e)
In the 6ame way we can replace the equations of the reciprocal trans-
for m ation (111) by _ J (114)
with C]t h ' = aK}ir ei2n{ (114a)
We thus see th a t our fundamental assumption as to the existence of
a linear relation (110) or (111) between the amplitude functions <f>°K. and
§15 R E ST R I C T E D T R ANSF O R M AT IO N T H E O R Y 131
is equivalent to the assumption that the same motion, whether it
be determined by an energy operator / / or K , can be described from
the point of view of the other operator, in the sense th at a stationary
state of the set determined by K (or H) can be represented as a super
position of stationary states determined by H (or K) with variable
amplitude coefficients CJVK>(or
If the latter were constant, then (113 c) would represent some general
solution of the equation 111 + ^ i p = 0 corresponding to the pos
sibility of finding the particle in one of the alternative (mutually
excluding) states of motion defined by the different functions
The coefficients Cli K^ provided they satisfy the normalizing relation
£ I^ W I* = would in this case represent the ‘probability ampli
tudes’ of the different alternative states ipH., the probability of these
states being equal to the square of the moduli of
11 is natural to preserve this interpretation in the present case when
the CirK>are functions of the time defined by (113 b). This dependence
upon the time does not affect their moduli, which remain constant and
equal to the moduli of the transformation coefficients a}VK,—the nor
malization condition £ \Cir K’\2 — 1 being satisfied in virtue of the
relations (113) (with K" — K f).
In defining the quantities \Cn - r I2 or \airKr\2 as the probabilities of
the different states of the H-set, we must not forget th a t all these states
are associated with a definite if-state, as indicated by the second sub
script in aH.K>. The quantity \aH>K»\2 *s no^ to regarded as the
probability of the state H' per se irrespective of any accessory con
ditions—for such unconditioned probability has no definite value—but
as the probability of the state H ' subject to the accessory condition
th at the particle is actually in a state of motion specified by value Ar/
of K or by the function <f>Ku
Instead of talking of the states as described by the wave functions
<f>K. or ifjjj'y it is often more convenient to speak of the values of certain
quantities F , H, K associated with these states. The fact th at a definite
state is actually realized can be expressed by saying th a t the probability
of this state is equal to unity. We can thus say th at \aH>K>\%is the
probability th at the quantity Ii has the value / / ' if it is knoum (with
a probability amounting to certainty, i.e. equal to unity) th at the
quantity K has the value K \
I t is perfectly natural th a t the determination of the probability of
a certain value of some quantity, e.g. H , must imply an assumption
132 T R ANSF O R M AT IO N T H E O R Y 515
about the probability of a given value of some other quantity K —for
the probability theory does not create probabilities, but only correlates
them.
From the relations (112), it follows th at \aH>K>\2 = lajr//'l2* This
equation can be interpreted from the probability point of view as the
expression of the ‘reciprocity law’, which means th at the probability
of H having the value H ' when K is known to have the value K' is
equal to the probability of K having the value K' when H is known
to have the value H' .
This feature of the coefficients aJVK>reveals a close similarity between
them and the amplitude functions $h’ (or $ r)- As a m atter of fact,
the latter also depend upon two arguments, or sets of arguments—one
of them, x ,y , 2, specifying the position and the other, H' (or H \9H 2i H 3),
the energy and some quantities commuting with it (i.e. representing
constants of the motion defined by the energy operator H). Further,
the function |0/r(#>y,z)|2, or more exactly its product with the volume-
element dV, does not determine the probability of a position specified
by dV irrespective of any other circumstances, but subject to the
explicitly stated condition th at II is known to have the value H \ To
give an adequate formal expression to this analogy between the coeffi
cients djj'K'y d~K'H’ on the one hand, and the functions ^jr (x,yfz)y
(f>°K>(x, y, z) on the other, we shall introduce for the latter the following
notation: 3/'.*') = t e a ', t U * . »',*') = ("5 )
using x ' to represent a set of values of the three coordinates x , y, z in
the same way as H ' or K' is used to represent a set of values of the
three quantities H l9 H 2, H z or K v K 2, K 3.
The analogy between the functions and the coefficients aHfR.
or aR'ir seems to indicate th at a set of values of the coordinates x(x>y,z)
can specify a ‘state’ of the particle just as well as a set of characteristic
values of any other three mutually commuting operators Hv Hv Hz or
K v K 2f K z. We are thus led, in a very natural manner, to revise the
conception of a ‘state’ or ‘stationary state’ which we have been using
hitherto, in the sense th at it is not determined by a function or
which refers to two states of two different sets like the trans
formation coefficients—or probability amplitudes—ajcH’ and aH K.ybut
simply by the values of three quantities (corresponding to the three
degrees qf freedom) which are represented by three independent mutually
commuting operators such as the three spatial coordinates of the particle,
or its energy, 2-component of the angular momentum, and square of the
§ 15 R E ST R I C T E D T R ANSF O R M AT IO N T H E O R Y 133
latter (in the case of a motion in a central field of force), and so on.
A ‘state’ defined in this more general way must no longer be necessarily
associated with the idea of motion. As a m atter of fact the idea of
motion—in the sense of a change of the position with the time—has no
meaning in wave mechanics, being replaced by the idea of the proba
bility of finding the particle in a given position when its energy and
two other quantities commuting with the energy have, given values.
The functions «/r°7r do not have to be associated with motion any more
than the coefficients a ^ lv. They are to be interpreted simply as the
probability amplitudes for a state defined by the position x' (or volume-
element dVf) subject to the condition th at H — H \ just as the coeffi
cients determine the probability of the value K ‘ of K if H is
known to have the value //'.
I t should be remarked th at in all these considerations the time does
not play any role whatever so long as it does not appear explicitly in
/ / or in the other operators concerned.
We are thus driven by the inner logic of the ideas embodied in the
wave-mechanical theory to consider it as a special case of a general
physical theory—let us call it quantum mechanics—whose problem
consists in determining the probability of a certain value of some
quantity or of a set of quantities when a set of some other quan
tities is assumed to have given values. This general problem reduces to
the usual wave-mechanical problem when the first three quantities
are the coordinates of the particle, and the second three are its energy
and some other two quantities which are represented by operators
commuting with the energy operator.
The condition th at the three quantities of each set—those whose
values are supposed to be known or those for which the probability of
certain values is being determined—should be represented by mutually
commuting operators seems to be essential for the problem to have a
physical meaning. I t is customary to express the possibility of fixing
simultaneously the value of two or more quantities by saying th at they
can be simultaneously observed or measured; this can be regarded as
the experimental equivalent for the mathematical idea of ‘mutual corn-
m utability’, connected with the operator or the matrix representation
of the quantity in question. I should like, however, to warn the reader
against the conclusion, often implied in the above expression, th at in
discussing elementary phenomena, we must keep in mind the observer
or experimenter as an essential part of these phenomena, supposed to
be responsible through his interference with them for the indeterminate-
134 T R ANSF O R M AT IO N T H E O R Y §16
ness by which they are characterized—and which, as a m atter of fact,
is only revealed and not produced by his observations.
This indeterminateness constitutes the characteristic feature of the
new quantum or wave mechanics, which distinguishes it from classical
mechanics. In the case of a particle moving in a given field of force
with three degrees of freedom, the classical mechanics assumed the
possibility of fixing simultaneously the values of six quantities—for
instance, the three coordinates x, ?/, z and the three components of the
momentum gx, gy) gz (or the energy H, the z-component of the angular
momentum Ms, and the square of the latter il/2), whereby the motion
was completely determined—while the wave or quantum mechanics is
less ambitious and restricts the number of quantities whose values can
be fixed (arbitrarily, or by observation) to three, making up for the result
ing incompleteness or indeterminateness in the description of the motion
by probability considerations as to some other set of three quantities.
Another distinction between classical and quantum mechanics which
must be borne in mind refers to the role played by the time. In the
former case this role seems to be much more fundamental and important
than in the second. As a m atter of fact, the time seems to have been
completely eliminated from the scope of the quantum mechanics as it
has been specified above. This is, however, not quite true. First of all
the time enters implicitly in the definition of such quantities as the
components of velocity (or momentum) and various functions of them
(such as energy, etc.), although these quantities are represented by
operators which do not contain the time explicitly. And secondly we
have supposed from the very beginning of this section th at the potential
energy of the field of force in which the particle is supposed to move
does not contain the time explicitly, i.e. it depends upon the coordinates
alone. I t is only subject to this condition that the time can be practically
eliminated from the theory; it becomes, however, a vital element of
the latter when the potential energy is a function not only of the
coordinates but also of the time. In this case SchrOdinger’s equation
+ — . —W = 0 does not have particular solutions of the form
2 m dtj
ift = e-i 2irinih with *pQ
i r (x,ytz) satisfying the equation H$h> —• •
Characteristic values of the energy do not exist, or putting it in another
way, values of the energy, if it is not a constant of the motion, cannot
be measured, and the question of determining the probability of an
arbitrarily chosen position x,( x \y,i zt) for a given (supposedly known)
value of the energy becomes meaningless.
§ 15 R E ST R IC T E D T R ANSF O R M AT IO N T H E O R Y 135
We shall now come back to our original assumption, th at neither H nor
K contain the time explicitly and that they possess a discrete set of
characteristic values H'(H'U H'z) and KJ) which determine
two discrete sets of ‘states’. We have been led to the conclusion th at the
coordinates of the particle can be used for the definition of a third set of
states, specified merely by the position of the particle in space. Since any
values of the coordinates x'(z\ yf, z') are possible, these values can be re
garded as constituting a ‘continuous spectrum’. This distinction between
H and K on the one hand, and x on the other hand is reflected in the fact
th at in determining the probabilities we must speak of definite values of H
and K and of a definite range of the values of x, i.e. of a volume-element
dV in which the particle is supposed to be situated. We thus have the
expressions: |aIVK |2 for the probability of II — IV if it is known that
K — K \ or of K = K' if it is known that II — H'\ dV' for the
probability that x is enclosed in the range (x\x'-j- dx') if it is known
th at H — / / ' (dV* = dx'dy'dz'); d-V' for the probability th at x is
enclosed in the range (x',x'+ dx') if it is known that K = K f.
Generalizing the reciprocity law which has been established in the
case of wc can flcfine \^x'jr\2 an(i I ^ j t I 2 dV‘ as the proba
bilities of II = i r or K = K' when it is known th at the particle is
located in the volume-element d V\
The similarity between the functions Mu- or and the coefficients
aK'ir or airK' revealed also by the fact th at they satisfy similar
orthogonality and normalizing relations, which in the former case are
expressed either by means of integrals (over x') instead of sums (over
H* or K ') or by functions 8(x'—z") instead of 8H>ir or —corre
sponding to the fact that H r and K' form a discrete and xf a continuous
set of values. We have, namely, the relation (113 a), which can be
written in the form
and to which there correspond the usual orthogonality and normalizing
relations for the ‘wave function’ tp
(116)
Besides the preceding relation, the coefficients a also satisfy the
‘reciprocal’ relation (113) or
to which an analogue is found in the relation
(116a)
w
130 T R ANSF O R M AT IO N T H E O R Y §15
where h(x'—x”) is an abbreviation for the product of the three Dirac
functions h(xf—x")y Siy'—y”), h(z'—z") (just as SKK> is actually an
abbreviation for the product of the three expressions of this type for
the three quantities implied in K).
The proof of the relations (116 a) [i.e. of their equivalence to (116)]
is obtained by multiplying them by </£*/ r , where H " is any fixed value
of //, and integrating over x". This gives, in view of (116),
/ 1 th r ' I ’h r ’W ir dx" = J Wiv / Vr n -' fr w <
1*" =
which, according to the definition of the function S(xf,—xf)) agrees with
J lPx0i r H x "—xf) dxn. The remaining difference between the probability
amplitudes a/rA-, vanishes if we abandon our initial assump
tion as to the discreteness of the spectrum of H and K and suppose
th at one of these quantities, e.g. //, has a continuous spectrum, being
in this respect equivalent to x (the spectrum of K will be assumed for
a while to remain discrete).
The transformation equations (110) which, with our new notation,
could be written in the form
^X’K' = 2 0£’i r ali’K'y
IV
must now be replaced b y |
4>x'K’ — J 'f’xlV aJVK' (IH'. (117)
Multiplying this equation by ^ i r and integrating over x' (xffy\ z' )
(dx' — dV'), we get
J V P r f t r dx’ = / a„ ,K. dH’ j </,%■*%. dx’ = / air K.h(H‘- H ' ) dH',
th at is a,r K . =-- J </>%■■<f>J.A~ dx'
as before. J Since the form of the reciprocal transformation
'Px'iv — 2 aKfn ,4>x,K' (117 a)
K‘
remains unchanged (so long as K is supposed to have a discrete spec
trum), we get the previous relation between the coefficients a and a -1,
namely, leading to the reciprocity law |aA-J/r |2 — \aiVK'\2'
t This transition is quite similar to a transition from a Fourior series to a Fourier
integral, which as a matter of fact forms a special case of the transformation or 'expan
sion’ (117) and (117 a).
J It should be noticed that the former coefficient a nw actually corresponds to the
product of the present coefficient with dH', this difference being compensated for by
the difference between the previous and the present form of the orthogonality and
normalizing relation for
§15 R E ST R I C T E D T R ANSF O R M AT IO N T H E O R Y 137
Substituting the preceding expression in (117 a), we get
4k K' == 2 $r K" j aK"ir aH‘K’ dll',
whence it follows that
J aK"lL'alI’K' dH' — >
or J «h -k t * i v k - d ir = hK-K: (If®)
This orthogonality-normalizing relation, which replaces (113), is
identical with the corresponding relation for the function x' being
replaced by K' [cf. (116)]. In a similar way (through substitution of
(117 a) in the reciprocal expansion) we find the relation
jLtan'K a *rK' -- 8(U' —H")> (118a)
A
which is the complete analogue of (116 a) with x' replaced by / / ' and
/ / ' by K'.
If both H and K have a continuous spectrum, the relations (118) and
(118 a), as well as (116) and (116a), are replaced by relations of the form
J a*rK*aH'K' did' r- 8(Kn—A'),
J alr K,air ic d K' = 8(//"—//'),
etc., all the sums being replaced by integrals and all the Sj-^-num bers
by 8(A '—K ”)-functions. All the transformation or expansion formulae
acquire in this case the same form (117 a).
From the complete analogy between aIVK. and or <f>^K>}it follows
in particular th a t we must have, in addition to the equations
4 k* = J 4 k* * * * 4 k* = / 4k*< *2* dK', (119)
the equation aH.K, = J 'pffe 4>1: k i dx', (119a)
where fact, equation is nothing else but the
expression (110 a) for the coefficients We can thus consider this
equation as a ‘transformation' between the functions air K. and
playing the role of the transformation coefficients, or as a trans
formation between the functions aH>K> and if% #\ the role of the
transformation coefficients being played in this case by 4k i t *
I t should be mentioned th at (119 a) still holds when U and K have
3593.6 rr
138 T R ANSF O R M AT IO N T H E O R Y §15
discrete spectra, equation (119) being replaced by (117) and its re-
ciprocal
(119b)
After we have thus settled the physical meaning of the ‘transforma
tion coefficients’ or ‘wave functions’ as the probability amplitudes for
the values of one of the quantities concerned when the value of the
other is supposed to be fixed, we obtain an extremely simple and
illuminating interpretation of the various ‘transformation equations’
connecting these probability amplitudes. All these equations can be
considered, namely, as the expression or rather the direct consequence of
the addition and multiplication law of the new probability theory (which
deals with the probability amplitudes in the same way as the old theory
dealt with the probabilities themselves).
Taking the last equation, for example, we see that the product
4>x 'k :'aK'/r can ke interpreted as the probability amplitude th at x will
be equal to x' if K — Kf and that at the same time K will have the
value K' if H is known to be equal to //'. Keeping the latter value
as well as that of x fixed, and summing the products 4>x,K'aKL ' iv f°r
all possible values of K> we must obviously obtain the probability
amplitude of x = x' subject to the assumption that H = H \ in agree
ment with (119b).
16. T r a n sfor m a t ion of M a t r ices
We shall now return to the beginning of the preceding section, i.e. we
shall again assume the values of H and K to be discrete, and we shall
examine the transformation equations for the matrices representing
different quantities F from the point of view of H and K. Before doing
this we must point out the fact th at the transformation coefficients
a H K>and \H. can also be considered as the matrix elements of a cer
tain matrix a and its reciprocal a~l respectively, in the same way as
Fh h * or Fk k * are the matrix elements of FH or FK. The main
difference between them is that, in the latter case, the two indices
(//' , H ” or K \ K ”) refer to states of the same set, defined either by H or
by K y whereas in the former case the first index refers to a state of the
one set and the second to a state of the other set.
Another differ ence (closely r elated to the pr eceding one) is th a t while
th e m atr ix elem ents FKK. or F JVH* ar e Her mitian, i.e. satisfy the
conditions FK.K. = F ^ K>, FH,H- = F ^ -h , the coefficients (or matr ix
elem ents) aH>K>are not Her mitian, as shown by th e r elations (112).
The m atr ix which is obtained fr om F (or a) by inter changing the
816 T R ANSF O R M AT IO N OF M ATR IC ES 130
r ows and the columns is called th e tr ansposed matr ix of F and is
denoted (usually) by F . A matr ix F * which is obtained fr om the
tr ansposed F by taking the conjugate complex of its elements is called,
accor ding to J or dan, the ‘adjoint’ matr ix of F (‘conjugate imaginar y*
accor ding to Dir ac) and denoted by F f. Using this notation, we can
wr ite the Her mitian condition in the for m
= F, ( 120)
while the condition (112) can be wr itten in the form
a f — a - 1. (120a)
Matr ices a satisfying this condition ar e called ‘unitar y*, because the
pr oduct of such a matr ix with its adjoint matr ix, which is the analogue
of th e squar e of the modulus of an or dinar y complex number , is equal
to u n ity (i.e. to the unit matr ix).
I t is self-evident th a t the multiplication of the matr ices of the
typ e a which do not cor r espond to a definite ‘point of view’ (II or K)
b u t ser ve to connect two differ ent points of view must be per for med
accor ding to the usual r ule of matr ix multiplication, i.e. by com
bining th e r ows of the fir st factor with the columns of the second. This
m eans th a t the elem ents of the pr oduct of two matr ices a and b m ust
h a ve th e for m
i.e. th a t th e second index of the elements of the fir st factor should
coincide with th e fir st index of the elements of the second factor , this
common in dex being the index of summation.
Fr om th e poin t of view of this definition, the pr oduct of a ‘m ixed ’
m atr ix such as a by itself or its conjugate complex a* would have
no m eaning, since the two indices r efer to states of differ ent sets, and
ther efor e cannot be identified. We can, however , for m the pr oduct of
a with its tr ansposed (a) or adjoint matr ix (at ), since the fir st index
of th e la tter two r efer s to a state of the same set as the second index of
th e for mer and vice ver sa. The expr ession can thus ^
consider ed as the (H', H") element of the pr oduct matr ix aa which is
of th e same ‘pure* typ e as the matr ix Fn . The same r efers to the
m atr ix aa* or a a -1, if the elements of the r ecipr ocal matr ix a - 1 are
labelled with the indices H ' and K' in the or der opposite to th a t which
r efer s t o th e m atr ix a (as has actually been done in the pr eceding
section). I t can easily be shown th a t the matr ix a a f is Her mitian (while
da is not). In fact, taking its adjoint matr ix, which is obviously equal
140 T R ANSF O R M AT IO N T H E O R Y §16
to the product of the adjoint matrices of the two factors taken in the
reverse order, we get
(oat)t = a rta f = aa*,
in agreement with (120).
I t should be noticed that the two matrices aat and afa are, in general,
entirely different, the former belonging to the same type as and
the latter belonging to the same type as FK.
In the particular case of a unitary matrix, satisfying the conditions
(120a), we get
(a'd)K'K. — 2 aK i r a irK' ~ 2 ah i r an K- ~ &k ' k >
ir 7T
(a a ^ ) n ' j r — 2 ° n K ' a K H' = 2 a i r K ' a K ' i r — <S//7r >
according to (112a)-(l 13 a), or in matrix notation
aaf — 8/;, afa = 8A, (120 b)
where <$H and 8^ denote the ‘unit m atrix’ as defined from the ‘point
of view’ of H or A’. Neglecting the physical meaning implied in this
difference one often identifies the two unit matrices and writes
oaf -- a*a — 1,
which occasionally can lead to misunderstandings.
The possibility of treating the transformation coefficients as the
elements of a (mixed) matrix and of applying to the latter the usual
rule of matrix multiplication is substantiated by the results obtained
in two or more successive transformations. Let L be an operator (or
set of three operators L v X2, Lz) of the same kind as / / or K, with
the (discrete) characteristic values U and characteristic functions x//-
These functions can be ‘transformed’ to those of K by means of the
equations x r = and further to those of H by means of
W
the equations <fPK>= T aH>K. Combining them together, we obtain
W
a direct transformation from L to //,
with the coefficients cw r = T V /'* The matrix of these coeffi-
W
cients is thus equal to the product of the matrices a and b taken in the
order stated, and calculated according to the ordinary rule. Using the
matrix representation for the transformation coefficients, we can thus
define the matrix of two successive transformations as the product of
the matriees of each of the separate transformations. This holds, in
§16 T R ANSF O R M AT I O N OF M ATR IC ES 141
particular, for the case which has been considered above, where the
second transformation is the reciprocal of the first one.
We can now turn to the main object of this section—the transforma
tion of the matrix representing the same quantity F in the transition
from one 'point of view’ specified by H to another, specified by A\
Substituting (110) in the expression (109) for the elements of FK, we get
n -A - = % 1 <A -'«/rA ~ *V ,r. ( 121)
it it
which can be written in the form
F (k 'K" ^ 2 a K’ir F °n ' ir a ir K*-
Jr ir
This expression can be interpreted, according to the matrix multi
plication law, as the (A", A'")-element of the product of the matrices
Fin and a taken in the order stated. We can thus put
Fk - a 'F u a. (121a )
Substituting (111) in (109), we get in the same way
Fi r ^ a F Ka \ (121b)
This equation can be obtained from the ju-eceding equation if the latter
is multiplied by a on the left and by at on the right side and if the
relations a^a — aat =- 1 are taken into account.
If we restrict ourselves to multiplying (121 a) by a on the left or by
a1 on the right, we get
F] ta --= aFK \ (121c)
and FKa i -- a ^F If \
The product- matrices in these equations have all a mixed character,
with elements of the type (//', K') in the case of the first and (A ',//')
in th at of the second.
W ritten in m atrix elements, these equations run
(F}1a)H.Kr — ^ F}Vir air K. = ^ a n ' K " F k - k ' ~ (a ^ )//' A >
H' k1
{FKa i)K’jr — 2 FK>K^a )^lv — 2 a/c'/r Fj r u >— (a!FH)K JJ>.
A' it
If in (121 c) we put, in particular, F = A" or F = //, we get
K lt a - a KK> aHK - H „ a 9 (122)
and two similar equations with at instead of a.
Taking the element (//', A') of the first equation (122), we get, since
(122a )
^ A W V r = K'a-
142 T R ANSF O R M AT IO N T H E O R Y §16
In the same way we obtain from the second equation (122)
2 aHK’^K ’K' — aH'K'Hr- ( 122 b)
k*
The equations (122 a) have exactly the same form for all values of
K f. Dropping K' as second index in the coefficients a, we can rewrite
them as a single system of linear homogeneous equations (corresponding
to different values of H' ) for a set of variables aiv
2 Kji'HmaHm= h ' a H' (123)
with a parameter K'.
This system of equations can serve for the direct determination both of
the transformation £oefficients aH K> and, of the values K ' if the matrix
elements of KH are knoum. We have, indeed, as the condition of the
compatibility of equations (123) the vanishing of the determinant,
h -h - K '
■ K-li’H’ KU r -
Kn 'ir K ,r „ — K ’ A W
KI f-jr Kf r u — K' = 0,
which is an equation for the determination of the possible values of
K ' (K", K'", etc.). To each of these values there corresponds a set of
values of the variables aH. which we can identify, under certain con
ditions, with the transformation coefficients aH.K, («//'*'< e^c.).
These conditions amount to the relations a fa = aa* = 1, which can be
shown to be verified if the solutions of (123) are normalized according
to the equation ~ * .
J aH.a?r = 1 (123b)
for every value of K'.
Let us first of all make sure of the fact that the values K' obtained
from (123 a) are real. To show this we take the equations
2 KH H*aHm
K' = K'aH K'>
ir
K*H.n .aJ H
* .r
’K'. = K'*a*
U K’
(the first of which can be considered as an identity, resulting from (123)
for a particular value of K \ and the second as its conjugate complex),
multiply them respectively by a ^ K>and aJVK>, sum over //', and finally
subtract one from the other. This gives
lH*K aHfK' ~ K*j 7T a
l l
§ 16 t r a n s f o r m a t io n o f m a t r ic e s 143
'Taking into account the Hermitian condition R */7/* --- K h 'H'* we
can rewrite the second double sum on the right side in the form
jE 2 ^ n ,,i r an'K,a^rKf which becomes identical with the first double
sum if we interchange the summation indices IV and IV. We thus get
(K' - K ' * ) £ a n .K.a*.K. = 0.
or, since the sum 2 aivK,a *tK' ~ 2 \<1h 'K '\2 *s essentially positive,
Hf iv
K ' - K ’* 0.
This equation expresses the fact that K' is real.
If, in the preceding argument, we replace the second equation by an
eq uation (identity)
2 Kfr jr tfr K* ~ & " a*VK~
corresponding to some value of K" different from K '} multiply it by
aH K'y sum over and subtract from the first equation multiplied
by a*rK. and also summed over //', we get
(K ' — K ' ) ]£ a H 'K 'a U 'K 0
iv
“ 2 2 ^ IVH ” a 11 ' K ' a *)’K ” “ 2 2 JK* l I l ” a * r K ' a H K -
iv ir iv i r
In view of K*vnm~ ^ i n v and the interchangeability of the summa
tion indices IV, IV, the right side vanishes just as in the case K ' = K",
and we get , v _* _ n
(A —A ) > d ivK ,a H'Km—
Jr
which, since K' —K ” is assumed to be different from zero, reduces to
2 a I VK ' a H 'Km ~ 0
or 2 a K*H' a i i'K ' — ^ (K * ^ K ')-
This relation expresses the mutual ‘orthogonality’ of the different sets
of solutions of the system of equations (123). Together with the
normalizing condition (123 b), it can be written in the form
= 8k ,
whereby the identity of the coefficients airK. obtained from equations
(123), (123 a), and (123 b), with those defined at the outset with the help
of the wave functions iftfr and <f>^ by means of equations (110 a) and
(111a), is demonstrated.
At the same time we have demonstrated the possibility of effecting
the transformation of the matrix F„ representing an arbitrary physi
cal quantity F ‘from the point of view of H f (i.c. with regard to states
defined by U) to the matrix FK representing the same quantity
144 T R ANSF O R M AT IO N T H E O R Y §16
‘from the point of view of K' without the use of the wave functions
characteristic of H and K, but by a purely matrix method, based upon
the matrix representation of all quantities—including the key one K —
‘from the point of view of H \ The transition from this point of view
to th at of K can be effected by means of the equations (123), (123 a),
(123 b), which determine the transformation matrix a, and further by
means of equation (121a), giving the new matrix elements of a n /
quantity F in terms of the old matrix elements.
In view of the relation a* = a -1, this formula can also be written in
the form Fk = a -'F ^a . (124)
The transformation matrix a can actually be defined by the condition
a~1Kn a — K k (a diagonal matrix) (124a)
which leads, after a left-handed multiplication by a, to the equation
K h a = a KK, i.e. to the system of equations (123); the unitary character
of the matrix a, expressed by the relation a^a = 1, can be considered
as a consequence of these equations.
A transformation of the type (124) is generally called a canonical matrix
transformation. I t has an interesting feature which does not depend
upon a being a unitary matrix (i.e. satisfying the relation a 1 —- c r 1),
namely, of leaving invariant all the functional relations between the
original matrices, the same functional relations holding between the
transformed matrices. This can be proved directly by putting in (124)
F = E + G or F = EG. In the first case we get, since FjI = e h + g u ,
Fk —ja -^E ji + G ^a = a -xE Ha-\-a~1GH a — EK-\-GK\
in the second case we have, using (EG)H = E H GHi
Fk — a ^ E jj Gn a.
Now wc can insert between E n and Gn the product oa-1, since it is
equal to the unit matrix h whose product with any other matrix is
identical with the latter (just as in the case of the multiplication of
ordinary numbers by an ordinary unity). We thus get, by the asso
ciative law,
Fk = (a~1E u a )(a -l Qu a) E k Gk .
This proof can easily be extended by induction to any function F of
E and G, so that, putting (in the operator representation) F ■
— f(E , G),
we have /( * * , GK) = a~V(Ea> GH)a (124b)
or a~lf(E H, GH)a = f(a ~ lE H a, a~'Ga a).
I t follows from these equations that, in particular, the transformation
§16 T R ANSF O R M AT IO N OF M ATR IC ES 145
(124) does not affect the validity of the commutation relations between
the coordinates and the components of the momentum; the original rela
tions (pxx —xpx)H — . hJ{ are transformed into {pxx—xpx)K = . SK.
Ztt I
Canonical transformations of the above type should be distinguished
from canonical transformations of the variables x, y, z, p x, p y, p s in the
sense corresponding to the general definition of a canonical transforma
tion in classical mechanics (see § 5). In the former case the canonically
conjugate variables are supposed to remain unaltered, the transforma
tion referring to the matrices only by which they are represented from
the point of view of different energy operators (H or K). In the latter
case, on the contrary, the variables p z are themselves transformed
into a new set of canonically conjugate variables ?7, £, 77^, 77^, 7r^, the
energy operator H(x) — II(x,...,pz) remaining essentially the same and only
changing its external form because the old variables defining it are
replaced by their expressions in terms of the new variables. We thus
get for it a new function, say, of the variables f,..., 7t ?, which is,
however, numerically equal to H(x) for the corresponding values of the
original variables. This numerical equality of the classical theory is
replaced in quantum mechanics by the equality of the characteristic
values of the operators H^x) and H The condition expressing the
canonical character of the transformation from the original variables
to the new ones consists in the fact th at the matrices representing the
latter (from any point of view) should satisfy the same commutation
relations 7 £77^ = hh^rri, etc., as those representing the old vari
ables. This means th at the new matrices (of £,..., 77^) can be derived
from the old ones (of £,..., p j by a canonical transformation in the first
sense, i.e. in the sense of the equation (124). The physical meaning of
such a transformation will, however, be entirely different from the case
to which (124) refers, the two kinds of transformation bearing but a
formal resemblance to each other.—We shall come back to the trans
formations of the second kind in the next section.
In the case of a degeneracy of the original energy matrix II Hi i.e.
when some of its diagonal elements coincide, it is necessary to consider
it simultaneously with one or two other matrices, which represent inde
pendent constants of the motion specified by II. We must therefore
replace the operator II by the three operators Hv H2, H3 and define the
m atrix representation of any quantity F from the ‘point of view’ of this
‘trio’, writing FI{ HH% instead of FH. The transformation matrix corre
sponding to a transition to the ‘point of view’ of some other trio, e.g.
3W B.6 rr
T R ANSF O R M AT IO N T H E O R Y $16
K v K2yKZywill then be unambiguously determined by the simultaneous
equations
— K*LKxKtKJ (124c)
° - l^ 3 {H J ItH %
)a = KaHKyKyKy) 4
with the condition that all the three matrices on the right side should
be diagonal (which can always be satisfied if the corresponding operators
K v K2y K3 commute with each other). Each of the equations (124 c),
taken separately, will leave a certain amount of ambiguity in the shape
of the matrix a, which can be removed by means of one or both of
the others; if we do not desire a diagonal representation of the corre
sponding quantities we can remove this ambiguity in a perfectly
arbitrary manner consistent with the condition a~x — a 1.
The preceding considerations can easily be generalized for the case
when either or both of the operators (or the operator trios) H and K
have a continuous spectrum. Let us assume, for instance, th at the
values of H form a continuous set, while those of K remain discrete.
We then have, instead of ( 110) and (111), the transformation equa
tions (117) and (117 a) with a semi-continuous transformation m atrix
a i r K ' satisfying the orthogonality and normalizing relations (118) and
(118 a). The latter can be put in the same form,
CLCL^ — — 8^ ,
as in the discrete case, if SH is considered as a continuous unit matrix,
i.e. as a Dirac function
W = HH’- H ”)y
while §K K- is the usual discrete unit matrix, and if, further, the matrix
multiplication law is defined in the usual way corresponding to discrete
matrices in the case of aa
K
and in the way corresponding to continuous matrices in the case of a fa:
[cf. eq. (105 a), §14].
We get further, instead of ( 121),
Fjc'Km— J J aH'K’aH”KmF n dH’d H \
or
§ 16 T R ANSF O R M AT IO N OF M ATR IC ES U7
which, as in the discrete case, can be written in the matrix form
Fk = at F „ a,
it being understood th at the matrix multiplication must be carried out
according to the rule for continuous matrices whenever the ‘summa
tion’ indices arc continuously variable. From this equation we can
derive the equations ( 122), the second of which, when reduced to matrix
elements, runs exactly as before [eq. (122 b)], while the first assumes
the form -
j K»f r i r air K . dll" = K ’air K.,
instead of ( 122a). Dropping the index K' of the coefficients air K>, we get
J -^//7i ”ajr d H =- K Qji', (125)
which can be considered as an integral equation for the determination
of the functions a }V and the characteristic values K \ replacing the
system of algebraic equations (123). The result of the elimination of
the functions alv from (125) cannot be written in the form of a deter
minant (123ft) unless we adopt a generalized definition of ‘continuous
determinants’ corresponding to continuous matrices. Writing the right
side of (125) in the form J K ,a1J.h {H n—II') d H \ we could then replace
the compatibility equation (123 a), which serves for the determination
of the characteristic values of K (K' = Kk k >), by a symbolic equation
of the type
\KH.u — K' h ( H ' - i r ) \ = 0 , (125a)
indicating the general element of the determinant. In the corresponding
notation for the discrete case, equation (123a) would run as follows:
IkivH — K ’^n ir l — 0.
Of course (125 a) cannot be used for the actual calculation of the values
K' \ but this is also true of equation (123 a), since it refers to a deter
m inant which consists of an infinite number of discrete elements.
We shall indicate later the method which can be used for the approxi
mate calculation of the admissible values of K* when K differs but
little from H (as is the case in problems of the perturbation theory).
I t should be remarked here th at both for a discrete and a continuous
spectrum of H the characteristic values of K may form a discrete as well
as a continuous spectrum (contrary to the assumption which was made
a t the beginning about the discreteness of the ijT-spectrum).
I t can easily be proved th at if the functions aH> [‘characteristic
functions’ of the integral equation (125)] corresponding to a particular
148 T R ANSF O R M AT I O N T H E O R Y §16
value K' are labelled with this value as second index, they will form
an orthogonal set—discrete or continuous, together with the set of
values of K '—and normalizable to unity, i.e. satisfying the relations
J a H' K' a H' Kmd H ' = $k 'k 9 or S(Ar/—A”)
and '<LaH'K'aH'K’ or f aH'K'a*i'K’ = &(H'—H )
A'' J
as the case may be.
The proof is obtained in exactly the same way as in the case of a
discrete //'-spectrum dealt with above and therefore will not be repro
duced here. I t should be remarked incidentally that the results referring
to the latter case must be amended to allow for the possibility of K
having a continuous spectrum with Kk >k » = h(K' —K H).
Summing up, we can say that both with a discrete and a continuous
spectrum of the ‘basic quantity’ (or basic trio) //, it is possible to
calculate the matrix elements of any quantity F from the point of view
of some other ‘basic quantity’ (or basic trio) A", without the knowledge
of the characteristic functions of either H or K; the only thing which
it is necessary to know in order to carry out the transformation from
Fh to Fk is the matrix KH. The transformation coefficients aH K> can
be found from the condition th at K k is a diagonal matrix of the discrete
or of the continuous type (which need not and cannot be specified
beforehand).
17. Tr ansfor m ation Theor y of M atr ices as a Gener alization of
Wave Mechanics; Tr ansfor m ation of Basic Quantities
I t thus appears that the matrix theory, so far as the transformation
from one point of view to another is concerned, can be considered as
a logically closed self-supporting structure, which does not need the
wave-mechanical basis upon which we have built it up. We have
already met with a similar situation in the preceding chapter, when we
were discussing the question of the actual determination of the matrices
corresponding to a given energy operator and found it possible to
achieve this result by determining the fundamental Hermitian matrices
of the coordinates and the momentum-components in such a way as to
make the energy matrix diagonal subject to the commutation conditions
p xx —xpx = A/27ri, etc.
In the light of the transformation theory developed in this chapter,
it appears, first of all, th at if the latter problem has been solved for
some simple type of motion specified by the energy operator H, it can
§ 17 G E NE R AL I Z AT IO N OF W AVE M EC H ANIC S 149
be solved for any other type of motion, specified by some more coift-
plicated energy operator K> by the method of the transformation
theory, without getting back to fundamental matrices (.r, p s) and com
mutation conditions (which, as lias been shown above, arc invariant
with respect to canonical transformations). I t is just this method of
solution which is used by the perturbation theory, when the difference
between the operators K and II is sufficiently small.
Besides furnishing a simple and practically the only workable method
for the solution of such perturbation problems, the transformation
theory reveals a new connexion between the matrix and the trave-mechani
cal method. rcducintj the, latter to a particular case, of the former—as was
pointed out in the preceding section. We have seen, namely, that
the characteristic fmictions or probability amplitudes of the wave-
mechanical theory i//).7r can be considered as the transformation coeffi
cients from the point of view of the ‘energy-trio* II to that of the
‘coordinate-trio' x (provided that such a thing as the energy exists,
i.e. that the energy operator II docs not contain the time)—in the same
sense as the probability amplitudes an >K. are the transformation coeffi
cients from the point of view of the energy-trio 11 to th at of the energy-
trio K. This means that the wave-mechanical method can be completely
replaced by the matrix method involving the transformation of the
matrices Fx to the matrices Fjr or vice versa.
The wave-mechanical theory, considered as a special case of the
matrix transformation theory, has to solve the following problem:
Suppose the matrices of all quantities, and in particular of the energy
//, to be known from the point of view of the coordinates, we have to
find the matrices representing them from the point of view of II. The
solution of this problem reduces to the solution of the linear integral
equation, [ <h" - I l ' tr , ( 120)
which is obtained from (125) if K is replaced by //, II by i\ and a ir by </£,
and which obviously must be equivalent to the Schrttdinger equation j*
‘ i m - //Vl"' (120 a)
The equivalence of these equations can be proved directly with the
help of the general definition of the elements of a matrix Fv by means
of the integral F^ = J F^ ^ (127)
f Wo moan here and in the sequel Selirbdingcr’s equation not involving the time (and
serving to define the stationary states only). This circumstance is indicated by affixing
to all the quantities connected—directly or indirectly—with the energy operator K the
additional (upper) index 0,
150 T R ANSF O R M AT I O N T H E O R Y 517
This definition has been used until now only in connexion with such
‘key’ or ‘basic’ quantities C, one of which at least could be regarded
as the energy. This restriction does not seem, however, to be necessary,
and the formula (127) can be applied to quantities C of any type (pro
vided the operators by which they are represented commute with each
other). We can, in particular, put C = x (i.e. Cx — x, C2 = y, C3 = z),
subject to the condition th at the variables z' and C’ in should be
considered as independent. This means th at the two indices (or argu
ments) in the function need not necessarily refer to the same point.
We can thus in (127) put C' = x” and C” — x"'y or, denoting the
integration variable by x" instead of x'f write
(127a)
where the oj>crator F is understood to refer to the point x'"y i.e. to be
a function of zmand of the elementary operators p x>.. = £ . ~ f/.
The functions must obviously represent the identical trans
formation (from the point of view of xmto that of z')yor, in other words,
the probability amplitudes that x should be equal to x'" when it is
known th at it has the value x'. Since one and the same particle cannot
be simultaneously in two different places, this means that must
vanish when x'” 7^ x' and become infinite when xm = x' (in view of the
fact that x is a continuous variable). We can thus identify with
the ‘unit m atrix’ of the continuous case, i.e. put
'/'"•v = S(x'"-x'). (128)
This expression can be derived from the general formula
[cf. (110), § 15] if wc put C* — x1and accordingly ajrK>= --
in conjunction with the orthogonality and normalizing relation
f '$?//■'dH ' = h(x'”—x')y the being in this caseobviously
identical with
I t is easy to see that, defined in this way, the function = tx"'*'
also satisfies the usual orthogonality and normalizing relations:
In fact, putting C' = xf and C" = xm, we get, according to (128),
/ = / S(X'-X"')S (x”' - x m)d xm = h(x’—x’“)
§17 G E NE R AL I Z AT I O N OF WAVE MECHANI CS 151
and, putting C = x"'y
f tT c tz- c d C ' = J 5 ( ;c " - x ') 5 ( /- i" ) dx'" == S (x '-x ').
We thus see that the elements of a matrix Fx can be defined according
to (127 a) and (128) by the integral
(128a)
so that, in particular, we have
(128b)
where H denotes the usual Hamiltonian function of the coordinates x
to the point x = xm (dxf" indicates the volume-element enclosing this
point).
It can now easily be shown that the integral equation (120), together
with the expression (128b) for its ‘nucleus’, actually reduces to the
differential equation (120a).
Let us first take that part of H which depends upon the coordinates,
that is, the potential energy U(x,y,z). We then get, according to (128 a),
U«.x. - J U(x'")8(x'-x'")S(x'"-x") dx"' ■--- U(x")8(x"-x'),
which, on substitution in (120), gives
J V '^ .d x " U(x')tf,.
Putting, further, F = djDxy we have
since, obviously,
A - - A s ,* --* -) ,
and consequently,
J F^-^.dx" = - J r^d-*" j -*")<]*■'
Now integrating by parts, we have
J = - J S(.r'"-x*)i7(^.rf.r",
162 T R ANSF O R M AT I O N T H E O R Y §17
because the product x*) vanishes at the limits of integration
(or at infinity). We thus get
/ / dx” Ox*
n <%•
J dx OX*
In the same way it can be shown that
J
if F — (cjcx)-, and so on. Putting finally F — — "V ( —V + ?/ H
2m yl-m ox)
we get J J lyyi/jy dx" - H*py. It should be mentioned that this formula
holds identically, i.e. irrespective of the shape of the function ifjy. The
hitter is determined in fact as by the condition that / / should
be equal to the product //'$*..
The generalization of the matrix theory which has been considered
hitherto consisted, in the main, in admitting quantities other than the
energy and those commuting with the energy to the role of the ‘basic
quantities’ determining the matrix representation of all other quantities
and being themselves represented by diagonal matrices. In the ease
just considered, this role of basic quantities was switched over to the
coordinates. The matrices representing the latter x r (or x ry:, yxyz, zryz)
are obviously defined by the equations [cf. (101 b), § 14]
Xy y > - .T'<$(.(*'— .1’" ) , (129)
or, written out in detail:
Vx-Mx-vs = y' $ ( x'- xHWy’- y " ) H z' - z' ) J. (129a)
= F h (x'-x" )h (y'-y''n z'~ z" ) J
The coordinates have, however, preserved at the same time another
fundamental role in which they have been employed from the very
beginning—namely, that of the arguments of the functions \fjy(y (with
C ----- 11, x, or any other ‘basic trio ) which can serve for the direct deter
mination of the elements of a matrix Fc by means of eq uation (127). This
second role of the coordinates is intimately connected with the initially
adopted representation of physical quantities by means of operators,
defined as functions of the (rectangular) coordinates x, y, z and of the
§17 G E NE R AL I Z AT I O N OF WAVE ME CHANI C S 153
elementary differential opera tors j)x — A - ~ , p if A 1
2n i bz ’
which replace the components of the momentum.
These functions were supposed to be known, being in fact identified
with the functions representing the same quantities in the classical
theory (on the ground th at F (x,px)ip reduces to the product F {x,gx)\fi
if ip is replaced by its approximate expression tp = ei27rNI,\ where S is
the action function of classical mechanics).
We must now consider a further generalization of the transformation
theory, consisting in the replacement of the coordinates in this second
role, connected with the usual operator representation, by some other
quantities, e.g. Q> associated with operators which contain derivatives
with regard to Q.
The possibility—and, more than that, the necessity—of such a
generalization clearly follows from the fact th at the functions \p%c>,
considered as transformation coefficients ‘from the point of view of C
to that of x \ or as probability amplitudes for one of these two quantities
having a given value when the value of the other is known, are practi
cally symmetrical with regard to both quantities. Instead of—or rather
together with—the functions *pr'c> we must consider the functions ]
which are simply equal to the conjugate complex of the former and
which correspond to the reciprocal transformation. In these functions,
however, it is the quantities C which play the role of the coordinates,
while the latter appear in the role of the ‘basic quantities’ instead of C.
Replacing the SchrOdinger wave functions {>by transformation
coefficients or probability amplitudes of the most general type we
can define the matrix elements of a certain quantity F with respect to
C by the formulae
or F *cc- — ^ a Q'C' F a Q'C”'
according as C has a continuous or a discrete spectrum.
This definition will, however, remain meaningless so long as F is not
specified as an operator ‘from the point of view’ of Q, i.e. as a certain
function of Q(Qv Q2>Qa) and the derivatives d/dQ. The operators
which have been considered hitherto have always been specified from
the point of view of the coordinates x, and obtained from the classical
Jh b
functions F(x> gx) by a simple substitution of px = — . ~ for g . Adopt-
Z 7r t ox
ing what can be denoted as the ‘principle of relativity’ with regard to
154 T R ANSF O R M AT I O N T H E O R Y §17
the ‘basic quantities' which specify the operator representation, we
shall denote the operator representing a certain quantity F ‘from the
point of view of Q’ by F(Q)> where the brackets are introduced to dis
tinguish this operator from the corresponding matrix FQ. The operators
defined in the usual way, i.e. from the point of view of the coordinates,
should be denoted accordingly by and the general definition of the
elements of the matrix Fc by means of the operator should run
as follows:
F c' C" — j aV'C' q c *dQ ~~ J °C' Q’ ^ (Q)(1Q C" d Q (1^0)
if the spectrum of Q is continuous, or
n - c ~ 2 aQ'w aV'c'' ~ 2 ac'V F{q )Qq c • (130 a)
<7 v
if it is discontinuous.
Another obvious condition for the operators FiQ) is that the matrix
elements of Fc defined by the preceding equations should not depend
upon the choice of the quantities Q.
Equations (130) and (130 a) bear a striking resemblance to the trans
formation equations
F W- = / / ^ F W ^ ' c - d Q ' d Q " ,
and ^c'c* “ 2 2 ac'V aQKcm'>
or, in the abbreviated notation based on the matrix multiplication law,
Fc = a *F Qa — a lF yd,
with a denoting the transformation matrix aQC>.
The equations of both types actually become identical if the operators
Fw satisfy the condition
$(Q)aQC* = J Fq q'Qq'C” dQh, (131)
or F(Q)aQ'cB= ^ F 0Q,Q"aQm
c''^ (131 a)
These conditions are a generalization of the equation
I = m ,
which has already been obtained in connexion with the proof of the
equivalence of the SchrOdinger equation (126a) with the integral equa
tion (126). I t should be observed that, according to the present notation,
we must write for the energy operator, and aJ.H> for the wave
functions Further, we easily get as a generalization of equation
§ 17 G E NE R AL I Z AT I O N OF WAVE ME CHANI CS 155
(128 b) the following relation between the operator and the matrix
F q' f W = / m - Q in g M o r - Q " ) dor, (131 b>
where the functions b(Q' —Q”) can be considered as the transformation
coefficients cLq q ” on the assumption that the spectrum of Q is con
tinuous. The formula (131b) can be considered as the direct consequence
of (130).
Putting F = C in (131) and taking into account that
J ^o'Q'ao'C' dQ — C (Lq 'c
according to the definition of the transformation coefficients a Q^c> [cf.
equations (125) and (126)], we get
('(Q)aQ'C* " @ aQ,Cm' ( 132)
This equation is the broadest generalization of SchrOdinger’s equation,
with C standing for //, Q for x, and the probability amplitudes a QC*
(which could also be denoted by ^ * c.) for the usual ‘wave functions’
h 3
ifPx i r . If th e for m of the oper ator CiQ) as a function of Q and of — —
is known, equation (132) can serve to determine the functions
a Q’(— cLq 'c -) and the characteristic values Cn of the operator C{Q). I t
should be remarked that these characteristic values do not depend upon
the choice of the basic quantities Q (i.e. are invariant with regard to the
transformation of the latter), being as a m atter of fact nothing else but
the characteristic values of the operator C{c), or, in other words, the
(diagonal) elements of the matrix Cc . This corresponds to the physical
meaning of the characteristic values of a quantity, as the values which
this quantity can possibly assume, irrespective of the values which can
be, or actually are, assumed by any other quantities.
In deriving equation (132), we have assumed that the characteristic
values of Q constitute a continuous set. If they constitute a discrete
set, the differential operator representation of different quantities F with
regard to Q becomes impossible , for the application of the derivative
operators djdQ to functions of Q becomes meaningless. Equation (131a)
can hold accordingly only when the operator F(Q) reduces to a function of
Q. The same refers to the equation, FfyQ* = which
should replace equation (131b) and which is meaningless, unless the
operator reduces to a function of Q (not containing the derivatives
156 T R ANSF O R M AT I O N T H E O R Y § 17
d/dQ), in which case it reduces to
meaning that FQ is a diagonal matrix.
This example shows that the matrix theory, which we initially de
veloped on the basis of the operator theory, starting with the energy
operator II(x) and the wave functions defined by it according to Schrfl-
dinger’s equation / / (jr)0®. — H ' t p is actually more general than the
operator theory even in its generalized form corresponding to the
replacement of the coordinates x by some other trio of quantities with
continuously variable values.f
Another and perhaps logically more satisfactory procedure would be
to start (following Heisenberg, Jordan, and Dirac) from the other end,
i.e. with the matrix representation of physical quantities, deriving the
operator representation as an alternative form of it for the case w'hen
the basic quantities admit continuously variable values, and using the
transformation theory for the definition of the probability amplitudes
a QC' and, in particular, of the wave functions i//£/r of the de Broglie-
Schrtidinger wave-mechanical theory.
This purely deductive method has, however, from a didactic point
of view, the disadvantage of being too abstract and of starting with
ideas completely alien to customary or ‘classical’ conceptions. The
inductive method, which is adopted in this book, and which makes an
appeal not only to the logic but also to the intuition of the reader,
gradually leading him from the concrete customary conceptions to the
abstract new ideas, may prove more helpful for those who have to get
used to these newr ideas and perform the logically simple but psycho
logically difficult task of getting rid of the old conceptions.
To this it should be added th at the matrix theory remains an empty
scheme so long as no concrete assumptions are made about the com
mutation properties and the functional relationship of the matrices
concerned, the problem consisting in the actual determination of the
elements of these matrices from a certain 'point of view’ (after which
a transition to some other point of view and the determination of the
corresponding probability amplitudes can be made with the help of
the transformation theory). These assumptions, however, involve con
siderations which lie outside the logical realm of the m atrix theory and
can hardly be understood without the fundamental idea of the wave-
f I t would be possible to extend the operator theory to the discrete case if differential
coefficients were replaced by finite differences.
§17 G E NE R AL I Z AT I O N OF WAVE ME CHANI C S 157
mechanical theory, namely, th at the motion of a particle in a given
field of force is determined in terms of probabilities by the propagation
of the associated waves.
This refers in particular to the commutation relations between the
fundamental matrices x and p,
(133)
in conjunction with the fact th at the latter have to be defined as the
components of the momentum in the classical expression of the energy
II (replaced by the matrix Hx).
After these relations, which correspond to the quantum conditions
of Bohr’s theory, have been established, the whole problem of the
wave-mechanical theory can be stated as the transformation of all the
matrices involved (and in the first place of x, p , and II) from the point
of view of x to that of //, the transformation coefficients being
the probability amplitudes of finding the particle in a given position
when its energy is known or with a given energy if its position is known.
The actual solution of this problem is usually reduced to the solution
of SchrOdinger’s equation involving the operator II(x).
As on illustration of the ‘principle of relativity’ with respect to the
basic quantities in the operator representation, wc shall consider the
results which are obtained if the coordinates are replaced in this role
by the momenta p. The latter must be considered in this case ns
ordinary quantities (== Q), while the coordinates, in order th at the
‘quantum conditions’ (133) should be satisfied, must be defined as
differential operators according to the formulae
h d h b
2iri dpx 2,TTl dPy
The energy operator Hip) can be determined accordingly as the operator
resulting from the substitution in the classical Hamiltonian function
0 ,I + l ?J+J[>]s)/(2m) + U(xiyiz) of the elementary operators (133a) for the
coordinates. The new wave functions corresponding to this defini
tion of the energy operator are determined by the differential equation
[cf. (132)]:
(133 b)
which in general is entirely different from that of Sclirttdinger—since
the kinetic energy (p%+ pl+ p\)!{2m) which in the ^-representation
reduced to the Laplacian differential operator of the wave theory
158 T R ANSF O R M AT I O N T H E O R Y §17
V2 [multiplied by —WKHirhn.)], in the p -representation remains an
ordinary quantity, or more exactly an ordinary factor which has to be
multiplied by the function 0”, while the potential energy becomes a
differential operator acting on this function, the result of the operation
H(p) being equivalent to the multiplication of by a constant factor
/ / '—one of the characteristic values of 11. As stated above, these
characteristic values must be the same whether wc start with the basic
quantities x or p.
The probability amplitudes arc’ however, in general, functions
of p' entirely different from the ordinary wave functions tfs^ir (with the
exception of the case of the harmonic oscillator, where the potential
energy is the same quadratic function of the coordinates as the kinetic
energy is of the momentum components). According to the funda
mental equation of the transformation theory [see, for instance, (119 b)]
they must be connected with each other by the relations
4 'x 'ir — J a x'p ' 'P j/ir d p ' (134)
tp'H' =j J
ap'1' ^ H ' dp' — ax'p'4Jx'H' dx'
where the transformation coefficients ax. . can be defined by the operator
equation,
Plx)a* = P ax-> (134a)
h c) ,
that is,
2ni Vxax'v'1' ^
h d ,
Z-ri dyax'“'*' = p ^a^ '
h d
Ini &*•*’*'*’ p t a*V--
This gives a*V = (134b)
p x ' denoting the scalar product of the vectors p and r, i.e. the sum
p ’xx -\-pvy’-\-p’zz'> The coefficient 1/VA follows from the orthogonality
and normalizing relation
J ax.p.a lp. dx‘ = h(p’-p " ), or J y dp’ =
The same result is obtained if the functions or rather are
defined by the operator equation
xw a~' = x’a ~ \
§17 G E NE R AL I Z AT I O N OF WAVE ME CHANI CS 159
h ?!
which, because
(134c)
in agreement with the relation — ay }/-
Substituting these expressions in (134), we get
JL j < P ’, :')!>< (Ip'/lp’^lpL (135)
The first of these formulae can obviously be regarded as the expansion
of the function in a Fourier integral with the amplitude coefficients
~Tl */*?/jr> while the second gives the explicit expression of these coeffi-
xii
cients. Remembering the w'ave-mcchanical interpretation of the vector
p '/h as equal to the reciprocal of the wave-length and pointing in the
direction of the propagation of the waves associated with the motion
of the particle, we can regard the transformation coefficients ax>v> as
plane sine waves (without the time factor, however!), and we can inter
pret the transformation equation (135) as the representation of the
wave function by means of a superposition of plane sine waves
with appropriate amplitudes and travelling in appropriate directions.
This physical interpretation is in complete harmony with the physical
meaning of the Fourier amplitudes ^ yir as the probability amplitudes
for the particle to have a definite momentum p' (irrespective of its
position) for a given value H' of its total energy to which the function
*Px'ir refers.
We shall not consider in further detail the generalized transformation
theory and its application to operators other than xyp y and //. There
is, however, one particular class of transformations which have been
alluded to at the end of § 16 as ‘canonical transformations of the second
kind* and which deserve special notice. They consist in a transition
from the original trio of (rectangular) coordinates (x) and the associated
h
momentum operators j p x
27rt dx
commuting coordinates (Q) and mutually commuting momenta (P)
160 T R ANSF O R M AT I O N T H E O R Y §17
satisfying the commutation relation
P Q —QP — .8 (136)
2m
for a given motion specified by a definite energy operator
tf(x> = H(jd(x,y,z; p x,p y.p.)
which is thereby transformed into H((J) = //<g)(C?i, <?2>Qs'yF
The quantities P and Q satisfying the above relations are said to be
‘canonically conjugate’ with each other. From the point of view of the
new coordinates (Q) the new momenta (P) are represented by the
fl f)
operators P{jg) = — . — (just as the Q's are represented from the point
of view of the P 's by the operators Q(P) — — ~ j . An operator
representation of the P 's from the point of view of the original co
ordinates (x) is, however, possible in the particular case only when the
Q's are defined as certain functions of the x's not involving the p / s or
the P 's. In this case, which corresponds to the ‘point, transformation’
of the classical theory, the new momenta (P) can be expressed as certain
functions of the original ones j)x (involving as parameters the co
ordinates x or Q). In the general case of a canonical transformation
corresponding to a ‘contact transformation’ of the classical theory such
a relationship between the new and the old variables does not exist and
some kind of matrix representation must be used for the definition of
the latter. The relationship between the new and the old variables cun
be expressed with the help of a certain transformation matrix <I>according
to the equations Q= p ^ $
th at is,
Pi = P, = 9 - % <D, = /' ' a)
These equations automatically secure the fulfilment of the commuta
tion relations which must exist between the new variables
QiQk- Q k Qi = «. P,P*-P*P, = o, P ( Qk- Q kPc (136 b)
as a consequence of those existing between the original ones.
In order th at the new variables should be represented by Hermitian
matrices just as the original ones, the transformation matrix O must be
unita ry, i.e. satisfy the relation O-1 = <!>*.
The equations (136 al are thus formally quite similar to the equations
§17 G E NE R AL I Z AT I O N OF W AVE ME CHANI CS 161
(124) of § 16. They have, however, an entirely different physical mean
ing. While the transformation matrix a in (124) has a mixed character
referring to two different sets of states, the elements of the matrix <I>
refer to the same set of states specified by the characteristic values of
some basic quantity which serves for the definition of the matrices
xyPx> Q> P* and H (this basic quantity can in particular coincide with
the invariable energy H).
The equations (136 a) must be considered as corresponding to the
classical equations Q —^ [cf. (31 a), § 4] defining a contact
transformation with the help of an arbitrary function <I>. In the quan
tum theory the latter is replaced by the likewise arbitrary transforma
tion matrix O.
In the classical theory a canonical transformation is characterized by
the fact that it does not alter the canonical form of the equations of
motion. The same criterion is easily seen to apply to the canonical
transformation (136 a) of the quantum theory.
We have, in fact, differentiating Q and P with respect to the time t,
§ = [* ■« ]•
which in virtue of (136) can be written in the form
dQ dH dP _ _ d H
dt dP ’ dt dQ
[cf. § 7, eqs. (43 a) and (44 c)].
An equivalent form of the condition th at the variables P and Q
should be canonically conjugate (in the classical sense), i:e. that they
should satisfy the canonical equations of motion, is that the Poisson
bracket expression
ra m V ' IdA dB dA d B\
V - V W J
should be equal to 1 for A = Piy B = Qt (i = 1,2,3) and to 0 for all
the other combinations of the variables P, Q. This condition corre
sponds to the commutation conditions (136b) which can be written in
the form [& , Qk] = 0, [P1(Pfc] = 0, [P^ Qk] = Sik, the classical Poisson
bracket being the analogue of the quantum bracket expression
[ A,B] = 2^ ( A B - B A ) (cf. § 8).
3595.6
Y
162 T R ANSF O R M AT I O N T H E O R Y §18
18. G eometr ical Repr esentation of the Tr ansfor m ation Theor y
The understanding of the generalized matrix theory, connected with
the ‘principle of relativity’ in the choice of the basic quantities and
with the transformation from one ‘basis’ to another, can bo greatly
facilitated by the use of a geometrical picture, or rather of a geometrical
language, suggested by the formal similarity between the equations of
the transformation theory developed in the preceding sections and the
theory of linear orthogonal transformations of ordinary analytical
geometry. The nucleus of this analogy is th at in both cases the trans
formation equations are linear (and homogeneous) and th at the
transformation coefficients satisfy similar orthogonality and normalizing
relations. (The mere idea of ‘orthogonality’ is suggestive of mutually
perpendicular axes.)
The choice of the basic quantities in the present theory corresponds
to the choice of the coordinate system in the geometrical theory, and
the relativity in the choice of these basic quantities corresponds to the
relativity in the choice of the coordinate system—or, in other words,
to the equivalence of all the directions in space.
It will be remembered that in analytical geometry a linear orthogonal
transformation means a set of linear homogeneous equations between
the coordinates x = xv y = x2i z = x3 of an arbitrarily chosen point
with respect to one system of axes, S, say, and the coordinates of the
same point £ — r\ = £2, £ = £3 with respect to another system X,
both systems being orthogonal and having the same origin. These
equations can be written in the form
nr
OT Xn = 2v O*i tL I’ (137)
v '
with any = a " 1 = cos(:rB, £„). (137 a)
The relations a nv = a~*, which are geometrically evident, can be ob
tained analytically from the orthogonality condition
Z * S = 2 f c (137 b)
which gives, in conjunction with (137),
(137c)
On the other hand, substituting the expressions of the £’s in those of
the x ' b and vice versa, we have
2 Q'n'v®vn" ^n'n’> 2 ^v'n^nv' (137 d)
§18 GE OME T R I C AL R E P R E S E NT AT I O N 163
The comparison of these equations with the preceding equations leads to
the relations (137 a), without, of course, the geometrical interpretation
with which we started.
The transformation theory which has been developed in the preceding
sections can be obtained from this elementary theory of linear ortho
gonal transformations by a twofold generalization.
F irstly, by making the number of coordinates specifying a point
infinite, i.e. by considering, instead of the ordinary three-dimensional
space, a fictitious space with infinitely many dimensions.
Secondly, by considering the coordinates of a point as complex
quantities and by defining the square of its distance from the origin,
not as the sum of the squares of the coordinates, but as the sum of
the squares of their moduli, thus replacing the orthogonality condition
(137 b) by the following condition:
I = (138)
n v
the summation being extended over all the coordinates. We get in this
case, instead of (137 c),
i' n
and, since equations (137 d) are not altered,
= <v or an„ = a - 1*, (138 a)
th at is, a*1 -- a*.
In the special case of real coordinates x, £, this ‘unitary’ transformation
reduces to the usual orthogonal transformation (though with an un
limited number of variables), and we get a 1 = a* = a (transposed
matrix), th at is, a~l — u, which is another expression of the relations
(137a). Although a geometrical interpretation cannot be associated
with an infinite number of complex variables x, £, connected with each
other by a unitary transformation, yet, since the number of variables
does not make any difference from the purely analytical point of view
(so long as it is larger than 1), we can preserve, if not a geometrical
picture, at least a geometrical language with respect to the variables
x, f and the transformation coefficients anv. We can accordingly regard
(or rather denote) the former as the coordinates of a point in a space
of infinitely many dimensions with respect to two orthogonal systems
of coordinates S and X, while the latter can still be regarded (or denoted)
as the cosines of the angles between the old and the new coordinate
axes. The variables xn and can be defined also as the projections
(or components) of a certain vector r on these axes.
164 T R ANSF O R M AT I O N T H E O R Y §18
In the simplest matrix transformation problem which was considered
at the beginning of § 15, the role of the coordinates xn and £„ is played
by the characteristic functions (or rather amplitudes) p,r. . This is
clearly seen from the fact that they are transformed according to
equations (110) and (111) which are the analogues of equations (137),
and th at they satisfy the orthogonality relation (113) which is exactly
of the same type as (138). We can thus describe the matrix transforma
tion theory in a very suggestive geometrical language, according to the
following principles.
Each stationary state specified by a wave function can be repre
sented geometrically by a certain direction or axis / / ' in a space of
infinitely many dimensions, which we shall call the state-space. The
states specified by the different functions *A?/' are represented by axes
H' which are perpendicular to each other, the complete set of states
defined by the operator H forming a complete orthogonal system of
coordinate axes in the state-space, which we shall also denote by the
letter I I . The ‘completeness’ of the system means that any ‘vector’ in
the state-space can be represented as the geometrical sum of its com
ponents along the axes of II.
This applies in particular to vectors drawn in the directions of another
complete orthogonal system of axes K \ which represent geometrically
the stationary states defined by the operator K. The transformation
coefficients aIlK>can be regarded as the projections of a unit vector in
the direction of a definite axis K ' on the different axes II' or, loosely
speaking, as the cosines of the angles between the axes K' and //'. The
latter expression requires, however, a correction, inasmuch as the co
efficients a jlH' = a*h'K' can also pretend to the same role, for they
represent the projection of a unit vector in the direction of a certain
axis H' on the different axes K'. This interpretation of air K, and a ^.H,
immediately follows from the comparison of the transformation equa
tions ^ aH K' aK'ir^K' with (137).
I t should be remembered that the quantities and appearing
in these equations in the role of rectangular coordinates of a point in
the state-space are themselves functions of the ordinary spatial co
ordinates x, y, z, and that, moreover, they refer to the same (arbitrarily
chosen) point.
So long as this point remains unspecified, and <f>°K*can be treated
as vectors, but as soon as we specify it, putting x — x \ we get numbers
an(i ^x'K' which, as we know, both with regard to their physical
meaning (as probability amplitudes) and analytical properties (as trans-
§ 18 GE OME T R I C AL R E P R E SE NT AT I O N 165
formation coefficients), are wholly similar to the numbers an >K>. We
can regard them accordingly as the components of the vectors and
along the axes of a third coordinate system X in the state-space,
each axis x’ of this system specifying a definite position x — x \ y — y*,
z = z' of the particle in the ordinary space. The axes of this new system
X must be regarded as orthogonal (i.e. mutually perpendicular) in spite
of the fact th at they correspond not to a discrete set of states, like the
axes of the system H or Ar, but to a continuum of states.
Since the functions 0°7r and are normalized to unity, both with
respect to x and to H or K, the vectors 0Jr , 0J^, as well as 0^, <f>y (the
latter sj>eeifying a certain position in space irrespective of the values
of the energy H or K) can be regarded as unit vectors (i.e. having the
length unity) and the numbers 0^7r and interpreted geometrically
in the same way as the numbers namely, as the cosines of the
angles between the axes x' (not in the ordinary space of course, but
in the state-space!) on the one hand, and between the axes / / ' or K '
on the other.
From this point of view the transformation equations
(139)
acquire an extremely simple geometrical meaning: they become, namely,
the generalization of the well-known formula of analytical geometry for
the cosine of the angle between two directions, x' and K ', say, expressed
in terms of the cosines of the angles between these directions and a
complete set of mutually perpendicular directions constituting a co
ordinate system //.
In fact, if we write cos(#', AT'), cos(x',//'), and cos(//', K') instead of
0yA~, 0”'7i', and aU K' respectively, the first of equations (139) assumes
the familiar form
cos(x',A') = y cos(x',H')cos(H'}K').
W
I t becomes, however, necessary to distinguish two different cosines
between the same two directions (corresponding to the projection of the
first on the second or the second on the first), since a j^ir — cos(A",//')
is not equal to aJVK. = cos(Z/',K') but to its conjugate complex:
cos(A/, H') — cos*(//', K'). (The same refers, of course, to the functions
and or and
Following Dirac, we shall often use in future the simplified notation
166 T R ANSF O R M AT I O N T H E O R Y §18
(K*\Hf) and (H'\K') for these two ‘cosines’ or transformation coeffi
cients; we shall write likewise
= (x’\ i n = {H'\x’),
thus avoiding the unnecessary complications arising from the use of
different letters, a, ^r0, <f>°, etc. The unit vector (in the state-space)
defining a certain state x' y //', or A ' per seyi.e. irrespective of the other
states with which it can be associated, will be denoted accordingly by
the symbols (x'|), (H '|), (A'|) or (|a;'), (I//'), (|A'). This notation has
the advantage of representing the same thing by the same symbol (or
two ‘conjugate’ symbols), while in our previous notation the same state
corresponding to a given position x' was described by two different
symbols or <f>^y depending upon the ‘coordinate system’ H or A
which we had in mind.
With the new notation the transformation equations (139) can be
written in the form
(.•r' |A'') — ^
//' (139a)
(x '|//') = ^ (x'\K')(K'\H')
A"
Since the three coordinate systems H y A, and x are equivalent to each
other, we could write by analogy a third relation of the same form,
namely, = ^ (H, {xW lK %
if ,r' were discretely variable, like / / ' and A'. Since, however, x' is
continuously variable, we must replace the sum by an integral over
x'y which gives
(H '\K‘) = J (H '\x')(x'\K')dx', (139b)
or, in the previous notation,
= f d v.
which is nothing else but the formula (110a) obtained at the beginning
of § 15, and again in the way just shown—but without the associated
geometrical interpretation—somewhat later.
The preceding equations (139a) and (139 b) hold, of course, for any
three sets of states which may be specified by three basic ‘trios’. It
should be remembered that, from the physical point of view, they
express the addition and multiplication law for the probability ampli
tudes. The geometrical interpretation of the probability amplitudes
§18 GE OME T R I C AL R E P R E S E NT AT I O N 167
(C'|C" ) as the cosines between the dir ections Q = Q* and C = C' in
th e state-space is in per fect har mony with the initial inter pr etation
of the or thogonality between two functions r epr esenting two differ ent
sta tes as the expr ession of the alternative char acter of these states. All
those sta tes which are r epr esented by m utually per pendicular dir ections
in th e state-space ar e alter native or m utually exclusive—in the sense
t h a t the pr obability of finding the par ticle in one of them when it is
known to be in another is equal to zer o. All such states may always
be r efer r ed to the same set.
H aving elucidated the geometr ical meaning of the pr obability am pli
tudes—or tr ansfor mation matr ices—we shall now tur n to the geom etr i
cal inter pr etation of the or dinar y matr ices, which r epr esent physical
q uantities fr om one or the other point of view. This inter pr etation is
again deter mined by the tr ansfor mation equations (121) which show
th a t Her mitian matr ices can be consider ed as a gener alization of the
so-called tensors, or mor e exa ctly symmetrical tensors, of the elem entar y
thr ee-dim ensional analytical geometr y.
A tensor can be defined as a composite qu an tity with a number of
components, each of which r efers to two axes of the same system of
coor dinates, and behaves with r espect to a tr ansfor mation of the co
or dinate system in the same way as the pr oduct of the components of
two vector s along th e cor r esponding axes.
Let us consider again the two coor dinate system s S and 2 and denote
th e components of the same vector , f, say, along the axes of S and 2
by f n and f v r espectively. I f g is some other vector , and if we for m
th e pr oducts of all the com ponents of f with all the components of g,
referred to the same system, we shall obtain a set of 9 quantities
Tmn= fm 9 n or (1*0)
which can be consider ed as the components of the same tensor T
r efer r ed to, or r epr esented fr om the point of view of, th e coor dinate
system S or 2 . Taking into account the tr ansfor mation equations,
fix ~ 2m ®mfifm> 9v ' 2n ^nv9n>
with th e coefficients a nv = a~£ = cos(^ u, tju) as befor e, we get
= 2 2 amp anv Tmn = 2 2 Tmn a n» (140 a)
mn mn
and Tmn = 2 2 ^ , = 2 2 « w T^ a m- (!40 b)
fi v nv
These tr ansfor mation equations can ser ve to define a tensor T in the
gener al case, when its components cannot be p u t in th e simple for m
108 T R ANSF O R M AT I O N T H E O R Y §18
(140). These equations can obviously be written in the following matrix
form: Ts a~' a, Ts = a T%a ~\
which makes it evident that a matrix Fc representing some quantity
F from the point of view of some other basic quantity C, can bo inter
preted geometrically as a certain tensor F in the state-space referred
to a system of coordinates whose axes represent the states specified by
the characteristic values of C.
The matrices Fc representing real quantities are Hermitian, i.e.
satisfy the relation v
* C mC ‘ ~ * C'C'y
which can be considered as the generalization of the condition
T —T T = T
for the symmetrical tensors of ordinary analytical geometry.
Now such tensors admit of a very simple and suggestive geometrical
illustration, namely, that of a central quadric (ellipsoid, hyperboloid),
defined by the equation
(140c)
ma
in the coordinate system S, or
- !- (140d)
in the coordinate system X.
The fact that these two equations represent the same surface, i.e. that
the coefficients Twn and Tflv are transformed into each other according
to equations (140a) and (140b), can be proved by substituting in
(140d) the expressions ^ = 2 amy.xm* = 2 atn>xn> which gives
2(x 2v 2m2n ^ v amfianvXmXn ~
or, changing the order of summation with regard to the Clreek and
Latin indices, . ,
= 1,
m 7L ' LL V '
which, in view of (140 a), coincides with (140 c).
The components of a symmetrical tensor referred to a system of
coordinates can thus be interpreted as the coefficients in the equation
of a certain central quadric referred to the same coordinate system;
this makes it possible to visualize a symmetrical tensor, without any
reference to a system of coordinates, as the quadric surface which it defines.
I t should be mentioned th at a quadric surface can be defined, accord
ing to (140c), by a non-symmetrical tensor just as well as by a sym-
§18 GE O ME T R I C AL R E P R E S E NT AT I O N 169
metrical one. But it will actually contain the sum of the components
+ referring to the coordinates xftl and xn as the coefficient of
their product xmxn. The asymmetry of T, if any, will therefore not be
manifested in the shape of the surface, or, in other words, the latter
will define only the symmetrical part of T. Thus a tensor can be com
pletely specified by a quadric surface only when it is symmetrical.
Every central surface of the second order has three mutually per
pendicular axes of symmetry, which can be defined by the condition
that, referred to a system of coordinates S whose axes coincide with
its symmetry axes, the equation of the quadric reduces to the ‘canonical’
form
not containing products of different coordinates.
This can be expressed by saying that the matrix T7^ considered from
this point of view is diagonal. The possibility of reducing the equation
of a central quadric to the canonical form, i.e. the existence of symmetry
axes, is proved by a well-known method which at the same time leads
to the actual determination of the cosines between these axes and the
original axes xtl, i.e. of the coefficients of the orthogonal transformation
and of the diagonal elements of the transformed matrix, or, in
other words, of the characteristic values of the tensor T, 1{f — T'.
This method consists in defining the vertices of the quadric—i.e. the
end-points of the symmetry axes—by either one of the following con
ditions:
(1) The normals to the surface at the vertices coincide in direction
with the radii vcctores from the centre. This condition leads to the
equations ^
-----proportional to x1fV
dxm
where F denotes the left side of equation (140c), or, if the propor
tionality factor is denoted by T':
1 Tmnxn = T 'X„r (141)
n
So long as we are dealing with ordinary three-dimensional space, this
is a set of three linear equations which are compatible with each other
if their determinant vanishes. The latter condition gives a cubic equa
tion for T \ and to the three roots of it there correspond three sets of
xn values, xnTi say, which define three mutually perpendicular vectors,
and reduce to the cosines of the angles between the old axes and the
symmetry axes if normalized to unity. The three values of T' turn out
88B5.6 7
170 T R ANSF O R M AT I O N T H E O R Y §18
to be the three non-vanishing diagonal elements of the transformed
matrix or tensor 7V
(2) The distances of the vertices from the centre or their squares
r2 — 2 *7#J have the largest or smallest possible values, consistent with
the equation p - VT r - 1
This gives, with the help of Lagrange’s method of undetermined
multipliers, a system of equations derived from
8r2+A8F = 0 (141a)
by equating to zero the coefficients of the variations of the separate
coordinates with a properly chosen value of the coefficient A. Putting
A - - —T', we again get equations (141).
I t should be mentioned that the variational equation (141a) can be
interpreted as the condition that F should have a maximum, minimum,
or stationary value while r2 is kept constant, for instance equal to unity.
(3) Finally we could find the symmetry axes of T by defining the
transformation coefficients a uv in equations (140 a) in such a way that
the three transformed non-diagonal components of T vanish, or, in
other words, that the transformed matrix 7 2 be diagonal. This again,
as can easily be shown, leads to equations (111) or, more exactly, to
2 ^m n x n T =" ^ x w T'
These equations, as well as equations (141), are obviously of the same
type as equations (122 b) or (123) of § 10 defining the transformation
of the matrix K n to the diagonal matrix Kk . They only differ in the
number of dimensions, this being equal to three in the case of ordinary
space and to infinity in the case of the state-space to which the latter
equations refer. Another difference between them and the correspond
ing elementary equations is that the vectors and tensors with which
we have to do in the case of the state-space are complex, the symmetry
condition for the ordinary tensors being replaced by the Hermitian
condition for the tensors in the state-space.
With this amendment, which from the purely analytical point of view
is merely a trivial generalization of the ideas and relations of ordinary
analytical geometry, we can apply the tensor idea and the idea of a
quadric central surface in the state-space for the representation of
physical quantities which have hitherto been represented by Hermitian
matrices. The idea of a tensor, together with the ‘principle of relativity'
in the choice of the coordinate system, is actually equivalent to the
§18 G E O ME T R I C AL R E P R E S E NT AT I O N 171
idea of a matrix in conjunction with the principle of relativity of the
basic quantities which determine the coordinate system.
The additional feature of the geometrical representation derived by
generalizing the ordinary geometrical theory is the possibility of think
ing of a quantity F as pictured, as it were, by a central quadric surface
in the state-space, the axes of symmetry of this surface representing
the different states specified by the characteristic values of F yand these
characteristic values being inversely proportional to the squares of the
length of these axes drawn from the centre to the vertices (without
being prolonged to infinity). The latter relation follows from the fact
th at in the canonical form of the equation of the quadric ]£ Thfl f “ = 1
H-
the coefficients T which are obviously the reciprocals of the squares
of the lengths of the axes (with positive or negative sign) represent at
the same time the characteristic values T' (or T ' t T *, T'") of the
tensor T.
The equation of a quadric surface representing in the state-space
a certain quantity F referred to the symmetry axes of the quadric
surface which represents some other quantity, C, say, can be written
in the form (142)
2 2 ^ V « < V c r = const.,
7y tr
if the values of C form a discrete set, or in the form
J J F$ c. aQ* dC'dC" = const., (142a)
if they vary in a continuous manner, while the expression
E = Za * .a c, (142b)
r'
or J
E — a*'ac<dC' (142c)
can be interpreted as the square of the distance from the common
centre of the two surfaces to some point with the coordinates ac..
The characteristic values of F and the states specified by them can
be found by transforming the quadric (142) to the canonical form, i.e.
to the symmetry axes of F . This problem, as we know already, is
solved by the transformation equations
F ir e- a c m— F Q(y \
(143)
or J F%'C*ac. dC” = F 'a C' j
the resulting normalized ac. = ac.F>= (C '|F ') being the cosines of the
angles between the symmetry axes of C and those of F, or, from the
172 T R ANSF O R M AT I O N T H E O R Y §18
physical point of view, the probabilities of getting a certain value for
C when that of F is supposed to be known.
An important relationship between the two quantities is expressed
by the coincidence of the symmetry axes of the associated surfaces.
This means the coincidence of the states specified by the corresponding
characteristic values of F and C and is equivalent to the condition
that F and (7, defined as matrices or operators from any common point
of view (Q say), commute with each other. To prove this we shall first
put Q ™ C\ The matrices Fc and Cr , being both diagonal, must com
mute with each other, since their product is also a diagonal matrix,
independent of the order of the factors:
( ^ W — Fcc'C'C’C'Sc'c0 ~ ( C F ) ^ .
Now when Q ^ C one can always define a (unitary) transformation
matrix b which will transform G into Q according to the equation
Q = bCb~l. According to the invariance property with regard to
canonical transformations of this form expressed by equation (124 a),
we must have
Fq C0 - C q Fq = b(Fr Cc - C cFr )b~' = 0.
The transformation equations from C to F in the general case when
these quantities do not commute can be derived from a variational
principle of the same type as that which serves to determine the vertices
of a quadric in ordinary analytical geometry. We can put, namely,
BE — 0, subject to the condition (142) or (142a) giving
BF —F'BE — 0, (143a)
where F denotes the left-hand side of (142) or (142 a) and E the expres
sion (142 b) or (142 c) respectively, while F ' is an undetermined multi
plier. This equation can also be interpreted as expressing the fact that
BF = 0 subject to the condition that E == const. (= 1, say). The
variations of ac, and a*, must be considered as independent of each
other and their coefficients in (143 a) set equal to zero, which leads to
the transformation equations (143) and their conjugate complex (i.e. the
equations of the reciprocal transformation).
The ‘conditioned’ variational equations BE — 0 with F -• const., or
BF — 0 with E — const., can be replaced by the ‘unconditioned’ varia-
tional equation « ( //* ) = o (143 b)
which automatically provides for the normalization of the functions
ac>so far as the value of F is concerned. If, indeed, the ac. are not
normalized, then the functions aa /^E can be considered as their nor-
§18 G E OME T R I CAL R E P R E SE NT AT I O N 173
malized values and F /E as the value of F subject to the appropriate
normalization conditions ^ a*<aC/ = 1 or J a*r ac*dCn -- 1j.
I t is obvious from the comparison of (143 b) with (143) th at the
stationary values of F /E are just equal to the characteristic values F '—
a fact which can be ascertained directly with the help of the trans
formation equations. Taking, for instance, F ^ 1 I F e e acracm lF >
then, since V FC'C.a 0* — F 'a c>, we get F = F ‘ Y a ^.a c./E = F '.
P t?
The variational principle which wo have just considered is a generaliza
tion of the variational principle for the energy, which was considered
in the preceding chapter under the form 8H - 0, with H - j i d V
and E -- J ip°*ip° dV — 1. It reduces to the preceding form if is
replaced by the sum Y ac </r^c , Vsc being the characteristic functions
P '
of the operator C(x) which may be supposed to represent a Hamiltonian
slightly different from th at represented by the operator H c , more
exactly, Hix).
This leads to a problem of the perturbation theory, which, from the
geometrical point of view, outlined in this section, can be regarded as
the problem of finding the symmetry axes of the quadric surface //,
whose equation is referred to the symmetry axes of a slightly different
quadric C.
More generally we can say that from this geometrical point of view
the quantum mechanics can be regarded as the analytical geometry of
central quadric surfaces in the state-space., the symmetry axes of each
such surface specify, by their length, the characteristic values of the
physical quantity represented by this surface, and, by their direction,
the associated states; while the cosines between the symmetry axes of
two different surfaces represent the probability amplitudes for a certain
value of one quantity (or set of three quantities) when the other
quantity (or set of three quantities) is known to have a given value.
In conclusion a few remarks should be added on the question of
notations. Dirac and following him many other authors denote the
elements of a matrix Fc by the symbol (C'\F \C") which is equivalent
to the symbol F^rc» used in this chapter, and which has the advantage
of being closely connected with the symbol (F '\C f) for the probability
amplitudes ^ u C,. Using Dirac’s notation, we can write the transforma
tion equations connecting the matrices and Fk in the following
form: (A 'l^ A ") = (K'\Il')(H '\F \H ,,) ( ir \K 1'),
174 T R ANSF O R MAT I O N T H E O R Y §18
if the spectrum of H is discrete, or
(A"|J5’|A'") ::= [J (K'\H ')dH ' (H'\F \H")dH" (H"\K"),
if it is continuous.
The index 0 in our notation serves to indicate that the time, which
is supposed not to appear in the equations of this chapter, is ignored.
We shall take it into account in a later section.
Another remark refers to a type of vector notation applied by Dirac
to vectors and tensors in the state-space and quite similar to that used
in the ordinary three-dimensional vector and tensor analysis.
A state—in the quantum-mechanical sense—is specified by a vector,
i/r,
say, of unit length and of a definite direction in the state-space. The
components of this vector with respect to a system of coordinates C
may be denoted by ipc-. The same state can, however, be specified by
</r,
the conjugate complex of which is a vector <p* with the components
The sum 'Pc'^c or the integral j**p*>pc>dC which is the measure
r'
of the square of the common length of the vectors i/r*
and t/rwill be
denoted as their ‘scalar product’ p*ip. In a similar way the scalar pro
duct of two different vectors ipx and tp2referring to two different states
will be denoted by p%</q or tp* which means, in the coordinate repre
sentation, or 2 0 f(7' 02C' (the sums being again replaced by
t? ' ' c* ‘
integrals in the case of a continuous C-spectrum).
These expressions (which are conjugate complex with regard to each
other) can be regarded, from the physical point of viewr, as the proba
bility amplitudes for the simultaneous occurrence of the two states
(a measure of the ‘mutual compatibility’ of the latter). If these states
are alternative (mutually exclusive), the vectors i/q and */»2 are mutually
orthogonal, which means that 0? ipx = ipf ip2 = 0.
Further, let F denote a tensor representing not a state, such as ip,
but a certain physical quantity (an ‘observable’ or ‘dynamical variable'
according to Dirac), with the components Fc>c*along the axes (— states)
of C (we are dropping for convenience the superscript zero). The sum
(or integral $ F ^ip o * dC") can be considered as the 6"-
component of another vector, <f>, say, specifying some state, in general
different from tp. This vector will be called the product of the tensor F
and the vector ip and denoted by Ftp [so that (Fip)c>— J ^ c c 0PcA•
The conjugate complex of <p can be defined in a similar way as the
product of F and tp* taken in the inverse order, i.e. by the formula
§18 GE OME T R I C AL R E P R E S E NT AT I O N 175
(f>* — *p*F, which means, in the coordinate representation,
- (0* A r = I t f r F c -c (or f A c - rfC").
This gives (
m - 1 4>i4v - 1(" W'*n-/v - 2 2vmrc'iW'i'c
(or Jj r v - F c - c ’+c- d C d C ”) ,
which will be denoted simply as *p*Fip.
We get further (taking for the sake of simplicity the case of a discrete
(7-spectrum)
4-*4> - I tir tir
tV TP
0?-Ac-Ac-0c-
c tr -
or, since 2 A c - A c - = (F*)c.c...,
V/
we get <p*<p — ip*F2tp.
The preceding formula is the simplest example of a tensor product’.
The product of two tensors F and G taken in the order stated is defined
as a tensor with the components
(F ( t )c~c>" Fc*c>6rC'C"' °r j FC'C>GC’C»’dC'.
This definition of tensor multiplication is identical with the definition
of matrix multiplication if F and G are considered not as tensors but
as matrices.
The matrix representation can also be applied to vectors such as ip
if we generalize the conception of a matrix by admitting matrices which
consist not of a square array of numbers (elements, components) but
of a rectangular array (with a different number of rows and columns)
and, in particular, of a linear array with one row or one column only.
If we wish to preserve the general multiplication law, i.e. th at the
product of two matrices shall be a matrix obtained by combining the
rows of the first factor with the columns of the second, we must repre
sent the vector ip and its conjugate complex ip* by linear matrices of
different kinds, the one, considered as the first factor, consisting of one
row only and the second of one column only.
Taking the components of ip and ip* along the (7-axes as the elements
of the matrices *pc and 0*, we shall put accordingly
0* =
and 0C'
0c*
0c = ■ 0o~ ’
170 T R ANSF O R M AT I O N T H E O R Y §18
which means that in multiplying two vectors or a vector and a tensor
we must always start with the conjugate complex (0*, <f>*) and finish
with the original ones. From the matrix point of view we should write
(adjoint matrix) instead of t/j*, for the matrix 0* defined above is
obtained from the matrix i/j not only by taking the conjugate*, complex
of its elements, but also by an interchange of the rows and columns (cf.
§ 16). With this convention the scalar product of two vector-matrices
\p and <f> can be written in the form ^<f> or <£f0, while the symbols
i/jff)* or have no meaning. Taking the components of $*(/> in the
usual way, we get
('/'VL,, ^ I,'(‘la ta ; ,’
V
which is equal to zero unless m --- 1 (first row of 0 f) and n = 1 (first
column of <f>).
The product of a vector 0 and a tensor F must be represented
accordingly in either of the two forms F\jj or «p^F, the former being
a matrix of the same form as 0 and the latter a matrix of the same
form as The two matrices arc, of course, adjoint with regard to
each other, so that we can write
W'F )' - Fip,
which is quite natural since F f — F (so long as F is a Hermitian
matrix).
I t should be mentioned finally that the linear matrices with the
elements can be replaced by ‘square’ matrices with the ele
ments representing a set of vectors, which correspond to different
values of Qf, or, in other words, the cosines between the directions Q'
and C \ Such matrices are not hermitian but unitary, i.e. satisfy the
relation </rf — i / r 1 (</rr ; ^ = */jqC>). The preceding formulae, relating to
the products of the type or Fifty etc., remain valid with this inter
pretation of the ifj> i.e. not as vectors specifying states, but as cosines
between two sets of axes specifying two sets of states and measuring
the probability amplitudes of their coexistence. The transformation
equations <f>°K>= can be written accordingly in the form
fix'K' = ll;£'H:airK’’ or <f>= ifta (the order of the factors on the right
side being opposite to that which corresponds to the product of ip
considered as a vector with a matrix representing a tensor).
P E R T UR BAT IO N TH E O R Y
19. Per tur bation Theor y not involving the T im e (Method of
Stationar y States)
The exa ct deter mination of the wave functions = (x’\H') which
specify the motion of a par ticle in a complicated field of for ce is usually
impossible on account of analytical difficulties. Bu t even if these diffi
culties could be over come, it would har dly be possible to use the r esults,
and especially to visualize them , on account of their complicated
char acter . Thus both for m athematical and physical r easons it is
desir able, in th e case of a complicated field of for ce, to use an appr oxi
m ative m ethod of deter mining the functions 0°, star ting with an exact
deter mination of the latter for the motion in a simplified field of for ce,
and intr oducing cor r ections to r epr esent the effect of the ‘per tur bing
for ces’, i.e. those for ces which have been left out of account a t the
beginning.
The ener gy oper ator cor r esponding to the ‘unper tur bed’, i.e. sim pli
fied, motion will be denoted by H ( = H (x)) and its char acter istic func
tions by ip°jr ( = tx'H')- The ener gy oper ator cor r esponding to the actual
or ‘per tur bed’ m otion will be denoted b y K ( = K^ ) and its char ac
ter istic functions by </>&'(=
The differ ence K —H — S will thus r epr esent the additional or ‘per
tur bation’ ener gy; it is usually defined as the p otential ener gy of the
per tur bing for ces.
This per tur bation ener gy m ust, of cour se, be r egar ded as ‘sm all’.
The exa ct meaning of th is condition will become appar ent as we develop
th e pr oblem by th e method of the per tur bation theor y.
As alr eady mentioned, th e per tur bation theor y (so far as H and K
do n ot in volve the tim e) amounts to a tr ansfor mation of all physical
q uantities, consider ed as matr ices, fr om the point of view of H to the
p oin t of view of K , which is supposed to-be b u t slightly differ ent fr om
H , so t h a t th e actual calculations can be car r ied ou t by means of the
m ethod of successive appr oxim ations.
The pr inciple of this method consists in r egar ding all quantities in
volvin g S , for in stance th e matr ix elem ents as small quantities
of th e fir st or der and sp litting up the exact equations into a chain of
appr oxim ate equations containing small quantities of the same or der.
W e shall fir st assume th a t H has a discr ete spectr um and th at the
178 P E R T UR B AT I O N T H E O R Y §19
unperturbed motion is not degenerate, the characteristic values of H
being thus sufficient for the complete specification of the corresponding
states.
The fundamental part of our problem will consist in the transforma
tion of the matrix K H to the diagonal form K k and in the determination
of the transformation matrix a, according to the general equation
K r — a ^Kjj i (at = a*"1), (144)
or Ku a = a KK> (144a)
th at is [cf. (123), § 16],
2Tn Kff jj' dnm
K,n — K"'aH'K» (144b)
We must, first of all, fix the ‘zero approximation’ which corresponds
to S = 0, i.e. to the actual coincidence of K and H. Assuming the
identical states to be labelled by the letters K or H with the same
number of dashes (K' — H', K ” = H ”, Km = H etc.), we can put, in
this case, *
a = o,
th at is, aHMK,,f = (145)
where 8 is the mixed unit matrix with the diagonal elements
^H'K' = = 1 (a^ the others being equal to zero).
Equations (144 b) reduce, in this case, to
th at is, to K h 'i t = (145a)
which is the same thing as K ' = H', since KH,H. = H H'H>= H '.
We shall now consider the actual case in which S ^ 0, assuming th at
there still exists in this case a one-to-one correspondence between the
unperturbed states H f, H ”, H m,... and the perturbed states K \ K ”, K "',...
—in the sense th at the states labelled by the letter K or H with the
same number of dashes coincide with each other when the perturbation
energy S tends to zero.
We shall put accordingly
K k .k . = K' = ff '+ A tf', (146)
where AH' denotes the change of the energy-levels due to the perturba-
n’ a = 8+Aa, i.e. aH'K* “ (146 a)
the corrections AaH.K. being assumed to be small (compared with 1).
We have further
K h = Hh + S h , i.e. K'h’H* = (146 b)
§ 19 PERTURBATION THEORY NOT INVOLVING THE TIME 179
Substituting these expressions in equations (144 b) and taking into
account th at — U ^ ^ we get
= ( ^ + A / r ) ( 8 i r ^ + A a H^ „ ).
Since §n ‘K» — 0 unless H* — K'" when it is equal to 1, and
(Hm—H')hH'K». = 0 both when K = K' (because then H'" — H ') and
when K m ^ A", we get
£?th'"+ 2 SH'n0&ajrK'''
— —H')Aa H'K»'. (147)
These equations can be solved by successive approximations, if we
assume th at the quantities SM>H» (i.e. the matrix elements of the per
turbation energy ‘from the point of view* of the unperturbed energy)
are small quantities of the same (first) order of magnitude and expand
AH ’ and Aa in series of the form
M l f = Al H ,+ A2H ,+ ...
Aa = A1a + A 2a + ...
where An/ / ' and Ana are corrections of the nth order (that is, of the
same order of magnitude as the nth power of the elements of Su ).
Substituting (147 a) in (147) and dropping terms of the second and
higher orders of magnitude, we obtain as a first approximation the
equations
^ Aj (148)
Putting K m — K f (and consequently H m = H r), we get
A1H ' = S%,n '. (148 a)
This formula determines, to the first approximation, the change of the
energy-levels produced by the perturbation.
If K m is different from K f (and consequently H m is different from
//'), equation (148) reduces to
&H'JET" — (H" —
Cf°
th at is, A, aH.K... = - (148b)
giving the first-order expressions for the transformation coefficients
aH'K:>.
If we preserve in (147) terms of the second order, dropping terras of
the third and higher orders, and take account of the first-order equations
180 P E R T UR BAT I O N T H E O R Y §19
(148), we get the second-order equations:
^ SlI H" ^1 Q'H’K"*
= A2H ,,rhfi>K^-\- Al H t" —H')A2aH,K„„ (149)
I t should be remarked that these equations, as well as the equations
of the succeeding orders, can be obtained from (147) by substituting
the expressions (147 a) and dropping all terms with the exception of
those of the order in question.
Putting K = K' (and H m= //') in (149), we get
= A2H r+ A1H fA1aJ7^ >
or, on account of the relation (148 a),
A2H' — 2
I V * IV
Substituting the expressions (148 b) with K f" replaced by K' and H' by
H ”, we get
(149 a)
2 Z . H ”—H ' ~ L H ' - H ”
HVfl' W*Il
With K"f different from K', equation (149) reduces to
5 ^ h 'h * a H* K--------- A l H mA 1 a H 'K » . + ( //" ' — H ' ) A 2 a H 'K ».,
n
giving, with the help of (148 a) and (148 b), the following expression for
the second-order correction in the coefficients a :
A2aH> _ V ^H’lV S0i v h "'
'k ' (H '-H "')* 9
iv* w
A2aH’K " — ^ (149b)
In carrying out the summation over H " we must drop the term H" = H"1
(as well as H* = H') because the formula
Aj airE S h 0h "'
g r r jr
holds for the case H " ^ H'" only, while for IV — H'" we have
Ai = 0. (150)
This equation can be obtained from the normalization condition which
must be satisfied by the matrix o, namely,
j ? aH'K0aH'K* = 1*
Putting aH'K* = §H>K--\-AaH'K', then since = 1 when H ’ — H ”
and 0 when W ^ H ”, we get
AQh 'k * = 0* (150a)
§ 19 PERTURBATION THEORY NOT INVOLVING THE TIME 181
whence it follows that
Since the diagonal elements of the matrix a must be real
we have A ajr K* == n0.
Aj
The formula (149) likewise leaves undetermined the diagonal elements
of A T h e y can be determined, however, with the help of the
equation (150 a) or rather the equation
2A2a7/'A'+ 2 ^1 a*l'Km
=
IV
which is obtained from it as a second approximation (dropping all
terms save those of the second order) and which, in conjunction with
(148 b), gives v „„
A2 z 3 '^ L (i5ob)
The formula (150) follows in a quite obvious manner from the geo
metrical interpretation of the coefficients a rrK. as the cosines of the
angles between the symmetry axes of the quadric surfaces representing
(in the state-space) the energy / / and the energy K. Since, by defini
tion, H and K must differ very little from each other, the corresponding
axes / / ' and K' (or //" and K", etc.) must have approximately the same
direction, while the non-corresponding axes (//' and A'") must be nearly
perpendicular to each other. Denoting the angle between / / ' and K'
by otjj x ’ an(l considering it as a small quantity of the first order, we get
l J I'K ' “ COS 0Cj[ K -- 1- (^IVK'Y■+..
which means th at the first-order correction Axojr K> vanishes, while
A2aH K' — Comparing this with (150 b), we can put
(«//■*')*= 2 w '-H * )y c r>i)
This formula shows that the angles between the corresponding sym
metry axes of H and K are of the same order of magnitude as the
ratios of the matrix elements of the perturbation energy 8 with respect
to different //-states to the difference between the characteristic values
of H for these states.
The same result, in a still simpler form, is obtained from a considera
tion of the first approximation values of the coefficients airK~ — A
(K* ^ K' ). Putting aH K* = cos<xjr K. and a; r r = j7r+Aa/ r j r , where
AaH K' denotes a small angle, we get
a HK* — —sin &oiH'K* = —A1ocIIK*y
182 P E R T UR BAT I O N T H E O R Y
whence, according lo (148 b),
A ic*h ’K' = H ’1—*!!” (151a)
This angle Bhould not be confused with the angle through which the
axis H" has to be rotated in order to coincide with K * and which is
equal to olh »k » = AaH>K>. The comparison of equations (151) and
(151 a) shows that the latter angle can be regarded as the (geometrical)
sum of mutually perpendicular angular displacements of the type
AqlH K‘ f°r different values of H ’ (=£ H ”). In other words, the angular
displacement A<*H K' can be considered as the component along the
//'-axis of the elementary rotation We thus obtain the law of
the vector composition of elementary rotations about different (mutually
perpendicular) axes, which is a generalization of the corresponding law
for ordinary three-dimensional space.
In the latter case, an infinitesimal rotation of the coordinate system
can be specified by a certain vector io, which determines the (apparent)
change of a fixed vector r by means of the formula Ar = —u>x r. So
far as the first approximation is concerned, the components of co and Ar
along the old and new axes can be identified with each other. W ritten in
components along the old axes, the preceding formula gives the fol
lowing equations:
A*i = fi—X x = — OJ2 X 3 + (x)3 X 2
A #2 = £2 *^2 = ^1+ ^ 1^3
A#s — #3 = Q)l X 2 -t~Q) 2 X l
which can be considered as a particular or rather as a limiting case of
an orthogonal transformation for the case when the two systems (8 and
2^) differ very little from each other. Putting
Ct»l = <*23 ~ a 32> ^2 = a 31 ^ a 18> “ a i2 = a 21’
we can rewrite the preceding equations in the form
— — 2 otn0n,xnm' (152)
Comparing equations (152) with the exact transformation equations
£v’ = It
n"
we see th at they can be obtained from the latter if we put
®nV ^ ^n'n0 a n'»'> £v' ^ £n'>
where v and n' denote corresponding axes of the new and old system,
i.e. such axes as were initially coincident. The angles anV = an.n>
must approximately vanish for the normalizing and orthogonality
relations to be satisfied.
§ 19 PERTURBATION THEORY NOT INVOLVING THE TIME 183
We thus see that an infinitesimal orthogonal transformation in
ordinary space can be treated as an infinitesimal rotation of the original
coordinate systems, specified both with regard to the direction of the
rotation axis and the angle of rotation about it by the (infinitely small)
vector o> with the components ojv to2, a>3, or by the ‘an Asymmetrical
tensor’ a with the components <xnV = —an.n*, referred to the original
axes.
These results can easily be extended to the infinitesimal orthogonal
transformations in the ‘state-space’, corresponding to a transition from
the symmetry axes of the quadric surface representing the unperturbed
energy H , to the symmetry axes of the quadric representing the per
turbed energy K = H + S.
Leaving the perturbation energy S unspecified, we can represent
the (apparent) change of the components of any vector tfi due to the
small rotation of the coordinate axes by an equation wholly similar to
(152), namely, ^ = _ ^ * (I62a)
n
where a denotes an ‘anti-Hermitian’ tensor (which is a generalization
of the antisymmetric one) satisfying the condition
ajt h - ~ ~ a/r/r> (152 b)
or = —a.
These results can be obtained in the same way as in the three-dimen
sional case from the exact transformation equation,
tpK, = •>
fr
by putting ^ an(* where the a
denote small quantities of the first order. Substituting the latter
expressions in the orthogonality and normalizing conditions,
^ a H 'K ' a H '"K' =
and neglecting second-order terms, we get, if the summation index K '
is replaced by H \
&H'H ' ^ a /T /T
th at is, olh ‘,,h 0~ ^ ~ =
which is equivalent to (152 b).
As a m atter of fact, from (148b) and because
we have oo
(152c)
“ W -H "
so that the condition (152 b) is actually satisfied.
184 P E R T UR B AT I O N T H E O R Y 1 19
I t should be mentioned th at in the case of a generalized space with
more than three dimensions an antisymmetrical tensor is no longer
equivalent to a vector, t I t is therefore impossible to represent the
rotation of the quadric surface H into such a position th at its axes
coincide (in direction but not in length!) with those of the quadric K
by means of a vector corresponding to a>, or to specify the rotation
by its components along the different axes of H. Instead of using
the coordinate axes, we can, however, use for the same purpose the
coordinate jplanes (in the case of ordinary space the number of these
planes is equal to the number of axes, which explains the possibility
of representing the former by the latter). The quantities aH./r can be
interpreted as the projections of the rotation H -> K on the planes
(H”>H'). The angle through which H ” must be rotated to coincide with
K ” is given by the equation
an m
K0 — 2 la/ r / r l 2>
IV
which is similar to the ordinary equation for the composition of ele
mentary rotations considered as vectors (for instance, w2 — wf+ wl+ ojl)
because in the preceding equation one of the axes (H") remains fixed
and the summation over the different planes passing through it is
equivalent to a summation over all the axes different from H ".
The expressions (152 c) for the elementary rotations, as well as
the corresponding (first-order) corrections for the energy values
AH' = K ' —H ', can be obtained in a somewhat simpler way than before
by starting from the expressions (152 a) and using the equations
Htylv = and K<f>K, = K'<f>K„
Putting in the latter equation <j>K>— if/H- + K ' = H ' + A H and
K = H + S, we have
H ^ + S ^ + H A + H ' + S A + n , = H ' tH.+AH ' +H.+H ' A+H.+AH ' AtH,t
or dropping terms of the second order of smallness (i.e. the products
SA*f>H. and AH'Atfi'):
Sfjj'+ H Afjj, = AH' i/j H'+H'A<Ph >, (153)
Now by the definition of matrix elements we have
On the other hand we get, according to (152 a),
HAlpH' — — ~
t I f n is the number of dimensions, then the number of differ ent non-vanishing
components of an antisymmetr ic tensor is equal to $n(n—1), which is equal to the
number (n) of components of a vector only when n = 3.
§ 19 PERTURBATION THEORY NOT INVOLVING THE TIME 185
Thus (153) can be written in the form
2 (Sfr n — H''otjr ir )tlJHm—
ir h "
or
5 — (H”~ H ) 0L}r iAlltH” = I (153a)
Tr ir
Equating the coefficients of ipH» on both sides, we get
S°/jn' = A //' (153 b)
S°tr jr = (//"-//')<
in agreement with the results previously found.
The fact th at equation (153 a) splits up into equations (153 b) for
the coefficients of the separate is due, as already pointed out,
(Part I, § 18), to the mutual orthogonality of the functions ipH~ (as
functions of the coordinates x, y, z). If we have an equation of the type
Y an 'Pit' — 2 'Pir which holds identically (i.e. for all values of
H W
x, y, z), then multiplying it by ipfr and integrating over x, y , z, we get
aH. — bi r , all the other terms vanishing.
We have assumed, hitherto, that the unperturbed problem was "non-
degenerate’, i.e. th at all the characteristic values of H were different.
The essential character of this assumption is clearly seen from the fact
th at the equations ajr jr = yj J rTjjt become meaningless (unless
H —ti
vanishes) when H" — H', while the two states \jtJV and \\j h > remain
different. I t is, moreover, impossible to specify the different states, as
has been done so far, by the value of the energy alone. We shall there
fore add to it some other quantity 6\ which commutes with it (i.e.
represents a constant of the motion) and which can be supposed to have
different values for different states which have the same energy.
The alterations in the treatm ent of a perturbation problem which
are necessitated by the presence of degeneracy in the unperturbed
problem can best be understood with the help of the geometrical inter
pretation. If the energy H is represented as a quadric surface in the
state-space, with symmetry axes whose lengths are inversely proportional
to the corresponding characteristic values of H , then degeneracy means
that a few of these axes have the same length, the corresponding section
of the surface, comprising all the equal axes, being "circle-like’ A de
generacy of this sort is met with in ordinary analytical geometry in the
case of an ellipsoid with two or three equal axes, the ellipsoid degenerat
ing into a spheroid or into a sphere.
8695.6 B b
186 P E R T UR BA T I O N T H E O R Y §19
So long as the surface is not degenerate, the directions of its symmetry
axes are perfectly definite. Degeneracy involves an arbitrariness in the
choice of the symmetry axes within the ‘circle-like’ section, any ortho
gonal system of axes being appropriate. I t may be mentioned that this
corresponds to the physical indeterminateness of the corresponding
states and to the necessity of specifying them with the help of some
other quantity, C say, which can also be imagined to be represented
by a certain quadric surface. The comrautability of H and C means,
as we know, that the symmetry axes of the corresponding surfaces have
the same directions; if one of them has a ‘circular’ section its axes
within this section can be identified with those of the other.
Let us assume th at the surface representing the energy K of the
perturbed motion is non-degenerate. We shall then find two types of
relations between its symmetry axes and those of H. So long as the
latter are intrinsically determined—i.e. apart from the circular sections
—the axes K' must differ but very little from the corresponding axes
H ', as has been supposed hitherto. So far, however, as a set of equal
H -axes is concerned, a set contained within a circular section and fixed
more or less arbitrarily, the angles between them and the set of if-axes
corresponding to this section need not be small. The process of successive
approximations, which was based on the assumption that all the angles
olH'k > were small, must therefore, in general, lead to wrong results.
That it does lead to wrong results is clear from the formula (152c)
which gives an infinitely large value for (xir ir if the difference
(for two different states) vanishes, unless S0H.tr also vanishes.
I t is thus clear that before starting on the process of successive
approximations based upon the assumption of the smallness of the
angles, one must make them actually small by transforming the sets
of axes which refer to ‘circular’ sections in such a way th at they
approximately coincide with the corresponding set of if-axes, l This
‘preliminary’ or zero-order transformation can be carried out for each
circular section independently, i.e. by dropping from the general equa
tion of the K -quadric, or rather from the equations of the K h -+ K k
transformation, all the terms which connect different circular sections
with each other (or with individual axes, if any). In fact the trans
formation coefficients and where H ' and H ” refer to one
circular section and K m to another ‘nearly’ circular section, must be
very small of the first order (the two sections being ‘nearly’ perpendi
cular to each other) and can therefore be neglected compared with the
coefficients aU K. or aa *K., where K ” refers to the nearly circular section
§ 19 PERTURBATION THEORY NOT INVOLVING THE TIME 187
of K which approximately coincides with the circular section of H
containing the axes H' and II”.
It will be convenient to alter our previous notation and to denote
the r f axes of a circular section corresponding to the value H = H f by
C[t C C ^ . The r r axes of the corresponding nearly circular section
of K will be denoted accordingly by K\yA"',..., K'r There is, in general,
no one-to-one correspondence between these r' A'-axes and the r'
C'-axes. They form two different orthogonal systems and the pre
liminary transformation which we are looking for is precisely the
transformation C* -* K* carried out for each circular section separately.
The exact equations of the transformation H -> K are thus split up
into a set of ‘zero-order’ equations of the following form:
2 K Cmc; K‘ ~ & ac'nK’> (* 54)
ri 1
where m = 1, 2,3,..., r \
For each of the ‘multiple’ values of II corresponding to r f different
states, we thus get a system of r' linear homogeneous equations involving
states of this set only. These equations are quite similar to the general
transformation equations for the case of no degeneracy,
^ Ku ’H”aHm
Tr K‘'’ — K 'airK ' >
differing from them solely by the fact that they refer to a finite number
of states—a fact which makes it possible to solve them exactly without
the use of the method of successive approximation (whose application
has to be postponed).
Putting K = H + S and K' = H ’+ AH ' in (154), then since
#<r„c; = we get
2 = (154a)
w-1
For the sake of simplicity, we shall rewrite this equation, or rather
the set of r' equations, in the form
2 S l Han = AH'a lrl, (154b)
H-l
where m is an abbreviation for C" and the index K' is dropped. Their
compatibility condition
-A H ' • • • 81,
** S ^-A H ' . . • « •, = 0 (154 c)
S li 5?., . . . S*,r— AH '
18ft P E R T UR BAT I O N T H E O R Y § 19
gives r' values for the ‘additional* energy AH', which are, in general,
different from each other. This is expressed by saying that the per
turbation splits up each multiple energy-level H' into a number (r') of
different sub-levels K' = IV+ AH [ , H'+ AH!Z,..., H '+ AH 'r .
To each value of AH', AH'8 say, there corresponds a set of values
of the r' coefficients an:
As in the general case, each of these sets must be normalized to 1, the
different sets being orthogonal to each other. We thus get for each
r'-fold value of the unperturbed energy Ii' a unitary transformation
matrix a of order r', which serves to transform the original r' functions
0c;’ •••> 0cy associated with the energy-level TV into new functions
•••> 0jcy> associated with the different sub-levels into which
these levels are split up. Using the one-row matrix notation for the
two sets of functions,we can write the relation between them in the
form ifj' == a\(j or \jjn — 0 ta t .
The preceding results are identical with those obtained in Chap. II, § 9,
by means of the variational method.
I t should be understood that the functions ift' do not represent a set
of IC-states, but another degenerate set of If-states which only approxi
mate to the corresponding fT-states. Starting with these functions, it
is possible, in the usual way, to obtain higher approximations. I t is
important to note that the first approximation values for the energy
are determined, according to (154 c), in conjunction with the ‘zero
approximation* for the characteristic functions.
I t can easily be shown th at the //-states specified by the new func
tions iff' arc such th at the matrix of the perturbation energy S with
respect to them is diagonal. This follows from the fact that equations
(154 a) are of the same form as the equations for the transformation of
the m atrix K u to the diagonal form Kk , K being replaced by S, K' by
AH', and the whole quadric K by its ‘nearly circular’ section. Denoting
the transformed matrix of the perturbation energy (for the r' states 0')
by S', we have S' — a~xSa = a fSa.
The diagonal elements of S' are equal to the values of AH' for the
corresponding states, so th at we can put
which is exaotly of the same form as equation (148 a), referring to the
case in which there is no degeneracy.
§ 19 PERTURBATION THEORY NOT INVOLVING THE TIME 189
These equations have a very simple physical meaning, which can
be expressed by saying th at the additional energy due to perturbing
forces is equal> in the first approximation, to the average value of the
perturbation energy S for the unperturbed motion.f When there is no
degeneracy, the latter is specified unambiguously by a function 07/'
referring to one definite state. In the presence of degeneracy these
unperturbed states have to be defined by means of the preliminary
transformation, and are, in general, different from the original states.
We are now in a position to formulate the conditions under which
a i>erturbation can be treated as weak. This weakness must obviously
correspond to the smallness of the angles between the symmetry axes
of the surfaces K and H and also to a smallness of the difference
between the lengths of these axes. The ‘circular’ sections of II corre
sponding to degeneracy need not be taken into account, since the
directions of the axes lying within them remains arbitrary and can
always be adjusted to be close to those of the corresponding section
of K.
Now we have seen that, to a first approximation, the angles aLjr K* are
equal to and the differences K k >k >—Hwir -- K '—H'
are equal to I t follows from this th at the perturbation can be
considered as weak if the matrix elements of the perturbation energy
$ with respect to different values of H are small compared with the
difference between these values, and the diagonal elements are small
compared with the corresponding values of H.
The smallness of S in this sense does not exclude the possibility th at
S, considered as a function of the coordinates of the particle (i.e. in
the classical sense), should become very large and even infinite at certain
points or regions. This makes the range of applicability of the wave-
mechanical perturbation theory infinitely broader than that of the
classical mechanics, which is restricted by the condition that S should
be small compared with H' at all points of the unperturbed path.
20. E xtension of the P r eceding Theor y to the Case of ‘Relative
Degener acy* and Continuous Spectr a; Effect of Per tur bation
on Var ious Physical Q uantities.
In many non-degenerate problems we meet with the case of a perturba
tion which cannot be described as weak—in the above sense—with
| It should be mentioned that the same result holds in the perturbation theory of
classical mechanics, the average value of S being defined here as the average value with
respect to the time.
100 P E R T UR BA T I O N T H E O R Y §20
regard to pairs of (unperturbed) states belonging to certain sets, while
it remains weak with regard to pairs of states belonging to different
sets. This means th at the matrix elements of S with respect to the
different states of the same set are large—or at least not small—com
pared with the energy differences between these states, while the matrix
elements of S with respect to states belonging to any two different sets
are small compared with the corresponding energy differences. In the
limiting case when the energy differences between the states of the same
set vanish, we get back to the ‘degenerate’ problem considered before.
I t is plain, however, that the same method can be applied approxi
mately when these energy-differences do not exactly vanish but are
small compared with the corresponding matrix elements of S, so that
without sensible error the (unperturbed) energies of the states in ques
tion can be identified with each other.
This serves to show that the notion of ‘degeneracy’ can be visualized
as a relative one, from the point of view of the perturbation energy
S which we are interested in, the ‘absolute’ degeneracy which has been
considered hitherto forming but the limiting case of this relative
degeneracy. If, for instance, S contains a continuously variable para
meter (an electric or magnetic field, say), we can pass, by steadily
increasing it, from a practically non-degenerate problem to a practically
degenerate one, the degeneracy extending over certain sets of states
whose energy-differences become small, as S increases, with respect to
the corresponding matrix elements of S , while the matrix elements of
the same function remain small compared with the energy-differences
between states of different sets.f
We shall assume th at such a subdivision of the various unperturbed
states into relatively narrow sets, which lie wide apart from each other
on the energy scale, is possible, and shall denote these states as multi-
plets. When the perturbation energy (defined by the value of its matrix
elements with respect to the corresponding states) is small compared
with the distance between the different multiplets and not small
(without necessarily being large) compared with the ‘widths* of the
separate multiplets, the perturbation theory given in the preceding
section is no longer applicable, and must be replaced by a more general
method.
This generalized perturbation method (which has been pointed out
by Lennard-Jones and by Jones) is extremely simple and consists in
f A typical example of this condition is found in the transition from a weak to a strong
magnetic field in the theory of the Zeeman effect (or Paachen-Back effect).
$ 20 RELATIVE DEGENERACY AND CONTINUOUS SPECTRA 191
splitting up the exact system of the transformation equations
KH i r ai r K — K'"aH'K'»
ir
into a number of approximate systems, referring to the separate multi-
plets and obtained from the above equations by confining the summation
over H" for each value of H' to such states only as belong to the same
multiplet as H'.
This is exactly what we have done before in writing down the equa
tions (154) which refer to the limiting case of absolute degeneracy.
They are applicable, however, just as well to the more general case of
a relative degeneracy if the letters C[, CT>are used to denote the
states of the same ‘multiplet’, with energy-values H \, H ’r>lying
close to a certain value / / ' and far away from the energy values, speci
fying all the other unperturbed states. To prove this we need but
note the fact th at the matrix elements of the total energy K with
respect to states of different sets are relatively small and can there
fore be neglected compared with those which refer to the same set
(multiplet).
In the geometrical representation of the unperturbed and the per
turbed states as the axes of the quadric surfaces H and K in the state-
space, a multiplet corresponds to a ‘nearly’ circular section of the
former. So long as each such section is nearly parallel to a certain also
nearly circular section of the A"-surface, we have to deal with a per
turbation which can be considered as weak with regard to the different
multiplets. I t can be, however, at the same time strong with regard to
the states of the same multiplet, if the symmetry axes of the corre
sponding nearly circular sections of H and K have entirely different
directions. A one-to-one correspondence between the unperturbed
states of each multiplet and the perturbed ones cannot be traced in this
case, just as in the case of an absolute degeneracy. The difference
between the two cases lies only in the fact th at in the former case the
unperturbed states are fixed unambiguously, while in the latter they
are represented by a perfectly arbitrary set of mutually perpendicular
axes in the corresponding exactly circular section of the quadric H.
As has just been mentioned, the equations (154) still hold for the
case of the ‘relative degeneracy’ if the letters CJ,..., Cy serve to distinguish
the states of a multiplet belonging to neighbouring values of the energy
//J,..., H'r'. The equations (154a) or (154b) are, however, not applicable
to the general case, for we must take into account the differences
between the various ‘sub-levels’ H'n (n = l,...,r'). To do this we need
192 P E R T UR BAT I O N T H E O R Y §20
only replace AH' in (154 a) by AH'm = K ' —H'm, which gives, in the
notation of (154 b), r,
= (155)
or I «£»<*» = P ' - 4 / 4 K (155 a)
where AH' = and AH fm = H '-IJ ' m)
H f denoting some average of the r' values //(, 7/.^,..., H^. The com
patibility condition of the equations (155 a)
S°n + AH [ -AH ' <5?, 81,
S?n S?„ +AH't - A H ' . Sir- =-0 (155b)
s°Ti S°r-> . Sar.r.+ AH ’r.- AH '
differs from (154 c) by the additional terms AH'm in the diagonal ele
ments of the determinant, and leads as before to r' (in general different)
values of the perturbed energy K' — H ' + AH ' . If the non-diagonal
terms of the determinant are sufficiently small it reduces to the product
of the diagonal terms leading to the expressions AH ' = S" n+AH ' n or
AH„ = S%n which have been obtained in the preceding section for the
case of no degeneracy. If, on the contrary, the terms AH'm or rather
H ^—Hn are small compared with ^ n, equation (155 b) practically
reduces to the equation (154c) for the case of complete (absolute)
degeneracy.
We have hitherto assumed th at the wave functions ifjjr specifying
the unperturbed states are orthogonal with respect to each other. The
above theory can easily be extended to the case when the orthogonality
condition is not fulfilled. We need not, however, consider this case in
detail here, for it has been dealt with already in § 9 of Chap. II by the
variational method. The results embodied in the equations (01) are
a generalization of the equations (154), which differ from the (special
ized) equations (62) in the notation only.
I t should be mentioned that to the states defined by non-orthogonal
wave functions there correspond in the state-space a system of non-
orthogonal axes to which the energy quadrics H and K are referred.
The non-orthogonality of these axes means physically th at the corre
sponding states are not mutually excluded, the integral J
measuring in fact the probability of one of them when the other is
supposed to be realized.
So far we have dealt only with the case in which the unperturbed
§ 20 RELATIVE DEGENERACY AND CONTINUOUS SPECTRA 193
motion has a discrete energy spectrum (which corresponds, classically, to
its being confined to a limited region of space). The case of a continuous
//-spectrum could be treated on similar lines. I t is, however, meaning
less to determine the change AII of the energy-levels produced by the
perturbation, when these levels form a continuous series. Thus one of
the main problems of the perturbation theory relating to the case of
discrete H -spectra, together with the complications arising in connexion
with degeneracy, drops out. The other problem—that of the deter
mination of the change Atp of the wave functions specifying the
stationary states—can be solved in the same way as before, i.e. by
determining the transformation coefficients In the present case
the zero approximation is given by the formula
= H H '-H " ),
a H 'K ”
instead of aH>K. = Instead of equation (144 b), we have
/ K°u .a .a H.K.,.dH• = K"‘ a lrK..,.
P u ttin g aH*K"> — 8(//" —H " ' )+ AaH*K.» and K — H + S , then since
With- =
we get
/ / ' " ) + J 8%' h * dH ”
= Kf" [ h ( H ' - in + Aa H^ l
which, with K"' = /T ' -fA//' " , can be wr itten in the for m
j S°H i r Aair Kn,dHn
= A //'" [8 (/T -/T )+ AairK„ ,] + (H"'-H')Aa irK.„.
This method can be conveniently applied only when the quantities
Aair j f". are known to be small—a condition which is, in gener al, n ot
satisfied.
An alter native m ethod consists in the dir ect deter mination of the
change of the functions *pjr , At}tn which is pr oduced by the per tur ba
tion, with ou t th e use of the integr al r epr esentation
AtpH' -= J AaR>ir ipir dH*
(wher e AaU ii. — A This can be done with th e help of the
equation ( / / + S - A " ) ( ^ - + A f o .) = 0,
which can be written in the form
( H - K ' ) A t = -SitP H '+ Afa r ) (156)
and which differs from the approximate equation (153) by leaving
3595.6 q g
104 PERTURBATIO N THEORY 1 20
‘unsplit* the energy K' of the perturbed motion and by preserving the
small term Skip'. Dropping it, we get the equation of the first approxi-
mation: = —SipH . (156a)
Substituting on the right side the nth-order correction An ipH>for ipH>,
we get the equation for the correction of the (n-f- l)th order,
(//~ tf')A n+1^ , = —SAn \pH>} (156b)
the exact function ipH>+A*pH' being thus defined as the limit of the
ser*es . •
This method has been worked out by Bom in connexion with collision
problems (see Part III). I t can be applied also to the case of
discrete spectra (thus enabling one to avoid the determination of the
transformation coefficients a); but in this case it must be modified by
putting K ’ = H '+ AH ' — H ' + A2//'-t-..., which leads to the
equations
( N - H ‘)H ^ H. =
(H - H ' ) A ^ h , = —( S - A ^ A ^ + f A * / / ' ) ^
L (157)
{H - H ’)A/1+10*,
= - ( S —A1H')AnipH>-)-(A2H')Alt. 1ipi r {-...-\-(Anji l H ,)ipI1’
The problem becomes more complicated, for we must determine not
only the functions A j ^ , A2'Pir> etc., but ftt the same time the numbers
A1H ,f A2 • This can be done with the help of the so-called ortho
gonality property of the non-homogeneous linear equations of the form
(H -H ')x ~ f. (157 a)
This ‘orthogonality* consists in the following: Multiplying the preceding
equation by the solution of the corresponding homogeneous equation
(H —H')*pH>= 0, or its conjugate complex and integrating, we get,
in view of the self-adjointness of the operator H,
( H - H ' ) x dV = j x(H-H'W*H.dV = 0,
and consequently j f r H dV = 0. (157 b)
Applying this ‘orthogonality property’ to the first of equations (157), we
** A,Z/' / dV = j PH.S+ U. dV,
th at is, Aj H* = S°HrH,. Applying it to the second, we get in a similar
Way At W = J 0 J - ( S - A , H ' ) ^ u . dV,
s 20 RELATIVE DEGENERACY AND CONTINUOUS SPECTRA 195
which can easily be evaluated after Aj has been determined from the
first of equations (157). This process can be prolonged as far as one
may desire, the determination of AnH' always preceding by one step
th at of An0 ^ .
If (157 a) is multiplied by instead of i/tH<ywe obtain, on integration,
J t fr X dV = j (157 c)
This gives, if applied to the first of equations (157),
J 0/r A1*pi r dV —
H '-H ”
i.e. the expression for the coefficient Ai <Ih 'K'- This is quite natural, for
if we put Aj i/jjr = then, in view of the orthogonality
of the functions ipH > and we get a H ’K'-
The preceding results obviously hold for the case only when the
unperturbed problem is not degenerate, and must be modified if there
is degeneracy—either absolute or relative.
We shall, however, leave th at case aside and shall briefly examine
the approximate effect produced by the perturbation on any physical
quantity F described as a matrix, from the point of view of H in the
case of the unperturbed motion and th at of K in th at of the perturbed
one. This can be readily done after we have succeeded in determining
the supposedly small quantities AaIVK>. or Putting
F W - F%,h . = AF°h .h .,
we have AF*jr j r = (a^F a—F )0^ ^ ,
or, since a — 8+A a and h*F = FS = F ,
= (FAa+ &a'F)°H'H*+ (Aa 'F Aa )0frH..
This gives, to the first order of approximation,
Ai F h h ’ ^ (-^TA1 a + A \O^ F )°H>H >y
or in the case of a discrete H -spectrum (with no degeneracy or a de
generacy accounted for by a preliminary transformation), according to
(148b):
A, * h 'i t - 2, S W ' F%>
+Z (158)
H ’- H m
SH
i 'H "
since Al ajffff^ = A = —A1a H>H^ = H , _ R Putting H* = H'
and writing H ” for H " we obtain, in particular,
A nn ^H'H0^H^H'^r S^jf^F ^H1
196 P E R T UR BAT I O N T H E O R Y §20
This formula determines the change of the average or probable values
of F for the different unperturbed states as compared with the corre
sponding perturbed states. Putting F = S, we get
I-S W 1*
A l < w - 2 R'Z w ^H "
Comparing this with (149 a), we obtain the following relation between the
second-order correction for the energy and the first-order correction
for A2H ' = J i j S%.fr . (158b)
This formula is quite similar to
A1H t = S°irH’y
and can be further generalized with the result
KB' =
n
if higher-order corrections for the matrix elements are taken into con
sideration, according to (157 a). We shall not, however, consider in
detail this question which can easily be solved by substituting in (157 a)
the expressions Aa = A ^ - f A2a-f....
For the sake of illustration we shall apply the preceding equations
to the case of a hydrogen-like atom, perturbed by a homogeneous
electric field E parallel to the z-axis. We have in this case S = —eEx,
where x is the coordinate of the electron with respect to the nucleus.
Putting in (158 a) F = ear, we obtain the expression for the additional
electric moment induced by the field when the atom is supposed to
remain in the (non-degenerate) unperturbed state H f:
t w ~ ^ = (158c)
where a is the polarization (or susceptibility) coefficient. The corre
sponding energy must obviously be equal to \ olE 2 = JAXS]VH>which
is in agreement with the relation (158 b) since the energy in question
corresponds to the second-order correction (A2# ') .
The same results are obtained, of course, if instead of the transforma
tion coefficients the transformed functions </*, or rather the corrections
A(/r, are used. Limiting ourselves to the first approximation, we get
F K'K” — j dV
S J K - F f a dV + j A, fa . F f a dV + j f a F A, f a dV,
that is, A, F iar n . = J Axf a F f a . dV + j f a F At f a . dV. (159)
§ 20 RELATIVE DEGENERACY AND CONTINUOUS SPECTRA 197
These expressions can be used in the case of continuous //-spectra when
the functions A1ipH>are determined directly by Bom ’s method. If they
are determined with the help of the transformation coefficients, we get,
as before, AxF$r n . = (F A1a-\-A1a*F)°JVH^ which means in the present
case
*!*%■*■=J
instead of (158).
In conclusion the following remark should be made. I t can happen
that, while the unperturbed motion is confined to a finite region and
has accordingly, within a certain interval of energy values, a discrete
spectrum, the perturbed motion has, within the same interval, a con
tinuous energy spectrum, which means th at the perturbing forces, even
when small, can extract the particle and drive it to infinity. An example
of this condition is furnished by the action of a homogeneous electric
field on a hydrogen atom. In the region of low energy values the con
tinuous energy spectrum, corresponding to the presence of the electric
field, practically reduces to a discrete one, with each //-level split up
(as a consequence of degeneracy) into several sub-levels. This pheno
menon is known as the Stark effect. The sub-levels in question have,
however, a certain effective width which increases with the strength of
the electric field and which corresponds to the phenomenon of pre-
dissociation, discussed in Part I, § 16. This means that there exists a
certain probability for the atom to be ionized by the electric field even
if the unperturbed state of the atom corresponds to the lowest energy.
The width of the energy-levels becomes, however, marked for unper
turbed states, which correspond to comparatively high energy-levels,
where the energy spectrum of the perturbed atom becomes practically
continuous. In the case of the unperturbed atom, the continuous
spectrum starts at the point where the energy is equal to zero, wrhile
for a perturbed atom it starts below this point—and indeed the more
below, the larger the perturbing electric field.
21. Per tur bation Theor y involving the Tim e; Gener al Pr ocesses;
Theor y of Tr ansitions
In all the foregoing developments the time has been completely
ignored. This has been possible because we have limited ourselves to
the consideration of such physical quantities as do not depend upon
the time. I t may seem, at first sight, th at the introduction of the time
as an independent variable into the expression of an operator, F ix) say,
representing some variable physical quantity, would only have the
198 P E R T UR BAT I O N T H E O R Y § 21
effect of making its characteristic values, and consequently the states
specified by them, functions of the time. That this is not so is clear,
however, from the example of the energy. If the energy operator K con
tains the time explicitly, then an equation of the type = 0
has no physical meaning and must be replaoed by the general equation
of motion
(K+ pt)<f> = 0, (160)
h d
where p, = — . —. The equation (K—K')6k . = 0 would correspond to
2771 dt
the treatment of the time as a simple parameter; from the purely mathe
matical point of view, the appearance of the time would have no
particular meaning, save th at of making the characteristic values K '
and the characteristic functions <f>K>{t) definite functions of the time.
These functions, as well as the corresponding characteristic values K \t),
would, however, have nothing to do with those functions <f>(x, t) which
describe wave-mechanically the motion determined by the energy
operator K and which are the solutions of equation (160).
So long as K depends upon the time, this equation does not admit
particular solutions of the type <f>— <f>°K'(x)e~i2lTK,l,\ which means, from
the physical point of view, th at K has no characteristic values, or, in
other words, that the values of a variable energy cannot he specified.
This result constitutes one of the fundamental differences between
wave mechanics and classical mechanics, where the value of a variable
energy can always be ascertained as a definite function of the time.
The same refers to other operators involving the time as an independent
variable.
I t is true that the energy is more intimately connected with the time
than any other operator. I t seems, however, doubtful whether an
equation of the form F ^p = F'ip defining the characteristic values of
an operator has any meaning if F(x) depends upon the time—so
long at least as the latter is treated on an entirely different basis from
that of the coordinates x, y, z. The exceptional role of the time is revealed
by the fact that, in contradistinction to the coordinates, it cannot be
used for the specification of the states, the latter being referred, in
general, to a particular instant of time. The time, therefore, cannot
be treated on the same lines as the coordinates and other physical
quantities, and, in particular, it cannot be represented as an operator
or a matrix with regard to some other basic quantity. Even when
completely ‘inactive1, the time remains above the realm of ordinary
quantities, ruling out the very possibility of their determination (so far
§ 21 PERTURBATION THEORY INVOLVING THE TIME 199
as exact and n ot pr obable values ar e concer ned) by its active inter
fer ence.
Never theless, th e tr ansfor mation theor y which has been developed
in th e pr eceding chapter can be applied in a som ewhat modified and
gener alized for m t o var iable quantities and, in par ticular , to th e ener gy
K of a par ticle m oving in a var iable field of for ce.
If the variable part of K refers to a comparatively small force, we
can regard the latter as a perturbing factor causing transitions between
the states specified by the part of K which does not contain the time.
This theory of transitions has been outlined already in P art I, § 14.
We shall now briefly recapitulate it, using the new notation, and we
shall point out its connexion with the transformation theory.
The variable part of K y which will be regarded as the perturbation
energy, will be denoted, as before, by S , and the constant part by H.
The function <f>(x,t)f which is the general solution of equation (160),
can be represented as a superposition of the (normalized) functions \(jh >
which correspond to the different states specified by the operator H,
with suitably determined variable coefficients.
Taking first the case of a discrete //-spectrum, we shall put accord-
ingly (160a)
<l>(xy0 — 2
H‘
with *1*h ' = *lt°H'(x)e-i2nHilhy
or <^(x, 0 = 2 (160b)
If
where CH (t) = cH\t)e-i2nHtlh. (160c)
Su b stitu tin g (160 a) in (160) and taking in to account t h a t the functions
ipH>satisfy th e equation (ZZ+p,)^ * = 0, we have
(/Z + ^ + p /) y CH'(t)ifiH' — ^ — 0*
If H‘
Since S*ft^>= y
fr
we get 'J? ^ ipH”[&H'H‘'P tCHm+ CH' S h 'H'] = 0,
whence £ (&H'HmP icH'~i~cH' = 0,
or, interchanging H' and H m
Ji dcj£'
(161)
2iri dt
I t should be r emember ed th a t th e quantities SH>H. r epr esent n ot the
m atr ix dements but th e m atr ix components of th e per tur bation ener gy,
so th a t SH,i r — ei27Ttn '-Jr’Wh. Fur ther , so long as S contains th e
200 P E R T UR BAT I O N T H E OR Y §21
time explicitly, the matrix elements S°H'H* = J dV must also
be certain functions of the time, so that (161) can be written in the form
- ± | tP u-H-Wt*a '-a "*kCH- (1«1 »)
If we substitute in (160) the expression (160 b) instead of (160 a), we
get in the same way, without, however, separating K into the parts
H and S,
(K+ pt) I Ci r t fr = X W i Pl CH'+ C u ,K^H.) = 0,
Tv fr
or, since K<fPu - =- X
ir
2 0 /r 2 f iir ir Pt Cr ) — 6,
ir ii
or finally — ~~df~ = ®nm
%(161b)
This equation can be derived from (161a)—or the latter from it—
with the help of the relation (160c) between the coefficients C and c and
the relations re u'% i oo
a h h * — H °irii*~rbH‘H'-
As already explained in P art I, § 17, the squares of the moduli of the
coefficients cJV or Cir> i.e. the quantities
Nn >(t) = CH>C*V — cH>c*r , (162)
can be interpreted as the probabilities of finding the particle at the
instant t in the unperturbed state H \ or, using the ‘multiplex repre
sentation’, as the relative numbers of the copies of the particle in the
state H ' at the instant t = 0. These numbers can be determined as
functions of the time with the help of equations (161 b) or (161) if the
initial values of the coefficients Clr (or cH.) at some instant / = 0 are
supposed to be known. We shall denote them in future by C]v and
write accordingly A^(0) = N°ir .
The change of the numbers Nir with the time can be interpreted as
the result of transitions induced by the perturbing forces. So long,
however, as two or more of the numbers N°ir are different from zero,
it is impossible to ascertain the original state from which the transition
to a given state takes place.
In order to be able to speak of definite transitions to a given final
state from a given initial state, we must therefore assume th at initially
all the copies of the particle were in the same state, H f say. This means
th at all the coefficients C°ir must be set equal to zero, with the excep
tion of one of them, CTJr , which can be put equal to 1. This can be
§ 21 PERTURBATION THEORY INVOLVING THE TIME 201
expressed by means of the formula
C°ir = Sir jn (162*)
which serves to show that the coefficients Cl r (t), not only for I — 0 but
also for t > 0, can be considered as the elements of a matrix, which we
shall call the transition matrix and shall denote by the same letter C.
The value of the coefficient CH» at the time f, on the assumption of
a definite initial state H will thus be denoted by
Cir( 0 = Ch 'H'W)' (162 b)
the initial value of the matrix C being 8 (that is, 1).
The formula <f> — I C„ "(tWjr represents the general solution of
Schrodinger’s equation (160). That particular solution of it which
reduces to ifjir at the initial instant t = 0 can conveniently be denoted
by <f>H'(x,t). We thus get for particular solutions of this type, which
approximate to the particular solutions of the equation of the unper
turbed motion (H-\-pt)\ft = 0, the following formula:
4*h ' = 2 (163)
IV
which shows that the transition matrix C(t) can be regarded as the
transformation matrix from the wave functions to the wave func
tions The latter can no longer be denoted by as was done
before, since K has no characteristic values; these characteristic values
can, however, be replaced by a kind of ‘reminiscence’ of the particular
solutions of the equation (K+ pt)<f> = 0 about the H -state they repre
sented at the instant t = 0.
I t can easily be shown that the functions <f>H,, <£/ r , etc., are mutually
orthogonal, just as are the functions <f>K- considered before.
We have in fact
( * - s a K - c -
Multiplying the first of these equations by and the second by <j>ir ,
subtracting one from the other, and integrating over the coordinates,
we get
J = -J lt(tfrhr)dV,
or, since the left-hand side vanishes (so long as K , in spite of its depen
dence upon the time, preserves the property of self-adjointness), we get
0.
Dd
202 P E R T UR B AT I O N T H E O R Y %2l
We thus see that the value of the integral J dV does not depend
upon the time. Since at the initial moment t = 0 we have <f>H>= $H'
and <j>H. = it follows from this that the functions <f>H> satisfy,
irrespective of the time, the same orthogonality and normalizing con
ditions r ^
j ^ h ' ^ h ’ d V — %HmH' (163 a)
as the functions
Substituting in these equations the expressions (163), we have further
J 4>Hm&H' dV — ^w h J
' ^ h ^ h 0 *Ph ,'”'Ph '" d Vy
that is, j <f>H0<f>n'dV = CH’"H'
and consequently Y CH-u <— $h 0h '- (163 b)
This equation shows th at the transition matrix is unitary (Cr — (7-1),
just as are the ordinary transformation matrices, which have been con
sidered in the preceding sections and which do not depend upon the
time. The transformation equations (163) can be written accordingly
in the ordinary matrix form
= <fy°, or <f>* = (163 c)
It follows from these results that the functions <f>u >specify perfectly
definite states in the same sense as those which would be represented
by the functions <f>K»if K were independent of the time and had definite
characteristic values; the only difference between them being th at the
former vary with the time while the latter should remain constant.
The set of states specified by the functions <f>H>can be represented
geometrically as an orthogonal system of coordinates in the state-space,
the transformation coefficients Ch *h ' denoting the cosines of the angles
between the fixed axes which represent the states and the movable
axes which represent the states This movable system of axes,
rotating like a solid body in the state-space, can be regarded as the
geometrical representation of the variable energy K.
One might be inclined to go a step further and to represent K by
a quadric surface defined by the equation
===
thus fixing not only the directions but also the lengths of the axes asso
ciated with K—i.e. the characteristic values of the latter. This argu
ment is, however, fallacious because the preceding equation has nothing
to do with the representation of the variable surface K t which we have
s 21 PERTURBATION THEORY INVOLVING THE TIME 203
been considering, but represents in reality the fictitious ‘quasi-constant"
energy operator K with the time treated as a simple parameter.
The fallacy of the above argument becomes especially apparent when
K is actually constant (which can be considered as a special case of
a variable K). The equation 2 2 n aa*r a H m = const. will then
represent K as a quadric surface fixed in the state-space. Nothing,
however, will prevent us from solving the equation (K+ p t)<f> = 0 in
this case in the same way as in the preceding case, namely, by taking
particular solutions not of the usual K-type, <f>K>= <fPK,e-iZnK^h, but of
the //-type, i.e. such that, at the initial moment t — 0 , <f>coincides with
one of the functions The functions <f>H, so obtained will represent
for t 7^ 0 states entirely different both from those specified by the
functions and from those specified by the functions In order
to avoid confusion, we shall denote the characteristic functions of K
(when they exist of course, i.e. when K is independent of the time) by
XK> instead of <f>K . The connexion between these functions and the
functions J ai r K *ff°/ r , (»<*)
TT-
which has been investigated l>efore, is represented by a constant
transformation matrix a, which has nothing to do with the variable
matrices C and c.
I t should be remarked that the elements of these matrices are con
nected with each other, according to (160c), by the relation
r _ n m fiUrrinih
CH 'H • — ^ H mH e >
which is not symmetrical with regard to the two indices and is in
agreement with the unitary character of the two matrices.
The transformation matrix a can be derived from the general equa
tions (161 b) if the condition th at the function <f> should reduce to
for t = 0 is replaced by the condition th at it should be a harmonic
function of the time of the type
<f>= Xk " = X°K"'e~i2”K"ilh- (164a)
This means, on account of the equation
4* = j!^ *
th at all the coefficients CH*should also be of the type
Cn . = (165)
The differential equations (161b) reduce, subject to this condition, to
204 P E R T UR BAT I O N T H E OR Y §21
a system of ordinary algebraic equations for the amplitudes C°/r
KmC0„- = 2K*jrH.C*u .9 (165a)
which are obviously identical with the equations determining the
transformation coefficients a.
We thus get = aH>,
or more exactly C /r ic '' — a H K "- (165 b)
The relations between the functions x k ' — X°k ' e~i27TK'{lh and
*Pir = can be obtained from (164) if the coefficients aH*K.
are replaced by ^ = a/ r j r (166)
These coefficients also constitute a unitar y m atr ix f. Combining the
m atr ix equations x ^ and ^ = ^
we can easily obtain a dir ect r elation between the functions <f> and x-
W e have, nam ely, ^ = =
and consequently <f>— xd,
with the transformation matrix
d=
W r itten in matr ix elements, these equations r un
4>h - = X dK-"' Xk ’> (166 a)
IT
with dK*H. = £]cu‘"cH"'H' — 2 Qr,,KmCH'"H,'>
or dK.,r = Y (166b)
tr
Putting dK.H' = Vk.h . ei2ir K"«\
we can rewrite (166 a) in the more convenient form
<t>w = £ Ik'u - A* ’ (166 c )
R*
with Vk ’H' —^ aH ''K0C (166 d)
showing th at the dependence of <f>H>on the time is fully determined by
the transformation coefficients Ch ” h •
Equations of exactly the same type as (105 a) are obtained in classical
mechanics for the amplitudes of the free oscillations of a system of
particles held together by ‘quasi-elastic’ forces, i.e. forces which are
proportional to their displacements both from the respective equilibrium
positions and relative to each other. Such a system can be realized in
the simplest form by a set of coupled pendulums which can oscillate
in a definite plane under the influence of gravity and of forces due to
§ 21 PERTURBATION THEORY INVOLVING THE TIME 20f»
their being coupled together (by means of later al str ings or other wise) |
L et £v f 2,... be th e displacements of the given par ticles— or pendu
lum s—fr om their position of r est. Their dependence upon the tim e is
deter mined by a system of equations of the for m
_ d% y <j> t (167)
dt* £ nm?m y '
The coefficients <DHWthus specify the binding of th e separ ate par ticles
to their positions of r est, and so deter mine the fr ee vibr ations which
th ey would car r y ou t in the absence of any coupling with the other
par ticles. The coefficients (m t 6 n) descr ibe, on the other
hand, the per tur bing coupling for ces.
I f we p ut
®nn = +<»;«, (» # m)<
we can then r egar d th e above equations as the equations of the perturbed
motion of th e given quasi-elastic system . By the unper tur bed motion
we ar e to under stand the vibr ations deter mined by the equations
__— (|>o £
dt* ~ V7,g,r
In this case each par ticle (pendulum or cur r ent) vibr ates quite indepen
d en tly of th e other s and with a fr equency co® = <^<I>®n.
277
In the pr esence of per tur bing coupling for ces such independent
har monic vibr ations of the separ ate par ticles (or pendulums) ar e n ot
possible. They become r eplaced by har monic vibr ations of a differ ent
kind—so-called ‘nor mal vibr ations’ of the system —in which with r egard
to any kind of vibr ation char acter ized by the common fr equency wk
all par ticles par ticipate with definite r elative am plitudes and definite
phase differ ences. The r eal amplitude and th e initial phase (at tim e
t == 0) of each par ticle can be defined r espectively as th e modulus and
th e ar gum ent of a com plex amplitude yn — |yn |ei8". These complex
am plitudes and th e cor r esponding fr equencies of vibr ation can be deter
m ined fr om th e equations of m otion if we make th e su bstitution
£n = Yne-2niwt> (167 a)
for the variables f ir Equations (167) then reduce to the form
I (107 b)
t Instead of a mechanical model we could use, for the illustration of the equations
(165 a), an electric model, formed by a system of electrically coupled electric circuits.
206 P E R T UR BAT I O N T H E O R Y 8 21
and thus with w2 = K"' and 0 Mm = K{}rH. become identical with the
‘wave mechanics* equations (165a).
The general solution of the classical vibration problem (167)—just
as of the corresponding ‘wave mechanics’ problem (K-\-pt)x = 0—is
obtained by superposition of all harmonic particular solutions (with
arbitrary constant coefficients).
The similarity of the two problems enables us to relate the perturba
tion theory of quantum mechanics, in a very clear manner, to the
classical theory of weakly coupled particles or pendulums. The ‘pendu
lum model’ (which can serve just as well for the illustration both of the
wave-mechanical and the electromagnetic vibrations) proves to be
especially convenient. Such a model consists of an infinite series of
pendulums which are suspended along a horizontal line in the order
of increasing frequencies of the unperturbed vibrations, i.e. in the
order of decreasing lengths, and which can be bound to one another
in pairs (see Fig. 2). Thus each pendulum corresponds to a definite
quantized state of the unperturbed system (atom, molecule), i.e. to a
definite characteristic function ^ I n the case of ‘degeneracy’, i.e.
when several different pendulums have the same unperturbed vibration
frequency vj^ = H ' j h , we can ascribe the sanle length to the corre
sponding pendulums (in general, however, a different mass) and place
them beside one another transversely to the original direction of
suspension.
If, under the given conditions of the motion, there exists, besides
a discrete set of states, also a continuous set of stationary states, then
the discrete pendulum series of our model must be supplemented by a
continuous series, which can be conceived as a compact heavy fabric. For
this fabric not to tear, the amplitudes and phases of the vibration of
its vertical elements must be continuous functions of the (unperturbed)
vibration frequency v° = H'/h.lf
From the point of view of the wave conception, the correspondence
between the vibrations of our pendulum model and the vibration
process in the corresponding mechanical system is very straight
forward and suggestive. Thus the different types of standing waves
represented by the functions play the role of the single pen-
f We could replace the pendulum model by a string model (limiting ourselves to the
fundamental vibrations of each string). The continuous spectrum in this model would
be represented by a membrane. Suoh a membrane must, however, possess quite unusual
properties which are incompatible with the ordinary equations of the theory of elasticity
(for these equations correspond to a coupling between the neighbouring elements of the
elastic continuum only).
$ 21 PERTURBATION THEORY INVOLVING THE TIME 207
dulums; while the coefficients Cu * (or cH.) are the (complex) ampli
tudes of vibration.
This correspondence acquires a purely symbolic character, however,
when we go over from the wave picture to the corpuscular picture. The
amplitude coefficients then acquire a quite different physical meaning;
for their norms CH.C*r = |CH~\2 then determine the relative number
of the copies of the given particle which are in the corresponding state.
To the continuous alteration of these coefficients with the time under
the action of the perturbing forces there corresponds a series of forced
transitions of these copies from one state to another. The derivative
d\CH.\2/dt then gives the probability, referred to unit time, that any
copy of the particle will go over into the state if d\CH*\2jdt > 0 or
out of this state if d\Ci r \2/dt < 0.
One important difference between the pendulum model and the
wave-mechanical vibrations it represents, consists in the normalization
of the amplitudes of vibration to a definite value (1). A system of
pendulums, as considered in classical mechanics, can be at rest; or if
the system is vibrating, one has to distinguish not only the relative but
also the absolute values of the amplitudes. So far as this model is used
for the illustration of wave-mechanical vibrations a state of rest is
excluded—for the particle must always be found in some one of the
states represented by the pendulums. Moreover, only the relative values
of the amplitudes have a physical significance as defining the probability
amplitudes of the corresponding states—which can be taken into
account by normalizing the sum of their norms once and for all to 1.
In the case of certain relations between the amplitudes yn of the
various pendulums, these amplitudes can preserve constant values, as we
have seen above. Such 'normal vibrations' of the system of pendulums
correspond to stationary distributions of the copies of the particles
208 P E R T UR BAT I O N T H E O R Y §21
among the different unperturbed states, and represent the stationary
states in the presence of the perturbing forces (i.e. states defined by
the energy K). If we introduce for the illustration of the perturbed
motion, i.e. of the vibrations defined by the operator K> a pendulum
model of the same kind as for the unperturbed motion (i.e. the //-
vibrations), then any such stationary distribution, i.e. any normal
vibration of the original model, will be represented by the vibrations
of a single pendulum of the new model. These new pendulums, repre
senting the transformed characteristic functions x k '> clearly be
considered as uncoupled. This means that transitions between the new
stationary states (which are the real stationary states) are impossible.
A transition between two different unperturbed states H' and H " is
possible in the first place if the corresponding matrix element of the
perturbation energy S{)vir >is different from zero. The coupling coeffi
cients which represent these elements in our pendulum model, can
be regarded as a measure of the probability amplitude for transitions
between the corresponding states. I t can easily be seen, however, th at
transitions are also possible between unperturbed states / / ' and H"
which are only indirectly coupled with each other, the matrix element
Su ’ji" vanishing, but certain other elements of the type and
S l r ’i r being different from zero. Such ‘indirect transitions’ play, as
we shall see later on, an important role in many physical phenomena.
In the case of the stationary K K-states represented by a stationary
distribution of the copies over the various //-states—or by normal
vibrations of the pendulum-system—the transitions between different
//-states can be imagined to be mutually compensated.
The variable A"7/-states which are described by the functions <j>H>can
be represented in our pendulum model by vibrations which at the initial
time t = 0 involve one particular pendulum (//') only. As time goes
on, the vibrations of this pendulum must be gradually transferred to
other pendulums, this transference representing the gradual transition
of the copies of the particle from the state H ' in which they were
initially supposed to be concentrated (whose probability, in other words,
was initially equal to 1) to other states.
If the energy K yor what amounts to the same thing the perturbation
energy S y depends upon the time, only states of this type can be
defined and represented by means of the pendulum model, while normal
vibrations corresponding to definite values of K are impossible.
I t is natural to consider vibrations due to an external influence,
specified as a given function of the time, as ‘forced vibrations*. I t must
§21 PERTURBATION THEORY INVOLVING THE TIME 200
be borne in mind, however, th at the forced vibrations we are referring
to are not of the usual type described by the non-homogeneous equations
where F„(t.) denotes the external force acting on the nth pendulum.
Such external forces do not have any place in our model. They are
replaced by a so-called ‘parametric perturbation’, i.e. by a change of
the parameters <Dmn which determine the free vibrations of the pendu
lums. In fact, the case of a perturbation energy depending upon the
time can be represented, in the pendulum model, by a type of forced
vibrations determined by the equations
The model will, however, adequately reproduce the actual conditions
only when the dependence of S upon the time is harmonic and if,
besides, we restrict ourselves to the case of small perturbing forces;
otherwise the agreement between the wave-mechanical equations (161a)
or (161b) and the classical equations will be destroyed on account of
the fact th at in the former we have first derivatives with respect to
the time (multiplied by h^rri), while in the latter we have second
derivatives (d2i n/dt2). This difference is immaterial only in the case of
harmonic vibrations represented by exponential functions of the type
ei2nvt, ^he differentiation with regard to the time being in both cases
equivalent to multiplication by a real constant.
The preceding theory can easily be extended to the case of a con
tinuous or mixed energy spectrum of the unperturbed motion.
Writing, for example,
instead of (160a), we get
[ B + S+ p M
= j rtoc/r)‘/r/ r +c/r
We have further
SipH, S h ,“H'4, h ",+ J S Jf^ H.ipHt„ .dH ,",i
3595.0
210 P E R T UR BAT I O N T H E O R Y § 21
where H ' and H"’ refer to the discrete and H" and H*" to the continuous
region of the //-spectrum, and consequently
Jb dCjJ> y
2rri dt A •CH -+ J r ^h ’h" ,CH" >dH'
(168a)
ll dCH. y
27ri dt A J\ ^H’H""Cjj" ..dH
The only difference between the discrete and the continuous case is
that in specifying the states we must, in general, replace the discrete
values of H ' by elementary regions or ranges of H"ythe number of the
copies belonging to the range AH ” being equal to | \cH.\2dH"—provided
a H'
the functions \pH* are duly normalized according to the equation
f <f>u- <!>h- dV = h (H " -H m) or j fe . <{>„■,„ dH” = 1.
I t should be remembered th at this condition is equivalent to the usual
normalizing condition J \*pH>\2 dV = 1 for the quasi-discrete functions
*H. = J 4>H-dH\
(AH')
With the help of the latter the case of a continuous spectrum can be
dealt with in exactly the same way as the discrete case, provided we
start with finite ranges AH ” and pass to the limit AH" -> 0 after having
calculated the coefficients c.
The actual determination of the perturbed motion by the method of
transitions explained above, both in the case of a variable energy K
and in the special case of a constant K , can be carried out by means
of a process of successive approximations, based upon the following
consideration. If there were no perturbation, then the coefficients c
(but not 01) would remain constant, preserving those values c° which
they were supposed to have at the initial moment 1 = 0. The action
of the perturbation will be to modify these values, so that we can put
c(t) = c°+Ac(J) and consider Ac(t) as a small quantity—for sufficiently
weak perturbing forces and, in general, for sufficiently small values of t.
The latter condition constitutes an important restriction of the validity
of the approximation method in question—a restriction that does not
have any equivalent in the alternative method dealing with stationary
states and not involving the time (if K does not depend upon the time).
I t is, however, perfectly natural from the physical point of view,
since, in the determination of transition probabilities, we have to limit
ourselves to short intervals of time. Regarding the matrix components
§ 21 PERTURBATION THEORY INVOLVING THE TIME 211
&h ‘n * 118 small quantities of the first order, we can put
cir(0 = '(04-^2 cif/(0+***
and obtain the corrections Axc9 Aac, etc., by the usual scheme of suc
cessive approximations.
Confining ourselves again, for the sake of simplicity, to the case of
a discrete spectrum, we obtain a chain of equations starting with
” 5Ti I t AlCa' = j? Sa 'H’ c“ ' (169)
(first approximation),
~ 2 n i di ^2 C u w ^ i cH0> (169a)
(second approximation), and so on. Since the matrix components SH>H-
are known functions of the time, equations (169) can be integrated
directly with the result
Aj cH\t) — (170)
which, on substitution in (162a), gives
t v
A2 — --- j~2 ^ 2 "j ^ ^ dt" S h 'H' W)- (170 a)
In a similar way one can obtain an expression for AnaH>(t) which is of
the nth order with respect to the small quantities SH'R*, etc.
The function S can usually be represented in the form of a product
of a function of the coordinates and a function of the time:
S = T(x9y9z)f(t)9 (171)
or more generally as a sum of terms of this type. We get accordingly
Sir ir = n ^ W * * * '- * " * * * <171 *)
and I SIVH.{t: )df = T na .H.J Va,s,(t), (171b)
0
where vH>H* = and
/,(<) = J /(O e < w '<&'. (171c)
0
This function can be defined as the amplitude coefficient in the Fourier
integral representation of the function/(*') within the interval 0 < t' < tt
or more exactly of a function which is equal to f(V) within this interval
212 P E R T UR BAT I O N T H E O R Y §21
and vanishes outside i t The latter function
/o (o =
can replace the actual function f(t) so far as we are interested in the
results produced by the perturbation S during the limited time t.
Turning to the quantities NH, = |cj r |2, we get
— lc/f'l2+ ( c/ r ^ i c/ r + c?r ^ i chO + |A 1ch *|2+
+ (c/r A2cir-\~cHm
Terms of higher order will not be needed in future and have accordingly
been dropped. In the particular case when = 0, this expression
reduces to it <a •<
»
N h. = |A,c/ r |2. / < a)v
(172
If initially the particle were supposed to be in a definite state, //' say
(so that c^. = c*H*H' == equations (170) and (170 a) reduce to
t
&1CH”H' = ----Y j
0
(with H Hand H f interchanged) and
t v
^2 CH ”H ’ = ~h? 2 ^ ^ ^ )• (173 a)
11 0 0
These equations give the first and second approximation for the
elements of the ‘transition m atrix’ cH.H>. We need not consider here
their geometrical representation (as determining the angles between the
fixed //-axes and the rotating A/f-axes in the state-space), since it is
identical with that of the transformation coelficients a ydiscussed in § 19.
I t is also hardly necessary to point out the way in which the pre
ceding equations can be generalized to allow for the presence of a con
tinuous or mixed spectrum; all we need to do in this case is to replace
the sums wholly or partially by integrals extended over the continuously
variable parameters.
The equations (173) and (173 a), as well as the higher approximations
for c/r7 r, can be obtained in a more straightforward, though somewhat
symbolic, way by considering the coelficients cH»H\t) as a matrix and
writing the equations (161), which serve to define them, in the matrix
form h dc
§ 21 PERTURBATION THEORY INVOLVING THE TIME 213
We thus get, treating S as an ordinary function of the time,
- tp js d t
c(t) = e h 0 c(0), (174 a)
t
2tt C
or putting, for the sake of brevity, ~ S dt = R and expanding the
o
exponential in a power series
c(t) = (174b)
This formula contains the two equations (173) and (173 a) as corre
sponding to the terms of the first and second order in the expansion.
I t is self-evident that all the multiplications must be carried out in
the order stated, according to the general rule of matrix multiplication,
and that, moreover, the matrix c(0) must be defined as the unit matrix
I t may seem at first sight that there is a discrepancy between the
expression (173 a) and the second-order term of (174 b)
\ cn " H ' ~ ) h " H' = “ 2 n2
„. R h ' ' U ' " R h " 'H'i
i.e. A2 CH' H'—~~ ^ J (WJ ^H"H' ^ * (1 74 c)
As a matter of fact, they are easily seen to be identical (by a generaliza
tion of the well-known relation for multiple integrals with the same
variable).
Since the first factor in (174 a) is a pure imaginary, we get at once
the relation c\t)c(t) - c+(0)c(0) == 8,
which means that T |%"y/ (0l2 — 1 in agreement with the elementary
/r
theory of Part I (§ 18) or with the formula (163 b) of this section.
I t should be mentioned, in conclusion, that the case of a variable
perturbation can be dealt with by a method similar to that of Born
for the case of a constant perturbation in the theory of stationary states
(§ 20). We can, in fact, determine the functions which are the
particular solutions of the equation (H + S+ p t)<f> = 0 reducing to
*j/H. = ^ at the initial instant t = 0, by putting
<f>H, = '+•••>
214 P E R T UR BAT I O N T H E O R Y §21
and integrating successively the chain of equations
(H+jp/)A1ifjir = —
(HJr pt)k2'lfH' —
etc., subject to the condition that A = A2\ftH>= ... = 0 for t = 0.
This method can be advantageously applied in the case of continuous
spectra. I t is, of course, completely equivalent to the method explained
above, differing from it only by avoiding the use of the coefficients c.
22. Fir st Appr oximation; Theor y of Sim ple T r ansitions
The study of transitions produced by a perturbing force can con
veniently be divided into two parts, corresponding to the first and to
the second approximation of the general theory. The first-order terms
determine the probability of simple (or direct) transitions between two
states, which have been dealt with already to some extent in P art I,
§18; while the second-order terms mainly determine the probability of
combined transitions, involving intermediate states.
So far as the action of variable forces is concerned, we shall restrict
ourselves to the case of a harmonically oscillating force represented by
the expression (171) with f(t) = cos(277rJ+£). In the general case of
a force represented by a sum (or integral) of terms of this form with
different frequencies v, reduces to the sum (or integral) of parts
corresponding to the separate harmonic terms of S.
Putting f(t) = ^[ei(27rvUP)+ e-i{27Tvl+P)]y we get, according to (170),
(169b), and (169c),
pi2 n (H ’- H ' + h v ) t J h _ i pr>ir(H'-H -hv)t/h_
[
11
* H ’- W + k , H ' - H ’- U ]’
(175)
which can also be written in the form
ei27T(v*''B^v*—1 nPi2ir(va„a,~v)t
------- 1. (175a)
__ IT
Ai 6 /r/r —1 ^ T h 0h ‘
vi VH 'U‘— V J
involving the transition frequencies vn *H>= (H*—H')/h instead of the
energy values.
As pointed out in P art I, § 18, these expressions, regarded as func
tions of the time, have two entirely different characters depending upon
whether the absolute value of the transition frequency vH*jr coincides
with v (‘resonance’) or not.
In the latter case A o s c i l l a t e s about the value zero, while NB„i
as determined by (172 a) (for a state H ” different from H ')t oscillates
§ 22 T H E O R Y OF SI M P L E T R ANSI T I O NS 216
about a small (positive) average value
(176)
+ hv)iJ r ( H " - H '- h vY
representing the average number of copies of the particle in the initially
vacant state H".
In the case of resonance (v — one ^wo ferms *n
square brackets becomes infinite, which means th at a stationary dis
tribution is impossible, i.e. that the number of copies in the state H ”
is steadily increasing. With the help of the formula
e ^ —l
lim - ——- = 27fit,
we get in this case, according to (175),
Aj cH.H>— —i TJ/v r[e±1^27ri7+periodic term]
Ji(h
(the positive sign referring to H " > H ' and the negative sign to
H " < //'), th at is, dropping the periodic term which remains small
while t increases : 2
(176a)
A perturbing force is usually said to induce transitions from the state
H ' to H" only when these transitions are manifested as a systematic
increase of N ir with the time, i.e. in the case of resonance. In the old
quantum theory the resonance or frequency condition was regarded as
the expression of the law of the conservation of energy on the assump
tion th at light of frequency v can be absorbed or emitted in energy
quanta of the magnitude hv. We see th at this relation is by no means
confined to light, being valid in the case of harmonic oscillations of any
kind.—To the type of resonance implied there corresponds in our
pendulum model not ordinary resonance between the external force and
the free vibrations of a definite pendulum, but what in classical
mechanics is denoted by "parametric resonance’, which means the co
incidence of the frequency of the variation of the coupling SPH’H*
between two pendulums H ' and H ” with the difference of the fre
quencies of their free vibrations (corresponding to the absence of the
coupling). I t can, in fact, easily be shown th at under this condition
even a very weak harmonic variation of the coupling coefficient
must produce a steady transfer of energy from the #'-pendulum (sup
posed to be initially the only one set in motion) to the ^"-pendulum
while all the other pendulums H nf for which the condition of parametric
216 P E R T UR B AT I O N T H E O R Y §22
resonance is not fulfilled will perform oscillations of small amplitude
without any tendency towards a steady increase.
The quadratic increase of NH. with the time according to (170 a)
corresponds to a transition probability (referred to unit time)
dNjj _ ^77“ Irpo
Li n r dt
which is itself a linear function of the time.
This result is due to the exact coincidence between vH»fV v and (sharp
resonance), which is practically never realized in nature. I t has been
shown in P art I, § 18, that in the case of ‘nearly-monochromatic’ light,
formed by a spectral line of finite width, NH. becomes a linear function
of the time and the transition probability becomes a constant.
The same is true, of course, of any nearly-harmonic perturbation.
We shall return to this question in the second part of this section
where it will be dealt with by a different method.
The preceding formula cannot be directly applied to the special case
v = 0 corresponding to a perturbing force which does not depend upon
the time. We must, namely, take into account the fact th at in the
case v > 0 only one term of (175) is effective in producing transitions
from the state H ' to the state with higher energy H " — H ' ~ \ - h v , while
the other would be effective in producing transitions from H ' to the
lower level H " — H ’ — h v (if such a level exists). Now when v — 0 both
terms of (175) become equally effective for the transition H ' H"
(more simply, the splitting of S into two terms becomes meaningless).
We thus get
SjrH' ei27TVB*H't—l
Al cHmH' — ■SW (177)
h vH*jr
whence 21S W 12 (177a)
if H ” ^ H \ and (177b)
if H ” = H \ which is the resonance condition in the present case. This
type of ‘inner’ resonance is faithfully reproduced in our pendulum
model by the resonance between the pendulums representing the unper
turbed states f / ' and H ”, I t will be noticed th at the expression (177 b)
differs from the corresponding expression (176 a) for the case v > Oby
a factor 4 in the numerator.
The quantities etc., have the effect of slightly dis
turbing the resonance between the corresponding pendulums, while the
§22 T H E O R Y OF SI M P L E T R A N S I T I O N S 217
quantities S*}r ir describe the perturbing coupling forces. As long as
the latter arc weak and there is no resonance, there corresponds to the
unperturbed vibration of each pendulum (//') a perturbed normal vibra
tion of the whole system (Ar/) in which this particular pendulum plays
the principal role, while all the others only faintly accompany it. This
state of affairs is described by the formula a tr K>-- ka H*K' of
§ 19, where air K. are the transformation coefficients between the func
tions x °k ' ftlld <A//'; the small quantities Aair K. represent the participa
tion of the pendulums H" / / ' in the normal vibration K \ corre
sponding to the unperturbed oscillation of the pendulum H' alone. We
might expect the quantities Nn — or their average values—to be equal
to the square of the moduli of these small quantities. As a m atter of
fact, we have, according to (148 b),
a S ° H 'r r
and consequently |A, a lrK .|2 = ,
which is equal to one-half of the value of N j r as determined by
(177).
This discrepancy is explained by the fact that the quantities
l^ ia//'A"l2 re^er to the stationary states (x k 1) °f the perturbed system,
while the quantities (177 a) refer to the non-stationary states or
more exactly to the initial stages in the development of these states—
as follows from the method of approximation used in deriving equation
(177). The limitation to the initial stages is practically irrelevant so
long as the quantities cir ir remain small, i.e. so long as there is no
resonance (H* H'). I t becomes, however, of primary importance in
the case of resonance, the formula (177 b) being valid for small values
of t only.
The actual conditions met with in this case can be best understood
with the help of the pendulum model. If initially only one pendulum,
H' say, were set in motion, then, however small the perturbing forces
which couple it with other pendulums, those which are in resonance
with it will gradually acquire large vibration amplitudes (while the rest
will but faintly accompany them as before). Resonance thus excludes
the ‘dominance’ of one particular pendulum in the perturbed vibra
tions: all the pendulums which are in resonance with each other become
equally important in the vibrations started by any one of them.
In the simplest case of two coupled pendulums in resonance we obtain
3595.6 j> f
218 PER TUR BATIO N THEORY §22
the following well-known results: If originally (when t = 0) only one of
the two pendulums was vibrating, then its vibration energy must
gradually go over to the second pendulum. If both pendulums are
identical, this process goes on until the first pendulum comes to a stand
still and the second takes over its role. Similar beats, i.e. relatively
slow periodic increases and decreases of the vibrations of one pendulum
at the cost of the other, must take place with any relations between
their initial amplitudes and phases—except in two cases: ‘symmetrical’
vibrations with equal (real) amplitudes and phases, and ‘antisymmetri-
cal’ with equal amplitudes and opposite phases. In these exceptional
cases the vibrations maintain a stationary character, i.e. their ampli
tudes remain constant. The symmetrical and antisymmetrical vibra
tions have somewhat different frequencies, both of which are, in general,
different from the common unperturbed vibration frequency of the
pendulums.
The non-stationary vibrations can be represented by a superposition
of the two kinds of stationary vibrations. The frequency of the resulting
‘beats’ must obviously be equal to the difference of the two funda
mental frequencies.
These results can easily be generalized to any finite number, r' say,
of coupled pendulums in resonance. In the first approximation their
coupling with other pendulums can be neglected. The resulting vibra
tions of the resonance group can be represented as a superposition of
r' independent normal vibrations with different frequencies. By suitably
adjusting the amplitudes (and phases) of these normal vibrations, a
resulting vibration can be obtained such that, at the instant t = 0, one
pendulum only—H' say—is in motion. The amplitudes of the others
will then at the beginning increase linearly with the time and their
energies increase proportionally to t2, this dependence being restricted
to such values of t as are small compared with the ‘beat periods’, th at
is, the reciprocals of the frequency-differences between the different
normal modes of vibration.
These results can easily be obtained from the general theory embodied
in equations (161a) and (161b) of § 21. I t should be remarked that,
although equations (161a) must be used for the approximate calcula
tion of the numbers NH. (for the coefficients cH. can be supposed to
be approximately constant while the coefficients c „ . cannot), equations
(161 b), with the coefficients K% .i r which are independent of the time
are more appropriate for the discussion of the case of resonance, because
of their similarity to the equations which determine the vibrations of
§22 T H E O R Y OF SI M P L E T R A N SI T I O N S 210
a system of coupled pendulums—the only modification consisting in
. d2 i h d
replacing — hy —
If the coupling between the pendulums (i.e. //-states) not belonging
to the resonance (degenerate) set in question and those which belong to
this set is neglected, then the quantities Cfr for the latter pendulums
can be determined by the system of r' equations
h d p v c
2m at //*=//'
or in the notation corresponding to equation (154 b),
- A i v„. - i v, («* - (178)
J.7TI (It ,|.-1
With the help of the relations
and K"mn - 8... H' + SZ.
these equations can be reduced to the form
__ !L A c — y S° c (178a)
2m d t m ~ ,r
The latter equations can be derived directly from the general equations
(161 a) in the same way as equations (178) have been derived from
(161b), in conjunction with the condition / / m = / / M= / / ' (i.e.
S mn = S°ln), namely, by dropping terms connecting the states which
belong to the same energy H ' with those which belong to different
energy-levels. We have preferred, however, the indirect derivation in
order to preserve throughout the analogy with the classical theory of
the pendulum model. So far, however, as the results are concerned, the
r* states of the same energy H' can be represented equally well by two
systems of r' pendulums whose oscillations are determined either by
equations (178) or (178a).
Taking equations (178a), we can first of all obtain the normal
vibrations (i.e. the if-stationary states) by putting cn =- a ne -i2n*H'flh
[or Cn = a ne~i2nK'flh in the case of equations (178)], whereby it reduces
to the system of equations (154 b), which was obtained by another
method in § 19. After this, the general solution of (178 a) can be written
in the form r>
(178b)
8~ 1
where the (AH')Bare the solutions of (154 c) and the a ng are the corre
sponding normalized solutions of (154 b), while the y8 denote arbitrary
constants. As already mentioned, these constants can be adjusted in
220 P E R T UR BAT I O N T H E O R Y §22
such a way as to make all the cm vanish at the initial instant ( - ^ 0
with the exception of one of them, cm say. This particular set of ya
can conveniently be denoted by yam.
We have, for their determination, the system of equations
(m
which shows that the matrix y is identical with o r1 or a 1. We thus
get, writing cnm instead of c?t,
a
or c ... - 2 « ' * : ' /* . (17flb)
Multiplying these expressions bv their conjugate comj)lex, we get
Nm 1 1 P m ' c o s ~ r ( A < — A v' )< . (179c)
a s' il
where is the real part of the product oliSo*ls(i*8-aim>.
We thus see that Nm is represented as a function of the time as
a sum of constant terms (.s' ~= s) and terms oscillating writh the
'difference-’ or 'beat-frequencies (K ^K ^yh
So long as the product of the time t with these frequencies (which are
the reciprocals of the ‘beat periods’) is small compared with 1, we can put
which gives, since Nm vanishes for t -- 0 (unless m — n),
v s<s' ’
This expression coincides with (177 b) if
1»<v
1 p T ^ = - W -
I t can easily be shown, with the help of equations (154b) and (154c),
that this relation actually holds. AVe shall not, however, give the
proof of it here.
I t may be remarked that equation (179) reduces, subject to the same
condition or rather subject to the condition AHetjh < 1 (for all s), to
cnm ~ ~ x (2
wh ile eq u a tion (177) gives, in th e ca se of r eson a n ce,
~ *2w
§22 T H E O R Y OF SI M P L E T R ANSI T I O NS 221
from which, by the way, it follows th at
Snm = 2 a ma *J AH ’)e.
t)
This relation can be derived from the equations
^ ^mn^ns
n
by multiplying them by a*>8 and summing over s . We thus get
2•S == 271 K n 2/» «,.»<* =* 271 = S“m-
Further, it should be mentioned th at an expansion of the same type as
th at for the coefficients c/rm is not possible for the coefficients CH>m,
as determined by (179 b), on account of the large value of the fre
quencies K ’J h. More exactly, the approximate expression Cvm I
would be valid for exceedingly short times only (small compared with
the reciprocal of K'/h), which hardly come into consideration.
The resonance between the r' states we have just considered corre
sponds to an absolute degeneracy between these states in the sense of
the perturbation theory not involving the time. In the present theory
we need not, however, distinguish between this case and th at of a
‘relative degeneracy’ (§ 20), so long as the energy-differences (//' —//")
between the states under consideration are small compared with the
corresponding matrix elements of the perturbation energy If
the ratios Sjr>ir >l(H,—H f') are large compared with 1 we can still use
the expression (177 b) for the probability of the transition H' -> //"
provided the time t is small compared with the reciprocal of the ‘beat
frequency’ (//"—H')jh. In the contrary case wc must limit ourselves
to the expression (177 a) for the average value of the probability of
finding the system in the new state //".
We have, hitherto, confined ourselves exclusively to the case of a
discrete //-spectrum. The modifications of the general theory which
are necessary in order to allow for the presence of a continuous or mixed
spectrum in a limited or unlimited range have already been indicated
in the preceding section. They necessitate, however, an im portant
revision of the approximate theory for the case of resonance between
states belonging to a discrete set, on the one hand, and states belonging
to a continuous set on the other (and also between states belonging to
two different continuous sets). The essence of this revision consists in
the replacement of the idea of sharp resonance, referring to two exactly
determined states, by th at of unsharp resonance for a narrow range or
‘band’ of final states belonging to a continuous set.
222 P E R T UR BAT I O N T H E O R Y §«22
Let us consider transitions which are produced by a perturbing force
vibrating harmonically with the frequency v. The initial state will be
supposed to belong to a discrete set and to have the energy II'. If the
energy H '+ hv lies in the region of the continuous spectrum (as can
happen in the case of a hydrogen-like atom if H' < 0 while H '+ hv > 0),
then transitions will be produced not only to the state with the energy
H nv = H '+ hv, but also to the neighbouring states whose energy H" is
slightty different from H ”v. This follows from two considerations.
Firstly, the resonance condition H" = H '+ hv need not be exactly
satisfied even when the final state belongs to a discrete set. Secondly,
the neighbouring states of a continuous set are themselves approxi
mately in resonance with each other and cannot therefore be considered
separately. We must consider instead a ‘band’ of neighbouring states
or, in other words, a ‘wave group’ formed by the superposition of the
harmonic waves representing them.
According to the general theory, we obtain for the coefficient cH. of
the functions tfjH. belonging to a continuous set exactly the same
differential equations as for the coefficients of the functions belonging
to a discrete state. If the particle were supposed to be initially in the
(discrete) state H then we have in both cases the same expression for
cH- — ch *H ’ namely, (175). Limiting ourselves to states in the neigh
bourhood of the resonance state with the energy H* = III = H '+ hvf
we can drop the first term in (175) on account of its relative smallness,
so that e i2 yr ( H ' - ir - h v)H h _ _ J
&i ch *h ’ — ^ (180)
H " -H '-h v '
If the functions are duly normalized, the number of copies of
the particles th at have passed during the time t from the state H ' into
a range AH" about the resonance value H"v is given by the expression
~ J dH". (180 a)
Before carrying out the integration over H" we must notice th at this
integration actually refers to the energy alone if the other two para
meters specifying the wave functions *f/H. remain discrete (as, for
example, in the case of the hydrogen-like atom). If one or both of
these parameters are continuously variable, dH" must be replaced by
the product of dH" with the element or elements of these continuously
variable parameters/ Leaving this case aside, we can calculate (180 a) by
integrating over the energy alone.
€2 2 T H E O R Y OF SI M P L E T R ANSI T I O NS 223
Since the last factor in (180) has, for not too small values of t, a very
sharp maximum a t the resonance point H * = //* and comparatively
very small values outside the immediate vicinity of this point, we can
replace the first factor by its value for H" — HI and extend the
integration over the difference H n—HH Vfrom —oo to +oo.
Putting, for brevity, 2tt(HH—H ' —
hv)tjh= we then get
. ao
12
N,a // j| dt
Since (e^—1|2 — 2(1— cosf) = 4sin2| f and
-00 —70
this gives (181)
The probability of a transition from the state H' into the band AHI
per unit time is thus equal to
(181a)
The same result could be obtained with the help of the quasi-discrete
functions r
A/r
We must first consider the intervals AH" as finite and calculate the
coefficients cH.H>= &iCH~u >according to formula (180) with the matrix
elements T°r r ir replaced by
= J r,r-m- dv~ jdv J r,r-m- w
AH "
= V(AJ5T') / Wr W h - dV = V(A
This formula is the more accurate the smaller the interval AR*, We
can therefore use it in the calculation of the limiting value of the sum
jjjF | A ^ |A c^/z<|2 A//" extended over a large number of in
finitely small intervals containing the resonance value H J. This limiting
value is obviously nothing else but the integral (180a).
An im portant example of transitions of the mixed type just con-
224 P E R T UR BAT I O N T H E O R Y
sidered is the ionization of an atom by the action of light, i.e. the
photoelectric effect. In this case we can put
S = —eE0coB(2Trvt+P),
where E 0 is the amplitude of the electric vector of the light waves,
supposed to be parallel to the z-axis, and e is the charge of the electron.
This gives ^
r _
1h ;h ' — P\XHIH’\~- (181 b)
Let us now turn to the case v — 0 corresponding to a perturbation
which does not depend upon the time. The transition being again from
a discrete state H ' to a continuous range of states H" belonging to
approximately the same value of the energy, we can determine its
probability per unit time by the formula (181a), putting T — S and
introducing the factor 4, for the same reason as in the formula (177 b)
[in contradistinction from (170 a)]. We thus get
IW = y V w i2 = (182)
Another—purely formal—modification wrhich must be introduced for
the case v — 0 refers to the notation. If the continuous spectrum over
laps the discrete spectrum (which is necessary for the resonance con
dition H ” — W to be satisfied), we must introduce explicitly one or
two parameters in order to distinguish the different states (continuous
and discrete) which have the same energy. Denoting this parameter
by Q, we can rewrite (182) in the form
^QM
Qf = (182a)
If, finally, the parameter Q” is continuously variable and if a range of
the continuous spectrum is specified by the product
o (H \ Q”)d H ”d Q \
where a is a certain function of H ” and Q* such th at the probability of
finding the particle in the above range is equal to
\cH.Q. M H ' ,Q ”)d H 'd Q \
then the probability of a resonance transition from the sharply defined
state H'Q' into a band corresponding to the interval dQ” is given by
r w = Q’) dQ’. (182b)
The same modification applies to a resonance transition produced by
§22 T H E O R Y OF SI M P L E T R ANSI T I O NS 225
a harmonically vibrating perturbation. Instead of (181a) we then get
IrfQ'.Q' “ Q”) (182c)
I t can easily be shown th at these formulae remain valid when both
the final and the initial states belong to a continuous set. We come upon
this case in collision problems of the simplest type such as the deflexion
of a particle by some field of force practically limited to a finite region
of space, the initial and final states (‘before* and ‘after’ the collision
with the source of the perturbing field) being described by wave func
tions corresponding to the motion in the absence of this field.
If, however, the final state belongs to a discrete set, then the initial
state must be specified unsharply, i.e. by a certain range of II' (and
eventually also of Q').
In conclusion the following circumstance must be pointed out. From
the corpuscular point of view resonance means the conservation of energy.
The fact th at perturbing forces practically produce only those transi
tions which satisfy the resonance condition can be regarded from this
point of view as the natural consequence of the law of conservation of
energy. As we have seen, however, the resonance condition is not
strictly obeyed in wave mechanics. First of all, transitions of a non-
systcmatic character arc produced from the initial state to states with
an entirely different energy, the average probability of finding the
particle in these ‘stray’ states being given by the formula (183 a).
Further, in the case of a continuous spectrum, the systematic transi
tions are governed by the condition of unsharp resonance, implying
slight deviations from the law of conservation of energy. I t thus seems
th at the latter does not strictly hold in wave mechanics.
This conclusion is, however, wrong, for the simple reason th at II does
not represent the actual energy of the particle, this energy, if the per
turbation S does not depend upon the time, being specified by the
characteristic values of the operator K — I I + S. The resonance equa
tion H n -- IV is therefore meroty an approximate expression of the law
of conservation of energy which in reality should be expressed by
K" - K'.
As a m atter of fact, if the motion of the particle is described from
the point of view of K , i.e. by means of the characteristic functions of
this operator, then a set of stationary states is obtained between which
no transitions arc possible, irrespective of whether K" = K' or K ” ^ Ar/.
I t is only when the motion of the particle is described from the point
3595.6 n „
226 P E R T UR B AT I O N T H E O R Y §22
of view of H t h a t tr ansitions appear , pr oduced b y the neglected par t
S of th e tota l ener gy K. I t is pr ecisely this ‘m isuse’ of th e ener gy £
which is the cause of the appar ent violation of the law of conser vation
of ener gy. Fr om th e point of view of H , S is not a constant—unless it
commutes with H , which, in gener al, is n ot so—and ther efor e has no
definite value. I t can ther efor e be r egar ded as the ‘goa t ’ r esponsible
for th e deviations fr om the conser vation law IV — H' in the tr ansitions
for which th is equation is n ot satisfied.
A similar consider ation applies even mor e str ongly to the gener al
case in which S does depend upon the tim e, for in this case the values
of the tota l ener gy K r emain undeter mined.
23. Second Appr oximation; Theor y of Combined Tr ansitions
The pr eceding consider ations p ave the way to an under standing of
tr ansitions the pr obability of which vanishes when der ived fr om the
equations of the fir st appr oxim ation but does not vanish when estimated
with the help of the second appr oxim ation.
According to equations (173) and (173 a), we have this case if the
matrix component SH»H>vanishes, while there is one or several states
H Msuch th at the components and SH^H>are both different from
zero.
For the sake of sim plicity we shall fir st consider the case of discr ete
sta tes together with a per tur bation independent of the time. I f ther e
is no r esonance between th e initial and final states, i.e. if H " ^ H \ then
th e pr obability am plitude, cir H >= At cir H >t of finding the par ticle in
th e sta te H* will r emain a small qu an tity of th e second or der , and the
squar e of it s modulus Ni r will oscillate about an aver age value of
th e four th or der of smallness. If, however , H" = H \ cir H >will incr ease
linear ly and NH. will incr ease quadr atically with the tim e, which means
t h a t ther e are system atic tr ansitions fr om the initial state H' to the
final H* via one or sever al inter mediate states H m. For these inter
m ediate sta tes th e r esonance condition with the end sta tes need not
(and in gener al cannot) be satisfied; th e fact, however , th a t in the
combined tr ansitions H f -> H ftf H" the par ticle has to pass thr ough
a sta te with an ener gy H" differ ent fr om th e initial (and final) value
does n ot in th e least pr event it fr om making such tr ansitions. The
appar ent violation of th e ener gy law for each of the two ‘legs’ of the
jum p fr om H ' to H ” can obviously be str aightened out b y taking into
account th e per tur bation ener gy S n ot only as th e cause of th e tr ansi
tion b u t also as an invisible factor in th e ener gy balance. If, for instance,
§23 T H E O R Y OF C O M BI NE D T R ANSI T I O NS 227
H"' > H \ then we can imagine th at the energy H'"—H \ which is
required for the first step of the transition, is ‘borrowed’ from the
perturbation energy S and restored to it during the second step.
The probability amplitude cir »H> Axci r ,H,y state will
remain small, the corresponding probability (or number of copies in
the state //"') N n = \^iCH^H>\2 oscillating about the constant value
2\Sir "] r \2l(H"f—H ')2y while the number Ni r , though initially much
smaller, increases with the time, and may finally become very large.
We can visualize this process by imagining each state as a vessel which
may be filled with a liquid representing the probability or the number
of copies. This liquid is initially concentrated in the vessel H f and is
pumped by the perturbation to the vessel W writh which it is connected
indirectly through a set of vessels H"'\ the liquid does not, however,
accumulate in the latter—just passing through them and accumulating
in H*. A still better picture of this transition process is provided by
our pendulum model, the probability or number of copies being repre
sented by the energy flowing from the pendulum H' to the pendulum
H ” which is coupled with it through the pendulums The lack of
resonance between the latter and / / ' results in these pendulums per
forming steady oscillations of small amplitude and functioning simply
as carriers of energy from H ' to H \
After these preliminary considerations of a qualitative character, wre
can pass to the quantitative theory of the double transitions. Putting
in (173a) & ir n---- £>jrl2'>e
and a _ ,qo Pi2TT{ir'-urn
we get A2cHm
H’ = 2 Su'H"' 8jr "n'j (183)
11 '
where t. r
Ih-U-iA1)= J dt' J dt'
o u
27n ei2 n(ir''~ir)('lh 1
dt'
X J ir-H'
th at is,
^ i2ir {I I ,,—H y i h __ I e i 2 i r ( H " - i r y i h ___ I
fr n r -H •(<) = ~ (183a)
In the case of resonance / / ' = H' this expression reduces to
/ _ *2irf , 1
J h -h - h W ‘ (183b)
228 P E R T UR BAT I O N T H E O R Y §23
Dropping the second term on account of its smallness, we thus get
V Sjr ii'" Sjr -ir (183c)
^ 2 CH " I l
We did not replace by in spite of the fact that I I " — H '
in order to indicate somehow that the final state is different from the
initial one. This can be done in a clearer way by introducing the
additional suffix Q and writing SyrQ\ir'Q'''Sjr'Q'‘,Ji'Q' instead of
In the case of double transitions, just as in the case of simple transi
tions, one usually has to do with an unsharp resonance between the
initial state and a band of continuously variable final states. If the
energy is the only continuously variable parameter, the probability of
transition from IV Q' to H ’Q" in the time t is expressed by the integral
— f I&2cir ir W\t‘dll
extended over the neighbourhood of the resonance value I V — IV. In
carrying out the integration we can drop the second term in the expres
sion (183 a). With this condition we must obviously get the same result
as for the simple resonance transition IV -> I V, with the matrix element
S°ir H >replaced by the expression
V S°irw S(lr:n;
£ li r - H f
We thus obtain for the probability per unit time of the transition
H ’ -> I V the following formula [cf. eq. (182 a)]:
1r j r »JTf __ 47f2 H-'ir
——
2 H " —H'
(IV = H') (184)
This formula is not complete in two respects. Firstly, it does not take
into account other parameters (Q) in addition to the energy. Secondly,
it neglects intermediate states belonging to the continuous energy
spectrum. If the parameter Q is discretely variable, we get, instead of
(184), the expression
477,21X* V Sjr Q'jr 'Q”' 8n"Q',\7rQ' ,
Q’,HQ'
h ““ r
dH„, SairQ.jr .,Q: .S’i>r ..Q,:JV0.
+ 2 /
Q"' J ir -H ’
(184a)
If Q is itself continuously variable, then the summation over Q " must
be replaced by an integration, the element dQf” being multiplied by the
§23 T H E O R Y OF C OMBI NED T R ANSI T I O NS 229
factor cr(/T", Q'"), and Qn being replaced by the element dQ” with the
factor a(H', Q") on the right side of (184 a).
If there is a slight direct coupling between the states / / ' and H ",
then the transition probability is determined by the sum of A1cir ir
and A2c/ r / r , so that instead of (184) we get
4tt _2
aPn SI (184b)
h ir ii'i 2 , ~ i r - H r
It often happens that the perturbation is due to the simultaneous
action of two different forces—which are incoherent with regard to each
other—in the sense that they involve independent phase-factors, over
which one must average, with the result that all quantities containing
odd powers of these factors vanish.
We thus get S — F + G, (185)
and I ^ W i 2 - |> W I ‘+ (185a)
the average value of the product of F'Jr ir with Ga,f-Jr beingequal to zero.
If we consider simple transitions H' -r //" produced by the simul
taneous action of two such perturbations, we get for the transition
probability the sum of the two probabilities, corresponding to the action
of ea^h of the two perturbations taken separately.
However, in the ease of combined transitions, we get, according to
(184), the following expression for the transition probability
IV /r “ ( F >F )l r jI' -\-{F1G)irji'-[- (G, G )i r i r , (L86)
the first and last terms being obtained from (184) by replacing S by
F or G. They represent the ‘solo’ action of the two perturbing forces,
while the middle term represents their combined action, one of the
perturbing forces producing the first and the other the second step of
the transition. This combination term
(H" = H ’)
(* ’ G W i jj"' ir -ir
(186a)
turns out to be, in many cases, more important than the two ‘pure*
terms.
These considerations acquire a particular importance in the generaliza
tion of the preceding results for the case of a perturbation depending
upon the time.
Let us first assume that S reduces to a simple harmonic vibration
without a constant term. We then have, as before,
8 = T(x,y,z)cos(2m^+j3).
230 P E R T UR BAT I O N T H E O R Y §23
Substituting this in equation (173 a), we get the former expression
(183) for A2 cHm
H' with
t
= - £ f dt’
0
V
+£*****"*... j dt"
i.e.
f _ 1 f i2p e i['i 7 r ivx ' ' i r + 2 v* J—1 yp €}[2rtvn"B'»+v)l]—l
H H 4h2\ v )(vh '"H,Jr v) (vjr H'''+ v)(vi r 'H~+v) +
C,i27rvu.tn4—1 e^i2TT{vn „ B . . . + v ) t _ 1
vH-n\vH"'H■—v) (vH-H-Jcv)(vH"'ir —»')
p i2 n vn r< - 1 ei2iKvn,.n„,-v)t__ I
vir ir (vH,,,H ~\~v) (vir n — v){vH">H-+v)
.nQ ei2irivB”Jr-2v*—\
----- - —e~.op. \
-fe~l20
H*// 2v)(vn'"ji—v)
(V (vjr j{— v){vn »‘n ’—v)l'
This expression clearly shows that the resonance condition vir H >—
(i.e. H "—H' — zb^1') of the theory of simple transitions has to be
replaced in the case of double transitions by the condition
VH”W ^ ± 2v or °>
that is, H ”—H' = ±2hv or 0,
giving respectively
tv e±i2H tv 1
fir H ’"ir — h H m- H ' ± h v ° r h H " '-H '+ h v ' H"’- H ’- h v
\
These results can easily be interpreted by assuming that each step of
the double transition H ’ -> H ”f -> H ” consists either in the absorption or
in the (forced) emission of one quantum hv of light—if, for the sake of
concreteness, the perturbation S is regarded as due to monochromatic
light of frequency v.
This interpretation is supported by the fact that the transition
probability as determined by the square of A2cir ir turns out to be
proportional to the square of the intensity of the light (i.e. to the fourth
power of the electric force E 0i to which S must be proportional in the
case under consideration). This is just what would be expected if the
probability of each of the two steps of the transition is proportional
to the intensity of the light.
I t must be emphasized, however, th at for each of these two steps
the usual resonance condition vw >,R. = ±i/ is, in general, not satisfied.
§23 T H E O R Y OF C OM BI NE D T R ANSI T I O NS 231
We have here the same situation as in the case v = 0 discussed above—
an apparent violation of the energy principle, straightened out by the
perturbation energy whose value is actually indeterminate.
I t is, in principle, quite possible for light to induce transitions whose
probability is proportional not to the first but to the second or even
to a higher power of its intensity. In order that such effects could be
observed, however, the intensity of the light must be extremely high,
in fact much higher than that with which we usually have to do in our
laboratory experiments. For, according to these experiments, the
transition probability, as measured by the rate of photo-ionization for
example, turns out to be exactly proportional to the light intensity.
We are thus entitled to conclude that double transitions produced
by the action of light alone practically do not occur—on the surface of
the earth at least.
There is, however, a great variety of phenomena which can be
described as double transitions under the combined action of light and
some other perturbation which does not depend upon the time.
Such combined perturbations are represented by a function of the
^ )e S — T(x,y)z)cos(27Tvt-\~p)-\-G(xyyfz). (187)
If, in the calculation of A2 ch 'h >onty those terms are preserved which
are bilinear in T and Gyi.e. proportional to their product, then, instead
of (183), we get
a "
(187 a)
with
fn -H " H 'tt)
_ —^ J dt ' { e ' ^ vn"B>"+v*‘ j dt," ei 27TVH"B'r t
and
= —^ J dt'ei2nv*"a’',i' J dt"
th at is
/ _ 1 / ^ .q l |
HH H 2h2\ iyir ir + v)vH'"H'
-j-r'P-(vNm
- -H,—v)vH"’R,
---------
1
(VH”H----
J,
ei27r(Vj2„B,„-V)i_ 1 \
232 P E R T UR BAT I O N T H E O R Y §23
and
ei2n{\>n„H.-\-v)t_l
e i 2-----
eilirvB"U"'—I
9m
m 1 I iS
-V)= -Wt [ ^ {vir ji-
* -..... - —elP v+
r'+ »/)(»,//" 7r + v) viriL”'(vir" sr + 1')
ei2TT(vH„u,-v)t— J gt27 rva..H"4— 1
-f C"*P —e~lP
(vtr n — v){vjr .I V— v) vH .n ...(vl r .i r — \
The two expressions define the resonance condition in the same way
as for a simple transition produced by the action of the light alone.
In the case of an unsharp resonance in the neighbourhood of the
value //" - IVzhhy, these expressions practically reduce to
1
firir - ___ p ; 1 P ........ ..
2h-e (virir-^v)vu’"jr
(187 b)
'Jinr 7 / ( 0 - 7' 1
p
wir
e i27T(vII„ u r i v ) t _ _ l
iv"
so that we get
V m w '-W r . {s n ; ir , J n"'ir \e 1
*^2cir ir Z,
H ' I' I V " -I V r I V" — l V z\ h v ) I V -I V ^ -h v 9
and consequently
T"ir ir -Guir lr | " 7’//"7/'\|'
r //;/r (187c)
1 1 “- I V ^ H " '-H '^h v)I
instead of (184). This formula should be completed to allow for transi
tions through states belonging to the continuous //-spectrum, and also
for other parameters (Q) besides the energy, in the same way as (184).
It must be mentioned that those terms—quadratic in T or G —
which have been dropped in formula (187 a) have no importance so
long as we restrict ourselves to resonance transitions of the above type.
As shown above, they would become predominant only for transitions
of the type //" — H ' + i h v or I I " — H ' .
An interesting feature of the expression (187 c) is the non-symmetrical
character of the two terms in the brackets with regard to the frequency
v. The latter affects the second term only, which corresponds to the
action of light in the first step of the transition, while in the first term,
which corresponds to the action of light in the second step, the fre
quency v appears only through the subscript //".
As an example of the application of the formula (187 c) we could cite
the problem of the transformation of light into heat in gaseous bodies.
In this case G must represent the perturbing force experienced by the
atom under consideration due to other atoms with which it is sup
posed to come into collision. The complete treatm ent of this problem
§23 T H E O R Y OF C OM BI NE D T R ANSI T I O NS 233
requires, however, the generalization of the preceding theory to allow
for the motion of all the particles which act on each other (see P art ITT).
Another example of double transitions of the above kind is provided
by the phenomenon of the scattering of light which can be considered
as a combination of two elementary acts (simple transitions)—namely,
the absorption of a light quantum liv and the spontaneous emission of
another light quantum hv corresponding, in general, to a difTcrcnt
frequency. The two acts may take place in either order—since the law
of the conservation of energy need not be satisfied in the intermediate
state (if the perturbation energy is left out of account).
The application of formula (187 c) to the case of the scattering of
light necessitates, however, two im portant amendments both in the
underlying principles and in the form of the result.
First of all it is necessary to visualize a ‘spontaneous' transition,
associated with light emission, as caused by some perturbation G—the
reaction of the electron’s radiation field on itself, for example (see
P art I, § 18). This question has, however, no practical significance,
since in formula (1ST c) we have to do not with the perturbation energy
G itself—which cannot be specified in the usual way, i.e. as a function
of the coordinates or as an operator G(x)—but with its matrix elements
only. The latter, however, can be regarded as known, since they
define the emission probability for which the expression (93), § 17,
P art I, can be used. Identifying this expression with the expression
4tt 2\ G wc can determine the matrix elements of G pro
vided the function o(H”') is known.
We shall not investigate this question here, for it wall be considered
in detail later in connexion with a more direct theory of light-scattering
I t must be mentioned, however, that this theory leads to a formula for
Th u v which differs from (187 c) in two respects.
F irstly, the resonance condition H w
v — H '± h v is replaced by
H nv H f-r hv—hv'; (188)
where v is the frequency of the absorbed and v the frequency of the
emitted ('scattered’) light. This result can be considered as the direct
consequence of the energy principle.
Secondly, taking the sign — in the denominator of the second term
in (187 c) (which corresponds to absorption of light), we must replace
the denominator of the first term, i.e. the difference H m—H \ by
H'”—H'-L-hv' (which corresponds to the emission of light of frequency
v' in the first step of the double transition). We thus get for the
3595.0 H h
234 P E R T UR BAT I O N T H E O R Y [23
probability of scattering, instead of (187 c), the expression
(70 „ rpo \ 12
Tinn = **1 s l T k i i ^ k j L ■G“
h .£ \ H ' " - H ' + h v ' K f e • <‘88*>
If the incident light is polarized in the direction of the unit vector q
and that part of the scattered radiation is considered which corresponds
to vibrations of the electron in the direction q', then we must put
T = - e ( r - q ) tf 0 and G = - e ( r - q ') ^ , (188b)
where r is the radius vector of the electron (with respect to the nucleus
of the atom ) and E q is a certain 'effective amplitude’. G is thus obtained
from T by replacing the amplitude of the external electric force by
a certain constant, which will be determined later.
These results can be derived from the general perturbation theory by
replacing the spontaneous emission forming one of the two steps of the
scattering process by an induced emission, i.e. an emission due to the action
of a secondary light w'ave with the frequency v' and the amplitude
Assuming the electron to be exposed simultaneously to the action of
these two light waves, we have for the total perturbation energy an
expression of the form
S = T(x,y,z)cos(27rv^+j8)+T/(x,y, z)cos(27n//+/?'). (189)
This gives for the bilinear part of &2 cir ir ^ 1C previous expression
(187 a) with G = T' but with somewhat different values for the factors
/ and g.
Limiting ourselves to the case of an approximate resonance in the neigh
bourhood of the value (188) and dropping relatively small terms, we get
1 O piZv{vu„H.-v+v')t_1
fn-H -iiiO = 47r (virn v+ v')(v}r'II'+ v')
(189a)
9 u ' H " ir (i ) “ T l t e l^ I 'w \
4ttz (vjrii— v+ v )\yH'"R— v)
which gives
AW
«Cjt,h -h ’
4^2 fcf\ vH ."H — v ) vH .,H ,— v + v f
(189b)
and consequently
p - - 71,21 V
rH'lr ~ J HZ [ h ’^ H ' +W
I T h ' h ' 'T h ’ h '
+
j T ? r i r ' T ° H " H \ I2
) 1
’ (189c) n o q c\
i.e. exactly formula (188 a) with Q replaced by T \ All that remains is
to assume a fixed effective value for E'0 in order to obtain the probability
of scattering.
§23 T H E O R Y OF C O M BI NED T R ANSI T I O NS 235
This value can be determined in the following w ay:
The unsharpness of the resonance implied in the preceding calcula
tions can be realized either by a transition of the particle into a ‘band’
AH ” of a continuous spectrum, with exactly specified values both of
v and v\ or by a transition into a perfectly definite state H ” belonging to a
discrete set, the unsharpness of the resonance being due in this case to a
variation of v in a small interval A / about the value (H”—H '—hv)jh
or, in other words, to the emission of a spectral line v of finite width.
From the latter point of view, which we shall adopt for the present,
we must consider instead of S' = T'cos(2ttv7+/0, a superposition of
a set of harmonic vibrations with different frequencies contained in the
small interval Ay' and with completely independent phase constants,
i.e. incoherent with regard to each other.
This means th at \T'fi.ir »\2 must be proportional not to the square
of the sum of the amplitudes of the component vibrations, but to the
sum of the squares of these elementary amplitudes. Denoting tli, value
of this sum for all the frequencies contained within the interval dv by
E ’}d v\ we get \T?r l r .\> = e* |(r. q ' W ' W
or if—as has been done above—the integration is extended over the
values of the energy and not over the frequency,
m °r /H 2 = - l(r •q w W - (190)
The corresponding transition probability is equal to
- | 3 r’i?-/r"l2 = ^ r l ( r - q V r W
This quantity must obviously be identified with the probability of
spontaneous emission (see P art I, eq. (93), § 17)
64tt 4 eV 3.
A =
3c3 h ■ |( r - q ' W I 2.
whence it follows th at K- = ~ A*'*. (190a)
Putting further T = —er •q E 0
(q being the direction of the vector E0), we get, according to (189 c),
_64tt4 ,a « 2 21"ST [ (r jr fl'" ’ q #) f (r ir ir '" ^ X r /r ' /r *Q)112
~ J * V 0 |Z [ Hm-H '+ h v' 1 H”'—H '—hv J |‘
(190 b)
236 P E R T UR BAT I O N T H E O R Y §23
The intensity of the scattered radiation is equal to the product of
Tir r r and hv\
If v is different from v, and if a direct transition from the state H'
to the (discrete) state H n is impossible—as assumed hitherto—formula
(190 b) describes, in conjunction with the resonance condition (188),
the so-called Raman effect or incoherent scattering of light. If the
state H" belongs to a continuous set, corresponding to an ionized state
of the atom, we get the Compton effect instead of the Raman effect.
In this case it is necessary, however, to modify formula (190 b), firstly
by allowing for transitions through intermediate states belonging to the
continuous spectrum, and secondly by allowing for the finite speed of
light both in absorption and emission. These corrections will be intro
duced later in P art III where an exact theory of the Compton effect
will be given.
24. T heory of T ra n sitio n s for an Undefined Initial S tate
The coefficients cJl— or in particular cjrI1— are complex quantities,
whose modulus determines the probability of the corresponding states
—or the number of copies associated with the latter—while their phases
have no direct physical significance.
We shall see later that these phases can be used for the building up
of a theory, in which the copies of the particle appear as a number of
particles of the same sort (cf. Part I, § 20). So long, however, as we
confine ourselves to one particle only, the phases of the quantities cir
are devoid of all meaning and must therefore not appear in the final
equations. This means that the latter must contain only the moduli
or the squares of the moduli of the coefficients c/ r .
We shall apply this principle to the problem (first treated by Dirac)
of the change in the distributon of the copies of a particle among
different states due to a perturbation of any kind when the state of the
particle at the initial instant was not exactly specified, so that only
the initial values of the probabilities N (jr were known. Our problem
will consist in the determination of these probabilities JVi r (£) as func
tions of the time (for sufficiently small values of the latter).
In this form the problem is indeterminate, for the equations of the
perturbation theory involve not the probabilities NJfi but the proba
bility amplitudes cir , whose values, both with respect to modulus and
phase, are determined by the values of their moduli VATJr and phases
a t the initial moment. In order to get rid of these phases, which
are completely irrelevant so far as the probabilities are concerned, we
§ 24 TRANSITIONS FOR AN UNDEFINED INITIAL STATE 237
can average the results over them—assuming all the values of these
\phases to be equally probable.
Taking the case of a discrete set of states, we have, according to (161),
h dcjj •
— 77*
y SH'H~cH”
2ni dt ji
To these equations we shall add the conjugate complex equations
2^ ^ = S" '/rC *r = ] -.S
Multiplying the former by c*r and the latter by cJV and subtracting
one from the other, we get
A ! i cn ,cir ) — 2 (®H H*cH' ci r —S i r i r c*rcH')>
2ni dt ir
i.e. d i ^ 11 ~~ T 2 (191)
We see th at the right side of these equations cannot be expressed as
a function of the numbers Ni r .
One might be tempted to put
= *JNH.eiyu-
and average over the phases yH*(and y/ r ), considering all their values as
equally probable. This would, however, reduce the right side of (191) to
zero. In fact, we are not allowed to assume the equal probability
of all the values of the phases yH, at any time; if they were equally
probable a t the instant t — 0 they will no longer be so later on.
We shall therefore, in the right side of (191), substitute for the
probability amplitudes cir approximate expressions in terms of their
initial values—up to the first approximation, so as to obtain the second-
order approximation for the time derivatives of the numbers Njr (it
should be remembered th at the matrix components of S by which the
coefficients c are multiplied are regarded as small quantities of the first
order).
We thus get
c*r cir c/ r c/i'+ c/ r ch ‘-
Now we obviously have
Ai cHm= (191 a)
so th at
cHmCH‘ ~ c? r cH '+^£ (Aic/ r n ,/,c/ r 'c/f + ^ i
If now we put e% ' (191b)
238 P E R T UR BAT I O N T H E O R Y §24
and average over the values of the initial phases yjr , etc., regarding
them as independent of each other and equally probable, we get
°TrcH' — 8/r /r + A 1c*r ^ IVJ //+ A 1cn n M^ ir>
or since Aic/r /r = ~ ^ i ct rir>
c*rcir — — ^?r)- (192)
Substituting this in (191) and remembering that
0
we get
dt
- —
~ h2
0 0
that is, (192a)
with (192b)
o
which is obviously nothing else but the probability (per unit time)
of a direct transition from the state IV into IV or vice versa. Equa
tion (192 a) could be obtained directly from the symmetry relation
r jr ir = Tir H : I t is easy to obtain higher approximations for dNjr /dt,
taking account of combined transitions. This would not affect the form
of equations (192 a). Instead of (192 b) we should, however, obtain the
following expression for the transition probability:
^H ir —
0 0 0
(192c)
VI
R E L AT IVIST IC REMODELLING AND MAGNETIC
G E NE R AL IZ AT IO N OF T H E WAVE MECHANICS
OF A SINGLE ELECTRON
25. S im plest F orm of R elativistic Wave M echanics
All the developments of the pr eceding chapter s wer e based on Schrft-
dinger 's wave equation for a single par ticle moving in an exter nal field
of for ce with a given potential-ener gy function U (x,y,zJ ),
This equation, as we have seen in Chap. I, cor r esponds to the pr e-
r elativistic classical mechanics, which neglects the variation of the mass
of a particle with its velocity. In addition it does not take into account
magnetic for ces, which depend not only upon the position of a par ticle
but also upon its velocity (being in fact pr opor tional to the latter ).
Our n ext pr oblem wrill be to find the impr oved for m of the funda
mental equation of wave mechanics for a single par ticle—which we
shall think of as an electr on—th at will take account both of the
var iability of mass and of the magnetic for ces.
I t tur ns out th a t the two par ts of this pr oblem can be solved sim ul
taneously—a t one str oke as it wer e—if in r efor ming the Schr txlinger
equation we let our selves be guided by the basic pr inciple of the
r elativity theor y, namely, the equivalence of the space coor dinates and
the tim e (multiplied by if), which m ust be expr essed by the sym m etr y
of all the fundamental equations of physics with r espect to both, and
which entails th e four -dimensional char acter of all physical quantities.
I t should be m entioned th a t the same pr inciple can be applied to the
pr oblem of im pr oving th e equations of the classical pr e-r elativistic
theor y and finding their r elativistically cor r ect pr e-quantum for m.
The for mal cor r espondence between the ener gy-momentum r elation
of Newtonian mechanics
± (9 l+ 9 l+ 9 l)+ U -W = 0 (193)
and the Schr ttdinger equation wr itten in the for m
[ 2^ u = °> (193 a)
with
h d h d hd hd
Px = 2rid~x P v= z2 m8y' P z~ 2^idz' Pt ~ 2m J t ’
(193b)
240 WAVE MECHANICS OF A SINGLE ELECTRON
leads us straight back to that four-dimensional representation of physi
cal quantities, which is the formal content of the relativity theory. We
must, therefore, so modify our original equations that they assume a
symmetrical form with respect to the components of four-dimensional
vectors appearing therein.
If, as will be done in fu tu re, th e tim e is specified in th e usual way,
i.e. by th e real q u a n tity t w ith o u t th e im aginary factor ic, th is sym
m etry will be slightly distorted b y the appearance of th e facto r —c2
or — 1/c2 in th e p ro d u ct of th e fourth com ponents of an y tw o vectors.
To begin with, we must fill up an important gap in the usual defini
tion of the momentum-energy vector
gx mvx, tja -= <],-=■ vivz, - g t -~ \V (193 c)
— a gap w hich m akes this definition inconsistent from th e p o in t of view
of th e relativ ity theory an d which lim its its correspondence with the
o p erator-vector (193 b).
In Einstein's mechanics of a particle with rest mass u/0 we have,
corresponding to the components of the momentum, i.e.
tit 0 ' u mnr.
....*■
9 v e -—^ w /c T "VC
as fourth component of the four-vector concerned, the ‘proper energy’
wflc2 / mnic \
V (l- » 2/c4) I ~ v(i-»*/c*)r
Now th e q u a n tity p t in (193b) represents, not this proper energy,
b u t th e total energy E = mc2+ U dim inished by th e c o n stan t rest-energy
m0c2. F o r th e relativistic form ulation of th e laws of corpuscular
m echanics we m u st clearly ad d th is co n stan t to th e energy W, i.e. we
must put
-gt --- E = 1F+Wi0c2 mc2-\-U.
In addition to this, we must regard the potential energy U as the fourth
component, i.e. as the ‘time-projection’, of a certain four-vector and
also take into account its space projection. This space projection G,
which obviously corresponds to the momentum and which, just as U,
can be an arbitrary function of the coordinates and the time, will be
called the potential momentum. In the—so far exclusively considered—
special case G = 0 the components of the force acting on the particle
reduce to the usual expressions — The Quos^ on as
to the nature and the mathematical expression of the force due to the
vector function G will be considered later on. We are a t present only
§ 25 SIMPLEST FORM OF RELATIVISTIC WAVE MECHANICS 241
interested in the fact that, by the introduction of the ‘potential
momentum’, the quantities g g yi gz appearing in formulae (193c) must
be defined as the components of the total momentum wiv+G just as the
quantity —gt denotes the total energy mc2-\-U.
We obtain, therefore, instead of (193 c), the formulae
gx = mvx+ Gx, gv = mvv+ G y, gz = mvz+ G z \
—gt = mc2-j- U I
The components of the ‘proper energy momentum vector’ are related
to one another, according to definition, by the relation
(mvx)2+ (mvy)2+ (m vz)2—~a(mc2)2 — —mjc2 (194a)
( which 7YI
is equivalent to the formula m — .------0--■ ■ In the case
y(l —V/C2
G = 0 this relation can be written in the form
(mvx)2^-(?7ivy)2+ (mvz)2—^2(E—U)2 = —w$c2.
In the limiting case of small velocities (r/c <' 1) we can put approxi
mately (mv)2 ~ (m011)2
and (E —U)2 = \ (?n0c2+ W—U)2 rr m lc2~j- 2m0( IV—U).
C“ C“
Thus the previous equation reduces to
(movx)2~\~(mov y ) 2 j r (w0 vz)2~r ~m o( U —IT) = 0,
which is the classical energy-momentum equation (193). I t should be
noticed th at it expresses the ‘law of the conservation of energy’ when
W (or E) is constant, which can only be the case when the function
U is independent of the time (static field).
We see therefore that the equation
to+t/)*+i»»5C* = 0, (194b)
which results from (194), and (194 a) represents the relativistic genera
lization and refinement of the Newtonian relation (193).
From this equation we can go over to the corresponding fundamental
equation of the relativistic wave mechanics in the same way* as in the
non-relativistic case—namely, by replacing the vector g in (194 b) by
the corresponding operator-vector p and equating to zero the result
obtained by the application of the resulting operator to a wave function
3590.6 T ;
242 WAVE MECHANICS OF A SINGLE ELECTRON 5 25
\ft. We thus get
= 0, (196)
with
\2t« dx z) + \2vidy
\27ri dy v) +
(h _ I
+ \27n dz + {/) + W 0 C2 . (195a)
C2 \2 7 T l
In the case of ‘multiplication’ of expressions which, besides ordinary
quantities, also contain differential operators, the order of the factors
Q
must remain unaltered. Thus the ‘product’ — Gx*f/ where the operator
dx
d/dx is to be applied to the function Gx\fj standing on its right side
3 dG
differs from the ‘product’ Gx— $ by the additional term ^ —T.
dx dx
If we take this into consideration we obtain
( s s L ~ G*)' i‘ = {i n rx - ° * ) {i n
h2 d2ip h n d . h da . fl .
2 dx2 7 r t 1 dx 2niY dx
47t
and similar expressions for the other terms in the equation. Written
out in detail it runs, therefore, as follows:
dhjf dhft d2t[t 1 d2^
d ^ + dyi + 'dz2'~c2 W
— h \ Gz8 i + 0^ + G‘ e i + ^ 8 i ) -
(196)
_ ^(iq x ,^ - e o i dU\
h \8 x^ "r d z ^ c 2 d t ) w
-^G i+ G \+ Q l-lu * + m lc * y = 0
If the rest-mass vanishes (m0 = 0 ), and if there are no external forces,
i.e. in the case of an Einstein photon, this equation reduces to the
equation . »V . _ 0
dx2 dy2 dz2 c2 dt2
for electromagnetic waves. Further, it can easily be shown th at when
m0 96 0 and G = 0 the relativistic wave equation (196) for the special
oase of a harmonic vibration process (i.e. motion with a given constant
§ 25 SIMPLEST FORM OF RELATIVISTIC WAVE MECHANICS 243
energy) agrees with the relativistic equation (48 b), § 13, P art I.
In fact, if we put dU/dt = 0 and $ = y, z)e~2nivtf equation
(196) reduces to the form
jj 4tr2 T 47r2 ,
v v + ( 4cT * - » t' + 4 w '' i" 2^ = 0,
**-"««■
or, with v = e//i,
47T2 ,
+ = o, (196a)
which is identical with (48b), P art I.
We shall now investigate the relation of equation (196) to the equa
tion of motion of Einstein’s mechanics. For this purpose we shall put
in (196) ^ _ const. e2niSlh. (197)
After dividing the result by (27Ti/'h)2ei2rrSth and dropping the terms which
contain the small factor hj2Tri we obtain the equation
© * + + ’ dz
'ey
l
+ G % + G l+ G l^~U *+ mlc* = 0,
which must obviously be the relativity form of the Hamilton-Jacobi
equation. I t can be written more briefly in the form
(197 a)
and can be obtained directly from (195) if we replace the vector p in
D by the vector g defined according to the equations
as _a s _ dS dS
9x (197 b)
a x’ 9 v~ a y’ 9‘ ~ 8 z ’ 9‘ 81 '
From these equations, which refer to the copy continuum of one
particle, one can easily go over to the relativistic equations of motion
of a given copy and, indeed, just as in the non-relativity theory, by
differentiation of equation (197 a) with regard to the coordinates and
the time, bearing in mind the following relations resulting from
(194) and (197 b), dg ^
—Or == mvT - + U
dx
If we differentiate (197 a) with regard to x and divide by m, we get
/0*S 8GX\ , [ ifs ag. , / 02S 8G,\ . 8*S , a u .
8x8y dx h + ( i E 5 - » h + dxdt
“ + dx^ - 0'
244 WAVE MECHANICS OF A SINGLE ELECTRON §25
or, by (197 b),
dU
B0, + v dGy + 0,
dx dt ' dy d t ' dz d t ' d t x dx v dx dx
i.e. — (V'G —U). (198)
dt
The three-dimensional velocity vector v referring to a definite particle
is here no longer considered as an explicit function of the coordinates
and the time. Therefore, its partial derivatives with regard to x, y, z, t
must be put equal to zero.
The equations for gy and gz analogous to (198) will not be written
down here. The fourth equation runs
<%i
dt
If the potential functions G and U arc independent of the time
(static field of force) this equation reduces to dgjdt =. 0, i.e. —gt -- E
= const, (law of the conservation of energy).
If we split up gx in (198) into the sum of vivr and Gx, we then get
ijx d_, v . M s , 8Gx dx dGx dy 8GXdz
dt dl 1 dt dx dt ~ci y dt dz dt
d. , , dGx , 8Gr 8GX , 8GX
-v„ T+ v„ x,
= di{mV‘ ) + - H + V* te u by ^ " d z'
and consequently,
(198a)
dx dt v\d x b y) s\ dz Bx)
The right side of this equation must obviously represent the x-com-
ponent of the force f acting on the particle.
If we put
U G = e- A (199)
with
Ex E„ Q — E
dx cet ’ dy cot ’ 1 dz cdt
(199 a)
(E
dA * —dA v II -= dA * - . dA* H = dA.x ]
dy dz 9 y dz d x' z dx by 1, (199b)
(H = curl A) i
$ 25 SIMPLEST FORM OF RELATIVISTIC WAVE MECHANICS 245
we obtain f x = e fEx + ^ H z- ^ f f J j ,
or in vector notation f = e^E + ^X H j, (200)
and 5 (» v ) = f . (200 a)
Here <j> and A are the scalar (electric) and the vector (magnetic)
potentials, E and H the electric and magnetic field strengths re
spectively, while e is the electric charge of the particle. A point-like
corpuscle can thus be defined by two constants only—its rest-mass and
its charge.
The vector defined by (200) represents, therefore, the external force
(so-called ‘Lorcntz force’) acting on an electron or a proton which is
moving in an arbitrary electromagnetic field.
The time projection of the four-dimensional equation of motion, of
which (200 a) is the space projection, has the form
d(™ p - eE • v e(Exvx+ E yvy+ E z%). (200b)
We thus obtain the relation
|(« c* ) = f .v = v |( m v ) .
from which at once follows the well-known formula
m0
I t still remains to find out the expressions, corresponding to the
relativity wave equation just considered, for the quantities p (proba
bility density) and j (probability current density). This is done most
simply as follows (according to W. Gordon). We first introduce the
operators:
h d e A h_ d_ e
— ; -------
UT 2rrl Uu = 2iri dy c
dx c
( 201 )
h d e^
Uz 27n dz c s)
by means of which we can write the relativistic wave equation (195) in
the form (mJ+«*4- m| —w^-fm^c2)^ = 0. (201a)
We multiply this equation on the left by ip* and subtract from it the
246 WAVE MECHANICS OF A SINGLE ELECTRON § 25
conjugate equation for \fj* multiplied by 0. We then get, bearing in
mind (196), the formula:
c2 Sty dt r 8t ~ h ry }
or £ (<AX + - (</>% 4>+*jjv,*<!>•)+
(72 CCtf
This formula can be regarded as the equation of continuity if we define
the quantities
3x = 2 m (>P*ux'!J+ ll’u *'l1*)
. a . . . . (201b)
9=
as the components of the current-density vector and the copy density
respectively. With regard to the first, this definition is the immediate
generalization of that given earlier. The expression for py on the other
hand, seems to be completely different from ifn/t* which has been used
so far. We can easily convince ourselves, however, by the example of
a conservative motion, that this difference is, in practice, quite unim
portant. Putting
h dt/f —Eifjr, and
h_ Biff*
27n dt 2ni dt
we obtain
c c
I IV— U \
and hence p — \fi\jj*(1 --------- I , (201 c)
\ ™oc /
i.e., in so far as the kinetic energy W —V is small compared with m0c2,
p ~ ifttp*.
With regard to the exact meaning of tfnj/*y one can easily show that
it corresponds to the rest density. This can be seen, for example, from
the relation />/(^*) = w/m0 which is obtained from (201 c) if the mass
§ 25 SIMPLEST FORM OF RELATIVISTIC WAVE MECHANICS 247
m is introduced by means of the usual formula
, W -U
m = m0+ - .
cc
26. Magnetic For ces in the Appr oxim ate Non -R elativistic Wave
Mechanics
If in reducing equation (196) to the form corresponding to conservative
motion the potential momentum is supposed to be different from zero,
we get instead of (196 a) an equation which in vector form can be
written as follows:
( 202 )
If the energy IF = E —m0c2 is small compared with the rest-energy
E0 = m0c2, which classically corresponds to motion with a velocity v
small compared with the velocity of light, then we can put with suffi
ciently good approximation
{ E - V f = (En+ W - V) * = E l+ ZE ^W -U ),
neglecting the relatively small term (IF—U)2, and thus replace equation
(202), which is supposed to be exact, by the approximate equation
(202 a)
This equation corresponds to the classical equation of motion allowing
for the presence of magnetic forces (derived from the constant potential
G) but neglecting the relativistic variation of the mass with velocity.
As a rule, the magnetic forces are relatively weak, so th at the terms
of (202 a) which are quadratic in G can be neglected compared with
the linear terms. With this condition, equation (202 a) reduces to the
still simpler form
Now we have V • G*p — div G\ji = G • V^+</r div G.
I t is well known further that in the case of a static field the divergence
of the magnetic potential A vanishes, so th at we have div G = 0.
The preceding equation can therefore be written in the form
(202 b)
So far we have been making perfectly permissible approximations. We
248 WAVE MECHANICS OF A SINGLE ELECTRON § 26
are now going to generalize the preceding approximate equations for
the case of non-conservative motion (in a static or non-static field)—
in the same way as was done before with G = 0, namely, by replacing
the energy W by the operator —p t (or - p r r«0c2; the constant term
m0c2 is immaterial in this case because it is absorbed by the potential
energy).
We thus obtain the equations
[ 1 ( A v V ----- — , G V + t / + f t U = 0 (203)
[2m0\2m J 2-nrn^t ^ ‘J r \
for weak magnetic fields or
0 ,203o)
for strong fields; these can be considered as the generalization of
SchrOdinger’s equation (193 a) for the case of the presence of magnetic
forces, with neglect of the relativistic variation of mass with velocity.
The transition from equations (202 a) and (202 b) to (203) and (203 a)
is certainly an illogical step, which, moreover, is in contradiction with
the results arrived at in the preceding section. For if equations (202 a)
and (202 b) are permissible approximations of equation (202), which is
supposed to be exact, referring to the case of motion with a definite
energy, equations (203) or (203 a) cannot be considered as an approxima
tion, in the strict sense of the word, to the general equation (190). In
fact, the latter involves a second derivative of tfs with regard to the time,
which we are not entitled to drop or to replace by a first derivative
multiplied by a constant factor—unless the dependence of upon the
time is given by the factor e-'2nE(!h—corresponding to a motion with
the constant energy E.
We have here an approximation of a kind similar to th at which is
constituted by the Hamilton-Jacobi equation with respect to SchrO-
dinger’s equation for the function S = (hfeniflogdi: in the latter case,
however, it is the second derivatives with regard to the space co
ordinates and not to the time which have to be dropped.
The preceding consideration does not, however, invalidate equations
(203) and (203 a) as good approximations to the truth within a certain
range corresponding to a negligible variation of the mass with velocity.
Apart from the fact th at the validity of the relativistic equation (196)
still remains to be proved (and we shall see later that, as a m atter of
fact, the contrary can be proved)—equations (203), and (203 a) repre
sent a very natural generalization of SchrOdinger’s equation for the
§26 M AG NET IC F O R C E S 240
presence of magnetic forces, and must therefore describe the motion
affected by such forces just as well as SchrOdinger’s equation describes
a motion unaffected by the latter.
An important advantage of the ‘approximate’ equations (203) and
(203 a) over the ‘exact’ equation (196) consists in the fact th at they
fit into the general scheme of the operator theory developed on the
basis of SchrOdinger’s equation, since they can be written in the same
form, namely, (H + p M = 0, (204)
where the Hamiltonian or energy operator II must be defined by the
generalized formula
(204 a)
2m0\2irt /
or II ,G -V+U . (204 b)
2m0\2ir» / 27rrn0i
Equation (196), since it contains the square of the operator pt, can
not be written in the form (204)—unless we assume th at it is possible
to extract square roots of operators in the same way as of ordinary
numbers and succeed in finding an equation linear with regard to p t and
actually equivalent to (196).
Leaving this question till a later section, we shall now indicate briefly
the principal modifications of the general theory, developed in the pre
ceding chapters, which are necessitated by the generalized form of the
Hamiltonian operator (204) or (204 a).
First of all, wre must notice th at this operator is complex (which does
not prevent it from representing a real quantity, just as the operator
p = ~ ~ y does). We must distinguish therefore the operator H from
2irt
the conjugate complex operator &*, which determines the conjugate
complex wave function ip* by the equation
(H*—p t)<p* = 0. (204 c)
Multiplying this equation by tp and subtracting it from equation
(204) multiplied by *p*> we get
t + divJ = °*
with the old—non-relativistic—expression \p*p* for p and the expression
> “ • i k ( 5 5 7-°)*+*(-S87- G)'*-] I205'
rofi.e Kk
for the current density. This expression turns out to be the same for
250 WAVE MECHANICS OF A SINGLE ELECTRON $ 26
the two Hamiltonians (204 a) and (204 b), and coincides with the expres
sion derived above from the ‘exact’ relativistic theory [cf. the first
equation (201 b)].
Equation (205) can obviously be rewritten in the form
j = -L p R (V £ -G ) = — P[V R(5)-G }, (205a)
m0 m0
where S = — lo g f
2rrl
and R(£) is its real part. In the approximation corresponding to the
classical (Newtonian) theory of the motion of an electron in an electro
magnetic field, S is the action function and its gradient VS is the total
momentum g. The difference VS—G thus reduces to the proper
momentum m0\ and the vector j reduces to the product p\ —just as
in the absence of magnetic forces—as, of course, is to be expected.
The complex character of the operator H necessitates the revision of
some of the properties of its characteristic functions, which were
established on the assumption th at H was real. This refers, in the first
place, to the orthogonality property which was deduced from the self
adjointness of H, i.e. from the formula
J ( A H ft - ft H fJ d V = 0.
Now in the general case of a complex fl defined by (204 a) or (204 b),
this formula does not hold and must be replaced by
j ( fi H f2- f 2H * fj ) d V = 0. (206)
We have, in fact, according to either one of the two definitions of H t
(A G -V /2+ / 2G -V / j )
47rmQi
;•G •V(/1/ 2),
47rm0i
or, so long as div G = 0,
fi H ft - f 2H*fi = div f12, (206a)
where f12 = - (/xV/2- / 2V/,) - G fJ t . (206b)
If, therefore, the functions and / 2 vanish sufficiently rapidly at infinity
(so th at the integral j f i f 2 dV converges), we must have equation (206).
Putting, in particular, /* = and f %= where H ' and H ” are
§26 M AG NE T I C F O R C E S 251
two differ ent (r eal) char acter istic values of H y we get
J («■' R+m— dv = ( f l' -ff) j dV = 0,
whence J i/'h-'/'h- d F = 0 (ff* ^ £T),
as befor e.
I t should be m entioned th a t in the case of a r eal / / (i.e. in the absence
of m agnetic for ces), th e char acter istic functions ipH', neglecting th e
tim e-factor c“i2iriI7/A, can always be defined to be r eal, i.e. to have real
amplitudes y, z), while in the case of a complex H these amplitudes
are complex. The or thogonality r elation holds ther efor e only in the
above for m, and not in the for m
J —0 or J dV = 0
in which it can be expr essed if H is r eal.
I t should also be m entioned th a t the pr oper ty of self-adjointness
expr essed by equation (206 a), r efer s n ot only to th e oper ator H but
to a n y oper ator which r epr esents a r eal q u a n tity, i.e. which is a r eal
function of th e coor dinates and th e elem entar y oper ator s p x, p yy p e—
or of th e vector -oper ator p = — V. This can easily be shown with
2m
the help of the r elations
wher e
Ju J l8xin~l 8x 8x2n~2^ 8x2 dx2n~3 ' " ' 8xin~lj2 ’
and /iJ ’2n+1/2 + /2P2'l+1/ i = ^ / i s ,
with f - f 82nA 8/1 | g2/i 1 ,& * jx f
112 J l 8x2n 8x dx2’1- 1'1' 8x* 8x2n-i ~r '"~t~8x2aJt’
in conjunction with
( p * n )+ = p * nt ( p * " * 1) * = — 2>®w+ 1.
W e can thus sa y t h a t n ot only the ener gy oper ator , but any oper ator
F r epr esenting a r eal physical q u an tity is self-adjoint in the sense of
the equation f , F J ,- f ,F - J , . div !„.
and th a t th e char acter istic functions of th is oper ator ar e or thogonal
with r egar d to each other in th e same sense as the char acter istic func
tion s of th e ener gy oper ator .
252 WAVE MECHANICS OF A SINGLE ELECTRON § 26
Another r esult which was associated with the r eality of H and its
self-adjointness in th e old sense was the possibility of r eplacing the
differ ential equation for its char acter istic functions and values
= 0
b y the var iational equation
8J 4 ,*H 4, d V = 0,
with the condition J dV — const. (= 1 ) .
Since, in th e case of a com plex //, the function tft* no longer satisfies
th e same equation as 0, th e pr eceding r esults seem to r equir e a m odi
fication.
As a matter of fact, however, no such modification is needed, for we
have r r
BJ 4**H4> dV =J r
dV +J dV
and, with the help of (206),
J4 >*HB>/> d V = f4 B ,H * 4>* d V,
t h a t is,
81 4 >*H 4> d v = j dV +J 4 B >H*4,* d V = 8J 4 11 4
, * ,* dV.
The var iational pr inciple thus pr eser ves its usual for m
BH = 0, E = const. ( = 1)
with H = J4 >*H 4, d V = J4,H * 4>* d V,
and E = j 4i*4f d Y.
As has been alr eady pointed out, the two equations h H — 0, BE = 0
can be r eplaced by th e single one BH = 0 if is defined by the for mula
5 = | 4>*H4, d V I J 4>*4> d V (or J 4,H*4>* d V j j 4>*4> d V ),
with ou t an y nor malizing condition for th e function
I t should be noticed fur ther th a t th e two equations BH = 0, BE = 0
can be sp lit up, as it wer e, in to th e following two pair s of equations:
J4 B >*H 4, d V = 0, J S4>*4, d V = 0 (207)
and J4B iH * 4>* d V = 0, J dV = 0,
the first pair being equivalent to the equation (H —H ' )ip = 0 and the
second to the equation = 0.
§26 M AG NET IC F O R C E S 253
The preceding result can easily be generalized for non-stationary
( H + - lb. -•
3\
U = 0 being equivalent to
2tt %utj
(207 a)
and the conjugate complex equation j|0* = 0, to
h d\ (H -f
; —1l/f = 0
2771 dtj
and $</>* is quite arbitrary, the variational equation (207 a) is nothing
but a transcription of the ordinary differential equation of motion. The
same variational equation is obtained, however, as the condition for
the error involved to be permanently small,f w’hen ip is replaced by an
approximate function of some relatively simple form tpv
At 6ome initial moment t = t0 the form of the function ip can be
fixed quite arbitrarily. We can accordingly identify ip(t0) with ip^to).
( Ji 3\
—)tp = 0 but an equa-
2ni ot]
tion of the form , j,
= <207b>
Our problem is thus reduced to th at of making the additional term
ip2(t) as small as possible for any time t. Taking t — t0+ dt, we get from
the preceding equation
~ 0i ( W — T HiPSo)dt— j^-*P2 ( ^ 0 ) ^ *
Now if the function ipi(t0+ dt) is altered by a small amount Sipi(t0+<it)y
the function ipi(t0) remaining the same as before, the corresponding
variation of the correction term ip2{to) will be
The condition th at ip2(t0) should be as small as possible for all values
of the coordinates can be stated as the minimum condition for the
integral J 0?(^o)*Aa(^o) dV and is equivalent accordingly to the equation
/ dV = 0.
t The argument presented below is taken from Dirac’s appendix to the Russian edition
of his book, The Principles of Quantum Mechanics.
254 WAVE MECHANICS OF A SINGLE ELECTRON § 26
Replacing here S0*(/o) by ~ &ijif(f0-\-dt), we get
j m t 0+ d m to ) d v = 0,
or passing to the limit di -> 0 and dropping the index 0 (since the above
results must hold for all values of t)
J 80*(<)02(O (IV = 0.
This equation means that the correction 02(/) must be orthogonal to
any variation of the approximate function 0j(/). Hence 02 can be
eliminated from the equation (207 b) if the latter is multiplied by 803*
and integrated over the coordinates, thus giving equation (207 a) with
the exact function 0 replaced by the approximate one 0X,
The expressions J 0*//0 dV and J 0//*0* dV for 7/ can easily be put
in the symmetrical form
B - J [ 4 ( s v- gM - s v - g > * + c ' # *] (208)
if H is defined by (204 a), th at is
s = j [ i f e v* )(-K v**)-G i+p'] <2°b*>
where p = 00* is the density of probability and j the probability
current density as defined by (205). Using the approximate expression
(204 b) for H , we get, instead of (208),
B- J dV’
(208 b)
which coincides with (208 a) if, in the above definition of j, we put
G = 0, thus coming back to the old definition of the current density
h
j = 4trim. (0*V0 —0V0*).
So long as the reality of the characteristic values of the operator H
and the mutual orthogonality of its characteristic functions is unaffected
by th at change of it which corresponds to the presence of a magnetic
field, we can preserve, without any modification, all the results of the
preceding chapters concerning the matrix representation of physical
quantities ‘from the point of view* of H, the transformation theory and
the perturbation theory.
If ther magnetic (or, in general, the electromagnetic) field specified by
§20 M AG NE T IC F O R C E S 255
the vector G is relatively weak (compared with the field of force defined
by the potential energy U), then it can itself be treated as a perturba
tion. Subtracting from the Hamiltonian (204 b), which in future may
be denoted by K y the usual Hamiltonian
a * (» vV +p,
2ra0\27n /
which corresponds to the absence of the ‘perturbing' forces specified
by G, we get the following expression for the perturbation energy:
h .G
S = ---------- ~ •V
” (209)
2t m i0 i
ieh
or S= A-V, (209 a)
2-nrn^c
where A is the vector potential corresponding to G (= eA/c) and e the
electric charge of the particle under consideration. Putting ^A-.V= p,
we can rewrite (209) in the form
S = - —- A p. (209 b)
The simplest application of this formula is provided by the special
case of the action of a permanent homogeneous magnetic field (Zeeman
effect). Denoting the field strength by we can, in this case, put
A = £ f tx r , (210)
where r is the radius vector of the particle. This gives in fact
curl A = £>,
as can be verified most simply with the help of the coordinate repre
sentation.
Substituting (210) in (209 b), we get
s - -
which can be rewritten in the form
5 = _ ^ . (rx p ).
Now the operator r xp = M
obviously represents the angular momentum of the electron about the
central point (nucleus), from which the radius vector r is supposed to
be drawn. We thus get
e
S = - = - * J6 M ,
256 WAVE MECHANICS OF A SINGLE ELECTRON § 26
or S = —f t p , (210a)
where (jl —
6 M, (210b)
2m0c
can be defined as the operator representing the magnetic moment due
to the rotation of the electron about the (fixed) nucleus.
This definition follows from the fact that (210 a) has exactly the same
form as the classical expression for the energy of a particle with a
(constant) magnetic moment (x in a homogeneous magnetic field » .
If the unperturbed motion is a motion in a central field of force, so
th at the vector M is constant, the vector p will also be a constant.
Its characteristic values are equal to those of M multiplied by e/2m0c.
Taking the 2-component of M and remembering that, with suitably
chosen characteristic functions YJ fi,* ) = W * . the characteristic
values of Ms are equal to integral multiples of h\'2n, we get for the
characteristic values of jiz integral multiples of the quantity
eh
Pi — 47rw0
...... c >
which is called the Bohr magneton (since it is equal to the magnetic
momentum of a one-quantum Bohr orbit).
If the magnetic field is parallel to the z-axis, or rather if the latter
is chosen in the direction of the magnetic field, then the change of the
additional energy of the perturbed states of motion compared with that
of the corresponding unperturbed states can easily be shown to bo
equal to the product of £ by the characteristic values of /z£. In fact
the non-diagonal matrix elements of the perturbation energy
^ n lm ,n T m ' ^(Pr) n Im;n Tm’
with regard to the functions i/^m and all vanish (which means
th at the perturbation is of such a kind as to introduce no coupling
forces between the pendulums representing different states), so that
the additional values of the energy AH ' reduce to the diagonal elements
of the perturbation matrix. We thus have, in the first approximation,
A^H =•- ^i H nlm = S nim;nim ~ £KPz)n/m>
&eh
or AXH ‘ m. ( 211)
47rm0c
This splitting up of the energy-levels by the magnetic field—or rather
the corresponding splitting of the spectral lines due to transitions
between energy-levels with different values of the axial quantum num
ber m is called the ‘normal’ Zeeman effect. Since only such transitions
§28 M AG NET IC F O R C E S 257
occur for which Am = 0, + 1 , or —1, the normal Zeeman effect consists
in the splitting up of each line into three lines, one of which coincides
with the original line (corresponding to the absence of the magnetic
field), while the other two are displaced in opposite directions by the
amount ~
Av= -~ ^-~ . (211a)
477W 0 C
The undisplaced line corresponds to harmonic oscillations of the electron
parallel to the magnetic field, while the displaced ones correspond to
circular motion in the one or the other sense about the direction of this
field. The relative intensities of these three lines for the case Al = + 1
and Al — —I have been determined in § 13, Chap. 111.
We shall not discuss the Zeeman effect in greater detail here, but
shall postpone this question until a later section where it will be dealt
with in connexion with the complications arising as a consequence of
the hitherto ignored ‘intrinsic’ magnetic moment of the electron
(‘anomalous’ Zeeman effect).
Although the preceding results have been obtained to a first ap
proximation by the perturbation method, they can easily be shown to
hold exactly—so long as the action of the magnetic field is represented
by the (approximate) operator (209) or (210 a).
We have, in fact, denoting by <f>the azimuthal angle about the z-axis
(supposed to coincide with the direction of the magnetic field),
h d_
AL -
2-771 d<f>*
and consequently S = (211 b)
i d(f>
where Av is given by (211 a).
If we now compare the exact equation of the electron’s motion
( H + S - K ' ) x °k. = 0,
with the equation = 0,
corresponding to the absence of the magnetic field, we easily find that
they can be satisfied by the same functions
if we put K ' - H ' = AH ' = hAvm
in accordance with (211). Thus, in the present case, we have
AH ' = AXH' .
We shall consider, in conclusion, another method of dealing with the
3t»fi.< L j
25 8 W A V E M E C H A N I C S O F A SI N G L E E L E C T R O N § 28
effect of a homogeneous magnetic field which is very instructive in th at
it brings to light the similarity between the wave-mechanical and the
classical theory.
We shall write the equation of the electron’s motion in the general
form / #> >4 \
(*+* + £ S s)* -* ( 212)
and shall introduce, instead of the original coordinate system x,y,z,
another system, x \ y',z'( = z), rotating about the common (fixed) z-axis
with a constant angular velocity w. The azimuthal angle <f>f with respect
to this rotating system is thus connected with <f>by the formula
f = <£-a>tt (212a)
whence it follows that
\* U \ st}i W dt \» U - H ’ '
Now the partial derivative with respect to I in equation (212)
obviously refers to a constant value of <f>. Taking account of (212 b), we
can therefore rewrite this equation in the form
hu) d h d'\ ^
(213)
2nilty' + 2:n id i)X ~ :
where denotes the value of the partial derivative with respect to
if taken for a constant value of <f>*. This equation can obviously be
regarded as describing the motion of the electron with respect to the
rotating coordinate system.
Substituting in it the expression (211b). for $, we get, since
h d’\ _
(213 a)
2th et]x
This equation reduces to th a t which describes the motion of the electron
with respect to the fixed axes in the absence of the magnetic field—
with the fixed axes replaced by the rotating ones—if the angular
velocity a, is defined by ^ = ^ (213 fe)
i.e. if the frequency of revolution is just equal to Av.
This result is identical with th at which is obtained with the help of
classical mechanics, where it is interpreted as a precession of the electron'a
orbit about the direction of the magnetic field with the angular velocity
w — 2nAv (Larmor's precession).
§26 M AG NE T I C F O R C E S 269
The partioular solutions of the equation
{H + £ d 8d t)x = 0’
corresponding to a conservative motion of the electron with respect to
the rotating axes, are obviously the same as those of the equation
(H+ — Ji d \
—)^r = 0, with replaced by We thus have
27Tl dtj
x = Xu- =
where H f is a characteristic value of H , i.e.
Xir = = ip0ir e-ii7,iH'+hwml2lT)ilh.
This is another expression of the result x °k ' = K ' —H' = hkvm
found by the preceding method.
27. Relativistic Wave Mechanics as a For m al Gener alization of
M axwell’s Electr om agnetic Theor y of Light
Coming back to the relativistic theory of the motion of an electron in
an external electromagnetic field, we have to face the following situa
tion. If the relativistic equation (196) established in § 25 is assumed to
be correct, we must give up the theory of the preceding chapters, so far
as the introduction of the energy operator H is concerned. If, on the
other hand, we wish to preserve this theory and express the wave-
h d\
( II-\-----. —|«/r = o,
2m dtj
we must replace the relativistic equation (196) by an equation or system
of equations which are linear and not quadratic with respect to the
operator ± |.
We shall now try the second alternative, not only because it fits in
better with our previous ideas, but also because it is more general than
the first alternative. In fact, the order of a differential equation can
always be increased by repeated differentiation, so that, in particular,
from an equation of the type (H + pt)ip = 0 we can always pass to an
equation containing the square of pt. This can be done, for instance,
by applying to the preceding equation the operator H + p t or H —pt
giving (II2-\-2Hpt-\-pj)tp = 0 in the first case and (H2—pf)tft = 0 in the
second.
Of course we must be prepared to find th at the equation of the second
order (with regard to pt)t obtained in this way, will be somewhat
260 WAVE MECHANICS OF A SINGLE ELECTRON § 27
different from our original equation (196). Which one is chosen will
ultimately be decided by comparing theory with experiment.
I t can easily be shown that a single equation of the first order with
one unknown function 0, satisfying the space-time symmetry require
ments of the relativity theory and giving by repeated differentiation
anything like equation (196) is a thing utterly impossible. I t is, how
ever, possible to replace equation (196) by a system of several equations
of the first order with as many unknown functions, which would satisfy
the space-time symmetry condition and with the help of a second
differentiation would assume a form similar to and, in the special case
of free motion, identical with equation (196). We shall see, moreover,
th at this system of equations can be written in the form of a single
equation of the type (H + pt)ift — 0, where //, pt> and \fs are treated as
four-dimensional matrices, or similarly, in one of the following three
equivalent forms (Px—p x)*f* = 0, (Pv—p v)^ = 0, (Pe—p s)*l* = 0, where
Pxi Pyy Pz are matrix operators representing the components of the
electron's momentum in the same sense as H = Pt represents its energy.
The possibility of writing the equation of motion in these four equivalent
forms is the direct expression of the equivalence of the space coordinates
and the time, which forms the essence of the relativity theory.
The first part of our problem, namely, the establishment of a system
of first-order equations satisfying the space-time symmetry condition,
can be solved in a very simple way, with the help of the analogy
between mechanics and optics, which was the starting-point for the
development of wave mechanics and which can still be used—with
certain reservations—as a source of inspiration.
Equation (201 a)
(u l+ ul+ uz-ui+ nilc2)iP = 0
in the case of a particle with vanishing charge and rest-mass, reduces to
'jp jP U
= 0, (214)
dx2 dy2 ^ dz2 c2 di2y
i.e. to the equation of the propagation of light-waves (in empty space)
with the true velocity c. If the wave velocity is equal to c, then the
velocity of the associated particles must also be equal to c, so th at these
particles can be identified with photons.
Now, according to the electromagnetic theory of light, equation
(214), usually denoted as d'Alembert’s equation, does not give a com
plete description of the electromagnetic field of the lightwaves. This field
is specified by six quantities, namely, the three components Exf E y> Ea
§ 27 GENERALIZATION OF MAXWELL’S THEORY OF LIGHT 261
of the electric field and the three components! Hx,H yi Hz of the magnetic
field, these quantities satisfying the well-known equations of Maxwell:
8HZ 1 8EX
= 0
% bz c ~dt
8HX bHz 1 SE„
dz bx c bt \
(curlH—-——oV
c dt }
(215)
dHv bHx 1 bEz
= 0
bx by c bt
and
dEzJ E V
x! = 0
by bz 1 C bt
■Qo
bEx 0HU
+1
(CUrlE+ r l ” °)’ (2,Ga)
i
bz bx ~ c bt
bEu bEx bHz
+ 1
bx by c bt
To these six equations we may add the following two:
8EX dE dEz
divE = 0, ( 210 )
or dy dz
divH --= hH* + dHv + m * 0. (216a)
bx by bz
The latter equations can, however, for vibrational processes, be regarded
as a consequence of (215) and (215 a) respectively. Thus, if we dif
ferentiate equations (215) with regard to x> y, z, and add them, we get
^ d iv E = 0. From this it follows—in so far as we reject purely static
fields—th at div E = 0 . In the same way we can derive (216 a) from (215a).
If we differentiate the left side of the first equation (215) with respect
to the time t, we obtain, using (215 a),
b 1 dh &1 d 1 b2E r
by c bt * dzc bt bt2
- d_ ( dE* bEy\ VEA _ i s*AV
~ t y \ by bx /# bz\ bx bz / c2 bt2
_ b2Ex b2E b lbEv , bE:\ 1 &E*
~ by2 + bz2 bx\ by bz ) c2 bt2 ’
i.e., by (216),
b2Ex b2Ex b2E_
bx2 + by2 ' bz2
« _ -*i ^dt* = 0,
t The reader will easily distinguish between the symbol U in the combinations H„ //„, Hz
used here for the components of Hand the simple FI use*}passim for the Hamiltoniancncrgy.
262 WAVE MECHANICS OF A SINGLE ELECTRON § 27
which is merely equation (214) with ^ = Ex. In the same way we obtain
similar equations for the other five components of the electromagnetic
field. We see, therefore, that d'Alembert’s equation must be regarded
as the result of the elimination, with the help of a second differentia
tion, of the different field-components from Maxwell’s equations.
This elimination is usually carried out with the help of the potentials
AXy AyyAey<f>which are introduced by means of the formulae
E = -Vd > -- -- A, H = curl A.
C vtr
Thereby equations (215 a) and (216 a) turn themselves into identities,
while equations (215) and (216), with the additional condition
dx dy dz ' c dt
yield four d ’Alembert equations of the type (214) for the components
of the potential.
The preceding relation leads to a simplification of the wave equation
(196), which assumes the following form:
cfitfj d^ift d^tfj 1
dx22 dy1
dy1 dz1
dz2 c2 dt2
47riel . 5^4 a A (217)
~ he \ xcdx + v b iy Zd z ^ c dt]
or, in vector notation,
r c2 dt1 he \ 6 Y c dt)
(217a)
Me* \ 9 ^ e2 r '
This equation, written in the form (201a), can be regarded as the
simplest generalization of d ’Alembert’s equation (214) for material
particles (electrons) with a non-vanishing charge c and rest-mass wi0—
a generalization obtained by replacing the operators
h d h d k d h id
2rri dx* 2ni dy9 2iri dz9 2ni c dt*
h_ d_
by the operators ux = eA etc., and further by adding to the
2vi dx c *
2
left side of (214) the term
(t ) mQcty.
§ 27 GENERALIZATION OF MAXWELL’S THEORY OF LIGHT 263
Now Maxwell's equations form a system of equations of the first
order satisfying the space-time symmetry condition and implying
d'Alembert’s equation as a corollary. We are thus naturally led to
the conclusion th at the first-order equations of the relativistic wave
mechanics, which must replace the second-order equation (201 a), can
be obtained as a generalization of Maxwell’s equations, in a way similar
to th at which leads from d'Alembert’s equation (214) to the wave-
mechanical equation (201a).
We shall assume, therefore, th at the electron (or proton) waves can
be described not, as so far assumed, by a scalar quantity \f* but by two
vector quantities M and N which are analogous to the magnetic and
electric field strength (H and E) respectively, and we shall seek to
generalize Maxwell’s equations by introducing the operators ux instead
of ^L-. etc. The second part of this generalization, i.e. the introduc
tion of the rest-mass, we shall at first disregard, i.e. we shall put m0 = 0.
To begin with, we must notice th at the generalized operators
unlike the original, are non-commulative, i.e. we obtain different results
if we apply to any function if* two such operators in a different order.
For example, if we form the difference of the expressions uxuyty and
uuuxift we obtain
[( ± . . I k a vA - e a * a s a vA -
[\2j71/ bxby 2 mbx\c v ) c z 2 m d y ' c - v J
17 h \* 8h(i h 8 te . ,\ e . h, d f , c2 . . .]
] \2m) 8ydx to n d y\c A**) ~cAy 2m-dx+ i iAyA*'l’y
, w he I c A t &Av\ .
or, by (199 b), —— N ( 218 )
2iric £J
if we omit the factor \ft operated upon. In a similar way we get the
formulae
he „
uTUr—uTur = _ he „
’ 2iric *' 2 ^c v'
he „
and also uxu - u tux (218a)
2tTic * '
and two analogous formulae for the combinations (y,t) and (z, t).
Because the operators u are not commutative, their introduction into
the eight Maxwell equations [multiplied by A/(27rt)] in place of the
264 WAVE MECHANICS OF A SINGLE ELECTRON i 27
hf 8
operators — . —, etc., necessitates a further modification. We must,
2tt-i dx
namely, add to the right side of these equations extra terms of the
form uM0 or uN0 where M0 and A"0 are two new scalars; otherwise
(i.e. when M0 — N0 — 0) the eight equations obtained for the six
quantities Mx, Myy Mz, Nx, Ny, Nz would be, in general, incompatible
with one another. In fact, if we limited ourselves to a replacement of
r
the operators — — ,... by uy ,..., the equations obtained from (216) and
2iri dx J *
(216 a) would no longer be a corollary of the equations obtained from
(215) and (215 a) and would therefore contradict the latter.
In writing down the generalized ‘Maxwell-like’ equations, the fol
lowing circumstances should be noticed:
(1) The extra terms uM0 and uN0 on the right side must represent
the space-time components of two four-dimensional vectors analogous
respectively to the vector of electric current and charge density in the
case of equations (215) and (216)—which will be referred to as the
I group of Maxwell’s equations—and to the vector of ‘magnetic current
and charge density’ in the case of the II group, formed by equations
(215a) and (216a).
Treating M0 and iV0 as scalar quantities, we can define the com
ponents of the first vector by uxM0> uyM0, uzMQy ± u tM0y and th at of
the second by uxN0, uy AT0, uzN0, ± u t N0.
(2) The ambiguity of sign (± ) arising in this connexion can be removed
with the help of the fact th at the two groups of Maxwell’s equations
can be derived from each other if E is replaced by H and H by —E.
We must therefore require th at one of the two groups of the general
ized Maxwell-like equations be obtained from the other by replacing
Nxi Ny> Nz) N0 by Mxt My; Mz, M0 and Mx, Myi Mzi M0 by - N xi - N yy
—Nzi —N0. Taking this into consideration, we obtain, as the first step
in our generalization of Maxwell’s equations, leaving the rest-mass out
of account, the following system of equations:
u vMz—u ,M v~ u t Nx = ux M0 '
uzMx—nx Mz—u ,N v = uu M0 (219)
uxM y—uyMx—u,Nz = uz M0 t
u y Nz —uz N y + u t M x = uxN0 ^
u zNx - u x Nz + u , M y = u y N 0 1 (219a)
u x N y —u y Nx + u , M t = uz N0 j
§ 27 GENERALIZATION OF MAXWELL’S THEORY OF LIGHT 265
uI Nx + u vNv + ui Nz = - u ,M 0 (220)
u x Mx + U!,Mv + u ,M . =■ + utN0. (220 a)
From these equations we will now by ‘generalized differentiation’, i.e.
by repeated application of the operators u> obtain eight differential
equations of the second order which correspond to d ’Alembert’s dif
ferential equation.
If we apply the operators ux, vip uz to the equations (219) and the
operator ut to equation (220), we obtain by addition, using (218) and
(218a):
- l lC [Hx M ,+ H ,M ,+ IL M - E xX - E , X „ - E : K \
+ .7TIC
—- (ul+ u'j,+ uz—ui)M0,
or, if we ]>ut for shortness
( 221 )
and use the vector notation:
( 221a)
2lTlC
Similarly we get from (219a) and (220a) the equation
2>.Ar. - ^ . ( - H - N - E - M ) . (221 b)
2 vlC
With e = 0 these equations can be satisfied identically if wc put
M0 = N0 =~ 0. In the general case, however, the scalar functions M0
and AT0 must be different from zero.
If we apply the operator nt to the first equation (219) and interchange
the order of the different operators u, we get, taking account of (218 a),
(Eu Mz—EzMy—Ex M0)+ u yut M - v zut My—
- u xut Mn-u ' i Nx = 0.
Now by (219 a) and (220):
uynt Mt = uvuz X0—uuuTXu-\-ulv Xz,
—u. uyN0+ u 2zNx- u zvxX,,
—uxu,M0 = u2Nx+ uxuvNy-\- ux uzN,.
By repeated application of the relations (218) and (218 a), we thus
obtain
(«*+“ # + « !—
+ 2 L ( E u M , - E z My- E x M0+ H u NZ- H ZNU- H XN0) = 0.
This equation and th e two other s which r esult fr om it b y cyclic
266 WAVE MECHANICS OK A SINGLE ELECTRON g 27
interchange of the indices x, y, z can be summarized in the following
vector equation:
A>N + „ Ae. [ ( E x M - E 3 f 0)+ ( H x N -H iV 0)] - 0. (222)
ZlTlC
Similarly, by application of the same method to equations (219 a), we
obtain the second vector equation,
Z)#M + A * [ ( H x M - H J f , ) - ( E x N - M , ) ] = 0. (222a)
2r r lC
Equations (221 a), (221 b), (222), and (222 a) are the required generaliza
tion of d ’Alembert’s equation. They differ, however, from the latter,
not only by the differential operators u appearing in D0 instead of
h d
—- — , etc., but also by additional terms which are proportional to the
2nt ox
electromagnetic field components and which for each equation have
a special form.
If we omit these additional terms (whose physical meaning will be
explained later) we obtain, for all the eight functions Mx,..., Nz>M0f N0,
identical equations of the d ’Alembert type—equations which differ
from the relativity wave equation (201a) or (217 a) found earlier only
by the absence of the ‘mass term ’ m\c2 in the operator D0. This shows
th at the second step of our generalization of Maxwell’s equations—in
so far as it is a question of the resulting generalized d ’Alembert equa
tions—must consist in replacing the operator D0 by the operator intro
duced earlier, namely,
D = D0+m*c*. (223)
The corresponding introduction of the parameter m0c into the equa
tions of the first order (219) to (220) is done most simply as follows:
In equations (219) and (219a), which contain the time derivatives of
the quantities Nx, Nyi Nz, N0, we replace the operator ut by
u't = ut—mQc, (223 a)
and in equations (219 a) and (220) by
u t” = ut+ m0c. (223 b)
Taking into account the relation ut ut = ut u't = u \—m\ ca, we can easily
convince ourselves th at from these generalized Maxwell’s equations
uyMt —ut My —v^Nx == uxM0 ’
ueMx—uxMM—u'tNy = uyM0 (224)
u„ My- u , M x- v i N , = ut M0 i
§ 27 GENERALIZATION OF MAXWELL’S THEORY OF LIGHT 267
—uzNy + u'tMx = uxN0 \
nzNx —uxN. + u,M v = uvN0 1 (224ft)
uxNv —uvNx+ UtMz —.uxN0 )
uxNx + u uNv + usK --= - u ’t M0 (225)
uxMx+ u vMx+ ut Mx = +«;AT0, (225 a)
there follow the generalized d ’Alembert’s equations:
hp \
DMD+ - - r ( H 'M - E ’N) = 0
2me
(226)
he ’
DN0 + (H-N+E-M ) = 0
2ntc /
Z>M + [(H x M —HiW0) —( E x N —EAr0)] = 0 ’
(226 a)
/>N + - ~ f(E x M —EM9)+ (H x N —HiV#)l = 0
2 7 TI C J
Equations (22G) and (220a) become identical if we put either
N = i M, Nq = iM0 (227)
or N — —iM, N0 = —iM0. (227 a)
Thereby they assume the following simple form:
(228)
DM»+ 2S c (H T iE )'M = ° ’
DM + [(H T»E)xM -(H qF*E)Jl/0] - 0 . (228a)
2t t ic
Let that solution of these equations which corresponds to the upper
sign be denoted by M + and the other by M ~ . The general solution of
equations (226) and (226 a), therefore, can obviously be written in the
form M = c,M++CjM", M ^c .M t+ ^M Z \
N = t(ciM +—c2M"), N0 = i ^ M ^ - ^ M - ) I'
where cx and c2 are two arbitrary constants (which must be introduced
if the solutions M + and M - are normalized in some way).
I t must be mentioned, however, th at the first-order equations (224)-
(225 a) do not admit solutions of the type (227) and (227 a), because of
the appearanoe of the two different operators u't and u\. These solutions
do not have, therefore, any real significance.
268 WAVE MECHANICS OF A SINGLE ELECTRON §28
28. Alter native For m of the Wave Equations; Duplicity and
Quadr uplicity Phenom enon
There is another possibility of halving the number both of the second -
order equations (226)-(226a) and of the first-order equations (224)-
(225a), as well as of the wave functions M , N, defined by them.
We must notice, first, th at equations (224)-(225a) can be naturally
regrouped by associating (225 a) not with (224 a), as has been clone
before, but with (224), and (225) with (224 a). The two groups of four
equations thus formed will be denoted by I' and II" respectively.
I t is now easily seen th at the equations of each group can be com
pounded in pairs and, as it were, folded up together, in such a way as
to form two groups of two equations involving four unknown wave
functions. Taking the group I' wc can, for example, compound the
first two equations (224) to form one pair and the third with equation
(225a) to form the second pair. If we multiply the first equation of
the second pair by i and add it to the other, we get
(ux—iuv)Mx+ (iux+ uv)Mu+ ut(Mz—iM0)+u'i(—iNI—N0)
= (ux—iuy)(Mx+ iMv)+ uz(Mz—iM0)+u't(—iN.—N0) — 0.
Likewise we obtain, by subtracting the second equation (224) from the
first equation multiplied by i,
(ux+ iuv)Mt + ul{- M x- i M y)+ u'l( - iN x+ Nv)—(iux- u ll)M0
— (ux+ iuv)(Mt—iM0)—ut(Mx+ iMv)+u't(—iNx+ Nv) — 0.
If we put, therefore,
= Mx+ iMv, ifit = Mz—iMa
(229)
‘<f>3 = —i{Nx+ iNv), ipi = —i(Nt—iN0)
we can reduce the four equations under consideration to the following
tW° ‘ (mx— +uz4>t+ u'th = 01 (229a)
(ux+ iuv)>/>2— = 0 J’
In a similar way the four equations of the group II" , (224 a) and
(225), can be folded up into the two equations
= 0 I (229b)
(u*+ iUyWi-U' ^ = 0y
with the same four unknown wave functions (229). The equations
(229 a, b) were first derived by Dirac.
The process just described can be applied to the second-order equa
tions whioh are obtained from (226) and (226 a) by taking their com-
§ 28 ALTERNATIVE FORM OF THE WAVE EQUATIONS 269
ponents along the coordinate axes. We have, for instance, according
to the first equation (226a),
D(Mx+ iMu)+
+ M~ - H; Mu - HxM0)+ i(H: Mx- H , M - H aJf„)] -
Z7T 1C
- [ ( E vN - E zNv- E xN0)+ i(EzNx- E xN ~ E uN0)]} - 0,
that is,
—7TIC
~--[iEz{Nx+ u \u)--i(E x+ iE u) ( N - iN 0)}} ,= 0;
and similarly,
D(Mz—iMt )+
£ ic{[(HxMu- H uMx- H zMa)- i( H x MX+ H„ MU+ H : M:)} -
—[(ExN„ —EvNx—E ,N0)- i(KJ.Nx+ E uNu + E: K)}} -- 0,
that is,
D ( M - iM 0) + J ie {[ - i ( H - U I u)(Mx^i M u) - i K ( M - i M 0)] -
—7TIC
~ {~ i{E x-iE„ ){Nx+ iA’, ) - iE:(N.—uV0)]} = 0,
or, according to (229),
/¥ a + + }L~ U ^ {{E x- i E ^ Y E z*,]} = 0 ^
(230)
In the same way the four remaining equations (22G)-(220a) are folded
up into
!Hu + ~ {+ i [ ( E x- i E v)4,1+ +<}} = 0 ^
(230 a)
They can be derived from (230) if \fix and ip2 are replaced by tf$3 and ^A4,
and the latter by and ^2. Both the equations (230) and (230 a) can
be obtained, of course, directly from equations (229a, b) in the same
way as the equations (226)-(226a) are obtained from (224)-(225a),
i.e. by the application of the operators u to the left side of (229 a, b).
The latter equations were established by Dirac in an externally different
270 WAVE MECHANICS OF A SINGLE ELECTRON §28
form and by a different method, which will be indicated later and which
does not make use of the formal analogy between wave mechanics and
the electromagnetic theory of light. We shall see that this analogy is
actually not so deep as it seems at first sight, and that the regrouping
of the equations (224)-(225a), which is necessary for their folding up
into the Dirac equations, is a formal expression of a drastic divergence
between the wave-mechanical functions M , N and the electromagnetic
functions //, E.
I t is interesting to notice that a similar regrouping and folding up
can be carried out with regard to Maxwell’s equations. These ‘dis
guised’ Maxwell’s equations can be obtained from equations (229 a)
and (229 b) by putting e = m0 — 0, and further by replacing the vectors
M and N in the definition (229) of the functions by H and E,
dropping the terms M0>N0.
In fact, it can be directly verified that if we put
H r+ i H u =-- Hz - ,/,2 1
- i ( E x+ iE u) = - i E s - *4, I
we obtain, instead of the eight equations (215)-(21(>a), the following
four equations:
(231a)
Another well-known possibility of reducing the eight Maxwell equations
to four consists in combining the electric and the magnetic field
strengths to form a complex vector
K = H ± tE .
We then obtain, instead of (215)-(216a), four equations of a similar
type, namely, • *
c u r l K ± - - K = 0, divK = 0.
c dc
This method is not applicable to the generalized Maxwell equations
(224)-(225a).
The formulae (230) and (230 a) correspond to the union of the
variables x and y as well as of the corresponding components of various
§ 28 ALTERNATIVE FORM OF THE WAVE EQUATIONS 271
real vectors to form complex quantities w = x-\-iy, H x-\~iHy — *fjv
Ex+ iE v = iip3, etc. The operators djdx—idjtiy and d/dx+ id/dy can
thereby be regarded as the differential operators d/dw and djew*
corresponding to the complex variable w and the complex conjugate
variable w* = x —iy respectively.
While we can regard the formulae (230) as a decomposition of the
complex functions 04 into real and imaginary parts, this is not
so in the case of the analogous formulae (229). The fact th at all the
eight quantities M, N must in general assume complex values follows
immediately from the complex nature of the operators u in the equa
tions (224)-(225a) determining them.
The reduction of these eight equations to the four equations (229 a, b)
is, therefore, an actual halving of the number of unknowns, while in the
case of the Maxwell equations we have simply a union of real quantities
—as the components of the electromagnetic field are—to form complex
quantities.
If the four complex quantities *pv ...} if/4 actually suffice for the com
plete determination of the electron waves it must be possible by means
of these functions to express the statistical quantities, i.e. the pro
bability density p and the components of the probability current density
j x, j v, j z which we have determined earlier by means of the scalar if*.
In the new determination of these quantities we shall at first be guided
by the same analogy as that which led us to the generalized Maxwell
equations—or to the Dirac equations equivalent to them. From this
point of view the quantities p and j must correspond respectively to
the electromagnetic energy density
p =
o7r
and the energy-current density (i.e. to Poynting’s vector)
J = i-E xH .
47T
If we put here, instead of the components of E and H, their expres
sions obtained from (231):
Hx = ). Hv = ! ( * , - # ) , Hz = = r*>
Ex = E y = * ( * ,+ « ) , E , = i+t = - i t f .
we obtain
272 WAVE MECHANICS OF A SINGLE ELECTRON
and further,
jx « U e vi i - e £h u) = f
= -u—(03 0 ? + 0?0s+ ,l'i- f h ’AD.
and similar formulae for and Jz.
These quadratic expressions are clearly real and also remain real
when all the four quantities are comj)lex. We arc led, there
fore, to use them for the representation of the quantities p and j.
Omitting the common factor 1/87T, we obtain
P r " 01 0? + 02 0? + 0 3 0? + 0 4 0 f (232)
Jx r ( 0 i 0 f -1-040* + 020? H-030?)
Jv — —JC(0 1 0 4 * -0 4 0 * -f0 2 0 ? —030?) (232 a)
J; = —C (0 1 0 ? + 0 3 0 f+ 0 3 0 ? + 040?)
If these expressions are correct they must, like the expressions obtained
earlier for p a n d j, satisfy the equation
^ + ^ + ^ + 25=0, (232 b)
c/X &y dz
expressing the law of the conservation of probability (or of the number
of copies). I t can easily be shown by means of equations (229 a, b) that
this is indeed the case.
Multiplying these equations successively by sub
tracting from them the corresponding conjugate equations
+ = 0,
(u * - iu * )</<? +u\ = 0,
etc., multiplied b}- ipt , ij/z, etc., and finally adding the results, we get:
[04 (MX )01 01 (Wx 1M* )04 ] + [03 (MX"I" *'W(/)02 02<WX"I" )03 J +
+ [ 0 ? (« x -* “ „ ) 0 3 - 0 3 « - * < ) 0 - ? ] - + [0 ?(« x + * “1/)0 4 -0 « (« ? + *M*)0J‘] +
+ <*?«, 0 2 - 02 t»*0T) - (0J«. 0, 01 «?0J) + (0?«z 0 4 - 0 4 < 0 ? ) -
- (0 N z 0 3 - 0 3 < 0 f ) + (0?« ;* 0 4 -0 4 * '? « ) + ( # « * 0 3 - 0 3 « ? 0 ? ) +
+ (0X02-0*M|'*0?)+(0N'01-0I M?*0?) = °>
which, by the definition of the operators w, easily reduces to (232 b) with
the expressions (232) and (232 a) for p and j x, j y} j z.
Formula (232) is the immediate generalization of the formula p — ipijj*
of the original non-relativistic Schrtidinger theory. On the other hand,
§28 ALTERNATIVE FORM OF THE WAVE EQUATIONS 273
the expressions (232 a) have a form entirely different from the original
expressions for the current density
jx h...
4777n0t \ dx 'dx /
A more accurate investigation of equations (229 a, b) shows, however,
th at this difference is not so great as it seems. With harmonically
vibrating waves, corresponding to a motion with a definite energy c,
the dependence of the functions on the time is described by the
common factor e~i2ir€tlhy so th at the operators u\ and u\ reduce to the
ordinary factors
u\ — —-(c —U-\-m0c2) — — -(IF —lJ-\- 2m0c2)
c c
uj — — -(c —U —m0c2) = —- (W—U)
c c
where U — e<j> is the potential energy of the electron and W—U is its
kinetic energy. In general (so far as we restrict ourselves to positive
values of e, see below), the first factor is enormously large compared
with the second; therefore the functions «/f3 and i/ja which are multiplied
by it in equations (229 a) must, with regard to their absolute magni
tude, be very small compared with the functions and ^2. If, more
over, we restrict ourselves to the case of motion with a kinetic energy
W—Ut which is small compared with the rest-energy ?n0c2, i.e. with
a velocity v whose square is small compared with c2, we can put
approximately, according to (229 a),
2»i0c^3= (m
x+w1/)</'2-m..'Ai | (233a)
- m0Clp4 = (ttx—»«»)01 + «,02 >
Since these relations no longer contain the energy €, they may be
regarded as approximately valid in the general case of non-conservative
motion.
I t should be mentioned that, according to (233 a), the ratio of the
functions ^3, ^»4 to the functions \jfv \fs'2 is of the order of magnitude
g/(m0c) ^ v/Cy where v is the velocity of the electron, and g is its proper
momentum estimated roughly by the ratio I t follows from
this that, to the first approximation with regard to small quantities of
the order v/c, we can put, instead of (232),
P- ^ 1+^2 (234)
neglecting the squares of ip3 and ipA. Substituting the expressions
n n
274 WAVE MECHANICS OF A SINGLE ELECTRON $ 28
(233 a) in (232 a), we get further,
ix = 2“ + "AN* W + [,Pi(u*+ iu*)>Af+<Piu *<l>t] +
+[<l>t(v,x+ iuv)<l>i -'l> tul <p1]+[4>i ( u * - iu*'l'P*+'Ps u*4>*}}
= 2 ^ {[(^f + ^*“x ^*)+ (^i + fa w5^*)]+
+ i M % ' h ~ ' P > v <l>x)-(<f>2 u *M +
If we put Ax = A2 = Az = 0 (i.e. if we neglect the potential momentum
—if any—compared with the proper momentum), we obtain the fol
lowing formula:
i , ____ L , f U S h + l S % _ A « f f _ fc ? S l +
4uw#»(l dx * dx 1 dx * fix J
jzW< h- •Pt'Pi)
+ % < # * •- -< Mi) + OZ
the first term of which (in square brackets) is the same generalization
of the original expression
Jx 4irm0i \ 8x* * 8x* )
as (234) is of the original expression for p. The physical meaning
of the two additional terms will be cleared up in the next section. From
the purely formal point of view, these two terms, as well as the corre
sponding terms in j y and j z) can be regarded as the y-, and z-com-
ponents of the curl of a certain vector c9M, defined by the formulae
W. = 47T 7W 0 C
W, = 4TT7W-0C l '
(234 a)
h
5Dl,=
47rm0c
so th at the approximate expression for the current density in vector
form is:
j= (i/ifVi/ij+ i/«!Vi/i2- xf,1V tf -^7 < p * )+ c curl OT. (234b)
47rm0i
If, further, we substitute the approximate expressions (233 a) in
equations (229b), the latter assume the following form:
{ux- i u u){ux+ iuyWt - ( u x- i u a)uz^ + u z(ux- iMv)^ +
- f ? i ? i / i a + 2 m 0 ct t j ' t /i2 = 0>
(ux+ iuu)(ux~ iuu)>pi + (ux+ iuu)uz<ps- u:(ux+ iUy)^ +
+® I^ i + 2t oocm<Vi = 0.
§ 28 ALTERNATIVE FORM OF THE WAVE EQUATIONS 275
Now according to the relations (218) and (218 a), we have
he
(ux- i u u){ux+ iu u) = ul+ ul+ iiitsU y-U yU j =
(ux+ iu u)(ux- i u v) = u‘i+ u l + — K ,
ui (ux- iu „ ) - ( u J. - i u l,yus = ^ { ~ H x~ iH u),
~7TC
ha
u.[ux+ iuy) = — — (Hx-\-iHu).
Wc have further
2,«0cHf'-= 2 m 0[ ( A . ^ + ^ + « , c * ] .
which reduces to —2m0(]\’~ U ) for conservative motion. We can drop
the constant term mnc2 if — . —is assumed to reduce in the latter case
27n dl
to — If and not to —e (this constant term entails an irrelevant factor
e-i 2-nm^yiih jn the expression of the functions With thiscondition,
the preceding equations can be written down in the following form:
), (23o)
( i u ,+ « + I ')*> + - •J
where u is the (three-dimensional) vector with the components ux, uy, uz.
These equations represent the approximate form of the relativistic
second-order equations (230) and can indeed be obtained from them
by dropping the small quantities </r3, *pA. The approximation involved
corresponds to neglecting terms of the second and higher orders in v/c,
including those which represent the variation of mass with velocity.
I t must be mentioned that, although the functions iftv are themselves
small of the first order with regard to t//v «/r2, they are multiplied, in
equations (230), by the factor heftnc, which can be regarded as a small
quantity of the first order (in 1jc).
If, in equations (235), we drop the additional terms, proportional to
the magnetic intensity (putting either H = 0 or c = oo), they reduce
to equation (203 a), § 26, the two functions \ft1 and t/t2 becoming iden
tical with the single function i/j of the previous theory. Equations
(235) thus give a more complete description of the motion than equation
(203 a). In fact they exhibit the duplicity phenomenon which has already
been indicated'in P art I, § 19, and traced to the electron’s 'spin* or
276 WAVE MECHANICS OF A SINGLE ELECTRON § 28
*intrinsic magnetic moment\ To these properties correspond additional
forces, which are represented by the additional terms, proportional to
the magnetic field in equation (235), and also to the electric field in the
exact equations (230).
The duplicity phenomenon, as explained in P art I, in its simplest
form consists in the splitting-up of each quantized state, as determined
by Bohr’s theory, into two states which in general have slightly
different energies. So far as the number of states is concerned, Bohr’s
theory gives the same results as the ordinary SchrOdinger equation with
one wave function Now to each solution of this equation, t/fH’ say,
there corresponds a set of two solutions of the system of equations (235)
or rather of the equations obtained from them, if the operator —pt is
replaced by the energy constant.
This means that to each energy-value H f of the ordinary Schrodinger
equation there correspond two slightly different energy values, H \ and
H'_ say, of the system of equations (235). Each of these energy values
is associated with a set of two functions *p1II+, wv an(^ - » $ 2ir->
these four functions replace the single function \pu >of the Schrodinger
theory.
If, instead of the approximate equations (235), we take the system
of four exact equations (230) and (230a), then by a similar argument
it seems to follow th at to each state of the ordinary SchrOdinger theory
there corresponds, according to the exact theory, four states, whoso
energies, if the magnetic and electric field strengths are not too large,
lie close to the energy H' of the single Schrodinger state.
This conclusion is, however, fallacious, for the four second-order
equations (230)-(230a) are not independent of each other, being in fact
derived from the four first-order equations (229a)-(229b). So far as
the number of solutions (i.e. states) is concerned, the latter are equi
valent to two of the four second-order equations derived from them.
We get, therefore, with the exact equations (230)-(230a), a duplicity
phenomenon of the same type as with the approximate equations (235),
the value of the energy being, of course, somewhat different in the
exact theory from what it is in the approximate theory.
The exact theory, when compared with the approximate theory or
with the original non-relativistic SchrOdinger theory, leads, however,
to an additional duplicity phenomenon of an entirely different type,
which is not connected with the ‘spin’ property, but can be referred
to as due to the variation of the mass with velocity. This type of
duplicity is already implied in the relativistic equation with the single
§ 28 ALTERNATIVE FORM OF THE WAVE EQUATIONS 277
function \fj, which was derived a t the beginning of this chapter. We
come upon it in its simplest form in the case of free motion, when the
operators ux,u v,u z, ut can be replaced by ordinary numbers (multipliers)
gxi gyi gzy eje representing respectively the components of the momentum
and the energy, including the rest-energy m0c2, divided by c. Equations
(230) reduce in this case to the same form as equation (196), namely,
which is equivalent to the ordinary relativistic relation between momen
tum and energy 2
<72- - + m-c2 = 0.
Now since this relation contains the square of the energy, it leads to
two numerically equal values of the latter, one positive and the other
negative, e= ± c ^-c ^ ).
In Einstein's mechanics, the negative value was rejected as having
no physical meaning. It has, however, been explained already in P art 1,
§ 19, th at this rejection is not justified in wave mechanics, because of
the possibility of a continuous transition from a state of positive to
th at of negative energy € through imaginary values of the velocity or
because of a ‘jum p’ produced by some perturbing forces.
In the case of non-relativistic wave mechanics, we have, under the
same conditions (free motion),
g2- 2 m 0W = 0,
where ]V is the ordinary (kinetic) energy, not including the rest-energy
7V0c2. This non-relativistic energy is related to the positive energy €of
relativity mechanics by the equation
W — e—m0c2,
whereas the negative energy e has no counterpart in non-relativity
mechanics. The appearance of the negative energy e in addition to the
positive energy forms the essence of the duplicity phenomenon of the
second kind. The situation is not substantially changed in the general
case of motion in a conservative field of force, the only difference being
th at the positive and negative energies of the corresponding states are
not numerically equal.
Combining the two duplicity phenomena—that due to the spin and that
due to the relativistic variation of the mass—we get a quadruplicity pheno
menon which can conveniently (though not quite correctly) be associ
ated with the replacement of the single ^-function of the Sehrodinger
L'78 WAVE MECHANICS OF A SINGLE ELECTRON § 28
theory by the four ^-functions of Dirac’s theory.—This association
is not quite correct, for the same quadruplicity phenomenon would
result from Pauli’s theory, based on the use of two functions fa and
if, in the approximate equations (235) defining them, the non-relativistic
operator u2j2m0+ pt+ U were replaced by the corresponding relativistic
operator of the second order, D — (u2—uf-\-rri{)c2)j2m0. It must be
mentioned, however, that in doing this we should be guilty of incon
sistency, because, having dropped additional terms of the second order
proportional to the electric field strength in deriving the approximate
equations (235) from the exact equations (230), we must also drop
second- and higher-order terms, representing the dependence of mass
upon velocity, in the main operator D.
In the case of free motion (represented by plane waves), there exists
a very simple relation between the four functions 0 referring to the
positive energy and the corresponding negative energy solution of the
Dirac equations (299a)-(299 b). Putting
fa = akei2lT(a*xi0*v+i’' :-€t)lhi (236)
where the ak are constants (Jc — 1, 2,3, 4), we can replace them by the
following algebraic system:
(<Jx- i9v)ai+ 9za2+ -c( €+ moc2)aA^ 0 '
(236 a)
2— 9 : a i + - ( c + wl o f S)a 3 - 0
C /
(9 x-i9i,)az-\-Uzai + - ( ^ - ^ o c2)a2 =
(236 b)
(9 x+ i9 vK -9 za3+ -(*-™<)c2K = 0 J
If, in these equations, the energy e is replaced by —c, then the first
two become identical with the second two and the latter with the
former if simultaneously av a2 are replaced by aA and a3, aA by
—a v —a2. This means that, with
= fay fa = fa = fa > fa ^ fay
corresponding to c = c' > 0, we have
fa = fav fa = fa> fa = —fay fa ^ -fa *
for c = —e'.
I t has been assumed, hitherto, th at the functions ^3, fa were small
(of the first order in v/c) compared with fa, fa. We now see that this
is only true if we restrict ourselves to positive energy solutions; the
§ 28 ALTERNATIVE FORM OF THE WAVE EQUATIONS 279
converse is true in the case of negative energy solutions—both for free
motion and for a motion in a coilservative field of force.
From the point of view of the old relativity mechanics, the reversal
of the sign of the energy €= c h n j^( \—v2lc2) is equivalent to the reversal
of the sign of the rest-mass ra0. This is not exactly true, however, in the
wave-mechanical theory. For a reversal of the sign of w0 in equations
(236a)-(236b) leads to the replacement of 0 j,0 2 by 03,0 4 an<^ 03’04 by
0i, 02 without reversal of the sign of the latter. The two solutions have
nothing to do with each other, since they refer to particles of different
kinds (particles with negative rest-mass being in reality non-existent),
whereas the two solutions corresponding to c — ± €' refer to the same
particle with a positive rest-mass ra0, the values of the energy being
due to the ambiguity of sign in the radical of the expression
€= c ^ o / ^ l —v2/c2).
I t is important to notice that the states of negative energy, as deter
mined by relativity wave mechanics, are not directly observable. Accord
ing to Dirac’s theory of the duality of m atter and electricity, outlined
in P art I, § 19, nearly all these states are occupied by electrons, the
vacant states (‘holes’) being observed as protons. According to the
revised version of this theory, the holes in question represent not
protons but positive electrons, which have been recently discovered by
Anderson in America (1932) and by Blackett and Occhialini in
England (1933).
29. The Appr oxim ate Pauli Theor y in the T wo-d im en sion a l
M atr ix For m; Electr on' s M agnetic Moment and Angular
M omentum
The approximate (non-relativistic) equations (235) were initially ob
tained by W. Pauli in 1927, not as an approximation to the Dirac
theory, which was published a year later, but as the result of a semi-
empirical attem pt to interpret wave-mechanically the duplicity pheno
menon, which a year before had been incorporated by Uhlenbeck and
Goudsmit into the Bohr theory on the assumption th at the electron
possesses a spin motion, with an angular momentum equal to half of
the Bohr unit h/2n and a magnetic moment equal to Bohr’s magneton
/x = ehl(47rm0c).
Pauli’s equations (235) can actually be put in a form corresponding
to this assumption, i.e. giving a wave-mechanical interpretation of the
electron’s ‘spin’, and, indeed, by using a matrix notation, based upon
the representation of the two functions 04, 02 as elements of a one-
280 WAVE MECHANICS OF A SINGLE ELECTRON §29
column matr ix
_ M (237)
~kr
the conjugate complex functions >]>*, tj>$ forming the adjoint one-row
matl'iX * f --= {*..*.}• (237 a)
Under this condition, the two equations (235) can be written in the form
Pifj = 0, (23K)
where P is a square 'operator-matrix’ of the second rank
H S n lr <**>
with suitably defined elements. These elements must be defined in such
a way th at the twro equations (235) assume the form
(fty)i = = o \ (2381.)
(P<i>)2 — P-il'Pl + P lAi - ° i
Hence it follows th a t
P -= (tf+ p t+ U fi-n H -o , (239)
2mft
wher e (239 a)
p
is the u n it m atr ix of th e second r ank and a is a vector matr ix w'ith
the following r ectangular components:
_ 1° 1 I 0 *') f” 1 01 (239 b)
°x ' (l 0o r CT' “ ( 0 + 1 )
The scalar pr oduct H*a denotes, as usual, the sum Hxajr+ H uaru-{~IIzaz.
This is a m atr ix with th e elements
(H o )n = - H z, (H o )i2 = HT+ i H u\
(239 c)
(H-o )21 — Hx—iHy, (H o ) „ = + H ,
)'
The m atr ix a was intr oduced by P a u li for th e wave-m echanical r epr e
sen tation of th e electr on ’s m agnetic m om ent which was supposed to be
due to its spin. This ‘in tr insic’ magnetic m om ent can be defined as the
oper ator or m atr ix
wher e ju. == eh/(^7rm0c) is the value of the Bohr magneton.
The reason for this is th at equation (238) can be written in the
usual form (240)
(K+ p M = 0
if p t is defined as the matr ix-oper ator
s h 8
(240 a)
Vt 2 m bt'
§ 29 P AUL I T H E O R Y I N T H E T W O -DIM E NSIO NAL M AT R IX FORM 2H1
and K as the energy matrix-operator
K = ^ J L u *+ L /js -|x H . (240 b)
the additional term —p H having exactly the same form as the energy
of an elementary magnet with a moment p in the given external
magnetic field H.
We thus sec that the generalization of the Schrttdinger theory which
is necessary to account for the spin phenomenon consists in adding to
the energy operator the extra term —p H and in replacing ordinary
operators by opera tor-matrices of the second rank, the function ip being
replaced accordingly by the one-column matrix (237). The old operators
of the SchrOdinger theory, such as 1 u 2-f- U and — —, are replaced
2ra0 27n ct
by their products with the unit matrix of the second rank 8.
In future we shall usually omit the unit matrix, its presence as a
factor being understood whenever we have to deal with an ordinary
operator—like u 2 or U, etc.—of the old theory. With this convention,
the old theory can be preserved without any change of form whatsoever
—except for the addition of the extra term —p H to the energy operator
and the corresponding modification of other expressions connected with
the resulting operator K.
Thus, for instance, if the characteristic values of Ar, which will be
denoted by K \ K n, etc., as before, are imagined to be multiplied by
the unit matrix 8, we may write, omitting the latter, in the same way
as in the old theory: (K—K')#k . = 0, (241)
which is actually equivalent to the system of equations
(Kn —K ,)ipKfl-\-K12tpK>2 — 0 j
(241a)
I t should be mentioned that Schrttdinger’s theory can be regarded
as a particular (or rather limiting) case of Pauli’s theory, obtained by
putting fj, = 0, i.e. by dropping the extra term —p H in K y but pre
serving the matrix form of the resulting operator H , which can be
defined as the product of the ordinary operator ^ an<^ ^ ie un^
matrix 8. The two functions and 02 become identical in this case
except for a constant factor which remains arbitrary, and which,
without loss of generality, can be put equal to zero, the function tp2
thus vanishing and «/q reducing to the ordinary SchrOdinger function \p.
282 W AVE MECHANICS OF A SING L E E L E C T R O N § 29
Befor e pr oceeding fur ther , we must consider the equation which is
satisfied by the function-matr ix ^ t , adjoint to
The conjugate complex of equation (240) satisfied by is
(* * + * ?)* • = 0, (242)
wher e
(0
is the conjugate complex of We shall not, however , in futur e need
this matr ix, but the tr ansposed m atr ix ^ If the matr ix
elements of K and p t wer e or dinar y number s (and not oper ator s), we
could, instead of the pr eceding equation, wr ite
P (K '+ p i) = 0. (243)
We shall pr eser ve this equation in the gener al case, with the convention
th a t the oper ator s K 1 and p j —contr ar y to the r ule assumed hither to—
a ct not on their r ight but on their left. The same r efers to matr ix
oper ator s of any typ e. Thus, if
F __ p i i ^ 12]
is a matr ix oper ator acting on \jt and F\f> the one-column matr ix
t e i t i +-*22 w
resulting therefrom, then the adjoint matrix (Fif*)1 will be defined by
p i" =
= in F* t r + F t t i ) ,
which is in accordance with the usual definition of adjoint matrices.
The necessity for reversing the direction of the action of an operator
from right to left in a transition from F to F 1 is due to the fact that
being a one-row matrix, must always stand as the first factor in
a matrix product involving it (while being defined as a one-column
matrix, must always stand in the second place).
With this convention, the equation for the matrix-function t\c can
be written in the form
JT't) = 0
or, since K ’' = K', ^ k .( K ' - K ' ) = 0. (243a)
This is equivalent to the ordinary equations
= 0,
K L + + U K L - K ' ) = K*_rK.l+ (K* t-K')TK., = 0,
which are the conjugate complex of the equations (241 a) (K ' being real).
§ 21) PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 2*:*
The product of the matrices ^ and ^ is a matrix consisting of one
row and one column only; it can be treated accordingly as a simple
number. This number
M = # * !+ * ? * . (244)
can also be regarded as the scalar product of the two-component vectors
«/r and *p* (or \p^). I t measures, as we know, the probability-density for
finding the electron a t a given point in a state of motion specified by
the matrix or vector ip. If the latter is 'quadratically integrable’, i.e. if
the integral J \p^\p dV extended over the whole space converges, then
ip can be normalized by setting this integral equal to 1. This refers, in
particular, to functions tpK> belonging to a discrete energy spectrum,
in which case we can put
j ^ K4 K d V= 1. (244 a)
It can in addition easily be shown in practically the same way as in
the old theory that functions \p belonging to different energy values,
K' and K " say, satisfy the orthogonality relation
j fa fa dV^ O (A ' * A "), (244 b)
where {Pk*1Ijk' = 'Pk0i'Pk'i~^'I'k02 1Pk'2
is the product of the matrices (or vectors) tp*K. and \pK•.
We have in fact, multiplying the equation (K—K')ipK>----- 0 (on the
left) by tp^ and the equation ^ . ( K —K ”) = 0 (on the right) by ipK>
and subtracting one from the other,
^ { K f a ) - { f a K ^ ) f a = {K’- K " ) fa fa . . (244c)
The two sides of this equation can be considered as ordinary numbers.
If A were not a differential operator but an ordinary matrix of
Hermitian character, i.e. satisfying the condition A ^ — A ^ = K*p or
A = A f, then the left side of (244 c) would vanish identically. In
reality, the matrix A, as defined by formula (240 b), has two component
parts of the above type—namely, the potential energy US and the
additional magnetic energy —p,*H = —^o-H. In fact, it can be directly
seen from the expressions (239 b) for the rectangular components of
Pauli’s ‘spin m atrix' o th at +
1 o' — a . (245)
The left side of equation (244 c) thus reduces to
[^.(u V a:-)- W-i-u2 t = 2~ - 2 (,l’K'au2'l>K'<x-'l'ic*u2*4'i-a)
® 0 Ot-l
284 WAVE MECHANICS OF A SINGLE ELECTRON §29
It should be mentioned that in the case \pK>— ipK" we obtain under
the div-sign an approximate expression for the current density j. [Cf.
the derivation of the expressions (201b) in § 25.]
Multiplying equation (244c) by the volume-element dV and in
tegrating over all space, we thus get
(K' —K ”) j *l>K‘$K-dV — 0,
whence the orthogonality relation (244 b) follows, unless K' — K". The
case of degeneracy, i.e. ^ when K f = K \ can be dealt with in the
new theory in exactly the same way as in the old theory, the SchrOdinger
‘scalar’ function i/j being replaced by the Pauli two-component vector
(or matrix) ip.
The present theory in the above form is a combination of the ordinary
operator theory and the matrix theory, as developed in the preceding
chapters on the basis of Schrftdinger’s equation. It can be reduced,
however, to the usual matrix form by introducing the matrix-com
ponents of the various (t'wo-dimensional) operators F by means of the
formula r
J W = J Pk -F + k - d V, (246)
where (246 a)
CT--I0-1
is an ordinary number (the ‘scalar product’ of the two-dimensional
vectors 0^, and F\jjK \ the latter can be regarded as the product of
the vector and the two-dimensional ‘tensor’ F ).
Replacing the functions by their ‘amplitudes’ with which
they are connected by the same relation
<Pk - = $£'(*> y> z)e-i2nK‘llh,
ns in the SchrOdinger theory, we obtain the matrix-elements of F
n -K = Iw -m -d r .
They are connected with the matrix-components by the usual relations
FK.K>= F°k .k , ei2n(K"~Kylh. (246b)
All the theorems which have been established in Chap. I l l with regard
to the matrix representation of physical quantities ‘from the point of
view’ of the energy K, remain valid if the latter, as well as the operators
representing other physical quantities, are defined as two-dimensional
tensors (or square matrices of the second rank). We have, for instance,
che usual expansion formula
m - = i n -rft-. (247)
§ 29 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 285
which is a dir ect consequence of the or thogonality and nor malizing
r elations for the vector -functions and which is equivalent to the
following two component-equations:
2 K ^ f t = 2 n - v & •« <« = 1>2), (247a)
th a t is, + = 2 n -K-P r v
K‘
F2l'l‘K l + F22'Pl-2 = 2 F K-K-<l>K-2-
K’
The tr ansfor mation theor y, i.e. the tr ansfor mation of the matr ices of
var ious physical quantities fr om the point of view of K (or iginal ener gy
matr ix) to the poin t of view of some other q u a n tity.L , as developed
in Chap. IV on the basis of SchrOdinger ’s ‘one-dim ensional’ theor y, can
be applied without any for mal modification to P a u li’s two-dim ensional
theor y. Intr oducing the tr ansfor mation coefficients <iK"L•, we have, for
exam ple, the usual equation
= (248)
A"
which is equivalent to the two equations
■/-!'* - 2 «K-L- ft-■« (« = 1.2). (248 a)
A"
To make the r esult expr essed by these tr ansfor mation equations unam
biguous, we m ust affix to the functions the index x (shor t for x, y> z,
i.e. th e r ectangular coor dinates of the point to which these functions
r efer ). W e t h u , get _ y W j [ .„ . ,248b )
A”
This equation clear ly shows th a t th e index a (which is supposed to
assume the two values 1 and 2) plays exa ctly th e same r ole as the space
coor dinates x ,y ,z. I t can be consider ed accor dingly as an additional
‘fou r th ’ coor dinate, which is usually r efer r ed to as the ‘spin coor dinate’.
W ith th is condition, th e two functions ^ (x.y^ z) and ^ (x^ y,2), for ming
the com ponents of the Pauli vector (or matr ix) can be consider ed
as th e two values of the same function 0(a ,x, y, z) r efer r ing to the same
values of x, y , z and to the two differ ent values a = 1 and a = 2 of the
spin coor dinate. The addition of the latter to the usual thr ee co
or dinates x , y , z enables one to r educe the two-dim ensional Pauli theor y
to th e old uni-dim ensional for m—with one modification onty concer ning
th e oper ator s F ieix) as defined ‘fr om the point of view’ of th e basic
q u an tities a, x, y , z. These oper ator s can be defined as or dinar y functions
of th e continuously var iable quantities x fy>z and of th e elementar y
280 WAVE MECHANICS OE A SINGLE ELECTRON § 20
differential operatorsp T = ~ —1p = -A-. A , p = — thev must,
27r? dx v 2m dy 2m dz
however, be defined as matrices with regard to the discrete variable a.
In fact, the result of the application of an operator F to a function
of the type </r(a,:r,i/,z) must be another function of the same type
<f>(P,x,y,z), referring to the same values of x, yt z but not necessarily to
the same value of a. Assuming fi to be independent of a, wc see that the
most general type of linear operator satisfying the condition
= M *)
can be defined by putting
Fifj{<x,x) = %Fp a^(a,z),
a t --1
where the Fpa are ordinary operators involving the space coordinates
only.
I t is possible and sometimes convenient to modify the preceding
notation in the opposite way, namely, by preserving a as a duplicity
index and introducing similar indices for the two values of all the other
quantities which are derived from a single value through the action of
the spin term —/xaTI in the energy operator K. This refers in the first
place to the characteristic values of the energy itself. The two values
of K ' , which are obtained by the splitting up of a certain characteristic
value of the SchrOdinger energy operator H' and which, in general, lie
very close to each other, could be denoted by adding to one of them
a subsidiary index, k say, assuming the two values 1 and 2, the com
bination (1, K f) being equivalent to K'+, say, and (2, K ') being equivalent
to KL, where K'± are the two values of K ’ corresponding to the given
value of H'. With this notation, the transformation equation (248 b) can
be rewritten in the form
2
1P \'L'; a 'x ' = 2 £ A ' / /0 k ' A * ' ; a \t ' »
K" K=1
where K" and L ' arc the single values of the energy operators K or L
unperturbed by the spin term —p,H.
From this point of view, the matrix components of an oj)erator F:
^ k 'K * ; k ' K ' — j fi i c ' K 0 F ' P k ' K ' ^ » (2 4 9 )
can be grouped together into two-dimensional matrices
f K'K’ = » w > (249 a)
1 * 2 A ' ; 1 A > * 2 A " ; 2 K ''
which correspond to the ordinary components of the matrix FKi defined
§ 29 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 287
from the point of view of the SchrOciinger energy operator K without
the spin terra.
The matrix FkK considered in this way—i.e. as formed by elements
which are themselves matrices—is called a ‘super-matrix’.
We shall not consider the further development of these formal con*
siderations. The preceding outline will be sufficient for handling various
problems connected with Pauli’s theory in any one of the three equi
valent forms, which have just been indicated. The simplest and most
important of these problems is the approximate solution of Pauli’s
equation, considering the spin term —pa H as a small perturbation.
The energy operator resulting from K by the omission of this term will
be denoted by //; it is equal to the SchrOdinger operator u2/(2w0)+ U
multiplied by the two-dimensional unit matrix S. In order to avoid
confusion between this operator and the magnetic field strength, we
shall denote the latter by §.
The change of the energy values / / ' produced by this perturbation
can be calculated, to the lirst approximation, by means of the same
equations as in the case of the Schrodingcr perturbation theory. In
doing this we must, however, keep in mind the fact th at the unper
turbed problem is degenerate, each value of H' corresponding to at least
two different states. I t is just this latent duplicity which must be
revealed by taking into account the spin energy
S = -p £ v a . (250)
Assuming no other degeneracy to take place (or the matrix elements
of the perturbation energy S with regard to other states of equal
unperturbed energy to vanish), we obtain the following equation for
the first-order correction AIF of the unperturbed energy
_ A //' Sl>*
0, (250 a)
£2,1 £2,2_Atf '
where S*-x - SK„ , Xa. = J </-«//• dV, (250 b)
the indices *, A (= 1, 2) specifying the two degenerate states in ques
tion. They are used as superscripts in the matrix elements of 8 in
order to distinguish the latter from the matrix elements with regard
to the spin-index
The two functions (k = 1,2), or rather function-pairs
(a = 1, 2) describing these degenerate states must be defined with the
help of the ordinary SchrOdinger function ifjWx = 0 in such a way as to
288 WAVE MECHANICS OF A SINGLE ELECTRON § 29
satisfy the orthogonality and normalizing relations. The simplest way
to do this is to put
01//'; 1*0//'x> 01//'; 2j- = ^ \ (l)r>l)
02//';ljr ~~ 02//'; 2* 0/7\r I
(supposing the function ipirx to be normalized).
By the definition of the spin matrix o [cf. equations (239 b)] we have,
dropping the indices IV and x,
(A*0a )i -= £n0Al + £ 120A2 ^ +M[Sr0Al —(§.r -M'SJ 0A2l
(£0 a )2 = ^210Ai + ^220A2 = M[(” -6x + ^ )0 A i —S;0Aa]-
In the present case these expressions reduce to
(£0j)i = M§c0> (# 0 i )2 = /*(“ +
for A = 1, and
(£02)1 = (£0a)2 ■
■■
•■ —p £l 0<
for A = 2. We thus get, with the help of (250 b) or
8^ = 1 1 1 ^ 8 ^ <IV:
« £
,S’U =
,s'i,2 _ J (gx+ i § J,)0*1t r f r
(251 a)
s 2'2 .-= J §.<!,*$ a v
whence, according to (250 a)
(A//')2 - (SU)s+ \S'fi\*t
since aS2-2 = —S1*1 and &2’1 = aS71-2*, or
AH' =-- ±V{(Sl-,)2+ l ^ 2|*}. (251b)
This formula solves our problem so far as the splitting of the original
‘unperturbed’ energy-level is concerned. The fact th at the two sub-
levels have an additional energy of the same magnitude and of opposite
sign can be interpreted by assuming th at the intrinsic magnetic moment
of the electron has in both cases opposite orientations varying, in
general, from one place to another according to the direction of the
magnetic field. In the simplest case of a homogeneous field, the two
orientations can be shown to be parallel to the latter.
§29 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 289
We have, in fact, in this case
S'* = - / ! ( $ , + »$,),
so th at AH' = ±/x§, (251c)
where § — magnitude of the magnetic field
strength. This formula is in full agreement with the assumption that
the electron has an intrinsic magnetic moment of magnitude /x (Bohr’s
magneton), which in a homogeneous magnetic field is oriented either
in the same or in the opposite direction to the magnetic lines of force.
It can in addition easily be shown that, in the case under considera
tion, formula (251 c), which has been derived as the first approximation,
holds exactly.
For the sake of simplicity, we shall imagine the magnetic field to be
parallel to the z-axis. Pauli's equation then reduces to the form
(H —fjL$><jz—K')ilj = 0,
which is equivalent to the two equations (cf. (235)):
(7/-f/x$ —Ar/)0X== 0,
= 0.
If 0 /r is the solution of the Schrodinger equation <IV = 0
corresponding to the unsplit energy-level //', then the solution of the
preceding system can be put in the form
(1) K ' — //'+/X§, j/q = 0/r , 02 = °.
(2) K' = / / ' —/X$, 0! = 0 , 02 = 0//'-
The first case obviously corresponds to an orientation in the direction
opposite to th at of the magnetic field, and the second to an orientation
in a direction coinciding with it (i.e. in the direction of the positive
z-axis).
This indicates, incidentally, th at the functions 0 X and 02 can be
considered as the probability amplitudes for finding the electron at a
given point with its intrinsic magnetic moment pointing in the negative
and positive directions of the z-axis respectively. In the general case,
both of them are different from zero. I t is perfectly natural that, under
this condition, the probability of finding the electron a t a given point
irrespective of its orientation should be measured by the sum |0il2+10212*
We see, further, that the index at which distinguishes the two com
ponents of the ‘vector’ 0 fully deserves the title of a fourth ‘spin-
coordinate’; it must be borne in mind, however, th at it specifies not
290 WAVE MECHANICS OF A SINGLE ELECTRON § 29
th e or ientation of th e ‘sp in ’ or magnetic axis in space, but only its
or ientation in one of th e two senses parallel to a given direction—namely,
th a t of th e z-axis.
This inter pr etation is suppor ted by the for m of the expr ession for
th e aver age or probable value of th e z-component of th e electr on’s
m agnetic m oment, as defined in th e usual way by the for mula
h = j ^ d V .
W e have, nam ely, with pz = poz and (oz\ft)l = o,zn 0i+ °«i2 02 = —^i>
( ° i 0 ) s = CTi21 <Pl + Cz22<p2 = + ^ 2 .
A, = M dV• (252)
In a similar way we find
Px = P j + dV
(252 a)
PV= / W^2— 0l) dV
We thus see th a t the dir ect r elation of the functions i/j1 and ip2 to the
or ientation of th e electr on’s m agnetic m om ent is lim ited to the z-axis.
The two functions \jt*ift2 and have complex conjugate values, and
can n ot be associated with a definite dir ection of the electr on’s moment
par allel to th e x - or to th e y-axis.
The quantities
®i* = n m z + t t h ) , = /a(*a**/,2—0?iAi)
(252 b)
are the components of a certain vector 3tt, which can be defined as the
probable magnetization, i.e. the probable value per unit volume of the
magnetic moment of the ‘electron cloud’ distributed with the density
= P- vector 2tt/p can be regarded accordingly as de
fining, both with respect to magnitude and direction, the probable value
of the intrinsic magnetic moment of the electron, supposed to be
situated at a given point. The magnitude of 371 must, of course, be
expected to be equal to p. This is easily seen to be actually the case.
We have in fact,
an2= mi+mi+mi = w x+mvm x- m v)+mi
—mW? •Af<A2+ (,/,2•A?)s+ (^i ■/>?)2—20a 'Z'?] =
so th a t 2R/p = p. The u n it vector 2W/pp thus deter mines the pr obable
dir ection of th e electr on’s m om ent a t a given point.
The physical meaning of th e vector fDl is in agr eement with th e
§ 29 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 291
expression ccurlSW in formula (234 a) for the additional current density
(cf., for instance, my Lehrbuch der Elektrodynamik, vol. ii, Chap. I).
In contradistinction to the electron’s position, its orientation cannot
be specified exactly, so that we must confine ourselves to the deter
mination of the probable orientation or of the probability of a certain
orientation (under given circumstances). The formal reason for this
difference is th at the matrices pyy pz or <rxy cryy ogy whose charac
teristic values should specify the orientation in the same way as the
values of the coordinates xyyyz specify the position, are not independent
of each other.
In fact, multiplying them according to the usual rule of matrix
multiplication, we get
’■H ! !){-! »]={
oH H
— n 91: h
If the multiplication is effected in the opposite order, the same results
are obtained but with the opposite sign, so that
(t v (t x = —axavt os av = —ovoz, crx(Tz = —azax. (253a)
These equations express the fact that the matrices crx, ayy crz do not
commute with each other—in contradistinction to the coordinates xy\gyz\
according to Dirac’s terminology they are said to ‘anticominuto’. Com
bining equations (253) and (253 a), we get
VxVy—VyVx =
etc., or in vector notation
o X a = 2 ia. (253b)
The non-commutability of the matrices ax) oyy oz means that the
values of the quantities represented by them cannot be determined
(‘observed’ or ‘measured’) simultaneously. I t should be mentioned
th at these values are to be defined in the usual way, namely, as
the characteristic values of the corresponding matrices, regarded as
linear operators, acting on a two-component function of the type tp.
Denoting these values by dashes, we have for their determination the
equations
ix = ax 'Ax. °VAy = Ay. a . Ax = a 'zAx.
292 WAVE MECHANICS OF A SINGLE ELECTRON § 29
or in components
V*
°xll *Aj-1+ <^12<l>xl = °x 4>x\ •
e»
etc., that is,
4>.12 = <l>xl = o'x'l’xt V
tyyi = Vy'PyV (254)
II
1
£
J
1
jf-
?-
A <Pz2 = O't^z2
II
whence it follows that
°x — ±1> •Pxi = ±<l>xl 'i
°v = ±1, ^»2 = }• (254 a)
v’z — zb 1> H-
II
c»
The characteristic values of the rectangular components of the elec
tron’s magnetic moment p. = fxa are equal accordingly to ±fx. This
means that, in determining the orientation of this moment with respect
to some axis, we have to assume beforehand that it is parallel to this
axis, the question to be decided reducing to the choice between the
positive and the negative direction. In other words, we have to assume
that the electron’s magnetic moment is quantized about some (arbitrarily
chosen) a xis, the two possible values of its projection on this axis being
+/x and —p, while its projection on any other axis remains undeter
mined. In the preceding theory this role of quantization or reference
axis has been conferred on the z-axis. The theory can easily be
generalized for the case when this reference axis has any direction
whatsoever with regard to the coordinate axes.
These results appear quite natural from the point of view of the
general transformation theory, developed in Chapter IV. Since the
matrices ax} ayy az do not commute with each other, one of them only
can be used as a basic quantity, not only for the determination of the
two others, but also for the determination of the matrix crn representing
the projection of a on any other direction n. In the preceding theory,
this basic role has been conferred on a2t which appears accordingly as
a diagonal matrix, while ax and av are not diagonal.
The present case can serve as a very simple illustration of the trans
formation theory, since we have to do with two states only, the state-
space thus reducing to a plane in which the two states are represented
by two mutually perpendicular axes, z+ and z_ say. Replacing z as
a reference axis (in ordinary space) by some other axis z' , we obtain
two other states (in which the electron’s magnetic moment is oriented
parallel to z'), which are represented on the ‘state-plane’ by two other
§ 29 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 293
mutually perpendicular axes z\ and zl (with the same origin as the
axes z±). If the angle between z and z' is equal to 6, then the angle
between the axes z+ and z'+ in the state-plane must obviously be equal
to \8—since to e,n angle of 180° between the direction of the positive
and negative z (or z') axis there corresponds an angle of 90° between
the axes z+ and z_ (or z'+ and z'_) on the state diagram. Now, as we
know from the general theory, the square of the cosine of the angle
between two axes in the state-space is equal to the relative probability
of the state represented by one of them subject to the assumption that
the probability of the other is equal to unity. Hence it follows that
if the magnetic moment of the electron is known to be pointing in
a certain direction (that of + z, say), there is a probability equal to
cos2 ^6 that it will be found pointing in another direction (that of + z')
making an angle 6 with the former. The probability that it will be
found pointing in the direction opposite to the latter (i.e. th at of —z')
is equal to cos2 £(7t —6) = sin 2 i0. We thus see th at if the electron’s
moment is known to point in a certain direction (-[-2), there is a pro
bability equal to cos2 \9 -f- sin2£0 — 1 th at it will be parallel to any
other direction (in the positive or the negative sense). This means, as
stated above, that the direction of the reference-axis to which the
electron’s moment must be assumed to be parallel can be chosen quite
arbitrarily.
Ail these results can be considered as a particular case of those
holding for the magnetic moment—or the mechanical angular momen
tum —due to the orbital motion of a (non-spinning) electron in a radially
symmetrical (central) field of force. As shown in Chapter II, the
z-component of this orbital angular momentum Mz can be assumed to
be quantized, i.e. to take a discrete set of (characteristic) values mhl2rr
the axial quantum number m varying from —Zto + /, where I is the
angular quantum number determining the total angular momentum
according to the formula M 2 = A2Z ( / 1)/4tt2, while the x- and y-com-
ponents of M do not have definite values. The present case can be
obtained from the general case by taking I equal to \ —i.e. by ascribing
to the electron, irrespective of its orbital motion, a spin motion of a
‘half-quantum’ magnitude. We have seen in Chapter II I th at the
matrix representation of physical quantities, being more general than
the operator representation, leaves room both for integral and half-
integral values of the angular quantum number, subject to the condition
th a t the axial quantum number should vary by elementary steps
Ain = 1 from —Ho + L This vacant place, or rather the lowest vacant
294 WAVE MECHANICS OF A SINGLE ELECTRON § 29
step on the Z-staircase, can now be filled by the electron’s spin angular
momentum. The other—higher—steps can be represented by combining
the latter with the orbital angular momentum—if any (see below).
The possibility of attributing to the electron, in addition to an
intrinsic magnetic moment p., an intrinsic angular momentum 8 pro
portional to it, i.e. represented by the same matrix a with a certain
numerical factor, follows also from the fact th at this matrix satisfies
the commutation relation (253 b) which is quite similar to the com
mutation relation M x M = —hMfeiri satisfied by the orbital angular
momentum M. Assuming the electron to possess an intrinsic angular
momentum s = ko (255)
satisfying the preceding relation, we get
k2o X <x = — ~.KO,
2vi
1 h_
or, according to (253 b), (255 a)
2 2t t ’
which means th at the magnitude of this momentum corresponds to
I — as was deduced above from the fact th at the electron’s magnetic
moment can only assume two (opposite) orientations parallel to a
quantization axis.
I t should be noticed th at the formula M = ko, with the above half
quantum value of k, does not contradict the result th at the charac
teristic value of the square of M must be equal not to \h 2j4tt2, but to
$h2/47r2, where f = Z(Z+1) with I = In fact, squaring the equation
s = ko, we get at = K2a 2 = k V J + S - K ) -
The characteristic values of 82 are obtained by substituting the charac
teristic values of o\, o\, o\. Now from the definition of the matrices
oz, oy, oz, it follows th at their squares are equal to the unit matrix
= al = «5 = 8. (255b)
The characteristic values of the latter being equal to 1, we thus get
char, value of M 2 = 3*a = - >
4 4TT2
While the electron’s intrinsic angular momentum k has a half-quantum
value, its magnetic moment /x = he/^7rm0c has a whole-quantum value,
i.e. the game value as the magnetic moment due to the orbital motion
with the angular quantum number I = 1. The ratio of the magnetic
§ 29 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 295
moment to the angular momentum
M= J L
k m0c
is thus twice as large in the case of spin as it is in the case of the orbital
motion.
This difference may be reduced formally to the fact th at the spin
matrix satisfies the relations (253) and (253 a), which are responsible for
the factor 2 in (253 b) and consequently for the factor £ in (255 a) (these
relations have no parallel in the case of the matrices representing the
orbital angular momentum). I t is the fundamental cause of the com
plications in the action of a magnetic field on a spinning electron,
moving in a central field of force, which are usually referred to as the
'anomalous’ Zeeman effect.
Postponing the detailed consideration of the latter till a later section,
we shall calculate here the rate of change of the total angular momentum
of the electron due to the couple produced by the magnetic field. If the
preceding assumptions about the electron’s spin are correct, then we
must have (so long as the electrostatic field can be supposed to produce
no couple), according to the classical mechanics,
- (L+kb) = 9- e- (L + 2*0) X (256)
at Zm0c
where L is that part of the angular momentum which is due to the
orbital motion. The same equation must hold in wave mechanics if
L and a are considered as operators and if the time derivative of an
operator F is defined with the help of the energy operator K by means
of the formula ,F 9 *
dJ L = [Ar, F] = ^ ( K F - F K ) . (256 a)
In equation (256), the operator (or operator-matrix) M = L -f repre
sents the total angular momentum of the electron and the operator-matrix
_L + -°- KO = -(L +2s)
: mnc
't-Ql * 2wIq c
the total magnetic moment, due both to its motion about the nucleus
and the supposed 'spinning’ about its own axis.
Neglecting the terms proportional to the square of the magnetic
field, we can put R = H_ ^ .a
where H is the Schrbdinger energy operator,
i / * „ e
296 WAVE MECHANICS OF A SINGLE ELECTRON §29
[cf. (210 a, b), § 26], supposed to be multiplied by the two-dimensional
unit matrix 8 [eL/(2m0c) is the magnetic moment of the orbital motion].
The sum of the first two terms of this operator, representing the
kinetic energy and the potential energy of the radially symmetrical
electric field, commute both with L and a, so th at in the formula
(256 a), with F = M, we can put simply
(256 b)
it being understood that L is multiplied by the unit matrix 8.
Now we have, since a obviously commutes with L,
[A",/ca] = —/qz[(£va),a].
For the sake of simplicity, we shall assume the magnetic field to be
parallel to the 2-axis (this does not, of course, involve any loss of
generality). Taking the rectangular components of the bracket expres
sions on the right side of the preceding equations, we get, with the
help of the equations L x L = —JiLftiri and o x o =- 2ia,
[S>LZ,L X] = = — £ (L xL )„ = - § L y = - ( L x * ) ,
2ni
[$£..■&,] = S [4 > A ,].= - ~ * ( L x L ) x = f>Lx = - ( L x * ) ,
\$ LZ, Lz] = 0,
We thus have, returning to the vector notation,
[(*-L),L] = —L x * , [(*•«), o] = —y o x * .
and consequently
or, since k = A/4?r,
which is nothing else but equation (256).
s 29 PAULI THEORY IN THE TWO-DIMENSIONAL MATRIX FORM 297
Our interpretation of the matrix o as representing a spin motion of
the electron with an angular momentum k = 7i /4 t t and a magnetic
moment fi = ek/47rm0c is thus fully checked—a t least from the formal
point of view. One may argue th at it cannot have an actual physical
significance since the electron in the Pauli theory, just as in th at of
SchrOdingcr, is dealt with as a point, with definite coordinates x, y, z,
and a point-like particle cannot be imagined to be spinning. To this
one can retort firstly, th at Pauli’s theory amounts to the addition of
a fourth ‘spin’ coordinate, giving a schematical representation of the
spin motion; and secondly, th at the translational motion—in particular
the revolution about a fixed centre—in wave mechanics is also repre
sented in a schematical way only.
30. Mor e Exact For m of the T wo-d im en sion a l M atr ix Theor y;
Electr on’s Electr ic M oment
Pauli’s theory, discussed in the preceding section, accounts for the
duplicity phenomenon in the presence of a magnetic field only, whereas,
in reality, this phenomenon is observed just as well without such a field.
A full account of the experimental facts is given by the theory of Dirac
which we are now' going to examine on the same lines. The preceding
analysis of Pauli’s theory will prove very helpful in the discussion of
the mathematical form and physical meaning of Dirac’s exact theory.
H wcpr t (257)
then equations (229 a) and (229 b) of Dirac’s theory can be written in
the following form:
o-m/r + (ui—m0c)x = 0
(257 a)
o ttx + (iq + m 0c)</r = 0
where o is Pauli’s spin matrix, while the operators ut^fm0c are under
stood to be multiplied by the unit matrix 8 = |* ; tp denotes here
the twro-dimensional matrix f ’M and x denotes the matrix Z*1).
Ifr2' lX2l
Applying to the first of equations (257 a) the operation a ll, we get,
with the help of the second equation,
(a u )20+ [(au)w /- w /(au)]x+(w /- m oc)au x
= (o-u)V +[(o‘U)ttr tt/o'U )]x-(i/r m oC )h+fn0c)^== 0.
Now (ut—mQc)(ut+rriQc) = u j—mlc2; we have further, according to
(218.), le
tfM.e Qq
298 WAVE MECHANICS OF A SINGLE ELECTRON §30
and (ou)» = a \u l+ ...+<JxOvUz Uv+ ...
- {ul+ u\+ ul)+ ivl (uxuv- u yut )+ ...,
i.e., according to (218),
(a u )2 = = u 2- ^ a - H .
2mc 2ttc
Putting, for the sake of brevity, u 2—uf+ mlc2 = Z>, we thus get
^ - | i a ( H ^ - » E x) = 0. (258)
In a similar way we obtain the equation
hp
Dx - - ^ o ( H x -iE + ) = 0. (258 a)
These equations are equivalent respectively to the second-order equa
tions (230) and (230 a) of the Dirac theory and could, of course, be
derived directly from the latter.
The expressions (232) and (232 a) for the probability density and the
probability current-density can be written in the form
p = ^v+xfx> (259)
j = c ^ o x + x ^ ). (269 a)
In the case of a conservative motion with a positive energy t which
differs relatively little from the rest energy m0c2, the functions x can
be expressed in terms of ^ with the help of the relations (233 a) or
<j u </s (260)
2m0c
which is the approximate form of the first of equations (257 a).
Using the relation
<rx(<Ml) = ax <Jxux+ oxoyuy+ a x <7Zuz = ux+i(Jt uy—i<7yuti
that is, o(o'U) = u + iu X a , (260a)
we get, substituting the expression (260) in (259 a),
1 i
j = -■— 0 fu0 + -— ^ fu x o ^+ co n ju g ate complex,
2m0 2771*0
which is easily reduced to the approximate form (234 b) with
2R = in agreement with (252 b). As a m atter of fact, we have
merely repeated the argument of § 28, using the new matrix notation
to illustrate its convenience.
The equation of Pauli’s theory was obtained from (258) by neglect
ing the last term (proportional to x) and replacing the two terms
—ttj-f m\ ca in the relativistic operator D by 2m0(pt+ U). We shall get a
§ 30 MORE EXACT FORM OF THE MATRIX THEORY 209
better approximation if we substitute in (258) the expression (2C0) for
X—which gives an additional term of the second order in 1/c—and
introduce a correction term of the same order in the expression for I).
Limiting ourselves, for the sake of simplicity, to the case of conservative
motion, and putting e = m0c2+ K and c' — w0c2-f A', we have
Ui — —-(e' —U ) — — -(?w0c2+A"/—U).
c c
This gives D = u 2—2wi0(A"' —U) —(K‘—U)2/c2, so that equation (258)
assumes the form
[ ^ - 2 m 0(K’- U ) - l ( K ' - U r } l , - ^ co m - i E x) = 0.
Neglecting the relativistic corrections, i.e. putting c — cc, we obtain
the ordinary SchrOdinger equation
[u2-2 )n 0( A " - U)] i/j = 0,
whence it follows that, with an accuracy of the order of 1/r2, we can
replace the operator (K' —U)2/c2 by u *j(2m0c)2 -- (u2^vl+ n -z)2!(27ii0c)2.
The preceding equation thus reduces to the standard form
(A - A ')0 = 0,
with the energy operator
1
A = 1 7 + A UJ —
2711"0 (2m0)3c2 ."‘- " [ " - i H
With the help of the formula (260 a) the last term in this expression
can be rewritten in the form
— a [h- o + E • ™ (u X O — ‘i l l ) 1 .
L 2moc J
The operator i/iE-u represents a purely imaginary quantity whose
average value vanishes and which can therefore be left out of account, f
Putting po = (x, we thus get
A' = ( - - u i+ C 7)+ 5, (261)
where the first term represents the usual (Schrbdinger) energy operator
(multiplied by the two-dimensional unit-matrix 8), while the operator
S = — --- 1 -o-„u 4—H u —E- w — u X|i (261 a)
(2m0)S
*8c2 r 2m0c r v'
can be regarded as a kind of perturbation energy, which specifies, with
t In fact the product ^ E u is approximately equal to the work done on the electron
per unit time, i.e. to —dU/dt; in the case of a stationary motion its average value
must obviously be equal to zero.
300 WAVE MECHANICS OF A SINGLE ELECTRON § 30
an accur acy of the second or der in 1/c, th e influence of th e r elativity
cor r ections. One of these, r epr esented by the fir st ter m in S, r efer s to
the var iability of mass with velocity, while the other , r epr esented by the
second and thir d ter ms, cor r esponds to th e spin phenomenon. The second
ter m, which has been discussed alr eady in the pr eceding section, can
be r egar ded as the additional ener gy due t o the electr on’s intr insic
m agnetic m om ent p. As to the thir d ter m, it can be inter pr eted in
a similar way—nam ely, as the additional ener gy due to the pr esence
of an electric moment r epr esented by the oper ator
1
v = - -u xu .
2m0c
We ar e thus led to r egar d th e electr on as a par ticle combining th e
pr oper ties of a point char ge, of an elem entar y m agnet, and of an
elem entar y electric dipole, with an electr ic m om ent pr opor tional to th e
m agnetic m om ent (p) and to th e velocity of tr anslational motion,
r epr esented appr oxim ately by the oper ator u jm0.
I t should be m entioned th a t th e association of an electr ic mom ent
with a m oving par ticle which is known to possess, when a t r est, a
m agnetic m om ent, is a dir ect consequence of th e r elativity theor y as
applied to th exp n n exion between th e m agnetic and the electr ic field.
I f we have, for exam ple, in the coor dinate system A only a magnetic
field H (E = 0), then in another system A' which is moving r elatively
to th e fir st with a velocity v' = —v, wre m ust have, in addition to
a m agnetic field H' which is slightly differ ent fr om H (the differ ence
being of th e second or der in v/c), an electr ic field
E' = —v x H' /c ^ —v x H/c,
and vice ver sa: in th e case of the pr esence of a pur e electr ic field
E (H ~ 0) in th e system A , ther e m ust be, in the system A \ besides
an electr ic field E' som ewhat differ ent fr om E, also a magnetic field
H ' = v x E ' /c = v x E /c .
L et us consider in the latter case a par ticle w'hich is m oving with the
system A ' and which, with r egar d to th is system , possesses a magnetic
m om ent p. I t will have accor dingly an additional m agnetic ener gy
U' — —p H ' == —p -v x E ' /c ^ p -v' xE ' /c. Now th is ener gy can be
expr essed in th e for m
V = ^ E ' ( |ix v )
or V ^ — i E (v' X |i)
{30 MORE EXACT FORM OF THE MATRIX THEORY 301
and interpreted as the additional electric energy with regard to the
system A of an electric dipole with a moment
1 ,
v = -v 'x ix .
c
We are thus entitled to assume th at a particle which, when at rest,
behaves like an elementary magnet with a moment (i acquires, when
moving with a velocity v', an electric moment v 'x jt/c. This result
can be obtained directly with the help of the spinning sphere model of
the electron, if due account is taken of the redistribution of the electric
current density produced by the superposition of the translatory
motion on th at of rotation.f
Replacing the velocity v' by the operator u/?/?0, we obtain for the
representation of the electron’s electric moment the operator
v = -A -u x a, (261b)
m0c
which is just double the previous expression. The additional electric
energy, represented by the last term in (261 a), must be written accord-
ingly in the form ^ - iE v, (261c)
while the magnetic energy is expressed in the usual way by
un = - H | i .
The origin of the factor \ in (261c) can be interpreted in different
ways. I t can be obtained, in the first place, by applying the relativity
theory to the spin motion. J I t is simpler, however, to connect it with
the fact th at the energy Uc corresponds to a second-order effect (wThile
Um corresponds to a first-order effect), as in the familiar case of a
particle possessing no rigid electric dipole moment, and acquiring such
a moment under the influence of the electric field onl}r. In the present
case, this influence is an indirect one, proceeding through the velocity
of translational motion which is maintained b}^ the electric field.
Before discussing the exact theory of Dirac, we shall apply the pre
ceding corrected form of the Pauli theory to the approximate calculation
of the so-called ‘relativity corrections’, i.e. of the shift and splitting of
the energy-levels of an electron moving in a spherically symmetrical
electric field with or without a homogeneous magnetic field superposed
upon it.
j Soe my Lehrbuch der Elektrodynamik, vol. i, pp. 295-6.
t See L. H. Thomas, Nature (1920), p. 514, and Phil. Mag. (1927); also J . Frenkel,
Zeits.f. Phys. 37 (1026), 273.
302 WAVE MECHANICS OF A SINGLE ELECTRON
A. No magnetic field
The perturbation energy reduces in this case to
1 , h-
E-(pxo), (262)
P4- i
(2m0)3c !*
S = -
where p = hVj'lTri is the operator representing the electron’s momen
tum. Putting 7
E = —- r ,
r3
which corresponds to a Coulomb field of force produced by a nucleus
with a charge Ze, we get
Ze , . Ze
E -(pxo) = a-(E xp) *o (r x p ) - - r c L,
where L = r x p is the operator of the electron’s angular momentum
(without the contribution ko due to the spin). Substituting this expres
sion in (2G2) and replacing p2/2m0 by I I ’—U - I F Z e 2/r, where IF is
the unperturbed energy, as given by SchrOdinger’s or Bohr’s theory,
W6get S -■= - 1 Mrr, Z<*\* + J(L .a ) (2G2 a)
.[(*• + ]•
where a = cuZe =J- ,
47rW?0
the charge of the electron being denoted by —e.
The expression (262 a) is somewhat similar to the expression (150) for
the magnetic perturbation energy, differing from it in the first place
by the fact th at the constant magnetic field fy is replaced by a kind
of effective magnetic field
Ze
eff L, (2G2 b)
2m0crz
which is inversely proportional to the cube of the distance from the
nucleus and parallel to the vector of the angular momentum L, and in
the second place by the appearance of the additional term
2 c2
which is supposed to be multiplied by the unit matrix S = j !.
The argument used for the solution of the magnetic perturbation
problem in the previous section can thus be applied, practically without
any modification, to the present case; it can be simplified by using from
the outset a coordinate system with the 2-axis parallel to the vector
L (which is a constant of the unperturbed motion).
§ 30 MORE EXACT FORM OF THE MATRIX THEORY 303
The result is expressed by the formula
4" '- - 2 » ; 4 ( H'+ v T ±“i (5)]' (263)
where the averaging is to be carried out for the unperturbed motion
with the help of the usual (scalar) SchrOdinger function «/r specifying it,
according to the formula F = J F i/ji/j * dV. The preceding formula can
be interpreted by assuming two types of the perturbed motion with the
electron’s spin axis parallel to the axis of the orbit and having either
the same or the opposite direction (L a = ± L ). The numerical values
of AH' = AH'± can be computed approximately by replacing the wave-
mechanical averages or probable values by the time averages of the
classical (Bohr) theory. The latter givesf
i_ i 2 _ 1 I_ _ 1
r~~a V2 ~ d b '
where a is the semi-major and b is the semi-minor axis of the electron’s
elliptical orbit. We thus get
AH , = 2 2 W a t] (263 a)
2m0c2 [ aa abab b2J
Now according to the Bohr theory we have further:
h*n2 ,» k T h 2irhn0 Z2e4
a = b = - a,
47r2ranc2Z * n L = T„ k’ 2a ’ KW *
where n is the principal and k the angular quantum number. Sub
stituting these expressions in (263 a), we find
f = ( '- 3 + ^ W ,a
a ab \ k)
olL Ze2h hk n® _ (Ze2)2 hW J_ 2n 2
and
6^ 47mi0 27t fca* 4a 2 Ze2 k^ 1c2
whence (263 b)
This formula was originally obtained in 1925 by Uhlenbeck and
Goudsmit in practically the same way as th at shown above, without,
however, any use of the matrix o (the product L*a being replaced by ± L
on the assumption th at the electron’s axis can have only two opposite
orientations parallel to the axis of the orbit).
By applying relativity mechanics to the stationary states of the Bohr
j Cf. Born, Atommechanik, i, p. 164 (Berlin, 1025).
304 WAVE MECHANICS OF A SINGLE ELECTRON i 30
theory, Sommerfeld, in 1915, derived the following formula:
€nk
--K & r
which proved to be in exact agreement with the experimental data for
(264)
the energy-levels in hydrogen and ionized helium. Here y is a dimen
sionless constant 27re2 _ , -
y = -/ = 7.10~3, (204 a)
he
s = n —k is the radial quantum number, and
k' = J ( k 2- y 2Z 2). (264b)
The constant y Z determines the ‘relativity splitting’ of the energy -
levels belonging to the same value of the principal quantum number
7i, and so determines the ‘fine structure’ of the spectrum. When
yZ <: 1, we can replace formula (264) by the approximate formula
n\
€n k ~ €n (264 c)
m0c2\ 4 k)
t Uq c 2y 2Z 2 27T2m0e4Z3
where Wn = en—vi0c2 stands for //'.
2n2 " h2n2
This fine-structure formula of Sommerfeld has been brilliantly con
firmed not only for hydrogen and ionized helium, but also for X-ray
spectra of the heaviest atoms. The number of lines given by it in the
latter case (with k = 1, 2,..., n and with regard to the selection rule
Ak = ±1)> or the number of energy-levels in the absorption spectrum
of X-rays comes out, however, too small, being equal to n instead of
2n—1, as found experimentally. Thus, for example, we have, when
n = 2 (L-group), three energy-levels, while Sommerfeld’s formula only
gives two (k = 1 and k = 2); when n = 3w e have five levels instead
of three, etc.
This difficulty was removed by Uhlenbeck and Goudsmit’s theory of
the spinning electron. To every orbit specified by the numbers n , k
there are two possible oppositely directed orientations of the spin axis
perpendicular to the plane of the orbit. Corresponding to these two
orientations, we must have two different additional energies which
bring about the doubling of all the energy-levels cnJk, according to the
formula (263 b).
However, some secondary difficulties remain unexplained by this
theory: First, one of the levels belonging to the same principal quantum
number (n) should remain undivided (since the number of different
levels is equal to 2n - 1 and not to 2n). This can be explained at once
J 30 MORE EXACT FORM OF THE MATRIX THEORY 306
if we ascribe to the angular quantum number the values 0, 1,..., n —1
instead of 1, 2,.,., n, i.e. if we introduce straight-line orbits instead of
circular ones—because obviously for such straight-line orbits all orienta
tions perpendicular to the direction of motion are equivalent. I t
should be noticed, however, th at the approximate formulae (263b) and
(264 c), as well as the exact formula (264), cannot be applied to the
case k = 0.
Secondly, for' hydrogen and ionized helium—briefly in the case of
atomic systems with a single electron—the experimental data fit exactly
with Sommerfeld’s formula both with regard to the number and the
position of the levels, if h is assumed to take the values 1, 2,..., n.
This difficulty can also be overcome by a more exact analysis of the
‘splitting due to spin* and its comparison with th at due to the variability
of mass (‘relativity splitting' in the sense of Sommerfeld’s theory).
Formula (263 b) is not valid for k = 0. In general, it is so much
the more accurate the larger k is. In this limiting case we have
= JL
* T 2ifc2 k ± i ’
so th at formula (263b) becomes identical with Sommerfeld’s formula
(264c), provided k (= n, n —1, ft—2,...) is replaced by k—£, each
energy-level appearing tw'ice for two consecutive values of k (the one
increased and the other diminished by ^).
The appearance of half-integral values of k (= n —\ t n - | , etc.) can
be explained by the fact th at on the wave-mechanical theory the angular
momentum L is equal to J {l(l+ l)}ft/27r, and not to lik^ir. Now since
1(1+1) = (J + i)2—J, we can put, for large values of I,
where I = k—1 is the angular quantum number of the Schrttdinger
theory, f
The average values of 1/r, 1/r2, and 1/r3 have been calculated above,
for the sake of simplicity, with the help of the old quantum theory; it
can be shown, however, th at the results obtained are not substantially
altered on the SchrOdinger theory if Bohr’s k is replaced everywhere
by l+ b-
We shall see in a later section th at the exact wave-mechanical theory
based on Dirac’s equation leads, in the case of a one-electron atomic
system, to precisely the same results as the old theory of Sommerfeld,
t Cf. infra, § 33.
K 95.6 Rr
306 WAVE MECHANICS OF A SINGLE ELECTRON §30
the spin-doubling remaining unrevealed. I t becomes manifest, however,
as soon as we turn to more complicated atoms in which the motion of
each electron takes place in a field of force deviating (owing to the
action of the other electrons) from the purely Coulomb one. This
follows immediately from the expression (263) in which 1/r3 must be
replaced by some other (more rapidly decreasing) function of the
distance, with the result th at the two terms of (263)—corresponding
to the relativistic variation of the mass and to the spin effect—can no
longer be combined into a single term, corresponding on the old theory
to the mass effect alone.
The two states resulting from a single state of the SchrOdinger theory
and specified by the orientation of the electron’s spin angular momen
tum in the direction of the orbital angular momentum or in the opposite
direction are distinguished with the help of a special quantum number
(formerly called the ‘inner’ quantum number) j y assuming the value
j = /- f i for the former state and the value j = l —k for the latter; the
product of j with A/27T can be regarded accordingly as the resulting
angular momentum of the electron. This interpretation corresponds
rather to the old quantum theory; it can be shown, however, th at in
wave mechanics the number j plays, in regard to the total angular
momentum M, exactly the same role as the angular quantum number
I in regard to the orbital angular momentum L. We have, for instance,
for the characteristic values of i f 2
which can be obtained from the formula i f 2 = (L-f s)2 = L 2+ 2 L s + s 2,
where s denotes the spin angular momentum, if we put 82 = f A2/47t2,
L2 — h2l(l+ l)/47r2, and 2L s = A2Z/47r2 in the case j = 1+ J (in the case
j — Z—i, I must be replaced by I—1).
As has been shown above, for a motion in a Coulomb field of force the
inner quantum number j also plays the same role as I—in the absence
of spin—with regard to the energy.
We shall presently see th at this correspondence between j and I can
be further extended in describing the splitting of the energy-levels
produced by a weak magnetic field.
B. Influence of a magnetic field (Zeeman effect)
The preceding theory can easily be generalized to allow for the
presence of a homogeneous magnetic field The radially symmetrical
electric field will be represented by the vector £ = /(r)-r.
§ 30 MORE EXACT FORM OF THE MATRIX THEORY 307
If the unperturbed motion is defined as th at corresponding to the
absence of the magnetic field and to the neglect of the relativity (mass-
spin) corrections, i.e. if it is specified by the ordinary energy operator
H = ~ ~ p 2+Z7(r) ^multiplied by 8 = |^ JJj, then neglecting terms
of the second order in £ we can represent the complete energy operator
K as the sum of H and of the perturbation energy
S = “ 2
where /z = hel(Amn0c) is the absolute value of the electron’s intrinsic
moment, the electronic charge being denoted by —e so that
- e E = —VC7, or f(r) = - ~ .
T dr
This can be written in the form
S = A+ Ba (265)
with A = - — u y + - - - £vL (265 a)
2m0c2V ~ 2 m 0c
and B = - 0 L + /z £ \ (265 b)
where = nf/(2m0c).
The determination of the energy-levels of the two perturbed states
resulting from a single unperturbed one can be carried out with the
help of the general method outlined in the preceding section in con
nexion with a perturbation due to the magnetic field alone [see equations
(250)-(251 b)]. We thus get
AH' = A ± B , (266)
where A = f dV and B = <J{(Bx)2+ ( B y)2+ ( B z)2} is the quadratic
average of the vector B. If L is dealt with as a constant vector (which
is quite exact for the unperturbed motion), we have
B = <J{(P)2L 2- +fj.2&2}. (266 a)
In the extreme case of a very strong magnetic field—such th at /z§ f$L
—this expression reduces to /z§. Putting, further, £vL =
where ml is the axial (magnetic) quantum number for the orbital
motion, and neglecting the first terms in (265 a) and (265 b) compared
with the second ones, we get
A H f = fji^ (m i±l)y (266 b)
i.e. the same result as in the case of the ‘normal’ Zeeman effect, corre
sponding to the absence of spin; the influence of the latter is expressed
308 WAVE MECHANICS OF A SINGLE ELECTRON §30
in the replacement of the axial quantum number ml by m = 1,
both numbers being integers.
In the opposite case of a very weak magnetic field (/x£ < jiL) we
obtain a splitting of a different type, usually denoted as the ‘anomalous’
Zeeman effect. Expanding the exact expression (2CG a), and neglecting
the terms of the second and higher orders in £>, we get
B - pL(l-pfiSyLlp2JJ) = p L - ^ yL / L - pL -^ m ^ ir L
or, putting L = h(l+ l)/2n and neglecting the ‘relativity correction’
(represented by the first term in (265 a)),
(266 c)
where the upper and lower signs refer to the values j — l-\-l and
j = I —\ of the ‘inner quantum number’ which determines the total
angular momentum M.
This result in a somewhat different external form involving the axial
quantum number mj which determines the component of M along the
magnetic field, so long as the latter is supposed to be weak, can be
obtained by the following simple argument.
We have seen above th at in the absence of a magnetic field the
vectors L and s (spin angular momentum) are not constants of the
motion, even if the latter takes place in a radially symmetrical electrical
field; the sum L + s — M (total angular momentum) is, however, con
stant in this case. Further, it can easily be shown th at the squares of
s and L remain constant, so th at the perturbation produced by the
spin alone can be pictured as the rotation (precession) of the two vectors
L and s of constant magnitude ibout their resultant M (Fig. 3). The
A Z
L
Fro. 3.
§ 30 MORE EXACT FORM OF THE MATRIX THEORY 309
average values of s and L must therefore be parallel to M, and can be
expressed accordingly by the equations
s = (sr-l)M , L = (2 -g )M y (267)
where g is a certain numerical coefficient (s + L = s + L — M).
I t should be mentioned th at this 'graphical’ representation of the
spin perturbation does not give correct results if we assume at the out
set th at the vectors s and L are parallel to each other (in the same or
opposite directions), as has been concluded previously from equation
(263).
The coefficient g can be determined with the help of the formula
L 2 = ( M - s ) 2 =r= J f 2-2 M -s + s2 if we put L 2 = h2l(l+ 1)/4tt2,
M 2 — h2j(j-\~l)/4n2) s2 = %h2/±7T2 and replace the scalar product M-s
by (g —\)M 2. This gives
1 _ j ( j + i )—
(267 a)
......
th a t is g - l = ± 5^ (j = l ± \ ) . (267b)
The perturbation produced by a sufficiently weak magnetic field can
be pictured in the same graphical way as th e ‘rotation (precession) of
the parallelogram, formed by the vectors s, L, M, as a rigid body about
the direction of the magnetic field, the magnitude of all the three
vectors remaining thus constant as before.
The additional magnetic energy can be determined to the first
approximation as the average value of the magnetic perturbation energy
- > y (L + 2 s) = — £v(M-| -s)
for the unperturbed motion. Replacing s by (g—1)M, we get
K AW' = 2^ - !7(.*-M). (208)
The factor g was introduced for the first time by Lande (in 1922).
I t can be interpreted as the ratio of the angular velocity of precession
of the (s,L) parallelogram about the direction of $ to the classical or
‘Larmor’ angular velocity oj = e$/( 2 m0c), which corresponds to the
absence of spin.
The projection of the vector M on £> preserves a constant quantized
value which can be showrn to be given by the formula
(268 a)
310 WAVE MECHANICS OF A SINGLE ELECTRON $ 30
where w3- is the axial quantum number. For a state with a given j it
can assume the 2j-\-1 half-integral values lying between + j and —j.
I t thus plays, with regard to j, exactly the same role as the ordinary
axial quantum number m, with regard to I in the theory of the spinless
electron.
With the help of (26#a) the expression (268) can be rewritten in
the form , , v
AH ’ = fibmj g n ^n ij(1 ± ^ ^ j. (268b)
I t differs from (266 c) (without the term f$L not involving the magnetic
field) by the fact that ml is replaced by m^ and ^ ^ by - Thi s
difference is, however, easily seen to correspond to the connexion
between the projections of the vectors L and M on the magnetic field.
Replacing the vector L in (266 a) by its average value according to
(267), we get _ ,
&-L == (2—0)$vM = (2
Z7T
and consequently . , »
instead of the expression (266c)—or that part of it which is propor
tional to §. Equating this to ^g r tip we obtain the following equation
for the factor g: / 1 \
whence approximately . ,
% - u = ± r+ T
which coincides with (267 b).
Each level, specified by the quantum numbers n,Z,j, is split up in
a weak magnetic field into 2j + 1 equidistant levels with the spacing
where the plus sign refers to the case j —- l + \ and the minus sign to
the case j — I—J.
We have assumed above that the magnetic field was ‘sufficiently
small’. The standard field with which it has to be compared in this
sense is the ‘effective’ magnetic field which determines the spin per
turbation in the case § = 0. This field is parallel and proportional to
L, as has been shown above [cf. eq. (262 b)], and can therefore be
defined by the formula $pett = 0L.
$ 30 MORE EXACT FORM OF THE MATRIX THEORY 311
If § is much larger than the vectors L and 8 are no longer
held together in the rigid parallelogram (Fig. 2), but must be imagined
to process independently about the direction of £>, the former with the
normal Larmor frequency and the latter with twice this frequency. We
get in this case, instead of (268),
in agreement with (266 b). The modification of the Zeeman effect which
takes place in a transition from a weak magnetic field to a strong one is
known as the Paschen-Back effect.
The preceding results will be established in a more rigorous and
complete way in a later section on the basis of Dirac’s exact theory.
31. T he E xact F o u r-d im en sio n al M atrix T heory of D irac
The four equations of the Dirac theory, which in the last section were
written in the form of two matrix equations of the Pauli type, can be
put in the form of a single matrix equation (they were actually first
given by Dirac in this form), in a way perfectly similar to th a t which
has been applied for the same purpose to the Pauli equations.
The four functions of Dirac, tftv *p2, 03, ^4, will be considered accord
ingly as the four elements of a one-column matrix:
(269)
w
(or the components of a four-dimensional vector), the adjoint matrix
(complex conjugate vector) being
(269 a)
Introducing a suitably defined square matrix of the fourth rank
(four-dimensional tensor) A we can represent the four first-order equa
tions (229a)-(229b) as the four components of the matrix (or vector)
equation ^ = Q (270)
writing them in the form
( ^ )l = ^1101+ ^1202+ ^1303+ ^1404 = 0
(Al/t) %= -^2101+-^2202+-^2303+-^24 04 = ^ (270a)
W)3 = -^3101+ ^3202+ ^ 3 3 03+^34 04 “ ^
( A #)4 = -^4101+-^4202+^48 08+-^44 04 “ ®
Identifying these equations respectively with the first, second, third,
312 WAVE MECHANICS OF A SINGLE ELECTRON § 31
and fourth equations (229a)-(229b), we get the following definition of
the matrix A:
A = ax^x+ a ywy+ a ewai+ a <^ + a 0m0C (271)
with
/ I 0 0 O' f—i 0 0 O' r 0 1 0 0\
0 10 0 0 i 0 0 -1 0 0 0
, OLy -- < '> — '
0 0 10 0 0 -i 0 0 0 0 1
,0 0 0 1; , 0 0 0 i; , 0 0 -1 0 ,
. (271a)
'0 0 0 V r0 0 0 —1\
0 0 10
*i = ' 0 1 0 0 a0 = 0 1 0 Oj
0 0 - 1 0!
,1 0 0 0, ,10 0 oj
This form of the Dirac equations corresponds to a privileged role of the
coordinate x, the associated matrix ax reducing to the four-dimensional
unit-matrix 8. I t is possible, however, to rewrite them in four other
equivalent forms, corresponding to the shifting of this privilege to one
of the other four matrices a.
This can be done in the simplest way by
rearranging the original equations (229a)-(229b) and eventually mul
tiplying them by —1. For instance, to reduce the matrix a0 to 8 we
multiply the two equations (229 a) by —1, and rewrite the four equations
in the reverse order. We thus get
{ux + i u v)<jji— u zi{ti+ ( u t+ 7 n 0c)4il = 0,
{ux - i u v)<l>3+ u l'f,i+ ( u t+ m tc)<l>2 = 0 ,
—(w;e+tw 1/)V<2+M201+ ( —m(+ w 0c )^3 = 0,
— (ux — iuwyi>1— u ttfit+ ( — it /+ f n 0 < # 4 = 0 ,
which can he written in the form
B<fi = 0, (272)
with B = fttt* + ft,tt,,+ ft« 2+ ftM ,+ ftm 0c, (272a)
where
r 0 0 0 1] 0 0 '0 0 - 1 o^j
f°
0 0 1 0 0 0 - i 0 0 0 0 1
3*
II
0 -1 0 0 0 —i 0 0 . ft = 1 0 0 0
-1 0 0 0, 0 0 0, ,0 - 1 0 0,
rl 0 0 a 0 0 o\
°1
0 1 0 0 0 10 0
0 0 -1 0 >ft = 0 0 10
,0 0 0 -1 , ,0 0 0 1,
(272 b)
§ 31 EXACT FOUR-DIMENSIONAL MATRIX THEORY OF DIRAC 313
Rewriting equations (229a)-(229b) in the inverse order without
multiplying (229 a) by —1, we get in a similar way
Vijj = 0, (273)*
with T = Yxuz+ Yvuv+ Yzuz+ Ytui+y<)moc’ (273a)
where
0 0 D 0 0 V r o 0 — 1 O'
(° ( 0
0 0 1 0 0 0 —i 0 0 0 0 1
Yx = Yv ■= < > Yz = (
0 1 0 0 i 0 0 — l 0 0 0
0 0 o,
0 0 0 0, 1 0 0,
,1 U i , 0
(273b)
0 0 0 a 0 0 O'
jo 1 0 0 0 1 0 0
* = 0 0 1 » Yo = ■0 0 — 1 0
lo 0 0 0l j ,0 0 0 - p
This last form of the Dirac equations is especially useful because the
matrices y arc all Hermitian, while the matrices a and jS are not. There
is, moreover, a very simple relationship between the Dirac matrices
Yx> Yz an(l Pauli ‘spin’ matrices ctx, oy, az which can be expressed
by the equations
0 crJ
Yx = (274)
or
with 0 meaning the two-dimensional zero matrix ^ ^j. The Dirac
matrices yx, yu, yz can be thus defined as ‘supermatrices’ of the second
rank, whose elements are constituted by the corresponding Pauli
matrices and the two-dimensional zero matrices.
Further, it can easily be shown that the matrices yx) yu, yz, just like
the Pauli matrices crxr ay, az> anticommute with each other and with the
matrix y0, so th at putting for the sake of brevity
Yx = Yv Yv = Yz> Yz = Yz> Yo = Yt
(yt must be left aside, since it is equal to the unit matrix 8), we have
Yp Yv= —Yv?? (274a)
A relation of the type o x a y = i a z , etc., does not hold, however, for the
matrices y x , y u , y z . We have, for instance, according to (273 b),
f—i 0 0 0\ ' - 1 0 0 O'
0 i 0 01 . 0 1 0 0 01
Yx Yy =
0 0 — i Oj “ 1 0 0 —1 0 o a j’
, 0 0 0 i) , 0 0 op
which is different from y t .
3686.6 ss
314 WAVE MECHANICS OF A SINGLE ELECTRON § 31
To equations (274 a) we may add the equations
rl = 8 (274 b)
which are easily verified.
I t should be mentioned th at the four matrices a or f$ (which are
different from 8) also satisfy anticommutative relations of the type
(274a), while their squares are equal to ±8. We have, namely,
f t = ft = ft = - 8, ft = 8 (274 c)
ft = ft = “o = “ '8, OLf = 8
and, of course, = ft = 8 (since )50 = crx = 8).
With the help of these relations the transition from one form of
Dirac’s equations to some other equivalent form can be carried out by
the multiplication of the former by th at matrix wrhich must be replaced
by 8 (with the + or — sign as the case may be). We have, for example,
A = yx F, B = y0 A = B, F = at A = B,
which means that
ax - 71 av = 7 * 7 tn — 7 x 7 z >« / = 7 x7 t = 7x> “ o = 7 x 7 o ’> A = 7*
etc.; these relations can be verified directly.
We can further easily derive from the first-order equations the
second-order equations of Dirac’s theory in a similar matrix form. This
can be done in the simplest way by applying to the equation Bip = 0
the operator
5 = - ( A t « * + A y M y + A « * + A « < ) + P 0 OT® C .
We thus get BBtft = 0, or, carrying out the multiplication and taking
account of the relations (274 a) and (274 b):
{( u l + u l + u j - u f + m l c ' ) - \ p , &(«„ « ,- « ,* „ ) + A*A*(«* ux- u x u,) +
+ A * A ,( « x w y - “ y « * ) + A t A K ui~ ut« * )+
This equation can be written in the form
Qi/j = 0 (275)
with the matrix operator
Q = 2 > S - ~ ( H-5+E-1I), (275a)
where D = ^wf+wjc2,
as before, while % and ij are vector-matrices with the rectangular
§ 31 EXACT FOUR-DIMENSION AL MATRIX THEORY OF DIRAC 315
components
0 0\ i 0 0 0 0^
( 0 °1 (-1
0 0 —i 0 0 0 > I o 1 0 0
■t , = (275 b)
• 0 0 0 i = 0 1 0
1 oj , 0 0 —i 0, 1I 00 0 0
lj
0 0 —v 0 0 0 ; O'
Vx
0 - •i 0
( 0
0 0 -1 ol (°
0 0 ) -»
- i 0 0 Vv = 0 1 0 Vz =
H 1i 0 ) o "
0 ,-l 0 0
T
oj lo - i ) 0;
o
o
(275 c)
We can also write down the relations
£r = = iPzPx. U = iPxPy (276)
Vx = Vy = iPyPt> Vi = tfzPl
or in vector notation
%= JtP x P , i) = t'Pft = —iy. (276a)
The identity of equation (275) with the four equations (230)-(230a)
is easily verified.
I t should be mentioned th at the actual way in which Dirac first
obtained his first-order equation Bift = 0 was to some extent the reverse
of the preceding derivation for the particular case of the free motion when
the matrix Q reduces to the operator D (multiplied by 8). Assuming the
possibility of representing Q in this case in the form BB one can easily
obtain the conditions = PI = pj = —$ = —8 and
(p ^ v) for the matrices 0; after this the first-order equation B\ft = 0
is naturally generalized for the motion of the electron in an arbitrary
field of force (by replacing p by u)> and finally the corresponding
generalized expression for the second-order operator Q is obtained in
the way shown above.
We have preferred to this straightforward method of Dirac the some
what more lengthy and complicated path starting with Maxwell’s
equations, because of the resulting gain in the comprehensiveness of
the theory. Moreover, the determination of the matrices /J from the
properties above stated is an ambiguous problem, which can be solved
only after some assumption has been made as to their rank, i.e. the
number of wave functions iff, whereas in our derivation this number is
settled from the beginning with the help of the analogy between
d ’Alembert’s equation and Maxwell’s equations on the one hand, and
the wave-mechanical equations of the second and first order on the
other.
316 W AVE MECHANICS OF A SING L E E L E C T R O N §31
The four-dimensional second-order equation (275) is equivalent to
the two equations (258) and (258 a) involving the two-dimensional Pauli
spin matrix a. The Dirac matrix %can be defined as a duplication of
the latter according to the formula
(277)
0 0|
where 0 is short for the two-dimensional zero-matrix This
0 OJ
formula is equivalent to the following three:
which differ from the formulae (274) for yx, y u, y z by the fact that the
duplication is carried out in the direction of the right diagonal and not
of the left one. The formulae (274) can be replaced by the single vector
formula 0 o
Y= a 0
The vectors y and % are easily seen to be connected with each other
by the relations
Y = p5 = %P, %= PY = YP. (277 a)
where p is the scalar matrix
0 0 10
10 0 0 1
P ~ jl 0 0 0 " P *>• <277b>
,0 1 0 0.
which commutes with y and anticommutes with y 0:
PYo = —YoP-
It should be mentioned that y 0 commutes with \ (since it anticommutes
both with y and with p). We have further, from comparing (273 b)
and (273 c):
-ip t (277 c)
The expression (275 a) for the matrix operator Q can thus be rewritten
in the form »
Q = D -p -J H -ip E y t
where the factor 3 is to be understood in D.
I t is clear th at the matrix \ must have in the Dirac theory a similar
physical meaning to that of the matrix a in the Pauli theory, i.e. it must
represent, with a suitable numerical factor, the spin angular momentum
or the Tnagnetic moment. The matrix tq must represent accordingly,
§ 31 EXACT FOUR-DIMENSIONAL MATRIX THEORY OF DIRAC 317
when multiplied by /x, the electric moment of the electron. An important
distinction between the matrices £ and tj consists in the fact that the
former is Hermitian and therefore represents a real quantity (with the
characteristic values ±1), while the latter is anti-Hermitian and there
fore represents an imaginary quantity (with the characteristic values
This result seems at first sight to contradict the conclusion arrived
at in the preceding section, namely, th at a moving electron possesses
a real electric moment represented approximately (in the corrected
Pauli theory) by the matrix x u o/(2 m0r). As a matter of fact such
a contradiction does not exist, for the matrices /x? and /xtj represent
the ‘rest-values’ of the magnetic and electric moments, i.e. their values
in a system of coordinate's with respect to which the electron is at rest.
In a coordinate system with respect to which it is in motion, the
electron has an additional imaginary magnetic moment and an addi
tional real electric moment, these additional moments being numerically
equal and to a first approximation proportional to the velocity.
From the point of view of the classical theory, if p. and v are the
rest-values of the magnetic and electric moments of a particle, then in
a coordinate system with respect to which this particle is moving with a
velocity v it will have an additional magnetic moment Ap equal, to
v
a first approximation, to - X v and an additional electric moment Av
v
equal (to the same approximation) to - x p.. Putting v — ip. we get
Ap. - tAv. The numerical equality of the two moments is thus main
tained for a moving electron (it can easily be shown to hold exactly),
the imaginary electric moment giving rise to an imaginary magnetic
one and the real magnetic moment to a real electric one. This real
electric moment is represented wave-mechanically by the operator
/ x U x oj(m0c).
We can now turn to the discussion of the physical meaning of Dirac's
first-order equation Vtp — 0. We shall note first of all that it can be
written in the standard form
+Pt)'ii = °» (278)
where p t denotes the operator — . multiplied bv the four-dimensional
2rri ct
matrix 8, and e the first-order energy operator defined as the four
dimensional matrix
e = U -\-c{yx ux+ y„ uu-f y. uz)+ wi„ cV0 = C 7+cyu+w 0c2y0. (278a)
318 WAVE MECHANICS OF A SINGLE ELECTRON §31
The important point about Dirac’s equation—namely, its relativistic
symmetry with regard to time and space—is revealed by the possibility
of writing it in one of the three other equivalent forms:
(Px—px)<p = 0. (Pv—Pv)<l> = 0. (pz—pt)<l> = 0.
corresponding to the election of one of the space coordinates to the
presidential role played in the usual form of the theory by the time.
Replacing the latter by the coordinate x, for example, we get for the
corresponding ‘momentum operator m atrix’, with the help of the equa
tion jAiJj ~ 0, the following expression:
Px = Gx—a.yuy—<xzus—oLiul—oLfim'i ct
where Gx, the ^-component of the ‘potential momentum’ eAJ c, is sup
posed to be multiplied by the unit matrix 8. The same refers, of course,
h d
to the operator p z = — in the equation (Px—p x)^ = 0 (as well as
2*771 dx
to the operators p y and p t in the two other momentum equations).
If the operator c does not contain the time explicitly, then equa
tion (278) admits particular solutions \)j €>— i)P€,e~i2ne'iih for which it
reduces to the form (c—e'Wv = 0. These solutions represent different
stationary states of the electron moving in a constant electromagnetic
field.
I t can easily be shown in exactly the same way as in Pauli’s theory
th at functions \fs = ipf> and t/r^, belonging to different energy values
which form a discrete spectrum, satisfy the orthogonality relation
J $ '!>*• d v = o,
where tpl\f/€» — j 'Pt'a'Pt'a- enables us to build up a matrix repre-
a<=1
sentation of physical quantities and a transformation theory which
differs from th at based on Pauli’s equation by the fact that the addi
tional ‘spin-index’ a assumes four values instead of two. Another
im portant difference consists in the fact th at Dirac’s equation
(c—c ') ^ = 0 admits solutions corresponding to negative values of the
energy This circumstance will be discussed in more detail later on
(§34).
I t may seem at first sight th at the wave-mechanical expression
(278 a), because it is linear in the operators ux>uy, uz, representing the
components of the electron’s proper momentum, has no parallel in the
classical relativistic mechanics. A similar expression is obtained, how
ever, on the Einstein theory if the proper energy me2 = miic% l^( l —v2lc%)
§ 31 EXACT FOUR-DIMENSIONAL MATRIX THEORY OF DIRAC 319
is rewritten in the form
r.2
me - - ” . ' V C - ^ » + 4 v,
where g = m v is the proper momentum. Putting e = mc2+ U and
g v = vxgx+ vvgv+ vzge, we get
« = U + vxgx+ vl/gl/+ v,gz+ m,0c2sj ( l —v2lc2), (279)
which becomes identical with the expression (277 a) if we replace the
proper momentum vector g by the operator u, the velocity vector v by
the vector-matrix cy, and the expression ^/(l —v2/c2) by the matrix y0.
We shall write this symbolically in the form of ordinary equations:
g = u, v = cy, j ( l —v2jc2) = y0.(279a)
The startling point about these relations is the fact th a t the classical
momentum and velocity are replaced by operators of an entirely dif
ferent type. This may be due partially to the variation of the mass—
which is the proportionality coefficient between momentum and velocity
—as a function of the latter. If, however, this were the only reason
for the difference, we should expect the relation
y0u = m0cy
to hold—which of course is not the case (see below).
The fact th at the operators u and cy are the wave-mechanical repre
sentatives of the momentum and velocity vectors respectively can be
established in a more direct and convincing way than has been done
above. Let us consider the classical equation of motion of the
relativity theory in the Lorentz-Einstein form
* e - « (E + ivxH ). (280)
Replacing the classical time derivative of g by the wave-mechanical
expression , ~
- U= (280a)
we get, since u = p —G — p —eA/c,
d e dA
dtu
and further, with the help of the expression (278 a), for the energy
operator [«,u] = c[(yu),u]+ [i7,u]
(since y0 commutes with u). Now
dU
[ U , u x] = [ U , Px] =
8z
320 WAVE MECHANICS OF A SINGLE ELECTRON
and
[(Y-u).«r] = y*K> ^ ] + y „ K . « * ]+ y J« -. «*]
= ^ [ y » K wx-«i«l,)+y*(M rW x~«x«*)]
= l(yvHz—yt Hv) = ^(yxH )*,
C 6
according to (218). We thus have
[f,u ] = e[— V ^ + y x H ],
and consequently
d f ISA n n „1
orfinally ~ u — e ( E - t ~ Y x H). (280b)
at
This equation is of exactly the same form as the classical equation
(280) with g replaced by u and v/c by y in agreement with (279 a).
Another still more direct and conclusive proof th at the operator cy
is the wave-mechanical equivalent for the velocity is obtained by
calculating the operators dx/dt, dy/dt, dz/dt which obviously represent
the components of the vector v. We thus get
j t = [ ',*] = c[yxux,x)
(since all the other elementary operators constituting €commute with
x), th at is dx
= c Yx[ u x ’ z ] = cyxb*>*] = c Yx>
or r = cy, (280c)
which is the desired relation.
The physical meaning of the operator cy as the representative of the
velocity can be finally recognized from the fact that, with the expression
p = (281)
for the density of probability, following from (232), the expressions
(232 a) for the probability current-density can be written in the form
j = c\ffi = c^r*Y (281a)
corresponding to the classical relation j = p \. We have, for instanoe,
according to (271 b), taking the z-component of j,
ix — ^ Yx ' P = 4'Af(y*^ )i+(y* ^ 2+^?(W O a (WO4]
§ 31 EXACT FOUR DIMENSIONAL MATRIX THEORY OF DIRAC 321
which coincides with the expression (232 a) for j x. Since all the three
matrices y xi y v, y z are Hermitian, we have y f = Y, so th at the two forms
(281a) for j (with y acting on *p and yt on ip*) are equivalent, being
actually obtained from each other by the associative law of multiplica
tion.
The expressions (281) and (281 a) can be derived directly from Dirac’s
equation Tip - 0 , and this in a much simpler way than without the
use of the matrix notation. Multiplying, namely, this equation (on the
left) by ip1 and subtracting‘from it the product of the adjoint equation
0 t p t = 0 by ip (on the right), we get
*P'(riP)-(iP'r')iP = 0,
th at is, since yj = y0 and y* = y,
'PHui'P)—('u*'P*)lp-i~lP*u 'YlP—(u**p^)yip — 0,
or finally —(ip^ip)div cip^ytp = 0. (281b)
This is the equation of continuity for the probability density and
current density as defined by (281) and (281a).
The expression (281 a) for the probability current-density can be
transformed (according to Gordon) in the following way. Replacing ip by
the expression —(^•u-{-ptvl)tpJm0c, with the help of equation (272 a) we
have ,
m0
or, since y = ftp ,
«*0j = —
We have further, according to (276),
f t p s ' l l — fixfi xu x~^~fixPvu v~^~Pxfizu s = £ zu y)i
th at is, p(P u) = —u —tu x ?
and pft — - it) . We thus got
j = - V ftu ^ + ± [ p fl l(ux%+ nu')'j,]. (282)
m0 Mq
Transforming in a similar way the factor «/r* (instead of ip) in the expres
sion j = ipiy'ip and adding the result to the previous one, we get finally,
remembering that
I e; = A = r«. V = 5 . V= -n = »Y.
j = ± R ( ^ V » u ^ ) + c « r ] ( ^ ^ yo ^ ) + | ( ^ - ^ yo ^ ) .
(282 a)
3605.6 T t
322 WAVE MECHANICS OF A SINGLE ELECTRON $ 31
This expression multiplied by e/c (e = charge of the electron) gives the
density of the electric current (in e.m. units). The latter can accordingly
be written in the form
C M\% </']— 0 •/')J + curl Sl *'>
CC
(282 b)
where 2» = i 4 fy0%4> j ,283)
and = m'AVot11/1 I
The vector ©I must obviously be interpreted as the ‘magnetization’,
i.e. the probable value per unit volume of the magnetic moment due
to the electron’s spin. Its components are expressed by the formulae
= vU'l’t'l’i+ 'l’t'I'i)—('I’t'f’i + t fM ] 'j
®i„ = • <283a)
SR* = m[(—<Pf t l ) —( —j
If in these expressions we neglect the products of i/;3 with ip4 (which
are small quantities of the second order in 1/c) they reduce to the
expressions (252 b) of Pauli’s theory. Splitting up the matrix ip into
two two-dimensional matrices *p, we can rewrite (283 a) in the form
UK = p W o t - x ' a x ) - (283 b)
The vector represents the ‘electric polarization’, i.e. the probable
value per unit volume of the electric moment due to the electron’s
spin. In spite of its imaginary appearance it is easily seen to be a real
quantity. We have, namely,
Wx = — + M i)] \
% = ^ 3—M s ] I (283c)
4>S)—(M*~02 0?)] J
which can also be written in the form
(283 d)
corresponding to (283 b). If x *s replaced here by its approximate
expression in terms of ip according to (260) we get, with the help of
(260 a),
== m0c x tp*u oip
in agreement with our previous interpretation of the operator
v= iu x o
m0c
as the electron’s real electric moment [cf. (261 b)].
§ 31 EXACT FOUR-DIMENSIONAL MATRIX THEORY OF DIRAC 323
I t is interesting to note th at the magnetic moment is in Dirac’s
theory specified by the matrix y0%and not by the matrix %which was
assumed to specify the mechanical angular momentum due to spin.
This difference can be interpreted as the expression of the fact th at in
the classical theory the ratio of the magnetic moment to the angular
momentum is equal to e/(2cm) for orbital motion or e/(cm) for the spin
motion, where m is not the rest-mass, but the actual mass m0/y]( 1—v2jc2).
If, therefore, in wave mechanics the spin angular momentum is repre
sented by the matrix &£/47t, then the magnetic moment must be repre
sented by the matrix ehy0%l(47rm0) = since the classical quantity
^(1 —v2/c2) is represented by the matrix y0 [cf. (279 a)].
32. Gener al T r eatm ent of the Spin Effect; Angular M om en
tum and M agnetic Moment
The fact th at the spin angular momentum must be represented by the
vector s — A£/47r can be proved in the same way as in the case of the
Pauli theory (where \ is replaced by a).
We have, to begin with, according to (275 b), the following relations:
U y = ~ f y f . = *f.. f y f . = ~ f . f y = if*, f . f , = ~ f . f . = if* (284)
and consequently = 2i§, (284a)
so th at the matrix \ satisfies the same relations as Pauli’s matrix a ,
giving for the angular momentum s = k% (k = h/Air) the usual com
mutative relation s x s = —Zts/(27ri).
I t should be mentioned that the characteristic values of the matrices
f*»fy>f. are equal to ± 1 (each value occurring twice), while those of
f f f are equal to 1. The characteristic value of s2 thus turns out
to be equal to \{hj2?r)2, as before.
I t can easily be verified th at the matrix y0 commutes with Since,
further, its square is equal to 1, the preceding relations will hold for the
m atrix y05 just as well as for £. The necessity of interpreting the latter
and not the former as the spin angular momentum can be inferred in
an unambiguous way from the fact th at the sum of s = h%j2ir and of the
orbital angular momentum
L = rx u , (285)
th a t is, the vector M = L+s
satisfies the equation of motion
S M -rx F .
wher e F = e ( E + y X H) is t h e for ce a ctin g on t h e electr on [cf. (280 b)],
324 WAVE MECHANICS OF A SINGLE ELECTRON § 32
and can accordingly be defined as the total angular momentum, while
the vector L4-y08 does not satisfy this equation.
We have in fact, d d r ........ du
s L - s x u + rx s .
th at is = cy x u + rx F , (285 a)
at
according to (280 b) and (280 c).
Replacing L by s, we get, on the other hand,
= KC[Y-U+y07W0 C ,5],
or, putting y = p%, since \ commutes both with p and with y0,
= »cc/>[(5-u),5]-
Taking the 2-component of this vector, we have
^ az = KCp(ux[£z , £ c] + u £ £ u , t z}) = — x c p ( u x i 1/— u l/i z )
= cp(u X ?). = cu x y
according to (284), th at is,
^ 8 = —c y x u . (285 b)
at
Adding (285 a) and (285 b), we get the equation
£ (L + s) = |M = rx F , (285c)
which coincides with the classical equation for the total angular
momentum. In the case of a spherically symmetrical electric field and in
the absence of a magnetic field the product r x F vanishes, so th at the
vector M is a constant of the motion.
Taking the square of M, we get the expression
M 2 = Z 2+ 2 L -s+ * 2 (286)
which is also a constant of the motion. Now since
is itself a constant, we get
^ (L * + 2 L s ) = 0. (286 a)
The two terms in the brackets taken separately are not constant; as
has been shown, however, by Dirac, we obtain a new constant of the
§ 32 GENERAL TREATMENT OF THE SPIN EFFECT 325
motion, characteristic of the relation between L
and 8, if we consider
the vector y0L* L-8y0.
s — Taking the time derivative of this vector,
we get
- ( L - s yo) = ^ - s y 0+ L - ( 8 yo).
Replacing dhjdt by the expression cy x p , according to (285 a) (using
u = p and r x F = 0) we get
f t -s = c (y x p )-s = C*p($Xp)-$ = -Cicp($X%) p
= = —2ic«r(Yp),
— 2tC fc p £ -p
and consequently
^ • 8 y 0 = —2»c/c(YP)y0 = —CK~[(Y'P).yo]
since y anticommutes with y0, or
dL* 1b r -i Vdn
Sn2 dt
We have further
J t (aro) = -c (Y X p)y0+ ^ JSC(p-Y)y0 = [~(YXp)+»5(p-Y)]cyo-
Now ?(py) = p$(p-5) = p (p + ip x % ) = pp+*pXY.
so th at - ( s y 0) = ippcy0, L - ( s y 0) = 0,
since L-p = (rx p)-p =■ 0. We thus get
dn , A2
^ • a y o ) = - Htt2 dl y<»
th at is, (L s + A _ jy 0 = const. = (287)
where &is an ordinary number, replacing the angular quantum number
of the old theory; the fact th at it can assume integral values only will
be shown later on.
Taking into account the identity
(L?)2 = L2+i(L xL )-5 = L2- i - L - 5 = L2- 2 L s
Z'lT
and rewriting (287) in the form
L'? = ^ (y ,M ) (y5= i).
326 WAVE MECHANICS OF A SINGLE ELECTRON § 32
we get further I? —2L-s = ( r r j (Yo^~ 1)*>
th a t is, I? = ( A j Sy# k(kye 1) = ( A ) V - y „ ) . (287a)
and Z,2+2L-8 = ^ * ( * * - 1 ) = const., (287 b)
in agreement with (286 a). Adding to both sides of this equation the
term 82 = f (A/27r)2>we obtain finally
m = (287c)
The latter expression is usually written in the form
where j = |fc| —\ is the so-called ‘inner* or ‘total* angular quantum
number.
An angular quantum number of the same character as that which
in the Schrfldinger theory specifies L according to the formula
L2 = (A/2tt)2/(/+1) does not exist in Dirac’s theory, since L 2 is not
a constant of the motion—as shown by the formula (287 a). I t should
be noticed th a t the number k can assume both positive and negative
values (which can be interpreted as corresponding respectively to the
same or to the opposite orientation of the orbit and spin axis), the value
k = 0 being obviously excluded [as seen from (287 c)].
The preceding results, which are strictly valid for the motion in a
spherically symmetrical electric field, remain approximately valid in
the presence of a weak homogeneous magnetic field. Such a field
which can be derived from the vector potential A = X r, corresponds
to the additional term Sm = —(e/c)A cy = —£e(£»xr)-y, th at is
Sm = -^ .C v ( r x c y ) (288)
in the energy operator c. This additional term can be identified with
the ordinary expression for the magnetic energy if the vector
p = j^ -rx c y = j e r x y (288a)
is defined as the total magnetic moment of the electron.
We have in this case, according to (285 a) with F = ey xSo,
~ M = e rx (y x £ ).
$ 32 GENERAL TREATMENT OF THE SPIN EFFECT 327
W ith th e help of th e equation
| [ r x (r x *)] = % X (r x * )+ r X ( | r x * )
we get, neglecting the left-hand term (since its time-average value
vanishes), YX (r X £ )+ r x (y X fc) = 0,
whence, using the vector identity
r x (y x $ )+ Y X ($ x r )+ $ x (r x Y ) = 0,
d
= |e (r x y )x £ = p .x£ (288 b)
in agr eement with th e classical theor y.
Taking th e scalar pr oduct of both sides with &, we get
d
(M-.0) = 0, (288 c)
dt
which means th at the projection of the angular momentum in the
direction of the magnetic field remains constant.
The formula (288 a) corresponds to the classical formula ft = £er x v/c
for the orbital magnetic moment due to the electron’s translational
motion alone, without any spin. According to the considerations de
veloped before in connexion with the spin magnetic moment /*y05 one
might expect th at the total magnetic moment would be expressed as
the sum
_ yo r x » + Wo? =
2m0c
This expr ession is, however , n ot exa ctly equivalent to the expr ession
(288 a).
I n or der to tr ansfor m th e oper ator p, to an equivalent for m of
th e above typ e, we shall consider its pr obable or aver age value
J dV, which can obviously be wr itten in th e for m
* - i S t x >d r -
where j = c^ y^ is the probability current density. Using the expres
sion (282 b) for ej/c, we get
[X= r——R f ^ y 0r xutff dV + i f rxcurlSWdF + - f r x~ t yd V.
2cm0 J J cJ dt
Now th e fir st integr al is equal t o th e pr obable value of y0L. W ith the
help of th e vector id en tity
V(A*B) = (A*V)B+(B*V)A+A Xcurl B + B x curl A
328 WAVE MECHANICS OF A SINGLE ELECTRON §32
we get further
rx cu rlfltt = V(r-9tt)—(3tt*V)r—(r*V)3R = V(®l*r)—3W—r ~ 9W,
since
cu rlr = 0, (®t-V)r = 3W,
__ d d d d_
dr '
In the latter expression 3/dr denotes a partial differentiation with regard
to the distance from the origin of a polar coordinate system, the two
angular coordinates being kept constant. Writing the volume element
dV in the form r 2 drdw, where dw denotes the element of solid angle,
we have
00
J r ^ r n d-V = J dw J r3-O T dr
0
O0 00
= J doj J —(r33J?) dr — 3 J dw J 3Wr2 dr = —3 J SWdV.
o o
Consequently,
\
J
f r x curl WldV = f 3WdF = /xy7| — e y0~s.
j m0c
We thus see th at so far as its probable value is concerned the operator
(X is equivalent, at least in the case of a stationary state when the
expression r
vanishes, to the following one:
|Ae« = 2m0cyo(L+2S)-
This ‘effective* magnetic moment can be replaced approximately by
the expression
not involving the factor y0, which accounts for the variation of the
mass with the velocity, and whose probable value differs by quantities
of the second order in 1/c from 1.
The fact th at the expression (288 a) does not contain explicitly the
spin contribution to the magnetic moment shows very clearly th at
the ‘spin-motion* has no real existence as something independent of the
translational motion, but is actually a certain aspect of it. This circum
stance can be regarded as a consequence of the fact th at in Dirac’s
§ 32 GENERAL TREATMENT OF THE SPIN EFFECT 329
theory there is no direct relation between the vector u = p —eA/c
representing the proper momentum of the electron (rav) and the vector
cy representing its velocity. These two vectors cannot be treated accord
ingly as parallel to each other. In fact, the lack of parallelism, as
measured by the vector product c y x u , can be considered according to
equations (285 a) and (285 b) as the cause of the change of the orbital
and spin components of the angular momentum in the absence of
a magnetic field.
The fact th at the electron's spin is not an independent kinematic
property but merely an aspect of the translational motion (resulting
from the divorce between the velocity and momentum) is indicated
also by the relation (277 b) between the matrices y and \ representing
respectively the translational and the ‘spin’ velocity. If the propor
tionality coefficient p were an ordinary number, then the relation y = p%
would imply th at the two vectors represented by y and \ were parallel
to each other. Since, however, p is a matrix, such a parallelism does
not necessarily exist, as may be seen from the calculation of the pro
bable values of y and %.
I t should be mentioned further th at the characteristic values of the
matrices yx, yv, ye are the same as those of £x, gy, th at is, + 1 and —1
(each of them occurring twice). This means th at the characteristic
values of the components of the electron's velocity as defined by the
vector cy are equal either to -f c or —c. We have here the same type
of duplicity as in the case of the electron’s spin. For the components
of the momentum as represented by the vector u we get a continuous
spectrum extending from —oo to +oo, as in the classical theory. The
same would refer to the velocity if the latter were defined not by the
vector cy but by the vector y0u/m0, corresponding to the classical
relation between velocity and momentum. Such a definition is, how
ever, inconsistent with the relations dxfdt = cyx, etc., derived above.
I t has been shown by V. Fock that, in spite of this, the two definitions
of the velocity become identical in the limiting case when the quantum
theory reduoes to the classical one (for instance, in the case of a motion
with very large energy).
The relationship between the translational and spin motion can be
interpreted according to Bohr as a particular case of Heisenberg’s
uncer tainty relation, resulting from the consideration of the magnetic
force experienced or produced by a moving electrified particle without
a n y a ctua l spin.
The magnetic field produced by such a particle (electron) at a distance
SftM.t vu
330 WAVE MECHANICS OF A SINGLE ELECTRON § 32
r is given by the well-known Biot-Savart formula:
S = -e ™ r .
c r3
Now the exact determination of £> according to this formula requires
the simultaneous knowledge both of the position, i.e. the radius vector
r of the electron (drawn from the point P for which £> is to be deter
mined) and its velocity v. This is, however, impossible, since it is only
possible to measure both quantities at the same time with a limited
accuracy, so that the products AxAvx, etc., are at least of the order of
magnitude of hfm0. This implies an inaccuracy
AS S — - S £ (a =
cniQr3 r3 \ 47rm0cj
in the determination of §, which can be interpreted as an additional
magnetic field (of unknown direction) due to a particle with a magnetic
moment /x. The superposition of the magnetic field produced by the
electron’s spin on that due to its translational motion thus secures the
validity of the uncertainty relation between position and velocity, so
far as they can be determined from the electron’s magnetic action.
A similar result is obtained if we consider the force F = evxSy/c
experienced by an electron in a given external magnetic field. The
inaccuracy Ar in the electron’s location leads to an inaccuracy
A& == (Ar-V)£ in the estimation of the field strength $>. Replacing v
in the preceding formula by the corresponding inaccuracy Ar, we get
lA fl = -At>X Ar-V$ S — s ,xVS,
c cm0
which agrees, with regard to the order of magnitude, with the force
acting on a magnet with moment \l in an inhomogeneous field [(|aV)^].
33. The Motion of an Electr on in a Centr al Field of For ce; Fine
Str uctur e and Zeeman Effect
We shall now turn to the more detailed discussion of the problem of
the motion of an electron in a spherically symmetrical field of force
according to Dirac’s theory.
The function quadruplet if/v ift2, 03, corresponding to a definite
energy-level e = c' can be determined in a general way from the equa
tion = 0. In the case under consideration it is, however, more
advantageous to start not with the energy but with the angular con
stants of the motion and specify the functions $ so as to make them
the characteristic functions of the corresponding operators.
§ 33' MOTION OF ELECTRON IN CENTRAL FIE L D OF FORCE 331
The most suitable operators for this purpose are Mz—the projection
of the angular momentum operator on one of the coordinate axes—and
the operator il/2; the operator L2, although it is not an exact constant
of the motion, can also serve for the determination of 0.
1>utting = 0, (289)
we get, from -~,+ — and the definition (275 b)of
2.TTI ( <f> 4tt
the following system of four ordinary equations:
1 fV-i - I .,, = c'<pit \& h_+ = C'^J,
i i<f> i c'(f>
1 i b<[>4
i t y — -r,J = cVa.™ i <<f> + Wi = c'4>i<
where c# = 2nML/h is a constant. An immediate consequence of these
equations is that the dependence of the functions ip3, 04 on the longitude
<P is the same as that of the functions 0j, 02. This dependence is
obviously given bv the formulae
i/r, = 02 -= A 2eimt
(289 a)
03 = 04 = B2eim*
where A and B are functions of the co-latitude 0, with c' = m + J,
that is,
(289 b)
^ ==s ; (nl+ i),
m denoting an arbitrary integral number.
The determination of the functions A , B can be carried out in the
simplest way by applying to 0 the operator L2. This gives, according
to the relation (287 a),
= ( 4 ) 2*(*+ 1>h> = ( $ k(k+ 1
1 0 0 0\
0 1 0 0
since
0 0 -1 0
.0 0 0 - 1>
Equations (290) show that the functions 0, so far as their dependence
on the polar angles 0, 0 is concerned, are spherical harmonics, just as
in Schrodinger’s theory. (It will be remembered that L%= —(fc/27r)2£22,
where £2a is the Laplacian operator on the sphere, and that the equation
332 WAVE MECHANICS OF A SINGLE ELECTRON '§33
Q,2tp+ l(l~f 1)0 = 0 is satisfied by spherical harmonic functions of the
order I > 0.) They show, moreover, th at the function pairs 02, 02 and
03, are spherical harmonics of different orders, and that the number
k which determines these orders can have integral values only. We
must distinguish two cases, namely, k > 0 and k < 0. In the former
case we get, putting k = Z-f 1, with regard to (289 a):
01 =
02 = O tF Yijp ,4 ) = a 2FPlm(6)eim*
(290 a)
0a = a3GYu l,m+1(e,<l>) = a3^ +1,m+1(0)e><”' ^
04 = at OYl+UmV,+) = a4(?P(+1,m(0)e‘W
where -P and # are two unknown functions of the distance r alone,
while ax, a2, <z3, a4 are certain numerical coefficients. /}m(0) denotes the
associated spherical harmonic function — sinlwl0 P j|m|)(cos0).
In the case k < 0 we shall put I — —fc = |&|, which gives
0! - 6i ^ , m+1 = l>i FP,,m, 1(0)e‘(mn# '
02 - (290b)
03 = = b3F P ,.lim+1(e)e‘^ ’
04 = &4 = *4 F P . - U W ”4
where bl9 &2, &3, &4 are another set of coefficients.
The number
I — k—1 (A: > 0) or Z= —k (k < 0),
i.e. the order of the spherical harmonic functions appearing in the
principal pair 01?02 is called the angular quantum number of the state
in question. The two states specified by the functions (290 a) and
(290 b) can be distinguished by their inner quantum number j which
is equal to l-\-\ in the first case and to I—I in the second (i.e. in both
cases to the arithmetic mean of the orders of the spherical harmonics
in an(l fa* *1*4 )- The two states belonging to the same j and to
different values of I are specified by functions of the type
0i>02 ~ Yj+l> 03>04 an(* 02 ~ Yj-t’ 0s>04 ~ respec
tively.
The ratio between the coefficients av a2 on the one hand and a8,o4
on the other can be determined from the equation
(Mi —M '2)>jt = 0 [4T 2 = ( - ) V - i ) j , (291)
which can serve for the complete determination of the angular factors
in the quadruplet 0 (inasmuch as the direction of the privileged axis
§ 33 MOTION OF ELECTRON IN CENTRAL FIELD OF FORCE 333
remains unsettled). I t is somewhat simpler, however, to combine for
th at purpose the equations (290) and (287). Putting L = hAj27r, we
can rewrite the latter in the form
= (291a)
which is equivalent to the system of equations
(Ax+ iA]/)4i2—Azif/1 = (k—l)<fi1
( A .- iA ^ + A .* , = ( k - I# ,
(292)
(Ax+tA„)«/i4—A ^ j = —( i + l ) 0 3
(Ax—»A„)^S+A X^4 = -(lc-i
We have here
A 1/ d d\ A 1/ d d\ A 1/ d d\
A* “ i ( y B z - % } = J \ z rx - X3z}' A> =
or in polar coordinates r, 6, <f>,
Ax+ i A , ~ e * ( | + < c o t * ^ , A .—iA,.
A = - —
z i d<f>
The first two expressions can be obtained as follows. We shall put for
the sake of brevity djdx = dxi etc. We shall further introduce the com
plex variable w = x+ iy and the corresponding derivative dw — dx+ idy.
We get then Ax+ iAy = zdw—wdz.
On the other hand, we have
89 = i 9* + % 8* + % d' = c o t ^ + j ^ - t a n f l ^ ,
and Xdx+ y8v = w*8w—i8<j> - wd*+id^ .
whence
K = ^ ( ^ + y 8 v- A 2) = - ^ — [(gj+ tan^za.J-coteA j,
and consequently
21/7
A*+*A, = i ^ ^ [ ( S « + t a n f l 2 a t )+cottft0^]-«;54.
Sinoe I—:tan0 = 1,
M M
we find finally AX+»A„ = e^(0#+icotflS^).
334 WAVE MECHANICS OF A SINGLE ELECTRON §33
We thus get in the case of the functions (290a) (k =
^ ~ mCOtd)Pl-m = «l(^+W+l)P;.m+i
(292 a)
a* ( j ^ - ™ c o t d j p i+1_m = —a 3(Z—w i + l ) ^ +lm+1
“3( ^ + (m + 1)COt d)P‘+l-™+i = a& + m+ 2)PM.m
These equations can be used not only for the definition of the ratios
ax: a2 and oa : a4 but also for the determination of the 'associated*
spherical harmonics etc. (supposed to be normalized in the same
way for all values of I and m). Eliminating between the first
two equations (292 a), we find, for instance,
^ + c o t « ^ r + [ K I + i ) - ig , ] p l. - o,
which is the standard equation for the functions Plm.
In the case (290b) we get with k = —I a similar set of equations,
namely,
b*($e~ mCOtd) P,-m = - M * - W)p4m+1
( ^ + ( w + 1)cct 0 ^ , m+1 = 62(i+ W + l)i),„
(292 b)
6s( - - m co te)p(_1,m = 63(Z+m)P,_Lm+1
*4( ^ + (»*+ 1)e°t = - 6 , ( i - « - l ) P Win
We shall not write down the explicit expressions for the coefficients
a, 6 (which depend upon the way the functions P are normalized), and
shall now turn to the investigation of the radial factors F> 0 and the
associated question of the characteristic values of the energy e.
The functions F and Q can be investigated by transforming the
equation (€—P)tp = 0 to polar coordinates and getting rid of the angular
factors in t/t with the help of the preceding expressions.
To carry out this transformation we multiply the term y u in e by
the square of the ‘radial projection’ of the vector y:
(293)
§ 33 MOTION OF ELECTRON IN CENTRAL FIELD OF FORCE 335
Taking into account the general relation,
(Y-A)(v B) = (S-A)($.B) = A * B + t (A x B )i
wc get y\ = 1 and, further,
y T u = I (r u-HL-5),
r
whence y u = yj y-u = ^ (r-u + iL -?). (293 a)
T
Now for a spherically symmetrical electric field we have u = p, and
consequently
i h 1/ a . d , a \ h d
- r u = — , - \ x ----\-y— \-z —I = ——.—.
r 27n r \ dx &y dz) 27n dr
Wc thus get, with the help of the equation L-£ = h(ky0—1)/27r,
£ = y^ i ~ ky^ l ) + m 6 c 2 yo + u ’
so that Dirac’s equation reduces to the form
where e0 = m0c2.
Since the operator-matrix yr commutes with 8/dr and 1/r, and anti-
commutes with y0,
( I + kvy l )yr -P + 2J - ('» ro -< '+ U)4> = 0. (294)
By the definition of the matrices yxi yy, yz [cf. (273 b)], we have
(Yr'l')i = z fa + i y ^ - v p s ] , (yr>p)t = l[(x-iy)< li9+ ali4],
r t
(Yr'f’)z = l[
r (*+ iyWz—# 1]. (yr’P)* = r-[(* —»y)</>i+zV,2],
or, putting 0, = <f>v <pt = <f>t, ifit = *l( h = X*. and
1. .
°t =; -(« r ),
(yr^)l = Kx)l> (yr^)« = (arX)*> (Vr^a = (Pt$)1. (Vr0)^ =
The equation (294) is thus equivalent to the following two:
(294 a)
( k + ' rr ) i°-* , - 2I , , ' + ,- - v> x ~ 0 ,
336 WAVE MECHANICS OF A SINGLE ELECTRON § 33
The latter equation can be multiplied by crr, giving, since or commutes
with d/dr and since its squar e is equal to 1, just as for yr,
^ ( £,+ e°—U)(°rX) = °- (294b)
The equations (294 a) and (294 b) serve for the determination of the
functions <f>and ar y. I t should be remembered that each of these func
tions represents a pair of ordinary functions. We thus see that the two
functions of each pair have the same radial factor, in agreement with
our previous results. Putting
<f> = F (r), crr x — iG(r),
we obtain the following system:
F + J ( f ' + (o- P ) 0 = 0
(295)
Using th e identity Cr F ),
(£ + ; )' - r dr
we have
(296)
wher e g = r Oy f = rF .
W e shall solve these equations for the par ticular case of the hydr ogen
like atom , i.e. an electr on moving in a Coulomb field with a potential
ener gy U = —Ze2/r. W e shall assume th a t e < e0, which cor r esponds
to a bound electr on (H' < 0) and leads to a discr ete set of ener gy-levels.
P u ttin g, for th e sake of br evity,
2irZea
~hc' ~ = y '
we get for this case
(296 a)
§ 33 MOTION OF ELECTRON IN CENTRAL FIELD OF FORCE 337
For large values of r these equations reduce to
% + * - »■
giving the following asymptotic solution:
/ = Ae~ ^r, g = Be~aPr \
(296 b)
Ap = B ol r
where A and B are considered as constants.
To get the exact solution we replace them by polynomials
A = A0r P + Al r*l+1-{-...+ A9ri1-'8,
2? = B0r ^-\-Bl B8r ^+8f
obtaining the following relations between the coefficients:
An(P + n ~ k)+ YBn = *PAn~l — ] (297)
Bn(lJ' + n + k)—YAn = '
Multiplying the first of these equations by ft and the second by a and
adding the results, we get
An[P(tl ~^~n — —&y]+ Bn[<x(ii-\-n-\-k)-{-py] = 0. (297 a)
The ‘boundary conditions’ An = B n = 0 for n — —1 and n = $+1
applied to (297) give
Av(p—k)+ YB i = 0, B1(ix-\-k)—yAl = 0;
PA» = <*b 8.
Eliminating A x and B x between the first two equations, we get
+ # 2- y 2). (297 b)
The ratio r
B0 y k-^j(k2—y2)
which follows from the preceding equation, is identical with th at which
is obtained from (297 a) for n — 1. With n — 8 we get, on the other
hand, A;)—oy]+ B a[a(/i+«+A:)+^y] = 0,
which becomes identical with pAt = aBa on using the condition
2*p{n+ s) = (a2—j82)y. (297c)
With the above definitions of a, ft we get
VcJ—c 'V - H ) = t'y>
869ft.6 xx
338 WAVE MECHANICS OF A SINGLE ELECTRON § 33
that is, from (297 b),
This is exactly Sommerfeld’s formula (264) (with yZ replaced by y).*f
The angular quantum number k has the same meaning in both cases,
so far as the value of the energy is concerned. I t must be remembered,
however, th at in the previous theory it was supposed to be essentially
positive, whereas in Dirac’s theory it can assume both positive and
negative values (zero excluded). With k > 0 we get I = k—1 and
j — = k—\, i.e. a solution of the type (290a); while in the case
k < 0 we obtain a solution of the type (290 b) with I = \k\ and
3 = l* |- f
I t should be emphasized that the two solutions are characterized not
only by different angular factors, but also, as is plainly seen from (297),
by different radial factors F —//r and G -- gjr\ their similarity is
restricted to the value of the energy and of the ^-component of the
angular momentum Mz.
The coincidence of the energy-levels corresponding to opposite values
of k is a characteristic feature of the motion in a purely Coulomb field
of force. If the motion of the electron takes place in a field even
moderately deviating from the latter, due, for instance, to the variable
shielding action of the inner electrons in an alkali atom, the energies
of the states -\~k and —k become different and we obtain what is called
a ‘screening doublet’. The two levels of such a doublet state belong
to two different values of the Schrbdinger angular number I, namely,
I = \k\—1 and I = |&|, and to the same value of the inner quantum
number j = \k\ —J. I t should be mentioned th at in the case of small
values of j the separation between the two energy-levels in alkali atoms
or ions of a similar structure is so large that they are no longer con-
t If instead of Dirac’s equation we used the r elativity second-order equation Dip — 0,
in the present case ^ t
not involving the spin, we should have obtained a solution of the same type
0 = F(r)Yhm(0, +)
as in Schrodinger’s theory, with
rF = f ~ e-** 2 bn ^+m
*-o
and - [i + [,-* + v {(?+ »)«-/»}]•]
corresponding to half-integral values of the radial and angular quantum numbers (#—}
instead of a, and instead of I). This result is, however, contradicted by the experi
mental data, which are in agreement with Sommerfeld’s formula.
§ 33 MOTION OF ELECTRON IN CENTRAL FIELD OF FORCE 330
sidered as forming a doublet and are refeiTed to different series. This
notion can, however, be conveniently applied to X-ray absorption levels:
The two levels corresponding to the same value of the SchrOdinger
angular quantum number I and to consecutive values of the inner
quantum number j = I—\ (k = —I) and j = l-\-\ (k = Z+l) are said
to form a ‘relativity doublet’. According to Sommerfeld’s formula (298)
they correspond to consecutive values of the old angular quantum
number |&| (= Z, Z+l). Since in the Bohr-Sommerfeld theory this
number determined the eccentricity of the elliptical orbits, the relativity
doublets were associated with orbits of different eccentricity. From
the point of view of the present theory, the relativity doublets should
be associated rather with orbits of the same size and eccentricity but
with opposite orientations of the spin. Such relativity or ‘spin’-doublets
are extremely narrow in hydrogen or ionized helium, but they become
very broad in X-ray spectra, their width increasing roughly as the
fourth power of'the effective nuclear charge [according to the approxi
mate formula (264 c)]. They are rather broad, too, in the spectra of
alkali atoms and other complicated systems with one external electron.
In this case, however, they are due not to a large effective nuclear
charge, but to a rapid variation of the latter, owing to the decrease of
the shielding effect of the inner electrons when the outer electron
approaches the nucleus.—Sommerfeld’s formula is, of course, inap
plicable to this case, which is characterized by a large AZ-separation
(‘screening effect’) and a relatively small Aj-separation (‘spin’ or
relativity effect).
To a given value of k (i.e. of I and j) ther e cor r esponds a degener ate
set of states specified by differ ent values of the axial quantum number m
or of th e number mi = m + \ which deter mines the z-component of the
tota l angular mom entum. This degener acy is of exa ctly th e same
typ e as t h a t discussed befor e in connexion with Schr &dinger ’s theor y;
it can be pictur ed as due to th e possibility of 2 j + 1 = 2\k\ quantized
or ientations of the angular momentum vector with r egar d to the z-axis,
cor r esponding to all half-integr al values of r a +J between + j and —j .
W e h ave in fa ct in th e case k > 0 a set of function-quadr uplets iff with
th e following angular factor s FA_ltm+1, The m a xi
mum or minim um admissible value of m is th a t for which one function
a t lea st of each pair is differ ent fr om zer o. W e th u s get m < k—1 and
m > —ky i.e.
—&+£ <
A similar r elation with lc r eplaced b y |fc| is obtained in th e case k < 0.
340 WAVE MECHANICS OF A SINGLE ELECTRON § 33
Thus, for example, in the particular case k — 1, I = 0 and j —
which corresponds to the normal state of the hydrogen atom (n = 1 ;
it should be mentioned th at the case k = —1, i.e. I = 1, corresponds
to an excited state n > 2) we actually obtain two sub-states specified
by the following expressions for the functions 0 1}..., 0 4:
<Pa = R Ya (oc = 1,2,3,4),
with the radial factor
B(r) =
and the angular factors
%y
y l = 0, y2= i, y3 = - sm o c11
l+ V (l -y * )
iy
in the case m = 0 , i.e. m,- = + J , and
n -o, n -
vsin Be-i*
4 i+ V (i—r 2)
in the case m — —1, i.e. m,. = —J. The two states correspond to the
same value of the inner quantum number j , namely, j = i. They
are associated with the same spherically symmetrical distribution of
the probability density, which is proportional to the square of the
radial factor R(r). I t should be noticed th at this factor becomes
00
infinite at r = 0 , but in such a way th at the integral J R 2r2 dr remains
o
convergent.
The difference between the two states consists in the fact th at for
the first of them the spin axis of the electron is pointing in the positive
and for the second in the negative direction of the 2-axis, as follows
from the approximate equation for the characteristic values
azip = a'.ip
with 03 = 0 4 = 0 .
We must consider in conclusion the modification of the states, and
in particular of the energy-levels, of a hydrogen-like or an alkali-like atom
in the presence of a homogeneous magnetic field § (Zeeman effect).
In the former case we have to deal with a twofold (kt —k) degeneracy,
corresponding to the absence of any screening effect. This degeneracy is
to be taken into account for very weak magnetic fields only, so weak
th at the product /x§ is very small compared with the relativistic (Aj)
§ 33 MOTION OF ELECTRON IN CENTRAL FIELD OF FORCE 341
separation. Jn the latter case, on the contrary, the relativistic splitting
is as a rule much smaller than the screening (±/c) separation, so that
for fields of moderate strength the only degeneracy present is that
which corresponds to different values of the axial quantum number m.
I t can easily be shown th at the characteristic functions «/r corre
sponding to this privileged character of the 2-axis in the absence of
a magnetic field are such th at the non-diagonal matrix elements of the
magnetic perturbation energy
S = x y = \e§{xyv- y y x) (299)
all vanish. So long as the magnetic field is sufficiently weak the addi
tional energy due to its action can be determined accordingly as the
diagonal elements of S with regard to the corresponding unperturbed
states.
The additional magnetic energy of a state specified by the quantum
numbers 1cy m is thus given by the formula
&€'km = &km;km = j 4*kmS'Pkm ^ • (299a)
Dropping for the sake of simplicity the indices 1c, nij we have, according
to (299),
(Siff) i = \e9}i(x+ iy)^y (Stp)2 = —\e§ i{x—iy)ifsZy
( ^ ) a = &?>i(x+ iy)ip2, (S</r)4 = -ie$> i(x-iy)ipv
and consequently
= -e S, - [ ( z + i y ) ^ * ^ - ( x - + (* + wWt'I’i - (* - iyWt'l’s]
or (299b)
X
Substituting here the expressions for the functions tft derived before
and integrating, we get
Aeim
= 2ne% f dr F {r )Q {r y f i '« a 4 P l+lm P,.m+l+ a*aAP m>ln+1 P (,Jsin 20 dd
(299 c)
in the case of the equations (290 a) and a similar expression in the case
(290b).
The radial factor in this expression can easily be calculated with the
help of the differential equations (290) which are satisfied by the func
tions r F — f and rO — g. Taking the first of these equations and
342 WAVE MECHANICS OF A SINGLE ELECTRON >33
putting approximately €'+ e0—U ^ 2e0t we get
whence
J FGr* dr = J fgr dr J / * * • - J rf% dr\
or since J r / | dr = J dr = - j p r ,
f *'Gr* dr =* --h- (* + J) f /* dr = --A - (*+ J)
J 47re0 J 47rm0C
o
if the function/(r) is appropriately normalized ( J /2 dr — 1).
The angular factor in (299 c) can also be evaluated without much
trouble with due regard to the normalizing conditions for the functions
P(d).
We obtain in this way (neglecting terms of the second order in 1/c)
A('km - - 7 47T
~— ?7l0C?£>(w + I ) = ?(»»+£)> (30°)
with <7 = - - ^ < * > 0)- {k< 0 ) ' (300a)
in agreement with the results obtained at the end of § 30 (if m + £ is
identified with rrij).
The integration of the expression (299 c) requires a great deal of
calculation. This can be avoided, however, if we replace the operator
M by the operators
” M« - S ^ <L+2S)’
which have been shown in the preceding section to be approximately
equivalent to it and to each other with an accuracy of the second order
in 1/c. To the same approximation we can replace y0 in the expression
J%2 h2
(287 a) by 1, with the result L2 = — k(k—1) = —2Z(Z+1) when k > 0.
Combining it with the equation M 2 = (—) j ( i + ! ) and putting
s = {g—1)M, we obtain, with the help of (267 a) and (289 b), the above
approximate expression for A c^.
The preceding theory is applicable only to a comparatively weak
magnetic field. When the shift of the energy-levels produced by the
§ 33 MOTION OF ELECTRON IN CENTRAL FIELD OF FORCE 343
magnetic field becomes of the same order of magnitude as the Aj-doublet
separation, the spin perturbation to which this separation is due must
be taken into account together with the magnetic perturbation.
We must start in this case with the two unperturbed states of equal
energy ejm, specified by the same values of I and m and belonging to
the values j = £ of the inner quantum number. The combined spin-
magnetic perturbation S = &flp+ £ m produces a splitting-up of the
unperturbed energy-level into two levels c[m-f AcJ,^, according to the
equation S n -A c ' ^12 = 0,
£01 iSU-Ac'
where the index 1 refers to one of the two degenerate states (j = l+ \,
say), and the index 2 to the other (j = / —J).
The non-diagonal elements of the spin perturbation (S8V)n and ($8p)21
must obviously vanish since the states j = are stationary in the
absence of the magnetic field. The diagonal elements ($6p)ii ~ Ajc',
(&8p)22 = A2e' can be defined therefore as the additional energies due
to the spin perturbation alone, their difference 8 = A ^ '—A2c' being
equal to the Aj-doublet separation in the absence of the magnetic field.
The action of the latter can thus be determined by the equation
Ai € Ac £ ml2 = 0, (301)
A ,c '-A c '
where
S M11 =
. (301a)
and g —g — c.V{(i + m + 1)(^—m )}
The first two expressions are given by (300); the expressions for ^?nl2 and
Sm2i can be derived in a similar manner [see § 20, equation (155 b)].
I t is customary to refer the displaced energy-levels and c£ to the
‘centre of gravity’ of the doublet, i.e. to the energy €'0 determined by
the formulae , , ,n , , , , 10
= €0 + ( l + l )P> €2 = €0 ~ lP
[8 = (2Z+1)/? = e[—c2]. Putting Ajc' = (Z+1)0, A2c' = —Z0, and
€/—€q == Ac', we obtain from (300) the following equation for Ac':
(Ac')2+ [ 0 + M$ ( 2 m + l ) ] A ^ ^ - 0.
Its solution runs
Ac' =
(301 b)
344 WAVE MECHANICS OF A SINGLE ELECTRON
If the magnetic field is very weak, we get, in the first approximation,
Ac' = —P (l+ l)— (w + J),
and Ac' = + p i-y.§
in agreement with (300). In the opposite case of a very strong magnetic
field—so strong that the doublet distance 8 is small in comparison with
the splitting /x§ due to the field alone (when 8 == 0)—the formula
(301 b) reduces to
Ac' = —p S K m + i J i — —/x§(w ± l),
i.e. to the earlier formula (266 b) which determines the normal Zeeman
effect.
34. Negative Ener gy States; P ositive Electr ons and Neutr ons
We have seen above that in Pauli’s theory the two values a = 1 and
a — 2 of the spin-coordinate refer to the two opposite orientations of
the electron’s spin or magnetic axis parallel to the z-axis. One might
be inclined to think th at the values a = 1,2,3,4 of the Dirac theory
refer to four different orientations of the electron. This is, however,
not true. Taking the probable value of the spin angular momentum in
the z direction we get, according to (275 b):
»* = ^ J* ( - ^ l + ^ a - ^ s + M k ) d v >
which shows th at the values a = 3 and a = 4 refer to the same orienta
tions (in the negative and positive direction parallel to z) as the values
a = 1 and a = 2 respectively.
I t should be mentioned th at we get exactly the opposite result as to
the meaning of a = 3 and ol = 4 if, instead of the angular (mechanical)
momentum, we consider the magnetic moment due to the spin (jl = /iy0
We get, namely, in this case [cf. (283 a)]:
]rt = J m z d v = n j (— tf'l’i ) dV-
This shows th at in the states a = 3,4 the electron behaves, so far as
its spin magnetic moment is concerned, as a particle with a positive
charge.
As has been explained already, the quadruplicity of the Dirac theory
is connected with the introduction of states of negative energy «. The
§34 NE G AT IVE E NE R G Y ST AT E S 345
values a = 3,4 for a state of this type have the same physical meaning
as the values a = 1,2 for the corresponding state of positive energy
(the functions «/r3, tp4 being large compared with i/jv i/j 2 in the former
case and small in the latter). The quadruplicity appearing in the com
parison of Schrtidinger’s and Dirac's theory can be pictured as the
result of the reflection of a point representing a Schrodinger state in
the plane c — 0 and further as the splitting of the two points into
a Pauli doublet.
To each characteristic value of SchrOdinger’s energy constant H'
there correspond in Dirac's theory four energy values e' which can be
denoted as follows:
m0c2+ i / ' | , m0c2+ H '± ( > 0),
m0c2+ //'+ , w,0c2- f / / '” (< 0),
the first pair lying close to each other as well as the second pair, the
two pairs having approximately opposite values.
The matrix elements of any physical quantity represented by the
four-dimensional matrix-operator F , as defined by the general formula
=
J
f F h dV = 1
J a -1 ^ 1
fi
dV
can be combined accordingly into four-dimensional matrices:
-1
;/rj
►s
Fj i-
V
F jj-\ir t F ,r : ,r
F n - < ,r ; F jr tH -l F H ’lH - t Fjr U r:
+*221
+1
*3
F ir -ii' i F j i. - h -z
B?
;+
11
I 1
Fr ' z h F ii-ziv t F u -zn i
If the function is expanded in a series of functions t/^», according
to the formula
F<i>t- = Z F (-('>pt;
negative energy states must be taken into account as well as the
states of positive energy unless the matrix elements F €.€*t where e > 0
and c* > 0, all vanish. This circumstance is especially important in
various perturbation problems; with F denoting the operator of the
perturbation energy, correct results as to the probability of combined
(double) transitions are obtained only if intermediate states of negative
energy are considered along with those of positive energy. In the
problem of the scattering of light by a free electron, for example, the
relative importance of intermediate states of negative energy is larger
the smaller the (positive) energy of the initial and final state. This
3 5 0 5 .6 v y
346 WAVE MECHANICS OF A SINGLE ELECTRON § 34
result (due to Tamm) is especially startling because relativity corrections
vanish in the limiting case of small velocities, so th at negative energy
states which form a characteristic relativity effect would be expected
to become insignificant in this limiting case.
Another interesting example of the paradoxical role played in Dirac’s
theory by the states of negative energy is presented by the motion of
an electron through a potential energy jump, as discussed by 0. Klein.
For the sake of simplicity we shall take the equation of the second
order, Dip = 0 (D = u2—u'i+ mlc2), to which the four equations of the
Dirac theory reduce for free motion. The continuity conditions for the
four functions ipv ..., fa can be replaced in this case by the continuity
condition for one of them and its derivative in the direction of the
energy jump. Assuming the latter to take place in the direction of
the s-axis, the potential energy being equal to 0 on the left of the
plane x = 0 and U = const. > 0 on the right, and assuming further the
electron to move parallel to the z-axis, we get
1 .* •
if* = A'e +A e
for x < 0 (incident and reflected wave), and
for x > 0 (transmitted wave), where
gl = c2/c2—m0c2 and g\ = (€—U)2/c2—m0c2.
The continuity conditions give the same relations A'+ A" = B' and
A ' —A" = B'gblga as in the non-relativity theory [cf. P art I]. The
important difference between the latter and the present theory con
sists in the fact th at the above relativity expression for gb remains real
not only in the case when U is smaller than the kinetic energy of
the incident electron c—m0c2, but also in the case when it is larger
than m0c2+ € ^ 2m0c2 (if c is not very different from m0c2). This
means th at total reflection (gb imaginary) takes place only within the
range c—m0c2 s; U < e+wi#c2,
whereas beyond it we get transmission both for small and for large
values of U.
I t seems hardly possible to give a reasonable interpretation of this
result. I t can be shown, however, th at the paradoxical transmission
probability for the case U > e+ m 0c2 rapidly decreases when the dis
continuity U in the potential energy at x = 0 is replaced by a gradual
§34 NE G AT IVE E NE R G Y ST AT E S 347
incr ease within an inter val compar able with or lar ger than the wave
length of th e electr on A = h/g.
The physical meaning of the states of negative energy is at present
not quite certain. They were initially interpreted by Dirac in con
nexion with the duplicity of electricity, and served to reduce protons to a
mere absence of electrons if space is assumed to be nearly saturated with
electrons in states of negative energy, with due regard to Pauli's ex
clusion principle. I t is, however, impossible to interpret in this way the
difference in the mass of electrons and protons. According to Pauli and
to Weyl the rest-mass of a proton considered as a hole in the distribu
tion of electrons with negative energies should be exactly equal to the
rest-mass m0 of an electron.
Although Dirac’s original theory has thus failed to reduce protons to
electrons, yet it may perhaps be credited with predicting the existence
and properties of things that have hitherto never been anticipated by the
experimental physicist and th at seem to reveal themselves in the Wilson
chamber cloud-tracks of particles released by the penetrating rays of
cosmic origin and by very hard gamma rays. These are the ‘positive
electrons’ whose discovery has recently been announced by Anderson
(1932) and also by Blackett (1933).
The experimental data are still too scarce to make it sure th at positive
electrons really exist. But if they do exist they fit beautifully in the
scheme of Dirac’s theory. The fact that they are not found under
ordinary conditions is explained by the extremely large probability th at
a ‘positive electron’ will recombine with a negative one (the latter
falling from a state of positive energy into the hole constituting the
former), this recombination being accompanied by the emission of two
photons (cf. P art I, § 19).
The visible existence of the mater ial wor ld ar ound us m ust be
guar anteed fr om this point of view by the fact th a t the tota l number of
electr ons is lar ger than th e number of available sta tes of n egative ener gy,
a t least in th a t par t of the wor ld which is accessible to obser vation.
Assuming th e existen ce of positive electr ons, it would be natur al to
postu late th e existen ce of Nega tive p r otons’ for med b y holes in a
pr actically satur ated distr ibution of pr otons between sta tes of negative
ener gy.
I t is difficult, however , to accept the idea th a t space is filled up with
one or two sor ts of par ticles for ming a kind of infinitely dense ‘ether ’
which is r evealed in a n egative way on ly thr ough th e occasional absence
of th e full quota of these par ticles.
348 WAVE MECHANICS OF A SINGLE ELECTRON § 34
Dirac’s equation has served as a starting-point for the introduction—
besides positive electrons—of particles devoid of electrical charge and
denoted accordingly as ‘neutrons’. Dirac himself attempted in 1931 to
introduce neutrons as magnetic analogues of electrons, i.e. as particles
possessing a magnetic charge instead of an electric one. Pauli on the
other hand proposed (simultaneously with Dirac) a theory of neutrons
devoid of charge (both electric and magnetic) but possessing a magnetic
moment and a spin angular momentum associated with it. The necessity,
or rather plausibility, of introducing neutrons in addition to protons
and electrons as constituent parts of atomic nuclei was dictated by
certain nuclear phenomena, like the apparent failure of the alterna
tion principle (Bose-Einstein statistics holding for nuclei supposed to
consist of an odd number of particles) and of the principle of conserva
tion of energy (continuous jS-ray spectra of radioactive substances).
These difficulties could be removed by admitting the existence in the
nuclei of a third sort of elementary particles in a bound state. The idea
of treating these particles as ‘magnetic neutrons’ was suggested by the
possibility of replacing Dirac’s equation for the electron by a similar
equation with e = 0 and with the mass m0 increased by an additional
term
L = m(H -5 -E ij)
which represents the action of the magnetic and electric field on the
neutron’s magnetic and electric moment (% and rj being the matrices
(275 b) and (275c), and p hypothetically Bohr’s magneton). Pauli’s
equation for the neutron can thus be written in the usual form
(e+2^ l)^ = 0with
<TP+yo(wtoc*+-&)>
wher e p= — .V; the electr omagnetic potentials A and <f>do n ot appear
2 iti
in c since the electr ic char ge with which th ey m ust be m ultiplied is sup
posed equal to zer o.
W e shall n ot stop her e to develop P a u li’s theor y. The r emar kable
fa ct we ar e m ainly concer ned with is th a t th e neutr on was discover ed ex
per im entally by Chadwick, following obser vations b y Cur ie and J oliot,
with in a year after its existence had been ten ta tively adm itted on th eo
r etical gr ounds. I t made its appear ance as th e disintegr ation pr oduct of
cer tain nuclei bombar ded b y pr otons or a-par ticles in th e for m of a
par ticle with a mass ver y little differ ent fr om th a t of a pr oton (while
Pauli expected it t o have a mass of th e same or der of m agnitude as the
§34 N E G A T I V E E N E R G Y ST A T E S 349
electron). It is still a m atter open to question whether a neutron is
a simple particle like an electron and a proton, or a combination of
both.f The latter alternative seems the more natural, although we are
not yet in a state to substantiate it theoretically, for the present wave-
mechanical theory is inadequate in treating such systems, whose linear
dimensions are of the same order of magnitude as the ‘size’ of the
electron (attributed to it on the electromagnetic theory of mass). As
to the forces binding the electron and proton in a neutron more
tightly than in a hydrogen atom—they may be due to the mutual
attraction of the spin magnetic moments. In fact this attraction (which
corresponds to a suitable orientation of the spins) increases with de
crease of distance much more rapidly than the attraction due to the
electric charges of the two particles, so that the Coulomb attraction
becomes negligibly small (relatively) at distances of the order of
10~14 cm. It cannot be asserted, however, th at the usual inverse fourth-
power law for the mutual attraction of two elementary magnets is
applicable for distances comparable with the electron’s own dimensions.
35. T h e I n va r ia n ce of t h e Dir a c E q u a tion w it h r ega r d to C o
or d in a t e T r a n sfor m a t ion s
We have hitherto considered the Dirac equation of motion for a parti
cular frame of reference specified by the coordinates x, y, z and the time t.
We shall now investigate the transformation properties of this equation
for such transformations as correspond to a rotation of the coordinate
system x, y, z in space, or more generally to a Lorentz transformation of
the coordinates and the time (i.e. to a rotation of the original frame
in a four-dimensional space-time manifold).
We shall first write down the Dirac equation in the form of two
two-dimensional matrix equations
a u < f> + ( u l - m . 0 c ) x = 0| (302)
a u x + (« (+ w 0c)i/r 0 )
[cf. (257 a), § 30J and limit ourselves to rotations in ordinary space,
which do not affect the operator ut. The invariance of equations (302)
with regard to such rotations can be achieved in two different ways:
(1) By considering the wave functions (matrices) and
X= as invariant and the matrices axt ay) az as covariant, i.e.
t It might also be surmised that the proton is a complicated particle formed by the
combination of a neutron with a positive electron.
350 WAVE MECHANICS OE A SINGLE ELECTRON § 35
transforming according to the same law as the coordinates x,y,z. Under
this condition the product o u = oxux+ a vuy+<jzus will define a scalar
(invariant) operator.
(2) By considering the matrices ayi os as invariant numerical
operators, and introducing a suitable transformation for the matrices
X-
The two methods must, of course, give equivalent results. In the
first case we can define the matrix au for any direction n (which may
be that of one of the new coordinate axes) as the projection of the
vector a in this direction. Using the polar angles 0n,(f>n to specify it
with respect to the original coordinate system C(x,y,z), we have
un — o x c o s (x ,n ) -\- f j y c o s (y ) n )~ \-o z c o s (z) n )
= sin Bn(ctx cos <f>n+ a y sin <f>n) + az cos 0n,
which is equivalent to four equations for the matrix elements auotp
(«,/? = 1,2) of an. With the help of the expressions ax = *j,
ay = | ? *J, az — | ^ defining the rectangular components of u
in the system A, we get
__ j —cos0w sin 0ne^A
(302 a)
°n (sin0ne-1^ cos0w ] ’
This equation can be applied for the definition of the matrices oy, oy,
az>which represent the rectangular components of the vector a with
regard to a new coordinate system C '(x',y'yz').
We shall not, however, write down the explicit expressions for these
matrices (which can easily be found with the help of the three Eulerian
angles), but shall limit ourselves to presenting the general transforma
tion equation in the form
ana/9
^mn^moLfh (302 b)
m=1
where the indices (m ,n ) = 1,2,3 stand for the three axes of the old
and the new system respectively (a[ = ax>, etc.), while amn is the matrix
of the orthogonal transformation C -> C' :
*n = 2°mn*m-
I t should be emphasized th at the indices m, n which specify the
coordinate axes or the rectangular components of or, have nothing to do
with the indices a, p which specify the matrix elements of o or of its
rectangular components.
The tr ansition fr om th e fir st method (of tr ansfor ming am) to the
§35 INVARIANCE OF DIRAC EQUATION 351
second method (of transforming ^>a and *<*) can be carried out in the
following w ay:
We try to find a unitary two-dimensional matrix A such that the
transformation defined by (302 b) shall be equivalent to the following
° ne: = (A-' = A*),(303)
th at is, 2 2
^na/3 ^ 1> 2, 3)
y -1 8 ; 1
involving a component of o along a given new axis and along the
corresponding axis only of the original coordinate system.
The relation between the transformation (302 b) and (303) can be
stated as follows: in the former the matrices am (or o„) appear as com
ponents of a vector in ordinary three-dimensional space, whereas in the
second case they appear as tensors in the two-dimensional spin-space
specified by the Greek indices a, /?, etc. The transformation matrices
amn and A^p are both unitary and refer respectively to the ordinary
space and to the state-space.
Let us suppose that we have succeeded in finding A and let us write
the scalar product a u in the form
2 OmUm = 2 « = 2 A~ ^ AK = A ~l( 2 ° n < )A -
n n
(A commutes with u'n since the latter is a scalar in the state-space.)
The transformed equations (302) can be written accordingly in the form
< ) A t + ( u , - m 0c)x = 0,
A - 1 (2 °» 0 A X+(v,+™oC)'l, = 0.
Multiplying them on the left by the matrix A , we get
(2 orn « ; ) f + K - m 0c)x' = 0 '
(303 a)
( 2 an Mn)x' + («<+»»ocM' = 0 I
'n ' '
with th e oper ator -matr ix <m' of the same for m as in th e or iginal
coor dinate system and with th e tr ansfor med wave functions
f = A+. X' = A X- (303 b)
W e shall deter m ine th e tr ansfor mation m atr ix A for th e simple case
of a r otation in th e {x, y)-plane thr ough a given angle (in th e dir ec
tion fr om x t o y). This gives
x* = x c o s^ + y sin ^ , y' = —x sin ^ + y c o s^ ,
352 WAVE MECHANICS OF A SINGLE ELECTRON § 35
and consequently
ax>= axcos<f>-\-(Jv sin <f>, o'y>= —crx sin <f>-\-cry cos <f>, o’z>= at, (304)
th at is,
, ( 0 e*+) , ( 0 ie**] , (- I 0\
o r a" ' - {- i e - * o r a* - [ o ir
Now we must have, irrespective of the index n,
<rnA = Ao;,
and in particular for n = 3, azA — Acz) that is, since az is a diagonal
matrix, / \» a
whence it follows that A mpst also be a diagonal matrix. Putting
A= we get further
th at is, A2 — A1eit, Ax — A2e-it,
or consequently Ax = ce“'^ , A2 = ce+i*^. The same result is obtained
from the equation ayA = Aa y>. The constant c is determined by the
condition that the determinant of A (a unitary matrix) is equal to 1.
We thus get c — 1 and finally
A = jC r e = cos !<£+*<*«s i n ( 3 0 4 a )
(the first term being understood to be multiplied by the unit matrix 8)
which corresponds to the following transformed expressions for the
functions ip,
fi - foer **, = >pi e+i^ ; x[ = xk = X i ^ M - (304 b)
For a rotation in the plane x, z through the angle 0 (in the direction
from z to x), i.e. for the transformation
o'j.' — oxcos6—azsmd> oy>= ayi az>— a,, sin 0+0,009 0, (305)
, __ fsin0 cos0 \ , __ f 0 i\ , _ f —cosd sin 01
01 °x (cos0 —sin0j, °v' (—i 0/* a* \ sin0 cos0J’
we get in a similar way
An = A 22i A12 = A2i
(from the equation ayA = Aoy), and further, from axA = Aa'x or
ozA — A<7fz, together with the condition \A\= 1:
A (cosJ0 —sin £01
(305 a)
§35 INVAR I ANC E OF DIR AC E Q UAT IO N 353
whence
0[ = 0XC O S ^0—02 sin \ 0, 02 = 02 8 ^ ^ + 02 008^,
Xi = *icos£0—y2sin£0, Xi = Xi sin }0+ *2 cos
I t should be mentioned th at the transformation matrices (304 a) and
(305 a) can be written in the form and e'*00* respectively. We
have in fact, by the definition of the exponential function
= cos jx -\-icrn sin /x.
Since q \ — o xn = ... -= 8( 1), a \ = cr* = ... = an.
With ijl = J0 and <rn --- a, this gives (304 a) ; with fi = £0 and a* — o-^, it
gives (305 a).
Two successive rotations arc obviously equivalent to a single one,
specified by a matrix (a" or A") which is equal to the product of the
matrices (a,a' or A , A') specifying the two component rotations. Thus,
for example, by combining the two preceding rotations in the order
stated, we get a rotation with the transformation matrix (in the state-
space):
a " — (cos2^ —sin-£0Wc“^ 0 1 _ Ic o b 10 c ~u + —sinJ0c<J^
\sin-J0 cos|0 /( 0 (sinJ0e~i4^ coslO e ^ / ’
which can be written symbolically in the form
J ^n 0ov __ gil(<f><jt + 0 o v)
with the understanding th at the order of the two factors should not
be inverted.
This means that to a coordinate transformation defined by the equations
x" —- (.t cos 0 + y sin 0)cos 0—z sin 0
y" — —x sin <f>+ycos 0
z” ~ a:sin0+ccos0
there corresj)onds the following transformation of the functions 0:
0j = ip1cos l0 e -iM-~ifi2sm \9eiW, 02 — 0i sin £0e“V4^ + 0 2cos i0e1^ ,
and a similar transformation of X2-
The preceding results are easily generalized for any number of suc
cessive rotations about arbitrarily chosen axes. These rotations are
always equivalent to a single rotation over an angle o> about an axis
specified by a unit vector n. The transformation matrix A correspond
ing to such a rotation is easily seen to be
A = cos ko+icrnsin Ico = t n-0ia% (306)
3ft05.0 zz
354 WAVE MECHANICS OF A SINGLE ELECTRON § 36
where an — a n = nxax+ n yav+ nzae is the component of a along the
axis of rotation. The reciprocal matrix
A ' 1 = cos |tu —icrn sin
corresponds to a rotation about the same axis in the opposite direction
(or to a rotation about the oppositely directed axis —n through the
same angle); it obviously coincides with A f since o* = a. Hence it
follows that A is a unitary matrix, as was assumed at the beginning.
A two-dimensional unitary matrix can be represented with the help
of two complex numbers a, 0 satisfying the condition aa*+$S* = 1 in
the form
A =
In the present case these numbers are
a. — cos^w~\~inzsin fi = i(nx-\-inv)sin \w.
I t should be mentioned th at the number of real independent parameters
which determine the rotation is equal to three (the rotation angle a> and
the two angles 6, <f>which determine the direction of the axis of rotation
n, or three of the four real numbers which define a and j8 under the
condition aa*+|3j5* = 1 ) .
As has been shown in § 30, the probability density and the rectangular
components of the probability current density are expressed, with the
help of the two-dimensional matrices tp, x> cr, by the equations
P = «/’V + XtX. in = c{'l>'<JnX+ x' an'l>)>
[n = 1,2,3; cf. eqs. (259) and (259 a)]. Transforming the functions 0 and
X according to the equations = Ap, 0'* = ^ A \ and regarding the
matrices an as invariant, we obtain for the same quantities referring to
the rotated system the expressions
p = ^AiAtp + x'A'Ax = = P
(since A^ = A ”1), and
3n = (A'onA)x+ xH A'a nA)*l>] = c{4>'o^x+Xf°k'f>)>
= ^ amnim'
m
in agreement with the invariant character of p and the covariant
character of the components of the vector j.
The preceding results are easily extended to the four-dimensional
matrix form of the Dirac equation and of the associated operators.
Taking, for example, the energy operator
B,
( = U + t 0 y 0 + C 2 ,Yn un>
§35 I N VAR I ANC E OF DIR AC E Q UAT IO N 356
we can consider it as an invariant with regard to rotations in ordinary
space if the three four-dimensional matrices y1 —- — , yx, y2 yv y3 yz
=
are defined as covariant operators, satisfying the same transformation
equations as the coordinates xy = xyx2 = y, xz = z or the components
of the operator u. The shape of the transformed matrices is easilyyn
obtained from the above expressions for the transformed matrices a'n
with the help of the invariant relations yn—
The same relations can serve for the determination of the unitary
matrices, L say, which determine the equivalent transformation in the
four-dimensional spin-space according to the ‘tensor’ law
Y«= -L~ 'yn L = L'y»L (n = 1,2,3).
We have, namely, L— ^ j, (306 a)
where A =-' Uu
fAn A jo
is the two-dimensional unitary matrix defining
the transformation of un ^0 = jjj j.
With the help of the matrix %= which serves to describe
the electron’s spin or magnetic moment [cf. (277)] we can write the
matrix L corresponding to a given rotation (co,n) explicitly in the form
L -- — cos |a>+t£nsin (306 b)
similar to (306) with a replaced by
The matrix y0 remains invariant under this transformation. Writing
Dirac’s equation in the form (c+p,)^ = 0 and using equation (306) for
the y7',, we can write it for the rotated coordinate system in the form
[ t o + ^ + eoyo) + c 2
or since (pt+ U + €0y0) = L~l (pt+ f7+e0y0)£,
L - ^ + U + €Qy0+ c '2,yn u'J^L*lt — 0.
If this equation is multiplied on the left by L t it reduces to the original
form, with the old matrices yn, the new components of u, and a new
wave function «/»' derived from the old one by means of the trans-
formation ^ = ^
Putting 0 = |^J, w h er e* = K j and x ~ we get, with the help
356 WAVE MECHANICS OF A SINGLE ELECTRON
of (306),
(« )■
in agreement with the results obtained before.
I t can further be shown directly that under the transformation ip’ — Lift
and tp'1 = ip^L* the product ip^ip remains invariant while the quantities
cipryn transform as the rectangular components of a vector.
We can now turn to the generalization of the preceding results for
rotations in the four-dimensional space-time manifold of the relativity
theory, i.e. for Lorentz transformations, corresponding to a transition
from a state of ‘rest* to that of uniform motion.
I t will be convenient in this connexion to use Dirac’s equation in
the form Bip = 0, i.e.
or ( 2 A>«„+w0c)4> = o. (307)
where n — 1,2,3 stands for x, y, z respectively, while
#4 = V—1Ct, = —V—lW/, /?4 = V—1/?/.
I t must be emphasized th at the imaginary unit V— 1 is introduced here
simply for the sake of formal symmetry, and th at it will be treated in
the sequel as an ordinary ‘real* number, in the sense that its sign will
not be altered in a transition to conjugate complex quantities. In order
to distinguish this relativistic V— 1 from th at of the quantum theory,
which plays an essential role, we shall denote the relativistic V— 1 by
the Greek letter t (i* = i, »* = —i).
A Lorentz transformation is defined as a linear transformation of
the form 4
< = I a m n x m,
1
4 4
satisfying the orthogonality condition 2 xn — 2 xm an(^ the condition
n®1 m=1
th at the first three components of x' should be real and the fourth
imaginary (reality condition). The components of the four-dimensional
operator u are transformed in the same way as the corresponding
coordinates, and if we wish to ensure the invariance of equation (307),
we must either submit the matrices to the same Lorentz trans
formation 4 4
ft = «m»
2
wi=l
ft,
'
(ft*, = «m„A*J,
2
m-1 1
(307 a)
or introduce the equivalent tensor transformation in the four-dimen-
§35 I N VA R I AN C E OF DIR AC E Q UAT IO N 357
sional state-space
/?; = K'p n K <&*, = 2 2 K% KxJ n:xX). (307 b)
With the help of the latter the transformed Dirac equation can be put
in the form
th a t ia, ( 2 j9n< + w 0cW = °.
xn--1
with the same numerical matrices f3n as the original ones and with the
transformed wave function
f = Ki/>. (307 c)
The possibility of replacing (307 a) by (307 b) is proved by the fact
th at the transformation matrices amn (in the ordinary space-time) and
(in the four-dimensional state-space) have the same rank. They
contain therefore the same number of elements.
The determination of K through a can be carried out in the same
way as in the case of rotations in ordinary space, by combining rota
tions in different planes.
In the case of rotations in ordinary space the matrix K must ob
viously coincide with the matrix L considered before. This follows from
the relations j3n = y0 yn for n = 1, 2, 3 (/?4 = iyQ) in conjunction with
the fact th at y0 is not affected by a spatial rotation. Now for a rotation
through an angle w in the plane (a^, x2) we have, as has been shown above,
L = elia>& or, since f 3 = [according to (276), § 31], L =
Identifying this with the matrix K for the case under consideration
and taking into account the relativistic symmetry of Dirac’s equation
in the form (307) with respect to the space coordinates and the time
(id), we can define the matrix K corresponding to a transition from
a state of rest to th at of a motion in the direction of the first axis with
a velocity v by the expression
K =
corresponding to a rotation in the plane (xv #4) through the imaginary
angle # = ta n ~lvjic. Replacing here & by y0 yv & by iy0, and putting
& = id, where
tanhfl = - (coshfl = .■ - 1 ■
■■ , ainhfl = ..T..v!c—- \
we get, since y0y i n = —YoYi = —ft.
K =
358 WAVE MECHANICS OF A SINGLE ELECTRON § 35
This result is easily generalized for the case of motion in any direction
specified by the unit vector n'. Denoting the corresponding component
of y (i.e. the scalar product y n ') by yn>we get
K — e-ify*' = cosh id —yw,sinhi0. (308)
In order to find the corresponding expression for the matrix L we
must come back to th at form of Dirac’s equation which has been used
hitherto, viz. ^2 Ynun^~Yomoc)tP — 0 with y4 = 18 and u 4 = —tut>where
the factor i is introduced in order to secure a more complete symmetry
between the terms involving the space coordinates and the time. The
Lorentz transformation of the components of the operator u, defined
4
by the equations u ’n — 2 amn um> must be combined with an appropriate
in—1
transformation of the wave function, ip' — Lip, so that the transformed
equation shall reduce to the form ^2 ynttn+yomoc) ^ — 0 with the same
matrices y.n (including y0) as the original one. Replacing ip and ip' by
y0%
P = {p and y0 tp' = ip' respectively, we come back to the equations
(2 Pn « „ + » » « = 0 and (£ Pn K + moc)$' = 0;
whence it follows that L — y0 JAry0, where K is the transformation
considered before. Since yl — 1, i.e. y0 = y^1, we can put L — y0 A y 0.
Substituting here the expression (308) for K , we get
L = ygcosh J0—y0yn«y0sinh \d
or, since yl = 1 and y0 yn>yQ= —ylyn- = —yM'>
L = coshi0+ y,r sinhi0 = (308 a)
If y is replaced here by iirj, where rj is the matrix which serves to define
the electron’s electric moment in the same way as £ defines the magnetic
one [cf. (276 a)], L assumes a form quite similar to th at (306 a) which
corresponds to an ordinary spatial rotation. I t should be remembered,
however, th at while § represents a real quantity, yj must be considered
as a pure imaginary. This corresponds to an im portant distinction
between the matrices (306 b) and (308 a), the former being unitary
(L* — Z r1 defining a rotation in the opposite sense) and the latter
Hermitian (Lf = L ).
In the general case of a Lorentz transformation combining an
ordinary rotation (co,n) with a relative motion (0,n'), the matrix L can
be represented as the product of the two component transformations
taken in a definite order, for instance,
L = (308 b)
§35 INVA R I AN C E OF DIR AC E Q UAT IO N 350
The adjoint matrix is
V = etfive-**0*.,
so th at
U L = cosh2£0-f sinh2£0+2yn,sinh£0cosh \B — cosh0+>vsinh0.
Substituting this expression in the formula p — ip^L^Lip for the trans
formed value of the probability density (in the ‘moving’ coordinate
system), we get
p' = tfi*if/coah 6 + ^ y n ^sinhfl,
th at is, »4 -i .vie
P' = pco8h«+yn.8inhfl =
in agreement with the well-known result following directly from the
Lorentz transformation equations.
If the moving axes are parallel to the original ones (co = 0) we get-
in a similar way from the general formula j n = ifi'^ynifj' = tp1 L 1yn Lif/
j' n ’ = ^ t[yn.(cosh2i0+8in h 2^ )+ 2cosh J 0sin h ^ ]^ ,
th at is, . , /
j ’. = J„- cosh 6 + p sinh 6 =
I t should be mentioned th at instead of introducing the relativistic
imaginary i = V - i in the definition of the fourth component of four-
dimensional vectors one can distinguish two types of real components,
namely, the covariant and the contravariant, the latter differing from
the former by the opposite sign of the fourth components. The contra-
variant components are denoted by the same letters as the covariant
ones with the index placed above instead of below. If, for instance,
x 1 = x, x 2 = y, xz = zy x 4 = ct are the covariant components of the
space-time vector, then its contravariant components must be defined by
xP* = Xy x<a) = yt 3<8) = Zy = —ct. The square of a four-dimensional
vector, A say, is thus equal to the sum of the products of its covariant
components with the corresponding contravariant ones:
A* = ± A kAK
l
In a similar way the scalar product of two vectors is defined by the
4 4
sum T Ak Bk or 2 AkBk. W ith this notation Dirac’s equation can be
written in the form
[ j -/k)uk+ yamc]>l> = 0,
360 WAVE MECHANICS OF A SINGLE ELECTRON
where uA= u( —- - ^ + e0^ an<^ ^(4) ~ ^ (— *)• The covariant com-
j)onents of the four-dimensional velocity vector y must be defined
accordingly as
7i = 7x> 72 = 7i/’ 7a y«i 74 (= “ *)>
and the covariant components of the operator % as
w(1) = ux, u{2) — iey, u(3) = uz, w(4) = —w*.
The transformation matrix L obtained above thus refers to the
contravariant components of y. It is easily seen, however, that it can
be applied just as well to the covariant ones.
Quantities of the type of Dirac’s wave function quadruplet 01} 02,
03, 04 can be regarded as forming in the space-time manifold a kind
of tensor of rank i. This means that they are related to an ordinary
vector (i.e. tensor of the first rank) in the same way as the latter is
related to an ordinary tensor (of the second rank). This connexion is
plainly seen from the fact that an ordinary vector—like the probability
current density (j ,p)—can be expressed with the help of the 0 ’s as a
quadratic quantity—just as a tensor (of the second rank) can be
expressed as a quadratic quantity by means of the components of a
vector or of two different vectors.
I t has recently been shown by various authors^ that each of the two
pairs of functions 0!,0 2 (= 2) and 03,0 4 (= XVX2 ) rather than the
whole quadruplet determines a ‘tensor of the rank £’. Any pair of such
quantities, whose transformation properties in the state-space of the
spin coordinate (with its two values 1 and 2) are connected with the
transformation properties of vectors in the ordinary space-time mani
fold by the above equations, are called, following Ehrenfest, a spinor.
The two components of a spinor, 0Xand 02 say, are complex numbers;
they determine therefore four real numbers which can serve to specify
the components of an ordinary four-dimensional vector. A vector can
be defined as a particular type of spinor of the second rank, i.e. as
a quantity whose components (in the spin space!) transform like the
products of the components of two ordinary spinors, or in particular
of a single spinor 0 and its adjoint quantity 0*.
I t can easily be shown, for example, th at the expressions
f k = <f.'°k<t> (*=1,2,3,4),
where
— <rx, at — ay, <r3 = as, ax = 8 ( = 1 ) ,
J Cf. O. Laporte and G. Uhlonbeck, Phya. Rev. 37 (1031), 1380.
§35 I NVAR I ANC E OF DIR AC E Q UAT IO N 361
t h a t is, f x = 4 t 4 z+ 4 * 4 i, h = *(4*4*—4*4i)> /a = and
/ 4 == 4*4i~^4*4i tr ansfor m like the quantities ar, y, 2 , ct in any or dinar y
r otation or in a Lor entz tr ansfor mation, if <f> is tr ansfor med accor ding
to (/>' — A<f> and f t accor ding to <f> — A*(f>\ A being a two-dimensional
m atr ix, which r educes to the for m en-wa* — cos la >-fav sin Jo> alr eady
consider ed in th e case of an or dinar y r otation (thr ough the angle cj
about an axis n). In th e case of a r elative motion in a dir ection ??/ with
a velocity v specified by the angle 0 = tanh~1(v/c) we have, so long as
the new axes are par allel to the old ones,
A = eJ*a»' = cosh 1 0 + a,,* sinh 10 = rixax+ n'uvu^\-n'z<jS).
This gives in par ticular , for a motion in the 2 -dir ection,
In the m ost gener al case A can be r epr esented by the pr oduct of
with th a t is,
A — cos }a>cosh 1 0 + a ;<sin 1to cosh 1 0 + cr,,*sinh 10cos£a ;+
+ <7Won>sin \w sinh J0,
or, since — (o*n)(o-n#) — n -n ' +io-(n xn ' ),
A — cos \u jcosh 10+n-n' +<7/t sin Jen cosh 10+<7/t,sinh 10cos
+ m*(n x n')sin Jon sinh 10.
The elem ents of this matr ix are easily ver ified to satisfy the r elation
\A \ — A n A 22 ^ 12^21 ~
Using the notation
4* = 4\> 4* = 4b
i.e. r eplacing th e conjugate com plex sign by dotted indices, one can
wr ite th e covar iant com ponents of a spinor of the second r ank in thr ee
differ ent for ms, nam ely,
4u> 4idy 4ki (A*, Z= 1,2),
th ese com ponents tr ansfor ming as the pr oducts 4k 4b 4ic4t> and 4k 4l
r espectively.
Besides covar iant components of spinor s we m ust also distinguish
contr avar iant ones. For a spinor of the fir st r ank these are defined b y
th e r elations ^u> _ ^ = —fa,
4>d) = h , 4>(i) = - f c ,
because this ensur es th e invar iance of the ‘scalar pr oducts* 4i x (1)+ </ >2 X(2)
an d fa t f v+ f a t f K
3595.8 3 A
362 WAVE MECHANICS OF A SINGLE ELECTRON § 35
The contravariant or mixed components of spinors of higher rank
are connected with the covariant ones in a similar way. We have, for
example,
= <&22> <t>12 = “ ^21» <£21 = —<^12> 4>Z2 = <£ll> etC'
The components of a (four-dimensional) vector / can be represented
with the help of a spinor of the second rank by the formulae
P = h = 1(4*1 ++»)> P = /. =
/ 3= / . = / 4 = - / « = W h + fc .)-
We shall not engage in a more detailed discussion of this question
and shall point out in conclusion the following important circumstance.
In our derivation of Dirac’s equations as a generalization of the
equations of Maxwell’s theory we originally introduced, instead of
the quadruplet 0i,02»0a>^4> eight quantities Mv M2, M3) M0] Nlt N2)
N3, NQi visualizing the six quantities Mv M2i M3, —Nv —N2, —N3
as analogous to the electromagnetic field components Hxt Hy, Hzi
Ex) Eyy Ezf while M0 and N0 were regarded as two additional scalar
quantities. This point of view had to be abandoned in the sequel
because of the rearrangement of the Maxwell-like equations, corre
sponding to the introduction of the additional terms containing the
rest-mass of the electron m0. If, however, instead of the first-order
equations we consider the second-order equations only (which are
a generalization of the d ’Alembert equations of the electromagnetic
theory), we can preserve the above point of view and treat the quantities
Mv M2> M3, iNv iN2, iN3 as the components of a four-dimensional
antisymmetric tensor of the second rank Mm = —Mlk (k, I = 1,2,3,4)
transforming under a Lorentz transformation in the same way as the
components of the electromagnetic field-tensor Fnl = ~Fln (F23 = Bi,
F 3, = Hf, Fit = Bf, F 14 = —iE it F ^ = —iE t, F ^ = —iE ,). I t haa
been shown further th at in this case we can put N = ± iM which
corresponds to the ‘self-duality’ of the tensor Mu and introduce accord
ingly the relation N0 = between the scalars (invariants) M0 and
N0, thus reducing the eight quantities M, N to foqr, just as in the case
of Dirac’s equation.
The fallacy of this procedure is shown by the fact th a t it does not
;permit us to define a four-dimensional vector representing the probability
current j and density p. The latter would appear in suoh a theory not
as the fourth (time) oomponent of a vector but as the (4,4)-component
§35 I NVAR I ANC E OF DIR AC E Q UAT IO N 363
of a tensor of the second rank, corresponding to the tensor of the electro
magnetic energy and momentum; the components of the vector j
would appear likewise as the (1,4), (2,4), and (3,4) components of
this tensor, corresponding to the components of the energy-stream. So
long as we confine ourselves to ordinary rotations in three-dimensional
space this circumstance remains irrelevant; it becomes, however, a
challenge to the theory when we pass to the more general Lorentz
transformation, involving the transformation of the time. In order to
make j,p a regular four-dimensional vector we must consider the
quantities jfef, N as defining a spinor </r—or more exactly two spinors
<f>, x—whose transformation properties have been studied in this section.
The above argument serves to show in a most convincing way the
restricted character of the analogy between m atter and light as repre
sented by the probability and the electromagnetic waves respectively.
A ‘wave-mechanical’ theory of light similar to that of m atter would
necessitate the introduction of a new type of probability li' Id, con
nected with the photons in the same way as with ordinary particles
and entirely different from the electromagnetic field which has been
used hitherto to describe the phenomena of light from the point of view
of the wave theory. It does not seem, however, that the introduction
of such a probability field with spinor properties is warranted by the
experimental facts.
36. Tr ansfor m ation of the Dir ac Equation to Cur vilinear Co
or dinates
We have considered hitherto cartesian coordinates only. We shall now
generalize the results obtained for a transformation from the cartesian
system x, y, z to any system of orthogonal curvilinear coordinates g2, q3.
Such a system can be specified by the following expression for the
square of the line-element (i.e. the distance between two neighbouring
P°ints): ds* = ef dqj+ ei dq-,+ej dq
where ev e2, e3 are mutually perpendicular vectors tangential to the
coordinate lines which pass through one end of ds. The products et dqt
play the same role as the differentials dxt of a local cartesian system
passing through P with its axes parallel to the vectors et, so th at the
rectangular components of the operator p can be written in the form
304 WAVE MECHANICS OF A SINGLE ELECTRON § 36
In transforming the expression Y’P^ — (2 YkPkj'l* t° the new co“
ordinates, with the help of the formula yk == L~lyk L we must take into
account the fact th at the matrix L is to be considered as a function of
the coordinates, varying from point to point with the direction of the
local cartesian axes. We thus get
rP<A = Yk LPk)'l>
= — J ,y k(.pkL —Lpk)<(i],
3
or VP'A = L - 1 2 Yk[p'k-(p'k L - L p ' ^ L - 1} ^ .
k= 1
In order to obtain the transformed equations we must accordingly
replace the components of the vector p by the ‘co variant’ operators
p 'k= p ' k- ip ’k L - Lp ’k)L-1- (3° 9)
In the special case of orthogonal coordinates they assume the form
; = A . lo g 4 (309 a)
2m ek\dqk dqk j
where A l o g L = d-L-L -\
d9k d9k
Now, as has been shown above, the matrix L can be defined by the
expression eiiu*t’n, where the rotation angle a> and the axis of rotation n
must be considered as certain functions of the coordinates. We thus
have log L = i \ m \ n = Jig-co, (309 b)
the vector co serving to determine the rotation both with respect to
magnitude and direction.
Let us consider the infinitesimal rotation dco corresponding to a
transition from a point P (with the coordinates qk) to a neighbouring
point P'(w ith the coordinates qk =
Introducing three unit vectors fk = e^/e*. in the direction of the
coordinate lines, we can obviously put
dfk = dco x f k,
whence
ft' dfk = i i ( d w x fk) = dw-(f*xf,) = ±(dioyij = ±(dw)jt
where ti is the unit vector perpendicular to ik and fit the positive sign
corresponding to an even character of the permutation ^ and the
negative sign to an odd one.
§ 30 DIRAC EQUATION IN CURVILINEAR COORDINATES 305
W e have on the other hand
f(-dfk = [etle{)-d{ek/ek) = e(-dek/{et ek),
since the vector s and ek are mutually or thogonal (i ^ k), whence,
putting for the sake of br evity = dhf
d9h
±(^<*>);- — ei^h ekliei ek)- (310)
I t follows fr om the for mula
dr — e1dq1'j-e2dq2~\-B3dq3y
which can ser ve for th e definition of the vector s eif th at the latter are
equal to the differ ential coefficients of the r adius vector r of the
point (q^ q^ qs) with r espect to the cor r esponding coor dinates. We
thus have
dhek = djceh
and consequently
= e.'A-e i = = \dk ej = ei dkei.
Fur ther , since et*eA = 0 (k ^ i),
e k'^ i e i = ~ ei^ iek = ~ e i ^ k e i>
and if h is differ ent fr om both k and i,
= °-
The la tter equation is easily obtained in conjunction with the fact th a t
0*(e*e,) = 0-
Substituting these expressions in (310) we find
(^w )i = 0, (c*iU>)2 — — ^ 3 01 J (^ 1 0 )3 — —- ^ 2^ 1
(^2^)1 — “ ^3g2> (^2W)2 — 0* (^2^)3 = - ° i e 2 (310a)
«1
(^3 ^ ) 1 —~
e2 ^2®S» — T
%~ &ie3> (^2^3 — 0*
Now according to (309 a) and (309 b)
t h a t is,
366 WAVE MECHANICS OF A SINGLE ELECTRON §36
The first component of the vector / — d,.u> (i.e. its product with
fi) is
y i- ( a iC * > ) i+ y 2—(c)2U))1+ y 3 - ( a 3c*>)jL = — y2 --da l ° g e2 + Ys - - log ez
ei e2 e3 e3 e2
according to (310 a). Multiplying the right-hand side by we get, since
£ i7 2 = piiiz = *p£3 = *73 a n cl fi7 3 = pi i £ j = —pi 2 = “ 72>
~ 4Lys7e 3 ^ loge2+y27
e2 ^ loge4J'
We thus find
§• J y*4
* e*
= ~ 4 n 7 ^ loK(e2c3)+V27 ^ • 1og(',3f,)+ y37
L ei d<h e 2 d(h ez d<ta
and consequently
r p ' = 2 w [**- M log(“ ' f 3)] ’ (311}
it being understood th at the second term in the brackets represents an
ordinary number and not an operator. We can also write
r p' = 2 y*p*[1_1°g (311a)
the transformed Dirac equation being
J2,( + eO )+ C Y ^P '-^A j+ y 0m0c2j f = 0. (311 b)
Two special cases should be especially noted, namely, th at of a
cylindrical and of a spherical coordinate system. In the former case
we have, putting
= r = ^(xi + yi), q2 = <f> (angle), and q3 =■= 2,
e, = 1, e2 = r, e3 = 1,
and consequently
th at is.
(312)
$ 36 DIRAC EQUATION IN CURVILINEAR COORDINATES 367
In the latter case, putting
qx = r = <J(x2 + y2-\~&2), q2 = S (colatitude), and qz = </>,
we have ex = 1 , e2 = r, e3 = rsin0,
and consequently
r P ' ■ § s [ r ‘( l ' I #,!) +y' K s - 1 lc* #l) +
th at is,
* p = s s h ( ! - ; ) + * K ! ~ !oot#) + >'v-ffirs <“ *>
This expression can be used to reduce to its simplest form the problem
of the hydrogen atom, which has been discussed already by a less
straightforward method in § 33.
I t should be mentioned th at in calculating the product y-A = 2 7 k^k
the quantities Ak must be understood to represent the components of
the vector potential along the axes of the local cartesian systems, i.e.
along the vectors e2, e2, e3 (.d* = A-f*). The matrices ylt y2, y3, though
identical with the original matrices yx) yyJ yz, have now a different
physical meaning, denoting the components of the vector y along the
axes of the local system and not of the original cartesian system of
coordinates.
The preceding results can be further generalized for the case of a non-
orthogonal system of curvilinear coordinates. We must distinguish in
this case contravariant and covariant components of different vectors,
the former transforming-as dqv dq2, dqs and the latter as djdqly djdq2y
h d
d/dqz. Putting p'k = — . — and denoting the contravariant components
2 i7Tt aqk
of the vector y in the new system by y'k, we can write the operator
8
y p in the form 2
i •
Introducing a generalized (non-unitary) transformation matrix L
according to the condition
/(*> = V y kL
(where y, = yx, y2 = y„, ya = y*), we get
(VP)<l> = ^ ^ Y k W k — ^ k L — L p ' ^ L -^ I j^ ,
whence it follows th at the transformed Dirac equation for the new wave
368 WAVE MECHANICS OF A SINGLE ELECTRON § 30
functions </r' = Ltp will differ from the original one in the same way as
in the case of orthogonal coordinates, the operators p k = (h/ 2 m)d/dqk
being replaced by
P i - K i ( 4 " 4 ,o g I') " rt[1 _ < lo g i,),i' ,)'
We shall not determine here the matrix L for the general case of
non-orthogonal coordinates, for it is not of practical interest.
The preceding results can be further generalized by introducing four
dimensional transformations, involving not only the space coordinates
but also the time. Such transformations can be used to include the
effects of the gravitational field on the motion of the electron in
accordance with the relativity theory of gravitation. These considera
tions lie, however, beyond the scope of the present book.
In conclusion, the following transformation property of Dirac’s equa
tion should be mentioned.
The electromagnetic field is represented in Dirac’s equation by the
potentials A, <f>. Now from the relations E = —V<£—dA/cetf, H = curl A,
it follows th at the electromagnetic field strengths are not altered if A is
replaced by A' = A + V y and </> by <p' = <p—dxjcdt, where y is an arbi
trary function of the coordinates and of the time. Since it is the field
strengths and not the potentials which have a direct physical meaning,
the above transformation of the potentials must be irrelevant for
Dirac’s equation; that is, the transformed equation
[ ( J V W ') /c + r ( P —e\'/c)-\-y 0 mc]>ji' = 0
must be equivalent to the equation
l(ft+e^)/c+Y -(p—eA/c)+y0»ic]</> = 0
with the original potentials. This is easily verified, the transformed
wave function \p' being connected with the original one by the equation
p' _ ei2iTtxihc^p g0 ]ong as y is a real quantity (as of course it must be),
the two functions, or rather function-quadruplets, tp and ip' correspond to
identical values of the probabilities and thus determine the same motion.
This transformation can be considered as a special transformation
of coordinates, the transformation matrix L being defined as the pro
duct of the matrix 8 = 1 by the function ei2rre^ . I t is clear th at the
coordinates are actually not affected by a transformation of this type.—
We see a t the same time th at the introduction of our electromagnetic
field can be described in a geometrical language as a generalization of
ordinary coordinate transformations, the quantities (h/27ri)dLldqk being
replaced by eAk/c.
V II
THE PROBLEM OF MANY PARTICLES
37. Gener al R esu lts, Vir ial T h eor em , Linear and Angular
M omentum
The problem of many particles has been considered already in the first
part (Chapter IV) on the basis of the non-relativity mechanics of a single
particle. Using the method of the configuration space, we arrived, in
the case of two different particles, at the equation (101), which in the
general case of a system of n different particles with the masses
m1,m 2,...,mn, and the potential energy U(xv yv zl\...; xn,ynizn;t) can be
written in the form
[ ! ^ + p ( £ i - p)]+-°- (3is)
where
foi m
Using the notation
h — . h d h d h d
P k ~ '2ni V*’ l e' P kx~ 2 n i d x k Pkv ~ 2^ i dyk Pkz ~ 2 ^ i d Tk
(313a)
and (313b)
2 1 r d K/ > + D ’
’‘ -
k=
we can rewrite (313) in the standard operator form
(H + p M = 0, (314)
h 3
p, denoting as usual — - . while H represents the energy operator
27rt ct
or Hamiltonian for the system under consideration. I t agrees with the
classical expression of the energy if the operators p* are regarded as
representing the momenta of the separate particles. The wave-
mechanical equation (314) thus corresponds to the classical energy
equation H — W = 0 if —p t is replaced by the value of the energy W.
This correspondence has exactly the same character as for a single
particle, for which it has been discussed in detail in Chapters I and II.
We need not repeat here all th at has been stated there, as well as in
the following three chapters, concerning the matrix representation, the
transformation, and the perturbation theory. I t may suffice to remark
th a t a system of particles, defined by the Hamiltonian (313 b), can be
8695.6 3 B
370 PR O BL E M OF MANY P AR T IC L E S §37
dealt with from the mathematical point of view as a single 'particle
moving in a space of 3n dimensions, with the coordinates
9i
-y (sK q‘ = M ) y‘- «> -y($K
* - y © * ......,3,4‘,
Here m is an arbitrary coefficient of the dimension of a mass, which
can be regarded as the mass of the ‘equivalent' particle. We can put,
for instance, m = (m1m2... mn)lln which gives
d V = dxxdyxdzx...dxndyndzn = dqxdq2... dq3n.
The corresponding momentum components are defined in the classical
theory by the formulae
p, - •» -? , - = y ( - ) p u .......... - y (-)i> ~ -
They are represented in the wave-mechanical theory by the operators
__ // m \ __ // ra\ d — /(m\ h ^
s j \ m j ^ ** V l mi/ 27ti dxj* ’ ^ 3n \ m j 2 iri dzn*
that is, according to (314 a),
f - - a s 4 : (“ - 1 ' 2.....3” (' <314b)
just as in the case of a single particle. Expressed in terms of these
coordinates and momenta the Hamiltonian (313 b) assumes the standard
form
H = a =l
(314c)
All the developments of the first five chapters of this part, referring
to the motion of a particle in ordinary three-dimensional space, can be
immediately generalized for the case of a symbolic particle representing
a system of n ordinary particles in the 3w-dimensional configuration
space. The generalization is in fact so simple that it is hardly necessary
to dwell upon it.
We shall therefore limit ourselves to the discussion of a few pecu
liarities connected with the physical meaning of the problem and to
the possibility of completing and refining the theory in the same sense
as has been done in the preceding chapter for the case of a single
particle.
From equation (314 c) and its conjugate complex (H— = 0
G E NE R AL R E SUL T S 371
(H* = H) we can obtain in the usual way the ‘conservation’ or con
tinuity equation 3n
i+ fi-i.-o. <»u>
where p = i/ji/j * is the probability density in the configuration space, and
j a = A r V e*
J 27im i r dqa
the components of the 3n-dimensional probability current. If equation
(315) is multiplied by the volume-element of the configuration space
dV = d\\dV2 ...dVn = ( ---- A .... - fd q xdq2 ...dqzn
\m i m 2 ••• Wn/
and integrated over all this space, the result obtained is
expressing the law of conservation of probability, f
If, however, the integration is extended over the configuration space
of the second, third,..., nth particle, while the coordinates of the first
one, x, y> z, are kept constant, we obtain an equation of the usual
three-dimensional form
FtPl + ftr A + + |A - 0, (315 a)
where the quantities
ft = j . . . j P dV2dV3...dV„
t t y dV2...dVn, etc.
■
■■ ■■■ * dxx >
(315 b)
can be interpreted as the probability density and current density for
the first particle in the ordinary three-dimensional space. The same
results hold, of course, for each of the other particles.
In the particular case of a system of particles which do not act on each
other the equation (H + p()*fs — 0 has multiplicative solutions of the
form ^ = ifixip2 ...ipn, where iftk depends upon the coordinates of the &th
particle alone; we get accordingly in this case
(provided the separate factors of iff are normalized to unity) and con-
f We shall assume for the sake of simplicity that tho integral J p dV is convergent,
which means that the particles are bound to remain in a finite region of space.
372 P R O BL E M OF M ANY P AR T IC L E S §37
sequently p = p1p2...pn. This result was the starting-point of our
discussion of the problem of many particles in P art I. In the general
case p is, of course, different from the product PiPz —pn* this circum
stance corresponds to a mutual dependence of the particles, a depen
dence specified by the form of the potential-energy function U or also
by statistical (i.e. symmetry) conditions, if the particles are all alike
(see below).
The function U may be assumed to have the form
^ ^ ^ 2
the first sum corresponding to the action of external forces, which can
depend upon the time explicitly, while the second represents the mutual
action of the particles (Uw = Ulk, rkl — |rA—i/l = distance between
the fcth and Zth particles).
If U does not depend upon /, then equation (314) admits solutions
of the form ip = •••>zn)e~i2nlJ (lti>where and H' are the charac
teristic functions and the characteristic values of the energy operator
satisfying the usual equation
(H -H ')Wr = 0. (316)
In the case of a discrete spectrum of II the functions are easily
proved to be orthogonal to each other (in the configuration space), this
orthogonality being a consequence of the self-adjoint character of the
operator H , since
/i « h - h l m ^ d q k\ (/i
0qk^ - hc)qkJ
f )•
Another interesting consequence of this self-adjointness of H is the
possibility of replacing the preceding equation by the variational
equation, 8 f 0°*J/00 dV = 0, (316 a)
with the accessory condition,
J 0®*0»dV - 1,
expressing the ‘normalization’ of the functions Using
f 0o*a¥° d v = - f ^ d v,
(316 a) can be rewritten in the form
8Jf \_&
\ J L y 8- f- l ^°+[/0»*0°l dV = 0,
TThn Z , qa dqa
8 J 9
(316b)
a =i
which involves the first derivatives of tfi only (dV = dqx...dq3n)f
§37 G E NE R AL R E SUL T S 373
An interesting application of the variational equation is afforded
by the following very simple and general proof of the virial theorem
(due to V. Fock). Let us replace the function which is
a solution of our problem, by the function \f>' = c*ft°(Xqv ...>Xq3n), which
is obtained from it by multiplying each coordinate by a certain
parameter A and introducing a normalizing factor c. Introducing
further a new set of coordinates q^ = Aga, we can write the normaliz
ing condition { t/f'+i/j' dV — 1 in the form J X Zn\li,*if}t dV' = 1, where
dVf = dq[...dq3nt which gives, on using the original normalizing con
dition,
/ t t i q m q ) dV =-. J r * ( q ' m q ' ) d V1 =-. i,
c = A~in. Using this value of c we can reduce the variational equation
(316 a) or (316 b) to the form dH'/dX--= 0, where Hf is the value of the
integral (316 a) or (316 b) which is obtained by replacing the function
if*0 by the function 0'. Its minimum value corresponds, of course, to
A — 1, which is the solution of the equation dH'/dX — 0.
Now using the coordinates q'k, we have
jp = r rA2J L x w ) .
J [ 8a Z dq[~ +
+ dr,
so that the preceding equation assumes the form
/ h - £ .| d r - °-
Putting here A = 1 and q^ = qa, we get
where T denotes the probable (average) value of the kinetic energy of
aw —
the system, and V ga -~- its 'virial’. We have obviously
If the potential energy is a homogeneous function of the coordinates,
this expression reduces to the product of U with the number specifying
the corresponding power. In the special case of a system of electrified
particles obeying the Coulomb law—which is approximately the case
374 P R O BL E M OF MANY P AR T IC L E S §37
with any actual material system constituted by protons and electrons,
we must have 2T = -U (317 a)
or, since T-\-XJ = W (= total energy of the system),
T = * -W . (317 b)
I t should l>e remembered th at these relations hold only so long as
the particles remain actually bound to each other, which is expressed
mathematically by the convergence of the integral J \ip\2 dVy a con
vergence that subsists so long as the energy W of the state under
consideration belongs to the discrete spectrum. I t should further be
remarked that they remain valid if some of the particles are treated as
fixed centres of force producing an ‘external’ Coulomb field of force.
We shall now establish a few other general laws which hold for a
closed system of particles, i.e. a system unaffected by external forces,
such as an isolated atom or molecule, etc.
These laws are the exact equivalents of the laws of classical mechanics
concerning the conservation of the energy, momentum, and of the
moment of momentum (or angular momentum) of the system. The
first of them has been stated already. The other two can be established
with the help of the relation
^ = [H, F ] = ^ ( I I F - F H ) .
(tt ft
We put F ^ p = l Pk,
1
or F = M = ir* x p * ,
in accordance with the classical definition of the total momentum and
angular momentum (the origin from which the vectors r k are supposed
to be drawn can be chosen arbitrarily).
Taking the x-component of p , we have
[H,Px] = tu ,p x] = Ik [U,Pkx] = - Tk -0X.k
Now — d U / d x k represents the force acting on the k i h particle in the
direction of the x-axis; so long as there are no external forces, the sum
of such forces for all the particles must obviously vanish. Hence we get
§37 G E NE R AL R E SUL T S 375
In a similar way we have
[//, Mx] = 2 [H ,ykp kz- z kPkv]
k
= 1 {[H >yk}Vto+yk[H,pks] - [ H , zk}pky+ zk{H ,p kv%
k
th at is, since
8H aH eu
[ H , y k] - f t r {H ’Pkz] = - dzk etc.,
'dPkV mk V
= 2 ^ V k y V k ^ P k y ) ~ J {y > § - * H y}
The first sum vanishes since pky and p^ commute with each other, while
the second is equal to the ^-component of the vector ^ (r*X F*), where
k.
F7. is the force acting on the Jcth. particle. We thus get
| M - |r , x F t.
just as in the classical theory. I t is easy to see th at in the case of
central forces, which we are considering, the vector 2 r kx F* (repre-
k
senting the resulting torque of all the forces acting on all the particles)
vanishes. We have in fact, putting F* = S F „ , and taking into account
l*k
th a t Fu = - F lk = f u {rk- r , ) t
| r* x Ffc = J Jj, (r fc- r ,) X F« = 0.
Hence it follows that M = 0,
dt
i.e. the conservation law for the resulting angular momentum.
This result, as well as the preceding one, can be obtained by another
method based on the invariance of the energy operator with regard to
a transformation of the coordinates (and momentum components) in
volving a shift of the origin and a rotation of the axes about it. Let
P be some fixed point in space (or in the configuration space) and P '
another point which in the new system has the same coordinates as
P has in the old one (x'k = xk, etc.). If f(P ) is some function of the old
coordinates, then the transformed function will be defined by the con
dition Tf{P ) = f(P '), T denoting the transformation under considera
tion. The coordinates of the point P ' in the original system are defined
by the linear transformation equations
Vk = yQ + * 2 ix k + * i 2 y k + « 2 z zk (Jc 1, 2,..., n).
Zk ~ 20 + a 31 a 32 2/1 + a 33 Z k ,
376 PRO BL E M OF MANY P AR T IC L E S §37
In the special case of an infinitesimal transformation these equations
reduce to _
xk—xk , = bxk
« = x„ 0 +, wvzk~-wzyk,
.
3/k—y'k = h k = yo+»,**-<"x2*.
zk~ z'k = *zk — zo+ OJxy'k—a>l/x'k,
where cu*, a>y, a>z are the components of an infinitesimal rotation to ,
while x0, y0, z0 are the components of an infinitesimal displacement r 0.
We obtain in this case
TAP ) » / m + 2 ( £ * . + £ * . + £ & . )
- / ( « + * . 2 £ + s . 2 i | + z " 2 & ,+
+ wx
4 i x4 r siB
the derivatives of / being taken for the point P . Neglecting small
quantities of the second order, we can replace in this equation the
primed letters (referring to P f) by the unprimed (referring to P),
which gives 9 •
Tf(P ) = f(P ) + ^ [ ( r , . p + w M ) a ,
where p = iP k> while M denotes, as before, the operator of the
k
resulting angular momentum. We thus see that an infinitesimal trans
formation T can be represented by an ordinary linear differential
operator 9 •
T = l + - p ( r 0p + u )M ).
Now it is obvious from symmetry considerations that the energy H
remains invariant under a transformation of the type T since the latter
alters neither the value of the potential energy (depending on the
relative position of the particles only) nor the expression of the kinetic
energy operator (the operators V£ being independent of the orientation
of the coordinate system or of the position of its origin). This circum
stance can be expressed by the condition THifj = HTifj, that is,
H T = TH, which, on the other hand, means th at the operator T repre
sents a constant of the motion. In view of the arbitrariness of the
(infinitesimal) vectors r 0 and <o, the equation T = const, is split up
§37 G E NE R AL R E SUL T S 377
into two independent equations: p = const, and M = const., expressing
the conservation law of the resulting linear and angular momentum of
the system.
These laws are, of course, no longer satisfied in the presence of
external forces. If, however, the latter reduce to an attraction to a
fixed point—as in the case of a system of electrons revolving about
a fixed nucleus supposed to act like a point-charge—then we still have
M -- const. In the presence of a homogeneous field—magnetic or
electric—parallel to a fixed direction in space, the energy operator
remains invariant for rotations about this direction only, and we obtain
accordingly the conservation law’ for the corresponding projection of
the angular momentum, the components of the latter in the perpendi
cular directions being no longer constant.
The operator = r^xp* representing the angular momentum of
a single particle satisfies, as we know, the relation
M tX M t = - A Mt .
* * 2m *
Replacing M&by the resultant angular momentum operator M = 2 M*,
k
we have
M x M = J M k x M k+ ^ ( M a X M j + M / X M * ) - | M kx M t ,
since the operators M k and M/ referring to different particles obviously
commute with each other. We thus get for the resulting angular momen
tum the same relation as for the component ones, viz.:
(318)
2t t i
I t has been shown in Chap. I ll, § 13, th at it is possible by means of
the matrix method to derive from this relation the matrix elements
of M in a representation specified by the condition th at M 2 and Me
should bo diagonal matrices (corresponding to a given value of the
energy). The number of particles involved is obviously immaterial (so
long as M commutes with H) and the results obtained before for the
case of a single particle can be directly applied to the present case.
We thus obtain, on denoting the angular quantum number h y j (instead
of I as before) and the axial one by m,
(318a)
30M .6 30
378 PRO BL E M OF MANY P AR T IC L E S §37
Wm.m = (~ j < m ^ j )
(318 b)
[cf. c.g. (96) and (96 a)]. As has been pointed out in § 13, the number
j can assume, from the matrix-theory point of view, both integral and
half-integral values (the values of m being of the same nature); half-
integral values occur, however, only if the spin of the particles is
included, and if M refers to total not orbital angular momentum.
38. Magnetic For ces and Spin Effects
A generalization and refinement of the preceding theory along the same
lines as for a single particle—i.c. the establishment of a wave equation
= 0 which would describe the behaviour of a system of
particles in agreement with the relativity theory, taking account of
magnetic forces and of the spin effect—is a problem which admits only
of a partial and approximate solution. This circumstance is not charac
teristic of the wave mechanics, for we meet with a similar situation in
the classical mechanics. The latter can be formulated in a relativistically
invariant form for the case of a single particle moving in an external
electromagnetic field—th at is, in a field which is supposed to be known
a priori and specified by the potentials <j> and A . The more general
problem of the motion of two or more particles, acting on each other
according to the laws of the classical electromagnetic theory, cannot be
solved with the help of a single equation involving the coordinates of
all the particles for the same instant of time, for according to this
theory the action emanating from each particle travels through space
with a finite velocity (c). The force acting on a particle (1) at the instant
t depends upon the position and motion of the other particles (2,3,...)
at previous instants t 12 = / —i?12/c, etc., B 12 being the distance between
the point where (1) is at the time t and the point where (2) was at the
time tu .
This fact, usually denoted as the law of retarded action, alone pre
cludes the possibility of treating the problem of motion and interaction
of a number of particles by means of a single equation of the Hamilton-
Jacobi type. We must, instead, write the relativistic equation of motion
for each individual particle assuming the electromagnetic potentials
§38 M AG NE T I C F O R C E S A N D S P I N E F F E C T S 379
pr oduced b y th e other par ticles to be known, and fur ther mor e a set of
equations defining the potentials pr oduced by each par ticle, its motion
being supposedly known.
This pr oblem allows, however , only an exa ct for mulation. I t cannot
be solved exa ctly even for the sim plest case of two par ticles. And
ther e is no doubt th a t such a solution, if it could be obtained, would
be in contr adiction to the exper im ental facts. Assuming th a t the latter
can be descr ibed adequately, so far as the motion of a par ticle in a
given exter nal field is concer ned, by means of the r elativistic wave
mechanics, we m ust find a method of descr ibing adequately the electr o
magnetic field pr oduced by a par ticle, whose motion is specified in
ter ms of wave mechanics, i.e. in ter ms of the j)r obability theor y. This
means th a t together with the classical mechanics we must abandon the
classical electrodynamics, based upon the idea of exactly specified motion,
and r eplace it by a new 'quantum electr odynam ics’, not involving
this idea.
W e shall consider th is pr oblem mor e closely later on (Chapter IX)
and shall confine our selves her e to the mor e modest task of incor
por ating into the wave-mechanical theor y the magnetic for ces, and
other effects connected with them, neglecting those which are due to the
r etar ded char acter of the inter action between the electr ified par ticles—
electr ons and pr otons—constituting matter .
So far as the action of an external magnetic—or electr omagnetic—
field on a system of such par ticles is concer ned, the r equir ed gener aliza
tion of the pr evious theor y pr esents no difficulties. We have mer ely
to r eplace in th e expr ession of the ener gy oper ator the momentum
oper ator s of th e single par ticles p* by the differ ences
wher e Ak = A(xk, yki zk11) is the vector potential of th e exter nal field
a t the point wher e the par ticle in question is supposed to be situated at
the in sta n t t under consider ation.
P u ttin g fur ther U = ]£ ek<f>k+ U \ wher e </>k = <f>{xk) yk, zk) t) is the
scalar potential of the exter nal field a t the point (xk,yk,zk), and
Z/' = ^ th e m utual potential ener gy of the par ticles, we get
i<k r 'k
380 PROBLEM OF MANY PARTI CLES §38
In the case (usually met with in practice) where the square of A can
be neglected as well as div A, this expression reduces to tho form
"-1 +22 • <*>*»>
We meet a much more difficult problem when we try to incorporate
in the energy operator terms representing the non-statical interaction
of the particles with each other. This problem can be solved approxi
mately if we neglect the retarded character of the electromagnetic
actions and define accordingly the vector-potential produced by a par
ticle with a charge ei and velocity vi at a distance rik by the expression
ei yi!{crik)'
The total value of the vector potential Ak at a given point (k) is then
equal to the sum of the part A£ due to the external field and th at
Ak = y - - - due to all the other particles. The total momentum of
CTji.
the kth. particle, p k = Tnkvk+ (ek/c)Ak> is thus given by the expression
P* = 1 9 ki v<+ (eje) A?, (320)
i
where = mii if i = k and e{^/(cV^) if i ^ k.
The corresponding expression for the total kinetic energy T of the
whole system [equal to the sum of the ordinary kinetic energy ^
and of the mut'ual kinetic energy T f = i 2 {eklc)yk^lc] is
T = (320a)
i k
Putting p k—(ekjc)Ak = p'k and solving the equations P^ = ^ 9 k i vi
%
with respect to the v /s, we get v4 — ^ gikpk, where gik = g~l (dgidgik)>
g being the determinant \gik\, and
r = (320b)
t k
The classical Hamiltonian H is equal to the sum of this expression and
the potential energy U = The simplest way to obtain
k
the corresponding quantum Hamiltonian consists in replacing the p ” s
in (319 b) by the operators (A /27r t )V — (e/c)A°. Since, however, these
operators do not commute with the coefficients gik we might just as
well write p ' gikp'k instead of gikpiP k or, more generally, f ”lPigikfp kt
where / is any function of the coordinates. If (following L. Landau) we
put f = *Jg we obtain for the quantum T an operator which can be
$38 M AG NET IC F O R C E S AND SP I N E F F E C T S 381
considered as a generalization of the ordinary Laplacian in a curved
space with the line-element ds2 = ^ ^ gik dq{dqk.
We shall now discuss some further complications of the theory of
a system of electrons, namely, those connected with the spin effect.
In the case of a single electron or proton this effect can be accounted
for approximately by introducing, in addition to the three space co
ordinates of the particle x, yt z, a fourth ‘spin coordinate’ £, able to
assume two values only. These values correspond, as we know, to two
opposite orientations of the spin parallel to a fixed direction, th at of
the z-axis say, or, more exactly, to the two characteristic values of the
z-component of the spin matrix az. We thus get, instead of a single
wave function \{j(x,y,z) describing the motion of the particle in ques
tion, a function doublet tp(x, y> z, £) which can be dealt with as a linear
two-dimensional matrix with the elements 01(#,y,z) = tp(xyy,z, 1) and
*p2 (x>y>z) — 0(*,y,z, 2), 1 and 2 being the two values of £. Instead of
these two values it is often more convenient to use —\ and + 4, which
are equal to the respective values of the z-component of the spin angular
momentum expressed in the standard hjlir unit.
The energy operator, as well as all the other operators referring to
the particle, must be defined accordingly as a square two-dimensional
m atrix involving either the spin matrix a or the unit matrix which is
equivalent to the square of any component of a .
These results can easily be generalized for a system of elementary par
ticles (electrons or protons) so long as their mutual action is neglected.
The wave function 0 describing the behaviour of the whole system can
be defined as the product of the functions ipk — *pk(xk, yki zki £k) referring
to the individual particles (k -- l,2,...,n). The expression 00* multi
plied by the volume-element of the configuration space dV = dV1 ...dVn
— dxl dy1dz1 ...dxndyndzn is to be regarded as a measure of the proba
bility of finding the system in the corresponding configuration with
the specified values of the spin coordinates. The number of such
specified values is obviously equal to 2n, so th at there are 2n states
corresponding to each configuration and differing from each other by
the orientation of the separate particles inasmuch as this orientation is
specified by the characteristic values of vz. The total probability of a
given configuration, irrespective of the orientation of the particles, is
measured by the sum
Ct t, Cn C
extended over the two possible values of each of the spin coordinates.
382 PR O BL E M OF M ANY P AR T IC L E S §38
In the case of a motion belonging to a discrete spectrum this sum must
be normalized according to the condition
\ 2 W d v = i,
the integration being extended over the whole configuration space.
With regard to the definition of we have
j Z W d V = l I W i dVx... J X ^ dVtl,
J 2 ' I' k' f' k d ^ k — J (<A*i ' P kl + ' fr ki' l' kl) d ^ t c — 1>
J I* J
where
= ' f ' k h i X k , Vk> z k ) = V k ’ z k> C k ) (ik = 2).
The product 0 considered as a function of the space coordinates alone
can be dealt with as a linear matrix of 2n dimensions
A{ = ■/>
This involves the use of operators wrhich should be defined as square
matrices of the same rank. Such an operator, F say, can be defined
by the equation ( = ^ ^ ^
where F ^ . is an operator of the ordinary kind with respect to the
space coordinates xly...yzn and the corresponding momenta, specified by
two sets of particular values of the spin coordinates, f' = (£(, £')
and f" = ( { J £*)• Each of the individual wave functions tfjk
satisfies the matrix-operator equation
( H k + ^ k V t^ k —
where hk is the two-dimensional unit matrix referring to the fcth particle
(with the elements 8^ ^). The factorized wave function i/j is easily seen
to satisfy an equation of the same type,
(H + hpM = 0, (321)
where 8 is the 2W -dimensional unit matrix with the elements
h i - = 8{;«;s£; (:•••%;> (321 a)
and H the energy operator defined by the formula
H c c = H m cr h i i: ••• 8{ ;c + 8c; a c ••• 8« ; + )
(321 b)
+ - + 8{ ; ; r - 8£;-,«--ffnUS j’
being the elements of the ordinary two-dimensional m atrix
operator referring to the &th particle.
Equation (321) can naturally be extended to functions ^ of a more
§38 M AG NET IC FO R CE S AND SP I N E F F E C T S 383
general type, equal, for instance, to a sum of particular solutions of the
simple product type. I t can be further generalized in order to account
for the mutual action of the particles by adding to it terms representing
the interaction energy multiplied by the unit matrix (321 a). (In pro
blems of the atomic theory involving only a small number of electrons
the mutual kinetic energy T' can be neglected.) There remains, how
ever, still one step in this generalization, which consists in the addition
to the interaction energy of terms characteristic of the spin effect. We
can solve this problem in a tentative way with the help of the approxi
mate theory of § 30. We found there th at the additional ‘spin’ force
acting on a particle (electron) in a given electromagnetic field E, H can
be derived from the energy operator
■'‘["H+2i;;E(px')]
[cf. equation (261 a), where u is replaced by p]. I t is natural to suppose
th at this result will still be valid for a system of particles, if H and E
are defined as the total field acting on the given particle due both to
external causes and to other particles constituting the system. The
field E, H produced by a certain particle at a distance r can be derived
in the usual way from the potentials <f>, A defined by the following
formulae: , e A e . a
4> = A = - - p + -.# x r.
r m0cr r*
The first term in A represents the ordinary electromagnetic field of
a moving point-charge, while the second is introduced as an equivalent
for the field produced by an elementary magnet with a moment /xa.
Neglecting the electric field due to the variation of A with the time,
i.e. putting e
E = —V* = -3r
6 u,
and H = curl A - / x p“ +’ 3r3^[ ~ r(o r)-o ],
m0crs
we get for the operator of the spin interaction energy the following
expression: = U',+ Ul, (322)
.h e * <322*>
and = £ 2 7 T [ i (a*-r« )(a*-r « )“ a*-a*l- (322b)
k<\ ** *- ki
In deriving the term U'8 which represents the linear or electromagnetic
384 PRO BL E M OF MANY P AR T IC L E S § 38
part of the spin interaction energy we have simply summed up the
contributions of all the particles concerned (vki denoting the radius
vector from the tth particle to the &th), whereas in deriving the
quadratic or purely magnetic part of the energy 17J, which is sym
metrical with regard to each pair of particles, we have taken each pair
only once (as indicated by the condition k < i).
I t should be noticed further th at in adding Ua to the Hamiltonian
H in (321) we must multiply it by the unit matrix (321 a). This amounts
to the multiplication of each term by those two-dimensional unit
matrices only which refer to other particles than those represented by
the matrices c. Dropping these unit matrices and neglecting the
mutual kinetic energy we can write the total Hamiltonian in the form
H = Z H k+ U '+ U 8, (323)
k
where Us is defined by (322), while
(323 a)
Hk is the energy operator for the &th particle, Ak, <f>k, HA, and E* denoting
the potentials and intensities of the external electromagnetic field at
the point (xk,ykfzk). If this field does not depend upon the time, the
equation (H+ p^ift = 0 admits solutions of the type 0 —
corresponding to a motion of the system with a fixed energy H \ the
function being defined by = 0. To each state or energy-
level defined by the approximate equation to which it reduces if the
spin effect of all the particles is neglected, there correspond, in general,
2Wdifferent states with slightly different energy-levels, which form what
is called a 'spin m ultiplet’. The theory of such multiplets for the
simplest case of a single particle has been discussed in the preceding
chapter. The general results stated there (§ 29) about the orthogonality
properties of the functions the matrix and supermatrix representa
tion of various physical quantities, the perturbation theory, and so on
can easily be extended to the case* n > 1. We shall not discuss these
questions here, but shall leave some of them for a later section where
they will be considered in connexion with Pauli’s exclusion principle for
identical elementary particles (electrons or protons).
The method which has been applied above for the description of the
spin effect characteristic of such particles can be used in a somewhat
generalized form for the description of the orientation or inner states
§38 M AG NET IC F O R C E S AND SP I N E F F E C T S 385
of complicated particles—such as atomic nuclei or whole atoms, etc.—
so long as they are treated as moving material points.
Let us consider, for example, a particle possessing an inner angular
momentum (which may be due both to orbital and spin motion of the
electrons and protons constituting it) of 5 units. Such a particle can
assume 25 + 1 quantized orientations, corresponding to the values
m — —5, —(s—1), —(5—2),..., + (5 —1), + 5 (324)
of the ^-component of s. These numbers can be defined as the charac
teristic values of a matrix crzi which is the z-component of a matrix o
representing the inner angular momentum of the particle in question
(in units of 7ij2Tr). The matrix elements of axy ayy and oz are defined
by the equations
(CT* + i<7l = V{(S+ I )*“ (m+ W}? ™
(<rx-i<Jy)m,m+l = V(('s+ l)2- ( TO+ l)2}e"<“m > (324 a)
K )mm = ™
which are obtained from the equations (94 b), (96), and (96 a) of § 13
(Chap. Ill), if M is replaced by hts^lir and I by s. The motion of such
a particle in a given external field of force can be described in exactly
the same way as this has been done above for the particular case s = \ ,
namely, by introducing in addition to the ‘external’ coordinates x, yyz,
defining the position of the particle’s centre of gravity of an 'inner’
angular momentum coordinate £, which should assume the values
1, 2,..., 25+1, corresponding to the characteristic values (324) of az. If,
moreover, the additional energy of the particle in a magnetic field $ is
represented by the operator /x£vo, we get a direct generalization of the
Pauli theory of the spin effect, discussed in § 29. A similar generaliza
tion is obtained if we consider a system of particles—such as electrons
and atomic nuclei—which differ from each other not only with respect
to the charge and mass, but also in respect to the inner momentum
number 5 or the multiplicity 25+1. A problem of this sort is met with,
for instance, in connexion with the hyperfine structure of atomic
spectra, due to the fact th at the nuclei of many atoms actually possess
an inner angular momentum and a very small magnetic moment asso
ciated with it. The magnetic field produced by the latter can be
specified by a vector potential of the same form,
A = -oxr ,
9
as for an electron (or proton), giving rise to an interaction energy of
3605,0 3 D
386 P R O BL E M OF M ANY P AR T IC L E S §38
the type (322 a, b), with ak denoting matrices of various ranks (2 for
an electron; 1, 2, 3, etc., for a nucleus).
These considerations show, by the way, th at an electron can be
visualized not as a point but as a spinning sphere, according to the
classical model, in spite of the fact that in the Pauli or Dirac theory
it is treated as a point.
39. Complex Par ticles tr eated as Mater ial Points with Inner
Coor dinates; Theor y of Incom plete System s
Complex particles can be treated as elementary, i.e., material points if
inner coordinates and momenta are introduced to specify their orienta
tion, the total value of the inner angular momentum, if it is variable,
as well as other quantities, serving to describe their inner properties.
Let us denote by x (x, y, z) the coordinates of the centre of gravity
of the particle, the coordinates spccifjung the relative motion of the
elementary particles (electrons, protons) constituting it being denoted
by q (?i,? 2,...). Let us divide further the energy operator 11 into three
parts, K , jLyM, where K is a function of the ar’s (and of the associated
h r» \
momenta represented by the operators — — I, L a function of the q'a
(and of the associated momenta —. — as well as of the spin variables),
2m dq
while M is a function of both. We shall assume them all to be inde
pendent of the time and shall denote the characteristic values and
functions of L by U and yz,(g) respectively.
The solution of the equation = 0 for a stationary state
of the complex particle (supposed to move in a given external field of
force) can be represented in the form
*Ph ‘ ~ (325)
where <f>L>(x) are certain expansion coefficients with regard to the
variables q, being themselves functions of the variables x. These
functions can be determined by substituting (325) in the equation
(H—H'Wn' = 0, which gives
V
Now the operator M applied to the function an(^ thereafter to the
function <f>L>gives the same result as the operator
^ ^ L * L ' Xl '
§39 COMPLEX P AR T IC L E S 387
acting directly on <f>V) where
m l -v = j x l - M Xu dq
are the matrix elements of M with regard to the characteristic inner
states of our complex particle (these matrix elements are functions of the
Ji d \
x and, in general, of the associated operators - - — j. We thus get
V L ' L"
or, interchanging the summation indices in the double sum and equating
to zero the coefficients of the functions xu*
= (H'—L')<f>L (325 a)
V
The system of equations can be written in the form of a single "operator-
m atrix’ equation = (325 b)
if <f> is defined as a one-column matrix with the elements <f>L. and J as
a square matrix operator with the elements
= (325 c)
hUL. denoting the unit matrix and J<f> the one-column matrix resulting
from the matrix multiplication of J by <f>\ J ' — H '—L ' are the charac
teristic values of J . We can also regard <f>as a vector and J as a tensor
in the state-space, corresponding to the inner motion (and orientation)
of the particle under consideration, and specified by the quantum
numbers L ' (which must include besides the energy other constants
of the inner motion). We can finally regard L ' as a sort of ‘inner’
coordinate (or coordinates) of the particle so long as it is treated as
a material point—in the same sense as this is done in Pauli’s or Dirac’s
theory of the spinning electron, with the only difference th at the number
of possible values of U is in general infinite, instead of being equal to
2 (as in Pauli’s theory) or to 4 (as in th at of Dirac). The ‘inner’ quantum
numbers corresponding to these additional coordinates in the functions
<j>(xyL')y compared with the functions <f>K'(x) which are the solutions of
the ‘unperturbed’ equation (K—K ’)^>k \ x ) = 0, can be represented by
the values of the difference J ' —K ' for the same value of K ’.
The different solutions of the equation
J<f)j, =
i.e. solutions referring to different values of J ', if quadratically in
te g ra te , are orthogonal to each other and can be normalized according
to the equation
J 4>y<f>r dx —
, * (326)
388 P R O BL E M OF MANY P AR T IC L E S §39
where <f>y is the one-row matrix formed by the elements which are the
conjugate complex of those constituting the one-column matrix
Introducing L ' as an inner coordinate, we can rewrite the preceding
equation in the form
/ J «•(*. L') dx = 8j.r . (326a)
This result easily follows from the self-adjoint character of the operator
matrix J , which in its turn is a consequence of the self-adjoint character
of the complete Hamiltonian II.
All quantities referring to the translational motion of the particle
under consideration must be represented by operator-matrices of the
tyPC l l -(x , y - —) = F (x, L'- h . ~ , L"),
FLL \ 2mdx) \ 2m dx ]
the inner coordinates appearing twice—in the role of ordinary co
ordinates, and in that of the momenta. The matrix element of such
a quantity with regard to two states of motion, specified by the func
tions <f>j' and <£j., is given by the expression
= / f t F f t dx = J ^ £ 2 .(*, L')F (L', L 'H Ax, L"). (326 b)
This expression is a generalization of those appearing in the theory of
Pauli and Dirac, with the inner (‘spin’) coordinate assuming two or
four values only.
Let us suppose, for example, that the particle is an ion (charge e,
mass m) moving in an electrostatic field, which within the particle can
be dealt with as practically homogeneous and equal to E — —W (x, y, z)
where F(a:,y,z) is the electric potential at the point (centre of gravity)
representing the particle. We then have, by the ordinary Schrodinger
theory, h2
K = - ^ V i + e V ( x )yiz)>
as for an elementary particle with a charge c and a mass myand further
M = —E(a;,y,z)-P(g),
where P is the resulting electric moment of the particle, the position
of the electrons and protons being referred to the point (x,y,z). The
operator L which specifies the inner motion of the particle—in the
absence of the external electric field—need not be considered here. All
we need to know are the matrix elements of P witli regard to the
stationary states representing this iimer motion, the translational
§39 COMPLEX P AR T IC L E S 389
motion being determined by an equation of the type (325 a) with
Ml ,l . = -E (*)-P iX..
For a particle moving in an inhomogeneous magnetic field (a problem
met with, for example, in the Stem-Gerlach experiments), we get in
a similar manner
being the matrix elements of the resulting magnetic moment of
[ t- L ' is
the particle.
The preceding theory can be easily extended to the general case of
a system of complex particles, considered as material points, or to the
still more general case of any ‘incomplete’ system A , which is a part
of a complete system A B, specified by the Hamiltonian H. If the part
of H corresponding to A taken alone is denoted by K , th at corre
sponding to B with L and the rest, representing the mutual action or
‘coupling’ between A and B with M, we obtain for the motion of A the
same results as before, the coordinates x specifying in the general case
the configuration of A , and (f>(x,L') being the probability amplitude of
this configuration for a given stationary state L' of B.
In the case of two particles, for example, we have, denoting by xv x2
the coordinates of the respective centres of gravity and by qv q2 the inner
coordinates, (327)
X/M ) - Xl M i ) X l M2)>
since the operator of the inner motion (without interaction) L obviously
reduces to the sum of the corresponding operators L l and Lt for each
of the two particles taken separately. Putting further
~ ^LiLi(-ri*a;a)» (327 a)
we obtain for an equation of the same type as before. If the two
particles are treated with regard to their mutual action as electrical
dipoles, their mutual potential energy will be represented by the
operator , r«
M= | i (r P 1)(r P!) - P f P1
where r is the radius vector drawn from one particle to the other (with
the components xx—x2, etc.), whence
M l >L ' = (327 b)
I t should be noted th at in spite of the incompleteness of the system
A, specified by the energy operator K-j~M which represents its own
390 P R O BL E M OF M ANY P AR T IC L E S §39
energy and the action on it produced by the ‘ignored' part B t the motion
of A is exactly determined if the operator M is defined as a matrix
with regard to the stationary states of B. This method of describing the
motion of an incomplete system A is especially convenient if its coupling
with B is relatively weak and if for some reason we are not concerned
with the details of the motion of B. As a further example of a (rather
unconscious) application of this method we shall mention Fermi’s theory
of the hyperfine structure of spectra, due to the mutual action of an
electron (A) with a nucleus (B) possessing a magnetic moment. The
motion of the electron is determined in this theory with the help of
Dirac’s equation, the action of the nuclear magnetic moment on the
electron being represented by the vector potential A = —a X r, where
o is the well-known matrix of rank 25+1, specifying the angular
momentum of the nucleus hs/27r. The wave function ip must be treated
accordingly as a rectangular matrix with four columns (corresponding
to the four components of the Dirac wave function) and 25+1 rows.—
We shall discuss later another interesting application of the same
method (due to Heisenberg) to the problem of the interaction between
m atter (A) and radiation (J5), the latter being described by ordinary
electromagnetic oscillations, whose amplitudes are treated as matrices
(Chap. IX).
If the interaction energy M is relatively small so that the second
term on the left side of the equation,
K i(z, L')+ £ M(L', L")<f>(x, L") = L%
can be treated as a small perturbation, this equation can be solved
approximately with the help of the ordinary perturbation method
starting with the solution of the equation which is obtained by dropping
the term M . More exactly, since our problem becomes degenerate, we
must consider the whole set of solutions corresponding to the same
unperturbed energy-level H' —L' — K Writing (K',L') for J ', where
I f denotes an inner quantum number independent of U but identical
with it in regard to the range of its possible values,! we can define an
orthogonal and normal set of solutions of the unperturbed equation
K<f> = K'<f> by the formula,
vL’fa L ' ) = wj£'(x)8z'L'f ( 328)
where <oK.(x) denotes the solution of the above equation leaving out of
t In the same sense as the spin coordinate £ = l f 2 and the spin quantum number
A « l , 2 for a single electron of the Pauli theory (cf. § 29).
§ 39 COMPLEX P AR T IC L E S 391
account the inner coordinates, while &i'L>are the elements of the unit
matrix. The function coK*(x) is supposed to be normalized according to
the ordinary condition j* \wK\x)\2dx = 1; it is supposed, moreover, to
be the only solution of the ordinary Schrtidinger equation Kw = K'a>
corresponding to the energy-level K ' (so th at no further degeneracy
outside of th at which is specified by the quantum numbers L f need be
considered).
The approximate solutions of the exact equation, ‘stabilized’ for the
perturbation M, can be defined, according to the general theory, as
linear combinations of the functions (328)
L ) — 2 cL,<f>K'L,(xf L )• (328 a)
L'
The sum reduces in the present case to a single term, so that we get
<f>j.(x, U) = cL>(x>K'(x). (328 b)
If M were an ordinary operator not involving the inner coordinates,
then the coefficients of the transformation (329) for each admissible
value of the perturbation energy H' —L ' —K ' = AAr/ (together with
the latter) would be determined by the system of equations
2 ^ L ' L ' cLm~ AA Cjj,
where M[JLr are the matrix elements of M with respect to the unper
turbed functions. These equations remain valid in the present case
provided the matrix elements of M are defined according to the general
formula (326 b) which gives, in virtue of (328 a),
m L’l - = / <o&(x)M(L',L" )wK.(x)dx.
Denoting this expression by and dropping the bars over the
L ’s, we get
A A 'C r (329)
We shall not stop here to discuss these equations, since they are
practically identical with those of the ordinary perturbation theory.
I t should be added in conclusion th at the preceding theory can easily
be generalized for non-stationary phenomena corresponding to an ex
plicit dependence of the energy operator H upon the time. So long
as this dependence does not affect the operator A, it is sufficient to
replace the characteristic value H ' of H in (325 a) by the operator
h d
—Pt = — — . the functions <f>Lr being determined by the equation
Ztt %dt
(329 a)
392 P R O BL E M OF MANY P AR T IC L E S §40
40. Identical Par ticles (Electr ons) and the Exclusion Pr inciple
Retur ning to elementar y par ticles, we shall now take into account the
r estr ictive condition which follows fr om th e id en tity of all the electr ons
or all the pr otons and which is expr essed by P a u li’s exclusion pr inciple
or by th e Dir ac antisym m etr y pr inciple for the wave functions <f>
descr ibing the behaviour of a system of electr ons or pr otons (see § 22,
P ar t I). For th e sake of sim plicity we shall apply this pr inciple to
a system of electr ons only, tr eating pr otons and atomic nuclei as fixed
centr es of for ce. Such a tr eatm ent can actually be applied with suffi
cient accur acy to m any pr oblems connected with the str uctur e of atoms,
molecules, and mater ial bodies; for in view of the r elatively lar ge mass
of the atomic nuclei—pr otons included—th ey can be dealt with to a
cer tain appr oxim ation as fixed mater ial points, pr oducing the exter nal
electr ostatic (and also m agnetostatic) field in which the electr ons are
supposed to move.
W e m ust, to begin with, check the valid ity of the Pauli pr inciple in
Dir ac’s for m—in th e sense of its per manence in tim e—fr om the point
of view of the gener alized equation of motion, involving the spin
coor dinates, which has been established in the pr eceding chapter .f
This equation can be wr itten in the following for m :
I H(Xl *„ & Pl l l - . p * £ .....O
= ~ 2 .... (330>
i.e. as a system of 2n equations for the set of 2n wave functions
wher e xk and p k stand shor t for coor dinate tr iplets xky yk, zk and the
m om entum components p kx, p ky, p^ . The space coor dinates of each
par ticle, together with its spin coor dinate, for m a coor dinate quad
r uplet; th e same is tr ue of the momenta, the momentum cor r esponding
to th e spin coor dinate being r eplaced by a duplication of the latter ,
which gives to H its oper ator -matr ix char acter .
In view of th e id en tity of all th e electr ons, H m ust be a symmetr ical
function with r egar d to th e indices 1, 2,... distinguishing them . If,
ther efor e, th e wave function </>is symmetr ical or antisymm etr ical with
r egar d to these indices—i.e. with r egar d to all th e coor dinate quad
r uplets—a t some in stan t t , its der ivative dt/j/dt, and consequently its
valu e for th e n ext (or pr eceding) instant, will be so too. The symmetr i-
t It should be remembered that the permanence of the antisymmetrical character
of the wave fimction has been established in Far t I on the basis of the ordinary
Schrbdinger equation for a system of identical particles without spin.
§40 IDENTICAL PARTICLES AND EXCLUSION PRINCIPLE 393
cal or antisymmetrical character of *p can be regarded therefore as
a permanent property. The fact th at for a system of electrons (or
protons) antisymmetrical wave functions only must be used to in
terpret the experimental data has been discussed at length in § 22 of
P art I.
As the spin forces arc very small compared with the electrostatic
ones, a fairly good approximation (of ‘zero order’) can be obtained by
totally neglecting them (as well as the magnetic forces of Biot and
Savart, specified by the mutual kinetic energy T').
The energy operator-matrix reduces in this case to the product
of the ordinary Hamiltonian operator for the system of particles under
consideration: ,
K = A (•£},..., Xny'Ply...iP u)1
with the unit matrix (321 a). Limiting ourselves to solutions of the
type tp = 0°(zi£v ‘-,xtl£n)e~i2nK t!h>which correspond to a motion with
a fixed energy K ' , we thus get, instead of (330),
(K-K')*P = Q. (330 a)
This equation differs from th at of the ordinary theory (not involving
the spin) only by the fact th at K is understood to contain as a factor
the unit matrix and th at ip is to be regarded as a function both of the
ordinary coordinates and of the spin coordinates £1?..., £w. Since K docs
not contain the latter—or more exactly the spin matrices ov o2,..., o„—
these matrices must commute with K and represent consequently con
stants of the motion. The characteristic values of their 2-components
ate = 2mk = ± 1 can be considered accordingly as additional spin
quantum numbers specifying 2n solutions of (330 a), th at is 2n de
generate states which belong to the same value of the energy K \ We
shall distinguish these 2n states with the help of the indices
writing m short for the whole set of them. I t should be remembered
th at the product of mk by 7*/2tt represents the projection of the spin
of the fcth electron on the z-axis.
If we write £* = — + i instead of 1 and 2 respectively (as was done
before), we can define a set of 2n orthogonal and normal solutions of
the equation (K—K')ipK’ = 0 which belong to the same characteristic
value of K by the formula
= (331)
where 8m{ = Sm^ ... is the 2n-dimensional unit matrix equiva
lent to (321a) and <pK\x) the normalized solution of the ordinary
MM.e 3B
394 P R O BL E M OF MANY P AR T IC L E S §40
Schrodinger equation (K —K')<pK>= 0 not involving any spin coordinates.
We have in fact, by the definition (331),
/ & '»'**'»• dv = j l l « w ( * % v l * n = W - (331 a)
This form of the solution of the Schrodinger equation with the spin
coordinates taken into account cannot, however, he reconciled with the
antisymmetry condition for the functions ip, except when all the n spin
quantum numbers mv ...,mn have the same value (either \ or —\). In
this case is a symmetrical function of the spin coordinates, and in
order to satisfy the antisymmetry condition we must define <f> as an
antisymmetrical function of all the n coordinate triplets xv ...yxn.
If some of the numbers mk have the value —1 and others the value
+ i, the function ip as defined by (331) will not be antisymmetrical,
whatever the type of the space factor <p.
The spin factors Sw£ can be used, however, in this case to obtain
somewhat more complicated spin functions t(£) which are either sym
metrical with regard to all the variables £n or with regard to some
of them, being in the latter case antisymmetrical with regard to definite
pairs of these variables.
A symmetrical spin function e(£) can be formed by permuting the
variables and £k in those factors 8mi^t and 8mjk^ for which mt ^ mk
and adding the results. If instead of adding we subtract them from
each other, we shall get a function antisymmetrical with regard to the
pair of variables (£*, £*). Putting for the sake of brevity
“ (&•£*) = = -«(£*>£<) ] /33m
»&,£*) = (8- u «8+m *+8- i * 8+m ,) = +»({*£«> I ’
we get for c(£) an expression of the form
€i;(£) == u ( t v t z ) u (£z> £4) U ( ^ 2 i - V £2i)Vj(%2i+ V"-> £n)> (332 a)
where t^(f2f+i,—>£«) is a symmetrical function of the n —2i variables
£2i+i,..., £n formed by taking the product of a certain number j of func
tions of the type v{£k, £{) and of n —2(i-{-j) simple functions 8mt{k with
the same value m' of mk, and summing such products for all non-trivial
permutations of the variables £2i+i» -»£n:
£n ) — 2 V(^2i+v £ 21+ 2 ) ••• v(t2i+2j~V £ a i + 2 f» + I •••
(332 b)
T h e number s i and j fu lly specify the spin functions e^ (f) for a fixed
ar r angement of th e var iables £1}..., £n. By per muting th e latter we can
obtain other functions of th e same sym m etr y typ e.
{ 40 IDENTICAL PARTICLES AND EXCLUSION PRINCIPLE 395
Before, however, proceeding to such permutations, let us multiply
the function (332 a) by a space factor 4>i(x) which we shall assume to
be symmetrical with regard to the pairs of coordinate triplets (xv x2),
(*3»*4)i—>(x2f~ vz 2i) and antisymmetrical with regard to the rest. The
pr°dUCt (333)
will obviously be antisymmetrical with regard to the pairs of coordinate
quadruplets (xv £v x2, {,), (x3, x4, £4),..., (x2i_v x2 i>Cm) and anti‘
symmetrical with regard to all the other coordinate quadruplets. I t
will have, however, no symmetry whatever with regard to permutations
affecting the variables of different groups, corresponding, for example,
to interchanges between the first and the third electron, or the first and
the (2 t+ l)th one. If we now apply such permutations (Pt-) to the
function (333) and add the results, we can obtain a function
= (333a)
which will be antisymmetrical with regard to all the electrons, i.e. all
the coordinate quadruplets. Permutations of this class can hardly be
defined explicitly for the general case (arbitrary values of i and j).
They can, however, be specified unambiguously by certain simple con
ditions which we shall not consider here.
The antisymmetrical wave functions (333 a) can also be obtained by
starting from ‘spinless’ functions of the type <f>jj(x) symmetrical with re
gard to i pairs of electrons and antisymmetrical with regard to j other
pairs, while antisymmetrical with respect to all the other n-2(i-\~j)
electrons. The complementary spin factors e(£) should reduce in this
case to a product of i factors u ,j factors i?, and n —2(i -\-j) — 2 |wi | factors
Sm/£. The permutations which must be applied to the products
$ij(x)€ij(C) order to obtain the functions
**(*,£) = I
identical with those defined by (333 a), will constitute a broader class
than the permutations P{. In fact, they can be defined as the products
of the latter and of the permutations which must be applied to the
spin functions
Ciil 2 ) ••• C2(i+^))^/n'£lH^ •••
in order to obtain upon addition the symmetrical function (332 b).
In constructing the functions (333 a) we have left out of account the con
dition th a t they must satisfy the ‘spinless’ SchrOdinger condition. Now
it is easily seen th at this condition is fulfilled so long as it is fulfilled
39G P R O BL E M OF MANY P AR T IC L E S §40
for the space factor <f>i(x) in the initially chosen function (333). Apply
ing to the equation K<j>L(x) — K ^ ^ x ) any permutation Pit we have
indeed, since K is symmetrical with regard to all the electrons and Ki
is a pure number,
W < (* ) ] = KftM z)] = KlP tU * )l
This shows th at if <f>i(x) is a characteristic function of the operator
belonging to a certain characteristic value (energy-level) Kiy then all
the functions resulting from it by permuting the electrons will also be
characteristic functions, belonging to the same energy-level. This being
so, any linear combination of such functions will have the same pro
perty, which therefore will be shared by the unique combination (333 a)
satisfying the antisymmetry condition (the factors if[cy(£)]> which are
equal either to ± 1 or to 0, playing the role of ordinary coefficients
with regard to the functions P,[<£,(x)]).
I t remains to be seen whether the equation K<f> — K'<f> actually has
solutions of the t}>pe <f>it i.e. antisymmetrical with regard to all the
n electrons (i ~ 0), or symmetrical with regard to one pair (1, 2), and
antisymmetrical with regard to the rest (i ---- 1), or symmetrical with
regard to two pairs [(1,2), (3,4)], and antisymmetrical with regard to
the rest (i — 2), and so on. A rigorous proof of this existence theorem
is not easy and we shall not stop to give it. The following remarks are
worth mentioning, however, in this connexion:
1. The functions fa defined above (or their linear combinations) are
not the only characteristic functions of a symmetrical operator K ; the
latter has besides, a number of characteristic functions with an entirely
different symmetry character—for instance, symmetrical with regard
to all the n coordinate triplets or antisymmetrical with regard to two
or three of them, and symmetrical with regard to the rest, and so on.
Such solutions, although they exist mathematically, are non-existent
physically, i.e. they do not correspond to any real phenomenon, for
they cannot provide a basis for constructing functions antisymmetrical
with regard to all the coordinate quadruplets xki £*. The fact th at such
a basis is provided only by functions of the type fa(x) is a consequence
of the two-valuedness of the spin quantum numbers mk of the individual
electrons, this two-valuedness determining the symmetry type of the
‘spin-factors’ c(£) and thence indirectly the symmetry type of the
associated space-factors fax).
2. The functions fa (x) (or their linear combinations) corresponding
to different values of i belong in general to different characteristic
| 40 IDENTICAL PARTICLES AND EXCLUSION PRINCIPLE 397
values K i of the energy operator. They can be introduced as ‘non-
h
(K+ — d\
—\<f> = 0 in th at case
2m oij
also when K contains the time explicitly (i.e. when the electrons are
supposed to move under the influence of a variable field of some external
origin). In this case the symmetry character of <f>remains a permanent
property, if no difference is made between various linear combinations
of the functions with the same value of i, the permanence of
the antisymmetry character of <f>0 being a particular case of this theorem
(the latter holds likewise for a number of solutions belonging to other
symmetry classes, not realized in nature).
I t will be convenient in the sequel to replace the numbers i and j,
which specify the functions (333) or (333 a) by two other numbers,
s = \{n —2i) = \ n —i, (334)
and m= = ±(£w—i —j). (334a)
The latter can obviously be interpreted as the component of the result
ing spin angular momentum of all the electrons along the 2-axis (in
A/27T units); in fact it is equal to the algebraic sum of the characteristic
values of the matrices \a zk for the individual electrons. For a given
value of s, m can assume 25+1 values differing from each other by 1
and lying between + 5 and —s. This circumstance suggests the inter
pretation of a as the magnitude of the vector specifying the resulting spin
of all the electrons (irrespective of its direction). The characteristic
value of the square of this total spin is equal to the product of (hftn)2
with 5(5+1)—just as in the case of the resulting ‘orbital’ momentum,
defined by the number j (see § 37).
The above interpretation of the number s is also supported by the
fact th at its maximum value is equal to \ n , which corresponds to the
same direction of the spin vectors a k of the separate electrons. I t
thus appears th at the resulting spin associated with a given solution <f>L
of the ‘spinless’ SchrOdinger equation is equal to one-half of the number
of electrons with regard to which this .function is antisymmetrical.
We shall now consider, for the sake of illustration, the special cases
of systems consisting of two and three electrons, a helium and a lithium
atom, Bay. In the first case we get functions <f>i(x) of two types only,
namely, the antisymmetrical one <f>0(x) = <f>0(xv x2) and the symmetri
cal <f>i(x) = <f>i(x^x2) (following Heitler and London, we introduce lines
under or over the neighbouring variables, to indicate the antisym-
398 PR O BL E M OF MANY P AR T IC L E S §40
metrical or symmetrical character of the wave function with regard
to these variables). Taking further the four combinations of the indi
vidual spin quantum numbers mx and m2, namely, (—J, —£), (—J, + J),
(+£> + J), we can form three symmetrical spin functions,
H Cv Q = 8-«,» s-u, 8h.{,> 8».t,
and one antisymmetrical
u (Ci.(t) = 8- u , 8».c,—8-»,?,8*.{,-
The products of the former with the antisymmetrical space function
(j>o(xv x2) define three states, corresponding to the same resulting spin
5 = 1 (parallel orientation of the two electrons) and to the values
m = —1,0, - f l of its projection on the 2-axis, whereas the product of
u(t v l 2) with ^ 2) defines a single state corresponding to ,9 = 0 and
m = 0 (‘anti-parallel’ orientations of the spins).
In the case of three electrons we must distinguish likewise two types
of ‘spinless’ functions, namely, those antisymmetrical with regard to
all the three electrons <j>0(x) = <f)0(xv x2, x^), and those symmetrical with
regard to two of them, (f>x(x) = <f>x{xXix2>x2), say (the third electron, being
alone, does not require any specific condition with respect to symmetry).
The functions of the first type must be combined with a symmetrical
spin factor €(£l5 £2, la) which can be obtained either in the form
if mx = m2 = m3 = rri = ± £ (Z mk — ±|)> or in the form
*=
if one of the numbers mk is different from the two others (2 mk ~ ± i)-
We thus get a ‘quadruplet’, i.e. four states with the same s = § and con
sequently with the same value of the energy K = K0, which are dis
tinguished from each other by the values of the resulting ‘axial’ spin
numbers m — —f, —J, J,
The functions of the second type, <f>i(xv x2ixz), must be combined
with spin factors of the form
a) =
and summed over the cyclic permutations of all the three electrons,
giving two antisymmetrical functions,
= <f>l{xVxt>x3MZl,Q&m‘L+ <f>l(x3>xV^ ) M(^» il)8m’C.+
for two different values of m'; the states defined by them belong to the
§ 40 IDENTICAL PARTICLES AND EXCLUSION PRINCIPLE 300
same value \ of s and to the same energy K — K v forming what is
called a ‘spin doublet’ of a similar type to th at for a single electron.
The antisymmetrical character of the functions tp(x, £) is clearly seen
from the fact th at if two electrons, the first and the second, say, are
interchanged, the first term changes its sign, whereas the second and
third are transformed into each other with opposite signs. I t should
be mentioned th at the normal state of a lithium atom, constituted by
two equivalent inner electrons, forming its ‘core’, and one ‘valence’
electron, must be described by a wave function of the above type.
VIII
R E DUC T IO N OF T H E PROBLEM OF A SYSTEM OF
IDE NT IC AL P AR T IC LES TO THAT OF A SINGLE PARTICLE
41. Per tur bation Theor y of a System of Spinless Electr ons and
the Exchange Degener acy
Fur ther pr ogr ess in the stu d y of the pr oblem of m any electr ons can be
achieved only if we descr ibe their m otion in a way similar to th a t used
in Boh r ’s theor y of complex atoms, nam ely, by assigning to each
electr on an individual state of motion in a given field of for ce. The
mutual action of the electr ons can be par tially accounted for by intr o
ducing some constants like the scr eening constants, in the definition of
th e appr opr iate field of for ce for each electr on, or by using the same
su itab ly chosen field of for ce for all of them—a self-consistent field, for
exam ple (see below). The pr oblem of the m otion of the whole system is
thus r educed to th a t of the m otion of the separ ate par ticles constituting
it and to th e deter mination of the effective exter nal field which can
appr oxim ately r epr esent their mutual action. Inasmuch as this mutual
action is accounted for in exactly, we can obtain a better appr oxim ation
by tr eating it, or th a t par t of it which was not included to begin with
in the effective field of for ce, as a small per tur bation, and appr oach
the. exa ct solution by the methods of the per tur bation theor y, star ting
with th e solution which cor r esponds to a distr ibution of the electr ons
between var ious individual sta tes of motion (or ‘or bits’).
A char acter istic distinction between Boh r ’s theor y and the new
quantum theor y in connexion with this per tur bation pr oblem consists
in th e fact th a t the electrons must be interchanged between all the in dividu a l
orbits in such a way as to be com pletely str ipped of their individuality.
This r esult which is expr essed by the sym m etr y pr inciple for the
pr obability d en sity ipifj* or the antisym m etr y pr inciple for the pr oba
b ility am plitude ift can be shown to be in har mony with the pr inciples of
th e per tur bation theor y applied to the pr oblem of a system of identical
par ticles.
The wave function <f> descr ibing their m otion can be r epr esented to
begin with as the pr oduct of the functions ^ (a^ ), fAn(^n) descr ib
ing th e behaviour of the individual electr ons in th e given exter nal field
of for ce. P u ttin g ^ = (335)
and denoting b y P<f> th e function into which <f> is tr ansfor med when
§41 SYST E M OF SP I N L E SS E L E C T R O NS 401
the permutation P is applied to the electrons, we can represent the
general solution of our undisturbed problem, belonging to the same
energy as <f>(x) by the expression
X(*) = X CP P+, (335 a)
P
where CP are arbitrary coefficients, the sum being extended over all
the possible permutations, or at least over the ‘effectively different*
ones, i.e. such as lead to different functions P<j>.
If all the n individual wave functions \jtly tn are different, every
one of the n\ possible permutations P will be associated with a specific
function P<f>. In the contrary case the permutations P can be sub
divided into separate sets of equivalent permutations, which correspond
to identical functions P(f>, and in writing down (335 a) we shall have
to consider only one representative of each set.
We shall assume for the sake of simplicity th at apart from this
‘exchange degeneracy’, arising from the possibility of interchanging
the electrons between different individual states without altering the
total energy, no other type of degeneracy need be considered.
We shall disregard in this section the spin effects and treat the
electrons as spinless particles, using for the determination of their
motion the ordinary Schrodinger theory. We shall leave aside further
more the question as to the symmetry of the functions \ ( x) an(^ shall
try to determine the coefficients CP by which they are defined in such
a way as to ensure the approximate validity of the expression (335 a)
when the perturbing forces (i.e. the mutual action of the electrons or
the neglected part of this mutual action) are taken into account. In
this case the function (335 a) is said to be ‘stabilized’ for the perturba
tion. I t is meant by this th at if the approximation is pushed further,
the coefficients CP will suffer but a slight variation. This question has
been considered in its most general form in the perturbation theory of
degenerate systems. As has been shown there, the degenerate set of
states specified by the functions P<j> gives rise to the same number
of states belonging in general to different energy-levels H ' and specified
by the values of the coefficients CP which satisfy the system of
equations £ HPtQ CQ = H 'C P, (336)
where HP Q are the matrix elements of the total energy with regard to
the approximate functions P(f> and Q<f>:
HPQ = | P<f>*HQ<f> dV, (336 a)
Q denoting, as well as P } a permutation of the electrons.
»9M 3F
402 SYST E M OF IDE NT I C AL P AR T IC L E S §41
In writing down the equations (336) we are tacitly assuming th at the
different functions P<f> are mutually orthogonal. This assumption is
easily seen to be verified if the functions ^i(xn) describing the different
individual states are orthogonal with regard to each other. Now the
mutual orthogonality of the individual functions is automatically
secured if they represent different stationary states of an electron in
a given external field—the same for all the n electrons. In many actual
problems it is more convenient, however, to assign to each electron
a specific field of force (for instance, a Coulomb field, characterized
by a specific value of the screening constant in the problem of the
distribution of electrons in a heavy atom), in which case the individual
wave functions can no longer be considered as mutually orthogonal.
The equations (336) must be replaced in this case according to (61),
§ 9, by the following ones:
2 ( H p q - H ' J p q )Cq = 0, (337)
v
where JP Q - J P(f>*Q<j>dV. (337 a)
The value of this integral must obviously remain unaltered if the
integration variables are replaced by any other ones (which amounts
simply to a change of notation). We can, in particular, interchange
them in a manner corresponding to an arbitrary permutation R of the
electrons. The functions P<f>* and Q<j> will be replaced accordingly by
RP(j>* and RQ<f>, so that we shall get
JPiQ = J RP<j>*RQ<j> dV = JRPtBQ.
I t should be noticed th at the permutation R must not be applied to
the functions <£* and <f>, the result
J PR<f>*QR(f> dV = JPRQR
being in general quite different from the preceding one.
If, in particular, R is identified with the reciprocal of Q (R — Q"1),
we get (338)
JP,Q = Jn-
g -ip ,
where Js is an abbreviation for JSI, I denoting the identical per
mutation (If> ~ <f>).
We get likewise, because of the symmetry of the energy operator H
with regard to all the electrons,
Hp,Q — Hr p ,r q
and in particular HP Q = HQ-\pt (338 a)
HJt being an abbreviation for HR1.
§41 SYST E M OF SP I N L E SS E L E C T R O NS 403
The relations J QP = J% Q and HQP = HPtQ can be written accord
ingly in the following form:
^ jr1 = HR-\ — Hjf) (338 b)
where R = Q~lP and R - 1 = P ~XQ. We thus see that the number of
different matrix elements HPQ and J PQ is actually reduced to the
number, g say, of different states P<f> instead of being equal to its
square g2.
The equations (337) can be rewritten as follows:
2 ( H r '— H ' J r ) C p r - \ = 0, (339)
r
the summation over all the permutations B being obviously equivalent
to the original summation over the permutations Q, with a fixed per
mutation P , the latter specifying each of the g equations forming our
system. The perturbed values of the energy H' are determined as the
roots of the determinantal equation
IH q - i p —H J q -i/»j —■0, (339 b)
which expresses the condition of their compatibility.
Two types of solution of our perturbation problem are immediately
obtained from the equations (339)—namely, those w'hich correspond
to the symmetrical and to the antisymmetrical functions y• In the
former case all the coefficients CQ are equal, so th at they cancel out
and the equations (339) reduce to the single equation
2 — 0,
R
which serves for the determination of the energy
H'aym ~ 2 HjJ £ JR. (340)
R R
In the latter case the coefficients CQ are defined by the formula
CQ = €q C, w'here — + 1 for oven permutations (equivalent to an
even number of transpositions) and = —1 for odd ones. Since in this
case CPR — ep€] tC, the g equations (339) again reduce to the single
equation | eIt(Hr - H ' J lt) = 0,
whence # 'anttoym = ' l tRHR j ' l (340a)
One might be tempted to look for more general solutions of (339) by
assuming th at CpQ = const. CP CQ, or CP — const. e<0L*. I t can easily
be shown, however, in the same way as in Tart I, § 22, that this assump
tion leads to symmetrical and antisymmetrical functions only. The
symmetry properties of all the other solutions can be determined by
the following method due to Dirac.
404 SYSTE M OF I DE NT I C AL P AR T IC L E S §41
According to Dirac, permutations can be dealt with in exactly the
same way as ordinary linear operators which serve to represent various
physical quantities. They can, in fact, be multiplied by each other, the
product being in general non-commutative, i.e. depending upon the
order of the factors, but satisfying the associative law (just as in
the case of differential or matrix operators investigated hitherto).
I t is possible further to define the sum. of two or more permutations as
an operator, which without being itself a permutation is equivalent to
them in the sense of the distributive law:
ft + P J F ^ P ^ + P t F ,
where F denotes any other operator or function.
To each permutation 1, 2,..., n
P =
&1>
there corresponds the reciprocal permutation
D-l ^7?)
11. 2,..., » / ’
whose product with P, irrespective of the order of the two factors, is
equal to 1, i.e. is equivalent to the 'identical’ permutation
12 ...n\
12 ...n)
Every permutation P can be represented as a product of 'cyclic’ per
mutations, of the type
(1 2 3 4
,2.3,4),
(2 3 4 1
where each element in the brackets () is replaced by the next, the last
one being replaced by the first. The different cycles into which P is
thus factorized must have no common elements; they can be therefore
commuted with each other without changing the result. We have
for example,
1 2 3 4 5 0 7 8 9
(1, 7)(2,5, 3 ,4)(6, 9)(8),
7 5 4 2 3 9 1 8 6;
the two-element cycles (1,7) and (6,9) being simply transpositions (i.e.
interchanges of two elements), while the one-element cycle (8) denotes
th at the corresponding element is not affected by the permutation
considered.
Permutations which can be factorized into the same number of cycles
with the same number of elements (which may be different for different
§41 SYST E M OF SP I N L E SS E L E C T R O NS 405
permutations) are called ‘similar’ and form a ‘class’ specified by the
‘partition’ of the number n into summands giving the number of ele
ments in each cycle. The partition for the above permutation is
n = 1+ 2 + 2 + 4 .
Similar permutations P and Q can thus bo obtained from each other
by permuting the elements appearing in the cycles of one of them.
Denoting by R the permutation which must be carried out in the
cycles of P in order to obtain Qy we get
Q RP R~l.
The factor R~1 accounts for the fact that the permutation R should
not affect the operator or function to which P or Q is supposed to
be applied (RP F would be equivalent to applying the permutation R
both to P and to F).
Since every permutation P commutes with the energy operator H (H
being symmetrical with respect to all the electrons), it can be treated
as a constant of the motion. The fact that the different permutations
do not in general commute with each other shows that it is impossible
to assign simultaneously definite values to all these constants. I t is
possible, however, to combine them linearly into a set of commutable
operators, which can be constructed by adding together all the permu
tations belonging to the same class. With a fixed P and a variable R
each permutation Q = R P R -1 will be obtained several times—namely,
n\/nk, where nk is the number of different permutations in the class
under consideration. The sum of all such permutations, or their
‘average’ , „
p = - , y b p r -\
w it
will obviously commute with all the permutations. We have in fact
T P T ~1 = - i S T R P R ^ T - 1
nl t-*
u
or putting TR = S and R - ' T - 1 = S -1,
T P T - 1 = -- y S P S -1 = P
n\ +*
£>
(since for a fixed T and a variable R the product TR varies over the
same rang^e as R). Hence TP — P T. I t follows in particular th at the
operators Pk referring to different classes (Jc — 1, 2,...) commute with
each other. Since, moreover, they commute with the energy operator
//, they can be considered as defining a set of independent constants
406 SYST E M OF I DE NT I C AL P AR T IC L E S §41
of the motion whose characteristic values P'k can be determined simul
taneously and can serve, together with the characteristic values of the
energy H \ to specify the stationary states of the system.
The characteristic values of the operators P are obviously wholly
independent of the form of the energy operator (so long as it is sym
metrical between all the electrons). They must be connected therefore
with the symmetry properties of the wave functions x h ' which belong
to them and can serve for the classification of the latter.
I t should be noticed that the operators P preserve their role of con
stants of the motion in the general case of an energy operator containing
the time explicitly. This means th at if the wave function x satisfying
h d
Schrbdinger’s equation — . — x ~ Hx has at the initial moment
2m dx, A
t 0 a definite symmetry type, specified by certain characteristic
values of the operators P \ it will maintain the same symmetry type at
any other time. The same results can be expressed by saying that the
stationary states of an unperturbed system belonging to different charac
teristic values of the permutation operators P do not combine with
each other under any perturbation (symmetrical in all the electrons).
The simplest examples of this theorem are provided by the sym
metrical and the anti symmetrical wave functions. The characteristic
values of the P are equal to + 1 for the former and to ± 1 for the latter
(+ 1 for even permutations and —1 for odd ones).
So long as the spin effects are left out of account we have to consider
symmetrical and antisymmetrical functions only; if, however, the spin
effect is allowed for, spinless functions of a more complicated character
have to be admitted; to each set of characteristic P -values there
corresponds in general not one but many wave functions of the same
symmetry type (cf. P art I, § 22). If, moreover, the spin forces are taken
into account (as a small perturbation), the states corresponding to
different P-values will combine with each other. We thus get rather
complicated results which can, however, be reduced to the original
simple form if the spin coordinates are introduced in the definition of
the wave functions on the same footing as the geometrical ones.
If the electrons are associated with different individual states specified
by mutually orthogonal wave functions, the set of functions P<f> can be
replaced by the set P ^$ obtained from <f> = ^i(^i)^2(;r2)*--,Ar»(:rn) by
applying the different permutations P not to the arguments of the
functions ip but to their indices, that is, by permuting not the electrons
between the given slates, but on the contrary the different states between the
§41 SYST E M OF SP I N L E SS E L E C T R O NS 407
electrons. Since by applying the same permutation P both to the argu
ments and to the indices, we obviously do not change the resulting
factorized function, we can jmt
Px?It —
where the suffix x has been added to indicate explicitly that P is applied
to the electrons. We thus see th at P^ plays the same role as the
reciprocal of PT and vice versa.
Taking the matrix elements of the energy with respect to the new
functions U ^ Q and remembering that they are invariant with regard
to any permutations R of the electrons (i.e. of the integration variables),
we have
U% = J P ^H Q ^d V = Rs j P ^ H Q ^d V
= J ] ,^R x<f,*HQ4lRx<f>dV,
(since we must first permute the integration variables in <f>* and <f>and
thereafter only carry out the permutations P ^ Q^ of the indices). The
functions Rx(f>* and Rx<f> can further be replaced by and R^ff),
the permutation Rx applied to the arguments of any factorized func
tion <f> being equivalent to the reciprocal permutation applied to the
indices. We thus get
H % = J P ,R ? i* n Q iR + '4 d V = Hflr'.QR-'- (341)
With R = Q this reduces to
= 11%-*, (341a)
where iPf? is an abbreviation for 11%. The difference between this
result and the expression (338 a) for the matrix element of H with
respect to the original functions Px<f>and consists only in the order
in which the two permutations P and Q -1 must be multiplied by each
other. We shall presently see that thanks to this difference it is
possible to reduce our ]>erturbation problem to a simpler form,
corresponding to the replacement of the energy operator H by the
equivalent ‘permutation operator’
W = 2 H fR ,p. (342)
Jt
The fact that the two operators are equivalent so far as the first
approximation equations (330) are concerned is proved by comparing
the matrix elements of W and H with respect to the functions P^<f>,
We have, namely,
w<% =| H f j P^*R+ Q ^dV,
408 SYST E M OF I DE NT I C AL P AR T IC L E S §41
which in view of the orthogonality and normalizing conditions for the
functions P^ <f>reduces to
JFjft, = H % . (RQ = P ),
th at is, to according to (341).
A similar result cannot be obtained with the wave functions Px iff which
have been used before, for with W defined by the formula W ~ ^ A r Rx
we get Wp q = Apq-\. There can, however, be no correspondence
between this expression and the matrix element HPQ = HQ-\P for
the two permutations PQ~Xand Q~XP are in general quite different.
The form of the energy operator H has been left hitherto quite
arbitrary (apart from its symmetry with respect to all the electrons).
Now in all actual problems H can be written down in the form
H = Z E ( * i,P i) + I Z F ( * i,xk), (343)
i i<k
where the first term represents the sum of the energies of the separate
electrons, supposed to move independently, while the second term is
equal to their interaction energy, so that F (xi,xk) ~ - (r being
r(xi}xk)
the distance apart between the ith and the &th electrons).—I t should
be emphasized th at in writing down the expression (343) we must not
consider the energy E(xi,p i) as corresponding to the approximate de
scription of the motion by means of the individual wave function
tpifai). The latter can correspond to a somewhat different energy
operator E ^x^p J involving some additional terms which serve to
account in a simplified way for the mutual action of the ith electron
with the rest—by an adequately chosen value of the ‘screening con
stan t’ in the case of a complex atom, or by some type of ‘self-con
sistent’ field. The difference
« = ^ - I % R ) (343 a)
i
can be defined as the perturbation energy. In order to obtain by our
perturbation method a good approximation to the truth we must
adequately determine the ‘effective’ energy operators E t for the in
dividual electrons in such a way th at the m atrix elements of the per
turbation energy S should be as small as possible. We shall come back
to this question in § 43. We are interested here only in the specialization
of our general theory for the actual case of an energy operator of the
form (343).
W e shall assume for th e sake of sim plicity th e functions ^ and
§41 SYST E M OF SP I N L E SS E L E C T R O NS 409
consequently P<f> to be mutually orthogonal (and of course normalized
to 1). The matrix element E It of the energy E(xifp t) defined by the
general formula
2?* = / R<f>*E(xi,p i)<f>dX (dX = dxv ..dx,l)
is then easily seen to vanish for all the permutations R except the
identical one, in which case it reduces to
E, = J <p*E(xi,p i)<pi dxH, (344)
th at is, to the average value of the energy of the ith electron with
regard to the external field alone for the state of motion which was
initially assigned to it. I t should be kept in mind th at this motion,
inasmuch as it is described by the approximate energy operator
Ei(xi>Pi) which contains some additional external field more or less
equivalent to the mutual action of the ith electron with the rest, differs
from the motion described by the operator E(xi,p i), and that accord
ingly the energy Ei is in general different from the characteristic value
E\ of the energy corresponding to the wave function «/ri.
Taking the matrix element Fr of the interaction energy F (x{, xk)y
Fk = f R$*F (xitxk)$dV,
we easily see that it does not vanish in two cases only, namely, in the
case of the identical permutation, when it reduces to
Fik = JJ <l>*^i)>f'*(xk)F (xllxk)^i(xi)^k(xk) dx(dxk, (344a)
and in that of a transposition R = Tik involving the interchange be
tween the ith and kth electrons. We shall denote its value for this case
by Qik, where
Gik = J J 'l>*(Xk)'l>*(xi)F(x{, x*)</',(zi)'/>*(z*) dXi dxk. (344 b)
All the other matrix elements of E and F , and consequently all the
coefficients HR for such permutations R which are different from the
identical permutation or from a transposition vanish.
I t should be noted th at we obtain the same expressions for the matrix
elements of E and F with respect to the wave functions R^<f>. The
identification of the integration variables in (344 a) and (344 b) with
the coordinates of the tth and the kth. electrons is irrelevant for the
value of Fik and Gik, this value being determined by the states to which
the two electrons are referred, and not by the individuality of these
3Q
410 SYSTE M OF IDE NT I C AL P AR T IC L E S §41
electrons. We could therefore write
Fik = F t = / / ^ U F m ^ F i x ' ^ ^ x ' ) ^ ' ' ) dx' dx"
and Otk Ot = JJ tf(x ' m x " ) F ( x ' , dx' dx" ,
leaving the indices of the two electrons unspecified.
The permutation operator W is thus reduced in all actual problems
to the relatively simple form
H' - (345)
i< k
where lfr° - £ E t+ J Fik (345 a)
i i< k
can be defined as the approximate value of the energy of the system
under consideration, the second term in (343) representing the operator
of the "exchange5 energy.
42. Intr oduction of the Spin Coor dinates and Solution of the
Per tur bation Pr oblem with An tisym m etr ical Wave Functions
The results of the preceding section cannot be directly applied to the
general problem of the motion of a system of electrons, for this implies
the introduction of the spin coordinates which have been ignored
hitherto. Even if we neglect the spin forces—which we shall always do
in the sequel—we must take into account the spin coordinates and the
spin quantum numbers in order to set up the antisymmetrical wave
functions which describe a system of electrons.
We shall consider here the problem of the approximate determination
of the antisymmetrical wave functions with spin, which belong to a
spinless energy operator //, with the help of the individual wave func
tions ipi(xyf ) describing the motion of the separate electrons in a given
external field (£ denotes the additional spin coordinate and the index
i is supposed to contain the spin quantum number).
This problem admits at first sight a simple and unique solution
expressed by the determinant
1 tl'i(xv £i) • • </>l(*n< u
V(n!)
• • >l>n(xn, in)
since no other wave functions but the antisymmetrical one need be
taken into account in connexion with the exchange phenomenon.
The simplification with regard to the exchange degeneracy intro
duced by the antisymmetry condition is, however, balanced by the addi
tional degeneracy, due to the possibility of assigning to each electron two
§42 INT R O DUC T IO N OF SP I N C O O R DINAT E S 411
different sjpin-atates connected with the same type of orbital motion and
corresponding to the same value of the energy. We thus get for the
whole system of n electrons, distributed between n ‘orbits’, i.e. spinless
states, which can be specified by certain functions of the geometrical
coordinates alone ip1(x)} *f/n{x), a degenerate set of 2n states
differing from each other by the spin quantum numbers mv ra2,..., mn,
associated with each spinless state.
The individual states with spin can be described by the functions
*<(*,*) = (346 a)
where m and £ assume the values £ and — 8 ^ being equal to 1 for
f = m, and to 0 for f (it should be remembered th at m denotes
the characteristic value of the component of the spin-matrix a along
some fixed axis).
The spinless functions ^(x) need not be all different; they can occur
in pairs, under the condition th at the associated spin quantum numbers
mi are different. Instead of four degenerate states we get for each such
pair only two, so th at the total number of degenerate states of the
whole system is equal to 2n'+n' = g, where n' is the number of singly
occupied spinless states and n" the number of doubly occupied spinless
states (n — 7t'+2n").
In the absence of any other degeneracy except the spin one [and the
exchange degeneracy which is taken care of by using as zero approxima
tion the antisymmetrical function (346)], the problem of determining
to the first approximation the wave functions with spin x(x>€) corre
sponding to the spinless energy operator H can be solved by defining
these functions as linear combinations of g functions of the type (346),
= (347)
a - 1
where the coefficients Ca satisfy the system of g equations,
I =0 (« = 1 , 2 ( 3 4 7 a)
a
under the compatibility condition
= 0 (347 b)
which serves for the determination of the energy-levels H
The matrix elements and must be defined here by the
expressions
H a ^ ljn ^ d X
(348)
^ = 2 / * :< v x
i J '
412 SYST E M OF I D E N T I C A L P A R T I C L E S §42
where 2 denotes a summation over the spin coordinates of all the
77 electrons involved in the functions <I>.
Taking into account the relation
= (348a)
which follows from the definition of the symbols 8 (where refers
to one particular electron), we can easily find that the matrix elements
(348) can be. different from zero only if the functions and Op are
associated with the same value of the resulting spin component
rn — ^ mi• (348b)
*i
In fact, Hajg and «/ap can be expressed as a sum of terms each involving
a product of n factors of the type (348 a). Now unless the two states
a and ft are associated with an equal number of spins pointing in the
same direction, i.e. specified by spin quantum numbers mt having the
same value (J or —J), one at least of these n factors will vanish in
each such term.
We thus see th at the functions can be divided into a number of
non-combining groups belonging to different characteristic values of the
total spin component m of all the electrons along a certain axis, z say.
This result is a direct corollary of the fact th at the spinless energy
operator H commutes with each of the spin matrices ozi and consequently
with their sum n
= 2 °zi-
1= 1
Now this means th at the matrix of U is diagonal with respect to m.
We have in fact (leaving other variables out of account)
(Has-O s H)mm. — ^ ~ Bm'm‘i**1 W ) =
m
whence it follows that Hmm- - 0 unless m' - - mn.
The subdivision of the function <I> into groups belonging to the same
value of m greatly simplifies the perturbation problem under considera
tion, for the g equations (347 a) are split up hereby into a number of
separate systems, containing coefficients which refer to functions $ of
the same group only. The function x(x>£) stabilized for the perturba
tion will belong accordingly to a definite characteristic value m of <r0
specifying the corresponding group. The equations (347), (347 a), and
(347 b) will be understood in the sequel to refer to one particular group
of g states with the same value of m.
f 42 INTRODUCTION OF SPI N COORDINATES 413
If all the spinless states 0lv..,^n are different, the number g is given
by the formula [Crn is the usual binomial coefficient n\/{r\(n—r)\}]
g(m) = Cl±tm. (349)
In fact the number of ways in which n+ positive and n_ negative spins
can be associated with the n different orbits is obviously equal to
C£+ = which reduces to (349) since
m — $(n+—n j)y (349a)
th at is, * n ± = n:£2m. (349 b)
The sum 2 g(m) taken for all values of m from —\n to \n is equal to
n
2 Cn+ = 2n, as of course it should be.
n+ =o
The g(m) functions <I>a forming a certain group can be obtained from
one of them 0 by permuting the spin quantum numbers mv ra2,..., mn
associated with the separate orbits between the latter, with the con
dition that identical orbits—if present—should always be associated
with opposite spins. Such permutations P must be distinguished from
those which we have considered before and which referred either to the
distribution of the electrons between the (spinless) states or of the states
between the electrons.
Ju st as before, however, it can be concluded from this circumstance
that the number of different matrix elements H and is reduced
from grJi to 9m• We shall not stop to investigate this question, for, as
has been shown by Slater, all we need to know are the diagonal elements
of the energy, from which the perturbed energy-levels can easily be
computed without directly solving the perturbation equations (347 a).
The diagonal elements of H are easily seen to have the same value,
H(m) say, for all the g(m) functions 0 . If the individual wave functions
(with spin) ipn are orthogonal and normalized to 1, i.e. if J ap = 8ap
(which we shall assume to be the case), then according to (347 b) the
sum of the diagonal elements of //, th at is, the product H(m)g(m),
must be equal to the sum of the g(m) characteristic values of H
belonging to m which are the roots of equation (347 b). Now whereas
m, being the characteristic values of the projection cr2 of the resulting
spin a on the direction of the arbitrarily chosen z-axis, depends upon
the choice of its direction in space, the characteristic values of the
energy must obviously be independent of the choice of this direction,
being in fact invariant with respect to the rotations of the coordinate
axes. They must be determined therefore by the characteristic values
8 of the resulting spin itself, which are also invariant both with respect
414 SYST E M OF I D E N T I C A L P AR T I C L E S §42
to rotations of the coordinate axes and to the permutations of the
electrons.
So long as the forces due to the spin of the electrons (including the
effects of their orientation in an external magnetic field) are neglected,
all those states which belong to the same value of the resulting spin
form a degenerate set, so that their energy is wholly determined by s.
The number of such states f(s) and their energy H(s) can easily be
calculated from g(m) and H(m) if we take into account the fact that,
for a given m, s can assume the following values:
s = |ra|, |m[ + l,...,Jw.
Subdividing all the states belonging to a definite m into groups specified
by different values of s, we thus get
?(«*) = ^ /(«) (350)
«-Uh|
and g(m)H(m) = 2 /(*)#(*)• (350 a)
s=|m|
The latter equation can be rewritten in the form
| M m
H(m) = , (350 b)
I j|S (S)
S=|w
which expresses the fact that the diagonal elements of the energy H
are equal to the average value of the energy for all the states
associated with the corresponding value of m.
From (350) and (350 a) we obtain
-/« = 9 (s+ l)-g (s) = Ag(s) (351)
and - f(8)H(s) = g(s+ l)H (s+ 1)-g(8)H(8) = A[g(a)H(8)] (351a)
whence H(s) = . (351b)
Ag($)
Since g(s) is known, being determined by the equation (349) in the case
of n different orbits, our problem reduces to the calculation of a diagonal
element of H for a given value of m (= s).
We shall take for the operator H the expression (343), i.e.
H = X E(xi)p i) + ZJ Z E(*i>xk),
which is the only one occurring in practice.
We shall further write one of the functions 0 defined by the deter-
§42 I N T R O D U C T I O N OF SP I N C O O R DI NAT E S 415
minant (34G) in the form
' (3 5 2 )
where <f>(x) — *p1(x1)*p2 (x 2 )’- tlJn(xJi) ™ product of the spinless func
tions and $„,(£) — ^ 1C product °f the corresponding
spin factors, eP being equal to -\~ 1 or —1 for permutations of the even
and odd type respectively (the permutations Px 7’ofer to the geometrical
coordinates and Pg to the spin coordinates of the electrons).
Let us consider the case when all the n orbits t/fj,02, \jtn are
different (and orthogonal to each other). The expression
h - 2 J 4>*//4> d x = i j 2 2 / pA H Q A dX 2 Pf m Q {m ,
* V" P Q i (352a)
which defines the diagonal matrix element of 11 with respect to the
state <I> (or the corresponding average value) is then easily simplified.
The integral IIPQ — J Px<f>IlQx(f>dX does not vanish, as we know,
either when the permutations P and Q are identical (P = Q) or when
they differ by a transposition Tik of any two electrons (Q == P Tik). I t
reduces in the first case to IIj — JF° = £ i^-f- and in the
j i< ..k
second to Gik [cf. equations (344), (344a), (344b), and (345a) of the
preceding section].
We have further, vdien P -- Q,
^ f r {8./»f 8 = p f 8 .P f 8 = l .
since the total number of different permutations is just equal to n\.
A little more care is required for the calculation of the preceding
expressions when Q = PTik. I t is clear th at the function
8 = 8m.fA
remains unaltered if the same permutation B is applied both to the
spin coordinates and to the spin quantum numbers mi (or more
exactly, to the indices of these variables). Any permutation Pg of the
former can therefore be replaced by the reciprocal permutation P " 1 of
the latter. We thus have
^ P t 8.P f TtkS = ^ P - ' S . P J T i kS,
where T\k denotes, as before, the interchange of the coordinates £* and
f k> which in the original distribution were assigned to the ith and &th
electrons.
416 SYST E M OF I D E N T I C A L P AR T I C L E S §42
Now in the function P ^ l8 these coordinates will be associated with
the spin quantum numbers P~] — mit and P ~\ = mk,y where i' and Jc'
are the numbers derived from i and k by the permutation P - 1. In the
function P ' 1T%kh the same coordinates will be associated with the spin
quantum numbers mk, and mif respectively. The sum 2 P ^ S .P ^ T ^ S
will obviously be equal to 1 if these two numbers are equal (+ £ or —J)
and to 0 if they are different.
Let us suppose th at the numbers mv m2,..., mny are labelled in such
a way that the first n + of them are equal to A and the last n_ to —J
(n++7i_ = n). If now all the permutations P mJ are applied to their
indices, then each index will have an equal chance of being found at any
place of the line, under the condition th at two originally different
indices will always have different places.
The number of positions which any two indices corresponding
originally to i and k can assume in the row of the n+ positive spins is
obviously equal to ?2+(n+—1), and in th at of the n_ negative spins to
n_(n_—1). The sum of these two numbers multiplied by (n—2)! will
give the total number of distributions (i.e. permutations P). We thus
see that in the case Q P Tik the expression
p £ P
is equal, irrespective of the choice of i and ky to
n+(n+—1)+n_(n_—1)
n(n— 1)
The expression (352 a) for the average value of the energy assumes
accordingly the following form
" - ? * • + ? J F« - — L* ^ n f = r- 2 2 (359)
where the negative sign corresponds to the fact th at €P €Q = —1 for
two permutations differing from each other by a transposition (one of
them being of the even and the other of the odd tj^pe). Writing W°
for the sum of the first two terms and putting m = £(n+—n j)t i.e.
n + = \n-\~my n_ = \ n —my we can represent H as a function of m
explicitly by the formula
As would be expected, this expression is a function of m alone, and is
§ 42 I NT R O DUC T I O N OF SP I N C O O R DINAT E S 417
independent of the choice of O out of the group belonging to a given
m, i.e. is the same for all diagonal elements of the energy matrix. We
can now pass on to the calculation of the characteristic values of the
energy as functions of the resulting spin 8.
In the first place we have, according to (351) in conjunction with
(349), i
m = c ) r a- c*n+s+1 = <?♦»+» • <354)
Further, according to (351 a) and (353 a),
f(s)H(s)
- S r- * ^ 1 2 2 «...
wh.no. (354 a)
n(n—l) i<k
This formula was originally derived by Heitler in connexion with the
spin theory of chemical forces. The derivation given above is a
modification of that given by Slater (in his theory of energy-levels in a
complex atom) and by Pauli (in connexion with Heisenberg’s theory of
ferromagnetism).
Pauli’s method of dealing with the perturbation problem under con
sideration differs from th at of Slater in the choice of the original wave
functions with spin. Instead of taking the antisymmetrical functions
defined by the determinant (346) we can use as the zero approximation,
just as in the spinless case, the factorized functions obtained by multi
plying by each other the individual functions ^ (^ )§ m<^r
We shall slightly modify our previous notation by introducing the
letters ./*, J n to specify the different spinless orbits with which
the separate electrons are associated and by writing (t/,)#*.) instead of
X
l*ji(xk)> and (mi|f/c) instead of Smi£y The factorized function with which
we must start can be obtained from one of them
< K*M) - ( ^ i l ® i ) . » ( ^ = (J \x)(m\t) (355)
by permuting the different electrons, i.e. by applying the same permuta
tions P to the arguments x and £, and also taking the two possible
values for each of the spin quantum numbers mt. Now, as has been
shown before, only those functions (355) must be combined with each
other which correspond to the same value of the sum J mi = m and
which accordingly can be obtained from each other by applying various
permutations R (independent of P) to the indices wq. The set of
3 6 0 & .6 3 H
418 SYSTE M OF I DE NT I C AL P AR T IC L E S §42
degenerate states which must be taken into account for the construction
of the wave function x(x>£) stabilized for the perturbation can thus be
specified by the expression
(J \P x)(Rm\P (), (355a)
where P and R are arbitrary permutations. Since a permutation of
the arguments (xyf ) is equivalent to the reciprocal permutation of the
indices («/, m), we can replace the preceding expression by
(P ^J ^P ^R m ^)
or by <f>PQ = (PJ\x)(Qm\£), (355b)
P and Q being independent of each other.
The nl different permutations Q actually lead to
g(m) = <7”-' = tf;- - C*’ *m
different spin factors (@w|£) — (m\Q~*£) which are distinguished from
each other by the coordinates fj,..., £n associated with the values mi = J-
and mi = —£ respectively. In what follows we shall assume the per
mutations Q to be subdivided into g(m) classes, corresponding to the
different functions (Qm|f), and shall take for Q only one representative
of each class, treating all the permutations of each class as identical.
The function x(x'£) can now defined by the formula
x(x>€) —2 2 P Q
C p Q i fr p Q , (356)
where the coefficients CP q are determined by the equations
2 ^ p \q ' —H Jp,Q\ p\Q')Cp',Q' = 0 (356 a)
with
H p,Q; P ' ,Q’ = 2 / ,0 ^ ^ P ' , 0 ' d l 7
= / (X\P J)H(P 'J\X) dV
and
= 2t Jf ^P.p^P'.Q'd V ~ 2i ( Q m \£){Q' m \£) Jf (X \ P J ) ( P J I#)dl7.
So long as we are considering effectively different permutations Q and
Q* only we can assume the sums £ (Qm \€)(Q'm \£) to vanish except for
i
the case Q = Q' when they are equal to 1. The non-vanishing matrix
elements of H and J thus reduce to
H p,Q ,P ',Q ' — H p , p ’ = H p p - i , J p ,Q ; p ',Q ' = ^ p ,p ' = J p p -'>
where HptP' and P>are the usual matrix elements of U and J = S
§42 I NT R O DUC T I O N OF SP I N C O O R DI NAT E S 419
with regard to the spinless functions (P J \x) and (P 'J \x). The equations
(356 a) can therefore be rewritten in the form
2 (//pp'-i—H ' J i * q —
p.
or, if we put P P ,_1 = P,
^ r ~ H 'J r )Cr ~1p ,q = 0. (356b)
We can now make use of the fact th at the only functions (356) we
need are the antisvmmetrical ones. This means that \( S xyS£) = €Sy
where = 1 for a permutation S of even type and —1 for one of
odd type. Since the application of a permutation S or S*1 to the
arguments x, £ of the functions (355 b) is equivalent to the application
of the reciprocal permutation to the indices J , we get
2 2 ^JP,0 1vr'g ~ €S 2 2
or, replacing S~'P and S~lQ in the first sum by P # and Q\
2 2 V,SJ>\SV‘fip'.Q' “ 2 2 Cp'Q^p Q — €S ^ ^ Cp'tQ <f>p\Q‘y
jy o' p o v <?
whence it follows that ^ s p ,hq = €ss ^ p ,q - (357)
This gives, if SQ is replaced by Q (i.e. Q by S'-1#) and S by P -1,
^ j r lp,Q — €r C p j {q >
so that the equations (350 b) can be rewritten in the form
2 *i>XH i t - H'Jit)Cp.RQ = (357 a)
a
The index P is irrelevant, as it is the same for the whole system of
equations and can therefore be left out of account. So far as the
coefficients CP liQ ~ CliQ are concerned the summation over R can lead
to g(m) different values only, which will be multiplied in equations
(357 a) by the sum of the expressions €J{(HJt—H 'J lt) for all the P ’s which
correspond to equivalent permutations RQ.
Putting as before II — £ E(xb2>i)+ 2 2 F (xi'xk) an(* assuming the
i i< k
functions (P J \x) to be mutually orthogonal, we get
( H , - H ’)CQ- Z j G ikCTliQ = 0, (357 b)
where I denotes the identical permutation, so that
Hi = ir®= 2 AVI-££3*.
while Tik corresponds to an interchange between the spin quantum
numbers wi and mk.
The g(m) different coefficients CR can be specified unambiguously by
420 SYSTE M OF IDE NT I C AL P AR T IC L E S §42
the indices of the n + electrons with a positive component of their spin
along the z-axis. We can thus write (following Pauli)
CQ = C(rv rti...,r n+),
where rv r2,..., rn+ are the indices in question, C being independent of
the order in which they appear. We can put in particular = 1,
r2 = 2,..., rn+ = n+ without affecting the generality of our theory, since
the choice of the permutation Q in the equations (357 b) is irrelevant-
for their solution. Putting accordingly
CTaQ = TikCQ = Tik = C(r[,r't,...,r ^)
we can rewrite the equation (357 b) in the form
(HI —H')C(rv ri , rn.,)~ I J ° ikTik C(rv r2. 0. (357 c)
If we consider the determinant of these equations, whose roots give
the allowed values of the energy H \ we see at once that the sum of
these values for all the g(m) perturbed states is equal to the sum of the
coefficients of Cf(r1, r 2,...,r/J,) (without of course the term Hf), th at is,
to the expression ^(H — G )
r i< k
The summation % is extended over those pairs of states (or electrons)
which interchange either two of the indices rv r2,..., r n+ or two of the
remaining indices, sv 6'2,..., sn_ say (corresponding to negative spins),
without interchanging any r with any whereas the summation £ is
extended over the g(m) different combinations of the r ’s. As a result we
obtain each Oik multiplied by the number of combinations for which
the spins associated with the states i and k (or the & ‘th and the Arth
electrons) are both positive or both negative, i.e.
c iu + c iz 2== C”
g n+w+ K - l )+ M ?t- - l )
n(n—1)
H j being multiplied by g{m)C^^. We thus get for the average value
of the energy H f of the g(m) perturbed states the expression
B („ , =
which has been obtained before.
As has been shown by Dirac, the transpositions Tik occurring in
(357 c) can be replaced by operators, involving P au li’s spin matrices
ot and ak. Let us consider the scalar product of these spin vectors,
t i.e. over those indices k which both occur either among the indices r or among
the n . indices s.
§42 I NT R O DUC T I O N OF SP I N C O O R DINAT E S 421
th at is, the operator a i - a fc, applied to some function of a { and a fc, and
in the first place to ot- and a k themselves or their components along
some axis, z say. We have, putting i = 1 and k = 2,
( a 1*a2) a 1, = + < 7l v cr2 v+ G\ z G2z)G\z
— (a l x ° i s : ) 0 ' 2 x + ( ° i y a l z) a 2y^~ ( ° i * Glz) a 2t>
since the vectors a x and o2 commute with each other, and further, in
virtue of the relations (253), § 29,
(al'a2)Glz — ^Gly (72 £ + ^ crl £ °2yJr a2z
or
(1 +<t1'02)(Jlz = i(GlxG2y—(7lva2x) + (Tlz+ ar2z = [ » ( » l X 0 2) + 0 1+ CF2]«*
Similar expressions are obtained if ou is replaced by alx or <rly, so th at
(l+ar<*2K = ia 1x o 2+ o 1+ a 2.
We get likewise
( l+ < V a 2)a 2 = ia 2X « ! + © ! + a 2,
and ®2(1+ ®i'®2) = iO iX aj-fCj+ag,
whence (l + a^Oo)®! — o2(l +ai*o2). (358)
We have on the other hand
(<J l-<J 2 ) 2 = (g 1x 0 ’2x + (712/ C J ^ + O’i s a2z)2
~ Gix °“x+ GlVuly+ a\zGlz+ Glx <J2xcrlyor2v+ <Jlv °2y °\x °2x+ ••■
•
— 3+2icr laio’Jte+ ...
= 3—2a1-a2,
and consequently (l+ a ^ a ^ 2 = (358 a)
I t follows from these equations th at the spin operator
0« = (358 b)
has the same properties with respect to any function of the spin
variables a lf a 2 as the permutation operator Tu . This becomes quite
clear if we rewrite the equation (358) in the form
0 l t a t 0 ^ 1 = o2,
or in the equivalent form
On'o2On = olt
which reduces to ^ 12*2 ^ i 21 — a i
in view of the equation Of2 = 1 which corresponds to the relation
Tf2 = 1 (= identical permutation).
The equivalence between 0 12 and T12 is preserved with regard to the
functions of the other spin variables a8, o4, etc., since they commute
with a x and a 2, and further with regard to any function of the type
422 SYST E M OF I DE NT I C AL P AR T IC L E S §42
»r«) since we can replace the indices rk of the electrons by the
corresponding spin variables ak (or their squares). We can accordingly
replace the permutation operators k in the system of equations (357 c)
by the spin operators 0 ik (the fact that the sign need not be changed
can easily be ascertained by considering a particular case). This system
of equations can thus be written in the standard form of a wave equation
( W- ~ H ' ) C = 0, (359)
where
W = H j- 1* 2 1 GW1+«<•«*)
i< k
(359 a)
i i<J c i' J c
is the approximate energy operator, which is equivalent to II as far as
the first approximation of the perturbation theory is concerned.
This result, due to Dirac, is very important both from the practical
point of view—for in many cases it enables one to calculate very easily
the perturbed energy-levels—and from the theoretical point of view,
for it shows that the ‘exchange energy’ in connexion with the anti
symmetry principle can be interpreted—in a purely formal way—as
due to a fictitious kind of magnetism associated with the spin. In fact
the expression Jf<»> = _ \Gika i ok (359b)
can be considered as representing the energy of a fictitious magnetic
interaction between the ith and Hh electrons, their actual magnetic
moments being replaced by quantities of an electrostatic nature. I t
should be noted th at only a part of the exchange energy can be inter
preted in this way ; another part —-J 2 2 Gya - g°es over into the ordinary
electrostatic energy 2 2
i<k
We shall consider in P art III some important applications of the
quasi-magnetic effects determined by (359 b) to the theory of the mag
netic properties of atoms and of ferromagnetic bodies. Another illustra
tion of equations (359 a) will be found in the theory of the chemical
forces between two atoms, inasmuch as no other type of degeneracy
than th at due to the exchange and spin effect has to be taken into
account.
The above theory can easily be extended to the more general case
when an additional degeneracy (such as th at due to the different
orientations of the electron orbits in a complex atom) must be included
in the perturbation problem. We shall not stop here, however, to
examine this general case.
METH OD OF SE L F -C O NSIST E NT F I E L D 423
43. The Method of the Self-consisten t Field with Factor ized
Wave Functions
The r eduction of the pr oblem of many electr ons to th a t of a single
electr on in th a t for m in which it has been consider ed in the two pr e
ceding sections is based on the descr iption of the unper tur bed motion
of each electr on in a given external field, th a t is, by means of an in
dividual wave function of a given for m. Now in actual pr oblems,
connected with the str uctur e of atoms and molecules, such a field can
n ot be defined befor ehand in a way which would ensur e the degr ee of
accur acy of the zer o-or der appr oxim ation which is necessar y for the
successful application of the per tur bation theor y. We m ust now tur n
to the consider ation of this pr oblem, namely, the pr oblem of the deter
mination of the ‘equivalent exter nal field’ for the separ ate electr ons
for ming a mor e or less complicated system (such, for example, as a
complex atom).
A r elatively simple method which is quite similar to th at used in
the ear lier (Bohr ’s) quantum theor y of complex atoms, consists in the
identification of the exter nal field acting on a given electr on with th at
of a bar e nucleus (or nuclei, if ther e is mor e than one) with an electr ic
char ge differ ing fr om the actual one by a cer tain constant, which,
divided by the elem entar y char ge, is denoted as the scr eening con stan t’
and is to be chosen in such a way as to r epr esent with the highest
possible degr ee of accur acy the effect of the r epulsive for ces acting on
each electr on due to all the r est.
To get a mor e exa ct descr iption of this action it is sometimes pr e
fer able to distr ibute the electr ic char ge of all the electr ons excep t th at
under consider ation in a continuous way over some sur face,*or in a cer
tain volume, with a unifor m density or a density var ying accor ding to
some mor e or less ar bitr ar ily chosen law.
In all these cases we get a pr oblem containing a finite number of
constant parameters which must be adjusted in a way leading to the
least possible error.
This pr oblem is solved ver y easily—at least in pr inciple—with the
help of th e var iational for m of the equations of motion, namely,
f <f>*H<f> dV
SJ ----------- = 0,
J dV
wher e is deter mined as the pr oduct of n individual func
tions tf/1(x1; a v bv ...)) *l>z(x2\a»yb2, . . . * l * n(xn\a nibni...) of known for m,
424 SYSTE M OF IDE NT IC AL P AR T IC L E S §43
containing a number of undetermined parameters')" etc. [cf.
§ 9, Chap. II].
Under these conditions the expression W — J <f>*H(f> dVj J dV>
which is equal to the energy of the system, is defined as a certain
function of the parameters a, whose values must be determined from
the equations
3W dW dW A etc.
f
= 0, = 0, = o, -— == 0,
dax 06, da9 da*,
The equation 8W — 0 can be used, however, not only to adjust the
values of a finite number of parameters introduced in the more or less
arbitrarily specified functions tpn, but also to determine these func
tions themselves without the explicit introduction of any parameters
(implicitly they are contained in the definition of the functions ip if the
latter are supposed to be expanded in some sort of series). Now the
factorized form of the wave function </> describing the behaviour of
the whole system of electrons corresponds to the possibility of assigning
to each of them a separate ‘orbit’, i.e. a motion independent—explicitly
—of that of the rest (in the sense of the wave-mechanical probability
interpretation). Inasmuch as the variational principle SIF = 0 ensures
the highest accuracy of the results consistent with any given assumption
about the character of the motion, we can thus state th at the most
accurate description of the motion of a system of electrons in terms of
the quasi-independent motions of the separate electrons is obtained by
defining the functions ^ ( x ^ , *j*2(x2),..., ipn{xn), describing these individual
motions, with the help of the variational equation, with
<£(* = <Pl(Xl)~4n(Xn)-
The above method has the advantage of avoiding the introduction of
an arbitrary effective external field for each electron. Such a field is,
however, introduced implicitly and can easily be determined in an
explicit form, This is the so-called ‘self-consistent field’ which we have
already alluded to many times, and which was applied for the first time
to the problem of complex atoms by Hartree.
In his original theory of the self-consistent field Hartree did not
make any use of the variational principle (which was introduced for
this purpose later on by V. Fock and J. C. Slater) but was guided by
the idea th at the action experienced by one electron due to the rest
can be calculated approximately by distributing in space the electric
t Wo shall leave aside for the time being the complications arising from the spin
effect.
§43 M ETH OD OF SE L F -C O NSI ST E NT F I E L D 425
charge of the latter with a density proportional to the probability of
their respective positions. The contribution of each electron to the
probable density of charge p at a given point is obviously given by
pk — e\t/jk(x)\2 under the condition that all the individual functions tfj
are normalized to 1: r
J I'M*) I2d x = l
(where dx is an abbreviation for the element of volume dxdydz). The
potential energy Ui of the ith. electron with respect to all the others
can be determined accordingly by the expression
Vi = I U ik .
k/■i
where, Uik _ e-
J
f rik
dxk.
or, with a slightly different notation,
U fa ) = e2 f , 1 y 14>k(r')\2dV'. (360)
J !r-r
Adding to this expression the potential energy U()(r) of the external
forces (which must obviously have the same form for all the electrons)
and substituting the resulting effective' energy
U{(r)= - Uoi(r)+ U[(r) (360a)
in the Schrodinger equation
- = 0, (360 b)
we can determine the wave function fa describing the motion of the
electron in question if the functions i/jk (lc i) describing th at of the
other electrons are supposed to be known. Now as a m atter of fact they
are not known beforehand, each of them being determined through the
rest by an equation of the form (360 b). We obtain in this way a system
of n integro-differential equations which can serve for the simultaneous
determination of all the n individual wave functions
I t may seem at first sight th at the total energy W of the whole
system is equal to the sum of the individual energies This is, how
ever, easily seen not to be the case. In fact multiplying equation (360 b)
on the left by tfsf and integrating, we have, in view of the supposed
normalization of tpiy
3505.6
w' “ M - 8 ^ + 4 - 3I
420 SYSTE M OF I DE NT IC AL P AR T IC L E S §43
or, according to the definition of Uif
<f>dV,
87r2m V j+^orl-
whence it follows that
h2
87r2m 22-
I k¥- i
<j>dVt
whereas the actual value of the total energy, corresponding to our
approximation, is
w - j w t v .. J
the mutual potential energy of all the electrons thus being doubled in
the expression 2
In order to calculate the total energy M,Twith the help of the ‘partial
energies’ Wi we must introduce in addition the ‘proper energies’ of the
separate electrons
Denoting their sum V £\ by E, we get
H '- J 2 W>- = \ E>
whence W = j ( f i+ J J»j) = (360 c)
I t should be mentioned that Hartree’s self-consistent field can be
defined either by the resulting probable density of the electric charge
n
p —e2 I2, from which the electric potential with due allowance for
the contribution of the external field can be derived by means of
Poisson’s equation, or by the electric density p\ = p—p{ = p—e |^ |2
and the potential energy (360) which corresponds to an electric field
of specific form for each of the electrons.
We shall now come back to the variational equation in the form
8 jV i/< M F = 0, (361)
with <f> defined as the product ••• and the n additional
normalizing conditions J 0 * ^ dz = 1 or
8 J tpfifti dx — 0. (361 a)
We have 8J <t>*H<j> dV = J 8<f,*H<j> d-V + J dV.
§43 M ETH OD OF SE L F -C O NSI ST E NT F I E L D 427
Now in virtue of the self-adjoint character of the operator H (which we
shall suppose to involve real quantities only) we have further
J dv = j dV,
so that (361) can be written in the form
J dV + j d V = 0.
Substituting here the product ^ 1(a:1) ...*pn(xn) for <f>we get
I J w n ***** dv +
1=1*' k+i
ij
i=1J
s& n ***** w = o.
k*i
If we subtract from this equation the n equations equivalent to (361 a)
J &/>* n *?* d v + j Bipi n **** *v = o
J i J k^ i
multiplied by suitably chosen parameters, Ai say, we can equate to zero
the coefficients of all the variations Sip* and (Lagrange's method of
undetermined multipliers). This gives
= o, (362)
where *> = f I I * * * I I ** I I tlxk (362a)
J k /- i J k-£ i
is an operator which can be defined as the average value of the actual
energy operator H for a given position of the tth electron and for all
the configurations of the other ones. Similar equations are obtained by
equating to zero the coefficients of the variations with Hi replaced by
* ? = i i i * * » n * ? n dxk. They need not be considered separately
‘ k i k ¥- i ks- i
for they are actually equivalent to the equations (362). The latter
provide the mathematical justification for the physical principle which
was used by H artreef and are practically equivalent to H artree’s equa
tion (360 b) if H is determined, as usual, by the formula
* = % E (xitPi) + \ I 2 F (x{,xk)... (362b)
1=1 1=1k & i
with F (xiyxk) = — .
r ik
The only difference between them consists, as is easily seen, in the fact
th at H t involves in addition to the proper energy of the ith electron
h2
— —w- V|-fC/oi and its average potential energy with respect to the
olTnt
rest, the average of the energies of all the other electrons. Hence the
| It may be remembered that essentially the same principle had been used before
by Schrddinger in oonnexion with his attempt to re-establish the wave theory of light
emission on the basis of wave mechanics. (See Par t I, § 17.)
428 SYSTEM OF IDE NT IC AL P AR T IC L E S §43
constants \ appearing in (362) are easily seen to have the same value,
namely, W, the total energy of the system. It should be mentioned
that the normal state of the latter corresponds to the condition that
W should have the least possible value of all the ‘stationary* values
which are allowed by the variational equation (361), in conjunction
with (361 a).
The preceding theory applies not only to a system of electrons but
just as well to a system consisting of different particles or indeed
of systems of any sort if denotes the totality of the coordinates
specifying the state of the corresponding elementary system and if the
total energy (362 b) is written in the somewhat more general form
(362 c)
44. The Method of the Self-consistent Field with An tisym m etr i-
cal Functions and Dir ac’s Density Matr ix
In the particular case of a system of electrons the accuracy of Hartree’s
method is limited not only intrinsically but also by the fact that a
specific distribution of electrons among the n orbits ipn such as
that defined by the function <j> violates the identity principle. The
function <f>defined by the product ^i(xx) ...$n(xn) must serve merely as
a starting-point for the perturbation theory which has been considered
in § 37 in connexion with the exchange degeneracy.
Instead of accounting for the latter a posteriori we can take it into
account from the beginning if we replace the factorized function <f> in
the variational equation by a linear combination of such functions,
corresponding to the different permutations P of the electrons between
the individual states
X = l C P P<f>.
p
The functions obtained in this way will of course be somewhat
different from those which are defined by the equations (361c) and
which do not involve the exchange effect. As to the coefficients Cp,
they can be shown to be the same as in the case of the perturbation
problem corresponding to functions </r1?..., 0/t known a priori.
We shall deter mine the latter for the antisymmotr ical functions with
spin which have been dealt with in the pr eceding section. We put
accor dingly
x(*>f) = C (363)
§44 ANT ISYM M E T R IC AL F UNC T I O NS 429
where &(*,£) =
and <f>(x, £) = &(*„ i M x 2, £,)... ^tt(z„, („).
We shall further assume for the sake of simplicity all the individual
wave functions with spin not only to be normalized but also to be
mutually orthogonal in the sense of the equations
2 f 4>*(x>tWki*’ £)dx = 8ik. (363 a)
( J
I t should be mentioned that if this orthogonality condition were not
fulfilled for the original wave functions ^ we could replace them by
certain linear combinations satisfying these conditions. The a pr iori
introduction of the latter does not therefore impair the generality of
the theory. I t serves, however, materially to simplify its external form.
The normalizing condition for the fimction (303) under the assumption
(363 a) gives C = 1/<J(n\).
I t will be convenient in what follows to write xi for xi9 und J for
2 J, thus keeping externally the notation corresponding to spinless
functions. We can formally proceed in the same way as if we were
dealing with an antisymmetrical function (363) without spin.| Sub
stituting it instead of <f>in the variational equation (361) (which in our
case should be written in the form 8 £ J X*^X &V) and taking account
t
of the self-adjointness of the operator H, we get as before
J 8X*HX dV + J 8XHX* dV = 0 (364)
(the summation over the £’s being understood).
Now we have according to (363)
and further, since the integral J Ph<f>*Hx dV (or more exactly
2 J dV) does not change if any permutation, P~* in particular,
£
is applied to all the integration variables,
J SX*HX dV = ~ J J W H P -' X dV;
or finally, since P~xx = €pX>
J ZX*HX dV = — } 2 J H ' H X dV = V(n!) J H *H X dV,
P (364 a)
t The variations 50 must of course refer to the factor 0 ,(2?) only, leaving the spin
factor ) unaltered.
430 SYSTEM OF IDE NT IC AL P AR T IC L E S §44
and in the same way
J 8xHx* dV = M ) j dV.
If we now substitute for H the operator (362 b) (by definition not
involving the spin) and replace yj(n\)x by the expression ^ eP P(f>, we get
J 8x*Hx dV - 2 £i- J 84>*HP4, dV
= I «r( I f 84>*E(xhVi)P4> dV + 1 2 f H *F (xitxk)P<f> rfFl.
r ,<k J >
The integral J S(f>*E(xi,p i)P<f> dV, where &<f> — £ t y f TI cas%
j-= l ArvO
seen to be different from zero only if P denotes the identical permuta
tion (because of the orthogonality conditions J t/jktpl dx == 8kl) when it
reduces to J bif/?E(xhp i)\l>i dxi. We have further, if P is the identical
permutation,
J h<j>*F(xi.xk)P^> dV
- JJ r (a-,)s«aa+ )+V'a*(<>)s<Ar )i ^ ) dxtdxk,
and
J 8<f>*F(xitxk)P4>dV
= J J W(^ )^ t(-Vk)+ 4’*(xkM *(:ri)]F (xi,xkypi(xk)if,k(xi) dx{dxk
if P is equal to the transposition T,k, i.c. the interchange between the
/th and &th electrons, and zero in all other cases. We thus get, on
account of the symmetry relation F (xh xk) ==■ F (xkyxi),
J 8x *Hx dV -= f j d x i blp*[[E(xt,P i) + Z j d x . F ^ x M ^ l ^ i ) -
- [ / dxk F (xt, xkW*(xk)h (xk)]'l>k(xi)}.
Putting for the sake of brevity
A m(x) = J F (x, x' WUx' WAx' ) dx' (365)
and B(x) ~ ^ A kk(x)y (365 a)
k=-1
we can rewrite the preceding expression as follows:
J s X*BXdV
= . I / d x W(x )[[E (x ,p T) + B ( x ) ] M x ) - £ A kiM * )} ^ 0- (305b)
ANT I SYM M E T R IC AL F UNC T IO NS 431
Subtracting from this equation and the conjugate complex equation
J SxH\* = 0 the expressions
hi f ty?(a#*(*) dx + Xki | <Pf(x)8<fik(x) <ix == 0,
which are derived from the orthogonality and normalizing conditions
J 0*0Adx ~ 8ik, and equating to zero the coefficients of the variations
80*, we obtain the following system of equations for the functions ^(x):
( E + B ) ^ ( x ) - i (Aki+ \ ki)4,k(x) = 0, (366)
and a similar system for the conjugate complex functions. If we
multiply these equations on the left by and integrate over x
(including summation over $) we get, in virtue of the orthogonality
and normalizing relations,
Atf = f ( * ) ( - £ + -B W ' iO ' O dx - i f A ki'f> *(x )4‘k{x ) d x
or, according to (365) and (365 a),
A,; = Ea + j"f F (x,x'Wfixtyiix) J <l>*(x')'l>k(x') dxdx'— )
m , (366 a)
- / J F (x, x')<l>*(x)>l>i(x>)^>l>t(x')>l>k(x) dxdx' j
where Ejt = j" <jif(x)E(x,ps)ifii(x) dx (366 b)
are the matrix elements of the proper energy of an electron [including
its external potential energy U0(x)] with respect to the states i and j.
Although the coefficients Xki are completely determined by these
equations, they can actually be considered as arbitrary constants form
ing an Hermitian matrix, i.e. satisfying the relations A*. = Xk{, and
further subject to the condition th at the diagonal sum should
i
have a given constant value.
This conclusion follows from the fact that the set of normalized and
orthogonal wave functions ip{ can be replaced by any set of linear com
binations of these functions, provided the transformed functions
K = Ii C a b
also satisfy the normalizing and orthogonality conditions. In fact the
functions Aik are transformed by (365) according to the equations
Ai>k' = X ^V» Ck'kAik,
i.e. like the components of a tensor in the n-dimensional space, whose
coordinates are defined by the values of the n functions *pi(x). The
432 SYST E M OF I DE NT I C AL P AR T iC L E S §44
latter can also be considered as the components of a vector referred
to a certain set of orthogonal coordinate axes, ijt'c being its components
with respect to another system of such axes (with the same origin). In
other words, the equations (366) can be considered as invariant with
regard to all the orthogonal transformations or ‘rotations’ of the co
ordinate axes, if the coefficients A^ arc likewise defined as the com
ponents of an arbitrary tensor A, the operators E and B being obviously
scalars.
As has been pointed out by Dirac, the arbitrariness involved in the
determination of the components if>i(x) of a vector \|>(a;) can be removed
if instead of such a vector we consider its scalar product with the con
jugate complex 4>*(a;') °f a vector t|/(:r') associated with some other
point x'. This product, which will be denoted as
p(x,x') = 1 (367)
is invariant under the above transformation and is therefore the only
quantity that can be determined unambiguously in connexion with our
problem. It can, moreover, easily be shown to be the only quantity
we actually need know, the energy
IV =-- | x*Hx dV
of the system of electrons being expressible as a function of p.
In fact the preceding formula is reduced (in the same way as the
expression J dV) to the form
W = j 4>*HX dV - I €P j <f>*HP^> dV,
or, if the energy operator is defined by (362 b),
W= j dV - X J <f>*HTik<f> dV.
Hence we get in the same way as in the derivation of (365 b)
w = | Ei(+ l j j F { X , x')[ p(x,X)p(x\x')-\p(x.x-)i*] dV, (367a)
where \p(x, x') |2 = p(x, x’)p(x', x).
I t should be mentioned th at the sum £ Aii differs from this expression
by the absence of the factor £ in the second term which corresponds
to the mutual energy of the different electrons, so that we can put
W = \ % ( E U+ X(().
4-1
§44 ANT ISYM M E T R IC AL F UNC T IO NS 433
The quantities are thus easily seen to correspond to the partial
energies of our previous theory.
e2
The integral £ JJ F (x,x')p(x,z)p(z',x') dxdx' with F (x,x') =
r epr esents the mutual potential ener gy which is obtained if the charges
of the electrons are distributed in space with a volume density elf/f^x)!2,
n
ep(x,x) — e 2 |^M#)|2 being the resulting density of the ‘electron cloud’.
i=1
This includes the action of an electron spread out into a cloud upon
itself, which is devoid of physical meaning. Such self-action is, however ,
cancelled out by the second integral on the r ight side of (367 a),
—b J J F (x,x')\p(x,x')\2 dxdx\
which also repr esents the exchange effect or, as it is usually denoted,
the ‘exchange ener gy’ of the electr ons.f
The first term in (367 a) does not seem at first sight to be consistent
with the r epr esentation of the ener gy as a function of the ‘density
matr ix’ p. If, however, we intr oduce the elements of the electr on’s own
ener gy matr ix E from the point of view of the coor dinates x
E (xyx') = J 8(;r—x")E(x"ipx-)8(x"—x') dx”
(cf. § 17), we can put, since Eu — J ip*(x)E\/ji(x) dx,
2 EiL = J J E (xyx’)p(x\x) dxdx'. (367 b)
The fact that the ener gy W — J x*^X dV is expressed as a function
(or rather a ‘functional’) of the density matr ix p alone, shows that the
latter can be determined dir ectly without the functions 01(x),...,^ n(x)
which have initially served for its definition. Multiplying the equations
(366) by ift*(x'), subtr acting ther efr om the pr oduct by *pi(x) of the corre
sponding equations for the conjugate complex of *pi(x'), and summing
over i, taking into account the r elations A%. — Aki and X*k = XkH, we
can eliminate the coefficients Xik with the r esult
[E(x,px) + B ( x ) - E ( x ' , p y )-B{x' )]2>l>i(x)>f>*(x' )-
- I I [•Aki( x ) - A ki(x')\l>k( x ) r (*') = 0. (368)
t As has beon stated at the beginning, the integration sign in tho preceding equations
actually means both integration with regard to the geometrical coordinates and a sum
mation over the spin coordinates. The latter can easily be introduced explicitly in the
final results. They are, however, wholly irrelevant so long as we are dealing with a
spinless energy. Their only effect is to allow the introduction of doubly occupied spinless
states (with opposite spin) without the violation of Pauli’s exclusion principle.
As a result we get a number of relations of tho form A/*(x) — An{x) — and
Ay{x) -- Ajy(x) for indices i and k which correspond to identical spinless states.
35#5.6 3
434 SYST E M OF IDE NT I C AL P AR T IC L E S §44
If we substitute here the expression (365) for AM(x) with x' replaced
by x” and similarly put Aki(x') = J F{x' ,x ”)*Jj*(xf,)$i(x”) dx", we obtain
the following equation, containing the density matrix alone,
(Ex+ B x—Ex.—Bx.)p{x, x’)—
— j" [F(x, x’)—F (x',xl’y\p(x,x’)p(x",x') dx" 0, (368 a)
where Ex, etc., is an abbreviation for E (x,px), etc.
Introducing a matrix K defined from the point of view of x by the
formula (369)
K(x,x’) = E(x, x')+ h(x—x')B(x')—A(x, x'),
p(x,x')
where A(x,x') F (x,x')p(x,x') = e2 (369 a)
r (x,x‘)
and, according to (365 a),
B(x) = J F (x,x')p(x’,x') dx' = c2 J dx'< (369 b)
we can consider the left-hand side of the equation (368 a) as the (x,x')
element of the matrix Kp—pK and accordingly rewrite it in the fol
lowing matrix form: K p — p K = 0. (370)
I t should be mentioned th at the matrix A(x,x') subtracts from the
matrix B(x,x') = 8(x—x')B(x') physically irrelevant terms correspond
ing to the action of an electron upon itself and at the same time accounts
for the exchange effect.
With the new m atrix notation we can rewrite the expression of the
energy as a function of p derived above
W = f f dxdxf {E (x,x')p(x',x)+ iF (xfx')[ p(x,x)p(x\x')—\p(x,x')\2]}
(371)
in the form
W = D [ p {E + \B -\A)] = D [ (E + \B -\A)p ] , (371a)
where D(M) is an abbreviation for the so-called diagonal sum (German,
Spur), i.e. sum (or integral) of the diagonal elements of the matrix M;
in the present case we have
D(M) — j M(x,x) dx.
The equation (370) which is satisfied by p can be obtained directly,
i.e. without the use of the functions ^i(x),...}ipn(x), from the variational
equation SW = 0. With the expression (371) for W we get, since
F {x , x ’) = F ( x \ x ),
8JF = J J dxdx* {E (x, x' )h p(x’, x ) +
+ F ( x yx' )[p(x, x)8p{x' , x f) —p ( x yx' )8p(x' , x)]}
§44 ANT I SYM M ET R IC AL F UNC T IO NS 435
— J J dxdx' [E(x,x' ) + $(x—x,)B(x,)—A(xix,)\Bp{x,ix)
— J J dx' dx &p(x,x' )[E(x' ,x)-{-&(x' —x)B(x)—A(x' ,x)]}
that is, according to (369),
hW = D(hp K) - D(K Bp) = 0. (371 b)
I t must not be concluded from this equation th at K — 0, for the
matrix p satisfies a certain accessory condition which is obtained by
comparing it with its square.
We have in fact, from the definition of matrix multiplication,
p2(x,x' ) = f p(z, *>(*">*') d*” = 2 I J h ( x W*(x l f k ( x l M x ”) dx”
J i A* J
= 2 £ <l>i(x )<P*(x ')^ik 2 tp{(x)l])*(x' ) = p(x,x' )
(because of the* orthogonality and normalization of the functions 0 £), i.e.
P2 = P. (372)
I t follows th at Sp == pBp+ Sp p, th at is,
Sp(x' ,x" ) = J p(x',x"') 3p(x’",x") dx'" + f 8p(x’,x'")p(x'",x") dx"'
which in conjunction with (371b) leads to (370).
The relation (372) shows that the characteristic values p of the matrix p
are equal either to 0 or to 1 (since they satisfy the same equation p 2 ------ p).
We thus obtain, according to Dirac, a new formulation of Pauli’s
exclusion principle, for although the matrix p can be introduced ir
respective of the statistical properties of the particles under considera
tion, yet it can be shown to possess a dynamical meaning—in the
sense of describing the motion of a system of particles—for the Pauli-
Fermi statistics only (see below, p. 4G3), so th at Pauli’s principle is
expressed implicitly by the property (372) of p.
If in the equation (367) we sum over all values of i specifying a
complete set of individual wave functions i/j 1—which corresponds to
n — oo, so long as all these wave functions are normalized and
orthogonal to each other, we obtain
p(xixt) = 8(z—x' ) (372 a)
This expression can be used as an approximation to p for large values
of n. I t is easily seen to satisfy the relation (372).
The preceding results can be generalized for a non-stationary motion
of the electrons, determined, in the method of the configuration space,
by the equation
(373)
436 SYSTEM OF I DE NT IC AL P AR T IC L E S §44
In order to obtain the corresponding generalized form of the equations
(366) or of the equation (370) for the density matrix, we need only
remember that the equation (373) is equivalent to the variational
equation r , j,
[cf. § 26, eq. (207 a)]. Now
J s* * t ^ | J V | JW -
- t L 'i'/i ' J
Putting for the sake of brevity
~ h J *r a T ^ £ a* = b> b~ a‘ = 6‘’
we thus get
Equating this expression to the expression (365 b) for f Sx*#x ai^
taking account of the orthogonality and normalizing conditions in the
form r
J H*(x)4>kix) dx = Q,
we get instead of (366) the equations
(374)
(£ + s + s s I V « w - = »•
where bki are numerical coefficients, or more generally functions of the
time.
These can be determined in the same way as the coefficients
i.e. by expressions similar to (366 a) and differing from the latter by
additional terms f</»*— dx:
2m J Tj dt
**
h
bH = Aw+ - i .
**
r <i>t ^
2 ir t J Tk 8 t
dx. (374 a)
Hence we see that they must satisfy the conditions
bki ~ bfk
and are otherwise quite arbitrary. Taking the sum
, V h
§44 ANT ISYM M E T R IC AL F UNC T IO NS 437
we have, according to (374),
I bu = I A,i— I j 4>t(E+B)4>i dx + | | J (Aki+ bkim k dx.
Now in view of the orthogonality and normalizing relations
Z ' £ j b kiM k dx = l b li.
The sum 2 thus drops out of the preceding equation which reduces
i
to the equation (367 a) for 2 \{-
i
The arbitrary coefficients bki can be eliminated from the equations
(374) in the same way as from the equations (366), namely, by multi
plying (374) by */»*(£'), subtracting the product with ^(x) of the corre
sponding (conjugate complex) equation for 0*(x'), and summing over i.
We thus get, instead of (368),
h dp(x,x')
— (Ex+ B x—Ex— Bx>)p(x,x’) —
2rri dt
- f [ F (x,x”) - F {x' )xn)]p(x)x'')p(x\x')dx*, (375)
or in the matrix form corresponding to (370)
_ J L .8P = k p- pK . (375 a)
2tti dt r *
This relation should be distinguished from the expression
h dM
— - (HM- -MH)
27t i dt v
for the time derivative of any matrix or operator which is specified in
terms of the same variables as the Hamiltonian H of the system and
which docs not contain the time explicitly. If K is considered as the
energy matrix of our system of electrons reduced by the method of the
self-consistent field to a single particle, then the expression
2nh l ( K P - PK ) = [ K ,p \
gives that part of the time derivative of p which corresponds to the
rate of change of the dynamical variables (x and p x for example)
through which it can be expressed. The total derivative of p with
respect to the time will thus be
(375 b)
in virtue of (375 a), which means th at p is a constant of the motion
determined by K .
438 SYSTE M OF I DE NT I C AL P AR T I C L E S §44
The total energy of the system of electrons W as given by (371) or
(371 a) is not a matrix but an ordinary number. If the external field
involved in the proper energy of an electron E does not depend upon
the time, W can be shown to be a constant of the motion. We have
in fact, in exactly the same way as in the derivation of (371 b)
or, according to (375 a),
- D K K P ~ P K )K ] = D [ K PK } - D ( p K %
which is easily seen to vanish.
The matr ix p( I B A ) ■= \ p {B + K )
could be formally defined as the energy matrix of the system of elec
trons, without, however, attaching any dynamical meaning to this
definition, for it is the matrix K only wliich is entitled to play the role
of the energy matrix for a single particle. The matrix K differs from
an ordinary energy matrix such as E , by the fact that it is itself deter
mined by the character of the motion, and that accordingly it cannot
be represented by an operator of the usual form (p2j2m)+ U(x) even
with an unknown potential-energy function U(x).
One might be tempted to replace the equation (375 a) by an
equation of the usual Schrodingcr type
----^ - —tb -- Kibix).
2t t i vv '
The latter can in fact be showm to be equivalent to (375 a) or to (375)
in the special case of a single electron (but not otherwise). Replacing
K by an energy operator of the ordinary type, E say, multiplying the
equation A3.,. „ „ .
by 0*(;r') and subtracting from it the equation
multiplied by ip(,x), we get
which is a special case of (375 b). The equation (375) or (375 a) can
thus be considered as the generalization of the wave mechanics of a
§44 ANT I SYM M E T R I C AL F UNC T I O NS 439
single electron, which makes it possible, through the introduction of the
density matrix p(x, x' ) instead of ordinary wave functions, to describe
the motion of a system of n electrons in exactly the same way (with
a modified definition of the energy matrix K) as the motion of a single
electron.
The com plete disappear ance of th e number of electr ons fr om th e
equations of th e gener al theor y seems a t fir st sight ver y puzzling. This
number m u st obviously be intr oduced a posteriori as an integr ation
constant, or mor e exa ctly as a sor t of quantum number , specifying the
system under consider ation.
We th u s see th a t th e theor y of the density matr ix natur ally leads
to a fur ther d evelopm ent of quantum theor y in th e sense of second
quantization discussed alr eady in P ar t I (see n ext chapter ).
45. Appr oxim ate Solutions (T h om a s-F er m i-Dir a c Equation)
Usin g Dir a c’s n otation for the matr ix elem ents and for th e wave func
tions we can tr ansfor m the m atr ix p fr om the point of view of x to
t h a t of K with the help of the following equations:
(JC'I/jIA") = JJ (K’\x’) dx' (x' \p\x" ) dx ’ (x" \K" ), (376)
th e matr ix elem ents (K' \p\K”) and (x’\p\x") being both of th e ‘pur e’
typ e, cor r esponding to a definite point of view. W e can, however , define
in a similar way th e m ixed elements of p cor r esponding to a ‘d ouble’
p oin t of view (KyE) which ser ves to connect the two matr ices K and
E with each other . These elem ents ar e given by the for mula
(E ' \p\K’) = J J (E' \x' ) dx' (x' \p\x" ) dx" (x" \K" ), (376a)
which is similar to the equation
(E ' \K' ) = J (E' \x) dx {x\K' ) (376b)
for the transformation coefficients (E ' \Kf) (cf. § 18), and reduces to it
in the limiting case n = oo according to (372 a). The wave function
(x|K') appearing in the transformation equations (376 a) and (376 b)
replaces in a certain sense the whole set of individual wave functions
associated with the given value of K \ in agreement with
the fact th at each of the n electrons on account of the exchange pheno
menon must be distributed over all of them.
The intr oduction of th e wave functions (x|A' )—although it is b y no
m eans necessar y nor even convenient—r aises th e question as t o th e
p ossib ility of r epr esenting the ener gy X as a function (of a per haps
unusual typ e) of th e dynam ical var iables x and p ( = p x) used in th e
440 SYSTE M OF I DE NT I C AL P AR T I C L E S §45
wave mechanics of a single electron. Now, since K is defined as a
function of the density matrix, this question amounts to the trans
formation of the latter from the original viewpoint of x to the ‘mixed’
viewpoint (x,p). In other words, we must find the transformation from
the ‘pure’ matrix elements (x\p\xf) to the ‘mixed’ matrix elements
(p'lpl# ). This transformation is given by the equation
<*l/»!*') = J (x\f> \p)dp(p\w) (377)
or the reciprocal equation
(x\p\p) = J (x|p\x') dx' (x’\p), (377a)
where (x\p) = (p\x)* is the well-known function
(x\p) = e*2™*"', (377 b)
x*p being an abbreviation for ocpx+ ypu-\-zpz. The function (377 b) is
understood to be normalized according to the condition
J (z|jp)(p’\x)dx =- S^- — j.
We shall give here, following Dirac, an approximate solution of this
problem by treating x and p as ordinary, i.e. mutually commuting,
quantities in the sense of classical mechanics. The density p (as well
as the energy K) will thus appear as a function p(x,p) of the coordinates
and momenta of an electron, its product with the volume element of
the phase space dxdp being proportional to the probability of finding
the electron in this volume element, or, in other wrords, to the relative
number of electrons to be expected in the latter. This physical meaning
of p will become apparent from the following argument.
Let us consider p(x,p) = px(p) as a function of p for a fixed value
of x and expand it in a Fourier integralf
PxiP) = j d£. (378)
This expansion is quite similar to the expansion of a function of the
time t, the coordinate of an electron for example, for a motion with
a fixed energy W
wjh being the frequency. Now7in the latter case the Fourier coefficient
fw,w is we^ known (by the correspondence principle, Chap. I l l , § 12) to
represent approximately the matrix element of / (W\f\W+ w) for two
neighbouring states with the energies W and W+ w (provided w <C W).
f I t should be remembered that x and p are meant to “denote the triplets of coordinates
and momenta, and that dx actually means dxdydz.
§45 AP P R O XI M AT E SO L UT I O NS 441
Since a coordinate x and the corresponding momentum p x are related
to each other in exactly the same way as the energy and the time
(being canonically conjugate quantities), the Fourier coefficients of
px(p) in (378) must likewise represent approximately the matrix ele
ments (x\p\x+ £) of p. The function p(x,2}) corresponding to the classical
definition of p (as a quantity commuting with u ) can thus he calculated
with the help of the matrix (x\p\xr) by the formula
p(.r,p) ■-= C rff. (378a)
Compar ing this with (377 a) and (377 b) we obtain the r elation
(x\p\p) = p(xfp)ei2^ l\ (379)
The Fourier coefficients in (378) can be calculated by the formula
P.r( = (*!p|*+f) = | p(x,p)e-i2'”‘-ilh dp/h3, (380)
hz appearing instead of h because dp actually denotes here the product
d'Pr dpvdpz.
P u ttin g her e £ — 0, we get
(x\p\x) iJ p(x, p) dp, (380a)
whence it is clear th a t p(xyp) can be defined as the pr obable number
of electr ons per volum e h3 of the phase-space.
The pr eceding equations are obviously valid n ot only for the matr ix
p b u t also for any other matr ix of the same typ e, and in par ticular for
th e ener gy matr ix K. Expr essing it as a function of the var iables a\ p
we th u s get K(x,p) = E{x,p) + B(x,p )—A{x,p), (381)
wher e E(x,p) is the usual (classical) expr ession for th e electr on’s own
ener gy p 2j2m-\~ U(x),
B(x,p) = B(x) J d$;
th at is, B(x,p) = B(x) = e2 f d x\ (381 a)
J r(x, x )
th e usual expr ession for the Coulomb ener gy of an electr on in a cloud of
electr ic char ge with th e volume density ep(x',x'), and
A(x,p) = J {x\A \x+ ^) e ^H h d$. (381 b)
Taking for the matrix (x\A \x+ £) == A(x, x+ £) its expression (369a) and
substituting for (x\p\x+ £) the expression (380), we get
A(xyp) = JJ F (x,x+ (;)p{x9p')ei2w(p-p')‘tlh d£dp'/hs.
SB05.6 3 L
442 SYSTE M OF I DE NT I C AL P AR T I C L E S §45
Now ( P - P ' ) ’S is the scalar product ( p x - p x^ £ + ( p y - p y^ y + ( p s - p z^ z
of the vectors p —p ' and Keeping the vector p —p' = g fixed and de
noting by 6 the angle made with it by the variable vector we can replace
the volume-element df (= d£x d£yd£z) by the expression 2tt £2d\%| sin 6 dd;
since F(x, x-f f ) depends on the magnitude j£| = r of the vector 5 only,
we can carry out the integration over 6, keeping r constant. This gives
i
J ei2nfiih J
— 2rrr2 dr ei2rlr 01,1d(cos 6)
-I
= 2-rrr2d r ^ 2^ M = 2r dr Vw jh )
Trgr/h g/h
and consequently
J (x\A \x-\-£)ei27TP'ilh d£ — 2e2 J -J dr ain(2irgr//i).
•» j
Now I siriardr — —--[cos art*,
J at 0
which is equal to 1/a+an indeterminate constant which can actually be
dropped. In fact, if instead of integrating over r to oo we first extend
the integration to some large finite value, R say, and pass to the limit
R — oc, after carrying out the subsequent integration over p f, the term
containing R vanishes. We thus get finally
ec r p(x,p'
A(x,p) dp'. (382)
nil I p -p ' J
If the function p(x,p' ) is replaced here by the function f(xyp' ) p/h3f
giving the probable number of electrons per unit volume of the phase-
space, the preceding expression assumes the form
x e2A2 r-f(x,p
j ')d p '
A(x,p) = — J J
IP -P T ’ (382a)
which shows that the exchange energy, being purely a quantum effect,
vanishes with A, as of course it should provided the function / remains
finite (which simply means that the number of electrons is finite).
If the function f(x,p ') vanishes for large values of p ' y then for suffi
ciently large values of p we can put approximately
/ - - p f ^ l p‘ ~ p
and consequently 2*2
A( x , p ) =? — l p ( x, x) . (383)
1TP*
§45 AP P R O XIM AT E SO L UT IO NS 443
I t was shown by Fock th at this expression can be applied for all
values of p in the case of an electron moving in the electric field of
a number of other electrons, if its reaction upon the latter can be
neglected. Thus, for example, if we consider an alkali atom containing
n electrons of which n —1 form its inner core, while one can be treated
as an outsider (although it actually ‘dives’ into the core), then the
effect of the interchange of roles between this outsider and the core
electrons is the same as if the energy of the external electron were
decreased by the amount (383). Fock’s formula can be obtained by
applying the variation principle to that part of the total energy
W = J x*^X (W which, besides the proper energy E of the ‘external’
electron, contains terms representing its interaction with the other
electrons (whose motion is supposed to be given, i.e. to remain un
affected by this interaction).
Taking for W the expression (371) and putting p = p0-[-pv we easily
get for the part in question the expression
Wx = J t fE fa dx +
+ J f F (x, x')[p0(x’,x')Pl(x, x) — p0(x, x')Pl{x, x')} dxdx' (384)
where p ^z.x') =
is the contribution to the total density matrix p(xyxf) of the electron
under consideration and t/*i(x) its wave function. The latter is to be
determined from the condition = 0 (the normalization condition
being now irrelevant). This leads to the equation
[ E (x,p )+ B0( x ) - A 0] ^ ^ W ^ v (385)
j* p xf)
where B0(x) — e3 ^ d x ' is the Coulomb energy of the electron
J (•£>x )
in question with respect to the rest, while A0 is the operator of the
exchange energy [including the physically irrelevant action of the
electron upon itself which must be subtracted from 2?0(;r)]. It has an
unusual form, being defined by
AM x) = e* f p°(X,X-} >l>(x') dx'. (385a)
J r (x, x )
Now, as has been pointed out by Fock, the quantity 1jr (x,x'), i.e. the
reciprocal of the distance between the points x and x', can be considered
(if we leave aside for a while the spin coordinates) as the m atrix element
with respect to x and x' of the operator —47r/V2, where V2 is Laplace’s
444 SYST E M OF I D E N T I C A L P A R T I C L E S §45
operator. This follows from the fact that
is the solution of the equation
= - 4 tr/,
which can be rewritten symbolically in the form
r y 2J
In applying this result to (385 a) we must take care of the fact th at
1/V2 operates on a function of x' leaving x constant. We must accord-
n --l
ingly come back to the original expression p0(x, x') — 2 ip ^x^^x') for
t -1
the density (where n —1 denotes the number of electrons in the core)
and insert the operator 1/V2 between the tp^x) and *pf(x') of the separate
terms.
h2
Since —-■- V2 = p 2, we obtain the following expression for A^tp^x):
\ M x) = -7T y >pi(x)p-i<i<i(x)M x),
1=1
where x' in \p*(x') and tp^x') has been replaced by x in view of the fact
th at p~2, by definition, converts a function f(x') of x' into a function
<p(x) of x.
If we now wish to consider the approximation corresponding to the
classical mechanics we must treat p as an ordinary number, which
enables us to rewrite the preceding formula as follows:
h2
A M * ) = 7T —t
p*Po{*> *)M*)
and leads us back to the expression (383) for the operator A0.
We have hitherto made no explicit use of the spin variables which
were understood to be included in x and p whenever they were neces
sary. I t is easy to rewrite the preceding equations with an explicit
notation for the spin variables. So long, however, as the dynamical
effects of the spin are neglected, its only influence will be to double the
maximum value of p(x,p) which is allowed by the exclusion principle.
As has been stated above, p(xyp) can be considered as the number of
electrons per volume h3 of the classical phase-space (which, as we
know, corresponds to one single spinless state in the sense of classical
mechanics). Inasmuch as the inclusion of the spin allows each spinless
state to be doubly occupied (by electrons with their spin axes in
§45 A P P R O XI M A T E SO L UT IO NS 445
opposite directions), the effect of the spin will be simply to increase the
maximum value of p{x,p) from 1 to 2.
If we consider a system of electrons, such as a complex atom, for
example, in the normal state, i.e. in the state of lowest energy W, we
can assume all the individual states of lowest energy to be doubly
occupied, or, in other words, all th at part of the phase-space x,p which
corresponds to the least possible value of the energy to be filled with
the maximum density p = 2 and the rest to remain quite empty. The
shape of the boundary surface can be determined from the condition
that p is a constant of the motion as determined by the energy K. This
means, since dpjdt = 0, that p must be a function of K, and that con
sequently the boundary surface we are looking for must be a surface
of constant K.
Now we have
K - P ( r ,p ) + B ( r ) - ± f dp', (386)
7rh J |P—p |2
where r is written instead of x in order to indicate the fact that we no
longer include the spin coordinate. Since within the part of the phase-
space which comes into play
p(r,p') = const. = 2,
the preceding equation is reduced to
where the integral
<388*’
dp'__ | = f f f _______ dp'xdp'ydp’z ________
IP-pT/r J J J ( P z - P x ? + { P v - p 'v ? y ( P z - P 'tf
must be extended over all the saturated part of the momentum space
p' which is associated with a given point of the ordinary space r.
In order to evaluate this integral we must make some assumption
as to the shape of this saturated momentum space. We shall assume
it to be spherical, its radius Pr being a certain (for the present undeter
mined) function of r. We then get
. 2e2[P ?—|p |2. Pr+ P
+ 2Pr 1. (387)
P r -P
We have further
P (r,r) = i J p(x,p) dp = * P f , (388)
8»r f P®
an d con seq u en tly £(r ) = ~ ^ ea I (388 a)
446 SY ST E M OF I D E N T I C A L P A R T I C L E S §45
At the boundary surface we must have p = P, and consequently
4c2
A (r, pr) = ~ P t .
The equation of this boundary reduces accordingly to
A(r, p) ----- E(r, Pr)+ B(r )~ Pt — const. (389)
h
This equation serves to determine P as a function of r. I t can be
replaced by a differential equation of the Poisson type if we take into
account the fact that B ( r) is the product of the charge e of an electron
and the potential due to a distribution of charge with a density
(388) multiplied by c. We thus get, applying the Laplace operator V2
to the equation (389) and assuming E( r ,p) to be of the usual form
p 2j(2m)+ U(r ) with V2Lr = 0 ,
* ( £ +i(r)- T V ) - °
” v1 £ - T f H ^ r)- 35 ? p' ^
This equation (due to Dirac) is a generalization of the equation of the
Thomas-Fermi theory which has been considered in P art I, § 32. I t
differs from the latter by the additional term —4e2P jh which represents
the exchange effect (and also eliminates the self-action of the electrons),
the electric potential or the density function being replaced by the
function P.
IX
SECOND (INTENSITY) QUANTIZATION AND
QUANTUM ELECTRODYNAMICS
46. Second Q uantization w ith resp ect to E lectrons
The reduction of the problem of the motion of a number of identical
particles to that of a single one, earned out in the preceding chapter,
involves a more or less rough approximation. A similar reduction can,
however, be achieved in a different way, which corresponds to the
method of copies which was sketched in P art I, § 20 , and is connected
with a quantization of the amplitudes of the waves representing the
motion of a single particle. This procedure may be denoted as ‘second’
or ‘intensity’ quantization.
This method was inaugurated by Dirac in connexion with the
theory of light quanta for a system of particles which are describable
by a symmetrical wave function. We shall, however, develop it
in the first place for a system of electrons which will lead us to a
generalization and improvement of the results obtained in the pre
ceding chapter.
In describing a system of N electrons we have used hitherto only
N individual wave functions ^ (x), 02(;r),..., if/N(x) which enable us in
the case of stationary states to account for the exchange degeneracy
only. We shall now introduce an infinite set of mutually orthogonal
and normalized wave functions of this sort (with spin), leaving their
form undetermined for a while (they may, for example, represent the
motion of an electron in the external field alone with negKct of its
mutual action with all the other electrons). We shall further combine
them into sets of N functions and for each set form an antisymmetrical
function x in the same way as before. Instead of, however, identifying
X with the exact wave function Q(x1,x2,...,xAr,t) describing the motion
of the electrons, we shall define the latter as" a linear combination of
all such functions, ^ = (392)
and shall determine the coefficients Cn as functions of the time in such
a way as to make Q an actual solution of the exact wave equation
(392)
(which can involve the spin variables). The coefficients Cn satisfying
448 SE C O N D Q U A N T I Z A T I O N §46
this condition are determined by the well-known equations of the per
turbation theory
h dC,
(392 a)
2ni dt
where H„n. - [ x* HXn. dX (392 b)
(the ‘integration * over the coordinates X of all the electrons being
understood to include a summation over the spin coordinates).
The indices n specifying the functions x can be considered as repre
senting the totality of the numbers nv w2,..., nr,... corresponding to the
individual wave functions tftv 02,•••> &,••• and equal to 1 if these func
tions are included in the set forming and to 0 in the converse case.
Thus nr = 1 if the function \fsr is contained in xn an^ 0 if it is not con
tained in it. We could also write more fully Xn = x(niiW2,...,nr;
and Cn = C(nlfn2i...,nr,...;t). The numbers vr may be denoted as the
partition numbers, indicating whether the corresponding rth individual
state is occupied by an electron or not. In calculating the matrix
elements (392 a) we can use the formula [cf. (364 a), § 44]
Hn„- - J xt HXn. dX = V(W) f 4 i Hx«- dX dX,
1 (393)
where (f>n and <f>n. are the factorized wave functions corresponding to
a definite distribution of the electrons between the N occupied states,
for instance, (394)
<f>AX ) = kfa W r M J •• ^a(^v)
and <f>n.(X) = fr S XiWr ' .M - &J,(XA’)- (394 a)
I t will be convenient to assume for a while th at the indices rv r2t...,r N
of the occupied states are arranged in the same order as the indices
1, 2,..., N of the electrons, i.e. th at
/*! < r2 < r3.... (394b)
This means merely a certain (arbitrary) denomination of the N wave
functions *fj forming the set under consideration. We could put, for
example, so long as we are concerned with this particular set, = I,
r 2 = 2,..., rN = N. The order of the indices in the other sets n' must
of course be left arbitrary. So long as H has the usual form
H = 2 E(xiyp {)+ 2 1 F (xitxk)
i~l i<k
the m atrix elements (393) will vanish identically if the set n ' differs
by more than two individual states from the set n (in view of the
§40 SEC O ND Q UANT I Z AT I O N F OR E L E C T R O NS 449
orthogonal property of the wave functions ip). We must therefore
distinguish three cases:
(1) n — n', i.e. nr — n'r, for all values of r, or simply xn = Xu'
this case the matrix element (393) reduces to the value of the energy
W already calculated in § 40. Putting
J ifi*Etpr‘ dx ■-= E„. (395)
and jjtf(x)tf(x')F (x,x')> l> r.(x)>l>Ax')dxdx' = Frtry,. (395a)
we can rewrite it in the following form:
Hnn = £ ^Vr+ Z 2 (Kr.r8~-Kr,*r)> (390)
r r<s
which is easily seen to coincide with (368 a).
(2) The set n differs from n’ by the fact th at one function, <pr.(x) say,
is replaced by another, ip^x), all the other factors in (394) and (394 a)
being the same. We then get in a simijar way, putting rt — p and
r* = P '' Hm, = Epp.+ 2 (Fpr;p.r- F pr;rp.), (396a)
where the sub-subscript i has been dropped.
(3) The set n differs from n ' by the fact th at two functions ipr{(xt)
and tprk(xk) are replaced by different functions (not belonging to the
original set) ipr^i) an(l We get in this case, writing p for r Land
<1 for rk, = Fpqt/q— Fpq.q.p, (p < q; p' ^ p ,q ' ^ q). (396 b)
Let otr denote an operator which when applied to Ct( -- C(?iv w2,..., wr,...)
increases nr by unity if nr — 0, th at is, transforms
C(nv W2 »-• • >« r - l > w r + l > •• •)
into C(nv n2,...,nr_1,n r+ h n r+lv..); if, on the other hand, nr — 1, the
operator ar reduces C to zero. Let, further, aj denote an operator
which decreases nr by 1 if nr = 1 and reduces C to zero if ?ir — 0.
The coefficient corresponding to case (2), can be written accord
ingly as (x\oLr'Cn, and the coefficient Cn> corresponding to case (3)
as ajaj ar/a«'Q- It is R°w possible to wrrite the equations (392 a) as
follows: h dCtt
K„ Cn (397)
277i dt
where
K» = 2 F rr+ 2 2 ’ F *p-«> p- +
+ 2t <2h (FrKr ,~ Fra-,ar)+ 2p * 2p r2< p (FPr,P,r—Fpr,rvWp°‘p- +
+ 2 2 2 2 (Epq-.v’q— E p x i ' p ^ P “J <V<*p" (397 a)
jX u P S-P Q *Q
3605.6 3M
450 SE C O ND Q UANT I Z AT I O N §46
The summation over r and s includes all those individual states which
are contained in the set n , i.e. represented by partition numbers nr and
ng equal to 1. The summation over p and q is extended only over such
of these states as are replaced by one or two other states in the set n'.
Thus the indices p , q on the one hand, and p*, q' on the other, are not
independent of each other. The expression (397 a) can be further simpli
fied with the help of the relations ar Cn = 0 if nr = 1, aj.Cn = 0 if
nr = 0. Applying the operator ar to Cn and omitting all the argu
ments except nr, we thus get
C(nT+ 1) if nr - 0,
«r C(nr) =
0 if nr — 1,
and
C ( n - 1) i in r = J,
4 C (n r)==
0 if nr = 0.
Under these conditions it is possible to represent the operators ar
and aj as matrices from the point of view of nr, considered itself as a
diagonal matrix
nr = (398)
n
with the characteristic values 0 and 1 or, what amounts to the same
thing, as a one-column matrix nr — |^ J. I t should be mentioned th at
the difference 1— must be defined accordingly as the matrix ^
or respectively.
Regarded from this point of view the operators and ol\ are repre
sented by the matrices
“r = (o J)’ “; = (i 2) (398 a)
satisfying the relations
= nr) ar al — 1—7lr. (398 b)
Any function C(nv n2,..., nr>...) of the matrix arguments nr, or more
exactly of their characteristic values, must likewise be dealt with as a
matrix. Leaving all the arguments but nr aside, we can define it as a
one-column matrix tnuw\
<3m»
whose elements correspond to the two characteristic values of nr. This
§46 SE CO ND Q UANT I Z AT I O N F OR E L E C T R O NS 461
gives, according to the definition (398 a) of the operators a, and a£
«r C(nr) = j ^ j , 4 C(nr) = j c J0)j. (399a)
which is in agreement with the original definition of and a*. If
therefore we agree to consider the partition numbers as the charac
teristic values of the corresponding operators (which will be denoted
by the same letters), we can rewrite the sum of the first two terms in
(397 a), corresponding to the proper energy of the electrons E, as follows:
I Err+ I I Epp- 4 «v = f | E „ .4 «v, (400)
r V P' r-lr' =l
since for all values of r and r' which are not actually represented
in the sum on the left-hand side of this equation, the operator <4<v *8
equivalent to 0.
Turning to the other terms of (397 a) which correspond to the mutual
energy of the electrons, we shall show in the first place th at they can
be collected together in a form similar to the last term with no other
restriction imposed on the summation indices p, q> p', q' than the con
dition p < q.
In fact the second term is easily seen to be obtained from the last
if we put q' — q = r and interpret the product a* as the operator nr.
Since this operator commutes with a* (so long as p ^ q) we can write
it on the left of a* and extend the summation over all values of r which
are larger than p, those terms which correspond to values of r not
represented in n being automatically cancelled.
I t should be emphasized in this connexion th at the order of the four
factors ajajaffl'oy in the last term of (397 a) is not taken at random,
but precisely with a view to ensuring the inclusion of the preceding
term under the condition q' = q — r. I t is easily seen in the same way
th a t the first term containing F in (397 a) is obtained from the following
one if we put p' = p, or consequently from the last term if we put
simultaneously p = p ' = r and q = q' = s with the one restriction
r < 8, i.e. p < q. We thus get
2r <82 (Frr.r,-Frr,*r)+ 2p * p2‘ r 2< p (Fpr.pr~Fpr-.rp-)^p<V+
+ 2 2 2 2 (Fpr,p'q‘~ Fp<M'p^ap °V <V (400a)
P<q,v'*P> Q'*q
= 2 2 2 2 ^ p*P (*'~~^1pQ\q'v')0c7>“J
, p <q p
and consequently
Kn =K=2£ <V+ 2 2 2 2 <V<V> (400*>)
452 SECO ND Q UANT I Z AT I O N §46
where the indices p and q have been replaced by r and 8 in order to
indicate the fact th at the summation can be extended over all values
of these indices, the terms represented by non-vanishing partition
numbers nr, n8 corresponding to the state n being actually the only ones
left. This is why we are now entitled to drop the index n for K. If instead
of writing a£aj<v<v second term of (400 b) we had written
a* aj <v<V> ft would have been impossible to include all the three terms
of (397 a) containing F in one.
The second step in the simplification of the operator consists in the
removal of the restrictive condition r < s and in the simultaneous
unification of the positive and negative summands in the second term
of (400 b), representing the mutual action of the electrons. In order to
carry out this simplification we must introduce instead of the a ’s new
operators ^ = ±£Xr, a t = ± a t
with an appropriate rule for chosing the upper or lower sign, so that
we could write v't — n 't. (401)
and (401 a)
^r = «J «r <V<V
“r aI ‘V cV -a }a lur.aH. = - a } a} a,, a,.. (401 b)
This enables us to put
r r'
and
2 2 2 2 (FnS* —Fr*Sr-)<4 al <V
r< N r *'
= 2 2 2 2 Fr>y>al al a *«r -r 2 2 2 2 Fr,sr-al al
r< 8 r‘ s' r<« r'
= 2r < H2 2r ’ 2s ' K * '- al al ar.aa.+ 2 2 2 2 F« vva» al <V<V
in view of the obvious relation
Frs,r h JF8i\8 ..r
and the relations (401 a), or finally
# =22En r r'
- a 'r + l 2222 ^
r 8 r' *'
«r a l <V
a r > (402)
the summation being extended over all the values of the indices r, r', s ,«'
without any restrictive conditions whatsoever, all the restrictions being
carried out automatically.
In order to define the operators a explicitly we must take into account
the condition which has been stated at the beginning of this section as
§46 SE CO ND Q UANT I Z AT I O N FOR E L E C TR O NS 453
to the arrangement of the indices rv r2>... specifying the individual
states in the set n : r1 ^< r 2 <. ....
This condition has been used in all our preceding expressions for K up
to the expression (400 b), and has been dropped only in the last expres
sion (402).
Now the operators a must be defined in such a way as actually to
enable us to get rid of this condition. This can be done, following
Jordan and Wigner, by putting
ar = <xrv„ al = vr <4, (402 a)
where vT is an operator with the characteristic values 1 and —1 (i.e.
equivalent to taking a with the -1- or — sign) which is defined as the
product^ r—i
■ V -n tl-S n J . (402 b)
#- L
The separate factors in this product are themselves operators of the
same kind as vr (the characteristic values of n3 being 0 and 1, those of
the difference 1—2??, must be + 1 and —1). The operators a defined
in this way are easily seen to satisfy the conditions (401), (401a), and
(401 b), or the more simple conditions not involving the original a ’s:
a \a r = nr (403)
ar as~^°8ar — -- 0 (403a)
and finally a} ar,-{-aTa} = Brr. (403b)
We have in fact
«r ar vr a r ar vr — vrnr vr — nr>r — » r -
since nr is represented by a diagonal matrix, just as vr is and therefore
commutes with vT, whose square is equal to 1, i.e. to the unit matrix
n -
Further, if r < st we obviously have
OLr OCs = r (Xs a r
(the case r — 8 is devoid of interest since the operator ar ar applied to
any function Cn gives identically zero). On the other hand,
ara8 =
since, according to the definition (402 b), commutes with all the
factors in vr. I t does not commute, however, so long as r < s, with one
t We slightly diverge bore from Jordan and Wigner by extending the product over
a to a =•- r —1 instead of * = r.
454 SECO ND Q UANT I Z AT I O N § 4C
factor in v8> namely, (1—2nr). Applying it to the latter we have
<*,.(1—2nr) = <xr—2(^71^
Now o^n,. = nr — 0, and 0 if nr = 1. In both cases we get
0Lr(l —2nr) = —(1 —2nr)ocri
and consequently 0^ 1^ = ~ va(Xr (r < *)•
The preceding relation can be derived in a somewhat different way if
we replace nr in 1—2nr by the product ol\ c^. We then get
«r(l —2aJ Or) - or—2ar(ajoir) = ar—2(oraJ)oir
= (1 —2ar o4)ar = [1 —2(1 —nr)]ojr
according to (398 b), which coincides with our previous result.
Coming back to our original expression for ar a8J we have
a r a s = vr<*8(X
TVH= ~
or, since the three operators vn aK, vs commute with each other while
vr commutes also with ar,
a r a 8 = - — * s vs vT<*r =-- — r 8 a r vr = — a fia r .
The second relation (403 a) is proved in exactly the same way.
In the case r = r* relation (403 b) immediately follows from the rela
tions olIoLj. — nr and ar aj = \ —nr (see 398 b). In order to prove it for
the case r < r' (or in general r ^ r ’) we must use the fact th at the
operators a* and <v commute with each other just as the operators
oLy. and cv do. We have further, if 1—2nr is written in the form
(1 —2nr) = - [ 1 - 2 ( 1 - n r)] = - ( 1 - 2 ^ ^ ) ;
a j(l—2nr) --- —4(1 —2ar <4) = —[a* —2a4(ar aj)]
= — [ a j ~ 2 ( a c J acr ) a t j ] = — ( 1 — 2 n r )a \,
so th at alvr, = —lyaj (r < r')
as before, and consequently, since
Ot,'Vr = Vr 0L^
a \a r. = vr aJ<V>V = <xr'Vr ocfr vr, = —c v ^ iy a j = —ar.a\.
Now th at the relations (403), (403 a), and (403 b) are all proved, we no
longer need to think of the auxiliary operators a, and v which have
been used in their derivation and which depend on the physically
irrelevant order in which the different individual states are numbered
in the set n. The above relations are self-supporting, for they specify
in a perfectly unambiguous way the operators a which serve to
express the energy operator of our problem K. These operators can be
represented by certain matrices, from the point of view of the matrix
§46 SEC O ND Q UANT I Z AT I O N F OR E L E C T R O NS 455
formed by the totality of the partition numbers nlr n2,..., in a way
implying a certain ordered arrangement of the different individual
states. So long as we are interested in one particular state only we can
define the corresponding partition number nr as a two-dimensional
matrix (398) and represent the operators ar and aj by the same matrices
as those representing the operators ar and aj. The difference between
them and the operators arf a£ becomes apparent when we take into
account all the other states ( — 1,2, 3,...). The general representation
of the operators <xr, a* can be derived from (398 a) by multiplication by
the product Mxx 3/2 of the matrices Mx and M2 denoting a matrix M
whose elements are obtained by combining multiplicatively the ele
ments of Mt and M2. In order to obtain the general representation of
ar and aJ we must replace the r - - l matrices S1? 8r_! by the
matrices 1—2?iv I —2n2,..., 1—2nr_1. The matrices so defined
can easily be verified to satisfy all the relations (403 a, b). The totality
of the numbers nv n2,... must be represented accordingly by the product
of the diagonal matrices representing each of them:
n = njXw2x ....
This operator product has, of course, nothing to do with an ordinary
product of the numbers which give the characteristic values of the
operators nr and which, as will be remembered, must satisfy the
relations «
2
r 1
The operators ar are not Hermitian, although the symbol oj preserves
its meaning as the operator adjoint to an i.e. as the conjugate complex
450 SECO ND Q UANT I Z AT I O N §46
of the transposed matrix ar = ^ They can, however, be ex
pressed with the help of the Hermitian operators
whose products with h/iir represent the components of the electron’s
spin [cf. § 29]. We have, namely, limiting ourselves to one particular
StatC’ °x = (a r + al)> au = —al)>
whence a r = \(crx—i<ry), rr* = icry). (404)
Hence it follows that
nt = a \a r -- ]{a; + aj-|-i(axay- o^cr,)],
or, according to (253), § 29,
= i ( ^ . / - |- e " + 2 c r ,),
or, since <j %= — 8, nr — i(8+crj, (404 a)
which agrees with the definition nr — ^ I t should be noticed
further th at
! —2nr — 1(8 - ^ ) ,
and th at accordingly
vr =- 11
«-1
the subscript s in serving to show th at it refers to the ath state.
We thus see th at the energy operator K (402) can be expressed with
the help of the familiar spin operators associated with the different
states. This is natural if we remember th at it is possible to represent
the interaction energy of the electrons in connexion with the exchange
effect with the help of the operators -?t(l-f-or-afi), as has been shown in
§ 42 of the preceding chapter. The problem is complicated in the present
case by the necessity of introducing the operators vTin order to ensure
the anticommutation of the operators ar and a8 (or a\ and aj) referring
to different states (whereas the operators ar and a9 must commute with
each other).
The fact th at the operators nr can have only two different charac
teristic values 0 and 1, which has been used as the basis of the above
definition of the operators ar> can be considered as a consequence of
the properties of these operators expressed by the* relations (403 a)
and (403 b) in connexion with the equation (403) which from this point
of view serves simply for the definition of the operator nr. We have,
§46 SEC O ND Q UANT I Z AT I O N FOR E L E C T R O NS 457
namely, multiplying the relation ara\ = 1—nr on the left by a*,
a \a ral = arf
or, since a la r = rcr, = a \—a \n T,
whence, by right-hand multiplication by ar, we get
w“ = nr—a \n raT.
Now aj 7irar r- al a} aTar, and according to the relations (403 a) we must
have arar — al al = 0. We thus get
n ; - n r = 0,
whence it follows th at the only characteristic values of nr are 0 and 1.
The preceding theory can be put in a still more significant form by
introducing the expression
*(*) = !«>*,(*)• (405)
r-1
Being an ordinary function of the coordinates of an electron (and
eventually of the time), it is to be considered at the same time as
an operator with respect to the amplitude coefficients C(nv which
play the role of the wave function in the equation
h d
<7 = KC
277i dt
with the energy operator K defined by (400b).
Multiplying Y(:r) on the left by the adjoint operator
W = (405 a)
and integrating over x (which includes as usual the summation over the
spin coordinates), we get in virtue of the orthogonality and normaliza
tion of the function 0r(z):
f dx - £ al ar = | *v (405b)
J r-l r= l
This equation is quite similar to th at corresponding to the ordinary
case of functions of the type (405) with amplitude coefficients ar defined
as ordinary numbers. Replacing such numbers by operators satisfying
the conditions (403 a) and (403 b)—or even the less restrictive conditions
ar ar = 0, al al = 0, al ar+ a r al = 1—we obtain for the number of elec
trons associated with any individual state one of the two characteristic
values 0, 1 of the operator al ar = nr in agreement with the exclusion
principle. The total number of electrons N can be defined accordingly
as a characteristic value of the sum ^ nr, so th at it appears in the role
4.'>8 SE COND Q UANT I Z AT I O N §46
of an additional ‘intensity’ or ‘quantitative’ quantum number (cf.
P art I, § 20 ). The operator w
N = 2 nr
r =1
is easily seen to commute with the energy operator K in virtue of the
relations (403 a, b) and to represent accordingly a constant of the
motion—which means th at the number of electrons forming any parti
cular sj'stem is constant—as of course it should be. The operator K
can itself be expressed in terms of the operator-functions T(z) and its
adjoint operator not containing explicitly the operator-coefficients
<7r. We have, namely,
Z Z E rr.a'r ar,= f V(x)E V(x)d x
r r' J
and
1 1 2 1 Frry,a 'r a la ,n r. = JT dxdx',
r s r ’ s' J J
so that K can be written in the following form:
K = J if i (x)Ef(x) dx + J J J dxdx',
(406)
which is somewhat similar to the expression for the value of the energy
W given by the equation (371), §44, if the densit~; matrix p(x,xf) is
replaced by the product Y1(2 )^( 2;'). The main difference between the
two expressions lies in the fact th at the exchange effect which is repre
sented by the negative term under the double integral sign in (371) is
not present in (406) where this exchange effect is automatically ac
counted for by the properties of the operators T(a:).
Putting F(x, x’) — e2/r(xJx'), which corresponds to an ordinary
Coulomb interaction between the electrons, and introducing further
the operator , ,
9W = e (406a)
J r(x, x )
which represents the electric potential due to a distribution of electricity
with a density ep(*', x>) =
we can replace (406) by an expression of still more familiar type,
K = f r f(x)(E+ie<p)Y(x) dx, (406 b)
corresponding to the average value of the energy W for an electron
moving in an electric field which consists of an external part (included
in E) and a quasi-extemal part, due, as it were, to its own field and
represented by the electric potential 9 with the extra factor J. I t must
§46 SECO ND Q UANT I Z AT I O N F OR E L E C TR O NS 459
clearly be understood th at no actual self-action of a single electron is
implied by our theory, the commutation properties of the operator-
functions Y being precisely such as to exclude any self-action.
These commutation properties arc easily derived from those of the
operators a, and from the orthogonality and normalizing conditions for
the functions *pr(x). We have, namely, multiplying Y(.r) by Y(:r'),
Y ( . t ) T ( * ' ) = 2 ar >pr(x) 2 aAM ' )
= 2 1 aTaA r { ^ s(x') = - 2 2
r «
that is, Y(*)Y(s')+Y(a;')Y(aO = 0 , (407)
and likewise, Y ^ Y ^ 'J - f - Y t(o;')Yt (a:) — 0 .
We have further
Y W ( * ') = 2 Z a }a A * ( ^ M ‘)
r a
•■= - 2 2 <P*(x)<PsA ) + 2 2 K stfU AM '),
r a r a
whence + - 8 (x-x'), (407 a)
where 8(x—x') denotes the product of the Dirac 8-functions for the
geometrical coordinates by 8^ if £ and are the values of the spin
coordinates associated with the points x and x'. It should be remarked
th at the formula a\ ar -- nr is replaced in the present case by the
formula (405 b) or r ±
J Yt(.r)Y(.r) dx = N. (407 b)
The functions *fir(x) which serve to define the operator Y(x) have been
left hitherto entirely arbitrary apart from the condition of being
mutually orthogonal and normalized. The actual problem, which was
put at the beginning of this section, was to find the coefficients Ck)
which determine the wave function il — Cn xn describing the behaviour
of the system of electrons under consideration in the configuration
space. From this point of view the functions *l*r(x) play only an auxiliary
role.
But on the other hand, it is clear th at the preceding theory can give
results of real practical value only in the case when the separate anti-
symmetrical functions xn form a good approximation to the functions
Qw, which describe the stationary states of the system when the external
field does not depend upon the time, or specific types of motion in
a given variable external field. Assuming the latter to be constant, we
are thus led to the problem of determining the individual wave func
tions \jsT{x) in such a way as to make the functions Xn the best possible
460 SE COND Q UANT I Z AT I O N §40
approximations to the exact antisymmetrical wave functions describing
the stationary states of the system.
Let us consider a stationary state of the system as determined by
the exact equation — \\ c (408)
to which the general equation —^ ^ — KC is reduced if the operator
_ A - is replaced by the characteristic value of the energy W, i.c. if
2m at
all the values of the function C(nlt n2,...) for which ^ n r = N are
assumed to be proportional to e~i2lrWilh just as in the case of an ordinary
Sehrtklinger equation. We then get, denoting the amplitude of Cn by
the following exact representation of a stationary state (in the
configuration space): (408 a)
where X denotes the totality of the coordinates of all the electrons.
Now the equation (408) must obviously be equivalent to the varia
tional equation S ir — S j*il*y Iiil (IX -- 0 with 12,r written in the form
(408a), in conjunction with the condition £ L This varia-
n
tional equation can serve for the determination of the coefficients CJ
if the functions yri, i.e. the individual functions tpry are known. Or it
can be used for the determination of the latter if the coefficients CJ are
known. Assuming them for the moment to vanish for all the subscripts
n except one, we get back to the self-consistent field considered in the
preceding section.
The question we were discussing above is thus reduced to the fol
lowing one: Is it possible to determine both the functions ipr and the
coefficients Cn from the same variational equation S ir = 0 , where
W - J 12*, HUWdX ? Such a determination is certainly possible for
a function (408 a) containing &finite number of terms; if only one of
them is different *from z.ero we get back to the problem of the self-
consistent field already solved.
However, the solution thus obtained will contain, as in the simple case
just mentioned, a certain amount of arbitrariness in the form of the func
tions tpr(x) (the latter being replaceable by any other set derived from
them by a linear orthogonal transformation). This arbitrariness will
increase with the number of non-vanishing terms and will become
infinite in the limiting case of an infinite series (408 a). So long how
ever as we are looking not for a formal but for a practical solution of
our problem, we can deal with it as if the number of terms in (408a)
§ 40 SE CO ND Q UANT I Z AT I O N FOR E L E C T R O NS 4<H
were finite; this procedure will ensure the most rapid convergence of
the series obtained in the limiting case. Dropping the affixes IF and 0
in (408 a), we have
IV =, f Q*//L1 dX = H cr, c\, f x* HXl, dX.
J n tr •
th at is, II-
n n'
which according to our previous results can be rewritten in the form
i r =, 2 C* KC], (409)
71
The problem we have considered hitherto was equivalent to the varia
tion of the coefficients Cni the operator K being fixed. I t could thus
be expressed by the equation
l& C : K C n + ZC*n KW„ = 0
n n
along with £ SC* C„ + £ C* SCN 0 ,
n n
which brings us back to the equation (408). The next step which we
must undertake in order to secure the best possible approximation con
sists in the variation of the functions ipr(x), i.e. of the operator K which
thej' define. We thus get the additional equation
ZC*nZKCn = 0. (409 a)
n
With the help of the expressions (400) and (406 b) this is easily reduced
to the following form:
I C; ( 8 / Yt(.r)[i?+e 9 (^)]T(.T) dx)Cn = 0 ,
provided we consider the variations of the operators Y(r) and Yt(r) as
independent of each other, apart from being subject to the condition
J 8Yt(a;)-Y(:r) dx — 0 [and J Yt(a;)-8Y(.r) dx = 0]. Let us consider C and
& as a one-column or a one-row m atrix and introduce the functions
co = Y6T, cot = C ^ t . (409 b)
The preceding equations can then be rewritten as follows:
J 8cot(x)[^+e9 (a:)]co(.T) dx — 0
and J 8cot(x)*co(x) dx = 0,
where 8cof — C^STt.
Applying Lagrange’s method of undetermined multipliers we thus
obtain the equation [£+e<p(x)_ ^ ]w(x) = 0
or [ E+ ey(x)-W] 'Y(x)C = 0 , (410)
462 SE COND Q UANT I Z AT I O N §46
wher e <p(z) is defined by (406 a) and W denotes a constant—the charac
teristic value of the energy we are looking for. This equation can serve
for the determination of the functions \ftr so loilg as the coefficients Cn
are supposed to be known for any choice of these functions. The equa
tion (410) is a good illustration of the method of double quantization,
as it is operational in the double sense of T(a;) operating on the matrix
C and E~f e<p(;r)—-W operating on T(x).
Assuming th e fir st of these oper ations to be under stood, we can
r ewr ite the pr eceding equation in the for m
[ E+ ey(x)—fT]T(a:) = 0 (410 a)
as for an or dinar y wave function, with the only differ ence th a t the
additional potential ener gy eq>(x) is itself dependent upon the function
T. I t should be mentioned th a t this dependence can be expr essed by
th e differ ential equation
V2<p = - 4 ve ^\x)^{x)t (410 b)
which is equivalent to the integral expression (406 a) for the operator <p.
This circumstance can be used, as will be shown later on, for a very
important generalization of the theory in the sense of taking into
account the exact electrodynamical laws which govern the interaction
of the electrons.
47. Intensity Quantization of Par ticles descr ibed in the Con
figur ation Space by a Sym m etr ical Wave Function (Einstein -
Bose Statistics)
The r eduction of th e pr oblem of a system of identical par ticles to th a t
of a single par ticle—cor r esponding to the method of copies (Par t I, § 20)
—has been consider ed hither to for the case of electr ons—or mor e
gener ally such par ticles as in the method of the configur ation space
ar e descr ibed by antisym m etr ical wave functions. We ar e now going
to consider the same question for th e case of par ticles which belong
to the sym m etr ical typ e, and confor m accor dingly to th e statistics of
E instein-Bose (for instance, a-par ticles, hydr ogen atom s consider ed as
elem entar y par ticles, etc.).
L et us star t as befor e with a set of N different individual states
specified b y the m utually or thogonal and nor malized functions ^ (x),
(/r2(z),..., *pn(x). Intr oducing th e factor ized wave function
4>(X) = A)
we can define th e symmetr ical wave function descr ibing the whole
§47 P AR T I C L E S OF SYMM E T RI CAL T YP E 463
system by the formula ,
<4,,,
which differs from the corresponding antisymmetrical wave function
by the absence of the sign-factors eP. This is of course a slight simpli
fication so long as wc are considering a set of N different individual
states. We get in this case from the variational equation
8J x ' fyd X = 0
in conjunction with the conditions
/ dx = S„
the following system of equations for the functions ipr [corresponding
to the method of the self-consistent field, cf. (366)],
( E + B - A ii)*i(x)+ 2 (Aki+ \ kt)<pk(x) = 0 , (411 a)
k-Ai
the functions B(x) and Aki(x) being defined by the same formulae,
(365), (365 a), as before. The energy (or its probable value) is expressed
accordingly by the formula
V/(AT!) | dX =- 2 J dX,
which gives
W = | J # (* )[(£ + aB- 1
that is,
H’ = I (Eh - J A t i U t W l ? ) rfx) +
+ \ IT F (x,x')[p(.r,x)p(x',z')+ \p(x,a:')|2] dxdx’
or
W =, 2 [Eu - j j F (x,z')\4,i(x)\*\ti(x')\*dxdx'} +
+ i J J F (x,x')[ p(x,x)p(x',x')Jr \p(x,x')\2] dxdx'. (411b)
We thus see th at in the case of symmetrical wave functions the density
m atrix p(x,x’) cannot replace the separate wave functions. Using the
notation (395 a) for the matrix elements of F and affixing the index
n to we can rewrite the preceding expression in the form
H„n = f xt Hx„ dX = 2 (Err- F rnrr)+ 2 2 (F r ^+ F rs]sr) (412)
J r r :h
similar to the expression (396) which corresponds to the antisymmetrical
case with a similar condition as to the arrangement of the indices
rv n,... in <f> — iftri(x1)i/irt(xt )...ifiry(xy ).
464 SECOND Q UANT I Z AT I O N §47
If x„ i* replaced by a function x,r differing from x„ in that- a single
individual function is replaced by </y, we get likewise
Him' + Z (^jr;//r + ^T
pr:r;/)* (412a)
r :p
If finally two of the functions serving to construct y„, tjjp and <//,, say,
are replaced by two new ones different from each other and from all
the other functions in the original set, we get
^ ( /'< '/) • (412b)
Let us now pass to the general case, which has no parallel in the theory
of antisymmetrical functions y, where certain individual states are
multiply occupied, so that, for instance, each function </v(*r) occurs nr
times (ur > 0 ) in the set specified by the index /?, where the sum
2 nr must of course be equal to the number of particles.
The formula (411) will still be valid in this case, except for the
normalization factor which must be replaced by if only
effectively different permutations P are included in the sum (411), i.e.
such permutations as interchange particles associated with different
individual states. Thus, leaving aside trivial permutations, we can
define the normalized symmetrical wave functions by the formula
(413)
__ N\ _ N\
where (413 a)
TI nr!
r
We then get, instead of (412),
H„„ - 2 nr(E,r- F rrrr) [ (414)
The calculation of the matrix elements corresponding to (412 a) requires
a little more care.
We must in the first place determine the number of times a given
variable, xr say, will be met in the sum ]£ P<f>n associated with a certain
p
function ipp. The number is obviously equal to
The function <f>„ will differ from <f>n by the fact th at it will oontain
rip = np—l factors ijjp and n'y = np’+ l factors i/y. Now the matrix
element of Er with respect to the functions P<f>n and P'<f>n‘ will be
§47 P AR T I C L E S OF SYMME T RI CAL T YP E 465
different from zero only if the variable xr is shifted from ipp to ^y, all
the rest remaining in their places. There will thus be np gn/N terms
equal to (Er)pp>. Now since H contains the sum of N terms En corre
sponding to the proper energy of all the N particles, the matrix element
Epp' will appear in the expression Z Z E<f>nllP'<f>u' just np ga times.
p v'
The coefficient of Epp. in Hnn will thus be
3 ' ? " - = n / ( ? s \ = VnPV ( V + 1)-
-J(9n9n’) V \9n-l
The same argument apples to the second term in (412 a) which corre
sponds to the mutual energy of the particles and must besides be
multiplied by nr. We get accordingly, instead of (412 a),
= V (n / ) M rV + + Z nr{Epr;p'r + Epr,rp')\ (414 a)
L r> p J
By a similar argument we obtain, instead of (412 b),
#nn' ~ + (414 b)
If we substitute these expressions in the equations
h dCn
ZHnn'Cn-
277i dt i\'
which determine the coefficients of the expansion Q. — 2 @n Xn and
n
introduce the operators olt already used, we can bring them to the
standard form h dCn
KCn
2tt X dt
with the following expression for the operator K :
K = 2 nrE „ + 2 I \n ll('Jnp.+ l)E/>p.(xl(xp—
r p*p'
~ Z nrErr,rr+ Z Z Wr Wt fW s ; r « + ^ r « ; s r ) +
r r<»
+ 2Jr-A
^p 3 V ( V +1)r2> p nr(Fpr;p'r+Fpr-,rp'Wp<V+
+ 2P*P'K
2 2I*Q'2 ^ p^ 9V(V+1WK-+1KJp«;pV+-fJ>9:«'p'K“9019'V>
P<Q
or
K = Z^ r r nr + Z Z, Epp' ^Up Z Z nr(ns^ K9)iKa;rs+Ks\a)+
P*P
+ 2 2 2 n r ( F p r .p T + F p r s p -) -ln p o£ a y + V +
r > p p l- p ’
+p<a;p^p;g'^(i
2 2 2 2 (F p q-,v<1- + F P < M*)'l n P 4 Vw«°v Vn«-vvv-
3606.0 30
466 SECOND Q UANT I Z AT I O N §47
This result can further be simplified if we replace the numbers nr by
operators, represented by the diagonal matrices
0 0 0 0 . ~
0 1 0 0 .
0 0 2 0 . . >
0 0 0 3
and represent accordingly the operators ocr and a* (from the point of
view of nr) by the matrices
0 1 0 0 . . ' 0 0 0 0
0 0 1 0 . . 1 0 0 0
« 0 0 0 1 . . >, <4 = = < 0 1 0 0 , (415a)
0 0 0 0 . . . 0 0 1 0
„ •
which are a generalization of the matrices used to represent the opera
tors nr, Oy, and ocfr in the antisymmetrical case. We can then put, since
0 0 0
0 1 0
0 0 1
I•
nr = v'nr aj aty\ !nr,
where
0 0 0 0 . '
0 Vl 0 0 . .
0 0 V2 0 . , (415b)
0 0 0 V3 .
I.
and combine the first two terms of K into a single one,
r r'
the summation over r and r' being unrestricted (i.e. vanishing terms
being cancelled out automatically).
The other three terms corresponding to the mutual energy F can
likewise be combined into a single one,
the second term corresponding to the case q' = q and the first to the
case q’ = q\p' — p. I t should be noticed that the term nr F^.^ is sub
tracted automatically in virtue of the relation nr a,. = ar(nr—1), which
5 47 P AR T I C L E S OF SYMME T RI CAL T YP E 467
reduces the product
Vnr o£ Vnr ol\ oty, Vnr oif. *lnr = \lnr cxj nr a,. vWr
to Vnr a* —1)Vwr = nT(nr—1).
I t now remains to introduce, instead of the operators nr, a,., and aj,
combined operators which can be defined by the formulae
br = ar Vnr, b\ = Vnr ocl (416)
or by the relations
b'r br = nri br bl = nr+ 1 (416a)
following therefrom, and which will be subject to the commutation
conditions , , , , ~.
6A - 6A = ° (417)
b lb l-b lb6l} = 0 j V ’
K K - b \b r = sr. (417a)
(the latter being in agreement with (416a) in the case r = s). With
the help of these operators, which are quite similar to the operators
ar, a\ [differing from the latter by the sign only in the commutation
relations (417 a, b)], the operator K can be written in the form
x = 2 2 Kr 'blbr,+ 1 2 1 2 1 Fra;, s.blblb,br, (417 b)
t r' r 8 r' s'
which can be obtained from (402) by replacing the a ’s by the 6 ’s. It
should be noticed that the order of the two last factors in (417 b) is
irrelevant [while it is very important in (402)] since they commute with
each other.
The commutation relations (417 a), just like their analogues (403 a)
and (413 b), arc actually self-supporting and can be used to define
the operators nr by one of the expressions (416). The fact that the
characteristic values of these operators are equal to 0 , 1, 2 ,... can be
considered as a consequence of the relations (417) and (417 a). We
need not repeat in detail all th at has been said in the preceding section
about the operator m_ y n
representing the total number of particles, and the functional operators
Y(*) = 2r «*,(*). Vf(*) = I blMx).
r
(418)
I t need only be stated that they satisfy the relations
y(x)y(x')-~y(z')y(x) = 0 |
(418a)
= 0 j
y(x) ^ ( x')—v|/t(a;')y(£) = &(&—%’) (418b)
468 SE CO ND Q UANT I Z AT I O N §47
and can serve to express the energy operator in the form
Ar = | ^(x)E\\/(x) dx +i JJ dxdx',
the order of the first two or of the last two factors in the double integral
being irrelevant. In the case of a stationary state of the system of
particles the equation of motion reduces to the form
KC = W.C
which can be derived from the variation principle 8 2 (7* KCn — 0 in
conjunction with the condition 2 C* Cti = 1. The same principle when
applied once more to the operator K itself, i.e. in the form
2 C * iK C n = 0 ,
n
leads to a double operator equation
C '\ e + J F (x,x')y'(x’)y(x,)dx''\y(x)C = 0
for the determination of the functions */jr(x).
In the special case when there is no interaction between the
particles (F —- 0) the transition from the equations
h dcr
l Err^r (419)
27n dt
describing the motion of a single particle, specified by the energy
operator E and the wave function ip(x) = £ criffr(x), to the equations
r
_ 2irt dt = Z H”« C* (419a)
for any number N of such particles (7/ = F r) can be carried out,
\ r: I J
according to Dirac, in the following way:
The right-hand side of the equation (419) can be defined as the dif
ferential coefficient with respect to c* of the expression
l l c*Err-Cr- (420)
r r’
which represents the probable value of the energy in the state specified
by the wave function tf/(x) = ]£ cr *f/r(x) We thus get
h dcr dE
2m dt dc**
h dcr dE
and in a similar way
2m dt dc„
§47 P AR T I C L E S OF SYMM E T R I CAL T YP E 469
If we now put cr — —Qr, h C* — p (420 a)
2ni r ~ T’
’.I9'
or c* = Qr3 (420 b)
jti
ll
zm
these equations can be rewritten the form
dPr\dt = - dE/dQ# dQJ dt = dEjdPr (420 c)
i.e. in the standard canonical form of the classical equations of motion.
The variables Qr and Pr can be identified with the generalized co
ordinates and momenta, the Hamiltonian E being a bilinear function
of them both and the number of degrees of freedom being infinite.
Let us now pass over from the equations (420 c) to the corresponding
wave-mechanical equation
h d
Eco - - —co (421)
2m dt
wher e to is the wave function, or pr obability amplitude for given
valu es of the coor dinates Q> the classical m om enta Pr being r eplaced
b y th e differ ential oper ator s — . Or let us take the equations
J F 2m dQr
(420) directly over into the quantum theory considering the variables
Q and P as operators (matrices) wrhich satisfy the commutation relations
Qr Qr - Q r,Qr =: P r Pr, - P r,Pr = 0
h
Pr Qr- Q , P r = 8„
2m
R eplacing her e Pr and Qr by their expr essions in ter ms of cr and c*,
we get />*/'*__/•* />* — /» /*_/» r — o i
Cr<V r r ” r r r r ~~ ' (421a)
c*cr— cr.c* = —&„■ )•
These r elations ar e equivalent to the wave-mechanical r elation
* a a
c* = - ^c*7c, ’ or c' = CC-Z (421 b)
which follows from Pr = -l We thus see that the coefficients
r 2m dQr
c and c* satisfy exactly those conditions which have been established
above for the operators b and M, and can accordingly be identified
with the latter.
The application of the quantization process just shown ('second
quantization’) to the coefficients c, c*9 i.e. their replacement by the
operators b and b thus leads us directly from the equations (419)
(which with their conjugate complex can-be considered as a system
470 SE C O ND Q UANT IZ AT IO N §47
of canonical equations in the classical sense) to the ‘wave-mechanical*
equation (421), th at is,
^ a
2n i dt
or the equivalent (operator) equation
= W.W.
This is no other than our previous equation
KC = W.C
with w replacing C and with an operator K of the form
K = H E n.blb,,
r y
which corresponds to a system of identical particles describable by
symmetrical wave functions in the configuration space without any
interaction.
In other words, the quantization of the equations (419), describing
the motion of a single particle, leads us to an equation describing the
motion of any number of such particles—provided they conform to
the statistics of Einstein and Bose. The actual number of these particles
is equal to one of the characteristic values of the operator
N = ZVr b,
and remains a constant of the motion since N commutes with K, The
motion of the whole assembly of particles is described by the operator-
t o 't i o a T<«> -
r
with the help of which the energy operator can be written in the form
K = j y*(x)Ey(x) dx.
An exactly similar scheme can be applied, according to Jordan and
Klein, in the general case of a system of identical particles of the
‘symmetrical type* interacting with each other, if this interaction is
represented by a ‘quasi-external’ potential energy of the form
W ( x ) = i f <j)*(x' )F (x, z ’)ip(x' )dx' ,
the operator of the proper energy being replaced accordingly by
E + \V, We then get, putting \jt(x) = 2 cr^r(^)»
V(x) = 2 l ^ c * c , ,
AV
§47 P AR T IC L E S OF SYM M E T R IC AL T YP E 471
and consequently
K = J f (a:)(E+iV)<f>(x) dx
= j iji*(r)E>fi(x) dx + 1 j j i/)*(x)i{i*(x')F(x, x')ip(x)ip(x') dxdx’,
or K = 2r 2r ' E rr' C?<V+£ 2r 2« 2r Is ' C? C? <VCr-
with rr
= (^ss')rr' = J j <l>H(x)^*(x')F(x,x')^r.(x)<l>a.{x' )dxdx' .
I t now remains to replace the numerical coefficients c and c* by the
operators b and 6 * in order to obtain the energy operator K corre
sponding to the problem of many particles.
I t should be mentioned that the ‘quasi-classical equations’ for the
coefficients c can be written in the general case in the same canonical
form (420 b) as in the special case of no interaction, and th at the transi
tion to the quantum (or doubly quantum) equations can be effected as
before by treating the coefficients c* as the operators —d/dcr (or cr as
3/a£).
The preceding scheme for carrying out the process of second (intensity)
quantization could be applied in principle to the case of particles of the
antisymmetrical type just as well as to particles of a symmetrical type—
namely, by substituting the operators a instead of the 6’s for the coeffi
cients c. I t would, however, be impossible in this case to consider the
conjugate complex coefficients c* as differential operators —d/dc and to
repeat with regard to the quasi-classical equations for the c’s and c*’s
the same process which leads from the classical equations of the motion
of a particle to the wave-mechanical equation.
The operators br and b\ are written by Dirac in the form
br = ei2ndrlhVnr, b\ = Vnr e“<2,r^ \ (422)
corresponding to the usual expressions for the coefficients cn 4nr
playing the role of the .modulus and 2iTdr/h th at of the argument or
phase angle. I t follows from a comparison of (422) with (416) th at the
operators eitn^ h and e~i27r^//<are no other than the operators a,, and cSr
considered before. Hence it follows th at the operators Br can be
represented from the point of view of the operators nr by the formula
d
(422 a)
2ni dnr
We have in fact, applying the operator
.jqj) Ih 1 / 8 \k
472 SECO ND Q UANT IZ AT IO N
to any function of nr,
* * * •■■> = 2 * ^
by Taylor’s theorem, and in a similar way
c- » / (Wr) = e - W ”r f(n r ) = 2 (- ^ p • = /( » r l) .
If instead of considering 6r and b\ from the point of view of nr we
consider br and nr from the point of view of b\, we get, as has been
shown before, P
(422 b)
dbl'
and consequently nr = K — . Replacing b\ by br as the basic quantity,
dbl
we get likewise
(422 c)
tt= ~ K
and
Representations of a similar type are not possible in the case of the
operators a and a \
Ju st as in the latter case, the operators b and b \ which are not Her-
mitian, can, however, be reduced to Hermitian operators p and q by
means of the relations
b = i(q+ ip), V = (423)
which correspond to the relations (404).
The operators p and q are represented by the matrices
• 0 Vl 0 0 . . . 1 ' 0 —tVl 0 0 . . .
Vl 0 V2 0 . . . iVl 0 —»V2 0 . . .
. 0 V2 0 V3 . . . , P = - 0 iV2 0 -W 3 . . . >
0 0 V3 0 . . . 0 0 tV3 0 . . .
which follow at once from (416 a), (415 b), and (416), and are easily
seen to coincide with the matrices representing the coordinate and the
momentum of a linear harmonic oscillator (cf. Chap. I l l , § 13). Their
non-vanishing matrix elements can indeed be written in the form
= £n-l,n = ^n J (423a)
Pn.n-1 = P n—l,n = >
(where the index r has been dropped), and differ by certain propor
tionality coefficients only from the expressions (88 a) derived in § 13.
§47 P AR T I C L E S OF SYM M E T R I C AL T Y P E 473
From (423) we get
tfb = n = l[ p 2jcq2—i(pq—qpj)
btf =r n+ 1 = l[ p2+ qz+ i(pq—qi>)},
2
whence pq~ qp — -• (423 b)
%
This reduces to the usual relation P Q —QP = hj2Tri between the
momentum P and the coordinate Q if they are defined as ^(hu>/4ir)p
and < J (h/47Tw)q respectively. With the help of the preceding relations
we find the following expression for n:
n = i ( p 2+ q 2- 2), . (423 c)
which can be rewritten in the form
i(P * + Q W) = h(n+ h)£-
I tt
corresponding to the quantized values of the energy of a harmonic
oscillator with the frequency v — w / 2 n playing the role of the quan
tum number.
These results bring us back to the elementary theory of the quantized
waves which has been sketched in P art I, § 20 , with the trivial dif
ference th at we do not have to worry about the half-integral energy
values of the harmonic oscillators representing the different states,
since it is not their energy, but the quantum number n which gives
the number of particles associated with the corresponding state. I t is
of more importance that we have now obtained an exact and general
expression for the energy K of the system of particles in terms of the
auxiliary variables br, b\, whereas it was assumed before without
sufficient justification that this energy is simply equal to the sum
00
Z E rnr- In reality it reduces to this expression in the special case only
r~ 1
of no interaction and for a special (though of course most natural)
choice of the wave functions tpr, as corresponding to the stationary
states, specified by the energy operator E (Er = Err). In this case the
energy K can be expressed as a simple function of the Hermitian
variables p and q, namely,
* = i I W + ? * - 2 )-
Their introduction in the general case instead of the variables br and
bl would, however, lead only to a useless complication of the theory.
I t is interesting to find the harmonic oscillator variables replaced in the
case of electrons (or any other particles described by antisymmetrical
474 SE C O ND Q UANT I Z AT I O N §47
functions) by the spin variables ax, cr(/—a fact which could hardly be
anticipated in the early development of the theory of quantized waves,
given in P art I.
48. Inter action between a ‘Doubly Q uantized’ System and an
Or dinar y System : Application to Photons
We have considered hitherto a system of identical particles,' with or
without interaction, in a given external field of force (specified by the
potential energy U0(x) or the operator E). We shall consider now the
more general case of such a system in interaction with some system
of a different kind which will be described to begin with in the usual
way, i.e. by giving the coordinates of all the particles constituting it.
The energy H of the combined system A-\-B will consist of three parts:
the energy of A taken alone (//*), that of B taken alone (Hn), and
their mutual energy M = HA]i, which can be considered as a pertur
bation.
The method of the ‘intensity quantization’ discussed in the two pre
ceding sections can easily be extended to the present case if the wave
function Q describing the whole system in the method of configuration
space is written in the form
a .(X,Y) = ^w ,,{Y,i)Xm(X), (424)
where X and Y denote the totality of coordinates specifying the corre
sponding system, while a)n denotes a symmetrical or antisymmetrical
function of the coordinates xlyx2,... of the particles constituting X ,
according to the nature of these particles. Substituting (424) in the
wave equation ^ ^
“ 2V i d i = //Q ’
we obtain a system of equations
_A_ ^wn _ y 77
2m dt P 71n n'
to
of the same kind as before, with the only difference th at H nn> must
now be treated as an operator with regard to coordinates 7 , and the
‘coefficients’ o>n as functions of these coordinates and of the time.
Introducing the individual states tftr (r = 1, 2,...) serving to define
the functions x» we shall thus obtain an equation of exactly the same
sort as before for the coefficients <ovi considered as functions of the
partition numbers nv n2,..., nr,..., of the coordinates 7, and of the time,
§48 INT E R AC T IO N BE T W E E N SYSTE M S 47f>
with the energy operator 2 &i increased by Hu and by the interaction
i
energy M of A and B. Putting
M = 2i V{*i, Y),
where, in view of the identity of all the particles of A , V is the same
function for all of them, we must simply add to the energy operator
Er of an individual particle the function V(x, 7). We thus get the
following equation:
{HB+ 2 2 [Er, + v rr,(Y)]cic H+ 1 2 2 2 2
' r r‘ r s r V >
h d
” — . — CO (425)
2TTl Ct
where the operators C stand for a or for b as the case may be.
It has been shown in Chap. VII, § 39, th at it is possible to treat one
part, B say, of a complex system A + B as a complete system by
treating all the quantities referring to this part as matrices with ele
ments defined with respect to the different stationary states of A taken
alone. This result has been proved by using for the function Q describing
the whole system just the expansion (424) with the important restriction
th at the functions x should be exact solutions of the equation
Ha X= --« a X-
This treatm ent can be conveniently applied to the present problem
only when there is no interaction between the particles of A and when
the individual functions ipr(x) are exact solutions of the equation
E*pr(x) = E'ripr(x). In this case the symmetrical or antisymmetrical
functions Xn(^) will also be exact solutions of the equation
^A Xn ~ HA Xn>
where HA= Yi=iN Exii and the theory of § 39 will be wholly applicable to
our problem.
This application is derived directly from the equation (425) if we put
F = 0 and Err. = 8rr>E'r. Denoting further the sum ]£ E'rClCr = E'r nr
by WAn and putting ^ = < e -itnwA.nkt ' (425 a)
» e g .t + <425
This equation coincides with the equation (329 a) of § 39 if the operator
of the interaction energy M is defined as 2 2 As a
r f
m atter of fact, the result of its application to the function <o'n can be
476 SE CO ND Q UANT IZ AT IO N §48
written in the form ]£ Mnn>w^, where n and n ' denote two sequences of
the partition numbers nl t n2,..., nr,... and differing from
each other (as in the previously considered case) by one of the numbers
in the second sequence being greater and another less by 1 than the
corresponding numbers of the first sequence. In other words, the
matrix components of the interaction energy M appearing in (425 b)
are taken with respect to collective states of the ‘ignored’ part A of the
system A + B which differ by just one particle jumping from one indi
vidual state to another, or, in other words, by a one-quantum jump in
opposite directions of two of the quantized partition numbers 7il 5 n 2,...
which specify the states of A.
The system B can in its turn consist of a number of identical particles
of a different kind from those constituting the system A (for instance,
A may be a system of photons or protons and B a system of elec
trons). In this case it is possible to apply the method of intensity
quantization to the two systems simultaneously, by defining the func
tions a>„(T, 0 in (424) as symmetrical or antisymmetrical combinations
of certain orthogonal and normalized functions ^(y), <f>2(y),• ^f(y),.-«
describing a sequence of stationary states of the separate particles of
B. We can then take the equations (425 b) as our starting-point and
transform them by putting
v n{Y,t) = 2m Cnm(t)o>nm(Y),
where wn(Y) depends (symmetrically or antisymmetrically) on the
coordinates Y only. We can also—and this is perhaps a more natural
procedure—carry out the two quantization processes simultaneously,
It d
starting from the original equation — — —Q = HQ and putting
27Tt ct
Cl(X, Y,t) = 2m 2n Cmn(t)< oJY)x„ (X). (426)
We thus obtain an equation of the following form:
<i + k + m )c - - ± % . ( « e .>
where L and K are the quantized energy operators referring to the
two systems A and B taken separately, while M is the operator of their
interaction energy. If A is antisymmetrical and B symmetrical, we can
use for L and K the expressions (402) and (417 b) respectively (affixing
the indices x and y to the operators E and F in order to distinguish
the particles of the two sorts), whereas the operator M is expressed in
§48 INT E R AC T IO N BE T W E E N SYST E M S 477
this case by the formula
M = 2 1 2 1 vrr.^.a \a r.b\b^ (426b)
r r' s s'
where v(x, y) is the interaction energy between one particle of the sort
A and one particle of the sort B, and
Tn the equations (426 a) C --= Cmn(t) is to be considered as a wave
function whose arguments are the partition numbers mr and n8, or
rather the corresponding operators, defined as b).br and a \a Hrespectively.
These results can be generalized further for the case of three or more
systems of identical particles, for instance electrons, protons, and
photons, interacting with each other.
We are now going to consider more closely the particular case of the
photons, i.e. light waves, in interaction with an ordinary material
system, which for the sake of simplicity we shall suppose to consist
of a single electron, forming with the fixed source of the external
field in which it moves a hydrogen-like atom. The peculiarity of this
problem lies in the fact that photons cannot actually be treated as
ordinary particles. As has been emphasized in P art I (§24) the analogy
between light and m atter has a very limited scope, and the notion of
photons must be considered as a useful fiction of the same sort as th at
of 'phonons’ (sound-quanta). In applying this fiction to the interaction
between light and m atter we must remember in the first place the fact
th at the number of photons does not remain constant, photons being
created in the act of emission and destroyed in the act of absorption.
This fact excludes the possibility of describing a system of photons by
the method of configuration space. Under such conditions a strict
application of the intensity quantization scheme devised for ordinary
particles to the case of photons is impossible. I t is nevertheless possible
to apply the final results to this rather fictitious case, thanks to the
fact th at we do not have to introduce any interaction between the
photons. We must, however, suitably define the expression for the
mutual energy between the photons on the one hand, and the material
system (electron, atom) on the other, in terms of the partition numbers
which describe the distribution of the photons over the different states,
and, moreover, provide in a physically irrelevant way for a formal
conservation of the total number of photons.
This latter circumstance can easily be achieved by introducing an
additional state of zero energy corresponding by definition to an actual
478 SECOND QUANTIZATION §48
absence of the photons. Emission or absor ption of a photon will be
inter pr eted under this condition as the tr ansition of a photon fr om or
into the zer o state.
The tota l ener gy of the photons taken alone (if this par t of the
system is r efer red to as B) can thus be r epr esented by the oper ator
h b = 1. Errnr = 1 hvTb\br, (427)
r o r -0
wher e v0 — 0. The oper ator s b, b* are intr oduced her e not on the gr ound
th a t a system of photons is descr ibable in the configur ation space by
a symmetr ical wave function, but because we know th a t the photons
confor m to the statistics of Einstein and Bose, i.e. behave like mater ial
par ticles of the ‘symmetr ical* type. I t should fur ther be r emar ked that
the quantities E rr —- hvr are intr oduced her e not by the gener al for mula
Err— J tp*Et/jr dy (since neither the oper ator E y nor the wave functions
ip(y) have a meaning for photons) but by way of definition.
The par t of the ener gy cor r esponding to the atom alone can be defined
in the usual way. I t thus r emains to define suitably the inter action
oo
ener gy M — Y V(X,yi)J or r ather the matr ix elements Vrr (X), the func
i'- o
tion V(X1yi) being itself just as meaningless as the oper ator E y.
In looking for such a definition we can be guided by the classical
expr ession for the ener gy of an atom or electr on in the electr omagnetic
field of the light waves. This field, accor ding to classical electr o
dynamics, is fu lly deter mined by its vector potential A as a function
of the coor dinates and the time, while the scalar potential <f>can without
any loss of gener ality be set equal to zer o. The electr ic and magnetic
in ten sity can be calculated with the help of A by means of the for mulae
E = - - - , H = cur l A.
c dt
Now th e ener gy of an electr on in an additional field specified by the
vector potential A is equal, if ter ms quadr atic in A are neglected, to
e a
- A-v — A- P*>
cm
wher e p x is th e electr on’s momentum. This for mula can be taken over
in to th e wave mechanics if p is defined as the oper ator ~ . V . In or der
to be able to tr eat this expr ession as th e mutual ener gy of the electr on
and of th e photons, it r emains to split up A into separ ate par ts, A* say,
which m ay be assumed to cor r espond to the separ ate photons, and to
§4# I NT E R AC T IO N BE T W E E N SYST E M S 479
find the matrix elements of A{ with respect to the different 'states’
r and s of the photons. Putting, for the sake of brevity,
— — Pr.s>
where Prs is obviously independent of the individuality of the photon
(specified by the index i)y we thus get for the energy of the electron
with respect to the light waves the expression
M ~~=V'21?rr'KK'> (427a)
r t '
whicli can be interpreted as the mutual energy of the electron and the
photons. The problem is thus reduced to the determination of the
matrix elements.
The simplest way to determine them is based on the assumption that
the perturbation energy (427 a) must be responsible for such acts as the
emission or absorption of light only. This means that the non-vanishing
elements Prr^ must correspond either to r — 0 or r* = 0 . Since the
number of photons in the zero state can be assumed to be infinite
(i.e. actually indeterminate) the operators b0 ~ a 0 \!n0 and b\ ~ \n 0aj
must also have infinite characteristic values, so that the matrix elements
Por and Pro must be infinitely small. All we need, however, is their
products with b\ and b0. Denoting these products by v* and vr re
spectively, we can reduce (427 a) to the form
M -- p • 2 (vJ6r+ v r&t). (427 b)
The operator p vJ 6r determines the probability of emission and the
operator p-vr ^ the probability of absorption of a photon hvr. Our
problem would be completely solved if we knew the dependence of vr
and v* on hvr. This dependence can be found by comparing the quantum
interaction operator (427 b) with the classical one
- p A = p - - Y a ,,
cm cm £-4
r
where Ay is the harmonic component in the Fourier analysis of A with
the frequency vr. The energy per unit volume corresponding to this
component is equal to ( ^ ) 2/ 8tt, where is the amplitude of the electric
intensity (since in the case of light waves the amplitudes of the electric
and magnetic vectors are numerically equal). Now according to the
relation j ^
E “ aF
480 SE CO ND Q UANT IZ AT IO N §48
we have ?Ayi.
c
The ener gy cor r esponding to a given har monic vibr ation in the whole
volum e V of the enclosur e wher e they take place is thus equal to
|7r —(A%)2V. On th e other hand, this ener gy m ust be equal to the pr oduct
of hvr with the number of photons associated with the vibr ations under
consider ation. We have ther efor e
7r
v* (A" )2V = hvr nn
2c2
whence ?A? = e (428)
This expr ession, m ultiplied by the phase factor co^(2irvr t + yr) — cos <f>r)
m ust obviously cor r espond to the quantum expr ession
Vr&r + Vr K
which can be wr itten in a similar way if we assume that vr = vl and
if fur ther th e oper ator s ocr = ei2ir^lh and = e~i27T^lh are identified, with
the complex phase factors e1^ and e~ifb*. In th e limiting case of ver y high
char acter istic values of nr we can tr eat V»r and <xr as commuting
(neglecting 1 compar ed with nr) and wr ite accor dingly
\ l b r+Vr bl = Vr(br+bl)
= vr ^nr(eitn^ h+ e-iin^ h) ~ 2vr Vwr cos<f>r. (428a)
H en ce it follows th a t — A* = 2vr <Jnr,
cm
which is identical with (428) if we put
V r= + = -------------
eVA .
(428 b)
r ' m<J(2rrVvr)
The dir ection of the vector vr coincides with th a t of A,, i.e. with the
dir ection of the electr ical vibr ations. The wave equation which deter -
mines th e m otion and inter action of the atom (electr on) with the
photons can be wr itten accor dingly in the for m
(429)
which can be obtained from the general equation (425) if we put F = 0,
interchange x with y> and determine, as shown above, the interaction
§48 I NT E R AC T IO N BE T W E E N SYST E M S 481
energy matrix Vrr>. Substituting in (429) w = <u'e-i2nW"lfht where
Wn = ^ hvTnr, we can reduce the preceding equation to the form
r
- ~ - - = i [ ^ + P - v A + 6t) K , (429 a)
which is a special case of (425 b).
Regarding M — 2 p iVr(fer+6J) as the operator of the perturbation
r
energy causing transitions between the stationary states of the atom
(electron) with emission or absorption of radiation, we can determine
the probability of such transitions by calculating the corresponding
matrix elements of M. Now these matrix elements can be written in
the form *') = £ (p-vr)j.A>>r+t>'r)n,n-
r
By the definition of the operators b we have
br wn = ^ ^ n r w(nr) = y/(nr+ l)w(nr+ l) 0Lrl
whence it follows, in view of the orthogonality and normalization of
the functions that the matrix element (br)vn>is different from zero
only if n' — nr-\-1, all the other numbers of the two sequences nr and
n ’r (apart from n0) being the same. The value of this matrix element is
equal in this case to *J(nr+ 1). For the matrix element (5J)wri' we find
likewise a non-vanishing value, namely, \!nr if rir = nr—1, all the other
numbers of the two sequences being the same cf. [eq. (423 a), § 47].
We thus see that the probability of the emission of a quantum of
frequency vr is proportional to
l(P-v,)w f K + > ) , (429b)
while that of its absorption is proportional to
K P - V r U r lX . (429c)
the proportionality coefficient being, of course, the same in both cases.
The energies of the two states of the atom J and J ' must differ from
each other by an amount approximately equal to -±hvr. The fact th at
the absorption probability is proportional to the number of photons in
the initial state, i.e. to the energy of the latter, is quite natural. I t is,
however, very remarkable to find th at the emission probability is pro
portional to the number of photons not in the initial, but in the final
state, being thus different from zero even if nr = 0 , i.e. if no photons
of the given sort were present at the beginning (except in the zero
state). This result gives an interpretation of the spontaneous emission
of light as stimulated by a photon which was initially in the zero state.
The sum n r~\-1 in (429 b) can be interpreted accordingly as the expres-
3495.0 3 q
482 SE C O ND Q UANT IZ AT IO N §48
sion of the fact th at the emission of light takes place in two ways,
namely, as a result of the stimulative action of the light already present,
the probability of this induced emission being exactly equal to the
probability jof the absorption, and also spontaneously. The ratio nr : 1
must therefore be equal to the ratio BpjA of the probability of absorption
or induced emission to the probability of spontaneous emission, A and
B being the well-known Einstein coefficients (see P art I, §§ 17 and 18)
and p the density of the energy per unit volume and per unit frequency
range.
This result can easily be verified. We have, in fact,
P d v = Jv / dv
l nr hvr>
where the summation is extended over all the frequencies within the
given range. Now, as has been shown in Part 1, §§ 11 and 37, the number
dz of free oscillations of any kind in an enclosure with a volume V,
whose wave number lies in the range die, is equal to AnVk2dk. Apphing
this to light oscillations (with a given state of polarization) we get,
since k = vie, AT7
c3
If nr is considered as a practically continuous function of the frequency,
it can be assumed to have the same value for all oscillations within the
small range dv. We then get
1 An
p dv = yU r hvr dz ~ — nr hvz dv,
whence — = - ^ 3,
nT c3
which actually coincides with the ratio A/B found in P art I, eq. (103 a).
We thus see th at the theory of the emission and absorption of radiation
developed in this section (and due to Dirac) has the advantage of inter
preting the spontaneous transitions with emission of radiation, actually
combining such spontaneous emissions with the induced ones.
I t is easy to obtain from the above theory the absolute values of the
emission and absorption probabilities. To do this w'e must multiply
the expressions (429 b) and (429 c) by (n2jh2) and further by v2,
dv c3
so long as we are interested in the emission or absorption not of a
particular photon with the frequency vr and a given direction of motion,
but of any photon with a frequency lying within a narrow range Av
§48 I NT E R AC T IO N BE T W E E N SYST E M S 483
irrespective of the direction of motion. In view of the unsharp character
of the resonance, summation of all the transition probabilities within
the range Av leads to a result which is independent of the actual
magnitude of Av.
The resulting probability of a ‘spontaneous’ emission, for example,
per unit time and unit frequency range thus turns out to be
27T2 47tVv2
A = l(P'Vr) ^
h 2 C3 ‘ ’
Substituting here the expression (428 b) for vr and denoting the com
ponent of p in the direction of the vector vr (i.e. the direction of the
electrical oscillations) by p r, we get
A ./ X 12 e% 2?r2
A = KP' ) " ' 1 h t f ~r*V'
dxr
or, if pT is replaced by m m27rvr ixr,
dt
8t74cV3
A
h
which coincides with the formula (93) of P art I if we take account
of a definitely polarized radiation only.
In order to account not only for the emission and absorption but
also for the scattering of radiation, we must consider the hitherto
neglected term of the perturbation energy, which is proportional to
the square of the vector potential A.
1 / e \2
Subtracting from the operator Ip — AI the operator^ 2/ 2m which
corresponds to A = 0 , we find for M—the operator of the mutual
energy between the electron and the light—the expression
*2
M = - A p + ~ —2A \ (430)
cm 2mc2
differing from the previous one by the extra term e2A2/2mc2.
In order to find its quantum interpretation let us put A = £ A,.,
r
where Ar = AJcos <f)r denotes a harmonic component of A. This gives
A —2 1 = 2 2 A?-AJ cos (f>r cos <f>„
r 8 r a
and consequently, according to (428 a),
484 SECO ND Q UANT IZ AT IO N * 48
which in view of the cor r espondence between the complex phase factor s
and the oper ator s a = ei27rdlh, = e~i2n8lh can be consider ed
as the appr oximate for m of the oper ator
K = I™ I I vr-v,(6J 6.+6J 6r) = w 2 1 W ,6+ 6,
r a r h
if we leave aside extr a ter ms of the typ e J 2 2 V V«(^A+&J&J) which
r s
cor r espond to a double emission or a double absor ption and which do
not seem to have any r eal physical significance. Substituting her e the
expr ession (428 b) for v and denoting by dr8 the angle between the
dir ections of th e electr ical vibr ations of the typ es r and s , we get
e2 h s? s r cos 6ri
- 2t t V z z (430 a)
•• r
Vvi Kb..
This oper ator , consider ed as a per tur bation ener gy, deter mines the
pr obability of those tr ansitions, in which one photon (h vr ) is absor bed
and another (h v8) is em itted. Since the state of the atom m ust not
change [this follows fr om the fact th a t its coor dinates do not explicitly
appear in (430 a)], i.e. its ener gy m ust r emain the same, the two fr e
quencies vr and v8 of the absor bed and em itted light must likewise be
the same; we thus have to do with a change of its dir ection only. This
is the nor mal coherent scattering. As has been pointed out in § 23, the
scatter ed light can in r eality be differ ent fr om the incident one (as in
th e Ram an or Compton effect). The above theor y cannot be extended
to such cases of combined scatter ing, f
49. Electr omagnetic Waves with Quantized Amplitudes; Theor y
of Spontaneous Tr ansitions and of Radiation Dam ping
The pr eceding theor y (due to Dir ac) can be gr eatly simplified if, fol
lowing J or dan, Pauli, and especially Heisenber g, we do not explicitly
intr oduce the notion of photons but tr eat the phenomena of light fr om
th e point of view of the wave theory, r eplacing, however , the classical
electr omagnetic waves by waves (<oscillations) with quantized amplitudes.
L et <f>(x, y> z, t) denote a plane har monic wave of some q uantity <f>char ac
ter istic of th e electr omagnetic field—electr ic or magnetic field-str ength,
scalar or vector potential, etc. I t m ay be a wave tr avelling in a definite
dir ection or a standing wave for med by the super position of two waves
t As a matter of fact, it is not str ictly applicable even to simple scatter ing: if instead
of the Schrodinger equation containing terms quadratic in the potential A, we used
Dirac’s equation which is linear in A, we should obtain to the first approximation (corre
sponding to simple transitions) no scattering at all.
§49 W AVE S W IT H Q UANT IZ E D AM P L IT UDE S 485
of the same frequency and amplitude, travelling in opposite directions.
In the former case we can put
fax, y, 2, t) = Ckei2^ r~v^+ CJ e- i2rr(*'T- vi\ (431)
where r is the vector with the components x, y, z and k the wave vector ;
the magnitude of the latter is connected with the frequency by the
relation k = cv, c being the velocity of light. The two amplitudes Ck
and CJ must be conjugate complex quantities so th at <f> may be real.
The expression (431) can be rewritten accordingly in the form
^(x.y.Zyt) = ^4k cos27r(kr—^) + ^kSin27r(kr—v/)> (431a)
where ^4k and Bk are two real coefficients. Taking the sum of the
expressions (431) or (431a) for various magnitudes and directions of
the vector k (forming a discrete or a continuous sequence) with suitably
chosen complex amplitude coefficients Ck (or Ak, Bk), we can represent
the value of the quantity <f> as a function of the space coordinates and
of the time for any electromagnetic field in ‘empty space’, i.e. satisfying
d ’Alembert’s equation i ;,2jl
? V - ^ = o. <«*)
It should be kept in mind, however, that this representation does not
hold for an electromagnetic field produced by electric charges situated
within the region under consideration, since such a field is determined
by a non-homogeneous equation of the form
= (432 a)
p being the volume density of the charges if <£ is the scalar potential,
or the electric current density if <f>is the vector potential.
So long, however, as we are dealing with radiation, we may safely
assume equation (432) to hold, and accordingly represent its general
solution in the form of a sum (or integral) of the expressions (431).
The transition from the classical electromagnetic theory of light to the
quantum theory can be achieved in the simplest way (without intro
ducing the notion of light quanta) by regarding the amplitude coeffi
cients Gk) Ck not as ordinary complex numbers but as non-commuting
quarvtum operators proportional to the operators b, which have been
used before with conjugate complex proportionality coefficients yk, yk
which are determined by the normalization condition for the function
fa . Adding to k the further suffix f to indicate the polarization
(f = 1, 2 ), we obtain the following quantum expression for a plane
polarized harmonic wave of light,
V> *, t) = YkJt b*£ Hi (433)
486 SE C O ND Q UANT IZ AT IO N §49
The substitution of the operators b, b 1 for the coefficients C, C* secures
the ‘quantization’ of all those quantities which are expressed as volume
integrals of the square of <f> (extended over the whole region in which
4 is different from zero).
Thus, for instance, taking the square of (433) and integrating over
a volume V outside which <f>can be assumed to vanish, we get
J
v
4*4 &V =
7*4 7 *4 ^ (Hl4 ^*4 ^ * 4 ^*4 )> (433a)
the squares of the two terms of (433) giving no contribution to the
integral on account of the periodic factors e±Uirk'r.
Now by the definition of the operators 6 , 6 f we have
b'b = N, bb' = N + 1,
where N is an integer or, more exactly, an operator capable of assuming
integral positive values only. Affixing to it the suffixes k, f which
specify the oscillations under consideration, we thus get
| H f dV = 2|yM !»F(A7M + J). (433b)
V
If £ is identified with the electric intensity E , the expression (433 b)
divided by \rr can be interpreted as the electromagnetic energy Wk£
enclosed in the volume V (since the magnetic part of the energy is equal
to the electric one). Putting
= J ~ = r ti («' = <*), (433 c)
we obtain for this energy the expression
W -= (Nkt+ i)h v,
which differs from th at of the photon theory by the presence of the
term \ in the brackets (N being the number of photons).
In order to get rid of this term one usually replaces the sum 6t 6 + 66t
in (433 a) by 2tfb, thus putting
f H fd V = 2 |yM | ^ 6k,f = 2 |yM |Wk>f,
which, however, is a wholly unwarranted procedure. I t can be shown,
however, th at in the accurate expression of the electromagnetic energy
which involves the sum of four terms (corresponding to the scalar
potential and to the three components of the vector potential) or of
six terms (corresponding to the three components of the electric inten
sity and the three components of the magnetic intensity) the \ cancels
out so th at the energy reduces to an integral multiple of hv.
§49 W AVE S W IT H Q UANT IZ E D AM P L IT UDE S 487
In the general case of an electromagnetic field represented by a sum
of terms of the form (433) satisfying given boundary conditions (corre
sponding, for example, to radiation enclosed in a vessel with perfectly
reflecting walls), the integral J <f>2 dVy on account of the mutual ortho
gonality of the different normal oscillations </>k ^ reduces to the sum of
the expressions (433 b) for all the values of k,£ concerned.
We shall now apply the method of quantized electromagnetic waves
to the interaction between light and matter. The light will be con
sidered as a perturbation and the m atter described in the usual way
by a superposition of the stationary states th at would persist in the
absence of the perturbation, i.e.
4> = 2 ar4>r 2 ar tf)r(r )e-'2”v’1.
The amplitude coefficients ar will be treated to begin with as ordinary
numbers; for the sake of simplicity the material system will be imagined
to consist of a single electron bound to a fixed centre of force (hydrogen -
like atom).
The perturbation due to the light will result in the variation of the
coefficients a with the time; this is determined by the well-known
equations h dar _ ~ „
(434)
2rri ~dt "
The perturbation energy can be written in the form
S = T(j> = ^ T<f>k£, (434 a)
where T is some quantity characteristic of the atom, for instance, its
electric moment if <f>represents the electric force.
Substituting in (434 a) the expression (433) for we get
s = I ya(T+ ba T - bl ei2"vJ), (434 b)
a
where the index a is an abbreviation for k ,f; = — p^e±i2irk*r if &
cm
denotes the vector potential, and va = ck. Hence we get
s„ = Ia
So far the present theory is formally identical with the previously
considered theory of the perturbation produced by classical (i.e. non-
quantized) electromagnetic weaves. We can therefore use for the ampli
tudes ar the same approximate expressions as have been derived
before [(175), .§ 22]. I t must be remembered, however, that the corre
sponding probabilities |or |2—just as the probability amplitudes ar
(r 8)—are to be dealt with not as ordinary numbers but as operators.
488 SE C O ND Q UANT IZ AT IO N §49
In order to obtain results comparable with the experimental data we
must consider the characteristic values of these operators, or their
probable values for a number of states corresponding to different charac
teristic values. We need not discuss here the method of calculating
these probable values since in the applications they are usually known
a priori. The important thing to be noticed is that the use of quantized
electromagnetic waves involves the introduction of ‘second-order pro
babilities’, i.e. of the probability that the ordinary (‘first-order’)
probability of some state (r) should have a given value, out of a number
of possible characteristic values. Instead of directly giving the value of
the transition probabilities, the operators |ar |2 considered as functions
of the time (with the condition th at at t = 0 one of them only has
a characteristic value different from zero), will serve to determine the
probable (or average) values of these transition probabilities.
Another important point is the fact th at in calculating the probability
operators |ar |2 we must take into account the non-commutative character
of the operators 6a, b£ whose squares or products occur in the expression
of the product of ar with <z*. It thus becomes necessary to define in
an unambiguous way the order in which the operators ar and a* must be
multiplied by each other. This order being adequately fixed, the com
mutation relations which are satisfied by the operators b\ enable
one to incorporate in the perturbation theory of the radiative transitions
those transitions which are classically distinguished as spontaneous on
exactly the same footing as the ordinary ‘induced’ ones.
We shall consider, just as in § 22 (or § 18, P art I), a radiation with a
practically continuous spectrum (such as the thermal radiation in
statistical equilibrium at a given temperature). Assuming the material
system (atom) to be initially in a given state s} we get to a first
approximation (r ^ s)
piln(yn-vjt__ 1 a c i2 fr(v „ + i'ay __ 1
ar = Aar bni ----------- ± {T i) ra+ b le------ --------
Vrs-Va a
where aj = a£* = 1. v '
Let us consider in the first place' a transition s -> r to a state of
higher energy Wr > W8 under the condition of unsharp resonance
with the electromagnetic waves in a small frequency range near
vql = vr$ = We can then drop the second term in (435) com
pared with the first one. I t now remains to multiply ar by its conjugate
complex, dropping all terms containing the 6a’s with different values
of ac and to sum over the frequency range considered.
$ 49 WAVES WITH QUANTIZED AMPLITUDES 489
Before we do this we must, however, make the following important
remark about the order of the factors in the product of ar with a*.
According to (435) ar (r s) must be considered not as an ordinary
number but as an operator of the same type as b \ its conjugate complex
must be replaced accordingly by the adjoint operator
g-iZlTiVn-VaX_I
< 4 ^ (A xar)t
[which corresponds to the first term of (435)].
Correct results are then obtained if the operator which determines the
probability of the state is defined by the product a \a r and not by ar a*.
In carrying out the summation over the different oscillations we can
drop all those products b^bp for which <x ^ (in view of the supposed
incoherent character of the radiation). This gives
(Av)
1 K b , +0 ei2n(v-vrt)l__ l
\ iT£ _ d(v—vr,),
' Kv /
where v0 = vrft is the resonance frequency, \(T^)rJ 2 the average value
of \(T£)r8\2for all the directions of the vector k with the fixed magnitude
c/r0, and Av a small frequency range containing the resonance frequency
and yet large enough to make the integrand very small compared with
1 for v—vr9 = ±Av. The integral being equal to t H, we thus get
Av
a la , ~ —«yv2.Rmj2
Let Z v Av be the number of different oscillations in the frequency
Av
range under consideration, i.e. the number of summands in
S7rV
For isotropio thermal radiation Z y Av = v2A v [cf. P art I, § 29, (141)];
cd
we can then put
A« — z ft b
Av *“
and consequently
ala, = a ° * a « B ^ h v E r ^ . t , (435 a)
where ^ = — O'= •>.)• (435 b)
Let us now consider the opposite transition r 8 due to the (unsharp)
resonance with electromagnetic oscillations of the same frequency as in
3895.6 3 E
490 SECOND Q UANT IZ AT IO N §49
the preceding case. Reversing the indices s and r, we obtain for the
probability amplitude of the transition r -> s the expression
1 x " ”' ( p iiir iv,,—v,±)t____ 1
= A l " , = - T °"r 2 y a { ( 7 a ).,r h .............~ - +
“ T* I vsr va.
tC:i2n (vlr i -Jl (430)
+ ( 2’a > ;
lV + va
Since )■„. = —rrK, and consequently rsr -f-r^ ~ 0 , we can now drop the
fir st term of this expression and not the second one, which gives
al as --= «?*«? Zvhv bfiifi .1. (430a)
This differs from (43oa) by the inverse order of the factors ba and b\,
and also in a minor way through the substitution of Brs for Bfi.
Now wc have i> ffi --- N 1
blK --- A.. "(Vl'n rv I 1 ■
)
where 2Va is the operator representing the (integral) number of light
quanta associated with the oscillations of the type a. Passing from
operators to probable values, wc get
^■hv b ' X , Z„ j? hv =r pr.
where pv is the spectral density of radiation per unit volume and
K>AZ y 1' - (ATa+l)A.' - p ^ l + ^ h v }.
Hence the probable values of the probabilities for the transitions a r
and r ->■s referred to unit time are
- B+ Pvn if Wr > W„
and Vr_+8 - Brs{pVri-f —- hv - Ar8+ B r8pVrt.j (436b)
We thus see that on the present theory ‘spontaneous’ transitions from
a state of higher energy to that of lower energy become completely
fused with the induced transitions of the same type. The relation
A 877V2,
A rs = “ ar h vBrs
between the probability coefficients A and B referring to spontaneous
and induced transitions is just that which has been obtained in P art I,
§§ 17 and 18, by the method of ‘classical’ electromagnetic waves. The
only difference consists in the multiplication of the quantity T charac
terizing the atom by the factors e±i2nk r characterizing the radiation,
which corresponds to the introduction of two somewhat different coeffi
cients B+g (for absorption) and B~ (for emission) instead of the single
§ 49 WAVES WITH QUANTIZED AMPLITUDES 491
one considered before. I t should be remarked, however, that for an
isotropic radiation characterized by all the directions of the vector k
being equally probable, the two coefficients are identical. If, moreover,
the wave-length A = 1jk is large compared with the effective linear
extension of the atom the factors e±i27Tk‘T can be dropped altogether.
The expression (435 b) reduces in this case to th at obtained before
(Part 1, § 17), if T is defined as the electric moment of the atom in the
direction of the electric intensity <f>. Substituting the corresponding
expression (433 c) for ya in (435 b), we get, since
i2/ j 2+ i z j 2),
= 3^ 2 e2(la:r1.|2+ ly « | 2+ l 2r»|2).
in agreement with (103), § 18 of P art I.
As a second illustration of the method of quantized electromagnetic
weaves wre shall apply it, following Rosenfeld, Weisskopf and Wigner
to the problem of the radiation damping.
Let us return to the perturbation equations (434) and let us assume
for the sake of simplicity that Srs is different from zero for two states
only, r ~ 1 and s -- 0 say (the diagonal elements S u and S00 likewise
vanishing).
The equations (434) reduce under these conditions to the following
tw °: h h
- 2 7 d 1==/a°’ (437)
where / = 2 ),„<*'C’2^
9 = I y«[(T: )o. K b'a e'™ -.- ^ ] . (437 a)
a
We shall assume th at at the initial moment (t = 0 ) — 1 and a 0 = 0 ,
which means th at the atom was initially in the excited state, and shall
try to solve our problem more exactly than was done before (when
a1 was considered as constant) by putting
a1 = e~2”rt. (438)
This corresponds to a radioactive-like decay of the number of atoms
in the excited state ( 1) owing to their spontaneous transition to the
normal one (0 ). Substituting this expression in the second equation
(437) and integrating, we get
e-t2ir(v16+Ka-iDf_I
= —T 01 J -Vr) piiniva-vit+iiy_1
+
\
+ (^ n S - ^ + s r-) - <«»*>
492 SE CO ND Q UANT IZ AT IO N §49
The first term in this sum can be dropped so long as lies in the
vicinity of vlQ just as in the derivation of (436a).
In order to find the decay or damping constant T we must substitute
this expression in the first equation (437) with due account of the order
of the factors 6 , 6 * and sum over all the a's in the resonance range
In doing this we can drop the second term in /, for in view of the
incoherent character of the oscillations the probable (average) value of
K vanishes both when a ^ jS and a -- £, the only non-vanishing
terms being those containing the products We thus
find
g - 12 i r V t __g i 2tt(i'1o- v j t
- r e - 2” n - ~ ^ y ! m ) 10(T-)01(Na+ i)
I ct v«—>'10+ ^
1 ^ 1 __ pi2n(vl0-va~ir)t
that is, r - - i ^ 0- V a- T f ) ' (438b)
Replacing here the operator Na as before by its probable (average)
value (c2/S7rhi^)pv, and further replacing the summation over a by an
integration over v with the expression Z pdv — XrrVv2 dvjc3 for the num
ber of ^-values in the range d\\ we get
V2 ________________ / c3 \ r 1 ___g ia rr^ io -v -tT y
where vQ denotes the resonance frequency v1Ql... and (T£)i0(T “ )01 the
average value of the product ( ^ )io(^ot )oi f°r the directions of the
vector k with the fixed magnitude k - - v/c. The integral J appearing
in this formula is easily seen to be independent of the value of the
parameter T and to be equal to n. We have in fact, putting V = 0
and v—1/10 =
j = j ” l - ep - * 1d t = - i J t - s p ? # d{ + rff
The first term obviously vanishes since the integrand is an odd function
of i, while the second reduces to the well-known integral of Laplace,
which is equal to it.
Thus, if we neglect the difference between the factors T+ and T~
replacing them simply by T (which is always permissible if the resonance
wave-length, A = c/v10, is large compared with the effective dimensions
of the atom), and take into account the relation (435 b), we obtain for
§ 49 WAVES WITH QUANTIZED AMPLITUDES 493
the constant T the following expression:
whence 47jT — A10-\- B10pv. (439)
This quantity is usually denoted as the damping exponent since the
number of atoms in the excited state a\ decreases with the time as
e- 477T7 Under ordinary circumstances the second term in (439) is small
compared with the first one, so that the damping constant is numerically
equal to the probability of a spontaneous transition between the corre
sponding states.
In the general case when the atom is initially in an excited state r
from which spontaneous transitions are possible to several states of
lower energy s, the damping constant is equal to the sum of the corre
sponding transition probabilities
4«r r = Z A „ (W.< Wr). (439 a)
The probability amplitude of the rth state ar decreases with the time
like e~2TrT'1. Multiplying this expression by </rr —
- tp^(x)e~i27TVft we can
treat the resulting function
aripr — $ (x)e-i2n{v'~ir')i
as representing damped vibrations, corresponding to a complex value of
the frequency vr —ir r. Such damped vibrations starting at a certain
instant t = 0 can be analysed into a series of undamped harmonic
vibrations, according to the equation
+oo
f(t) = = J Ave~iimldv (t Js 0 ),
OO 00
where Ay = j dt = j e^ -'v+fl'vx at
0 0
oo
j* e -2n[rr-i(v-v,)]l r ff ___ 1 __
0
27r[rT—i{v—vr)]'
or (439 b)
K |2 4ff2[(v—vr)2+ fj:]’
This corresponds to an effective spectral width Tr of the state in question
—in agreement with the interpretation of complex energy values given
in P art I, § 15, in connexion with the problem of radioactive decay.
494 SECO ND Q UANT IZ AT IO N §50
50. Application of Quantized Electr on Waves to the Em ission
and Scatter ing of Radiation
If in the function
0 = 2 ar &•(*> 0 = 2 ar ^[ x)e‘i2w^tf (440)
r r
representing the undisturbed motion of the electron, the coefficients
ar are treated not as ordinary complex numbers, but as operators
satisfying the relations
« X + aX = Sr„ ar a„ + a,aT= 0 , = 0. (440a)
ifj will represent the motion of any given number of electrons, distributed
over the individual states ipr, the number of electrons associated with
a particular state r being defined as the characteristic value of the
operator a \a r ^ n T (440 b)
(i.e. 1 or 0 ); it should be remembered that the product ar a} is equal
to 1~ n T.
I t has been shown by Heisenberg that with this definition of \jj corre
sponding to quantized electron waves, it is possible to give an adequate
description of the emission (and scattering) of radiation in terms of the
classical electromagnetic theory, if, following SchrOdinger, we replace
the classical mechanical quantities (coordinates, velocities, etc.) by their
average or probable values.
This wave-mechanical theory of light emission has been discussed
already in P art I, § 17, with the help of ‘classical’ (i.e. unquantized)
electron waves as giving rise to classical electromagnetic waves. I t has
been shown there that light vibrations defined as ‘beats’ (‘difference
tones’) between two electron waves have correct frequencies, but that
their amplitude is proportional not only to the probability of the initial
state but also to that of the final one—which contradicts the photon
theory of radiation. Now this contradiction can be removed if the
‘classical’ electron waves are replaced by quantized ones; the resulting
electromagnetic waves appear likewise as quantized although in a way
somewhat different from th at considered in the preceding section.
The mechanical quantity which determines the radiation emitted by
an atom can be defined according to SchrOdinger’s theory as the
probable value of the electric moment of the atom
P = J dV.
If we are concerned with several electrons P must denote the sum
§ 50 APPLICATION OF QUANTIZED ELECTRON WAVES 495
er{, where r* is the radius vector of the ith electron (with respect to
the nucleus), and if/ an antisymmetrical function of the coordinates
of all the electrons, the integration being extended over the whole
configuration space. Introducing quantized electron waves, we can
represent the totality of the electrons by the three-dimensional operator-
function (440), and replace the preceding expression for the probable
value of the resulting electric moment by the operator
P =j pP '/idV, (441)
whose characteristic values must be considered as the probable values
of P. Ju st as for the quantized electromagnetic waves discussed in the
preceding section, we are thus concerned with probabilities of the
‘second order’, i.e. the probabilities of certain probable values of P, the
corresponding second-order probability amplitudes C being defined by
an equation of the form P C = P'C. As a m atter of fact, we need
not bother about these probabilities, for the quantity we are actually
interested in, and which can be directly compared writh the experimental
facts, is the probable value P of the operator P, which, as we shall
presently see, can usually be determined directly.
I t should be emphasized that the order in which the two factors
^ and ip appear in the expression (441) is an essential feature of#this
expression, since these factors do not commute with each other. We
should obtain wrong results if the operator P were defined as J ipPip^ dV.
Substituting in (441) the expression (440) for ip, we get
P(t) = I I a l a 8P % e ^r .tj (441a)
r 8
and consequently
d2P(t)
2 2 «, P?4(27ivr<)V 4"1’" 1 (441b)
dt2
This expression can be considered as defining in the same way as in
the classical electromagnetic theory the electric and magnetic field
generated by the atom at sufficiently remote points.
The electrical intensity in a given direction r, say, at a distance R
from this atom (the unit vector r being perpendicular to R) is thus
represented by the operator
Er(R,t) = - ~ P T(<-i*/c),
c being the velocity of light, th at is,
Er = E~ + E i = —2 2 r+8
(442)
496 SECO ND Q UANT IZ AT IO N §60
where E~ corresponds to terms with negative frequencies and E+ to
those with positive frequencies.
The electric field defined by (442) is an operator, of a type somewhat
similar to th at defined in the preceding section with the help of the
operators b, the operators a \a 8 corresponding to b if vr8 < 0
OK < W8) and to br if vn > 0 (Wr > W8). The connexion between the
two types of operators will be examined later on. We are concerned
here only with the fact th at in order to obtain the observed electric
field we must take the characteristic or probable values of (442). In
the absence of definite phase relations between the operators ar and a8
referring to different states, i.e. when the different harmonic terms in
(440) are incoherent with regard to each other, the probable values
of a].a8 are equal to zero so long as r ^ 5, so th at the probable value
of (442) vanishes. This is practically equivalent to the fact th at
the average value of Er with respect to the time is equal to zero. The
quantity we are interested in is, however, not the electric field-strength
but the corresponding energy. According to the classical theory, the
latter (or more exactly the energy-density) is proportional to the square
of Et . In order to obtain the operator which serves to define the energy
in agreement with the photon theory of radiation we must, instead of
squaring ET, multiply E+ by E~ in the order stated (just as in the
preceding section where E + was replaced by <£f and E~ by <f>). This
gives
it being understood th at vr8 > 0 if r > s (the index r is dropped for
convenience).
We shall take in the first place the time average of this expression,
which can be done by keeping those terms only for which vr8+ v8y = 0 ,
that is, r' = r and s' — s. We thus get
I t should be mentioned th at the same result is obtained by averaging
over the phases of the operators an etc., if they are assumed to corre
spond to incoherent vibrations.
$ 50 APPLICATION OF QUANTIZED ELECTRON WAVES 497
This formula shows th at the intensity of the emitted light is equal to
the sum of terms corresponding to a combination of two states (r, s)>
provided the upper state is occupied (nr = 1) and the lower vacant (n8 = 0 ).
This result is in complete harmony with what should be expected on
the photon theory of light emission in connexion with Pauli’s exclusion
principle. The formula (442 b) can thus be regarded as the improved
version of the ‘classical’ wave-mechanical equation (92) of P art I, § 17,
where the upper and lower states appeared in a quite symmetrical
manner. Indeed, we come back to this result if we consider the ampli
tudes ar, a9 as ordinary numbers and not as operators.
If in the expression (441a) ar and a8 are multiplied by the damping
factors e-i2l'r' t and e-i2nT*ti the light vibrations with the frequency
vr8 = (Wr~W8)lh due to the combination of the corresponding states
appear as damped with the damping constant
rj-a = —2 2
V<^r g<8
The effective width of the spectral line emitted in a transition from
one state to another is thus equal to the sum of the widths of both the
initial and final states.
We shall now investigate, with the help of the formula (442), the light
emitted by an atom under the perturbing influence of ‘primary’ electro
magnetic waves, or, in other words, the phenomenon of the scattering
of radiation. As has been shown in Chap. V, § 23, the interpretation
of this phenomenon from the purely mechanical point of view neces
sitates the consideration of double transitions, which correspond to the
second approximation in the solution of the perturbation problem. If,
however, we consider the radiation emitted (scattered) by the perturbed
atom, we can confine ourselves to the first approximation, which in
conjunction with equation (442) gives equivalent results.
Let us for a moment treat the coefficients ar as ordinary numbers,
and define the electric field of the primary light waves by the expression
E° =
where b is the (complex) amplitude of E9. Let us assume further that
at the initial moment, t — 0 , the coefficients aqt aq., aQ. are different
from 0 , while all the other coefficients ar, as,... vanish. We then get
from (435), with T£ replaced by Pa, the component of P in the direction
o of the vector E°, and the summation over a by a summation over q:
\ ei2n(v
(v+vffV
+ v ^ V __ lJ “I
a, = \ a r
SfiM.t
- s 2 “! « 4 6
n L
---------- + 0 T-
vrg+v J ’
38
408 SECO ND Q UANT IZ AT IO N §50
Substituting this expression and the similar expression for the conjugate
complex
Q\2.n{v-Vf^t J £-i2ir(v-j-vrjl_
«J = A1a t = — ---...+ 6 - ...... —
vra v vr<7+ 1'
in the formula (441 a) for the electric moment or its projection in a given
direction r and dropping small terms of the second order, we have
PM = 2 £ [ \a \< l%{p X i ei‘L,,'’v,+ aV A1ar(P;) 9>ei2™«-''J.
r V
If furthermore we drop irrelevant terms which do not contain the
primary.frequency (they can actually be considered as fading away
owing to the damping), we get, with the help of the relations
vrqJTvq,r ~ vq'q ~ *W >
— 1 ^ a v ~> a { r pi2n(v~Vq’qji />—i2w(v
[ p-iZn(v~v9Jt pi2.'n(y+v9'Jt'\\
b e------------+ t f -— — — ,
or, rearranging the different terms, vrq—v vrq+v J)
pM = 2 1 a - , 6 tei2»(v-^y]+
q a'
+ 22 [ « uqq' be-i2rT{v+v**)t-\-a$a% u£q6 tei27r(v+v«/^]. (443)
<i <r
In this formula
U«1 - 2k 2
r vrq~V
= 1 y i P l M Ptw
2h ^ Vrq—V
>. (443 a)
1 y (P?)q.f(P°)rfl
v Q~ 2 k£ vrqJr v
r '’rq+V
The electric field strength of the scattered radiation at a distance JR,
Er(t) = — is thus given by the formula
K ( () = - ^ { [ ^ ( v- ^ q ) } \ aq < U«t be-i^<’- ' ' ^ ‘- Rlc>+
+ a ° fa“. m-. 6 te<2Ir(>'--,-,xi-R/c)j} _
—^ { [ 2tt(i/+ Vff)]2[a?ta«’w« ' be-i2n^v+v>’^l- n,c)+
+ < « S < « 6 Vm»+v.XMW«)]}. (443 b)
Although in the preceding calculation the coefficients art etc., were
dealt with as ordinary numbers, the results obtained remain valid if
we regard them as operators, since in writing down the products a\ av
$ 50 APPLICATION OF QUANTIZED ELECTRON WAVES 490
etc., we have always preserved the correct order of the factors. The
smallness of the first-order coefficients ar must be understood to mean
in this case the smallness of the average (or probable) values of the
corresponding probabilities |ar |2 (i.e. the predominance of the charac
teristic values \ar \2 = 0 ).
Let us consider separately the special case when the atom is supposed
to be initially in a definite state q. The double sum (443) reduces in
this case to the single term
Pr(t) = o"faJtt)(W(6e“12ffW+ 6 te<2,T‘'<), (444)
where 1 V •V(P 2W'P?)r« (444 a)
w h Z '
and the electric field strength (443 b) to
(444 b)
The scattered radiation has in this case the same frequency as the
primary one. This is the so-called simple or Rayleigh scattering. In
the general case of equation (443 a) we obtain in addition to this simple
scattering a ‘combination’ or Raman scattering with a number of
modified frequencies v^vq.q.
In order to obtain the average energy of the scattered rays we must
take the square of Er or, more exactly, the product of E f with E ~ .
For Rayleigh scattering this gives
E+ E'; ~
th at is, E* E~ ~ w ^n ffib = WqqnQb'b, (445)
since the characteristic values of and nq are the same (1 or 0 ).
For the Raman scattering the situation is somewhat more com
plicated. We shall consider separately the scattered rays with the
frequency v—vq'q and those with the frequency v+ vqq'.
Taking the time average of Elf E~, according to (443 b) we get
■U ,, °q al b' b-
th at is, ~ {y-vq.q)*u^.u-.qn l(\-n \.y, (445a)
and in a similar way
•W , ~ ( * ■ + ^
or Jv+v,; ~ (v+ vq'q)iK ’qum K ^ l ~ nl ) ^ b- (445b)
These results are in harmony with the experimental data and with the
elementary theory (due to Smekal) of the Raman effect, based on the
idea of photons. In order to secure complete agreement we must make,
500 SE CO ND Q UANT IZ AT IO N §50
however, the ‘additional assumption th at vq>q is positive, i.e. that
The scattered photon with the decreased frequency v—vq.q is obtained
on this view if the atom was initially in the lower state (nj = 1), the
higher state q' being vacant (n = 0 ). In the contrary case (nj* = 1,
raj = 0 ) the atom jumps from the higher state to the lower one, adding
the energy hvq'q to that of the incident photon, which results in the emis
sion of the scattered photon with the increased frequency v+ vQ’Q-
I t should be mentioned th at the intermediate states r, which deter
mine the intensity of the scattered radiation through the factors u±)
in contradistinction to the final state q or q', need not be vacant,
since the corresponding numbers (operators) nr do not appear in the
equations (444 a) and (444 b).—This can be explained by the fact that
if some intermediate state r is occupied, the electron starting from the
state q, say, is interchanged with the electron in the state r, which
passes to the final state q .
The probability amplitude of such double transitions q-> r -> q' with
interchange must be the same as for double transitions without inter
change, since the electrons are indistinguishable.
The expressions
(v Vq’q)*'UW,qq,'“‘
U,qI‘Q
q
and (y-j~Vg<^)^‘Uq-qUgq' ?
which are a measure of the intensity of the scattered radiation with
the frequency are in agreement with the expressions (184)
derived in § 23 for the probability of the double transitions which are
responsible for the scattering.
The preceding theory of the scattering process can be improved by
taking account of the damping which is described by adding to the
frequency vr of each state the imaginary term Pr considered above.
This correction becomes especially important in the neighbourhood of
resonance. We thus get, for example, instead of (444 a),
where r^r = Tr is the damping factor for the line emitted in the
transition between the states q and r. This expression remains finite
when v = v^, determining the polarization and intensity of the so-called
‘resonance radiation’.
The radiation theory sketched above is inexact in the sense th at it does
not take into account adequately the retarded character of the electro-
§ 50 APPLICATION OF QUANTIZED ELECTRON W^ VES 601
magnetic actions. This has been done approximately by substituting
the difference t —Rjc for t, where R is the distance of some point (centre,
say) of the atom from the point in question. This approximation does
not hold, however, if the wave-length of the emitted or scattered light
A = cjv is of the same order of magnitude as or smaller than the linear
extension of the atom. The electromagnetic field generated by the
latter can be determined in this case by the classical expressions for
the scalar and the vector potential
4>(r,t) = € f ‘' A .* ilcA d V' '
^ ), (446)
A<r,f) = .
J R
where R — |r —r '| is the distance of the point considered from some
point r' in the volume-element dV' of the electron-cloud. Here c denotes
the charge of the electron, while
p= (446 a)
is the density of the cloud and j the corresponding current density. J
According to Schrbdinger’s theory, the latter is given-by
j = (446 b)
m
where u = V - - A is the operator of proper momentum, whereas
according to Dirac’s theory j ^ (446 c)
cy being the velocity matrix and \fs the operator corresponding to
Dirac’s wave function. Substituting for the latter the expression (440),
where x is an abbreviation for the geometrical coordinates and the
spin-coordinate, we get
P = 2 2 ar a„
r .h
J ^ 2 2 n\a a^(x)-x4>1{x)eii7,v"1.
r s
Before substituting these expressions in (446) we must replace x by xr
(coordinates of the point r') and t by tr = t —Rjc. Now so long as R is
very large compared with the atomic dimensions we can put
R = R0—n t ' ,
where R0 is the distance of the point r from the centre (nucleus) of the
% More exactly, the operators whose characteristic values are the probable values of
the respective densities.
502 SE C O ND Q UANT IZ AT IO N §50
atom and n = R 0/ i ?0 the unit vector pointing in the corresponding
direction. We thus have
p ( r ',t') = 2 I af
r s
and a similar expression for j(r', V).
Replacing R in the denominator of the integrands in (446) by R0—
which is permissible so long as R0 is supposed to be sufficiently large
—we obtain the following expressions for the electromagnetic potentials,
0 r s f> (447)
A(r, t) = -i- 2 2 al ^ e i2rr^ ‘- RJC^rs
0 r 8
where — J ^ t$}el'27rr" n*r,/r dV'
(447 a)
gr5 =-- J i / i f y dV
The electric and magnetic field strengths can be calculated from (447) with
the help of the classical equations
1 pjA
E = _V<£—- H = curl A, (448)
c at
which give (if R0 in the denominator of (447) is treated as a constant)
E = c i 2 2 ff> « e i2w”(,- Wr)W n / „ - g J ,
0 r .v
H= 0 r
2 28 « J v <2^ - « H , ( n x g f)). (448a)
These expressions are easily seen to satisfy the relations
H = nxE, E = -n x H
characteristic of the classical radiation field. Indeed the only non-
classical feature of the preceding equations besides the quantum fre
quencies vrs (which appear just as well in the old SchrOdinger theory)
is the non-commutative character of the coefficients ar. This feature
becomes manifest, however, only when we pass to the calculation of
the electromagnetic energy.
51. Connexion between Quantized M echanical (Electr on) Waves
and E lectr om agnetic Waves
As we have already pointed out, in order to obtain a correct expres
sion for the energy (as well as for the other quadratic quantities) we
must split up the linear parameters of the electromagnetic field <f>, A,
§ 61 MECHANICAL AND ELECTROMAGNETIC WAVES 503
E, H into two parts: <£“, A ”, E~, H~ and , A +, E+, H+, corresponding
to terms with negative and positive frequencies respectively. The
energy density is then represented by the operator
07T
r, = J - (E ^ E -+ H +H-). (449)
In a similar way the energy stream (Poynting’s vector) is represented
by the operator r
077
K = ~ -(E +x H " - H +x E -). (449a)
The negative and positive frequency terms of <f>, etc., should not be
identified with the operators <f> and <f>t which have been introduced in
§ 49 with the help of the operators b, b* of the Einstein-Bose statistics.
In fact the electromagnetic waves we are now considering are not plane
waves but spreading spherical waves, with amplitudes which vary as
the reciprocal distance from the emitting atom and decrease expo
nentially w'ith the time, the vibration (r, 6*) being in fact damped
according to the law e~2n{r^ r<)/. These damped spherical waves are,
moreover, quantized in a way different from the plane waves considered
before, namely, through the operators a \a $ and a \a n instead of the
operators and b of the previous theory.
I t is interesting, however, to note that the operators of these two
types are to some extent very similar. If r > s (i.e. Wr > W8)> then
a \a s obviously corresponds to b\8 and a \a r to br8 (in the sense th at the
former relate to harmonic terms with positive frequencies and the
latter to terms with negative frequencies). Putting accordingly
a\ a* = K b and ° l a r — K b>
we Set b- 6+ = a \ a ra \a =-- a \a sara} = w8( 1- nT)
K b Kb = al asal ar =
and consequently b~b+ —bf8b~ — ns~~nr‘
In the case of an emission due to the transition r -> s the characteristic
values of nr and ns after the transition are nr ~ 0 and ns = 1, so th at
the preceding expression reduces to 1, just like 6 a 6£— ba. In a similar
way it can be shown th at the operators = bf9 and a\. ar>= b~-8-
commute with each other unless r' r or s' =£ s (if r ’ — r> then
b~B>b+a—bjsb~d' = a \a 8.)y while b+s always commutes with b}8> and 6~
with b^8>.
These results seem to indicate th at it is neither necessary nor possible
to build up a theory of quantized electromagnetic waves in empty space
on the basis of the very restricted analogy between these waves and
504 SECO ND Q UANT IZ AT IO N §51
the quantized waves representing the motion of ordinary particles which
conform to the statistics of Einstein-Bose. The true relationship be
tween the electromagnetic waves and the quantized electron waves in
three-dimensional space is probably much more adequately represented
by the fact that the amplitudes of the former are quadratic in the
amplitudes of the latter, the ‘symmetrical* operators b being thus re
placed by quadratic combinations of the ‘antisymmetricar operators a .
The theory of quantized electromagnetic waves developed in § 49
must therefore be regarded as a convenient though artificial method
for dealing with radiation problems involving ‘spontaneous’ transi
tions, rather than the true picture of a physical reality. As a matter
of fact, this method implies that the radiation emitted by an atom
which is situated in a rectangular enclosure with reflecting walls
is converted into plane standing -waves, which represent the normal
modes of electromagnetic vibrations consistent with the correspond
ing boundary conditions. Under such circumstances it is not neces
sary to consider the damped spherical electromagnetic waves which
are emitted during the transition of the atom from one state to
another, this transition along with the resulting change in the radiation
field being described as a transition of the complete system: atom +
radiation (in the form of normal vibrations) from one stationary state
to another. It should be noted that this is exactly the same type of
description as th at used in the perturbation theory of ordinary transi
tions not involving any radiation effects: the transition is not investi
gated as a process with a definite course in time, it being simply assumed
th at this process brings the system from one unperturbed state to
another.
If we wished to consider the ‘spontaneous’ transition of the atom
from a higher to a lower state as the result of its own radiation field,
described by spherical waves, we should use a more complicated per
turbation method, involving damped vibrations, the transition appear
ing not as an instantaneous jump with a certain probability per unit
time, but as a continuous process starting at t = 0 and ending at t = oo,
with an effective duration of the order of l/A.
I t should be mentioned further th at from this point of view (which
seems to be the really correct one) the electromagnetic radiation ought
to be considered always in conjunction with the m atter by which it
is emitted, absorbed, or scattered. In fact the radiation enclosed in
an empty vessel with perfectly reflecting walls and considered as an
independent dynamical system is merely a fiction, since its reflection
§ 51 MECHANICAL AND ELECTROMAGNETIC WAVES 505
by the walls is actually due to the absorption and re-emission, or to
the scattering, by the atoms constituting these walls. The absorption
of radiation which, according to the method of quantized electro
magnetic waves in an enclosure, is simply a transition of the absorbing
atom from a state of lower energy to th at of a higher energy with the
accompanying decrease of the energy of the corresponding electro
magnetic wave system by just one quantum, must be considered as the
result of the superposition on the primary radiation, causing the transi
tion, of the secondary radiation emitted by the atom. This is the
picture of the absorption process which is given by classical electro
dynamics, and it must remain fundamentally unchanged in a consistent
quantum theory, where actual processes must only be replaced by
probable ones.
The current idea th at the emission of radiation can be due only to
a transition of the atom from a higher to a lower state is fundamentally
wrong; the converse transition is just as well accompanied by emission
of radiation, which, however, cuts down the primary radiation causing
the transition, and is therefore manifested as the decrease—i.e. absorp
tion—of the latter.
In the preceding discussion of the connexion between the quantized
mechanical (electron) waves and the electromagnetic waves, the former
were dealt with as the cause of the latter. This relation can, however,
be reversed in the sense th at the motion of the electrons is influenced
by electromagnetic waves of external origin. This influence has been
actually examined already by the method of the perturbation theory
in the preceding section (in connexion with the scattering) and especially
in § 49. I t remains to be seen whether the two types of quantization,
assumed for the two kinds of waves, are consistent with each other in
this respect.
The expressions obtained in § 50 by the perturbation theory for the
amplitudes ar which were supposed to have initially a characteristic
value zero, must obviously satisfy the general commutation relations
a \a 8+ a $a\ = 8rg, etc. Assuming for the sake of simplicity th at all the
coefficients aj but one vanish, we get, preserving the order of all the
non-trivial factors involved,
a \a t 1 f W -aa“W 6a° + -aa °W a °
4A2 l(*'r4—^ (% +*)
+ a number of harmonically oscillating terms which we shall leave
aside, since their average value vanishes.
SS9M 3 T
506 SECO ND Q UANT IZ AT IO N §51
Now the products 6*6 and bb*, whether b and 6* are defined as the
amplitude-operators of the Bose-Einstein statistics or as the products
of the type a* a* and a\ap (with suitably chosen values of p and s)
commute with aj and uJJ*. We thus get
aotao j L 'p pr__
q V4/*2' ^ [ K - v )2+ r(^TQ+V)
and in a similar way
6*6
arar = al aV
4«
^ ,2 , ,
l> r ,— W2 ( ^ ,+
,
W2}
We see from these equations th at the relation a\ar-\-ara\ = I will
follow from the relation a^a"-\-a%a^ — 1 only if it is assumed th at
66* = 6 *6 , th at is, if 6 and 6 * are treated as ordinary (commutable)
numbers. As to the relations a\a8-\~a8al — 0 , etc., they are easily seen
to hold (if r ^ s)\ in fact, so long as oscillating terms are dropped, we
get separately o*«, = a,a* = 0 .
52. The Quantum Electr odynam ics of H eisenber g, Pauli, and
Dir ac.
The absence of complete harmony between the mechanical and the
electromagnetic waves from the point of view of their quantization is a
very unsatisfactory feature of the preceding theory. It can be shown,
however, to be due, at least to some extent, to the approximate form in
which this theory has been developed hitherto. We shall now briefly
consider its more exact formulation due to Heisenberg and Pauli. This
formulation is at the same time a generalization, which treats the
radiation field as but a special case of the electromagnetic field, pro
duced by m atter and acting upon it, and includes ordinary electric and
magnetic forces, treating them in the same way as radiation effects.
The theory of Heisenberg and Pauli can be condensed into the
following equations:
1. The equation of motion
+ e<£+ c y A + y0m0c2]i/r = 0, (450)
where <f> is Dirac’s one-column matrix with the four components
<l>t> K <f>v
2 . The equations of the electromagnetic field
(451)
(V2- - » g ) A = J
§ 52 ELECTRODYNAMICS OF HEISENBERG, PAULI, AND DIRAC 507
with the usual relations
E = -Vd > -~ ~ A, H = curl A (451 a)
c ct
between the potentials </>, A and the field strengths E, H.
3. The commutability equations expressing the quantization of the
mechanical field according to the Pauli-Fermi statistics:
= B(x—x' ) \
^(x)^(x')+^(x')^(x) = 0 . (452)
^ (x W V R ^ V W V ) = 0 J
4. The commutability equations for the electromagnetic field in
empty space (i.e. in the absence of matter, see below):
hr
Ek(x)Atx')-AAx')E k(x) = (453)
Ztt I
(453 a)
Ek(x)El( x ' ) - E t(x')Ek(x) = 0 I
The equations (450) and (451) along with the quantum conditions
(452) can be considered as a generalization of the equations (410) and
(410 b) which have been established in § 46 as the exact equivalent of the
SchrOdinger theory of a system of electrons described by unquantized
0 -waves in the configuration space, and acting on each other according
to Coulomb’s law. This generalization consists in the introduction of
the finite velocity of propagation c of electromagnetic actions, both in
an indirect way—by substituting the relativistic equation of motion
(450) for the non-relativistic one (410), and in a direct way—by sub
stituting the equations (451) expressing the law of the retarded action
for the Poisson equation (410b).
The differential equations (451) can be replaced by the explicit
expressions for the ‘retarded’ potentials
J
<f>(r,t) = e
r —r
d v + m r ,t)
(454)
A(r, t) = e J
r —r
where t9 = t— |r —r'|/c ; <f>° and A 0 are arbitrary solutions of the homo
geneous d ’Alembert equations
1 d2A°
^ -5 dt* 7 ‘A 1 ? " °'
(454 a)
508 SE CO ND Q UANT IZ AT IO N §52
satisfying the relation
div A ° + - — = 0 . (454 b)
c dt
If we put <f>° — 0 , A 0 •= 0 , th at is, confine ourselves to the retarded
potentials produced by the motion of the electrons which is described
by the operator-function tp, the action of an electron on itself which
may seem to follow from these equations is actually eliminated auto
matically owing to the commutation relations (452). The equations
(452), (450), and (454) (with <f>° = 0 , A 0 = 0 ) must thus give the adequate
description of the mutual action of the electrons allowing for the re
lativity and retardation effects.
The weak point of the Heisenberg-Pauli theory consists, as it seems,
in the introduction of additional quantization rules for the electro
magnetic field expressed by the equations (453). These equations do
not follow from the equations (451) in conjunction with (452), but are
postulated on the basis of the analogy between the light waves and the
mechanical waves which describe the motion of particles conforming to
the Einstein-Bose statistics. In order to obtain the commutability
relations for the electromagnetic field, Heisenberg and Pauli (following
an earlier paper by Pauli and Jordan) actually come back to the old
mechanical theory of light, considered as vibrations of an elastic ether,
and give the quantum-mechanical theory of these vibrations, based on
the classical wave equations (454 a). I t is indeed possible to write down
the latter in a form corresponding to the ordinary Hamiltonian equa
tions of motion of a system of material points for the limiting case when
these points constitute a continuous medium. Replacing the classical
Hamiltonian equations of the motion of such a continuous medium by
the corresponding matrix or wave-mechanical equations, one obtains
the equations for the quantized elastic or electromagnetic waves. The
photons corresponding to these waves are thus introduced in exactly
the same way as the phonons, corresponding to ordinary sound waves
(Part I). The energy of electromagnetic (or ‘elastic’) oscillations of a
given frequency v is thus quantized according to the usual formula
(n+ \] hv for the ordinary harmonic oscillator. In order to get rid of the
\ it is necessary to modify the definition of the energy in the way
shown in § 49 and § 50.
I t should be remembered th at the above theory refers to the ‘free
ether’, i.e. to empty space, without electric charges. This corresponds
to the electromagnetic field which has been denoted above as <p°t A0.
Now such a field can be described, as is well known, without loss of
§ 52 ELECTRODYNAMICS OF HEISENBERG, PAULI, AND DIRAC 509
generality by putting <f>° = 0. Treating the components of the vector A0
as the coordinates of the particles of an elastic ether described by the
Lagrangian function L = J J (E2—H 2) d V , one can define the electric
1 dA°
field E° ------------ asthe quantity corresponding to the mechanical
c ct
momentum of these particles. Hence we obtain the commutation
relations (453), (453a) which are merely the ordinary commutation
relations
P)en Qln'~~QlnPjcn' “ ^kl^nn' ft ’ ^ = 1>2, 3),
etc., for a system of particles 1, 2,..., w,..., n \ .. . in the limiting case when
these particles form a continuum.
I t should be mentioned that this field can be represented as a super
position of plane harmonic waves—as has already been done in §49. The
commutation relations (453), (453 a) can be replaced accordingly by the
relations
A'm(k)An( k ' ) - A n(k')A'm(k) = - ^ 8mn8 ( k - k ') , (455)
to which we must add the relation
ch
<t>\k)<f>(k')-<f>(k')<t>'(k) = + ^ 8(k - k ') , (455a)
all other combinations being mutually commutable. These relations
can be derived directly from the relations of § 49 for the operators
b \ b representing the amplitudes of the harmonic terms with positive
and negative frequencies respectively for the limiting case of an en
closure with an infinite volume.
In order to preserve the above commutation relations for the electro
magnetic field in the presence of electric charges (electrons) it is neces
sary to modify Maxwell*s equations by the addition of small terms
1
proportional to the expression P4 = divA-J-----— or to its derivatives,
c fit
replacing the condition (454 b) by the additional commutation relation
< p4( * w * ')-^ ( * X (* )] = £ s(r ~ r /)> (456b)
where c is the above-mentioned proportionality coefficient which in the
final result is set equal to zero.
I t has been recently shown by Dirac th at it is possible to give a
somewhat different (relativistically invariant) formulation of the
Heisenberg-Pauli theory for a system consisting of a given number of
electrons or indeed of electrified particles of any kind. In Dirac’s theory
the partioles are described by the method of the configuration space,
510 SE C O N D Q U A N T I Z A T I O N §52
and their mutual action is defined implicitly through their coupling
with the quantized electromagnetic field in empty space in conjunction
with a certain restrictive condition imposed on the wave function.
Let ip(xv /j; x2, f2; ... :rv, ts ) be the wave function of the particles (elec
trons) each considered with its own individual time, and let further
</>(#,/), A (x,t) be the potentials of the quantized electromagnetic field,
satisfying the equations (454 a), (454 b) and the commutation conditions
(455). Dirac’s equations can then be written as follows:
where
{u ‘ + £ i i ) * = °- <4561
Hk = vi - 4 )j + y 0A:m0c2 (456 a)
is the Hamiltonian for the &th particle.
The function iff must be actually treated as a matrix with respect to
the stationary states of the field taken alone. These states correspond
to the different plane harmonic waves specified by the wave-number
vector k and the polarization quantum number £. Associating these
with photons, we can regard the above treatment as a particular case
of the general method of treating incomplete systems, explained in
Chap. VII, § 39, the ignored part (B) of the complete system being the
‘photon gas’.
I t could be argued that it must be possible in this way to give an
adequate description of the mutual action between the particles, inas
much as their mutual action with the photons [ignored in the equations
(454 a)] is represented by the energy operators
Mk — ek\.4>(xk^ **)““Yfc'Afcjfc* h )\ (4^6 b)
(the operator CYAp A.+y*omJfcoc2 corresponding to the energy of the Hh
particle taken alone).
This is, however, not so, for the relation between m atter and field is
expressed not only by this operator M, describing the effect of the
latter on the former, but also by the terms eifSi/j and e^yip on the right
side of the equations (451) which describe the effect of the matter on the
field. I t is obviously impossible to get rid of this side of their mutual
relationship, and it must be introduced somehow, explicitly or implicitly,
into the preceding theory in order to transform it into a theory not only
of the motion but also of the mutual action between all the particles
concerned. This is done by Dirac in the following manner:
Let us come back to the complete system: electrons ~j- photons
§62 E L E C T R O DYNAM IC S OF H E I SE NBE R G , P AUL I , AND DIR AC 511
(electromagnetic field), and let us consider 0 as a function both of the
xk,tk of the former, and of the x, t of the latter, it being understood
that the system is doubly quantized with respect to the photons [which
corresponds to the commutation relations (455)]. The equations (454a),
(454 b) will then be rewritten in the form
i 0, (457)
c2 dt2
[di'rA+s s ]'1- 0' (457 a)
<f>and A being defined as certain operators acting on 0 . The latter equa
tion can be considered as a constraint to which the function 0 is subject
Now in order to describe the influence of m atter on the electromagnetic
field this equation must be replaced by the following generalized
equation: r , * N
[divA + i - - = £ e kA ( X - X , ) . (458)
X a = x8, y8,zgyt9 and A(X) is the so-called ‘invariant delta-function*
(introduced by Jordan and Pauli)
A(X) = - [ 8( r + d ) —8(r—c<)] (458a)
(it represents a spherical wave concentrated in an infinitely thin layer
and travelling with the velocity of light from infinity so as to converge
at the point r = 0 at t = 0 and then diverging again to infinity). Using
1 ^_A
the relations E = —V0 —- — , H = curl A, one obtains accordingly,
besides the equations
c u rlE + ~ — = 0 , divH = 0 ,
c dt
which can be considered as identities, the equations
N (458 b)
(div E)iji = £ « * A (X -X .)U
^ 1 *■
Let us now put tx = tz = ... = tN = t == T , i.e. introduce a common
time for all the particles and for the field, and denote the corresponding
complete derivative for any quantity / by df/dT, so th at
d_ \ V + N-
dT - V [dt
1
ik=r=T-
512 SECOND Q UANT IZ AT IO N >52
Then remembering the relations
dA dA d<f> d<f>
~dt = dT* ~dt= d f' ai l- w ./] .
and with the help of the formula
we easily get, along with the trivial expressions
E = H = curl A,
* the equations (459)
d iv A + ^ = 0>
and
|c u r l H - i —Jv = 4jr[fcf e*r*S(r-r*)]^
(459 a)
(divE )</> = 4 t t J ek8(r ~r k)
k—l '
which are equivalent to
(459 b)
and (V2A - I = - [4»J; e*Yfc8(r-r* )]* .
In tho limit c = oo these equations, together with the equations of
motion (456), reduce to the ordinary Schrbdinger equation for the
N particles in the configuration space with the mutual potential energy
U = ^ 2 I 6fc€k' corresponding to the Coulomb forces.
k<k' * * *
53. Br eit’s For mula. Concluding Remar ks
The theories of Heisenberg and Pauli and of Dirac have been
hitherto in practice rather fruitless, that is, they have not led to any
marked progress in the theory of the interaction of electrons. The only
improvement over the simple interaction theory based on Coulomb’s
law is represented by a formula originally derived by Breit from the
general equations of Heisenberg and Pauli’s theory. Breit’s results
amount to the following approximate expression for the mutual energy
of two electrons
W= ^ [ Y V 1 ! (YJ<r)(YJ1>r)] (460)
r 2[ r 1 r8 J
where cy1and cy11are the respective velocity matrices of Dirac’s theory.
This expression takes account of the electromagnetic (spin-orbit) and
B R E I T ’S FORMULA 513
magnetic (spin-spin) interaction and also to some extent of the retarda
tion effects. I t can be derived in a much simpler way without any use
of the Heisenberg-Pauli-Dirac electrodynamics. The simplest and most
straightforward of these derivations is the following one due to K.
Nikolsky.f
The energy of an electron in an external electromagnetic field speci
fied by the potentials <j>, A is given by the formula
W = e<f>—ey-A. (461)
Let us imagine that these potentials are due to the retarded action of
a second electron moving classically. Their values at a given point and
instant r can be expanded in Lagrange’s series in the form J
r Z, ll Ic ^ dr* 7
(462)
Z* filcJ1 dr ^\ c]
where v is the velocity of the electron producing the field and r its
distance from the other electron at the instant r. It is natural to think
that the quantum theory of the interaction can be obtained from the
classical one by replacing the classical time derivatives d<f>!dr by the
quantum Poisson bracket expression [11, <f>] — (2Trilh)(Il<l>—<l>H), where
11 is the Hamiltonian of the system formed by the two electrons without
the interaction term W. The velocity vector v must naturally also be
replaced by the matrix vector cy. We thus get
(463)
n\
where Cv =
v\(n —v)!
Here r corresponds to the common time T of the whole system, that
is, of the two electrons (the electromagnetic field being no longer con
sidered as a dynamical system and playing an auxiliary role only) It
is natural to define the corresponding energy 11 as the sum
H = Ul+ H l\ (464)
where Hl and H 11 are the Hamiltonians of the two electrons taken
separately, i.e.
H1 = cy ^ p^yJW o C 2, H n — cyu,Pu+ Yo w 0c 2. (464 a)
t Not yet published. The other derivations are due to Mollor, Roscnfcld, and Schcrzcr.
X Cf. nay Lehrbuch der Eleklrodynamik, i, p. 184.
3M5.9 3 t t
514 SECOND Q UANT IZ AT IO N §53
I t must not b© supposed that this expression for H omits completely
the mutual action of the electrons. In fact the operators p 1 and p 11
must be considered as representing the total momentum of the respective
electrons, including the 'potential momentum* due to its partner, i.e.
«i = ___ !?o___ v 1-I- — v 11 v11= ____ m° ___ v 114- — v1
P VO- ^ 1lc)2} ’ P J { i - ( v”/c)*}’
(464 b)
in agreement with the approximate theory of § 38. This will become
more apparent when we compare Breit’s formula with the result of the
above theory. With the help of (461), (462), and (463) we get:
W = JP.h = e* J
« = 0 v= 0 ^
+ YH yYnr,*~1HP , (465)
v
where Hv = ^ CxHxHr~x. (465 a)
A-o
Dropping terms of the third and higher orders with respect to vjc
(i.e. y), we have
i vi. Vn\ 27rV2
(r I f ) - ^ {t W - H t H + W t )
whence, according to (465 a),
JF>." = e^-r - yt- ^ - ? ~ - J [ r H lH tt- H llr H ' - H 1rH tl+ H tH nr]
together with terms which are proportional to the square of H 1 or Hu>
which we shall neglect as having no physical meaning (they represent
the action of a point-like electron on itself)!.
Now the expression in the brackets [ ] can be put in the form
H11 (Hlr —rHl)—(<Hlr —rHl)ir l.
Using the formulae (464 a) for H1and Hu we get
W r -r H 1- h dr h , c^ ‘r
2iri dtl 2iri r
and
( ch\* r Yr-Yn _ (Y1-r)(Y,I-r)l
Hn(H 'r -r H l)-(H 'r -r H ')H n
\2iri) [ r r3 J’
t These terms are physically irrelevant also for another reason, namely, because the
squares of the matrices yl and y 1 are equal to 1 (or rather to 3), whereas they must
represent small quantities of the second order with respect to v'jc and vu/c. This diffi
culty has, however, an origin entirely different from the preceding one, being connected
with the existence of states of negative proper energy.
§53 C ONCLUDING R EM AR K S 515
which leads to Breit’s formula for IF1' 11, this expression being actually
symmetrical with regard to the two electrons.
The classical expression for W corresponding to Breit’s formula is
W = 7 “ ^ [^ v,v,,+^(rv)(rv,)]‘ (466)
The second term must obviously represent the effect of the electro
magnetic interaction between the two electrons with due account of the
retardation. Now in the non-relativistic theory of § 38 where this
retardation was left out of account, the electromagnetic interaction was
shown to correspond to a mutual kinetic energy
T = -^ v '-v 11, (466a)
c2r
which is quite different from the second term of (466).
This difference is, however, greatly attenuated if we consider the
total energy of the two electrons H-\-\V = H l-\-Hu+ W, or more exactly
the classical expression which corresponds to it and which is obtained
if cy is replaced by v, y0 by ^/{l —(t;/c)2}, and the p ’s by the expressions
(464 b). We thus get
H = V,{ ^ r ^ v c « v , + ? r v,,) + m «c^ 1-
+ v,1-(V{1 - ( ^ ] c f ) yU + ^ v l)+ m 0cV{1- ( t ’"/c)2}
m0c2 m0c2 2e2
= V{1- (v'le)*}+ Vt 1 - (t’"/f)2}+ ^ v 'v ’
and consequently
M = _ , m oc2 . e2 ,
+ (^/c)2}+ V{1-{v" jc f} + r +
+ — l^ v'v" —~(r-v'Xr-v11)]. (466 b)
2cl [r rd J
The first three terms in this expression represent the proper energy of
the two electrons and their mutual potential energy, whereas the last
one gives the energy of the electromagnetic interaction. Although still
somewhat different from (466 a), it is, however, much more similar to it
than the corresponding term of IF. We obtain a still closer similarity
if we average over all the directions of the vector r, considering them
as equally probable. We thus get
(r-v^r-v11) = $v1‘Vnr2,
516 SE C O ND Q UANT IZ AT IO N
which gives
moc m0c* 4 e2
H+ W =
A + 7 + i ^ v,'vM- (466c>
The factor J appearing in the last (electromagnetic) term is the same
as th at which is met with in the calculation of the electron’s mass as
due to the electromagnetic mutual action of the elements of its charge
(supposed to be distributed in a spherically symmetrical way in a finite
volume).
The above derivation of Breit’s formula is not free from objection,
especially with regard to the definition (464) of the energy H. Tt could
be slightly modified by adding W to the expression used before (this
would not alter the results to the approximation considered). The
important point is that any symmetrization of the expression for H
yl
leads to cancelling terms of odd degree in the products of and y11. The
same result is obtained if in the derivation of Lagrange’s series for the
potentials (j> and A we replace the retarded potentials by the mean
value of the retarded and the accelerated ones. This symmetrization with
respect to the time (which has been actually used for a similar purpose
by Fokker) is equivalent to the symmetrization of the energy H with
respect to the two electrons. This is natural since the time and the
energy are dynamically conjugate quantities.
We thus see incidentally that so long as we arc using a symmetrical
energy operator for two electrons, it is impossible to describe th at part
of their mutual action which is antisymmetrical in the two particles or
in the time and which corresponds to the dissipation of energy by
radiation.
This reproach may not be applicable to the accurate form of the
Heisenberg-Pauli-Dirac theory. This theory cannot be considered,
however, as a satisfactory system of quantum electrodynamics for
many other reasons. In the first place it is based on a fundamentally
wrong interpretation of the relationship. between m atter (electrons)
and electromagnetic field (photons) as a formal analogy, the quantum
theory of the electromagnetic field being developed accordingly as a
wave-mechanical theory of the ‘ether’ in a somewhat disguised form
adjusted to Maxwell’s equations.
A second, more important, reason lies in the fact that material par
ticles are visualized as the primary things in Nature and are dealt with
as unextended points with dynamical properties independent of those of
the electromagnetic field, while the electromagnetic field is treated as
but an auxiliary agent introduced for the description of their mutual
C O NC L UDING R E M AR K S 517
action and serving to determine their motion. It seems, however, more
reasonable to think th at the electromagnetic field is the primary and
fundamental thing in Nature, the material particles (electrons and pro
tons) being derivable from it, and possessing no independent mechanical
properties. This point of view corresponds to the latest development of
the classical electrodynamics, culminating in the electromagnetic theory
of mass. The mechanical momentum and energy—potential and kinetic
—must be interpreted from this point of view as the approximate form
of electromagnetic momentum and energy, directly connected not with
the particles but with the electromagnetic field. The laws of motion
can be derived accordingly from the principle of conservation of electro
magnetic momentum and energy, applied to separate electrons, if the
latter are considered not as points but as extended bodies (spheres) and
if the external force acting on them is supposed to be balanced by the
‘inner’ force, due to their own motion.
This classical theory which means the complete reduction of mechanics
to electrodynamics has met with one serious difficulty, connected with
the problem of the spatial extension or ‘structure’ of the electron. I t
is responsible for the fact that the electromagnetic theory of mass, or,
in other words, the electromagnetic derivation of mechanics, has re
mained without further development until now. The advent of the
quantum theory did not in the least alter the situation, the modern
wave or quantum mechanics being simply a modified form of the old
mechanics of a point-like particle with a given mass.
Now it seems quite certain that this new theory is in principle just
as wrong as the old one, and that the next task'in the development of
our theory of the physical universe will consist in the application of the
quantum ideas to the electromagnetic field in such a way as to obtain
the mechanical lawrs as a corollary from the laws of conservation of
electromagnetic energy and momentum. It is to be hoped th at the
main difficulty of the classical theory connected with the problem of
the electron’s spatial extension will be eliminated by considering the
electron as the product (and not the source) of the electromagnetic field,
described in a consistent quantum way. One might, for example, define
the electromagnetic field as a matrix from the point of view of the
space-time manifold, i.e. as a matrix with the elements (x'\F \x”), where
x is an abbreviation for x, y, z, £, the diagonal elements representing the
probable values of the field at different points x' = x". The electron
could be described accordingly with the help of a function D(\x' —
similar to a Gaussian function, with a finite parameter a playing the
618 SECOND QUANTIZATI ON §63
role of t ve electron’s radius, \x' —x" \ being the four-dimensional distance
between the points x' and x \ We are thus entitled to think th at Dirac’s
equation of motion will be replaced by an equation containing the
electromagnetic momentum-energy tensor; the mass of the electron,
instead of being introduced a p r io r i as a parameter, being derivable
from the quantum equivalent for its radius. A closer discussion of this
question is, however, hardly possible at the present time.
R E F E R E NC E S
§4
1. Appr oxim ate solution of Schr odinger ’s equation based on th e equation of
H am ilton-J a cob i:
G. W e n t z e l , Z s .f. P hys. 3 8 , 518 (1926).
L. B r i l l o u i n , C. R. 183, 24 (1926).
H . A. K r a m e r s , Zs . f. P hys, 39, 828 (1926).
H . A. K r a m e r s u . G. P. I t t m a n n , Z a .f . P hys. 58, 217 (1929).
2. The Vir ial Theor em:
B. F i n k e l s t e i n , Z8 .f. P hys. 50, 293 (1927).
A. S o m m e r f e l d , Wellenmechanischer Ergdnzungsband, K ap. I I , § 9.
3. T he m otion of a wave packet:
P. E h r e n f e s t , Zs . f. P hys. 45, 455 (1927).
§5
1. Theor y of canonical tr ansfor mations and of conditionally per iodic m otion:
M. B o r n , Atommecha nik, I , ch . ii.
2. Connexion between classical and wave-mechanical aver age values:
J . H . V a n V l e c k , Proc. Na t. Ac. Sci. 14, 179 (1928).
§7
Bo , H e i s e n b e r g , u . J o r d a n , Zs . f. P hys. 35, 557 (1926).
r n
P. A. M. D i r a c , The P rinciples of Qua ntum Mecha nics, ch . iii.
§9
E . Sc h b o d i n o e b , Ann. d. P hys. 79, 361 (1926).
W. G o b d o n , Zs . f. P hys. 40, 117 (1926).
§ 10
On th e nor malization of wave functions in th e case of continuous sp ectr a :
E . F u e s , An n . d. P hys. 81, 281 (1926).
P. A. M. D i r a c , The P rinciples of Q. M ., ch . iv.
§n
M. B o r n , W . H e i s e n b e r g , u . P. J o r d a n , Zs . f. P hys. 35, 657 (1926).
E . S c h r o d i n g e r , Ann. d. P hys. 79, 734 (1926).
§12
W. H e i s e n b e r g , Zs . f. P hys. 33, 879 (1925).
N. B o h r , Vber die Quantentheorie dcr Linienspektra , Br aunschweig, 1923.
§13
M. B o r n u . P . J o r d a n , Elementare Quantenmechanik.
P. A. M. D i r a c , The P rinciples of Q. M., §§ 29 and 30.
520 R E F E R E NC E S
C H APT E R IV (Tr ansfor mation Theor y)
P . A. M. D i r a c , The P rinciples of Q. M ., ch . v .
M. B o r n u. P. J o r d a n , Elementare Quantenmechanik.
J . v . N e u m a n n , Mathematische Grundlagen der Quantenmechanik (1032); Gott.
Na chr. 1927, p. 245.
§19
E. Sc h r , Abha ndlungcn zur Wellemncchanik, I I I , S. 85.
o d in g e r
B o r n , H e i s e n b e r g , u. J o r d a n , Zs. f. P hys. 3 5 , 557 (1926).
H. W e y l , Gruppentheorie und Quantenmechanik, § 16.
§21
P. A. M. D i r a c , Proc. Hoy. Eoc. 112, 661 (1926).
§25
W. C o r do n , Zs . f. P hys. 40, 117 (1926).
O. K l e in , Zs . f. P hys. 3 7 , 895 (1926).
§28
P. A. M. D i r a c , Proc. Roy. Eoc. 117, 610 and 118, 351 (1928).
C. C. D a r w i n , Proc. Roy. Soc. 118, 654 (1928).
J . F r e n k e l , Zs . f. P hys. 52, 356 (1928).
§§ 29, 30
W. P a u l i , Zs . f. P hys. 43, 601 (1927).
J . F r e n k e l , Lehrbuch der Elektrodyna mik, i. 294 and 353.
L. H . T h o m a s , P hil. Ma g., J u n e, 1927, and Na ture, 107, 514 (1926).
§31
A. M . D i r a c , The P rinciples of Q. M., §§ 74 a n d 76.
^ <J p p y
B r e i t , Proc. Na t. Ac. Sei. 14, 553 (1928).
I v a n e n k o , C. R., 25 Fob. 1929.
F o c k , Zs . f. P hys. 5 5 , 127 (1929); 5 7 , 261 (1929).
. G o r d o n , Zs . f. P hys. 5 0 , 630 (1927).
§§ 32j 33
P. A. M. D i r a c , The P rinciples of Q. M ., §§ 76, 77, 78.
A. S o m m e r f e l d , Wellenmechanischer Erganzungsband.
§35
P. A. M. D i r a c , The P rinciples of Q. M., § 75.
O. LAroRTE and C. U h len b k ck , P hys. Rev. 37, 1380 (1931).
H . W e y l , Gruppentheorie und Quantenmechanik, § 39.
§36
V. F ock , Zs . f. P hys. 55, 127 (1929).
R EFE R E NC E S 521
§37
E. S c h r o d i n g e r , Abha ndlungen zur Wellcnmechanik, TI.
V. F o c k , Z s .f. P hys. 63, 855 (1930).
H . W e y l , Gruppentheoric und Qua ntenmcchanik.
B. L. v a n d e r W a e r d e n , Die gruppentheoretische Methode in der Qua nten
mecha nik.
§38
W. P a u l i , Z s .f. P hys. 43, (501 (1927).
§ 41
P. A. M. D i r a c , Proc. Roy. Soc. 112, 661 (1926).
------ The P rinciples of Q. M .y ch. xi.
E . W h in e r , Z s .f. P hys. 40, 883 (1927).
§ 42
J . C. S l a t e r , P hys. Rev. 34, 1293 (1929); 36, 57 (1930).
W. F a u l t , Ra pport du Congr&s Solva y dc 1930, i, § 4.
P . A. M. D i r a c , The P r inciples of Q. M .y § 66.
§43
D. R. H a r t r k e , Proc. Cam. P hil. Soc. 26, 85 (1928).
§44
V. F ock , Z s .f. P hys. 61, 126 (1930).
P. A. M. D i r a c , Proc. Cam. P hil. Soc. 26, 376 (1930).
§ 45
V. F o c k , Z s .f. P hys. 81, 195 (1933).
L. H. T h o m a s, Proc. Cam. P hil. Soc. 23, 542 (1927).
E . F e r m i, Z s .f. P hys. 48, 73 (1928).
§46
P . J o r d a n u . E. W ig n e k , Z s .f. P hys. 47, 631 (1928).
V. F o c k , Z s .f. P hys. 75, 622 (1932).
§47
P . A. M. D i r a c , Proc. Roy. Soc. 114, 243 and 710 (1927).
P . J o r d a n u . O. K l e i n , Zs . f. P hys. 45, 751 (1927).
V. F o c k , Zs . f. P hys. 75, 622 (1932).
§48
P. A. M. D i r a c , The P r inciples of Q. M ., c h . xii.
§§ 49, 50
W. H e i s e n b e r g , An n . d. P hys. 9, 338 (1931).
V. W b i s s k o p f u . E . W i g n e r , Zs . f. P hys. 63, 64 (1930); 65, 18 (1930).
E . F e r m i , Reviews o f Modem P hysics, 4 , 87 (1932).
G. B r e i t , Reviews o f Modem P hysicst 4 , 504 (1932).
3595.6 3 x
522 REFERENCES
§52
P. J o r d a n u . W. P a u l i , Zs.f. Phys. 47, 151 (1927).
W. H e i s e n b e r o u . W. P a u l i , Zs.f. Phys. 56, 1 (1927); 59, 168 (1930).
L. L a n d a u u . R. P e i e r l s , Zs.f. Phys. 62, 188 (1930).
P. A. M. D i r a c . Proc. Boy. Soc. 136, 453 (1932).
V. F o c k and B. P o d o l s k y , Sow. Phys. 1, 801 (1932).
D i r a c , F o c k , and P o d o l s k y , Sow. Phys. 2, 468 (1932).
§53
G. B r e i t , Phys. Rev. 34, 553 (1929); 36, 383 (1930); 39, 016 (1932).
Ch r . Mo l l e r , Zs.f. Phys. 70, 786 (1931).
L. R o s e n f e l d , Zs.f. Phys. 73, 253 (1932).
I N D E X TO P AR T I
Absor ption pr obability, 142. Fowler , R. H., 113, 243.
angular quantum number , 94. fr equency r elation, 53, 131.
antisymmetric! functions, 173.
antisymmetr y pr inciple, 182. Gumow, Cl., 105, 106.
asymptotic solution, 79, 80. Germcr, 25.
axial quantum number, 94. Goudsrnit, 152.
group velocity, 29, 230.
Bloch, F., 2(>.r>, 271. Gurney, 105.
Bohr, N., 53, 87, 90.
Boltzmann’s law, 194, 196, 270. Har monic oscillator , 78-84, 160.
Bor n, M., 33, 36, 172 heat capacity of electron gas, 218.
Bose-Einstein statistics, 198. Heisenber g, W. (uncer tainty r elation),
Bothe, W., 274, 278. 46, 47-52.
Br agg’s formula, 25, 122. Her mitian r elation, 137.
Br illouin, 1.., 209, 263. holes, Dir ac’s, 155.
de Broglie, 20, 23, 55. Huygens, 5.
hydr ogen-like atom, 84, 108.
Char acteristic value's, 87.
compr essibility (of metals), 2 2 1 . Identity pr inciple, 158.
Compton effect, 48. image force, 240.
Condon, E., 105.
configur ation space, 167. Kr onig, 123.
contact potential difference, 244.
Crystal lattice, motion of electr on in a, Laplace’s equation, 92.
121, 227, 230. leaking, 117.
------ , ener gy levels in a, 230, 233. Liouville’s theor em, 43, 184.
Lorentz, 8 , 256.
Damjied vibr ations, 105.
Dar win, C. G., 46. Magnetic susceptibility of electr on gas,
Davisson, 25. 220 .
Debye, P., 160. matr ix, 136.
decay constant, 105. matr ix multiplication law, 149.
degener acy, 83, 96, 129, 176. Maxwell, 7, 253.
Dempster , 27. mean free path of electr on, 254, 257,
Dir ac, 142, 153, 155, 160, 182, 198. 261.
Doppler effect, 277. momentum-ener gy vector , 1 1 .
Dr ude, 253. multiplicity (of ener gy levels), 83.
‘Eigenvibr ations’, 64. Newton, 1, 5, 23, 34.
Einstein, A., 6 , 9-12, 36, 135, 147, 198, nodes, 57, 81, 8 8 .
277. Nor dheim, 113, 243.
electr ical conductivity of metals, 254. nor malizing condition, 126.
electr omagnetic waves, 18, 103, 188,
190. Or thogonality r elation, 130.
electr on gas in metals, 215. over tones, 57.
emission pr obability, 135.
exchango degener acy, 176. Pauli, W., 152, 162, 181, 198, 201, 220.
exclusion pr inciple, 162, 182. Pauli-Fer mi-Dir ac statistics, 198, 215.
Peier ls, 265, 271.
Fer mi, E., 198, 235. per mutations, 176.
Fizeau, 7, 23, 27, 30. per tur bation ener gy, 141.
fluctuations (in a gas), 197, 205. phase-space, 42, 60, 158, 200.
for oed tr ansitions, 139. phonons, 265.
Four ier ’s theor em, 41. photons, 12, 13, 135, 187, 272.
5?4 I N D E X TO TART I
Planck, M., 13, 160, 272. SchrOdinger’s equation, gener alized,
Poisson’s equation, 163, 234. 140.
polar ization of light, 150. ------ , r elativistic, 77.
potential stair case method, 100. ------ , for two par ticles, 170.
pr incipal quantum number , 94. Schr odinger ’s theor y of light emission,
pr obability amplitudes, 62. 132.
— conception, 32. selective r eflection of electr ons by a
— cur rent, 69. cr ystal latt ice, 122.
— density, 32. self-consistent field, 234.
— packet, 38. Sommerfeld, A., 54, 224, 242, 256.
— theor y, classical, 61. spher ical harmonics, 92.
pr oper time, 10. spin, 152, 153, 183.
spontaneous tr ansitions, 139.
Quanta of oner gy, 13. statistical equilibr ium, 199.
quantized states, 53. symmetr ical wave functions, 173.
— waves, 161. super position pr inciple, 61, 124.
quantitative quantization, 162.
quantum number s, 94. Tamm, Ig., 124.
tensor of ener gy, 18.
Radiation force, 138. ther mionic curr ent, 242. •
Raman effect, 269. Thomas-Fer mi equation, 235.
r ectification of contacts, 250. Thomson, G. P., 26.
r eflection par tial, 5, 68. tr ansition pr obability, 133.
— total, 5, 79, 86. tr ansmission coefficient, 69.
r efr active index, 3. tr ansposition, 177.
r elative motion of two par ticles, 171. tunnol effect, 112.
r elativistic Schr odinger equation, 77. Tyndall effect, 258.
r elativity, 7, 9. Uhlenbeck, 152.
r est-ener gy, 11. uncer tainty r elation, 40, 59.
r ever sibility law, 71, 144, 196, 207.
r otator , 98. velocity, imaginar y, 67.
Rupp, 26. —, gr oup, 29, 230.
Scatter ing of electr ons, 258, 268. Wave equation of Schrodinger (see
scatter ing of photons, 275. Schr odinger ’s equation),
Schr ddinger, E ., 30, 64, 75, 77,140, 170. wave number , 15.
Schrddinger ’s equation, for stationar y waves, standing, 56.
states, 74. Wiedemann-Fr anz law, 255.
I N D E X TO P AR T I I *
Action variables, 42. j Klein, O., 346.
adjoint matr ix, 139.
amplitudes quantized, 485. i Landc’s factor, 309.
angle var iables, 42. Larmor's angular velocity, 309.
angular quantum number , 53. ! Lor entz transfor mation, 356.
axial quantum number , 53.
Matrix multiplication law, 90.
Basic quantities, 155. — transposed, 139.
Bohr ’s inter pr etation of spin, 329. — unitar y, 139.
Bohr ’s magneton, 256. magnetic moment, 256, 275, 280.
br acket expr ession (Poisson), 57, 161. mixed matr ix, 139, 441.
Canonical equations, 40. Negative ener gy states, 277, 345.
— tr ansfor mations, 41, 144, 159. neutr on, 348.
centr al quadrics as r epr esentatives of
quantum var iables, 173. Pauli, W., 280, 392, 417.
centr oid, 32. — equations, 275.
char acter istic functions, 54. per iodicity moduli, 38.
— values, 54. per mutations as oper ator s, 404.
class (of per mutations), 404. photons (Dir ac’s theor y of), 477.
commutable operators, 56. positive electr ons, 347.
commutation rules, 94. p-r oprosentation, 157.
cor r espondence pr inciple (Bohr ’s), 98.
cur vilinear coor dinates (Dirac;’s equa Quadruplicity phenomenon, 277, 345.
tion in), 364.
cyclic per mutations, 404. Raman effect, 236, 499.
r elativity splitting (of spectr al terms),
Damping (r adiation), 491, 497. 338.
degener ate systems (pertur bation of), r elative degener acy, 189.
186.
delta function (Dir ac’s), 84. Scatter ing of light, 234, 484.
density matr ix, 432. —, Raman, 499.
duplicity phenomenon, 275. self-adjoint oper ator s, 65.
Slater, J . C., 413, 424.
Ehr enfost, P., 32, 360. Sommerfold’s line str uctur e formula,
electr ic moment of the electr on, 300, 304.
316. spin matr ices of Pauli, 280.
exchange forces, 422, 443. -------of Dir ac, 316.
Factor ized wave functions, 424. spin of the electr on, 294.
Fer mi, E ., 446. ------ a system, 385, 414.
Fock, V., 329, 424, 443. Thomas, L. H., 446.
fundamental frequencies, 43.
Gordon, 321. Uhlenbeck, 273, 304.
Goudsmit, 279, 304.
Var iational for m of wave equation, 424.
Hamilton-J acobi equation, 18, 25, 243. Virial theor em, 31.
Har tr ee, D., 423.
Heisenber g, W., 97, 491, 506. Wigner ’s oper ator , 453.
Her mitian pr oper ty (of matr ices), 80.
Zeeman effect normal, 257.
J or dan, P ., 453. -------anomalous, 307, 341.
* This index does not make any claims at completeness, its purpose being simply to
help the reader in locating the definitions of the main terms and some of the authors
quoted in the text.
D O VE R B O O KS ON SCI E NCE
P UR E M ATHEMATICS
ANSCHAULICHE GEOMETRIE by D. Hilbert and S. Cohn-Vossen. Yellow
(Grundlehren) S eries. Text in German. English translation of table of
con ten ts. G e rm a n -E n g lish g lo ss a ry -in d e x . 5 -1 /2 x 8 -1 /2 . x + 314
pages. 330 illu stratio n s. (Originally published at $10.00). $3.95
AUFGABEN UND LEHRSATZE AUS DER ANALYSIS by G. Po'lya and G.
Szego. Two volume set. Text in German. English translation of table
of contents. G erm an-E nglish g lo ssary -in d ex . 5-1/2 x 8 -1 /2 . Volume
I: xxvi + 342 pages. Volume II: xx + 412 pages. (Originally published
at $14.40 for both volu m es. Each V o lu m e--$ 3 .9 5 , The S e t - - $7.90
A CONCISE HISTORY OF MATHEMATICS by D irk J. Struik. Emphasizes
ideas and continuity of m a th em atics r a th e r than anecdotal asp ects
from O riental beginnings through 19th century. " , . .r ic h iji content,
thoughtful in i n te r p r e ta t io n ... ” --U . S. Q u a rte rly Book L ist. Two
volume se t. Dover S eries in M athem atics and P hysics. Bibliography.
Index. 4 -1 /2 x 6 -3 /4 . Volume I: xviii + 123 pages. Volume II: vi + 175
pages. 47 illu stratio n s. The S e t- - $3.00
COURS D ’ANALYSE INFINITESIMALE by Ch. J. de la V alle Poussin.
Eighth revised edition. “ The handling throughout is clear, elegant and
co n c ise .. . “ --B ulletin of the American Mathematical Society. Two vol
ume set. Text in French. 5 -1 /2 x 8 -1 /2 . Volume I: xxi+524 pages.Volume
II: xii + 460 pages. Each V olum e--$ 4 .5 0 , The S e t- - $8.75
EINFUHRUNG IN DIE ALGEBRAISCHE GEOMETRIE by B. L. van der
W aerden. “ C lear, system atic exposition of an im portant new m athe
m atical developm ent.“ —Bulletin of the A m erican M athem atical Soci
ety. Yellow (Grundlehren) S eries. Text in German. 5-1/2 x 8 -1 /2 . ix +
247 pages. 15 illu stratio n s. (Originally published at $7.80). $3.95
EINLEITUNG IN DIE MENGENLEHRE by Adolf Fraenkel. Third revised
edition. “ The tre a tise by Fraenkel on the theory of aggregates is now
one of the fin e s t.“ --B u lle tin of the A m erican M athem atical Society.
Yellow (G rundlehren) S e rie s. T ext in G erm an. B ibliography. Index.
5-1/2 x 8 -1 /2 . xiii + 424 pages. 13 fig u res. $4.00
ELEMENTARY MATHEMATICS FROM AN ADVANCED STANDPOINT by
Felix Klein. Volume I: A rithmetic, Algebra, Analysis. T ranslated from
the th ird German edition by E. R. Hedrick and C. A. Noble. “ A very a t
tractiv e introduction into some of the most modern developments of the
theory of groups of finite o rder, with em phasis on its applications. ’*—
A m erican M athem atical Monthly. Yellow (Grundlehren) S eries. Index.
5 -1 /2 x 8 -1 /2 . xiv + 274 pages. 125 illu stratio n s. $3.75
ELEMENTARY MATHEMATICS FROM AN ADVANCED STANDPOINT by
Felix Klein. Volume II: Geom etry. T ran slated from the th ird German
edition by E. R. H edrick and C. A. Noble. “ Required reading for any
one planning to teach high school geom etry a n d .. . Interesting and v al
uable to the experienced te a c h e r.“ --School Science and M athem atics.
Yellow (Grundlehren) S eries. 5-1/2 x 8 -1 /2 . ix + 214 pages. 141 illu s
tratio n s. $2.95
D O VE R B O O KS ON SCI E NCE
GRUNDZUGE DER THEORETISCHEN LOGIK by D Hilbert and W. Acker
man. Second revised edition. Yellow (Grundlehren) Series. Text in G er
man. Bibliography. Index. 5 -1 /2 x 8 -1 /2 . xi + 133 pages. (Originally
published at $4.50). $3.00
MENGENLEHRE by F. Hausdorff. Third revised edition. Text in German.
B ibliography. Index. 5 -1 /2 x 8 -1 /2 . v 4 307 pag es. 12 illu stratio n s.
(Originally published at $10.00). $3.95
ORDINARY DIFFERENTIAL EQUATIONS by E. L Ince. Fourth revised
edition. " Notable addition to the m athem atical literatu re in E nglish."
--B ulletin of the American Mathematical Society. 4 appendices. Index.
5-1 /2 x 9. viii + 558 pages. 18 illu stratio n s. (O riginally published at
$12.00). $4.95
THEORIE DER DIFFERENTIALGLEICHUNGEN byLudwigBieberbach.
Third rev ised edition. Yellow (Grundlehren) S eries. Text in Germ an.
Index. 5 -1 /2 x 8 -1 /2 . xvii + 399 pages. 22 illu stra tio n s. (O riginally
published at $10.00). $3.95
DIE THEORIE DER GRUPPEN VON ENDLICHER ORDNUNG by Andreas
S peiser. Third revised edition. Yellow (Grundlehren) S eries. Text in
German. Index. 5-1 /2 x 8 -1 /2 . x 4 262 pages. 41 illu stratio n s. (Orig
inally published at $9.00). $3.95
THEORY OF FUNCTIONS by Konrad Knopp. P art I: Elem ents of the Gen
e ra l Theory of Analytic Functions. T ran slated from the fifth German
edition by F re d erick Bagemihl. "T h e re is little doubt but that this is
the best monograph on functions of a com plex v ariab le yet w ritte n ."
--A m erican Mathematical Monthly. Bibliography. Index. 4 -1 /4 x 6 -1 /2 .
xii 4 146 pages. 4 illu stratio n s. $1.50
THEORY OF FUNCTIONS by Konrad Knopp. P a rt II: A pplications and
F u rth e r D evelopm ent of the G en eral T h eo ry . T ra n sla te d fro m the
fourth G erm an edition by F re d e ric k B agem ihl. B ibliography. Index.
4-1/4 x 6 -1 /2 . x 4 150 pages. 8 illu stratio n s. $1.50
PROBLEM BOOK IN THE THEORY OF FUNCTIONS by Konrad Knopp.
Volume I: Problems in the Elementary Theory of Functions. T ranslated
by Lipman B e rs. "T h e difficult ta sk of selectin g from the im m ense
m aterial of the modern theory of functions the problem s Just within the
reach of the beginner is here m asterfully accom plished."--B ulletin of
the Am erican M athem atical Society. Dover S eries in M athem atics and
P hysics. 4 -1 /4 x 6 -3 /8 . viii 4 126 pages. $1.85
VORLESUNGEN UBER DIFFERENTIALGEOMETRIE by Wilhelm Blaschke.
Volume I: Elementare Differentialgeometrie. Third revised edition. Yel
low (Grundlehren) S eries. Text in German. English translation of table
of contents. G erm an-English glossary-index. 5 -1 /2 x 8 -1 /2 . xiv 4 322
pages. 35 figures. (Originally published at $9.00). $3.95
D O VE R B O O KS O N SC I E N C E
A P P L I E D M AT H E M AT I C S
AND M AT H E M AT I C AL P H YSI C S
APPLIED ELASTICITY by John P rescott. “ . . . important contribution.. .
old m a te ria l p resented in new and refresh in g f o r m .. . many original
investigations.’’--N ature. 3 appendices. Index. 5-1/2 x 8 - l/2 . vi + 666
pages. (Originally published at $9.50). $3.95
BESSEL FUNCTIONS, Eleven and Fifteen-Place Tables of B essel Func
tions of the F irs t Kina to All Significant O rd ers by Enzo Cambi. The
main tables give Jn (x) for x = 0 (0.01) 10.5 and n = 0 (1) 29 to 11 places.
A supplementary table gives Jn (x) for x = 0 (0.001) 0.5 and n = 0 (1) 11
to 15 p la c e s. B ibliography. 8 -1 /2 x 1 0 -3 /4 . H ard binding, vi + 160
pages. 2 graphs. $3.95
FOUNDATIONS OF NUCLEAR PHYSICS. Compiled by Robert T. Beyer.
F acsim ile reproductions with text in the original language of French,
German or English of the 13 most important papers in atomic research
by Chadwick, Cockcroft, Yukawa, F erm i, etc. 122 page bibliography
with over 5,000 classified en tries. 6-1/8 x 9 -1/4. x = 272 pages. Dlus-
trated . $2.95
HIGHER MATHEMATICS FOR STUDENTS OF CHEMISTRY AND PHYSICS
by J* W. Mellor. Fourth revised edition. “ . . . an eminently readable and
thoroughly p ractical tr e a tis e .’’--N atu re. 2 appendices. Index. 5-1/2 x
8-1/2. xxix + 641 pages. 189 figures. 18 tables. (Originally published at
$7.00). $4.50
HYDRODYNAMICS by Sir Horace Lamb. Sixth revised edition. “ Standard
w o rk .. . im p o rtan t th e o rie s (of the dynam ics of liquids and g ases),
which underlie many present-day practical applications, a re dealt with
thoroughly and with m ath em atical r ig o u r .’’--E n g in e erin g S ocieties
L ib rary . Index. 6 x 9 . xviii + 738 pages. 83 illu stra tio n s. (Originally
published at $13.75). $5.95
INTRODUCTION TO THE DIFFERENTIAL EQUATIONS OF PHYSICS by
L. Hopf. T ranslated by W alter Nef. “ T here is a su rp risin g amount of
valuable m a teria l packed into th is sm all book.’’--School Science and
M athem atics. Dover S eries in M athem atics and P hysics. Index. 4-1 /4
x 6 -3 /8 . vi + 154 pages. 48 illu stratio n s. $1.95
INTRODUCTION TO THE THEORY OF FOURIER’S SERIES AND INTE
GRALS by H. S. Car slaw . T hird revised edition. “ . . .n eed s little in tro
duction. . . much new m aterial has been introduced (in the p resen t edi
tion). . . clearly and attractiv ely w ritte n .’’--N atu re . 2 appendices. In
dex. 5 -3 /8 x 8. xlii + 368 pages. 39 illu stratio n s. $3.95
DIE MATHEMATISCHEN H ILFSM ITTEL DES PHYSIKERS by Erwin
Made lung. T h ird re v ise d edition. “ Standard. ..c o lle c tio n of m atne-
m atical definitions and fo rm u las and of laws and equations used in
th e o re tic a l and ap p lied p h y s ic s .’’- - E le c tr o n ic s In d u stries.Y ello w
(G rundlehren) S e rie s . T ext in G erm an . G erm a n -E n g lish g lo ssa ry .
B ibliography. Index. 6 x 9. xvi + 384 pages. 25 illu stra tio n s. (O rigi
nally published at $12.00). $3.95
D O VE R B O O KS O N S C I E N C E
MICRO-WAVES AND WAVE GUIDES by H. M. B arlow . U p-to -d ate ex
position which d e sc rib e s both the accom plishm ents and future p o ssi
b ilities in this increasingly important field. G lossary of symbols used.
Bibliography. Index. 5-1/2 x 8-1/2. x + 122 pages. 70 illustrations. $1.95
PARTIAL DIFFERENTIAL EQUATIONS OF MATHEMATICAL PHYSICS
by H. B atem an. F irs t A m erican edition with c o rre c tio n s. “ The book
m ust be in the hands of everyone who is in te re ste d in the boundary
value problem s of m athem atical p h y sic s.’’--B ulletin of the A m erican
M athem atical Society. Appendix. Index. 6 x 9 . xxii + 522 pages. 29 il
lustrations. (O riginally published at $10.00). $4.95
PRACTICAL ANALYSIS (GRAPHICAL AND NUMERICAL METHODS) by
F r. A W illers. T ranslated by Robert T. Beyer. Section on calculating
m achines rew ritten by T racy W. Simpson to reflec t c u rre n t methods
with A merican-made calculators. “ . . . is to be recommended as a con
venient reference book.’’--B ulletin of the American M athem atical So
ciety. Index. 6 -1 /8 x 9 -1 /4 . x + 422 pages. 132 illu stratio n s. $6.00
SPHERICAL HARMONICS, An E lem entary T re atise on Harm onic Func
tions with A pplications by T. M. M acRobert. Second rev ised edition.
“ . . . sc h o la rly tre a tm e n t of the type of p ro b lem s a ris in g in a great
many branches of theoretical physics and the tools whereby such prob
lems may be attacked.’’--B ulletin of the American M athem atical Soci
ety. Index. 5-1/2 x 8 -1 /2 . vi + 372 pages. $4.50
THEORIE UND ANWENDUNG DER LAPLACE-TRANSFORMATION by
Gustav Doetsch. Second revised edition. Yellow (Grundlehren) S eries.
Text in German. Germ an-English glossary. Bibliography. Index. 6 x 9 .
xiv 4 439 pages. 18 illu stratio n s. Tables of Laplace tran sfo rm atio n s.
(Originally published at $14.50). $3.95
THE THEORY OF SOUND by Lord Rayleigh. With an H isto ric al Intro
duction by Robert Bruce Lindsay. Second rev ised edition. “ . . . m akes
this outstanding tre a tise available again, and furtherm ore, at a popular
p r ic e .’’--R eview of Scientific In stru m en ts. Appendix. Index. 5 -1 /2 x
8 - l/2 . Volume I: xlii + 408 pages. Volume II: xvi + 504 pages. (Orig
inally published in two volum es at $8.00).
* Unabridged One Volume E d itio n --$5.95
A TREATISE ON THE ANALYTICAL DYNAMICS OF PARTICLES AND
RIGID BODIES by E. T. W hittaker. Fourth revised edition. “ . . .exhibits
great m athem atical power and attain m en ts.. . ’’--B ulletin of the A m er
ican M athem atical Society. Index. 6 x 9 . xiv + 456 pages. (O riginally
published at $6.00). $4.50
A TREATISE ON THE MATHEMATICAL THEORY OF ELASTICITY by
A. E. H. Love. F o u rth re v ise d edition. “ . . . has been fo r y e a rs the
standard tre a tise on e la s tic ity .. .p re se n ts a pictu re of this extensive
field in all its aspects in a single volum e... ’’--A m erican M athematical
M onthly. Index. 6 x 9 . xxi 4 643 p ag es. 76 illu stra tio n s. (O riginally
published at $10.50). $5.95
D O VE R B O O KS O N SCI E NCE
P H YSI C S AND C H E MI ST R Y
ATOMIC SPECTRA AND ATOMIC STRUCTURE by G erhard H erzberg.
T ranslated with the cooperation of the author by J. W T. Spinks. Sec
ond revised edition. “ . . .th e vector model and the quantum mechanical
view are skillfully blended together into a unified description of atomic
p r o c e s s e s ... ’’--N atu re . Bibliography. Index. 5-1/4 x 8-1/4. xv 4 257
pages. 80 illustrations. 21 tables. (Originally published at $5.70). $3.95
BIOMETRICAL GENETICS, THE STUDY OF CONTINUOUS VARIATION
by K. M ather. Based on the use of m easurem ents, this work examines
the pheno-type classes for which older methods of discontinuous v a ri
ation are u se less. 5-1/2 x 8 -1 /2 . x 4 158 pages. 16 diag ram s. $3.50
COSMIC RADIATION. Edited by W Heisenberg. Translated from the G er
man by T. H. Johnson. 15 a rtic le s on recent accom plishm ents in the
field written by eminent German physicists during World War n.
Mate
ria l well integrated with ifumerous c ro ss re fe re n c e s and consistent
notation. Bibliography. Index. 6 x 9 . xvi 4 192 pages. 36 illu stratio n s.
13 tables. $3.95
DESIGN OF CRYSTAL VIBRATING SYSTEMS by W illiam J. F ry , John
M. Taylor and B ertha W. Henvis. Second revised edition. P ro ced u res
for design of p ro je c to rs involving a g en eral set of cu rv es based on
fundamental piezoelectric relations. “ Contains much valuable m aterial
released for the firs t tim e for general publication.’’--E le c tro n ic En
gineering. 4 appendices. 6 - 1 / 8 x 9 -1 /4 . viii 4 182 pages. 126 graphs.
$3.50
THE EVOLUTION OF SCIEN TIFIC THOUGHT FROM NEWTON TO
EINSTEIN by A. d ’Abro. “ . . .c o v e rs many m ore topics than any other
popular book in English of which I know, and th ere are many ad m ira
ble features in the p re se n ta tio n .. . ’’--P h y sic al Review. 4 appendices.
5-3/8 x 8 . 544 pages. $5.00
GAS DYNAMICS TABLES FOR AIR by Howard W. Emmons. “ The p re
cision of the computations makes the tables adequate for many special
u s e s .’’--R eview of Scientific Instrum ents. 6 - l / 8 x 9 -1 /4 . S em i-stiff
binding. 46 pages. 3 illu stratio n s. 4 tab les. 10 graphs. $1.75
HYDROLOGY. Edited by O scar E. M einzer. C hapters by 24 ex p e rts on
precipitation, g la cie rs, soil m oistures, runoff, droughts, hydrology of
lim estone and la v a -ro c k te rr a n e s , etc. “ M ost u p -to -d a te and most
com plete treatm ent of the su b je c t.. . ’’--B ulletin of the A merican As
sociation of Petroleum Geologists. Physics of the E arth S eries. Bibli
ography. Index. 6 -1 /8 x 9 -1 /4 . xi + 712 p ages. 165 illu stra tio n s. 23
tables. (Originally published at $ 8 . 0 0 ). $4.95
MATHEMATICAL FOUNDATIONS OF STATISTICAL MECHANICS by
A. I. Khinchin. T ra n sla te d by G Gamow. The m ost rig o ro u s m athe
m atical discussion available. Dover Series in Mathematics and Physics.
Appendix. Notations. Index. 5 x 7 -3 /8 . viii 4 179 pages. $2.95
D O VE R BO O KS O N SC I E NC E
MATHEMATISCHE GRUNDLAGEN DER QUANTENMECHANIK by Johann
von Neumann. Yellow (Grundlehren) Series. Text in German. German -
English glossary. Index. 6 x 9. vi + 266 pages. 4 illustrations. (Orig
inally published at $7.85). $3.95
MATTER AND LIGHT, THE NEW PHYSICS by Louis de Broglie. T rans
lated by W. H. Johnston. 21 essays on present day physics, m atter and
electricity, light and radiation, wave mechanics and the philosophical
implications of scientific achievement. 4-7/8 x 7-3/4. iv + 300 pages.
(Originally published at $3.50). $2.75
THE NATURE OF PHYSICAL THEORY by P. W. Bridgman. “ It can easily
be read in about three hours, but it will then demand to be rerea d ,
p arts of it several tim es o v e r.’'--R eview of Scientific Instrum ents.
Index. 5-3/8 x 8. xi + 138 pages. $2.25
THE PHASE RULE AND ITS APPLICATIONS by Alexander Findlky.
Eighth revised edition. “ It has established itself as the standard work
on the su b je c t... “ --N atu re. Index. 5-1/2 x 8 -1/2. xxxi + 313 pages;
163 illustrations. $3.95
POLAR MOLECULES by P. Debye. “ This book not only brings together
for the first time the accumulated information on electric dipoles, but
also points out the gaps which still exist in theory and experim ent.’'
--N ature. Index. 5-1/2 x 8-1/2. iv + 172 pages. 33 illustrations. (Orig
inally published at $8.00). $3.50
TABLES OF FUNCTIONS WITH FORMULAE AND CURVES (FUNKTION-
ENTAFELN) by Eugene Jahnke and F ritz Emde. Fourth revised edition
containing 400 corrections of e rro rs and a supplementary bibliography
of 43 titles. Text in German and English. Bibliography. Index. 5-1/2
x 8 -1 /2 . xvi + 382 pages. 212 illu stratio ns. (Originally published at
$6.00). $4.95
LES TENSEURS EN MECANIQUE ET EN ELASTICITE by Leon B rillouia
“ . . .f ir s t com prehensive trea tise in any language on which the main
em phasis is laid on the ten so rlal form ulation of the cla ssic a l (non-
relativistic) laws of p h y sic s.. . “ --Review of Scientific Instrum ents.
Text in French. Index. 6 x 9. xx + 364 pages. 114 figures. $3.95
TERRESTRIAL MAGNETISM AND ELECTRICITY. Edited by J. A. Flem
ing. Chapters by 14 leading geophysicists. “ An Important and author
itative production.. . making av ailab le.. . the present state and fasci
nating and difficult problem s of this branch of earth scien ce.“ - -P ro
ceedings, Physical Society of London. P hysics of the E arth Series.
Bibliography with 1,523 entries. Index. 6-1/8 x 9-1/4. xii + 794 pages.
296 illustrations. (Originally published at $8.00). $4.95
TIME, KNOWLEDGE, AND THE NEBULAE by Martin Johnson. Foreword
by Professor E. A. Milne. “ . . . succinct and lucid summary of the new
cosmology involved in P ro fesso r M ilne's theory of relativity, of its
physical background and of its possible philosophical significance."
--London Tim es. Bibliography. Index. 5-1/2 x 8-1/2. ill + 189 pages.
$2.75
D O VE R BO O K S O N S C I E N C E
TREATISE ON THERMODYNAMICS by Max Planck. T ran slated with the
author’s sanction by Alexander Ogg. Third rev ised edition (translated
from the seventh German edition). . . an English tran slatio n of P ro
fe sso r P lanck’s book will receive a warm w elcom e.’’--N atu re . Index.
5-1/4 x 8-1/4. xxxiii + 297 pages. 5 illu stratio n s. (Originally published
at $4.80). $3.50
Please send for our free, new catalog which gives full
desaiptions of all Dover Booths on Science
Dover Publications, 1780 Broadway, New Yor^ 19, N. Y.