Multiplicative Number Theory I-Montgomery
Multiplicative Number Theory I-Montgomery
Prime numbers are the multiplicative building blocks of natural numbers. Un-
derstanding their overall influence and especially their distribution gives rise
to central questions in mathematics and physics. In particular their finer distri-
bution is closely connected with the Riemann hypothesis, the most important
unsolved problem in the mathematical world. Assuming only subjects covered
in a standard degree in mathematics, the authors comprehensively cover all the
topics met in first courses on multiplicative number theory and the distribution
of prime numbers. They bring their extensive and distinguished research exper-
tise to bear in preparing the student for intelligent reading of the more advanced
research literature. The text, which is based on courses taught successfully over
many years at Michigan, Imperial College and Pennsylvania State, is enriched
by comprehensive historical notes and references as well as over 500 exercises.
HUGH L. MONTGOMERY
University of Michigan, Ann Arbor
ROBERT C. VAUGHAN
Pennsylvania State University, University Park
cambridge university press
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo
Cambridge University Press has no responsibility for the persistence or accuracy of urls
for external or third-party internet websites referred to in this publication, and does not
guarantee that any content on such websites is, or will remain, accurate or appropriate.
Dedicated to our teachers:
P. T. Bateman
J. H. H. Chalk
H. Davenport
T. Estermann
H. Halberstam
A. E. Ingham
Talet är tänkandets början och slut.
Med tanken föddes talet.
Utöfver talet når tanken icke.
Preface page xi
List of notation xiii
1 Dirichlet series: I 1
1.1 Generating functions and asymptotics 1
1.2 Analytic properties of Dirichlet series 11
1.3 Euler products and the zeta function 19
1.4 Notes 31
1.5 References 33
2 The elementary theory of arithmetic functions 35
2.1 Mean values 35
2.2 The prime number estimates of Chebyshev and of Mertens 46
2.3 Applications to arithmetic functions 54
2.4 The distribution of (n) − ω(n) 65
2.5 Notes 68
2.6 References 71
3 Principles and first examples of sieve methods 76
3.1 Initiation 76
3.2 The Selberg lambda-squared method 82
3.3 Sifting an arithmetic progression 89
3.4 Twin primes 91
3.5 Notes 101
3.6 References 104
4 Primes in arithmetic progressions: I 108
4.1 Additive characters 108
4.2 Dirichlet characters 115
4.3 Dirichlet L-functions 120
vii
viii Contents
14 Zeros 452
14.1 General distribution of the zeros 452
14.2 Zeros on the critical line 456
14.3 Notes 460
14.4 References 461
APPENDICES
A The Riemann–Stieltjes integral 486
A.1 Notes 492
A.2 References 493
Our object is to introduce the interested student to the techniques, results, and
terminology of multiplicative number theory. It is not intended that our discus-
sion will always reach the research frontier. Rather, it is hoped that the material
here will prepare the student for intelligent reading of the more advanced re-
search literature.
Analytic number theorists are not very uniformly distributed around the
world and it possible that a student may be working without the guidance of an
experienced mentor in the area. With this in mind, we have tried to make this
volume as self-contained as possible.
We assume that the reader has some acquaintance with the fundamentals of
elementary number theory, abstract algebra, measure theory, complex analysis,
and classical harmonic analysis. More specialized or advanced background
material in analysis is provided in the appendices.
The relationship of exercises to the material developed in a given section
varies widely. Some exercises are designed to illustrate the theory directly
whilst others are intended to give some idea of the ways in which the theory can
be extended, or developed, or paralleled in other areas. The reader is cautioned
that papers cited in exercises do not necessarily contain a solution.
This volume is the first instalment of a larger project. We are preparing a
second volume, which will cover such topics as uniform distribution, bounds for
exponential sums, a wider zero-free region for the Riemann zeta function, mean
and large values of Dirichlet polynomials, approximate functional equations,
moments of the zeta function and L functions on the line σ = 1/2, the large
sieve, Vinogradov’s method of prime number sums, zero density estimates,
primes in arithmetic progressions on average, sums of primes, sieve methods,
the distribution of additive functions and mean values of multiplicative func-
tions, and the least prime in an arithmetic progression. The present volume was
xi
xii Preface
xiii
xiv List of notation
Symbol Meaning
∞ Found on page
(s, a) = a e−w w s−1 dw; the incomplete 327
Gamma function.
γ The imaginary part of a zero of the 172
zeta function or of an L-function.
N −1
N (θ ) = 1 + 2 n=1 (1 − n/N ) cos 2π nθ; 174
known as the Fejér
kernel.
κ 1/2
ε(χ ) = τ (χ )/ i q . 332
ζ (s) = ∞ n=1 n −s
for σ > 1, known as the 2
Riemann zeta function.
ζ (s, α) = ∞ n=0 (n + α)
−s
for σ > 1; known 30
as the Hurwitz zeta function.
−s
ζ K (s) a N (a) ; known as the Dedekind 343
zeta function of the algebraic number
field K .
= sup ρ 430, 463
ϑ(x) = p≤x log p. 46
∞
= n=−∞ e−π n z for z > 0.
2
ϑ(z) 329
ϑ(x; q, a) The sum of log p over primes p ≤ x 128, 377ff
for which p ≡ a (mod q).
ϑ(x, χ) = p≤x χ ( p) log p. 377ff
κ = (1 − χ (−1))/2. 332
(n) = log p if n = p k , = 0 otherwise; 23
known as the von Mangoldt Lambda
function.
2 (n) = (n) log n + bc=n (b)(c). 251
(x; q, a) The sum of λ(n) over those n ≤ x 383
such that n ≡ a (mod q).
(x, χ ) = n≤x χ (n)λ(n). 383
λ(n) = (−1)(n) ; known as the Liouville 21
lambda function.
µ(n) = (−1)ω(n) for square-free n, = 0 21
otherwise. Known as the Möbius mu
function.
µ(σ ) the Lindelöf mu function 330
ξ (s) = 12 s(s − 1)ζ (s) (s/2)π −s/2 . 328
ξ (s, χ ) = L(s, χ ) ((s + κ)/2)(q/π )(s+κ)/2 333
where χ is a primitive character
modulo q, q > 1.
xvi List of notation
for |z| < 1, then the n th power series coefficient of f (z)s is the number rk,s (n)
of representations of n as a sum of s positive k th powers,
n = m k1 + m k2 + · · · + m ks .
We can recover rk,s (n) from f (z)s by means of Cauchy’s coefficient formula:
1 f (z)s
rk,s (n) = dz.
2πi z n+1
By choosing an appropriate contour, and estimating the integrand, we can de-
termine the asymptotic size of rk,s (n) as n → ∞, provided that s is sufficiently
large, say s > s0 (k). This is the germ of the Hardy–Littlewood circle method,
but considerable effort is required to construct the required estimates.
To appreciate why power series are useful in dealing with additive prob-
lems, note that if A(z) = ak z k and B(z) = bm z m then the power series
1
2 Dirichlet series: I
The terms are grouped according to the sum of the indices, because
z k z m = z k+m .
A Dirichlet series is a series of the form α(s) = ∞ n=1 an n
−s
where s is
∞ −s
a complex variable. If β(s) = m=1 bm m is a second Dirichlet series and
γ (s) = α(s)β(s), then (ignoring questions relating to the rearrangement of terms
of infinite series)
∞
∞ ∞ ∞ ∞
γ (s) = ak k −s bm m −s = ak bm (km)−s = ak bm n −s .
k=1 m=1 k=1 m=1 n=1 km=n
(1.2)
∞ −s
That is, we expect that γ (s) is a Dirichlet series, γ (s) = n=1 cn n , whose
coefficients are
cn = a k bm . (1.3)
km=n
This corresponds to (1.1), but the terms are now grouped according to the
product of the indices, since k −s m −s = (km)−s .
Since we shall employ the complex variable s extensively, it is useful to have
names for its real and complex parts. In this regard we follow the rather peculiar
notation that has become traditional: s = σ + it.
Among the Dirichlet series we shall consider is the Riemann zeta function,
which for σ > 1 is defined by the absolutely convergent series
∞
ζ (s) = n −s . (1.4)
n=1
As a first application of (1.3), we note that if α(s) = β(s) = ζ (s) then the
manipulations in (1.3) are justified by absolute convergence, and hence we see
that
∞
d(n)n −s = ζ (s)2 (1.5)
n=1
for σ > 1. Here d(n) is the divisor function, d(n) = d|n 1.
From the rate of growth or analytic behaviour of generating functions we
glean information concerning the sequence of coefficients. In expressing our
findings we employ a special system of notation. For example, we say, ‘f (x) is
asymptotic to g(x)’ as x tends to some limiting value (say x → ∞), and write
1.1 Generating functions and asymptotics 3
0000
0000
0000
0000
Figure 1.1 Graph of π(x) (solid) and x/ log x (dotted) for 2 ≤ x ≤ 106 .
proved that π (x) x/ log x. This is of course weaker than the Prime Number
Theorem, but it was derived much earlier, in 1852. Chebyshev also showed
that π(x) x/ log x. In general, we say that f (x) g(x) if there is a positive
constant c such that f (x) ≥ cg(x) and g is non-negative. In this situation both
f and g take only positive values. If both f g and f g then we say that f
and g have the same order of magnitude, and write f g. Thus Chebyshev’s
estimates can be expressed as a single relation,
x
π (x) .
log x
The estimate (1.6) is best possible to the extent that the error term is not
o(x(log x)−2 ). We have also a special notation to express this:
x x
π(x) − = .
log x (log x)2
In general, if lim supx→∞ | f (x)|/g(x) > 0 then we say that f (x) is ‘Omega of
g(x)’, and write f (x) = (g(x)). This is precisely the negation of the statement
‘ f (x) = o(g(x))’. When studying numerical values, as in Figure 1.1, we find
that the fit of x/ log x to π (x) is not very compelling. This is because the error
term in the approximation is only one logarithm smaller than the main term.
This error term is not oscillatory – rather there is a second main term of this
1.1 Generating functions and asymptotics 5
size:
x x x
π (x) = + + O .
log x (log x)2 (log x)3
This is also best possible, but the main term can be made still more elaborate to
give a smaller error term. Gauss was the first to propose a better approximation to
π(x). Numerical studies led him to observe that the density of prime numbers in
the neighbourhood of x is approximately 1/ log x. This suggests that the number
of primes not exceeding x might be approximately equal to the logarithmic
integral,
x
1
li(x) = du.
2 log u
(Orally, ‘li’ rhymes with ‘pi’.) By repeated integration by parts we can show
that
K −1
(k − 1)! x
li(x) = x + OK
k=1
(log x)k (log x) K
for any positive integer K ; thus the secondary main terms of the approximation
to π(x) are contained in li(x).
In Chapter 6 we shall prove the Prime Number Theorem in the sharper
quantitative form
x
π(x) = li(x) + O √
exp(c log x)
√
for some suitable positive constant c. Note that exp(c log x) tends to infinity
faster than any power of log x. The error term above seems to fall far from
what seems to be the truth. Numerical evidence, such as that in Table 1.1,
√
suggests that the error term in the Prime Number Theorem is closer to x in
size. Gauss noted the good fit, and also that π(x) < li(x) for all x in the range of
his extensive computations. He proposed that this might continue indefinitely,
but the numerical evidence is misleading, for in 1914 Littlewood showed that
1/2
x log log log x
π(x) − li(x) = ± .
log x
Here the subscript ± indicates that the error term achieves the stated or-
der of magnitude infinitely often, and in both signs. In particular, the dif-
ference π − li has infinitely many sign changes. More generally, we write
f (x) = + (g(x)) if lim supx→∞ f (x)/g(x) > 0, we write f (x) = − (g(x))
if lim infx→∞ f (x)/g(x) < 0, and we write f (x) = ± (g(x)) if both these re-
lations hold.
6 Dirichlet series: I
1.1.1 Exercises
1. Let r (n) be the number of ways that n cents of postage can be made, using
only 1 cent, 2 cent, and 3 cent stamps. That is, r (n) is the number of ordered
triples (x1 , x2 , x3 ) of non-negative integers such that x1 + 2x2 + 3x3 = n.
(a) Show that
∞
1
r (n)z n =
n=0
(1 − z)(1 − z 2 )(1 − z 3 )
a b c d e f
+ + + + +
(z − 1) 3 (z − 1) 2 z−1 z+1 z−ω z−ω
where ω = e2πi/3 and ω = e−2πi/3 are the primitive cube roots of unity.
(c) Show that r (n) is the integer nearest (n + 3)2 /12.
(d) Show that r (n) is the number of ways of writing n = y1 + y2 + y3 with
y1 ≥ y2 ≥ y3 ≥ 0.
2. Explain why
∞
k
1 + z2 = 1 + z + z2 + · · ·
k=0
K
z ak 1
=
k=1
1 − zmk 1−z
e2πia K /m K
∼
m K (1 − r )
for all n ≥ 2.
(b) Let P(z) = ∞ n
n=0 A(n)z . Show that
P(z)2 = P(z) − z.
Deduce that
√
1− 1 − 4z ∞
1/2 2n−1
P(z) = = 2 (−1)n−1 z n .
2 n=1
n
2n−2
(c) Conclude that A(n) = n−1 /n for all n ≥ 1. These are called the Cata-
lan numbers.
1.1 Generating functions and asymptotics 9
(d) What needs to be said concerning the convergence of the series used
above?
6. (a) Let n k denote the total number of monic polynomials of degree k in
F p [x]. Show that n k = p k .
(b) Let P1 , P2 , . . . be the irreducible monic polynomials in F p [x], listed in
some (arbitrary) order. Show that
∞
(1 + z deg Pr + z 2 deg Pr + z 3 deg Pr + · · · ) = 1 + pz + p 2 z 2
r =1
+ p3 z 3 + · · ·
(j) Show that gn > 0 for all p and all n ≥ 1. (If P ∈ F p [x] is irreducible and
has degree n, then the quotient ring F p [x]/(P) is a field of p n elements.
Thus we have proved that there is such a field, for each prime p and
integer n ≥ 1. It may be further shown that the order of a finite field
is necessarily a prime power, and that any two finite fields of the same
order are isomorphic. Hence the field of order p n , whose existence we
have proved, is essentially unique.)
7. (E. Berlekamp) Let p be a prime number. We recall that polynomials in a
single variable (mod p) factor uniquely into irreducible polynomials. Thus
a monic polynomial f (x) can be expressed uniquely (mod p) in the form
g(x)h(x)2 where g(x) is square-free (mod p) and both g and h are monic. Let
sn denote the number of monic square-free polynomials (mod p) of degree
n. Show that
∞
∞ ∞
sk z k p m z 2m = pn z n
k=0 m=0 n=0
∞
1 − pz 2
sk z k = ,
k=0
1 − pz
induction that
∞
k
A(z) − B(z) = 1 − z2
k=0
since R(u) is constant in the interval [n − 1, n). The integrals combine to give
(1.7).
If |R(u)| ≤ ε for all u ≥ M and if σ > σ0 , then from (1.7) we see that
N ∞
|s − s0 |
an n −s ≤ 2ε + ε|s − s0 | u σ0 −σ −1 du ≤ 2 + ε.
n=M+1 M σ − σ0
For s in the prescribed region we see that
|s − s0 | ≤ σ − σ0 + |t − t0 | ≤ (H + 1)(σ − σ0 ),
N
so that the sum M+1 an n −s is uniformly small, and the result follows by the
uniform version of Cauchy’s principle.
Weierstrass that α(s) is analytic for σ > σc , and that the differentiated series is
locally uniformly convergent to α (s):
∞
α (s) = − an (log n)n −s (1.8)
n=1
need not exist. However, α(s) is continuous in the sector S of Theorem 1.1, in
view of the uniform convergence there. That is,
lim α(s) = α(s0 ),
s→s0
(1.9)
s∈S
Let φ denote the left-hand side of (1.11). If θ > φ then A(x) x θ where the
implicit constant may depend on the an and on θ . Thus if σ > θ, then the integral
in (1.10) is absolutely convergent. Thus we obtain (1.10) by letting N → ∞,
since the first term above tends to 0 as N → ∞.
Suppose that σc < 0. By Corollary 1.2 we know that A(x) tends to a finite
limit as x → ∞, and hence φ ≤ 0, so that (1.10) holds for all σ > 0.
14 Dirichlet series: I
Now suppose that σc ≥ 0. By Corollary 1.2 we know that the series in (1.10)
diverges when σ < σc . Hence φ ≥ σc . To complete the proof it suffices to show
that φ ≤ σc . Choose σ0 > σc . By (1.7) with s = 0 and M = 0 we see that
N
A(N ) = −R(N )N σ0 + σ0 R(u)u σ0 −1 du.
0
Since R(u) is a bounded function, it follows that A(N ) N σ0 where the implicit
constant may depend on the an and on σ0 . Hence φ ≤ σ0 . Since this holds for
any σ0 > σc , we conclude that φ ≤ σc .
converges for σ > 0, but it is absolutely convergent only for σ > 1. In general
we let σa denote the infimum of those σ for which ∞ n=1 |an |n
−σ
< ∞. Then σa ,
the abscissa of absolute convergence, is the abscissa of convergence of the series
∞ −s
n=1 |an |n , and we see that an n −s is absolutely convergent if σ > σa ,
but not if σ < σa . We now show that the strip σc ≤ σ ≤ σa of conditional
convergence is never wider than in the example (1.12).
Proof The first inequality is obvious. To prove the second, suppose that ε > 0.
Since the series an n −σc −ε is convergent, the summands tend to 0, and hence
an n σc +ε where the implicit constant may depend on the an and on ε. Hence
the series an n −σc −1−2ε is absolutely convergent by comparison with the series
−1−ε
n .
Theorem 1.5 Suppose that α(s) = an n −s has abscissa of convergence σc .
If δ and ε are fixed, 0 < ε < δ < 1, then
α(s) τ 1−δ+ε
By the example found in Exercise 8 at the end of this section, we see that
the bound above is reasonably sharp.
Since the series α(σc + ε) converges, we know that an n σc +ε , and also that
R(u) 1. Thus the above is
M
|σc + ε − s| σc +ε−σ
n −δ+ε + M −δ+ε + M .
n=1
σ − σc − ε
By Theorem 1.4 this sum is absolutely convergent for σ > σ0 + 1. Since each
term tends to 0 as σ → ∞, we see that the right-hand side tends to 0, by
the principle of dominated convergence. Hence c N = 0, and by induction we
deduce that this holds for all N .
16 Dirichlet series: I
for σ > 1. That this is an entire function follows from Theorem 10.2.) Since a
Dirichlet series does not in general have a singularity on its line of convergence,
it is noteworthy that a Dirichlet series with non-negative coefficients not only
has a singularity on the line σc + it, but actually at the point σc .
Theorem 1.7 (Landau) Let α(s) = an n −s be a Dirichlet series whose ab-
scissa of convergence σc is finite. If an ≥ 0 for all n then the point σc is a
singularity of the function α(s).
It is enough to assume that an ≥ 0 for all sufficiently large n, since any finite
N
sum n=1 an n −s is an entire function.
Proof By replacing an by an n −σc , we may assume that σc = 0. Suppose that
α(s) is analytic at s = 0, so that α(s) is analytic in the domain D = {s : σ >
0} ∪ {|s| < δ} if δ > 0 is sufficiently small. We expand α(s) as a power series
at s = 1:
∞
α(s) = ck (s − 1)k . (1.13)
k=0
for |s − 1| < 1 + δ . If s < 1 then all terms above are non-negative. Since
series of non-negative numbers may be arbitrarily rearranged, for −δ < s < 1
we may interchange the summations over k and n to see that
∞
∞
(1 − s)k (log n)k
α(s) = an n −1
n=1 k=0
k!
∞
∞
= an n −1 exp (1 − s) log n = an n −s .
n=1 n=1
Hence this last series converges at s = −δ /2, contrary to the assumption that
σc = 0. Thus α(s) is not analytic at s = 0.
1.2.1 Exercises
1. Suppose that α(s) is a Dirichlet series, and that the series α(s0 ) is boundedly
oscillating. Show that σc = σ0 .
2. Suppose that α(s) = ∞ n=1 an n
−s
is a Dirichlet series with abscissa of con-
vergence σc . Suppose that α(0) converges, and put R(x) = n>x an . Show
that σc is the infimum of those numbers θ such that R(x) xθ .
3. Let Ak (x) = n≤x an (log n) .k
marks following the proof of Theorem 1.1 imply only that σc ≤ σc .)
4. (Landau 1909b) Let α(s) = an n −s be a Dirichlet series with abscissa of
convergence σc and abscissa of absolute convergence σa > σc . Let C(x) =
−σc
n≤x an n and A(x) = n≤x |an |n −σc .
(a) By a suitable application of Theorem 1.3, or otherwise, show that
C(x) x ε and that A(x) x σa −σc +ε for any ε > 0, where the implicit
constants may depend on ε and on the sequence {an }.
(b) Show that if σ > σc then
∞
an n −s = −C(N )N σc −s + (s − σc ) C(u)u σc −s−1 du.
n>N N
18 Dirichlet series: I
α(s) τ θ (σ )+ε
lim α(σ ) = a1 .
σ →∞
(b) Show that ζ (s) = − ∞ n=1 (log n)n
−s
for σ > 1.
(c) Show that limσ →∞ ζ (σ ) = 0.
(d) Show that there is no half-plane in which 1/ζ (s) can be written as a
convergent Dirichlet series.
6. Let α(s) = an n −s be a Dirichlet series with an ≥ 0 for all n. Show that
σc = σa , and that
2tr
(a) Show that tr an = 0.
1.3 Euler products and the zeta function 19
(b) Show that if tr ≤ x < 2tr for some r , then A(x) = [x]itr where A(x) =
n≤x an .
(c) Show that A(x) 1 uniformly for x ≥ 1.
(d) Deduce that α(s) converges for σ > 0.
(e) Show that α(it) does not converge; conclude that σc = 0.
(f) Show that if σ > 0, then
R
2tr ∞
α(s) = an n −s + s A(x)x −s−1 d x .
r =1 n=tr t R+1
(i) Show that if n ≤ x < n + 1, then (n it R x −it R ) ≥ 1/2. Deduce that
2t R
[x]it R x −σ −it R −1 d x t R−σ .
tR
(j) Suppose that δ > 0 is fixed. Conclude that if R ≥ R0 (δ), then |α(σ +
it R )| t R1−σ uniformly for δ ≤ σ ≤ 1 − δ.
(k) Show that |an |n −σ < ∞ when σ > 1. Deduce that σa = 1.
The mere convergence of α(s) and β(s) is not sufficient to justify (1.2).
Indeed, the square of the series (1.12) can be shown to have abscissa of conver-
gence ≥ 1/4.
20 Dirichlet series: I
(1 + f ( p) p −s + f ( p 2 ) p −2s + f ( p 3 ) p −3s + · · · )
p
Set n = p1k1 p2k2 · · · prkr . Since f is multiplicative, the above is f (n)n −s . More-
over, this correspondence between products of prime powers and positive inte-
gers n is one-to-one, in view of the fundamental theorem of arithmetic. Hence
after rearranging the terms, we obtain the sum f (n)n −s . That is, we expect
that
∞
f (n)n −s = (1 + f ( p) p −s + f ( p 2 ) p −2s + · · · ). (1.14)
n=1 p
The product on the right-hand side is called the Euler product of the Dirichlet
series. The mere convergence of the series on the left does not imply that the
product converges; as in the case of the identity (1.2), we justify (1.14) only
under the stronger assumption of absolute convergence.
Theorem 1.9 If f is multiplicative and | f (n)|n −σ < ∞, then (1.14) holds.
Hence
∞
y − f (n)n −s ≤ | f (n)|n −σ .
n=1 n ∈N
/
Let ω(n) denote the number of distinct primes dividing n, and let (n) be
the number of distinct prime powers dividing n. That is,
ω(n) = 1, (n) = 1= k. (1.16)
p|n p k |n p k n
It is easy to distinguish these functions, since ω(n) ≤ (n) for all n, with equal-
ity if and only if n is square-free. These functions are examples of additive
functions because they satisfy the functional relation f (mn) = f (m) + f (n)
whenever (m, n) = 1. Moreover, (n) is totally additive because this func-
tional relation holds for all pairs m, n. An exponential of an additive function is
a multiplicative function. In particular, the Liouville lambda function is the to-
tally multiplicative function λ(n) = (−1)(n) . Closely related is the Möbius mu
function, which is defined to be µ(n) = (−1)ω(n) if n is square-free, µ(n) = 0
otherwise. By the fundamental theorem of arithmetic we know that a multi-
plicative (or additive) function is uniquely determined by its values at prime
powers, and similarly that a totally multiplicative (or totally additive) function
is uniquely determined by its values at the primes. Thus µ(n) is the unique
multiplicative function that takes the value −1 at every prime, and the value 0
at every higher power of a prime, while λ(n) is the unique totally multiplicative
function that takes the value −1 at every prime. By using Theorem 1.9 we can
22 Dirichlet series: I
determine the Dirichlet series generating functions of λ(n) and of µ(n) in terms
of the Riemann zeta function.
Corollary 1.10 For σ > 1,
∞
n −s = ζ (s) = (1 − p −s )−1 , (1.17)
n=1 p
∞
1
µ(n)n −s = = (1 − p −s ), (1.18)
n=1
ζ (s) p
and
∞
ζ (2s)
λ(n)n −s = = (1 + p −s )−1 . (1.19)
n=1
ζ (s) p
Proof All three series are absolutely convergent, since n −σ < ∞ for σ >
1, by the integral test. Since the coefficients are multiplicative, the Euler product
formulae follow by Theorem 1.9. In the first and third cases use the variant
(1.15). On comparing the Euler products in (1.17) and (1.18), it is immediate
that the second of these Dirichlet series is 1/ζ (s). As for (1.19), from the identity
1 + z = (1 − z 2 )/(1 − z) we deduce that
−2s
−s p (1 − p ) ζ (s)
(1 + p ) = −s
= .
p p (1 − p ) ζ (2s)
∞
log ζ (s) = log(1 − p −s )−1 = k −1 p −ks .
p p k=1
for σ > 1. This is a Dirichlet series, whose n th coefficient is the von Mangoldt
lambda function: (n) = log p if n is a power of p, (n) = 0 otherwise.
and
ζ (s) ∞
− = (n)n −s .
ζ (s) n=1
for σ > 1.
We now continue the zeta function beyond the half-plane in which it was
initially defined.
Here {u} denotes the fractional part of u, so that {u} = u − [u] where [u]
denotes the integral part of u.
We evaluate the first integral on the right-hand side, and integrate the second
one by parts. Thus the above is
∞
x 1−s
= + {x}x −s + {u} du −s .
s−1 x
Since (u −s ) = −su −s−1 , the desired formula now follows by Theorem A.3.
The integral in (1.23) is convergent in the half-plane σ > 0, and uniformly so
for σ ≥ δ > 0. Since the integrand is an analytic function of s, it follows that the
integral is itself an analytic function for σ > 0. By the uniqueness of analytic
continuation the formula (1.23) holds in this larger half-plane.
1.3 Euler products and the zeta function 25
10
0 1 5
–10
In addition,
1
= log x + C0 + O(1/x) (1.26)
n≤x n
Proof The first estimate follows by crudely estimating the integral in (1.23):
∞ ∞
x −σ
{u}u −s−1 du u −σ −1 du = .
x x σ
As for the second estimate, we note that the sum is
x x x
u −1 d[u] = u −1 du − u −1 d{u}
1− 1− 1−
x
= log x + 1 − {x}/x − {u}u −2 du.
1
x ∞ ∞
The result now follows by writing 1 = 1 − x , and noting that
∞ ∞
{u}u −2 du u −2 du = 1/x.
x x
Proof The first assertion is clear from (1.24). When |t| is larger, we obtain
a bound for |ζ (s)| by estimating the sum in (1.25). Assume that x ≥ 2. We
observe that
x
n −s n −σ 1+ u −σ du
n≤x n≤x 1
1.3.1 Exercises
1. Suppose that f (mn) = f (m) f (n) whenever (m, n) = 1, and that f is not
identically 0. Deduce that f (1) = 1, and hence that f is multiplicative.
2. (Stieltjes 1887) Suppose that an converges, that |b | < ∞, and that
n
cn is given by (1.3). Show that cn converges to ( an )( bn ). (Hint:
Write n≤x cn = n≤x bn A(x/n) where A(y) = n≤y an .)
3. Determine ϕ(n)n −s , σ (n)n −s , and |µ(n)|n −s in terms of the zeta
function. Here ϕ(n) is Euler’s ‘totient function’, which is the number of a,
1 ≤ a ≤ n, such that (a, n) = 1.
4. Let q be a positive integer. Show that if σ > 1, then
∞
n −s = ζ (s) (1 − p −s ).
n=1 p|q
(n,q)=1
6. Let σa (n) = d|n d a . Show that
∞
σa (n)σb (n)n −s = ζ (s)ζ (s − a)ζ (s − b)ζ (s − a − b)/ζ (2s − a − b)
n=1
for σ > σc .
(b) Show that
∞
(−1)n−1 n −s = (1 − 21−s )ζ (s)
n=1
for σ > 0.
1.3 Euler products and the zeta function 29
Thus if
Nk
εk log 1 (1.31)
Nk−1
then the series is not uniformly convergent in D.
(b) By using Corollary 1.15, or otherwise, show that if (a, b] ⊆ (Nk−1 , Nk ],
then
εk
an n −1 .
a<n≤b
tk
Hence if
∞
εk
< ∞, (1.32)
k=1
tk
then the series α(1) converges.
30 Dirichlet series: I
(c) Show that the parameters can be chosen so that (1.30)–(1.32) hold, say
1/2
by taking Nk = exp(1/εk ) and tk = εk with εk tending rapidly to 0.
14. Let t(n) = (−1)(n)−ω(n) p|n ( p − 1)−1 , and put T (s) = n t(n)n −s .
(a) Show that for σ > 0, T (s) has the absolutely convergent Euler product
1
T (s) = 1+ .
p ( p − 1)( p s + 1)
(c) Deduce that ζ (s, α) is an analytic function of s for σ > 0 apart from a
simple pole at s = 1 with residue 1.
(d) Show that
∞
1 {u}
lim ζ (s, α) − = 1/α − log α − du.
s→1 s−1 0 (u + α)2
(e) Show that
1 1 {x}
lim ζ (s, α) − = − log(x + α) +
s→1 s−1 0≤n≤x
n + α x +α
∞
{u}
− du.
x (u + α)2
(f) Let x → ∞ in the above, and use (C.2), (C.10) to show that
1
lim ζ (s, α) − =− (α).
s→1 s−1
(This is consistent with Corollary 1.16, in view of (C.11).)
1.4 Notes 31
1.4 Notes
Section 1.1. For a brief introduction to the Hardy–Littlewood circle method,
including its application to Waring’s problem, see Davenport (2005). For a
comprehensive account of the method, see Vaughan (1997). Other examples
of the fruitful use of generating functions are found in many sources, such as
Andrews (1976) and Wilf (1994).
Algorithms for the efficient computation of π(x) have been developed
by Meissel (Lehmer, 1959), Mapes (1963), Lagarias, Miller & Odlyzko
(1985), Deléglise & Rivat (1996), and by X. Gourdon. For discussion
of these methods, see Chapter 1 of Riesel (1994) and the web page of
Gourdon & Sebah at http://numbers.computation.free.fr/Constants/Primes/
countingPrimes.html.
The ‘big oh’ notation was introduced by Paul Bachmann (1894, p. 401). The
‘little oh’ was introduced by Edmund Landau (1909a, p. 61). The notation
was introduced by Hardy (1910, p. 2). Our notation f ∼ g also follows Hardy
(1910). The Omega notation was introduced by G. H. Hardy and J. E. Littlewood
(1914, p. 225). Ingham (1932) replaced the R and L of Hardy and Littlewood
by + and − . The notation is due to I. M. Vinogradov.
Section 1.2. The series an n −s is called an ordinary Dirichlet series,
to distinguish it from a generalized Dirichlet series, which is a sum of the
form an e−λn s where 0 < λ1 < λ2 < · · · , λn → ∞. We see that generalized
Dirichlet series include both ordinary Dirichlet series (λn = log n) and power
series (λn = n). Theorems 1.1, 1.3, 1.6, and 1.7 extend naturallyto generalized
∞
Dirichlet series, and even to the more general class of functions 0 e−us d A(u)
where A(u) is assumed to have finite variation on each finite interval [0, U ].
The proof of the general form of Theorem 1.6 must be modified to depend on
uniform, rather than absolute, convergence, since a generalized Dirichlet series
may be never more than conditionally convergent (e.g., (−1)n (log n)−s ).
If we put a = lim sup(log n)/λn , then the general form of Theorem 1.4
reads σc ≤ σa ≤ σc + a. Hardy & Riesz (1915) have given a detailed ac-
count of this subject, with historical attributions. See also Bohr & Cramér
(1923).
Jensen (1884) showed that the domain of convergence of a generalized
Dirichlet series is always a half-plane. The more precise information provided
by Theorem 1.1 is due to Cahen (1894) who proved it not only for ordinary
Dirichlet series but also for generalized Dirichlet series.
The construction in Exercise 1.2.8 would succeed with the simpler choice
an = n itr for tr ≤ n ≤ 2tr , an = 0 otherwise, but then to complete the argu-
ment one would need a further tool, such as the Kusmin–Landau inequality
32 Dirichlet series: I
(cf. Mordell 1958). The square of the Dirichlet series in Exercise 1.2.8 has ab-
scissa of convergence 1/2; this bears on the result of Exercise 2.1.9. Information
concerning the convergence of the product of two Dirichlet series is found in
Exercises 1.3.2, 2.1.9, 5.2.16, and in Hardy & Riesz (1915).
Theorem 1.7 originates in Landau (1905). The analogue for power series had
been proved earlier by Vivanti (1893) and Pringsheim (1894). Landau’s proof
extends to generalized Dirichlet series (including power series).
Section 1.3. The hypothesis | f (n)|n −σ < ∞ of Theorem 1.9 is equivalent
to the assertion that
which is slightly stronger than merely asserting that the Euler product converges
absolutely. We recall that a product n (1 + an ) is said to be absolutely con-
vergent if n (1 + |an |) < ∞. To see that the hypothesis p (1 + | f ( p) p −s +
· · · |) < ∞ is not sufficient, consider the following example due to Ingham:
For every prime p we take f ( p) = 1, f ( p 2 ) = −1, and f ( p k ) = 0 for k > 2.
Then the product is absolutely convergent at s = 0, but the terms f (n) do not
tend to 0, and hence the series f (n) diverges. Indeed, it can be shown that
−2 −3
n≤x f (n) ∼ cx as x → ∞ where c = p 1 − 2 p + p > 0.
Euler (1735) defined the constant C0 , which he denoted C.
Mascheroni (1790) called the constant γ , which is in common use, but
we wish to reserve this symbol for the imaginary part of a zero of the
zeta function or an L-function. It is conjectured that Euler’s constant C0
is irrational. The early history of the determination of the initial digits of
C0 has been recounted by Nielsen (1906, pp. 8–9). More recently, Wrench
(1952) computed 328 digits, Knuth (1963) computed 1,271 digits, Sweeney
(1963) computed 3,566 digits, Beyer & Waterman (1974) computed 4,879
digits, Brent (1977) computed 20,700 digits, Brent & McMillan (1980)
computed 30,100 digits. At this time, it seems that more than 108 digits
have been computed – see the web page of X. Gourdon & P. Sebah at
http://numbers.computation.free.fr/Constants/Gamma/gamma.html. To 50
places, Euler’s constant is
C0 = 0.57721 56649 01532 86060 65120 90082 40243 10421 59335 93992.
work on Dirichlet series with natural boundaries see Estermann (1928a,b) and
Kurokawa (1987).
1.5 References
Andrews, G. E. (1976). The Theory of Partitions, Reprint. Cambridge: Cambridge Uni-
versity Press (1998).
Bachmann, P. (1894). Zahlentheorie, II, Die analytische Zahlentheorie, Leipzig:
Teubner.
Beyer, W. A. & Waterman, M. S. (1974). Error analysis of a computation of Euler’s
constant and ln 2, Math. Comp. 28, 599–604.
Bohr, H. (1910). Bidrag til de Dirichlet’ske Rækkers theori, København: G. E. C. Gad;
Collected Mathematical Works, Vol. I, København: Danske Mat. Forening, 1952.
A3.
Bohr, H. & Cramér, H. (1923). Die neuere Entwicklung der analytischen Zahlentheo-
rie, Enzyklopädie der Mathematischen Wissenschaften, 2, C8, 722–849; H. Bohr,
Collected Mathematical Works, Vol. III, København: Dansk Mat. Forening, 1952,
H; H. Cramér, Collected Works, Vol. 1, Berlin: Springer-Verlag, 1952, pp. 289–
416.
Brent, R. P. (1977). Computation of the regular continued fraction of Euler’s constant,
Math. Comp. 31, 771–777.
Brent, R. P. & McMillan, E. M. (1980). Some new algorithms for high-speed computation
of Euler’s constant, Math. Comp. 34, 305–312.
Cahen, E. (1894). Sur la fonction ζ (s) de Riemann et sur des fonctions analogues, Ann.
de l’École Normale (3) 11, 75–164.
Davenport, H. (2005). Analytic Methods for Diophantine Equations and Diophantine
Inequalities. Second edition, Cambridge: Cambridge University Press.
Deléglise, M. & Rivat, J. (1996). Computing π (x): the Meissel, Lehmer, Lagarias, Miller,
Odlyzko method, Math. Comp. 65, 235–245.
Estermann, T. (1928a). On certain functions represented by Dirichlet series, Proc. Lon-
don Math. Soc. (2) 27, 435–448.
(1928b). On a problem of analytic continuation, Proc. London Math. Soc. (2) 27,
471–482.
Euler, L. (1735). De Progressionibus harmonicus observationes, Comm. Acad. Sci. Imper.
Petropol. 7, 157; Opera Omnia, ser. 1, vol. 14, Teubner, 1914, pp. 93–95.
Hardy, G. H. (1910). Orders of Infinity. Cambridge Tract 12, Cambridge: Cambridge
University Press.
Hardy, G. H. & Littlewood, J. E. (1914). Some problems of Diophantine approximation
(II), Acta Math. 37, 193–238; Collected Papers, Vol I. Oxford: Oxford University
Press. 1966, pp. 67–112.
Hardy, G. H. & Riesz, M. (1915). The General Theory of Dirichlet’s Series, Cambridge
Tract No. 18. Cambridge: Cambridge University Press. Reprint: Stechert–Hafner
(1964).
Ingham, A. E. (1932). The Distribution of Prime Numbers, Cambridge Tract 30. Cam-
bridge: Cambridge University Press.
34 Dirichlet series: I
1 N
lim F(n) = c.
N →∞ N n=1
In this section we develop a simple method by which mean values can be shown
to exist in many interesting cases.
If two arithmetic functions f and F are related by the identity
F(n) = f (d), (2.1)
d|n
This is the Möbius inversion formula. Conversely, if (2.2) holds for all n then
so also does (2.1). If f is generally small then F has an asymptotic mean value.
To see this, observe that
F(n) = f (d).
n≤x n≤x d|n
By iterating the sums in the reverse order, we see that the above is
= f (d) 1= f (d)[x/d].
d≤x n≤x d≤x
d|n
35
36 The elementary theory of arithmetic functions
by Corollary 1.10. From Corollary B.3 we know that ζ (2) = π 2 /6; hence the
proof is complete.
Let Q(x) denote the number of square-free integers not exceeding x, Q(x) =
n≤x µ(n) . We now calculate the asymptotic density of these numbers.
2
√
This is a relation of the shape (2.1) where f (d) = µ( d) if d is a perfect square,
and f (d) = 0 otherwise. Hence by (2.3),
µ(d)
Q(x) = x +O 1 .
d 2 ≤x
d2 d 2 ≤x
The error term is x 1/2 , and the sum in the main term is treated as in the
preceding proof.
We note that the argument above is routine once the appropriate identity
(2.4) is established. This relation can be discovered by considering (2.2), or by
using Dirichlet series: Let Q denote the class of square-free numbers. Then for
σ > 1,
1 − p −2s ζ (s)
n −s = (1 + p −s ) = −s
= .
n∈Q p p 1− p ζ (2s)
Now 1/ζ (2s) can be written as a Dirichlet series in s, with coefficients f (n) =
µ(d) if n = d 2 , f (n) = 0 otherwise. Hence the convolution equation (2.4) gives
the coefficients of the product Dirichlet series ζ (s) · 1/ζ (2s).
Suppose that ak , bm , cn are joined by the convolution relation
cn = ak bm , (2.5)
km=n
and that A(x), B(x), C(x) are their respective summatory functions. Then
C(x) = ak bm , (2.6)
km≤x
and it is useful to note that this double sum can be iterated in various ways. On
one hand we see that
C(x) = ak B(x/k); (2.7)
k≤x
this is the line of reasoning that led to (2.3) (take ak = f (k), bm = 1). At the
opposite extreme,
C(x) = bm A(x/m), (2.8)
m≤x
for 0 < y ≤ x. This is obvious once it is observed that the first term on the right
sums those terms ak bm for which km ≤ x, k ≤ y, the second sum includes the
38 The elementary theory of arithmetic functions
pairs (k, m) for which km ≤ x, m ≤ x/y, and the third term subtracts those ak bm
for which k ≤ y, m ≤ x/y, since these (k, m) were included in both the previous
terms. The advantage of (2.9) over (2.7) is that the number of terms is reduced
( y + x/y instead of x), and at the same time A and B are evaluated only
at large values of the argument, so that asymptotic formulæ for these quantities
may be expected to be more accurate. For example, if we wish to estimate the
average size of d(n) we take ak = bm = 1, and then from (2.3) we see that
d(n) = x log x + O(x).
n≤x
To obtain a more accurate estimate we observe that the first term on the
right-hand side of (2.9) is
[x/k] = x 1/k + O(y).
k≤y k≤y
Here the error term is minimized by taking y = x 1/2 . The second term
on the right in (2.9) is then identical to the first, and the third term is
[x 1/2 ]2 = x + O(x 1/2 ), and we have
We often construct estimates with one or more parameters, and then choose
values of the parameters to optimize the result. The instance above is typical –
we minimized x/y + y by taking y = x 1/2 . Suppose, more generally, that we
wish to minimize T1 (y) + T2 (y) where T1 is a decreasing function, and T2 is
an increasing function. We could differentiate and solve for a root of T1 (y) +
T2 (y) = 0, but there is a quicker method: Find y0 so that T1 (y0 ) = T2 (y0 ). This
does not necessarily yield the exact minimum value of T1 (y) + T2 (y), but it is
easy to see that
so the bound obtained in this way is at most twice the optimal bound.
Despite the great power of analytic techniques, the ‘method of the hyperbola’
used above is a valuable tool. The sequence cn given by (2.5) is called the
Dirichlet convolution of ak and bm ; in symbols, c = a ∗ b. Arithmetic functions
form a ring when equipped with pointwise addition, (a + b)n = an + bn , and
2.1 Mean values 39
Dirichlet convolution for multiplication. This ring is called the ring of formal
Dirichlet series. Manipulations of arithmetic functions in this way correspond
to manipulations of Dirichlet series without regard to convergence. This is
analogous to the ring of formal power series, in which multiplication is provided
by Cauchy convolution, cn = k+m=n ak bm .
In the ring of formal Dirichlet series we let O denote the arithmetic function
that is identically 0; this is the additive identity. The multiplicative identity is i
where i 1 = 1, i n = 0 for n > 1. The arithmetic function that is identically 1 we
denote by 1, and we similarly abbreviate µ(n), (n), and log n by µ, Λ, and
L. In this notation, the characteristic property of µ(n) is that µ ∗ 1 = i, which
is to say that µ and 1 are convolution inverses of each other, and the Möbius
inversion formula takes the compact form
a∗1=b ⇐⇒ a = b ∗ µ.
In the elementary study of prime numbers the relations Λ ∗ 1 = L, L ∗ µ = Λ
are fundamental.
2.1.1 Exercises
1. (de la Vallée Poussin 1898; cf. Landau 1911) Show that
{x/n} = (1 − C0 )x + O x 1/2
n≤x
for σ > 1.
40 The elementary theory of arithmetic functions
for x ≥ 1.
4. (cf. Evelyn & Linfoot 1930) Let N be a positive integer, and suppose that
P is square-free.
(a) Show that the number of residue classes n (mod P 2 ) for which (n, P 2 )
is square-free and (N − n, P 2 ) is square-free is
1 2
P 2
1− 2 1− 2 .
p|P
p p|P
p
p 2 |N p 2 N
(b) Show that the number of integers n, 0 < n < N , for which (n, P 2 ) is
square-free and (N − n, P 2 ) is square-free is
1 2
N 1− 2 1 − 2 + O(P 2 ).
p|P
p p|P
p
p 2 |N p 2 N
(c) Show that the number of n, 0 < n < N , such that n is divisible by the
square of a prime > y is N /y.
(d) Take P to be the product of all primes not exceeding y. By letting y
tend to infinity slowly, show that the number of ways of writing N as
a sum of two square-free integers is ∼ c(N )N where
1 2
c(N ) = a 1+ 2 , a= 1− 2 .
p 2 |N
p −2 p p
5. (cf. Hille 1937) Suppose that f (x) and F(x) are complex-valued functions
defined on [1, ∞). Show that
F(x) = f (x/n)
n≤x
for all x.
6. (cf. Hartman & Wintner 1947) Suppose that | f (n)|d(n) < ∞, and that
|F(n)|d(n) < ∞. Show that
F(n) = f (m)
m
n|m
2.1 Mean values 41
7. (Jarnı́k 1926; cf. Bombieri & Pila 1989) Let C be a simple closed curve in
the plane, of arc length L. Show that the number of ‘lattice points’ (m, n),
m, n ∈ Z, lying on C is at most L + 1. Show that if C is strictly convex
then the number of lattice points on C is 1 + L 2/3 , and that this estimate
is best possible.
8. Let C be a simple closed curve in the plane, of arc length L that encloses
a region of area A. Let N be the number of lattice points inside C. Show
that |N − A| ≤ 3(L + 1).
9. Let r (n) be the number of pairs ( j, k) of integers such that j 2 + k 2 = n.
Show that
r (n) = π x + O x 1/2 .
n≤x
10. (Stieltjes 1887) Suppose that an , bn are convergent series, and that
−1/2
cn = km=n ak bm . Show that cn n converges. (Hence if two Dirichlet
series have abscissa of convergence ≤ σ then the product series γ (s) =
α(s)β(s) has abscissa of convergence σc ≤ σ + 1/2.)
11. (a) Show that n≤x ϕ(n) = (3/π 2 )x 2 + O(x log x) for x ≥ 2.
(b) Show that
1 = −1 + 2 ϕ(n)
m≤x n≤x
n≤x
(m,n)=1
for x ≥ 1. Deduce that the expression above is (6/π 2 )x 2 + O(x log x).
12. Let σ (n) = d|n d. Show that
π2 2
σ (n) = x + O(x log x)
n≤x 12
for x ≥ 2.
13. (Landau 1900, 1936; cf. Sitaramachandrarao 1982, 1985, Nowak 1989)
(a) Show that n/ϕ(n) = d|n µ(d)2 /ϕ(d).
(b) Show that
n ζ (2)ζ (3)
= x + O(log x)
n≤x ϕ(n) ζ (6)
for x ≥ 2.
42 The elementary theory of arithmetic functions
where
1 1
C= 1+ .
8 log 2 p>2
p( p − 2)
(b) Show that for any real number x ≥ 1 and any positive integer q,
1 log p ϕ(q)
= log x + C0 + + O 2ω(q) /x .
m≤x m p|q
p−1 q
(m,q)=1
2.1 Mean values 43
(c) Show that for any real number x ≥ 2 and any positive integer q,
log p
1 ζ (2)ζ (3) p
= 1− 2 log x + C0 +
n≤x ϕ(n) ζ (6) p|q p − p+1 p|q
p−1
(n,q)=1
log p log x
− + O 2ω(q) .
pq
p2 − p + 1 x
18. Let dk (n) be the number of ordered k-tuples (d1 , . . . , dk ) of positive integers
such that d1 d2 · · · dk = n.
(a) Show that dk (n) = d|n dk−1 (d).
∞
(b) Show that n=1 dk (n)n −s = ζ (s)k for σ > 1.
(c) Show that for every fixed positive integer k,
dk (n) = x Pk (log x) + O x 1−1/k (log x)k−2
n≤x
embarrassing that this is the best-known upper bound for gaps between
sums of two squares.)
22. (Feller & Tornier 1932) Let f (n) denote the multiplicative function such
that f ( p) = 1 for all p, and f ( p k ) = −1 whenever k > 1.
(a) Show that
∞
f (n) 2
= ζ (s) 1 −
n=1
ns p p 2s
for σ > 1.
(b) Deduce that
f (n) = µ(d)2ω(d) .
d 2 |n
Let (x) be as in Exercise 23(b). Show that (x) x 1/3 (log x)2 .
27. Let R(x) be as in Exercise 24(c). Show that R(x) x 1/3 log x.
The proof we give below establishes only that there is an x0 such that
ψ(x) x uniformly for x ≥ x0 . However, both ψ(x) and x are bounded away
from 0 and from ∞ in the interval [2, x0 ], and hence the implicit constants can
be adjusted so that ψ(x) x uniformly for x ≥ 2. In subsequent situations of
2.2 Estimates of Chebyshev and of Mertens 47
this sort, we shall assume without comment that the reader understands that it
suffices to prove the result for all sufficiently large x.
which arise in the main terms. To avoid this problem we introduce an idea that
is fundamental to much of prime number theory, namely we replace µ(d) by
an arithmetic function ad that in some way forms a truncated approximation to
µ(d). Suppose that D is a finite set of numbers, and that ad = 0 when d ∈ / D.
Then by (2.11) we see that
ad log d
ad T (x/d) = (x log x − x) ad /d − x + O(log 2x).
d∈D d∈D d∈D
d
(2.12)
Here the implicit constant depends on the choice of ad , which we shall consider
to be fixed. Since we want the above to approximate the relation (2.10), and
since we are hoping that ψ(x) x, we restrict our attention to ad that satisfy
the condition
ad
= 0, (2.13)
d∈D
d
48 The elementary theory of arithmetic functions
in view of (1.20). Thus E(y) will be near 1 for y not too large if ad is near µ(d)
for small d. Moreover, by (2.13) we see that E(y) = − d∈D ad {y/d}, so that
E(y) is periodic with period dividing lcmd∈D d. Hence for a given choice of
the ad , the behaviour of E(y) can be determined by a finite calculation.
The simplest realization of this approach involves taking a1 = 1, a2 = −2,
ad = 0 for d > 2. Then (2.13) holds, the expression (2.14) is log 2, E(y) has
period 2 and E(y) = 0 for 0 ≤ y < 1, E(y) = 1 for 1 ≤ y < 2. Hence for this
choice of the ad the sum in (2.15) satisfies the inequalities
ψ(x) − ψ(x/2) = (k) ≤ (k)E(x/k) ≤ (k) = ψ(x).
x/2<k≤x k≤x k≤x
Thus ψ(x) ≥ (log 2)x + O(log x), which is a lower bound of the desired shape.
In addition,
and
By computing the implicit constants one can use this method to determine a
constant x0 such that ψ(2x) − ψ(x) > x/2 for all x > x0 . Since the contribution
of the proper prime powers is small, it follows that there is at least one prime
in the interval (x, 2x], when x > x0 . After separate consideration of x ≤ x0 ,
one obtains Bertrand’s postulate: For each real number x > 1, there is a prime
number in the interval (x, 2x).
Chebyshev said it, but I’ll say it again:
There’s always a prime between n and 2n.
N. J. Fine
and
ψ(x) x
π(x) = +O .
log x (log x)2
Proof Clearly
∞
ψ(x) = log p = ϑ x 1/k .
p k ≤x k=1
and that
1
∼ log log x.
p≤x p
However, these assertions are weaker than PNT, as we can derive them from
Theorem 2.4.
By Theorem 2.4 the error term is x. Thus (2.11) gives (a). The sum in (b)
differs from that in (a) by the amount
log p log p
≤ 1.
p k ≤x
pk p p( p − 1)
k≥2
by Theorem 2.4. We now prove (d) without determining the value of the con-
stant b. We express (b) in the form L(x) = log x + R(x) where R(x) 1.
Then
1 x x
1 x
d R(u)
= (log u)−1 d L(u) = d log u +
p≤x p 2− 2− log u 2− log u
x
du R(u) x x
= + − R(u) d(log u)−1
2− u log u log u 2− 2−
x
R(u) R(x)
= log log x − log log 2 + 1 + du. +
2 u(log u)2 log x
∞ ∞
The
∞ penultimate term is 1/ log x, and the integral is 2 − x =
2 +O(1/ log x), so we have (d) with
∞
R(u)
b = 1 − log log 2 + du.
2 u(log u)2
As for (e), we note that
−1 −1
1 1 1 1
log 1 − = + log 1 − − .
p≤x p p≤x p p≤x p p
−1
1
log 1 − = log log x + c + O(1/ log x) (2.16)
p≤x p
where c = b + p k≥2 (kp k )−1 . Since e z = 1 + O(|z|) for |z| ≤ 1, on expo-
nentiating we deduce that
−1
1
1− = ec log x + O(1).
p≤x p
To complete the proof it suffices to show that c = C0 . To this end we first note
that if p ≤ x and p k > x, then k ≥ (log x)/ log p. Hence
1 log p log p 1 log p 1
p −k ,
p≤x kp k p≤x (log x) p k p log x k≥2
log x p p 2 log x
p k >x p k >x
52 The elementary theory of arithmetic functions
Since this is trivial when 1 ≤ x < 2, the above holds for all x ≥ 1. We
express this briefly as T1 = T2 + T3 + T4 , and estimate the quantities Ii =
∞ −1−δ
δ 1 x Ti (x) d x. On comparing the results as δ → 0+ we shall deduce
that c = C0 . By Theorem 1.3, Corollary 1.11, and Corollary 1.13 we see that
1
I1 = log ζ (1 + δ) = log + O(δ)
δ
as δ → 0+ . Secondly,
∞
1 ∞ ∞
1 −δn
I2 = δ x −1−δ d x = e = log(1 − e−δ )−1
n=1
n en n=1
n
= log(δ + O(δ 2 ))−1 = log 1/δ + O(δ).
Thirdly,
I3 = c − C 0 ,
and finally
∞ e1/δ ∞
dx dx
I4 δ x −1−δ δ+δ + δ2 x −1−δ d x δ log 1/δ.
1 log 2x 2 x log x e1/δ
Then there is an x0 such that ψ(x) ≤ (a + ε)x for all x ≥ x0 , and hence
x x0 x
ψ(u)u −2 du ≤ ψ(u)u −2 du +(a + ε) u −1 du ≤ (a + ε) log x + Oε (1).
1 1 x0
x
Since this holds for arbitrary ε > 0, it follows that 1 ψ(u)u −2 du ≤ (a +
o(1)) log x. Thus by Theorem 2.7(c) we have a ≥ 1. Similarly lim inf ψ(u)/u
≤ 1.
2.2.1 Exercises
1. (a) Let dn = [1, 2, . . . , n]. Show that dn = eψ(n) .
1
(b) Let P ∈ Z[x], deg P ≤ n. Put I = I (P) = 0 P(x) d x. Show that
I dn+1 ∈ Z, and hence that dn+1 ≥ 1/|I | if I = 0.
(c) Show that there is a polynomial P as above so that I dn+1 = 1.
20≤x≤1 |x2 (1 − x) (2x
(d) Verify that max 2 2
− 1)| = 5−5/2 .
2n
(e) For P(x) = x (1 − x) (2x − 1) , verify that 0 < I < 5−5n .
(f) Show that ψ(10n + 1) ≥ ( 12 log 5) · 10n.
2. Let A be the set of integers composed entirely of primes p ≤ A1 , and
let B be the set of integers composed entirely of primes p > A1 . Then n
is uniquely of the form n = ab, a ∈ A, b ∈ B. Let δ(A1 , A2 ) denote the
density of those n such that a ≤ A2 .
(a) Give a formula for δ(A1 , A2 ).
(b) Show that δ(A1 , A2 ) (log A2 )/ log A1 for 2 ≤ A2 ≤ A1 .
3. Let an = 1 + cos log n, and note that an ≥ 0 for all n.
(a) Show that
∞
1 1
an n −s = ζ (s) + ζ (s + i) + ζ (s − i)
n=1
2 2
for σ > 1.
(b) By Corollary 1.15, or otherwise, show that
an
= log x + O(1).
n≤x n
5. (Chebyshev 1850) From Corollaries 2.5 and 2.8 we see that if there is a
number a such that ψ(x) = (a + o(1))x as x → ∞, then we must have
a = 1. We now take this a step further.
(a) Suppose that there is a number a such that
as x → ∞. Deduce that
x
ψ(u)
du = log x + (a + o(1)) log log x
2 u2
as x → ∞.
(b) By comparing the above with Theorem 2.7(c), deduce that if (2.17)
holds, then necessarily a = 0.
(c) Suppose that there is a constant A such that
x x
π (x) = +o (2.18)
log x − A (log x)2
x
as x → ∞. By writing ϑ(x) = 2− log u dπ (u), integrating by parts,
and estimating the expressions that arise, show that if (2.18) holds,
then
as x → ∞.
(d) Deduce that if (2.18) holds, then A = 1.
Proof Let R be the set of those n for which ϕ(n)/n < ϕ(m)/m for all m < n.
We first prove the inequality for these ‘record-breaking’ n ∈ R. Suppose that
ω(n) = k, and let n ∗ be the product of the first k primes. If n = n ∗ then n ∗ < n
and ϕ(n ∗ )/n ∗ < ϕ(n)/n. Hence R is the set of n of the form
n= p. (2.19)
p≤y
which gives the desired result for n ∈ R. If n ∈/ R then there is an m < n such
that m ∈ R, ϕ(m)/m < ϕ(n)/n. Hence
ϕ(n) ϕ(m) 1 1
> = e−C0 + O
n m log log m log log m
1 1
≥ e−C0 + O .
log log n log log n
We note that equality holds for n of the type (2.19), so the proof is complete.
We now consider the maximum order of d(n). From the pairing d ↔ n/d
√
of divisors, and the fact that at least one of these is ≤ n, it is immediate that
√
d(n) ≤ 2 n. On the other hand, if n is square-free then d(n) = 2ω(n) , which
56 The elementary theory of arithmetic functions
√
can be large, but not nearly as large as n. Indeed, for each ε > 0 there is a
constant C(ε) such that
d(n) ≤ C(ε)n ε (2.20)
for all n ≥ 1. To see this we express n in terms of its canonical factorization,
n = p pa , so that
d(n) a+1
= = f p (a),
nε p paε p
say. Let α p be an integral value of a for which f p (a) is maximized. From the
inequalities f p (α p ) ≥ f p (α p ± 1) we see that
( p ε − 1)−1 − 1 ≤ α p ≤ ( p ε − 1)−1 ,
so that we may take α p = [( p ε − 1)−1 ]. Hence (2.20) holds with
C(ε) = f p (α p ).
p
This constant is best possible, since equality holds when n = p p α p . By
analysing the rate at which C(ε) grows as ε → 0+ , we derive
Theorem 2.11 For all n ≥ 3
log n
log d(n) ≤ (log 2 + O(1/ log log n)) .
log log n
We note that this bound is sharp for n of the form in (2.19).
Proof It suffices to show that there is an absolute constant K such that
C(ε) ≤ exp K ε 2 21/ε , (2.21)
since the stated bound then follows by taking ε = (log 2)/ log log n. We observe
that α p = 0 if p > 21/ε , that α p = 1 if (3/2)1/ε < p ≤ 21/ε , and that α p 1/ε
when p ≤ (3/2)1/ε . Hence
log C(ε) log(2/ p ε ) + log(1/ε).
p≤21/ε p≤(3/2)1/ε
Here the second sum is π (3/2) 1/ε
log 1/ε ε 2 21/ε . The first sum is
(log 2)π (2 ) − εϑ(2 ), and by Corollary 2.5 this is ε 2 21/ε . Thus we have
1/ε 1/ε
so that ω(n) = p X p (n). If we were to treat the X p as though they
were independent random variables then we would have E(X p ) = 1/ p,
Var(X p ) = (1 − 1/ p)/ p. Hence we expect that the average of ω(n) should be
approximately
1
E Xp = E(X p ) = = log log n + O(1),
p≤n p≤n p≤n p
1
ω(n) = x + O (π (x)) .
n≤x p≤x p
and
(ω(n) − log log n)2 x log log x. (2.24)
1<n≤x
so we have
2.3 Applications to arithmetic functions 59
Note that in analytic number theory we say ‘almost all’ when the excep-
tional set has asymptotic density 0; this conflicts with the usage in some
parts of algebra, where the term means that there are at most finitely many
exceptions.
Proof of Theorem 2.12 To prove (2.23) we first multiply out the square on the
left, and write the sum as
x 1
2
1
≤x ≤x = x(log log x)2 + O(x log log x)
p1 = p2
p 1 p 2 p1 p2 ≤x p 1 p 2 p≤x p
p1 = p2
(2.26)
The estimate (2.23) now follows by inserting this and (2.22) in (2.25).
We derive (2.24) from (2.23) by applying the triangle inequality x −
y ≤ x − y for vectors. This gives
1/2 1/2
(ω(n) − log log n)2 − (ω(n) − log log x)2
1<n≤x 1<n≤x
1/2
≤ (log log x − log log n)2 .
1<n≤x
60 The elementary theory of arithmetic functions
and (2.24) follows by squaring both sides and applying (2.23). We omit the
similar argument for (n).
Since 2ω(n) ≤ d(n) ≤ 2(n) for all n, Corollary 2.13 carries an interesting
piece of information for d(n):
for almost all n. Since this is smaller than the average size of d(n), we see that
the average is determined not by the usual size of d(n) but by a sparse set of n for
which d(n) is disproportionately large. Since the first moment (i.e., average) of
d(n) is inflated by the ‘tail’ in its distribution, it is not surprising that this effect
is more pronounced for the higher moments. As was originally suggested by
Ramanujan, it can be shown that for any fixed real number κ there is a positive
constant c(κ) such that
κ
d(n)κ ∼ c(κ)x(log x)2 −1 (2.27)
n≤x
as x → ∞.
In order to handle the error terms that arise in our arguments we are frequently
led to estimate the mean value of multiplicative functions. In most such cases
the method of the hyperbola or the simpler identity (2.3) will suffice, but the
labour involved quickly becomes tiresome. It will therefore be convenient to
have the following result on record, as it is very readily applied.
Then for x ≥ 2,
x f (n)
f (n) (A + 1) .
n≤x log x n≤x n
We note that this is sharper than the trivial estimate
f (n) ≤ x f (n)/n (2.30)
n≤x n≤x
for any fixed κ. Though weaker than (2.27), this is all that is needed in many
cases. We can similarly show that for any fixed real κ,
n κ
x. (2.32)
n≤x ϕ(n)
62 The elementary theory of arithmetic functions
2.3.1 Exercises
1. Let σ (n) = d|n d.
(a) Show that σ (n)ϕ(n) ≤ n 2 for all n ≥ 1 .
(b) Deduce that n + 1 ≤ σ (n) ≤ eC0 n log log n + O(1) for all n ≥ 3.
2.3 Applications to arithmetic functions 63
√
2. Show that d(n) ≤ 3n with equality if and only if n = 12.
3. Let f (n) = p|n (1 + p −1/2 ).
(a) Show that there is a constant a such that if n ≥ 3, then
f (n) < exp a(log n)1/2 (log log n)−1 .
(b) Show that n≤x f (n) = cx + O x 1/2 where c = p (1 + p −3/2 ).
4. Let dk (n) be as in Exercise 2.1.18. Show that if k and κ are fixed, then
κ
dk (n)κ x(log x)k −1 .
n≤x
for x ≥ 2.
5. (Davenport 1932) Let
µ(d) log d
f (n) = − .
d|n
d
(d) Show that the right-hand side above is (log y)/ log x.
2.4 The distribution of (n) − ω(n) 65
(e) Deduce that the second and third terms in (2.35) are 1.
(f) Conclude that
2 = x(log log x)2 + (2b + 1) log log x + O(x)
where b is the constant in Theorem 2.7(d).
(g) Show that the left-hand side of (2.23) is = x log log x + O(x).
(h) Show that the left-hand side of (2.24) is = x log log x + O(x).
9. (cf. Pomerance 1977, Shan 1985) Note that ϕ(n)|(n − 1) when n is prime. An
old – and still unsolved – problem of D. H. Lehmer asks whether there exists
a composite integer n such that ϕ(n)|(n − 1). Let S denote the (presumably
empty) set of such numbers.
(a) Show that if n ∈ S, then n is square-free.
(b) Suppose that mp ∈ S. Show that m ≡ 1 (mod p − 1).
(c) Let p be given. Show that the number of m such that mp ≤ x and mp ∈ S
is x/ p 2 .
(d) Show that the number of n ∈ S, n ≤ x, such that n has a prime factor
> y is x/(y log y).
(e) Suppose that x/y < n ≤ x and that n is composed entirely of primes
p ≤ y. Show that ω(n) ≥ (log x)/(log y) − 1.
(f) By Exercise 4, or otherwise, show that the number of n ≤ x such that
ω(n) ≥ z is x(log x)2 /3z .
(g) Conclude that the number of n ≤ x such that n ∈ S is
√
x/ exp( log x).
uniform in k. This is indeed the case, as we see from the following quantitative
form of Rényi’s theorem.
∞
ζ (s) ζ (s)
µ(n)2 n −s = (1 + p −s ) = (1 + p −s )−1 = λ(d)d −s ,
n=1 p f
ζ (2s) p| f
ζ (2s) d∈D
(n, f )=1
by Theorem 2.2. But d∈D λ(d)/d = p| f (1 + 1/ p)−1 and d∈D d −1/2 =
−1/2 −1
p| f (1 − p ) , so that the proof is complete.
Proof of Theorem 2.16 Let Q denote the set of square-free numbers and F
denote the set of ‘power-full’ numbers (i.e., those f such that p| f ⇒ p 2 | f ).
Every number is uniquely expressible in the form n = q f , q ∈ Q, f ∈ F,
(q, f ) = 1. Hence
Nk = 1.
f ≤x q≤x/ f
f ∈F q∈Q
( f )−ω( f )=k (q, f )=1
2.4 The distribution of (n) − ω(n) 67
1 ⎜ ⎟
6 ⎜ 1/2 ⎟
−1/2 −1 ⎟
x (1 + p ) −1 −1
+O⎜
⎜x f −1/2
1− p ⎟.
π2 f ≤x
f p| f ⎝ f ≤x p| f ⎠
f ∈F f ∈F
( f )−ω( f )=k ( f )−ω( f )=k
In order to appreciate the nature of these sums it is helpful to observe that each
member of F is uniquely of the form a 2 b3 with b square-free, so that there are
x 1/2 members of F not exceeding x. Suppose that z ≥ 1. Then the sum in
the error term is
−1
≤ z −k z ( f )−ω( f ) f −1/2 1 − p −1/2 .
f ≤x p| f
f ∈F
To see that (2.36) holds, it suffices to multiply this by z k and sum over k.
2.4.1 Exercise
1. Let dk be as in (2.36). Show that
dk = c2−k + O(5−k )
where
1 1 −1
c= 1− .
4 p>2
( p − 1)2
2.5 Notes
Section 2.1. Mertens (1874 a) showed that n≤x ϕ(n) = 3x 2 /π 2 + O(x log x).
This refines an earlier estimate of Dirichlet, and is equivalent to Theorem 2.1,
by partial summation. Let R(x) denote the error term in Theorem 2.1. Chowla
(1932) showed that
x
x
R(u)2 du ∼
1 2π 2
as x → ∞, and Walfisz (1963, p. 144) showed that
R(x) (log x)2/3 (log log x)4/3 .
In the opposite direction, Pillai & Chowla (1930) showed (cf. Exercise
7.3.6) that R(x) = (log log log x). That the error term changes sign in-
finitely often was first proved by Erdős & Shapiro (1951), who showed that
R(x) = ± (log log log log x). More recently, Montgomery (1987) showed that
√
R(x) = ± ( log log x). It may be speculated that R(x) log log x and that
R(x) = ± (log log x).
Theorem 2.2 is due to Gegenbauer (1885).
Theorem 2.3 is due to Dirichlet (1849). The problem of improving the error
term in this theorem is known as the Dirichlet divisor problem. Let (x) denote
the error term. Voronoı̈ (1903) showed that (x) x 1/3 log x (see Exercises
2.1.23, 2.1.25, 2.1.26). van der Corput (1922) used estimates of exponential
sums to show that (x) x 33/100+ε . This exponent has since been reduced
2.5 Notes 69
by van der Corput (1928), Chih (1950), Richert (1953), Kolesnik (1969, 1973,
1982, 1985), Iwaniec & Mozzochi (1988), and by Huxley (1993), who showed
that (x) x 23/73+ε . In the opposite direction, Hardy (1916) showed that
(x) = ± (x 1/4 ). Soundararajan (2003) showed that
(x) = x 1/4 (log x)1/4 (log log x)b (log log log x)−5/8
with b = 34 (24/3 − 1), and it is plausible that the first three exponents above are
optimal.
The result of Exercise 2.1.12 generalizes to Rn : A lattice point
(a1 , a, . . . , an ∈ Zn ) is said to be primitive if gcd(a1 , a2 , . . . , an ) = 1. The
asymptotic density of primitive lattice points is easily shown to be 1/ζ (n).
In addition, Cai & Bach (2003) have shown that the density of lattice points
a ∈ Zn such that gcd(ai , a j ) = 1 for all pairs with 1 ≤ i < j ≤ n is
1 n n 1 n−1
1− + 1− .
p p p p
pp. 285–288). Polynomials can be found that produce better constants, but
Gorshkov (1956) showed that the supremum of such constants is < 1, so
the Prime Number Theorem cannot be established by this method. For more
on this subject, see Montgomery (1994, Chapter 10), Pritsker (1999), and
Borwein (2002, Chapter 10).
Theorem 2.7(b)–(e) is due to Mertens (1874a, b). Our determination of the
constant in Theorem 2.7(e) incorporates an expository finesse due to Heath-
Brown.
Section 2.3. Theorem 2.9 is due to Landau (1903). Runge (1885) proved
(2.20), and Wigert (1906/7) showed that d(n) < n (log 2+ε)/ log log n for n > n 0 (ε).
Ramanujan (1915a, b) established the upper bound of Theorem 2.11, first with
an extra log log log n in the error term, and then without. Ramanujan (1915b)
also proved that
log d(n)
< li(n) + O n exp − c log n
log 2
for all n ≥ 2, and that
log d(n)
> li(n) + O n exp − c log n
log 2
for infinitely many n. For a survey of extreme value estimates of arithmetic
functions, see Nicolas (1988).
Theorem 2.12 is due to Turán (1934), although Corollary 2.13 and the es-
timate (2.22) used in the proof of Theorem 2.12 were established earlier by
Hardy & Ramanujan (1917). Kubilius (1956) generalized Turán’s inequality to
arbitrary additive functions. See Tenenbaum (1995, pp. 302–304) for a proof,
and discussion of the sharpest constants.
Theorem 2.14 is due to Hall & Tenenbaum (1988, pp. 2, 11). It represents
a weakening of sharper estimates that can be derived with more work. For
example, Wirsing (1961) showed that if f is a multiplicative function such that
f (n) ≥ 0 for all n, if there is a constant C < 2 such that f ( p k ) C k for all
k ≥ 2, and if
f ( p) ∼ κ x/ log x
p≤x
(1984, 1986, 1987). For a comprehensive account of the mean values of (not
necessarily non-negative) multiplicative functions, see Tenenbaum (1995, pp.
48–50, 308–310, 325–357). The two sides of (2.31) are of the same order of
magnitude, and with more work one can derive a more precise asymptotic
estimate; see Wilson (1922).
Section 2.4. Rényi (1955) gave a qualitative form of Theorem 2.16. Robinson
(1966) gave formulæ for the densities dk . Kac (1959, pp. 64–71) gave a proof
by probabilistic techniques. Generalizations have been given by Cohen (1964)
and Kubilius (1964). Sharper estimates for the error term have been derived
by Delange (1965, 1967/68, 1973), Kátai (1966), Saffari (1970), and Schwarz
(1970).
For a much more detailed historical account of the development of prime
number theory, see Narkiewicz (2000).
2.6 References
Bateman, P. T. (1949). Note on the coefficients of the cyclotomic polynomial, Bull. Amer.
Math. Soc. 55, 1180–1181.
Bateman, P. T. & Grosswald, E. (1958). On a theorem of Erdős and Szekeres, Illinois J.
Math. 2, 88–98.
Bombieri, E. & Pila, J. (1989). The number of integral points on arcs and ovals, Duke
Math. J. 59, 337–357.
Borwein, P. (2002). Computational excursions in analysis and number theory. Canadian
Math. Soc., New York: Springer.
Cai, J.-Y. & Bach, E. (2003). On testing for zero polynomials by a set of points with
bounded precision, Theoret. Comp. Sci. 296, 15–25.
Chebyshev, P. L. (1848). Sur la fonction qui détermine la totalité des nombres premiers
inférieurs à une limite donné, Mem. Acad. Sci. St. Petersburg 6, 1–19.
(1850). Mémoire sur nombres premiers, Mem. Acad. Sci. St. Petersburg 7, 17–33.
(1946). Collected works of P. L. Chebyshev, Vol. 1, Akad. Nauk SSSR, Moscow–
Leningrad.
Chih, T.-T. (1950). A divisor problem, Acta Sinica Sci. Record 3, 177–182.
Chowla, S. (1932). Contributions to the analytic theory of numbers, Math. Zeit. 35,
279–299.
Cohen, E. (1964). Some asymptotic formulas in the theory of numbers, Trans. Amer.
Math. Soc. 112, 214–227.
van der Corput, J. G. (1922). Vereschärfung der Abschätzung beim Teilerproblem, Math.
Ann. 87, 39–65.
(1928). Zum Teilerproblem, Math. Ann. 98, 697–716.
Costa Pereira, N. (1989). Elementary estimates for the Chebyshev function ψ(x) and
for the Möbius function M(x), Acta Arith. 52, 307–337.
Davenport, H. (1932). On a generalization of Euler’s function φ(n), J. London Math. Soc.
7, 290–296; Collected Works, Vol. IV. London: Academic Press, pp. 1827–1833.
72 The elementary theory of arithmetic functions
Huxley, M. N. (1993). Exponential sums and lattice points II. Proc. London Math. Soc.
(3) 66, 279–301.
Iwaniec, H. & Mozzochi, C. J. (1988). On the divisor and circle problems, J. Number
Theory 29, 60–93.
Jarnı́k, V. (1926). Über die Gitterpunkte auf konvexen Curven, Math. Z. 24, 500–
518.
Kac, M. (1959). Statistical Independence in Probability, Analysis and Number Theory,
Carus Monograph 12. Washington: Math. Assoc. Amer.
Kátai, I. (1966). A remark on H. Delange’s paper “Sur un théorème de Rényi”, Magyar
Tud. Akad. Mat. Fiz. Oszt. Közl. 16, 269–273.
Kolesnik, G. (1969). The improvement of the error term in the divisor problem, Mat.
Zametki 6, 545–554.
(1973). On the estimation of the error term in the divisor problem, Acta Arith. 25,
7–30.
(1982). On the order of ζ ( 12 + it) and (R), Pacific J. Math. 82, 107–122.
(1985). On the method of exponent pairs, Acta Arith. 45, 115–143.
Kubilius, J. (1956). Probabilistic methods in the theory of numbers (in Russian), Uspehi
Mat. Nauk 11, 31–66; Amer. Math. Soc. Transl. (2) 19 (1962), 47–85.
(1964). Probabilistic Methods in the Theory of Numbers, Translations of Mathematical
Monographs, Vol. 11. Providence: American Mathematical Society.
Landau, E. (1900). Ueber die zahlentheoretische Function ϕ(n) und ihre Beziehung zum
Goldbachschen Satz, Nachr. Akad. Wiss. Göttingen, 177–186; Collected Works,
Vol. 1. Essen: Thales Verlag, 1985, pp. 106–115.
(1903). Über den Verlauf der zahlentheoretischen Funktion ϕ(x), Arch. Math. Phys.
(3) 5, 86–91; Collected Works, Vol. 1. Essen: Thales Verlag, 1985, pp. 378–383.
(1911). Sur les valeurs moyennes de certaines fonctions arithmétiques, Bull. Acad.
Royale Belgique, 443–472; Collected Works, Vol. 4. Essen: Thales Verlag, 1986,
pp. 377–406.
(1936). On a Titchmarsh–Estermann sum, J. London Math. Soc. 11, 242–245;
Collected Works, Vol. 9. Essen: Thales Verlag, 1987, pp. 393–396.
Linfoot, E. H. & Evelyn, C. J. A. (1929). On a problem in the additive theory of numbers,
I, J. Reine Angew. Math. 164, 131–140.
Ma̧kowski, A. (1960). Partitions into unequal primes, Bull. Acad. Pol. Sci. 8, 125–126.
Massias, J.-P. & Robin, G. (1996). Bornes effectives pour certaines fonctions concernant
les nombres premiers, J. Théor. Nombres Bordeaux 8, 215–242.
Mertens, F. (1874a). Ueber einige asymptotische Gesetze der Zahlentheorie, J. Reine
Angew. Math. 77, 289–338.
(1874b). Ein Beitrag zur analytischen Zahlentheorie, J. Reine Angew. Math. 78,
46–62.
Montgomery, H. L. (1987). Fluctuations in the mean of Euler’s phi function, Proc. Indian
Acad. Sci. (Math. Sci.) 97, 239–245.
(1994). Ten Lectures on the Interface of Analytic Number Theory and Harmonic
Analysis, CBMS 84. Providence: Amer. Math. Soc.
Narkiewicz, W. (2000). The Development of Prime Number Theory. Berlin: Springer-
Verlag.
Nicolas, J.-L. (1988). On Highly Composite Numbers. Ramanujan Revisited (G. E.
Andrews, R. A. Askey, B. C. Berndt, K. G. Ramanathan, R. A. Rankin, eds.). New
York: Academic Press, pp. 215–244.
74 The elementary theory of arithmetic functions
Shan, Z. (1985). On composite n for which ϕ(n)|(n − 1), J. China Univ. Sci. Tech. 15,
109–112.
Sitaramachandrarao, R. (1982). On an error term of Landau, Indian J. Pure Appl. Math.
13, 882–885.
(1985). On an error term of Landau, II, Rocky Mountain J. Math. 15, 579–588.
Soundararajan, K. (2003). Omega results for the divisor and circle problems, Int. Math.
Res. Not., 1987–1998.
Stieltjes, T. J. (1887). Note sur la multiplication de deux séries, Nouvelles Annales (3)
6, 210–215.
Sylvester, J. J. (1881). On Tchebycheff’s theory of the totality of the prime numbers
comprised within given limits, Amer. J. Math. 4, 230–247.
Tenenbaum, G. (1995). Introduction to Analytic and Probabilistic Number Theory, Cam-
bridge Studies 46, Cambridge: Cambridge University Press.
Turán, P. (1934). On a theorem of Hardy and Ramanujan, J. London Math. Soc. 9,
274–276.
de la Vallée Poussin, C. J. (1898). Sur les valeurs moyennes de certaines fonctions
arithmétiques, Ann. Soc. Sci. Bruxelles 22, 84–90.
Voronoı̈, G. (1903). Sur un problème du calcul des fonctions asymptotiques, J. Reine
Angew. Math. 126, 241–282.
Walfisz, A. (1963). Weylsche Exponentialsummen in der neueren Zahlentheorie, Math-
ematische Forschungsberichte 15, Berlin: VEB Deutscher Verlag Wiss.
Ward, D. R. (1927). Some series involving Euler’s function, J. London Math. Soc. 2,
210–214.
Wigert, S. (1906/7). Sur l’ordre de grandeur du nombre des diviseurs d’un entier, Ark.
Mat. 3, 1–9.
Wilson, B. M. (1922). Proofs of some formulæ enunciated by Ramanujan, Proc. London
Math. Soc. 21, 235–255.
Wintner, A. (1944). The Theory of Measure in Arithmetic Semigroups. Baltimore:
Waverly Press.
Wirsing, E. (1961). Das asymptotische Verhalten von Summen über multiplikative Funk-
tionen, Math. Ann. 143, 75–102.
(1967). Das asymptotische Verhalten von Summen über multiplikative Funktionen,
II, Acta Math. Acad. Sci. Hungar. 18, 411–467.
3
Principles and first examples of sieve methods
3.1 Initiation
The aim of sieve theory is to construct estimates for the number of integers
remaining in a set after members of certain arithmetic progressions have been
discarded. If P is given, then the asymptotic density of the set of integers
relatively prime to P is ϕ(P)/P; with the aid of sieves we can estimate how
quickly this asymptotic behaviour is approached. Throughout this chapter we
let S(x, y; P) denote the numbers of integers n in the interval x < n ≤ x + y
for which (n, P) = 1. A first (weak) result is provided by
Proof From the characteristic property (1.20) of the Möbius µ-function, and
the fact that d|(n, P) if and only if d|n and d|P, we see that
S(x, y; P) = µ(d)
x<n≤x+y d|n
d|P
= µ(d) 1
d|P x<n≤x+y
d|n
x + y x
= µ(d) − . (3.1)
d|P
d d
76
3.1 Initiation 77
where
(
s
s = card Sr j .
1≤r1 <···<rs ≤R j=1
where ε(y) → 0 as y → ∞. This bound is very weak, but has the interesting
property of being uniform in x. Since the bound for the error term in Theorem 3.1
is very crude, we might expect that more is true, so that perhaps
ϕ(P)
S(x, y; P) ∼ y
P
even when z is fairly large. However, as we have already noted in our remarks
following Theorem 2.11, this asymptotic formula fails when z = y 1/2 .
In order to derive a sharper estimate for S(x, y; P), we replace µ(d) by a more
general arithmetic function λd that in some sense is a truncated approximation
to µ(d). This is reminiscent of our derivation of the Chebyshev bounds, but in
fact the specific properties required of the λd are now rather different. Suppose
that we seek an upper bound for S(x, y; P). Let λ+ n be a function such that
1 if n = 1,
λ+
d ≥ (3.5)
d|n
0 otherwise.
Such a λ+d we call an ‘upper bound sifting function’, and by arguing as in the
proof of Theorem 3.1 we see that
S(x, y; P) ≤ λ+
d = y λ+d /d + O |λ+
d| . (3.6)
x<n≤x+y d|n d|P d|P
d|P
This will be useful if d|P λ+ d /d is not much larger than ϕ(P)/P, and if
+
d|P |λ d | is much smaller than 2ω(P) . Brun (1915) was the first to succeed
with an argument of this kind. He took his λ+n to be of the form
µ(n) if n ∈ D+ ,
λ+
n =
0 otherwise,
where D+ is a judiciously chosen set of integers. A sieve of this kind is called
‘combinatorial’. With Brun’s choice of D+ it is easy to verify (3.5), and it
is not hard to bound d|P |λ+ d |, but the determination of the asymptotic size
of the main term d|P λ+ d /d presents some technical difficulties. We do not
develop a detailed account of Brun’s method, but the spirit of the approach can
be appreciated by considering the following simple choice of D+ : Let r be an
integer at our disposal, and put
D+ = {n : ω(n) ≤ 2r }.
We observe that
2r
2r
ω(P)
λ+
d = µ(d) = (−1) j
.
d|P j=0 d|P j=0
j
ω(d)= j
3.1 Initiation 79
3.1.1 Exercises
1. (Charles Dodgson) In a very hotly fought battle, at least 70% of the combat-
ants lost an eye, at least 75% an ear, at least 80% an arm, and at least 85% a
leg. What can you say about the percentage that lost all four members?
80 Principles and first examples of sieve methods
2. (P. T. Bateman) Would you believe a market investigator who reports that of
1000 people, 816 like candy, 723 like ice cream, 645 like cake, while 562
like both candy and ice cream, 463 like both candy and cake, 470 like both
ice cream and cake, while 310 like all three?
3. (Erdős 1946) For x > 0 write
ϕ(k)
1= x + E k (x).
1≤n≤x
k
(n,k)=1
(c) By using the result of Exercise B.10, or otherwise, show that if d|k and
e|k, then
k
(d, e)2
B1 ({x/d})B1 ({x/e}) d x = k.
0 12de
(d) Show that if k > 1, then
k
1 ω(k)
E k (x)2 d x = 2 ϕ(k).
0 12
(e) Deduce that if k > 1, then
1/2
ω(k)/2 ϕ(k)
max |E k (x)| 2 .
x k
4. (Lehmer 1955; cf. Vijayaraghavan 1951) Let E k (x) be defined as above.
(a) Show that |E k (x)| ≤ 2ω(k)−1 for all k > 1.
(b) Suppose that k is composed of distinct primes p ≡ 3 (mod 4), and that
ω(k) is even. Show that if d|k, then µ(d)B1 ({k/(4d)}) = −1/4.
(c) Show that there exist infinitely many numbers k for which
5. (Behrend 1948; cf. Heilbronn 1937, Rohrbach 1937, Chung 1941, van der
Corput 1958) Let a1 , . . . , a J be positive integers, and let T (a1 , . . . , a J ) de-
note the asymptotic density of the set of those positive integers that are not
divisible by any of the ai .
(a) Show that T (a1 , . . . , a J ) = Jj=0 (−1) j j where
1
j = .
1≤i 1 <···<i j ≤J
[ai1 , . . . , ai j ]
This simple observation can be used to obtain an upper bound for S(x, y; P);
namely
⎛ ⎞2
⎜ ⎟
S(x, y; P) ≤ ⎝ d ⎠
x<n≤x+y d|n
d|P
= d e 1
d|P x<n≤x+y
e|P d|n,e|n
x+y x
= d e −
d|P
[d, e] [d, e]
e|P
⎛ 2 ⎞
d e
=y +O⎝ |d | ⎠ . (3.10)
d|P
[d, e] d|P
e|P
Theorem 3.2 Let x, y, and z be real numbers such that y > 0 and z ≥ 1. For
any positive integer P we have
y
S(x, y; P) ≤ + O(z 2 L P (z)−2 )
L P (z)
3.2 The Selberg lambda-squared method 83
where
µ(n)2
L P (z) = .
n≤z ϕ(n)
n|P
Hence
d e d e
= ϕ( f )
d|P,e|P
[d, e] f |P d
d e e
f |d|P f |e|P
= ϕ( f )yf2
f |P
where
d
yf = . (3.11)
d
d
f |d|P
Moreover, from these formulæ we see that d = 0 for all d > z if and only if
yf = 0 for all f > z. Thus we have diagonalized the quadratic form in (3.10),
and by (3.12) we see that the constraint 1 = 1 is equivalent to the linear
condition
yf µ( f ) = 1. (3.13)
f |P
2
µ( f ) 1
ϕ( f )yf2 = ϕ( f ) yf − + . (3.14)
f |P f |P
ϕ( f )L P (z) L P (z)
f ≤z
84 Principles and first examples of sieve methods
In order to apply Theorem 3.2, we require a lower bound for the sum L P (z).
To this end we show that
µ(n)2
> log z (3.18)
n≤z ϕ(n)
for all z ≥ 1. Let s(n) denote the largest square-free number dividing n (some-
times called the ‘square-free kernel of n’). Then for square-free n,
1
1 1 1 1
= 1 + + 2 + ··· = ,
ϕ(n) n p|n p p m m
s(m)=n
Here the last inequality is obtained by the integral test. With more work one can
derive an asymptotic formula for the the sum in (3.18) (recall Exercise 2.1.17).
By taking z = y 1/2 in Theorem 3.2, and appealing to (3.18), we obtain
Theorem 3.3 Let P = p≤√ y p. Then for any x and any y ≥ 2,
2y 1
S(x, y; P) ≤ 1+O .
log y log y
By combining the above with (3.3) we obtain an immediate application to
the distribution of prime numbers.
Corollary 3.4 For any x ≥ 0 and any y ≥ 2,
2y 1
π (x + y) − π (x) ≤ 1+O .
log y log y
In Theorem 3.3 we consider only a very special sort of P, but the following
lemma enables us to obtain corresponding results for more general P.
Lemma 3.5 Put M(y; P) = maxx S(x, y; P). If (P, q) = 1, then
q
M(y; P) ≤ M(y; q P).
ϕ(q)
Proof It suffices to show that
q
ϕ(q)S(x, y; P) = S(x + Pm, y; q P), (3.19)
m=1
since the right-hand side is bounded above by q M(y; q P). Suppose that x +
Pm < n ≤ x + Pm + y and that (n, q P) = 1. Put r = n − Pm. Then x <
r ≤ x + y, (r, P) = 1, and (r + Pm, q) = 1. Thus the right-hand side above is
1 = 1.
m r x<r ≤x+y 1≤m≤q
(r,P)=1 (r +Pm,q)=1
Since (P, q) = 1, the map m → r + Pm permutes the residue classes (mod q).
Hence the inner sum above is ϕ(q), and we have (3.19).
Proof Let
P1 = p, q1 = p.
p|P
√ pP
p≤ y √
p≤ y
Theorem 3.3 provides an upper bound for M(y; q1 P1 ), and hence by Lemma
3.5 we have an upper bound for M(y; P1 ). To complete the argument it suffices
to note that S(x, y; P) ≤ S(x, y; P1 ) ≤ M(y; P1 ), and to appeal to Mertens’
formula (Theorem 2.7(e)).
We note that Theorem 3.3 is a special case of Theorem 3.6. Although we have
taken great care to derive uniform estimates, for many purposes it is enough to
know that
1
S(x, y; P) y 1− . (3.20)
p|P
p
p≤y
This follows from Theorem 3.6 since √ y< p≤y (1 − 1/ p)−1 1 by Mertens’
formula. To obtain an estimate in the opposite direction, write P = P1 q1 where
P1 is composed entirely of primes > y, and q1 is composed entirely of primes
≤ y. Since the integers in the interval (0, y] have no prime factor > y, we see
that M(y; P1 ) ≥ [y] . Hence by Lemma 3.5,
1
M(y; P) ≥ [y] 1− . (3.21)
p|P
p
p≤y
Then by Exercise 2.1.17 and Mertens’ estimates (Theorem 2.7) it follows that
this is 14 (3 − 2 log 2) log y + O(1).
3.2.1 Exercises
1. Let d be defined as in the proof of Theorem 3.2.
(a) Show that
d 2z
d log
L P (z)ϕ(d) d
for d ≤ z.
(b) Use the above to give a second proof of (3.17).
2. Show that for y ≥ 2 the number of prime powers p k in the interval
(x, x + y] is
2y 1
≤ 1+O .
log y log y
3. (Chowla 1932) Let f (n) be an arithmetic function, put
g(n) = f (d) f (e),
[d,e]=n
5. (Hensley 1978)
(a) Let P = p≤√ y p. Show that the number of n, x < n ≤ x + y, such
that (n) = 2, is
x + y
x
≤ S(x, y; P) + π −π .
√
p≤ y
p p
(b) By using Theorem 3.3 and Corollary 3.4, show that for y ≥ 2,
2y log log y 1
1≤ 1+O .
x<n≤x+y log y log log y
(n)=2
(c) Let λ−
be a lower bound sifting function such that λ−
d d = 0 for d > z.
Show that for any q,
ϕ(q) λ− λ−
d
≥ d
.
q d
d d
d
(d,q)=1
Theorem 3.8 Suppose that (a, q) = 1, that (P, q) = 1, and that x and y are
real numbers with y ≥ 2q. The number of n, x < n ≤ x + y, such that n ≡ a
(mod q) and (n, P) = 1 is
⎛ ⎞
⎜
C0 y ⎜ 1 ⎟ 1
≤e 1− ⎟ 1+O .
q ⎝ p|P p ⎠ log y/q
√
p≤ y/q
Thus the n that remain after sifting are precisely the n for which (a(n), P) = 1.
By the sieve we obtain upper and lower bounds for the number of remaining n
of the form
λm = λm 1. (3.24)
x<n≤x+y m|(a(n),P) m|P x<n≤x+y
m|a(n)
Now p|a(n) if and only if n ∈ B( p) (mod p). By the Chinese remainder theo-
rem, this will be the case for all p|m when n lies in one of precisely p|m b( p)
residue classes modulo m. The b( p) are defined only for primes, but it is con-
venient now to extend the definition to all positive integers by putting
b(m) = b( p)α .
p α m
Thus b(m) is the totally multiplicative function generated by the b( p). For
square-free m, b(m) represents the number of deleted residue classes modulo
m. We are now in a position to estimate the inner sum above. We partition the
interval (x, x + y] into [y/m] intervals of length m, and one interval of length
{y/m}m. In each interval of length m there are precisely b(m) values of n for
which m|a(n). In the final shorter interval, the number of such n lies between
0 and b(m). Thus the inner sum on the right above is = yb(m)/m + O(b(m)),
and hence the expression (3.24) is
b(m)λm
=y + O b(m)|λm | . (3.25)
m|P
m m|P
92 Principles and first examples of sieve methods
To continue from this point, one should specify the choice of λm , and then
estimate the main term and error term. In the context of Selberg’s 2 method,
we have real d with 1 and d = 0 for d > z. The number of n ∈ (x, x + y]
that survive sifting is
2
≤ d = d e 1
x<n≤x+y d|(a(n),P) d|P e|P x<n≤x+y
[d,e]|a(n)
b([d, e])
=y d e + O g([d, e])|d e | . (3.26)
d|P e|P
[d, e] d|P e|P
This is (3.25) with λm = [d,e]=m d e .
We consider first the main term above. Clearly [d, e] = de/(d, e) and
b([d, e]) = b(d)b(e)/b((d, e)). For square-free m put
b( p)
g(m) = . (3.27)
p|m
p − b( p)
By applying this with m = (d, e) we see that the first sum in (3.26) is
b(d)d b(e)e (d, e) b(d)d b(e)e 1
· · = ·
d|P
d e b((d, e)) d|P
d e f |d
g( f )
e|P e|P f |e
1 b(d) b(e)
= d e
f |P
g( f ) d d e e
f |d|P f |e|P
1 2
= y (3.28)
f |P
g( f ) f
where
b(d)
yf = d . (3.29)
d
d
f |d|P
3.4 Twin primes 93
By the above formulæ we see that the condition that d = 0 for d > z is
equivalent to the condition that yf = 0 for f > z. Also, the condition that
1 = 1 is equivalent to
yf µ( f ) = 1. (3.31)
f |P
where
L= µ( f )2 g( f ) . (3.33)
f ≤z
f |P
Hence
L= µ(m)2 g(m) = c(k) d(n)/n.
m≤z k≤z n≤z/k
The Euler product in (3.35) is absolutely convergent for σ > −1/2. Hence
|c(k)|k −σ < ∞ for σ > −1/2. Thus the two sums in the error terms above
are convergent. Also,
1 ∞
1
|c(k)| ≤ |c(k)| log k .
k>z
log z k=1
log z
Thus by taking s = 0 in (3.35) we find that
1
L = (log z)2 + O(log z). (3.36)
2c
3.4 Twin primes 95
It remains to bound the error term in (3.26). Since 0 ≤ b([d, e]) ≤ b(d)b(e),
the error term is
2
b(d)|d | .
d≤z
Hence
1
b(d)|d | µ(d)2 dg(d) µ(m)2 g(m)
d≤z
L d≤z m≤z/d
1
= µ(m)2 g(m) µ(d)2 dg(d) .
L m≤z d≤z/m
We note that
p
p
1 = 1. (3.37)
b=1 x<n≤x+y x<n≤x+y b=1
/ 1 ( p )
b∈B n ∈B
/ 1 n ∈B
/ 1 / 1 ( p )
b∈B
n≡b ( p ) b≡n ( p )
Consider the inner sum on the right. Since n ∈ / B1 ( p ), the variable b is restricted
to lie in one of p − b1 ( p ) − 1 residue classes. Hence the right-hand side above
is
= ( p − b1 ( p ) − 1)M(x, y; b1 ).
Since there are p − b1 ( p ) values of b in the outer sum on the left-hand side of
(3.37), it follows that there is a choice of b such that b ∈ / B1 ( p ) and
p − b1 ( p ) − 1
1 ≥ M(x, y; b1 ) .
x<n≤x+y p − b1 ( p )
n ∈B
/ 1
n≡b ( p )
3.4 Twin primes 97
Thus
p − b1 ( p)
M(x, y; b1 ) ≤ M(x, y; b2 ) ,
p|P
p − b2 ( p)
and that
r (n)2 x. (3.39)
n≤x
The first of these estimates is easy: Put y = [(log x)/ log 2]. If 0 ≤ k ≤ y − 1,
then 2k ≤ x/2, and if also p ≤ x/2, then p + 2k ≤ x. Thus the sum in (3.38)
is
x
≥ π (x/2)y log x x
log x
for x ≥ 4.
To prove (3.39), we first observe that the sum on the left-hand side is
= 1.
p1 , p2 , j,k
p1 +2 j ≤x
p2 +2k ≤x
p1 +2 j = p2 +2k
3.4.1 Exercises
1. For each prime p let B( p) be the union of b( p) ‘bad’ arithmetic progressions
)
with common difference p. Put B = p|P B( p), and let
m(x, y; b) = min 1
B
x<n≤x+y
n ∈B
/
where the minimum is over all choices of the B( p) with b( p) fixed. Show
that if b1 ( p) ≤ b2 ( p) for all p, then
b1 ( p) −1 b2 ( p) −1
m(x, y; b1 ) 1− ≥ m(x, y; b2 ) 1− .
p p p p
100 Principles and first examples of sieve methods
(d) Show that if a and b are fixed real numbers with a < b, then
(b log p − d( p)) 4(b − a)2 x .
p≤x
a log p≤d( p)≤b log p
(i) Take b = a + 1/8, and suppose that d( p) ≥ a log p for all p > p0 .
Show that the estimates of (f) and (h) are inconsistent if a > 15/16.
Thus conclude that
d( p) 15
lim inf ≤ .
p→∞ log p 16
4. Let r (n) be defined as in the proof of Theorem 3.15. Show that
x
r (n) ∼ .
n≤x log 2
5. Let r (n) be defined as in the proof of Theorem 3.15. Show that
x
r (n) .
n≤x log x
2|n
6. (Erdős 1950)
(a) Show that if n ≡ 1 (mod 3) and k ≡ 0 (mod 2), then 3|(n − 2k ).
(b) Show that if n ≡ 1 (mod 7) and k ≡ 0 (mod 3), then 7|(n − 2k ).
(c) Show that if n ≡ 2 (mod 5) and k ≡ 1 (mod 4), then 5|(n − 2k ).
(d) Show that if n ≡ 8 (mod 17) and k ≡ 3 (mod 8), then 17|(n − 2k ).
(e) Show that if n ≡ 11 (mod 13) and k ≡ 7 (mod 12), then 13|(n − 2k ).
(f) Show that if n ≡ 121 (mod 241) and k ≡ 23 (mod 24), then 241|
(n − 2k ).
(g) Show that every integer k satisfies at least one of the congruences
k ≡ 0 (mod 2), k ≡ 0 (mod 3), k ≡ 1 (mod 4), k ≡ 3 (mod 8), k ≡
7 (mod 12), k ≡ 23 (mod 24).
(h) Show that if n satisfies all the congruences n ≡ 1 (mod 3), n ≡ 1
(mod 7), n ≡ 2 (mod 5), n ≡ 8 (mod 17), n ≡ 11 (mod 13), n ≡
121 (mod 241), then n − 2k is divisible by at least one of the primes
3, 7, 5, 17, 13, 241.
(i) Show that these congruential conditions are equivalent to the single
condition n ≡ 172677 (mod 3728270).
(j) An integer n satisfying the above might still be representable in the
form p + 2k , but if it is, then the prime in question must be one of the
six primes listed. Show that if in addition, n ≡ 9 or 11 or 15 (mod 16),
then n cannot be expressed as a sum of a prime and a power of 2.
3.5 Notes
Sections 3.1, 3.2. The modern era of sieve methods began with the work
of Brun (1915, 1919). Hardy & Littlewood (1922) used Brun’s method to
establish the estimate (3.9). The sharp form of this in Corollary 3.4 is due
102 Principles and first examples of sieve methods
µ(q)2 b( p)
L= . (3.43)
q≤Q 1 + 32 q Q/N p|q
p − b( p)
Here b( p) is the number of residue classes modulo p that are deleted. This is
both a generalization and a sharpening of Theorem 3.2.
Section 3.3. Titchmarsh (1930) used Brun’s method to obtain Theorem 3.9,
but with a larger constant instead of 2. Montgomery & Vaughan (1973) have
shown that Corollary 3.4 and Theorem 3.9 are still valid when the error terms are
omitted. See also Selberg (1991, Section 22). The first significant improvement
of Theorem 3.9 was obtained by Motohashi (1973). Other improvements of
various kinds have been derived by Motohashi (1974), Hooley (1972, 1975),
Goldfeld (1975), Iwaniec (1982), and Friedlander & Iwaniec (1997).
In Lemmas 3.5 and 3.12, and in Exercises 3.2.7, 3.2.9, 3.2.10, 3.4.1 we see
evidence of a monotonicity principle that permeates sieve theory; cf. Selberg
(1991, pp. 72–73).
3.5 Notes 103
Hooley (1994) has shown that quite sharp sieve bounds can be derived using
the interrupted inclusion–exclusion idea that Brun started with. This approach
has been developed further by Ford & Halberstam (2000). An exposition of
sieves based on these ideas is given by Bateman & Diamond (2004, Chapters 12,
13). Still more extensive accounts of sieve methods have been given by Greaves
(2001), Halberstam & Richert (1974), Iwaniec & Kowalski (2004, Chapter
6), Motohashi (1983), and Selberg (1971, 1991). In addition, a collection of
applications of sieves to arithmetic problems has been given by Hooley (1976),
and additional sieve ideas are found in Bombieri (1977), Bombieri, Friedlander
& Iwaniec (1986, 1987, 1989), Fouvry & Iwaniec (1997), Friedlander & Iwaniec
(1998a, b), and Iwaniec (1978, 1980a, b, 1981).
Section 3.4. The twin prime conjecture is a special case of the prime k-tuple
conjecture. Suppose that d1 , . . . , dk are distinct integers, and let b( p) denote
the number of distinct residue classes modulo p found among the di . The prime
k-tuple conjecture asserts that if b( p) < p for every prime number p, then there
exist infinitely many positive integers n such that the k numbers n + di are all
prime. Hardy & Littlewood (1922) put this in a quantitative form: If b( p) < p
for all p, then the number of n ≤ N for which the k numbers n + di are all
prime is conjectured to be
N
∼ S(d) (3.44)
(log N )k
as N → ∞ where
−k
b( p) 1
S(d) = 1− 1− . (3.45)
p p p
This product is absolutely convergent, since b( p) = k for all sufficiently large
primes p. Although this remains unproved, by sifting we can obtain an upper
bound of the expected order of magnitude. In particular, from (3.43) it can be
shown that the number of n, M + 1 ≤ n ≤ M + N , for which the numbers
n + di are all prime is
N
2k k!S(d) . (3.46)
(log N )k
Corollarys 3.4 and 3.14 are special cases of this.
Theorem 3.15 is due to Romanoff (1934). Once the bound for the number
of twin primes is in place, the hardest part of the proof is to establish the
estimate (3.41). Romanoff’s original proof of this was rather difficult. Erdős
& Turán (1935) gave a simpler proof, but the clever proof we have given is
due to Erdős (1951). Let r (n) be defined as in the proof of Theorem 3.15.
Erdős (1950) showed that r (n) = (log log n), and that n≤x r (n)k k x for
104 Principles and first examples of sieve methods
any positive k. Presumably r (n) = o(log n), but for all we know there could be,
although it seems unlikely, infinitely many n such that n − 2k is prime whenever
0 < 2k < n. The number n = 105 has this property, and is probably the largest
such number. The best upper bound we have for the number of such n not
exceeding X is (Vaughan 1973),
c log X log log log X
X exp − .
log log X
3.6 References
Ankeny, N. C. & Onishi, H. (1964/1965). The general sieve, Acta Arith. 10, 31–62.
Bateman, P. T. & Diamond, H. (2004). Analytic Number Theory, Hackensack: World
Scientific.
Behrend, F. A. (1948). Generalization of an inequality of Heilbronn and Rohrbach, Bull.
Amer. Math. Soc. 54, 681–684.
Bombieri, E. (1977). The asymptotic sieve, Rend. Accad. Naz. XL (5) 1/2 (1975/76),
243–269.
Bombieri, E., Friedlander, J. B., & Iwaniec, H. (1986). Primes in arithmetic progressions
to large moduli, Acta Math. 156, 203–251.
(1987). Primes in arithmetic progressions to large moduli, II, Math. Ann. 277, 361–
393.
(1989). Primes in arithmetic progressions to large moduli, III, J. Amer. Math. Soc. 2,
215–224.
Brun, V. (1915). Über das Goldbachsche Gesetz und die Anzahl der Primzahlpaare,
Archiv for Math. og Naturvid. B 34, no. 8, 19 pp.
(1919). La série 1/5 + 1/7 + 1/11 + 1/13 + 1/17 + 1/19 + 1/29 + 1/31 +
1/41 + 1/43 + 1/59 + 1/61 + · · · où les dénominateurs sont “nombres premiers
jumeaus” est convergente ou finie, Bull. Sci. Math. (2) 43, 100–104; 124–128.
(1967). Reflections on the sieve of Eratosthenes, Norske Vid. Selsk. Skr. Trondheim,
no. 1, 9 pp.
Buchstab, A. A. (1938). New improvements in the method of the sieve of Eratosthenes,
Mat. Sb. (N. S.) 4 (46), 375–387.
Chowla, S. (1932). Contributions to the analytic theory of numbers, Math. Z. 35, 279–
299.
Chung, K.-L. (1941). A generalization of an inequality in the elementary theory of
numbers, J. Reine Angew. Math. 183, 193–196.
van der Corput, J. G. (1958). Inequalities involving least common multiple and other
arithmetical functions, Nederl. Akad. Wetensch. Proc. Ser. A 61 (= Indag. Math.
20), 5–15.
Erdős, P. (1940). The difference of consecutive primes, Duke Math. J. 6, 438–441.
(1946). On the coefficients of the cyclotomic polynomial, Bull. Amer. Math. Soc. 52,
179–184.
3.6 References 105
(1950). On integers of the form 2k + p and some related problems, Summa Brasil.
Math. 2, 113–123.
(1951). On some problems of Bellman and a theorem of Romanoff, J. Chinese Math.
Soc. (N. S.) 1, 409–421.
Erdős, P. & Turán, P. (1935). Ein zahlentheoretischer Satz, Mitt. Forsch. Inst. Math.
Mech. Univ. Tomsk 1, 101–103.
Ford, K. & Halberstam, H. (2000). The Brun–Hooley sieve, J. Number Theory 81,
335–350.
Fouvry, E. & Iwaniec, H. (1997). Gaussian primes, Acta Arith. 79 (1997), 249–287.
Friedlander, J. B. & Iwaniec, H. (1997). The Brun–Titchmarsh theorem, Analytic Number
Theory (Kyoto, 1996). London Math. Soc. Lecture Note Ser. 247, Cambridge:
Cambridge University Press, pp. 85–93.
(1998a). The polynomial X 2 + Y 4 captures its primes, Ann. of Math. (2) 148, 945–
1040.
(1998b). Asymptotic sieve for primes, Ann. of Math. (2) 148, 1041–1065.
Goldfeld, D. M. (1975). A further improvement of the Brun–Titchmarsh theorem, J.
London Math. Soc. (2) 11, 434–444.
Greaves, G. (2001). Sieves in Number Theory. Berlin: Springer.
Halberstam, H. (1985). Lectures on the linear sieve, Topics in Analytic Number Theory
(Austin, 1982). Austin: University of Texas Press, pp. 165–220.
Halberstam, H. & Richert, H.-E. (1973). Brun’s method and the fundamental lemma,
Acta Arith. 24, 113–133.
(1974). Sieve Methods. London: Academic Press.
(1975). Brun’s method and the fundamental lemma. II, Acta Arith. 27, 51–59.
Hardy, G. H. & Littlewood, J. E. (1922). Some problems of ‘Partitio Numerorum’: III.
On the expression of a number as a sum of primes, Acta Math. 44, 1–70; Collected
Papers, Vol. I, London: Oxford University Press, 1966, pp. 561–630.
Heilbronn, H. (1937). On an inequality in the elementary theory of numbers, Proc.
Cambridge Philos. Soc. 33, 207–209.
Hensley, D. (1978). An almost-prime sieve, J. Number Theory 10, 250–262; Corrigen-
dum, 12, (1980), 437.
Hooley, C. (1972). On the Brun–Titchmarsh theorem, J. Reine Angew. Math. 255,
60–79.
(1975). On the Brun–Titchmarsh theorem, II, Proc. London Math. Soc. (3) 30, 114–
128.
(1976). Applications of Sieve Methods to the Theory of Prime Numbers, Cambridge
Tract 70. Cambridge: Cambridge University Press.
(1994). An almost pure sieve, Acta Arith. 66, 359–368.
Iwaniec, H. (1978). Almost-primes represented by quadratic polynomials, Invent. Math.
47, 171–188.
(1980a). Rosser’s sieve, Acta Arith. 36, 171–202.
(1980b). A new form of the error term in the linear sieve, Acta Arith. 37, 307–320.
(1981). Rosser’s sieve – bilinear forms of the remainder terms – some applications.
Recent Progress in Analytic Number Theory, Vol. 1. New York: Academic Press,
pp. 203–230.
(1982). On the Brun–Titchmarsh theorem, J. Math. Soc. Japan 34, 95–123.
Iwaniec, H. & Kowalski, E. (2004). Analytic Number Theory, Colloquium Publications
53. Providence: Amer. Math. Soc.
106 Principles and first examples of sieve methods
Boston: Academic Press, pp. 467–484; Collected Papers, Vol. 1. Berlin: Springer-
Verlag, 1989, pp. 675–69.
(1991). Lectures on Sieves, Collected Papers, Vol. 2. Berlin: Springer-Verlag,
pp. 65–247.
Titchmarsh, E. C. (1930). A divisor problem, Rend. Circ. Math. Palermo 54, 414–429.
Tsang, K. M. (1989). Remarks on the sieving limit of the Buchstab–Rosser sieve, Number
Theory, Trace Formulas and Discrete Groups (Oslo, 1987). Boston: Academic
Press, pp. 485–502.
Vaughan, R. C. (1973). Some applications of Montgomery’s sieve, J. Number Theory 5,
64–79.
Vijayaraghavan, T. (1951). On a problem in elementary number theory, J. Indian Math.
Soc. (N.S.) 15, 51–56.
4
Primes in arithmetic progressions: I
or
1 1 ∞
f (z) − f (−z) = cn z n .
2 2 n=0
n≡1 (2)
108
4.1 Additive characters 109
unless ζ = 1. Hence
1
q
1 if n ≡ a (mod q),
e(−ka/q)e(kn/q) = (4.1)
q k=1 0 otherwise,
Here the exact values that k runs through are immaterial, as long as the set of
these values forms a complete residue system modulo q. Hence we may replace
k by −k in the above, and so we see that
q
f (n) = *
f (k)e(kn/q). (4.3)
k=1
and Fourier expansion of a function f ∈ L 1 (T), but the situation here is simpler
because our sums have only finitely many terms.
Let v(h) be the vector v(h) = [e(h/q), e(2h/q), . . . , e((q − 1)h/q), 1].
From (4.1) we see that two such vectors v(h 1 ) and v(h 2 ) are orthogonal un-
less h 1 ≡ h 2 (mod q). These vectors are not normalized, but they all have the
√
same length q, so apart from some rescaling, the transformation from f to * f
is an isometry. More precisely, if f has period q and * f is given by (4.2), then
by (4.3),
q
q
q 2
| f (n)|2 = *
f (k)e(kn/q) .
n=1 n=1 k=1
By expanding and taking the sum over n inside, we see that this is
q q
q
= *
f ( j) *
f (k) e( jn/q)e(−kn/q).
j=1 k=1 n=1
Finally,
µ(q/(q, n))
cq (n) = dµ(q/d) = ϕ(q). (4.7)
d|(q,n)
ϕ(q/(q, n))
Proof The first assertion is evident, as each term in the sum (4.5) has period
q. As for the second, suppose that q = q1 q2 where (q1 , q2 ) = 1. By the Chinese
Remainder Theorem, for each a (mod q) there is a unique pair a1 , a2 with ai
determined (mod qi ), so that a ≡ a1 q2 + a2 q1 (mod q). Moreover, under this
correspondence we see that (a, q) = 1 if and only if (ai , qi ) = 1 for i = 1, 2.
Then
q1
q2
cq (n) = e((a1 q2 + a2 q1 )n/(q1 q2 ))
a1 =1 a2 =1
(a1 ,q1 )=1 (a2 ,q2 )=1
⎛ ⎞⎛ ⎞
⎜ ⎟⎜
q1 q2
⎟
=⎝ e(a1 n/q1 )⎠ ⎝ e(a2 n/q2 )⎠
a1 =1 a2 =1
(a1 ,q1 )=1 (a2 ,q2 )=1
q
q
e(na/q) = e(na/q)
a=1 d|q a=1
(a,q)=d
q/d
= e(nb/(q/d))
d|q b=1
(b,q/d)=1
= cq/d (n).
d|q
By (4.1), the left-hand side above is q when q|n, and is 0 otherwise. Thus we
have (4.6).
The first formula in (4.7) is merely the Möbius inverse of (4.6). To obtain
the second formula in (4.7), we begin by considering the special case in which
q is a prime power, q = p k .
k
p
c pk (n) = e(na/ p k )
a=1
pa
k k−1
p p
= e(na/ p ) −k
e(na/ p k−1 ).
a=1 a=1
112 Primes in arithmetic progressions: I
Here the first sum is p k if p k |n, and is 0 otherwise. Similarly, the second
sum is p k−1 if p k−1 |n, and is 0 otherwise. Hence the above is
⎧
⎨0 if p k−1 n,
= −p k−1
if p k−1 n,
⎩ k
p − p k−1 if p k |n
k
µ p /(n, p k )
= k ϕ( p k ).
ϕ p /(n, p k )
The general case of (4.7) now follows because cq (n) is a multiplicative function
of q.
4.1.1 Exercises
√
1. Let U = [u kn ] be the q × q matrix with elements u kn = e(kn/q)/ q. Show
that UU ∗ = U ∗ U = I , i.e., that U is unitary.
2. (Friedman 1957; cf. Reznick 1995)
(a) Show that
1
2r 2r
ue(θ/2) + ve(−θ/2) dθ = u r vr
0 r
for any non-negative integer r and arbitrary complex numbers u, v.
(b) Show that if u = (x − i y)/2, v = (x + i y)/2, then
x cos π θ + y sin π θ = ue(θ/2) + ve(−θ/2)
for all θ.
(c) Show that
1
2r 2r
x cos π θ + y sin π θ dθ = 2−2r (x 2 + y 2 )r
0 r
for any non-negative integer r and arbitrary real or complex numbers
x, y.
(d) Show that
q
πia/q
−πia/q 2r 2r
ue + ve =q u r vr
a=1
r
if r is an integer, 0 ≤ r < q.
(e) Show that
q
2r
(x cos πa/q + y sin πa/q)2r = q 2−2r (x 2 + y 2 )r
a=1
r
if r is an integer, 0 ≤ r < q.
4.1 Additive characters 113
where
−1
6µ(q) 1
aq = 2 2 1− 2 .
π q p|q
p
by taking
1 1
aq = lim F(n)cq (n) . (4.9)
ϕ(q) x→∞ x n≤x
In the following, suppose that f (r ) is chosen so that F(n) = r |n f (r ) for
all n.
(a) Suppose that
∞
| f (r )|
< ∞. (4.10)
r =1
r
114 Primes in arithmetic progressions: I
as x → ∞.
(b) Suppose that (4.10) holds. Show that
1 ∞
f (r )
lim F(n)cq (n) = ϕ(q) .
x→∞ x r
n≤x r =1
q|r
(c) Put
∞
f (r )
aq = .
r =1
r
q|r
Show that if
∞
| f (r )|d(r )
<∞ (4.11)
r =1
r
∞
then (4.8) and (4.9) hold, and moreover that q=1 |aq cq (n)| < ∞.
∞
8. (Ramanujan 1918) Show that if q > 1, then n=1 cq (n)/n = −(q). (See
also Exercise 8.3.4.)
9. Let q (z) denote the q th cyclotomic polynomial, i.e., the monic polynomial
whose roots are precisely the primitive q th roots of unity, so that
q
q (z) = (z − e(n/q)).
n=1
(n,q)=1
and that (z d − 1)µ(q/d) has a power series expansion, valid when |z| < 1,
with integer coefficients. Deduce that q (z) ∈ Z[z].
(b) Suppose that z ∈ Z and p | q (z) and let e denote the order of z modulo
p. Show that e | q and that if p | (z d − 1) then e | d.
(c) Choose t so that p t (z e − 1). Show that for m ∈ N with p m one has
p t (z me − 1).
(d) Show that if p q, then p ht q (z) where h = µ(q/d). Deduce that
e|d|q
e = q and that q | ( p − 1).
(e) By taking z to be a suitable multiple of q, or otherwise, show that there
are infinitely many primes p with p ≡ 1 (mod q).
4.2 Dirichlet characters 115
Lemma 4.2 Suppose that G is cyclic of order n, say G = (a). Then there are
exactly n characters of G, namely χk (a m ) = e(km/n) for 1 ≤ k ≤ n. Moreover,
n if χ = χ0 ,
χ (g) = (4.12)
g∈G
0 otherwise,
and
n if g = e,
χ (g) = (4.13)
*
0 otherwise.
χ ∈G
* is cyclic, G
In this situation, G * = (χ1 ).
Since the characters are now known explicitly, the remaining assertions are
easily verified.
Next we describe the characters of the direct product of two groups in terms
of the characters of the factors.
Lemma 4.3 Suppose that G 1 and G 2 are finite abelian groups, and that G =
G 1 ⊗ G 2 . If χi is a character of G i , i = 1, 2, and g ∈ G is written g = (g1 , g2 ),
gi ∈ G i , then χ (g) = χ1 (g1 )χ2 (g2 ) is a character of G. Conversely, if χ ∈ G, *
then there exist unique χi ∈ G i such that χ (g) = χ1 (g1 )χ2 (g2 ). The identities
(4.12) and (4.13) hold for G if they hold for both G 1 and G 2 .
* corresponds to a pair (χ1 , χ2 ) ∈ G
We see here that each χ ∈ G *1 × G
* 2 . Thus
∼ * *
G = G1 ⊗ G2.
Proof The first assertion is clear. As for the second, put χ1 (g1 ) = χ ((g1 , e2 )),
*i for i = 1, 2, and χ1 (g1 )χ2 (g2 ) = χ (g). The
χ2 (g2 ) = χ ((e1 , g2 )). Then χi ∈ G
χi are unique, for if g = (g1 , e2 ), then
χ (g) = χ ((g1 , e2 )) = χ1 (g1 )χ2 (e2 ) = χ1 (g1 ),
and similarly for χ2 . If χ (g) = χ1 (g1 )χ2 (g2 ), then
χ (g) = χ1 (g1 ) χ2 (g2 ) ,
g∈G g1 ∈G 1 g2 ∈G 2
so that (4.12) holds for G if it holds for G 1 and for G 2 . Similarly, if g = (g1 , g2 ),
then
⎛ ⎞⎛ ⎞
χ (g) = ⎝ χ1 (g1 )⎠ ⎝ χ2 (g2 )⎠ ,
*
χ ∈G *1
χ1 ∈G *2
χ1 ∈ G
Though G and G * are isomorphic, the isomorphism is not canonical. That is,
no particular one-to-one correspondence between the elements of G and those
* is naturally distinguished.
of G
4.2 Dirichlet characters 117
If (n, q) = 1, then
ϕ(q) if n ≡ 1 (mod q),
χ (n) = (4.15)
χ
0 otherwise,
where the sum is extended over the ϕ(q) Dirichlet characters χ (mod q).
Proof The first assertion follows immediately from the observations that
χ1 (n)χ2 (n) is totally multiplicative, that it vanishes if (n, [q1 , q2 ]) > 1, and
that it has period [q1 , q2 ]. As for the second assertion, we may suppose that
(n, q) = 1. By the Chinese Remainder Theorem we see that
(Z/qZ)× ∼
= (Z/q1 Z)× ⊗ (Z/q2 Z)×
Our proof of Theorem 4.4 depends on Abel’s theorem that any finite abelian
group is isomorphic to the direct product of cyclic groups, but we can prove
Corollary 4.5 without appealing to this result, as follows. By the Chinese Re-
mainder Theorem we see that
+
(Z/qZ)× ∼ = (Z/ p α Z)× .
p α q
If p is odd, then the reduced residue classes (mod p α ) form a cyclic group; in
classical language we say there is a primitive root g. Thus if (n, p) = 1, then
there is a unique ν (mod ϕ( p α )) such that g ν ≡ n (mod p α ). The number ν is
118 Primes in arithmetic progressions: I
called the index of n, and is denoted ν = indg n. From Lemma 4.2 it follows
that the characters (mod p α ), p > 2, are given by
k indg n
χk (n) = e (4.16)
ϕ( p α )
for (n, p) = 1. We obtain ϕ( p α ) different characters by allowing k to assume
integral values in the range 1 ≤ k ≤ ϕ( p α ). By Lemma 4.3 it follows that if q
is odd, then the general character (mod q) is given by
k indg n
χ (n) = e (4.17)
p α q
ϕ( p α )
when (n, q) = 1.
By definition, if f (n) is totally multiplicative, f (n) = 0 whenever (n, q) > 1,
and f (n) has period q, then f is a Dirichlet character (mod q). It is useful to
note that the first condition can be relaxed.
Theorem 4.7 If f is multiplicative, f (n) = 0 whenever (n, q) > 1, and f has
period q, then f is a Dirichlet character modulo q.
Proof It suffices to show that f is totally multiplicative. If (mn, q) > 1, then
f (mn) = f (m) f (n) since 0 = 0. Suppose that (mn, q) = 1. Hence in partic-
ular (m, q) = 1, so that the map k → n + kq (mod m) permutes the residue
classes (mod m). Thus there is a k for which n + kq ≡ 1 (mod m), and
4.2 Dirichlet characters 119
4.2.1 Exercises
1. Let G be a finite abelian group of order n. Let g1 , g2 , . . . , gn denote the
elements of G, and let χ1 (g), χ2 (g), . . . , χn (g) denote the characters of G.
√
Let U = [u i j ] be the n × n matrix with elements u i j = χi (g j )/ n. Show
that UU ∗ = U ∗ U = I , i.e., that U is unitary.
2. Show that for arbitrary real or complex numbers c1 , . . . , cq ,
q 2
q
cn χ (n) = ϕ(q) |cn |2
χ n=1 n=1
(n,q)=1
where the sum on the left-hand side runs over all Dirichlet characters
χ (mod q).
3. Show that for arbitrary real or complex numbers cχ ,
q 2
cχ χ (n) = ϕ(q) |cχ |2
n=1 χ χ
where the sum over χ is extended over all Dirichlet characters (mod q).
4. Let (a, q) = 1, and suppose that k is the order of a in the multiplicative group
of reduced residue classes (mod q).
(a) Show that if χ is a Dirichlet character (mod q), then χ (a) is a k th root
of unity.
(b) Show that if z is a k th root of unity, then
k if z = 1,
1 + z + ··· + z k−1
=
0 otherwise.
(b) Show that each k th root of unity occurs precisely ϕ(q)/k times among the
numbers χ (a) as a runs over the ϕ(q) reduced residue classes (mod q).
6. Let χ be a character (mod q) such that χ (a) = ±1 whenever (a, q) = 1, and
q
put S(χ ) = n=1 nχ (n). Thus S(χ ) is an integer.
(a) Show that if (a, q) = 1 then aχ (a)S(χ ) ≡ S(χ ) (mod q).
(b) Show that there is an a such that (a, q) = 1 and (aχ (a) − 1, q)|12.
(c) Deduce that 12S(χ ) ≡ 0 (mod q).
In algebraic number fields we encounter not only Dirichlet characters, but
also characters of ideal class groups and of Galois groups. In addition, algebraic
number fields possessing one or more complex embeddings also have a further
kind of character, Hecke’s Grössencharaktere. In a sequence of exercises, be-
ginning with the one below, we develop the basic properties of these characters
√
for the Gaussian field Q( −1).
7. Let K be the Gaussian field,
√
K =Q −1 = {a + bi : a, b ∈ Q},
O K = {a + bi : a, b ∈ Z}.
Elements α = a + bi ∈ K have a norm, N (α) = a 2 + b2 , and we observe
that N (αβ) = N (α)N (β). An element α of a ring is a unit if α has an inverse
in the ring. The ring O K has precisely four units, namely i k for k = 0, 1, 2, 3.
Two elements α, β ∈ O K are associates if α = uβ for some unit u. For each
integer m we define the Hecke Grössencharakter
4mi arg α
e if α = 0,
χm (α) =
0 if α = 0.
for k = 1, 2, 3, . . . . Hence
χ (n) ≤ q (4.23)
n≤x
for any x, so that by Theorem 1.3, the series (4.20) converges for σ > 0. This
result is best possible since the terms in (4.20) do not tend to 0 when σ = 0. On
the other hand, we shall show in Chapter 10 that the function L(s, χ) is entire
if χ = χ0 . For σ > 1 we can take logarithms in (4.21), and differentiate, as in
Corollary 1.11, and thus we obtain
In these last formulæ we see how relations for L-functions parallel those
for the zeta functions. Indeed, when manipulating Dirichlet series formally, the
only property of n −s that is used is that it is totally multiplicative. Hence all
such calculations can be made with n −s replaced by χ (n)n −s . For example, we
know that µ(n)2 n −s = ζ (s)/ζ (2s) for σ > 1. Hence formally
∞
µ(n)2 χ (n)n −s = L(s, χ )/L(2s, χ 2 ). (4.26)
n=1
where the sum is extended over all characters χ (mod q). This is the multiplica-
tive analogue of (4.1). Hence if (a, q) = 1 then
∞
1 ∞
(n)n −s = (n)n −s χ (a)χ (n)
n=1
ϕ(q) n=1 χ
n≡a (q)
−1 L
= χ (a) (s, χ) (4.28)
ϕ(q) χ L
for σ > 1. As L(s, χ0 ) has a simple pole at s = 1, the function LL (s, χ) has a
simple pole at 1 with residue −1. Thus the term arising from χ0 on the right-hand
side above is
1
+ Oq (1) (4.29)
ϕ(q)(s − 1)
Suppose that (a, q) = 1. Then the above, with (4.28), (4.29), and (4.30) give
the estimate
∞
1
(n)n −s = + Oq (1)
n=1
ϕ(q)(s − 1)
n≡a (q)
4.3 Dirichlet L-functions 123
as s → 1+ . Consequently
∞
(n)
= ∞.
n=1
n
n≡a (q)
We call a character real if all its values are real (i.e., χ (n) = 0 or ±1 for all
n). Otherwise a character is complex. A character is quadratic if it has order
2 in the character group: χ 2 = χ0 but χ = χ0 . Thus a quadratic character is
real, and a real character is either principal or quadratic. In Chapter
9 we shall
express quadratic characters in terms of the Kronecker symbol dn .
If we take s = σ > 1, then the sum above is a non-negative real number, and
hence we see that
L(σ, χ) ≥ 1 (4.32)
χ
for σ > 1. Now L(s, χ0 ) has a simple pole at s = 1, but the other L(s, χ)
are analytic at s = 1. Thus L(1, χ ) = 0 can hold for at most one χ , since
otherwise the product in (4.32) would tend to 0 as σ → 1+ . If χ is a character
(mod q), then χ is a character (mod q), and χ = χ if χ is complex. Moreover
124 Primes in arithmetic progressions: I
Hence r (n) ≥ 0 for all n, and r (n 2 ) ≥ 1 for all n. Suppose that L(1, χ ) = 0.
Then ζ (s)L(s, χ) is analytic for σ > 0, and by Landau’s theorem (Theorem
1.7) the series r (n)n −s converges for σ > 0. But this is false, since
∞
∞
∞
r (n)n −1/2 ≥ r (n 2 )n −1 ≥ n −1 = +∞.
n=1 n=1 n=1
Hence L(1, χ ) = 0. Since L(σ, χ ) > 0 for σ > 1 when χ is quadratic, we see
in fact that L(1, χ ) > 0 in this case.
By using the techniques of Chapter 2 we can prove more than the mere
divergence of the series in Corollary 4.10.
This last error term is χ 1, and then (a) follows from (4.33) and the fact that
L(1, χ) = 0. The derivation of (b) from (a), and of (c) from (b) proceeds as in
the proof of Theorem 2.7. Continuing as in that proof, we see from (c) that
(n)χ (n)
1
= c(χ ) + Oχ
1<n≤x
n log n log x
where
χ ( pk )
c(χ ) = b(χ ) + .
k
kp k
p
k>1
We let s → 1+ in (4.24), and deduce by Theorem 1.1 that c(χ) = log L(1, χ ).
To complete the derivation of (d) it suffices to argue as in the proof of
Theorem 2.7.
log p 1
(b) = log x + Oq (1),
p≤x p ϕ(q)
1
n≡a (q)
1 1
(c) = log log x + b(q, a) + Oq ,
p≤x p ϕ(q) log x
n≡a (q)
−1
1 1
(d) 1− = c(q, a)(log x)1/ϕ(q) 1 + Oq
p≤x p log x
n≡a (q)
where
1 1 1
b(q, a) = C0 + log 1 − + χ (a) log L(1, χ) −
ϕ(q) p χ =χ0
kp k
p|q k
p ≡a (q)
k>1
and
1/ϕ(q)
ϕ(q) 1 −χ ( p) χ ( p)
c(q, a) = e C0
L(1, χ )χ (a) 1− 1− .
q χ =χ0 p p p
Proof To derive (a) from Theorem 4.11(a) we use (4.27) and the estimate
(n)χ0 (n)
= log x + Oq (1),
n≤x n
which follows from Theorem 2.7(a) since
log p log p
= q 1.
k
pk p|q
p−1
p
p|q
We derive (b) and (c) similarly from the corresponding parts of Theorem 4.11.
In the latter case we use the estimate
χ0 ( p)
1
= log log x + b(χ0 ) + Oq
p≤x p log x
where
1 χ0 ( p k )
b(χ0 ) = C0 + log 1 − − .
p|q
p k
kp k
p
k>1
To derive (d) we observe first that
−1
χ0 ( p) −1 1 1
1− = 1− 1− ,
p≤x p p≤x p p≤x p
p|q
which by Theorem 2.7(e) is
⎛ ⎞−1
ϕ(q) ⎜ 1 ⎟ −C0 1
= ⎝ 1− ⎠ e (log x) 1 + O .
q p|q
p log x
p>x
4.3 Dirichlet L-functions 127
Here each term in the product is 1 + O(1/x), and the number of factors is
≤ ω(q), so the product is 1 + Oq (1/x), and hence the above is
ϕ(q) 1
= eC0 (log x) 1 + Oq .
q log x
To complete the proof it suffices to combine this with Theorem 4.11(d)
in (4.27).
4.3.1 Exercises
1. Let χ be a Dirichlet character (mod q). Show that if σ > 1, then
∞
(a) (−1)n−1 χ (n)n −s = (1 − χ (2)21−s )L(s, χ );
n=1
∞
L(s, χ)4
(b) d(n)2 χ (n)n −s = .
L(2s, χ 2 )
n=1
2. (Mertens 1895a,b) Let r (n) = d|n χ (d).
(a) Show that if χ is a non-principal character (mod q), then
χ (n) 1
√ χ √ .
n>x n x
(b) Show that if χ is a non-principal character (mod q), then
r (n)
1/2
= 2x 1/2 L(1, χ ) + Oχ (1).
n≤x n
(c) Recall that if χ is quadratic then r (n) ≥ 0 for all n, and that r (n 2 ) ≥ 1.
Deduce that if χ is a quadratic character, then the left-hand side above
is log x.
(d) Conclude that if χ is a quadratic character, then L(1, χ ) > 0.
3. (Mertens 1897, 1899) For u ≥ 0, put f (u) = m≤u (1 − m/u).
(a) Show that f (u) ≥ 0, that f (u) is continuous, and that if u is not an
integer, then
[u]([u] + 1)
f (u) = ;
2u 2
deduce that f is increasing.
(b) Show also that
u 1 u u 1
f (u) = − {v} dv = − + O(1/u) .
2 u 0 2 2
(c) Let r (n) = d|n χ (d), and assume that χ is non-principal. Show that
r (n)(1 − n/x) = χ (d) f (x/d) .
n≤x d≤x
128 Primes in arithmetic progressions: I
(d) Write d≤x = d≤y + y<d≤x = S1 + S2 where 1 ≤ y ≤ x. Use
part (b) to show that S1 = 2 x L(1, χ ) + Oχ (x/y) + O(y 2 /x).
1
(d) Show that for any positive integer q there is a small number cq and a
large number Cq such that if x ≥ 2Cq and (a, q) = 1, then
log p
> cq .
x/Cq < p≤x
p
p≡a (q)
(e) Show that for any positive integer q there is a Cq such that if (a, q) = 1,
then
x
π (x; q, a) q
log x
uniformly for x ≥ Cq .
(f) Show that if (a, q) = 1, then
π(x; q, a) 1 π (x; q, a) 1
lim inf ≤ , lim sup ≥ .
x→∞ x/ log x ϕ(q) x→∞ x/ log x ϕ(q)
6. (a) Show that
x
ϑ(x) ≤ π (x) log x ≤ ϑ(x) + O
log x
for x ≥ 2.
(b) Let P denote a set of prime numbers, and put
πP (x) = 1, ϑP (x) = log p.
p≤x p≤x
p∈P p∈P
Show that
x
ϑP (x) = πP (x) log x + O
log x
for x ≥ 2, where the implicit constant is absolute.
(c) Let
n= p.
p≤y
p∈P
as y → ∞.
7. Let R(n) denote the number of ordered pairs a, b such that a 2 + b2 = n
with a ≥ 0 and b > 0. Also, let r (n) denote
the number of such pairs for
−4
which (a, b) = 1. Finally, let χ−4 = n be the non-principal character
(mod 4). We recall that if the prime factorization of n is written in the form
n = 2α pβ qγ ,
γ
β
p n q n
p≡1 (4) q≡3 (4)
then r (n) > 0 if and only if γ = 0 for all primes q and α ≤ 1. We also
recall that
p (β + 1) if 2|γ for all q,
R(n) = r (n/d 2 ) = χ−4 (d) =
d 2 |n d|n
0 otherwise.
(a) Show that ∞ R(n)n −s = ζ (s)L(s, χ−4 ) for σ > 1.
n=1
∞
(b) Show that n=1 r (n)n −s = ζ (s)L(s, χ−4 )/ζ (2s) for σ > 1.
(c) Show that if x ≥ 0 and y ≥ 2, then
y
card{n ∈ (x, x + y] : r (n) > 0} √ .
log y
(d) Show that
x
card{n ≤ x : R(n) > 0} √
log x
for x ≥ 2.
(e) Suppose that n is of the form
n= p.
p≤y
p≡1 (4)
In the above it is noteworthy that although R(n) ≤ d(n) for all n, that
R(n) is usually 0 and has a smaller average value (cf. Exercise 2.1.9)
than d(n) (cf. Theorem 2.3), the maximum order of magnitude of R(n)
is the same as for d(n).
4.3 Dirichlet L-functions 131
√
8. Let K = Q( −1) be the Gaussian field, O K = {a + ib : a, b ∈ Z} the ring
of integers in K . Ideals a in O K are principal, a = (a + ib), and have norm
N (a) = a 2 + b2 .
(a) Explain why the number of ideals a with N (a) ≤ x is π4 x + O(x 1/2 ).
(b) For σ > 1, let ζ K (s) = a N (a)−s be the Dedekind zeta function of
K . Show that ζ K (s) = ζ (s)L s, χ−4 .
(c) For the Gaussian field K , show that N (ab) = N (a)N (b). (This is true
in any algebraic number field.)
(d) Assume that ideals in K factor uniquely into prime ideals. (This is true
in any algebraic number field, and is particularly easy to establish for
the Gaussian field since it has a division algorithm.) Deduce that if
σ > 1, then
1 −1
ζ K (s) = 1−
p N (p)
for σ > 1.
(f) Let a and b be given ideals. Show that
1 if gcd(a, b) = 1,
µ(d) =
d|a
0 otherwise.
d|b
(g) Among pairs a, b of ideals with N (a) ≤ x, N (b) ≤ x, show that the
probability that gcd(a, b) = 1 is
1 6
+ O x −1/2 = 2 + O x 1/2 .
ζ K (2) π L 2, χ−4
9. (Erdős 1946, 1949, 1957, Vaughan 1974, Saffari, unpublished, but see
Bateman, Pomerance & Vaughan 1981; cf. Exercise 2.3.7) Let q (z) =
µ(q/d)
d|q (z − 1)
d
denote the q th cyclotomic polynomial. Suppose that
q= p
p≤y
p≡±2 (5)
(d) Deduce that q (z) has a coefficient whose absolute value is at least
exp q (log 2−ε)/ log log q
if y > y0 (ε).
√
10. Grössencharaktere for Q( −1), continued from Exercise 4.2.7.
(a) For σ > 1 put
1
L(s, χm ) = χm (α)N (α)−s = χm (a + bi)(a 2 + b2 )−s
α∈O
4 a,b∈Z
K
(a,b)=(0,0)
where α denotes a sum over unassociated members of O K . Show
that the above sum is absolutely convergent in this half-plane.
(b) We recall that members of O K factor uniquely into Gaussian primes.
Also, the Gaussian primes are obtained by factoring the rational primes:
The prime 2 ramifies, 2 = i 3 (1 + i)2 , the rational primes p ≡ 1 (mod 4)
split into two distinct Gaussian primes, p = (a + bi)(a − bi), and the
rational primes q ≡ 3 (mod 4) are inert. Show that
L(s, χm ) = (1 − χm (p)N (p)−s )−1
p
(f) Show that if m = 0, then the Dirichlet series L(s, χm ) is convergent for
σ > 1/2.
(g) Show that L(s, χm ) and L(s, χ−m ) are identically equal, and hence that
L(σ, χm ) ∈ R for σ > 1/2.
4.4 Notes 133
4.4 Notes
Section 4.1. Ramanujan’s sum was introduced by Ramanujan (1918). Incredi-
bly, both Hardy and Ramanujan missed the fact that cq (n) be written in closed
form: The formula on the extreme right of (4.7) is due to Hölder (1936). Nor-
mally one would say that a function f is even if f (x) = f (−x). However, in
the present context, an arithmetic function f with period q is said to be even
if f (n) is a function only of (n, q). Thus cq (n) is an even function. The space
of almost-even functions is rather small, but includes several arithmetic func-
tions of interest. For such functions one may hope for a representation in the
∞
form f (n) = q=1 aq cq (n), called a Ramanujan expansion. For a survey of the
theory of such expansions, see Schwarz (1988). Hildebrand (1984) established
definitive results concerning the pointwise convergence of Ramanujan expan-
sions. An appropriate Parseval identity has been established for mean-square
summable almost-even functions; see Hildebrand, Schwarz & Spilker (1988).
Section 4.2. The first instance of characters of a non-cyclic group occurs in
Gauss’s analysis of the genus structure of the class group of binary quadratic
forms. The quotient of the class group by the principal genus is isomorphic to
C2 ⊗ C2 ⊗ · · · ⊗ C2 , and the associated characters are given by Kronecker’s
symbol. Dirichlet (1839) defined the Dirichlet characters for the multiplicative
group (Z/qZ)× of reduced residues modulo q, and the same technique suffices
to construct the characters for any finite Abelian group. More generally, if
G is a group, then a homomorphism h : G −→ G L(n, C) is called a group
representation, and the trace of h(g) is a group character. Note that if a and
b are conjugate elements of G, say a = gbg −1 , then h(a) and h(b) are similar
matrices. Hence they have the same eigenvalues, and in particular tr h(a) =
tr h(b). Thus a group character is constant on conjugacy classes. In the case of a
finite Abelian group it suffices to take n = 1, and in this case the representation
and its trace are essentially the same. For an introduction to characters in a
wider setting, see Serre (1977).
Section 4.3. Dirichlet (1837a,b,c) first proved Corollary 4.10 in the case that
q is prime. The definition of the Dirichlet characters is not difficult in that case,
since the multiplicative group (Z/ pZ)× of reduced residues is cyclic. The most
challenging part of the proof is to show that L(1, χ ) when χ is the Legendre
symbol (mod p). If p ≡ 3 (mod 4), then
p−1
p−1
a p( p − 1)
a ≡ a= ≡ 1 (mod 2),
a=1
p a=1
2
and hence the sum on the left is non-zero. It follows by (9.9) that L(1, χ p ) = 0
in this case. If p ≡ 1 (mod 4), then one has the identity of Exercise 9.3.7(c),
134 Primes in arithmetic progressions: I
4.5 References
Baker, A., Birch, B. J., & Wirsing, E. A. (1973). On a problem of Chowla, J. Number
Theory 5, 224–236.
Bateman, P. T. (1959). Theorems implying the non-vanishing of χ(m)m −1 for real
residue-characters, J. Indian
Math. Soc. 23, 101–115.
(1966). Lower bounds for h(m)/m for arithmetical function h similar to real
residue characters, J. Math. Anal. Appl. 15, 2–20.
4.5 References 135
The interchange of limits here is difficult to justify, since α(s) may not be
uniformly convergent, and because the integral in (5.3) is neither uniformly nor
absolutely convergent. Moreover, if x is an integer, then the term n = x in (5.4)
gives rise to the integral (5.3) with y = 1, and this integral does not converge,
although its Cauchy principal value exists:
σ0 +i T
1 ds 1
lim = (5.5)
T →∞ 2πi σ −i T s 2
0
137
138 Dirichlet series: II
Here indicates that if x is an integer, then the last term is to be counted with
weight 1/2.
Proof Choose N so large that N > 2x + 2, and write
α(s) = an n −s + an n −s = α1 (s) + α2 (s),
n≤N n>N
here the justification is trivial since there are only finitely many terms. As for
α2 (s), we observe that
∞ ∞
α2 (s) = u −s d(A(u) − A(N )) = s (A(u) − A(N ))u −s−1 du.
N N
We have now established a precise relationship between (5.1) and (5.2), but
Theorem 5.1 is not sufficiently quantitative to be useful in practice. We express
the error term more explicitly in terms of the sine integral
∞
sin u
si(x) = − du.
x u
By integration by parts we see that si(x) 1/x for x ≥ 1, and hence that
si(x) min(1, 1/x) (5.6)
for x > 0. We also note that
+∞
sin u
si(x) + si(−x) = − du = −π. (5.7)
−∞ u
Theorem 5.2 If σ0 > max(0, σa ) and x > 0, then
1 σ0 +i T
xs
an = α(s) ds + R (5.8)
n≤x 2πi σ0 −i T s
where
1 x
R= an si T log
π x/2<n<x
n
1 n 4σ0 + x σ0 |an |
− an si T log +O σ0
.
π x<n<2x x T n n
Proof Since the series α(s) is absolutely convergent on the interval [σ0 −
i T, σ0 + i T ], we see that
1 σ0 +i T
xs 1 σ0 +i T
x s ds
α(s) ds = an .
2πi σ0 −i T s n 2πi σ0 −i T n s
Thus it suffices to show that
⎧
⎪
⎪ 1 + O(y σ0 /T ) if y ≥ 2,
σ0 +i T ⎨ σ0
1 ds 1 + 1
si(T log y) + O(2 /T ) if 1 ≤ y ≤ 2,
ys = π
σ0
2πi σ0 −i T s ⎪
⎪ − si(T log 1/y) + O(2 /T ) if 1/2 ≤ y ≤ 1,
1
⎩ π σ0
O(y /T ) if y ≤ 1/2
(5.9)
for σ0 > 0.
To establish the first part of this formula, suppose that y ≥ 2, and let C be
the piecewise linear path from −∞ − i T to σ0 − i T to σ0 + i T to −∞ + i T .
Then by the calculus of residues we see that
1 ds
ys = 1,
2πi C s
140 Dirichlet series: II
In classical
1 harmonic analysis, for f ∈ L1 (T) we define Fourier coefficients
*
f (k) = 0 f (x)e(−kα) dα, and we expect that the Fourier series *f (k)e(kα)
provides a useful formula for f (α). As it happens, the Fourier series may
diverge, or converge to a value other than f (α), but for most f a satisfactory
alternative can be found. For example, if f is of bounded variation, then
f (α − ) + f (α + ) K
= lim *
f (k)e(kα).
2 K →∞
−K
As in the case of Fourier series, this may fail, but it is not difficult to show that
if f is of bounded variation on [−A, A] for every A, then
f (α − ) + f (α + ) T
*
= lim f (t)e(t x) dt. (5.12)
2 T →∞ −T
The relationship between (5.1) and (5.2) is precisely the same as between
(5.10) and (5.11). Indeed, if we take f (x) = A(e2π x )e−2π σ x , then f ∈ L 1 (R) by
Theorem 1.3, and by changing variables in (5.1) we find that
* α(σ + it)
f (t) = .
2π (σ + it)
Thus (5.2) is equivalent to (5.11), and an appeal to (5.12) provides a second
(real variable) proof of Theorem 5.1.
In general, if
∞
F(s) = f (x)x s−1 d x, (5.13)
0
then we say that F(s) is the Mellin transform of f (x). By (5.10) and (5.11) we
expect that
σ0 +i∞
1
f (x) = F(s)x −s ds, (5.14)
2πi σ0 −i∞
and when this latter formula holds we say that f is the inverse Mellin transform
of F. Thus if A(x) is the summatory function of a Dirichlet series α(s), then
α(s)/s is the Mellin transform of A(1/x) for σ > max(0, σc ), and Perron’s
formula (Theorem 5.1) asserts that if σ0 > max(0, σc ), then A(1/x) is the inverse
142 Dirichlet series: II
and that
σ0 +i∞
1
Aw (x) = α(s)K (s)x s ds. (5.16)
2πi σ0 −i∞
Alternatively, we may start with a kernel K (s), and define the weight w(x)
to be its inverse Mellin transform. The precise conditions under which these
identities hold depends on the weight or kernel; we mention several important
examples.
1. Cesàro weights. For a positive integer k, put
1
Ck (x) = an (x − n)k . (5.17)
k! n≤x
x
Then Ck (x) = 0 Ck−1 (u) du for k ≥ 1 where C0 (x) = A(x), and hence
Ck (x) x θ for θ > k + max(0, σc ). (The implicit constant here may depend
on k, on θ, and on the an .) By integrating (5.1) by parts repeatedly, we see
that
∞
α(s) = s(s + 1) · · · (s + k) Ck (x)x −s−k−1 d x (5.18)
1
for σ > max(0, σc ). By following the method used to prove Theorem 5.1, it
may also be shown that
σ0 +i∞
1 x s+k
Ck (x) = α(s) ds (5.19)
2πi σ0 −i∞ s(s + 1) · · · (s + k)
when x > 0 and σ0 > max(0, σc ). Here the critical step is to show that if y ≥ 1
and σ0 > 0, then
σ0 +i∞ k
1 ys ys
ds = Res
2πi σ0 −i∞ s(s + 1) · · · (s + k) j=0
s(s + 1) · · · (s + k) s=− j
5.1 The inverse Mellin transform 143
for σ > max(0, σc ). By following the method used to prove Theorem 5.1 we
also find that
σ0 +i∞
1 xs
Rk (x) = α(s) ds (5.22)
2πi σ0 −i∞ s k+1
when x > 0 and σ0 > max(0, σc ). Here the critical observation is that if y ≥ 1
and σ0 > 0, then
σ0 +i∞ s
1 ys y 1
ds = Res k+1 = (log y)k .
2πi σ0 −i∞ s k+1 s s=0 k!
3. Abelian weights. For σ > 0 we have
∞ ∞
(s) = e−u u s−1 du = n s e−nx x s−1 d x.
0 0
where
∞
P(x) = an e−nx . (5.24)
n=1
These operations are valid for σ > max(0, σa ), but by partial summation
P(x) x −θ as x → 0+ for θ > max(0, σc ), so that the integral in (5.23) is
absolutely convergent in the half-plane σ > max(0, σc ). Hence the integral is
an analytic function in this half-plane, so that by the principle of uniqueness
144 Dirichlet series: II
of analytic continuation it follows that (5.23) holds for σ > max(0, σc ). In the
opposite direction,
σ0 +i∞
1
P(x) = α(s) (s)x −s ds (5.25)
2πi σ0 −i∞
for x > 0, σ > max(0, σc ). To prove this we recall from Theorem 1.5 that
α(s) τ uniformly for σ ≥ ε + max(0, σc ), and from Stirling’s formula
π
(Theorem C.1) we see that | (s)| e− 2 |t| |t|σ −1/2 as |t| → ∞ with σ bounded.
Thus the value of the integral is independent of σ0 , and in particular we may
assume that σ0 > max(0, σa ). Consequently the terms in α(s) can be integrated
individually, and it suffices to appeal to Theorem C.4.
The formulæ (5.23) and (5.25) provide an important link between the Dirich-
let series α(s) and the power series generating function P(x). Indeed, these
formulæ hold for complex x, provided that x > 0. In particular, by taking
x = δ − 2πiα we find that
∞
1 σ0 +i∞
an e(nα)e−nδ = α(s) (s)(δ − 2πiα)−s ds.
n=1
2πi σ0 −i∞
It may be noted in the above examples that smoother weights w(x) give rise
to kernels K (s) that tend to 0 rapidly as |t| → ∞. Further useful kernels can
be constructed as linear combinations of the above kernels.
Since the Mellin transform is a Fourier transform with altered variables, all
results pertaining to Fourier transforms can be reformulated in terms of Mellin
transforms. Particularly useful is Plancherel’s identity, which asserts that if f ∈
L 1 (R) ∩ L 2 (R), then f 2 = *
f 2 . This is the analogue for Fourier transforms
of Parseval’s identity for Fourier series, which asserts that k | * f (k)|2 = f 22 .
By the changes of variables we noted before, we obtain
∞
Theorem 5.4 (Plancherel’s identity) Suppose that 0 |w(x)|x −σ −1 d x < ∞,
∞ ∞
and also that 0 |w(x)|2 x −2σ −1 d x < ∞. Put K (s) = 0 w(x)x −s−1 d x. Then
∞ +∞
2π |w(x)|2 x −2σ −1 d x = |K (σ + it)|2 dt.
0 −∞
5.1.1 Exercises
1. Show that if σc < σ0 < 0, then
1 σ0 +i T
xs
lim α(s) ds = a .
n>x n
T →∞ 2πi σ0 −i T s
2. (a) Show that if y ≥ 0, then
π
− = si(0) ≤ si(y) ≤ si(π ) = 0.28114 . . . .
2
(b) Let β > 0 be fixed. Show that if x > 0 and σ0 > max(0, σc ), then
1 σ0 +i∞ ∞
β
α(s) (s/β)x s ds = β an e−(n/x) .
2πi σ0 −i∞ n=1
(b) Explain why the values of the integrals above are independent of the
value of σ0 . Hence show that if σ0 = −b/a 2 , then the above is
+∞
e−b /(2a
2 2
)
1
e−a t /2
e−b /a .
2 2 2 2
= dt = √
2π −∞ 2π a
(c) Show that if a > 0, x > 0 and σ0 > σc , then
1 σ0 +i∞
a 2 s 2/2 s 1 ∞
(log x/n)2
α(s)e x ds = √ an exp − .
2πi σ0 −i∞ 2π a n=1 2a 2
for σ > σw .
(a) Show that Aw (x) = ∞
n=1 an w(x/n) satisfies Aw (x) x θ for θ >
max(σw , σc ).
(b) Show that
∞
K (s)α(s) = Aw (x)x −s−1 d x
0
10. Suppose that F is strictly increasing, and that for i = 1, 2 the functions f i
are real-valued with f i ∈ L 1 (R) ∩ L 2 (R) and F( f i ) ∈ L 1 (R) ∩ L 2 (R).
(a) Show that
+∞
( f 1 (x) − f 2 (x))(F( f 1 (x)) − F( f 2 (x))) d x
−∞
+∞
= f*1 (t) − f*2 (t) F(
f 1 )(t) − F( f 2 )(t) dt.
−∞
5.2 Summability
We say that an infinite series an is Abel summable to a, and write an = a
(A) if
∞
lim− an r n = a.
r →1
n=0
Abel proved that if a series converges, then it is A-summable to the same value.
Because of this historical antecedent, we call a theorem ‘Abelian’ if it states
that one kind of summability implies another. Perhaps the simplest Abelian
theorem asserts that if ∞ n=1 an converges to a, then
N
n
lim 1− an = a. (5.27)
N →∞ N
n=1
1 N
lim sn = a. (5.28)
N →∞ N n=1
∞
lim tmn = 1. (5.31)
m→∞
n=1
We now show that regular transformations preserve limits, and relegate the
verification of the converse to exercises.
The important special case (5.28) is obtained by noting that the (semi-infinite)
matrix [tmn ] with
1/m if 1 ≤ n ≤ m,
tmn =
0 if n > m
To establish the second assertion, suppose that ε > 0 and that |an | < ε for
n > N = N (ε). Now
N
|bm | ≤ |tmn an | + |tmn an | = 1 + 2 ,
n=1 n>N
say. From (5.29) and the argument above with A = ε we see that 2 ≤ Cε.
From (5.30) we see that limm→∞ 1 = 0. Hence lim supm→∞ |bm | ≤ Cε, and
we have the desired conclusion since ε is arbitrary. Finally, suppose that T is
regular and that limn→∞ an = a. We write an = a + αn , so that
∞ ∞
bm = a tmn + tmn αn .
n=1 n=1
To see how this may also be derived from Theorem 5.5, let {sm } be an arbitrary
sequence of points of S for which limm→∞ sm = s0 . It suffices to show that
limm→∞ α(sm ) = α(s0 ). Take
tmn = n s0 −sm − (n + 1)s0 −sm ,
so that
∞
n
−s0
α(sm ) = tmn ak k .
n=1 k=1
In view of Theorem 5.5, it suffices to show that [tmn ] is regular. The conditions
(5.30) and (5.31) are clearly satisfied, and (5.29) follows on observing that if
s ∈ S, then s − s0 H σ − σ0 , so that
n+1
n s0 −s − (n + 1)s0 −s = (s − s0 ) u s0 −s−1 du
n
n+1
H
(σ − σ0 ) u σ0 −σ −1 du
n
= n σ0 −σ − (n + 1)σ0 −σ .
Thus we have the result. Abel’s analogous theorem on the convergence of power
series can be derived similarly from Theorem 5.5.
150 Dirichlet series: II
The converse of Abel’s theorem on power series is false, but Tauber (1897)
proved a partial converse: If an = o(1/n) and an = a (A), then an = a.
Following Hardy and Littlewood, we call a theorem ‘Tauberian’ if it provides
a partial converse of an Abelian theorem. The qualifying hypothesis (‘an =
o(1/n)’ in the above) is the ’Tauberian hypothesis’. For simplicity we begin
with partial converses of (5.27).
Theorem 5.6 If ∞ n=1 an = a (C, 1), then an = a provided that one of the
following hypotheses holds:
(a) an ≥ 0 for n ≥ 1;
(b) an = O(1/n) for n ≥ 1;
(c) There is a constant A such that an ≥ −A/n for all n ≥ 1.
Proof Clearly (a) implies (c). If (b) holds, then both an and an satisfy (c).
Thus it suffices to prove that an = a when (c) holds. We observe that if H
is a positive integer, then
N
N + H N +H
n N N
n
an = an 1 − − an 1 −
n=1
H n=1
N+H H n=1 N
1
− an (N + H − n) (5.33)
H N <n<N +H
= T1 − T2 − T3 ,
say. Take H = [εN ] for some ε > 0. By hypothesis, lim N →∞ T1 = a(1 + ε)/ε,
and lim N →∞ T2 = a/ε. From (c) we see that
1 AH
T3 ≥ −A ≥− ≥ −Aε.
N <n<N +H
n N
Hence on combining these estimates in (5.33) we see that
N
lim sup an ≤ a + Aε.
N →∞ n=1
so that
N
lim inf an ≥ a,
N →∞
n=1
If we had argued from (a) or (b), then the treatment of the term T3 above
would have been simpler, since from (a) it follows that T3 ≥ 0, while from
(b) we have T3 ε.
Our next objective is to generalize and strengthen Theorem 5.6. The type of
generalization we have in mind is exhibited in the following result, which can
be established by adapting the above proof: Let β be fixed, β ≥ 0. If
N
n
an 1 − = (a + o(1))N β ,
n=1
N
β (β) = (β + 1) (5.38)
when β > 0.
The amount of unsmoothing required in deriving (5.37) from (5.35) is now
much greater than it was in the proof of Theorem 5.6. Nevertheless we follow
the same line of attack. To obtain the proper perspective we review the preceding
proof. Let J = [0, 1], let χJ (u) be its characteristic function, and put K (u) =
N N
max(0, 1 − u) for u ≥ 0. Thus n=1 an = n an χJ (n/N ), and n=1 an (1 −
n/N ) = n an K (n/N ). Our strategy was to approximate to χJ (u) by linear
combinations of K (κu) for various values of κ, κ > 0. The relation underlying
(5.33) and (5.34) is both simple and explicit:
1 1
K (u) − (1− ε)K (u/(1 − ε)) ≤ χJ (u) ≤ ((1+ ε)K (u/(1+ ε)) − K (u));
ε ε
(5.39)
we took ε = H/N . In the present situation we wish to approximate to χJ (u) by
linear combinations of e−κu , κ > 0. We make the change of variable x = e−u ,
so that 0 ≤ x ≤ 1, and we put J = [1/e, 1]. Then we want to approximate to
χJ (x) by a linear combination P(x) of the functions x κ , κ > 0. In fact it suffices
to use only integral values of κ, so that P(x) is a polynomial that vanishes at
the origin. In place of (5.33), (5.34) and (5.39) we shall substitute
Lemma 5.8 Let ε be given, 0 < ε < 1/4, and put J = [1/e, 1], K =
[e−1−ε , e−1+ε ]. There exist polynomials P± (x) such that for 0 ≤ x ≤ 1 we have
and
Proof Let g(x) = (χJ (x) − x)/(x(1 − x)). Then g is continuous in [0, 1]
apart from a jump discontinuity at x = 1/e of height e2 /(e − 1) < 5. Hence
by Weierstrass’s theorem on the uniform approximation of continuous func-
tions by polynomials we see that there are polynomials Q ± (x) such that
Q − (x) ≤ g(x) ≤ Q + (x) for 0 ≤ x ≤ 1, and for which
for 0 ≤ x ≤ 1. Then the polynomials P± (x) = x + x(1 − x)Q ± (x) have the
desired properties.
and that
∞ ∞
v β−1 e−vδ dv = δ −β w β−1 e−w dw = δ −β (β).
0 0
Hence if b(u) = a(u) − α(u + 1)β−1 / (β), then b(u) ≥ −B(u + 1)β−1 , and
∞
b(u)e−uδ du = o(δ −β ).
0
U
Thus 0 b(u) du = o(U β ), so that
U
α
a(u) du = U β + o(U β ),
0 β (β)
and we have (5.37), in view of (5.38).
For the remaining case, β = 0, it suffices to consider b(u) = a(u) −
αχ[0,1] (u).
∞
Corollary 5.9 Suppose that p(z) = n=0 an z n converges for |z| < 1, and
that β ≥ 0. If p(x) = (α + o(1))(1 − x)−β as x → 1− , and if an ≥ −An β−1
for n ≥ 1, then
N
α
an = + o(1) N β .
n=0
(β + 1)
∞
Proof Take β = 1, p(z) = ∞ n=0 sn z = (1 − z)
n −1 n
n=0 an z in Corollary
N
5.9. Then n=0 sn = (α + o(1))N , which is the desired result.
so that
an
α
≥ + o(1) (log N )β − A1 (log N )β−1 .
n≤N
n (β + 1)
then
N
1 2
an (N − n) = N + O N 3/2 . (5.45)
n=1
2
156 Dirichlet series: II
This is best possible (take an = 1 + n −1/2 ), but if the error term is oscilla-
√
tory, then smoothing may reduce its size (consider an = cos n). Conversely if
(5.45) holds and if the sequence an is bounded, then the method used to prove
Theorem 5.6 can be used to show that
N
an = N + O N 3/4 . (5.46)
n=1
This error term, though weak, is best possible (take an = 1 + cos(log n)2 ).
For Dirichlet series it can be shown that if
∞
1
α(s) = an n −s = + O(1)
n=1
s−1
This is also best possible (take an = 1 + cos(log log n)2 ), but we can obtain a
sharper result by strengthening our analytic hypothesis. For example, it can be
shown that if α(s) is analytic in a neighbourhood of 1 and if the sequence an is
bounded, then
N
an
= O(1).
n=1
n
However, even this stronger assumption does not allow us to deduce that
N
an = o(N ),
n=1
5.2.1 Exercises
1. Let T be a regular matrix such that tmn ≥ 0 for all m, n. Show that if
limn→∞ an = +∞, then limm→∞ bm = +∞.
2. Show that if T = [tmn ] and U = [u mn ] are regular matrices, then so is
T U = V = [vmn ] where
∞
vmn = tmk u kn .
k=1
3. Show that if b = T a and limm→∞ bm = a whenever limn→∞ an = a, then
T is regular.
4. For n = 0, 1, 2, . . . let tn (x) be defined on [0, 1), and suppose that the tn
satisfy the following conditions:
(i) There is a constant C such that if x ∈ [0, 1), then ∞ n=0 |tn (x)| ≤ C.
(ii) For all n, limx→1− tn (x) = 0.
(iii) limx→1− ∞ n=0 tn (x) = 1.
Show that if limn→∞ an = a and if b(x) = ∞ n=0 an tn (x), then
limx→1− b(x) = a.
5. (Kojima 1917) Suppose that the numbers tmn satisfy the following
conditions:
(i) There is a constant C such that ∞ n=1 |tmn | ≤ C for all m.
(ii) For all n, limm→∞ tmn exists.
(iii) limm→∞ ∞ n=1 tmn exists.
Show that if limn→∞ an exists and if bm = ∞ n=1 tmn an , then limm→∞ bm
exists.
6. For positive
∞ integers n let K n (x) be a function defined on [0, ∞) such that
(i) 0 K n (x) d x → 1 as n → ∞;
∞
(ii) 0 |K n (x)| d x ≤ C for all n;
(iii) limn→∞ K n (x) = 0 uniformly for 0 ≤ x ≤ X . ∞
Suppose that a(x) is a bounded function, and that bn = 0 a(x)K n (x) d x.
Show that if limx→∞ a(x) = a, then limn→∞ bn = a.
7. Let rm be a sequence of positive real numbers with rm → 1− as m → ∞ .
For m ≥ 1, n ≥ 1, put tmn = nrmn−1 (1 − rm )2 .
(a) Show that [tmn ] is regular.
(b) Show that if an = n−1 k=0 ck (1 − k/n) and bm is defined by (5.32), then
∞
bm = k=0 ck rmk .
(c) Show that if cn = c (C, 1), then cn = c (A).
8. Suppose that T = [tmn ] is given by
⎧
⎪
⎪ 0 if n = 0,
⎨ m!n
tmn = if m ≥ n > 0,
⎪
⎪ m n+1 (m − n)!
⎩
0 if m < n.
158 Dirichlet series: II
for 1 ≤ k ≤ m .
(b) Verify that T is regular.
(c) Show that if an = nk=0 x k /k! for n ≥ 0, then bm = (1 + x/m)m for
m ≥ 1.
9. (Mercer’s theorem) Suppose that
1 1 a1 + a2 + · · · + am
bm = am + ·
2 2 m
for m ≥ 1. Show that
2n 2
n−1
an = bn − mbm .
n+1 n(n + 1) m=1
but
N
lim an (1 − n/N ) = 0.
N →∞
n=1
N
(c) Let B(N ) = n=1 nan . Show that if an converges, then B(N ) =
o(N ) as N → ∞.
(d) Show that if P(δ) converges for δ > 0, then
B(N ) N
1 e−u/N e−u/N
s N − P(1/N ) = + B(u) − − du
N 1 u2 u2 uN
∞
u du
+ B(u)e−u/N −1 .
N N u2
(e) Show that if B(N ) = o(N ), then s N − P(1/N ) = o(1).
(f) Show that if an = a (A), then an = a if and only if B(N ) = o(N ).
∞
21. (a) Using Ramanujan’s identity n=1 d(n)2 n −s = ζ (s)4 /ζ (2s) and Theo-
rem 5.11, show that n≤x d(n)2 /n ∼ (4π 2 )−1 (log x)4 .
(b) Show that if n≤x d(n)2 ∼ cx(log x)3 as x → ∞, then c = 1/π 2 .
22. Show that ∞ n=1 1/(d(n)n ) ∼ c(s − 1)
s −1/2
as s → 1+ where
p
c= ( p 2 − p)1/2 log .
p p−1
Deduce that
1 2c
∼ √ (log x)1/2
n≤x nd(n) π
as x → ∞.
23. Show that if n≤N an /n = O(1) and lims→1+ ∞ n=1 an n
−s
= a, then
an
log n
lim 1− = a.
n≤x n log x
x→∞
27. Suppose that for every ε > 0 there is an η > 0 such that
|an | < ε whenever N > 1/η. Show that if an = a (A),
N <n≤(1+η)N
then an = a.
28. Show that if an = a (C, 1) and if an+1 − an = O(|an |/n), then an = a.
29. (Hardy & Littlewood 1913, Theorem 27) Show that if an = a (A) and if
an+1 − an = O(|an |/n), then an = a.
30. (Hardy 1907) Show that
∞
k
lim− (−1)k x 2
x→1
k=0
5.3 Notes
Section 5.1. Theorem 5.1 and the more general (5.22) were first proved rig-
orously by Perron (1908). Although the Mellin transform had been used by
Riemann and Cahen, it was Mellin (1902) who first described a general class
of functions for which the inversion succeeds. Hjalmar Mellin was Finnish, but
his family name is of Swedish origin, so it is properly pronounced mĕ · lēn .
However, in English-speaking countries the uncultured pronunciation mĕl · ı̆n
is universal.
In connection with Theorem 5.4, it should be noted that Plancherel’s formula
f 2 = * f 2 holds not just for all f ∈ L 1 (R) ∩ L 2 (R) but actually for all
f ∈ L (R). However, in this wider setting one must adopt a new definition for
2
*f , since the definition we have taken is valid only for f ∈ L 1 (R). See Goldberg
(1961, pp. 46–47) for a resolution of this issue.
For further material concerning properties of Dirichlet series, one should
consult Hardy & Riesz (1915), Titchmarsh (1939, Chapter 9), or Widder (1971,
Chapter 2). Beyond the theory developed in these sources, we call attention to
two further topics of importance in number theory. Wiener (1932, p. 91) proved
that if the Fourier series of f ∈ L 1 (T) is absolutely convergent and is never zero,
then the Fourier series of 1/ f is also absolutely convergent. Wiener’s proof was
rather difficult, but Gel’fand (1941) devised a simpler proof depending on his
theory of normed rings. Lévy (1934) proved more generally that the Fourier
series of F( f ) is absolutely convergent provided that F is analytic at all points
in the range of f . Elementary proofs of these theorems have been given by
Zygmund (1968, pp. 245–246) and Newman (1975). These theorems were
generalized to absolutely convergent Dirichlet series by Hewitt & Williamson
(1957), who showed that if α(s) = an n −s is absolutely convergent for σ ≥
σ0 , then 1/α(s) is represented by an absolutely convergent Dirichlet series
5.3 Notes 163
in the same half-plane, if and only if the values taken by α(s) in this half-
plane are bounded away from 0. Ingham (1962) noted a fallacy in Zygmund’s
account of Lévy’s theorem, corrected it, and gave an elementary proof of the
generalization to absolutely convergent Dirichlet series. See also Goodman &
Newman (1984). Secondly, Bohr (1919) developed a theory concerning the
values taken on by an absolutely convergent Dirichlet series. This is described
by Titchmarsh (1986, Chapter 11), and in greater detail by Apostol (1976,
Chapter 8). For a small footnote to this theory, see Montgomery & Schinzel
(1977).
Section 5.2. That conditions (5.29)–(5.31) are necessary and sufficient for
the transformation T to preserve limits was proved by Toeplitz (1911) for upper
triangular matrices, and by Steinhaus (1911) in general. See also Kojima (1917)
and Schur (1921). For more on the Toeplitz matrix theorem and various aspects
of Tauberian theorems, see Peyerimhoff (1969).
Theorem 5.6 under the hypothesis (a) is trivial by dominated convergence.
Theorem 5.6(b) is a special case of a theorem of Hardy (1910), who considered
the more general (C,k) convergence, and Theorem 5.6(c) is similarly a special
case of a theorem of Landau (1910, pp. 103–113).
Tauber (1897) proved two theorems, the second of which is found in Exer-
cise 5.2.18. Littlewood (1911) derived his strengthening of Tauber’s first theo-
rem by using high-order derivatives. Subsequently Hardy & Littlewood (1913,
1914a, b, 1926, 1930) used the same technique to obtain Theorem 5.8 and
its corollaries. Karamata (1930, 1931a, b) introduced the use of Weierstrass’s
approximation theorem. Karamata also considered a more general situation,
in which the right-hand sides of (5.35) and (5.36) are multiplied by a slowly
oscillating function L(1/δ), and the right-hand side of (5.37) is multiplied by
L(U ). Our exposition employs a further simplification due to Wielandt (1952).
Other proofs of Littlewood’s theorem have been given by Delange (1952) and
by Eggleston (1951). Ingham (1965) observed that a peak function similar
to Littlewood’s can be constructed by using high-order differencing instead
of differentiation. Since many proofs of the Weierstrass theorem involve con-
structing a peak function, the two methods are not materially different. Sharp
quantitative Tauberian theorems have been given by Postnikov (1951), Kore-
vaar (1951, 1953, 1954a–d), Freud (1952, 1953, 1954), Ingham (1965), and
Ganelius (1971).
For other accounts of the Hardy–Littlewood theorem, see Hardy (1949) or
Widder (1946, 1971). For a brief survey of applications of summability to
classical analysis, see Rubel (1989).
Wiener (1932, 1933) invented a general Tauberian theory that contains the
Hardy–Littlewood theorems for power series (Theorem 5.8 and its corollaries)
164 Dirichlet series: II
as a special case. Wiener’s theory is discussed by Hardy (1949), Pitt (1958), and
Widder (1946). Among the longer expositions of Tauberian theory, the recent
accounts of Korevaar (2002, 2004) are especially recommended.
5.4 References
Apostol, T. (1976). Modular Functions and Dirichlet Series in Number Theory, Graduate
Texts Math. 41. New York: Springer-Verlag.
Bohr, H. (1909). Über die Summabilität Dirichletscher Reihen, Nachr. König. Gesell.
Wiss. Göttingen Math.-Phys. Kl., 247–262; Collected Mathematical Works, Vol. I.
København: Dansk Mat. Forening, 1952, A2.
(1919). Zur Theorie algemeinen Dirichletschen Reihen, Math. Ann. 79, 136–156;
Collected Mathematical Works, Vol. I. København: Dansk Mat. Forening, 1952,
A13.
Delange, H. (1952). Encore une nouvelle démonstration du théorème taubérien de Lit-
tlewood, Bull. Sci. Math. (2) 76, 179–189.
Edwards, D. A. (1957). On absolutely convergent Dirichlet series, Proc. Amer. Math.
Soc. 8, 1067–1074.
Eggleston, H. G. (1951). A Tauberian lemma, Proc. London Math. Soc. (3) 1, 28–45.
Freud, G. (1952). Restglied eines Tauberschen Satzes, I, Acta Math. Acad. Sci. Hungar.
2, 299–308.
(1953). Restglied eines Tauberschen Satzes, II, Acta Math. Acad. Sci. Hungar. 3,
299–307.
(1954). Restglied eines Tauberschen Satzes, III, Acta Math. Acad. Sci. Hungar. 5,
275–289.
Ganelius, T. (1971). Tauberian Remainder Theorems, Lecture Notes Math. 232. Berlin:
Springer-Verlag.
Gel’fand, I. M. (1941). Über absolut konvergente trigonometrische Reihen und Integrale,
Mat. Sb. N. S. 9, 51–66.
Goldberg R. R. (1961). Fourier Transforms, Cambridge Tract 52. Cambridge: Cambridge
University Press.
Goodman, A. & Newman, D. J. (1984). A Wiener type theorem for Dirichlet series,
Proc. Amer. Math. Soc. 92, 521–527.
Hardy, G. H. (1907). On certain oscillating series, Quart. J. Math. 38, 269–288; Collected
Papers, Vol. 6. Oxford: Clarendon Press, 1974, pp. 146–167.
(1910). Theorems relating to the summability and convergence of slowly oscillating
series, Proc. London Math. Soc. (2) 8, 301–320; Collected Papers, Vol. 6. Oxford:
Clarendon Press, 1974, pp. 291–310.
(1949). Divergent Series, Oxford: Oxford University Press.
Hardy, G. H. & Littlewood, J. E. (1913). Contributions to the arithmetic theory of
series, Proc. London Math. Soc. (2) 11, 411–478; Collected Papers, Vol. 6. Oxford:
Clarendon Press, 1974, pp. 428–495.
(1914a). Tauberian theorems concerning power series and Dirichlet series whose co-
efficients are positive, Proc. London Math. Soc. (2) 13, 174–191; Collected Papers,
Vol. 6. Oxford: Clarendon Press, 1974, pp. 510–527.
5.4 References 165
(1914b). Some theorems concerning Dirichlet’s series, Messenger Math. 43, 134–147;
Collected Papers, Vol. 6. Oxford: Clarendon Press, 1974, pp. 542–555.
(1926). A further note on the converse of Abel’s theorem, Proc. London Math.
Soc. (2) 25, 219–236; Collected Papers, Vol. 6. Oxford: Clarendon Press, 1974,
pp. 699–716.
(1930). Notes on the theory of series XI: On Tauberian theorems, Proc. London
Math. Soc. (2) 30, 23–37; Collected Papers, Vol. 6. Oxford: Clarendon Press, 1974,
pp. 745–759.
Hardy, G. H. & Riesz, M. (1915). The General Theory of Dirichlet’s Series, Cambridge
Tract No. 18. Cambridge: Cambridge University Press. Reprint: Stechert–Hafner
(1964).
Hewitt, E. & Williamson, H. (1957). Note on absolutely convergent Dirichlet series,
Proc. Amer. Math. Soc. 8, 863–868.
Ingham, A. E. (1962). On absolutely convergent Dirichlet series. Studies in Mathemati-
cal Analysis and Related Topics. Stanford: Stanford University Press, pp. 156–164.
(1965). On tauberian theorems, Proc. London Math. Soc. (3) 14A, 157–173.
Karamata, J. (1930). Über die Hardy–Littlewoodschen Umkehrungen des Abelschen
Stetigkeitssatzes, Math. Z. 32, 319–320.
(1931a). Neuer Beweis und Verallgemeinerung einiger Tauberian-Sätze, Math. Z. 33,
294–300.
(1931b). Neuer Beweis und Verallgemeinerung der Tauberschen Sätze, welche die
Laplacesche und Stieltjessche Transformation betreffen, J. Reine Angew. Math.
164, 27–40.
Kojima, T. (1917). On generalized Toeplitz’s theorems on limit and their application,
Tôhoku Math. J. 12, 291–326.
Korevaar, J. (1951). An estimate of the error in Tauberian theorems for power series,
Duke Math. J. 18, 723–734.
(1953). Best L 1 approximation and the remainder in Littlewood’s theorem, Proc.
Nederl. Akad. Wetensch. Ser. A 56 (= Indagationes Math. 15), 281–293.
(1954a). A very general form of Littlewood’s theorem, Proc. Nederl. Akad. Wetensch.
Ser. A 57 (= Indagationes Math. 16), 36–45.
(1954b). Another numerical Tauberian theorem for power series, Proc. Nederl. Akad.
Wetensch. Ser. A 57 (= Indagationes Math. 16), 46–56.
(1954c). Numerical Tauberian theorems for Dirichlet and Lambert series, Proc.
Nederl. Akad. Wetensch. Ser. A 57 (= Indagationes Math. 16), 152–160.
(1954d). Numerical Tauberian theorems for power series and Dirichlet series, I, II,
Proc. Nederl. Akad. Wetensch. Ser. A 57 (= Indagationes Math. 16), 432–443,
444–455.
(2001). Tauberian theory, approximation, and lacunary series of powers, Trends in
approximation theory (Nashville, 2000), Innov. Appl. Math. Nashville: Vanderbilt
University Press, pp. 169–189.
(2002). A century of complex Tauberian theory, Bull. Amer. Math. Soc. (N.S.) 39,
475–531.
(2004). Tauberian Theory. A Century of Developments. Grundl. Math. Wiss. 329.
Berlin: Springer-Verlag.
Landau, E. (1907). Über die Multiplikation Dirichletscher Reihen, Rend. Circ. Mat.
Palermo 24, 81–160.
166 Dirichlet series: II
(1908). Zwei neue Herleitungen für die asymptotische Anzahl der Primzahlen unter
einer gegebenen Grenze, Sitzungsberichte Akad. Wiss. Berlin 746–764; Collected
Works, Vol.4. Essen: Thales Verlag, 1986, pp. 21–39.
(1909). Handbuch der Lehre von der Verteilung der Primzahlen, Leipzig: Teubner.
Reprint: Chelsea (New York), 1953.
(1910). Über die Bedeutung einiger neuerer Grenzwertsätze der Herren Hardy und
Axer, Prace mat.-fiz. (Warsaw) 21, 97–177; Collected Works, Vol. 4. Essen: Thales
Verlag, 1986, pp. 267–347.
(1913). Einige Ungleichungen für zweimal differentiierbare Funktionen, Proc. Lon-
don Math. Soc. (2) 13, 43–49; Collected Works, Vol. 6. Essen: Thales Verlag, 1986,
pp. 49–55.
Lévy, P. (1934). Sur la convergence absolue des séries de Fourier, Compositio Math. 1,
1–14.
Littlewood, J. E. (1911). The converse of Abel’s theorem on power series, Proc. London
Math. Soc. (2) 9, 434–448; Collected Papers, Vol. 1. Oxford: Oxford University
Press, 1982, pp. 757–773.
(1986). Littlewood’s Miscellany, Bollobas, B. Ed., Cambridge: Cambridge University
Press.
van de Lune, J. (1986). An Introduction to Tauberian Theory: From Tauber to Wiener.
CWI Syllabus 12. Amsterdam: Mathematisch Centrum.
Mellin, H. (1902). Über den Zusammenhang zwischen den linearen Differential- und
Differenzengleichungen, Acta Math. 25, 139–164.
Montgomery, H. L. & Schinzel, A. (1977). Some arithmetic properties of polynomials in
several variables. Transcendence Theory: Advances and Applications (Cambridge,
1976). London: Academic Press, pp. 195–203.
Newman, D. J. (1975). A simple proof of Wiener’s 1/ f theorem, Proc. Amer. Math. Soc.
48, 264–265.
Perron, O. (1908). Zur Theorie der Dirichletschen Reihen, J. Reine Angew. Math. 134,
95–143.
Peyerimhoff, A. (1969). Lectures on summability, Lecture Notes Math. 107. Berlin:
Springer-Verlag.
Pitt, H. R. (1958). Tauberian Theorems. Tata Monographs. London: Oxford University
Press.
Postnikov, A. G. (1951). The remainder term in the Tauberian theorem of Hardy and
Littlewood, Dokl. Akad. Nauk SSSR N. S. 77, 193–196.
Riesz, M. (1909). Sur la sommation des séries de Dirichlet, C. R. Acad. Sci. Paris 149,
18–21.
Rubel, L. (1989). Summability theory: a neglected tool of analysis, Amer. Math. Monthly
96, 421–423.
Schoenberg, I. J. (1973). The elementary cases of Landau’s problem of inequalities
between derivatives, Amer. Math. Monthly 80, 121–158.
Schur, I. (1921). Über lineare Transformationen in der Theorie der unendlichen Reihen,
J. Reine Angew. Math. 151, 79–111.
Steinhaus, H. (1911). Kilka slów o uogólnieniu pojȩcia granicy, Warsaw: Prace mat-fiz
22, 121–134.
Tauber, A. (1897). Ein Satz aus der Theorie der unendlichen Reihen, Monat. Math. 8,
273–277.
5.4 References 167
168
6.1 A zero-free region 169
put
K
R 2 − zz k
g(z) = f (z) .
k=1
R(z − z k )
th
The k factor of the product has been constructed so that it has a pole at z k , and
so that it has modulus 1 on the circle |z| = R. Hence g is an analytic function
in the disc |z| ≤ R, and if |z| = R, then |g(z)| = | f (z)| ≤ M. Hence by the
maximum modulus principle, |g(0)| ≤ M. But
K
R
|g(0)| = | f (0)| .
|z |
k=1 k
We now show that a bound for the modulus of an analytic function can be
derived from a one-sided bound for its real part in a slightly larger region.
Lemma 6.2 (The Borel–Carathéodory Lemma) Suppose that h(z) is analytic
in a domain containing the disc |z| ≤ R, that h(0) = 0, and that h(z) ≤ M
for |z| ≤ R. If |z| ≤ r < R, then
2Mr
|h(z)| ≤
R −r
and
2M R
|h (z)| ≤ .
(R − r )2
Proof It suffices to show that
h (k) (0) 2M
≤ k (6.1)
k! R
for all k ≥ 1, for then
∞
h (k) (0) k ∞
r k 2Mr
|h(z)| ≤ r ≤ 2M = ,
k=1
k! k=1
R R −r
and
∞
|h (k) (0)|kr k−1 2M ∞
r k−1 2M R
|h (z)| ≤ ≤ k = .
k=1
k! R k=1 R (R − r )2
To prove (6.1) we first note that
1
1 dz
h(Re(θ )) dθ = h(z) = h(0) = 0.
0 2πi |z|=R z
170 The Prime Number Theorem
and
1
Rk R k h (k) (0)
h(Re(θ ))e(−kθ) dθ = h(z)z −k−1 dz = .
0 2πi |z|=R k!
By forming a linear combination of these identities we see that if k > 0, then
1
R k e(−φ)h (k) (0)
h(Re(θ ))(1 + cos 2π(kθ + φ)) dθ = .
0 2 · k!
By taking real parts it follows that
1
1 k
R e(−φ)h (k) (0)/k! ≤ M (1 + cos 2π (kθ + φ)) dθ = M
2 0
for k > 0. Since this holds for any real φ, we are free to choose φ so that
e(−φ)h (k) (0) = |h (k) (0)|. Then the above inequality gives (6.1), and the proof
is complete.
K
If P(z) = c k=1 (z − z k ), then
P K
1
(z) = .
P k=1
z − zk
We now generalize this to analytic functions f (z), to the extent that f / f can
be approximated by a sum over its nearby zeros.
Lemma 6.3 Suppose that f (z) is analytic in a domain containing the disc
|z| ≤ 1, that | f (z)| ≤ M in this disc, and that f (0) = 0. Let r and R be fixed,
0 < r < R < 1. Then for |z| ≤ r we have
f K
1 M
(z) = + O log
f k=1
z − zk | f (0)|
where the sum is extended over all zeros z k of f for which |z k | ≤ R. (The implicit
constant depends on r and R, but is otherwise absolute.)
Proof If f (z) has zeros on the circle |z| = R, then we replace R by a very
slightly larger value. Thus we may assume that f (z) = 0 for |z| = R. Set
K
R 2 − zz k
g(z) = f (z) .
k=1
R(z − z k )
6.1 A zero-free region 171
where τ = |t| + 4 and the sum is extended over all zeros ρ of ζ (s) for which
|ρ − (3/2 + it)| ≤ 5/6.
Proof We apply Lemma 6.3 to the function f (z) = ζ (z + (3/2 + it)), with
R = 5/6 and r = 2/3. To complete the proof it suffices to note that | f (0)| 1
by the (absolutely convergent) Euler product formula (1.17), and that f (z) τ
for |z| ≤ 1 by Corollary 1.17.
172 The Prime Number Theorem
We now use Lemmas 6.4 and 6.5 to establish the existence of a zero-free
region for the zeta function.
Theorem 6.6 There is an absolute constant c > 0 such that ζ (s) = 0 for
σ ≥ 1 − c/ log τ .
Proof Since ζ (s) is given by the absolutely convergent product (1.17) for
σ > 1, it suffices to consider σ ≤ 1. From (1.24) we see that
∞
s |s|
ζ (s) − ≤ |s| u −σ −1 du = (6.5)
s−1 1 σ
for σ > 0. From this we see that ζ (s) = 0 when σ > |s − 1|, i.e., in the parabolic
region σ > (1 + t 2 )/2. In particular, ζ (s) = 0 in the rectangle 8/9 ≤ σ ≤ 1,
|t| ≤ 7/8. Now suppose that ρ0 = β0 + iγ0 is a zero of the zeta function with
5/6 ≤ β0 ≤ 1, |γ0 | ≥ 7/8. Since ρ ≤ 1 for all zeros ρ of ζ (s), it follows that
1/(s − ρ) > 0 whenever σ > 1. Hence by Lemma 6.4 with s = 1 + δ + iγ0
we see that
ζ 1
− (1 + δ + iγ0 ) ≤ − + c1 log(|γ0 | + 4).
ζ 1 + δ − β0
Similarly, by Lemma 6.4 with s = 1 + δ + 2iγ0 we find that
ζ
− (1 + δ + 2iγ0 ) ≤ c1 log(|2γ0 | + 4).
ζ
From Corollary 1.13 we see that
ζ 1
− (1 + δ) = + O(1).
ζ δ
On combining these estimates in Lemma 6.5 we conclude that
3 4
− + c2 log(|γ0 | + 4) ≥ 0.
δ 1 + δ − β0
We take δ = 1/(2c2 log(|γ0 | + 4)). Thus the above gives
4
7c2 log(|γ0 | + 4) ≥ ,
1 + δ − β0
which is to say that
1 4
1+ − β0 ≥ .
2c2 log(|γ0 | + 4) 7c2 log(|γ0 | + 4)
Hence
1
1 − β0 ≥ ,
14c2 log(|γ0 | + 4)
so the proof is complete.
It is useful to have bounds for the zeta function and its logarithmic derivative
in the zero-free region.
Theorem 6.7 Let c be the constant in Theorem 6.6. If σ > 1 − c/(2 log τ )
and |t| ≥ 7/8, then
ζ
(s) log τ , (6.6)
ζ
| log ζ (s)| ≤ log log τ + O(1) , (6.7)
and
1
log τ . (6.8)
ζ (s)
ζ
On the other hand, if 1 − c/(2 log τ ) < σ ≤ 2 and |t| ≤ 7/8, then (s) =
ζ
−1/(s − 1) + O(1), log ζ (s)(s − 1) 1, and 1/ζ (s) |s − 1|.
Proof If σ > 1, then by Corollary 1.11 and the triangle inequality we see that
ζ ∞
ζ 1
(s) ≤ (n)n −σ = − (σ ) .
ζ n=1
ζ σ −1
Hence (6.6) is obvious if σ ≥ 1 + 1/ log τ . Let s1 = 1 + 1/ log τ + it. In par-
ticular we have
ζ
(s1 ) log τ. (6.9)
ζ
From this estimate and Lemma 6.4 we deduce that
1
log τ (6.10)
ρ s1 − ρ
where the sum is over those zeros ρ for which |ρ − (3/2 + it)| ≤ 5/6. Suppose
that 1 − c/(2 log τ ) ≤ σ ≤ 1 + 1/ log τ . Then by Lemma 6.4 we see that
ζ ζ 1 1
(s) − (s1 ) = − + O(log τ ). (6.11)
ζ ζ ρ s−ρ s1 − ρ
6.1 A zero-free region 175
But by Theorem 1.14 we know that ζ (σ ) < 1 + 1/(σ − 1), so that (6.7)
holds when σ ≥ 1 + 1/ log τ . In particular (6.7) holds at the point s1 =
1 + 1/ log τ + it, so that to treat the remaining s it suffices to bound the
difference
s
ζ
log ζ (s) − log ζ (s1 ) = (w) dw.
s1 ζ
We take the path of integration to be the line segment joining the endpoints.
Then the length of this interval multiplied by the bound (6.6) gives the error
term O(1) in (6.7).
The estimate (6.8) follows directly from (6.7), since log 1/|ζ | = − log ζ .
The remaining estimates follow trivially from (6.5).
The ideas we have used enable us not only to derive a zero-free region but
also to place a bound on the number of zeros ρ that might lie near the point
1 + it.
Theorem 6.8 Let n(r ; t) denote the number of zeros ρ of ζ (s) in the disc
|ρ − (1 + it)| ≤ r . Then n(r ; t) r log τ , uniformly for r ≤ 3/4.
6.1.1 Exercises
1. (a) Show that if |z| < R, |w| ≤ R, and z = w, then
zw − R 2
≥ 1.
(z − w)R
(b) Show that if |w| ≤ ρ < R, |z| = r < R, and z = w, then
zw − R 2 rρ + R 2
≥ .
(z − w)R (r + ρ)R
(c) Suppose that f is analytic in the disc |z| ≤ R. For r ≤ R put M(r ) =
max|z|≤r | f (z)|. Show that if 0 < r < R and 0 < ρ < R, then the num-
ber of zeros of f in the disc |z| ≤ ρ does not exceed
M(R)
log
M(r )
.
rρ + R 2
log (r + ρ)R
2. Suppose that R, M, and ε are positive real numbers, and set h(z) =
2M z/(z + R + ε).
(a) Show that h(0) = 0, that h(z) is analytic for |z| < R + ε, and that
h(z) ≤ M for |z| ≤ R + ε.
(b) Show that if 0 < r < R, then
2Mr
max |h(z)| = −h(−r ) = .
|z|≤r R +ε−r
(c) Show that if 0 < r < R, then
2M(R + ε)
max |h (z)| = h (−r ) = .
|z|≤r (R + ε − r )2
3. Show that, in the situation of the Borel–Carathéodory lemma (Lemma 6.2),
if |z| ≤ r < R, then
4M R
|h (z)| ≤ .
(R − r )3
4. (Mertens 1898) Use the Dirichlet series expansion of log ζ (s) to show that
if σ > 1, then
|ζ (σ )3 ζ (σ + it)4 ζ (σ + 2it)| ≥ 1.
The method used to establish a zero-free region for the zeta function can be
applied to any particular Dirichlet L-function, though the constants involved
may depend on the function. We shall pursue this systematically in Chapter 11,
but in the exercise below we treat one interesting example.
6.1 A zero-free region 177
5. Let χ0 denote the principal character (mod 4), and χ1 the non-principal
character (mod 4).
(a) Show that L(1, χ1 ) = π/4, and hence that there is a neighbourhood of
1 in which L(s, χ1 ) = 0.
(b) Show that if σ > 1, then
L L L
−3 (σ, χ0 ) − 4 (σ + it, χ1 ) − (σ + 2it, χ0 ) ≥ 0.
L L L
(c) Show that there is a constant c > 0 such that L(s, χ1 ) = 0 for σ >
1 − c/ log τ .
(d) Show that there is a constant c > 0 such that if σ > 1 − c/ log τ , then
L
(s, χ1 ) log τ,
L
| log L(s, χ1 )| ≤ log log τ + O(1),
1
log τ.
L(s, χ1 )
6. (a) Show that if 1 < σ1 ≤ σ2 , then
ζ (σ2 ) ζ (σ2 + it) ζ (σ1 )
≤ ≤
ζ (σ1 ) ζ (σ1 + it) ζ (σ2 )
for all real t.
(b) Show that if 1 < σ1 ≤ σ2 ≤ 2, then
σ1 − 1 ζ (σ2 + it) σ2 − 1
σ2 − 1 ζ (σ1 + it) σ1 − 1
uniformly in t.
7. (Montgomery & Vaughan 2001)
(a) Show that if σ > 1, then
ζ (σ + i(t + 1)) ∞
(n) 1
≤ exp 2 sin 2 log n
ζ (σ + it) n=1
n σ log n
for σ ≥ 1 − θ(t + 1)/3 where the sum is over zeros ρ for which |ρ −
(1 + θ (t + 1) + it)| ≤ 5θ (t + 1)/3.
(b) Show that there is an absolute constant c > 0 such that ζ (s) = 0 for
θ (2t + 1)
σ ≥1−c .
φ(2t + 1)
(c) Show that the zero-free region (6.26) follows from the estimate (6.25).
6.2 The Prime Number Theorem 179
and then use partial summation to derive an estimate for π (x). It would be more
direct to apply Perron’s formula to log ζ (s), but our approach is technically
simpler since log ζ (s) has a logarithmic singularity at s = 1 while ζζ (s) has
only a simple pole there.
since we may suppose that 0 < c < 1. Thus the proof of (6.12) is complete.
To derive (6.13) it suffices to combine (6.12) with the first estimate of Corol-
lary 2.5. As for (6.14), we note that
x x
1 1
π(x) = dϑ(u) = li(x) + d(ϑ(u) − u).
2− log u 2− log u
By integrating by parts we see that this last integral is
ϑ(u) − u x x
ϑ(u) − u
+ 2
du,
log u 2− 2 u(log u)
√
and by (6.13) it follows that this is x exp(−c log x). Thus we have (6.14),
and the proof is complete.
182 The Prime Number Theorem
The method we used to derive Theorem 6.9 is very flexible, and can be
applied to many other situations. For example, the summatory function
M(x) = µ(n)
n≤x
6.2.1 Exercises
1. (Landau 1901b; cf. Rosser & Schoenfeld 1962) Use Theorem 6.9 to show
that
π(2x) − 2π (x) = −2(log 2)x(log x)−2 + O(x(log x)−3 ).
Deduce that for all large x, the interval (x, 2x] contains fewer prime num-
bers than the interval (0, x].
2. Use Theorem 6.9 to show that if n is of the form n = p≤y p where y is
sufficiently large, then d(n) > n (log 2)/ log log n .
3. (a) Use Theorem 6.9 to show that
1 log y
= log + O exp − c log x .
x< p≤y p log x
(b) Use the above and Theorem 2.7 to show that
1
= log log x + b + O exp − c log x
p≤x p
where b = C0 − p ∞ k
k=2 1/(kp ) .
4. Show that for x ≥ 2,
(n)
= log x − C0 + O exp − c log x .
n≤x n
6.2 The Prime Number Theorem 183
5. (cf. Cipolla 1902; Rosser 1939) Let p1 < p2 < · · · denote the prime num-
bers. Show that
log log n 2 (log log n)2
pn = n log n + log log n − 1 + − +O .
log n log n (log n)2
6. (Landau 1900) Let πk (x) denote the number of integers not exceeding x
that are composed of exactly k distinct primes.
(a) Show that
π2 (x) = π(x/ p) + O x(log x)−2 .
√
p≤ x
(c) Using Theorem 6.9 and integration by parts, show that the sum above
is
√
x
du
x + O(x/ log x).
2 u(log x/u) log u
(d) Conclude that π2 (x) = x(log log x)/ log x + O(x/ log x).
7. (D. E. Knutson) Let dn denote the least common multiple of the numbers
1, 2, . . . , n.
(a) Show that dn = exp(ψ(n)).
(b) Let E(z) = ∞ n=1 z /dn . Show that this power series has radius of
n
convergence e.
(c) Show that E(1) is irrational.
8. (Landau 1905) Let Q(x) denote the number of square-free integers not
exceeding x, and define R(x) by the relation Q(x) = (6/π 2 )x + R(x).
(a) Show that
R(x) = M(y){x/y 2 } − µ(d){x/d 2 }
d≤y
∞
+ M x/m − 2x M(u)u −3 du.
m≤x/y 2 y
√
(b) Taking y = x 1/2 exp(−c log x) where c is sufficiently small, show
√
that R(x) x 1/2 exp(−c log x).
9. Let N = N (Q) = 1 + q≤Q ϕ(q) be the number of Farey points of order
Q, and for 0 ≤ α ≤ 1 write
for 0 ≤ α ≤ 1.
(d) Show that R Q uniformly for 0 ≤ α ≤ 1.
10. (Landau 1903b; Massias, Nicolas & Robin 1988, 1989) Let f (n) denote
the maximal order of any element of the symmetric group Sn .
(a) Show that f (n) = max lcm(n 1 , n 2 , . . . , n k ) where the maximum is ex-
tended over all sets {n 1 , n 2 , . . . , n k ) of natural numbers for which
n 1 + n 2 + · · · + n k ≤ n.
(b) Choose y as large as possible so that p≤y p ≤ n. Show that
log f (n) ≥ log p = (1 + o(1))(n log n)1/2 .
p≤y
16. (Landau 1899b, 1901a, 1903c) Use the method of proof of Theorem 6.9 to
show that
∞
µ(n) log n
(a) = −1;
n=1
n
∞
µ(n)(log n)2
(b) = −2C0 ;
n=1
n
∞
λ(n) log n
(c) = −ζ (2).
n=1
n
17. Taking (6.18) and a quantitative form of the first part of the preceding
exercise for granted, use elementary reasoning to show that if q ≤ x then
µ(n)
(a) exp − c log x ,
n≤x n
(n,q)=1
µ(n) log n q
(b) =− + O exp − c log x .
n≤x n ϕ(q)
(n,q)=1
18. (Hardy 1921) Use the method of proof of Theorem 6.9 to show that
∞
µ(n)
(a) = 0;
n=1
ϕ(n)
∞
µ(n) log n
(b) = 0;
n=1
ϕ(n)
186 The Prime Number Theorem
∞
µ(n)(log n)2
(c) = 4A log 2
n=1
ϕ(n)
where A = p>2 1 − ( p−1)
1
2 .
19. Let Q(x) denote the number of square-free integers not exceeding x, and
recall Theorem 2.2.
(a) Show that
6 µ(n)
Q(x) = x−x − µ(n){x/n 2 }
π 2 √
n> x
n 2 √
n≤ x
as x → ∞.
6.2 The Prime Number Theorem 187
(d) Use the estimate d≤y µ(d)/d (log 2y)−2 to show that
x µ(d) dv
1.
1 d≤x/v
d v
(e) Mimic the proof of Theorem 5.5, or use Exercise 5.2.6 to show that if
(i) holds, then
f (n) n
lim 1− = c.
x→∞
n≤x n x
(f) Use Theorem 5.6 to show that if (i) holds and f (n) = O(1), then (ii)
follows.
(g) Take f (n) = µ(n) to deduce that ∞ n=1 µ(n)/n = 0. (Of course we
used much more above in (d). For a result in the converse direction, see
Exercise 8.1.5.)
21. (Landau 1908b) Let R be the set of positive integers that can be expressed
as a sum of two squares, let R(x) denote the number of such integers not
exceeding x, and let χ1 denote the non-principal character (mod 4), as in
Exercise 6.1.5.
(a) Show that
n −s = (1 − 2−s )−1 (1 − p −s )−1 (1 − p −2s )−1
n∈R p≡1 (4) p≡3 (4)
for σ > 1.
√
(b) Show that the Dirichlet series above is f (s) ζ (s)L(s, χ1 ) where
f (s) = (1 − 2−s )−1/2 (1 − p −2s )−1/2
p≡3 (4)
where
f (s)
g(s) = (s − 1)ζ (s)L(s, χ1 )
s
is analytic in a neighbourhood of 1.
(f) Show that
,
π
g(1) = (1 − p −2 )−1/2 .
2 p≡3 (4)
b = 2−1/2 (1 − p −2 )−1/2 .
p≡3 (4)
22. Let A denote the set of those positive integers that are composed entirely
of the prime 2 and primes ≡ 1 (mod 4), and let B be the the set of those
positive integers that are composed entirely of primes ≡ 3 (mod 4).
(a) Explain why any positive integer n has a unique representation in the
form n = a(n)b(n) where a(n) ∈ A and b(n) ∈ B.
(b) Let A(x) denote the number of a ∈ A, a ≤ x. Show that
αx x
A(x) = √ +O
log x (log x)3/2
√
where α = 1/ 2.
(c) Let B(x) denote the number of b ∈ B, b ≤ x. Show that
βx x
B(x) = √ +O
log x (log x)3/2
√
where β = 2/π .
6.2 The Prime Number Theorem 189
(d) For 0 ≤ κ ≤ 1 let Nκ (x) denote the number of n ≤ x such that a(n) ≤
n κ . Show that
Nκ (x) = 1.
a≤x κ a 1/κ−1 ≤b≤x/a
a∈A b∈B
(a) Show that the least term in the sum above occurs when k = [log x] + 1.
(b) Show that if x ≥ e K , then
K
(k − 1)!
Li(x) = x + Li(e)
k=1
(log x)k
K −1 ek+1
dt (k − 1)!ek
+ k! −
k=1 ek (log t)k+1 kk
(K − 1)!e K x
dt
− + K! .
KK eK (log t) K +1
(c) Define R(x) by the relation
x]
[log
(k − 1)!
Li(x) = x + R(x).
k=1
(log x)k
6.3 Notes
Section 6.1. Jensen (1899) proved that if f satisfies the hypotheses of
Lemma 6.1, then
n 2π
R 1
| f (0)| = exp log | f (Reiθ )| dθ
|z |
k=1 k
2π 0
where z 1 , . . . , z n are the zeros of f in the disc |z| ≤ R. Here the right-hand side
may be regarded as being the geometric mean of | f (z)| for z on the circle |z| =
R. Each factor of the product above is ≥ 1, and if |z k | ≤ r , then R/|z k | ≥ R/r .
Thus Lemma 6.1 follows easily from the above. The products used in the proofs
of Lemmas 6.1 and 6.3 are known as Blaschke products. Their use (usually with
infinitely many factors) is an important tool of complex analysis. Lemma 6.2 is
due to Borel (1897); it refines an earlier estimate of Hadamard. Carathéodory’s
contributions on this subject are recounted by Landau (1906; Section 4).
Lemma 6.4 is implicit in Landau (1909, p. 372), and may have been known
earlier. It can also be easily derived from the identity (10.29) that arises by
applying Hadamard’s theory of entire functions to the zeta function.
The Prime Number Theorem was first proved, in the qualitative form π (x) ∼
x/ log x, independently by Hadamard (1896) and de la Vallée Poussin (1896).
In these papers, it was shown that ζ (1 + it) = 0, but no specific zero-free region
6.3 Notes 193
was established. The first proof that ζ (1 + it) = 0 given by de la Vallée Poussin
was rather complicated, but later in his long paper he gave a second proof
depending on the inequality 1 − cos 2θ ≤ 4(1 + cosθ ). This is equivalent to the
non-negativity of the cosine polynomial 3 + 4 cos θ + cos 2θ , which Mertens
(1898) used to obtain the result of Exercise 6.4. Our Lemma 6.5 is derived by
the same method. The classical zero-free region of Theorem 6.6 was established
first by de la Vallée Poussin (1899). The estimates (6.6) and (6.8) of Theorem 6.7
were first proved by Gronwall (1913).
Wider zero-free regions have been established by using exponential sum es-
timates to obtain better upper bounds for |ζ (s)| when σ is near 1 . The first such
improvement was derived by Hardy & Littlewood. Their paper on this was never
published, but accounts of their approach have been given by Landau (1924b)
and Titchmarsh (1986, Chapter 5). Littlewood (1922) announced that from
these estimates he had deduced that ζ (s) = 0 for σ ≥ 1 − c(log log τ )/ log τ .
As explained by Ingham (1932, p. 66), Littlewood never published his com-
plicated proof, because the simpler method of Landau (1924a) had become
available.
In 1935, Vinogradov introduced a new method for estimating Weyl sums. A
N
Weyl sum is a sum of the form n=1 e( f (n)) where f ∈ R[x]. The quality of
Vinogradov’s estimate depends on rational approximations to the coefficients
of f , and on the degree of f . The function f (x) = t log x is not a polynomial,
but by approximating to it by polynomials one can make Vinogradov’s method
apply. This was first done by Chudakov (1936 a, b, c), who derived estimates
for ζ (s) for σ near 1 that allowed him to deduce that ζ (s) = 0 for
σ > 1 − c(log τ )−a (6.24)
for a > 10/11. Vinogradov (1936b) gave stronger exponential sum estimates,
which Titchmarsh (1938) used to obtain a zero-free region of the above form for
a > 4/5. Hua (1949) introduced a further refinement of Vinogradov’s method,
from which Titchmarsh (1951, Chapter 6) and Tatuzawa (1952) derived the
zero-free region
σ > 1 − c(log τ )−3/4 (log log τ )−3/4 .
By refining the passage from Weyl sums to the zeta function, Korobov (1958a)
obtained (6.24) for a > 5/7, and then Korobov (1958b, c) and Vinogradov
(1958) obtained a > 2/3. In fact, Vinogradov claimed that one can take a =
2/3, but this seems to be still out of reach. Richert’s polished exposition of
Vinogradov’s method is reproduced in Walfisz (1963). Other expositions have
since been given by Karatsuba & Voronin (1992, Chapter 4), Montgomery
(1994, Chapter 4), and Vaughan (1997). Richert (1967) used Vinogradov’s
194 The Prime Number Theorem
where b = 1/(1 + a). Similarly, from the zero-free region (6.26) it follows that
π (x) = li(x) + O x exp − c(log x)3/5 (log log x)−1/5 . (6.28)
Turán (1950) used his method of power sums to show conversely that (6.27)
implies (6.24). More general converse theorems have since been established by
Stás (1961) and Pintz (1980, 1983, 1984). A similar converse theorem in which
an upper bound for M(x) = n≤x µ(n) is used to produce a zero-free region
has been given by Allison (1970).
That M(x) = o(x) was first proved by von Mangoldt (1897). The quantitative
estimate (6.17) is due to Landau (1908a). The relation (6.19), asserted by Euler
6.4 References 195
(1748; Chapter 15, no. 277), was first proved by von Mangoldt (1897). Landau
(1899a) and de la Vallée Poussin (1899) shortly gave simpler proofs.
6.4 References
Allison, D. (1970). On obtaining zero-free regions for the zeta-function from estimates
of M(x), Proc. Cambridge Philos. Soc. 67, 333–337.
Borel, E. (1897). Sur les zéros des fonctions entièrs, Acta Math. 20, 357–396.
Chudakov, N. G. (1936a). Sur les zéros de la fonction ζ (s), C. R. Acad. Sci. Paris 202,
191–193.
(1936b). On zeros of the function ζ (s), Dokl. Akad. Nauk SSSR 1, 201–204.
(1936c). On zeros of Dirichlet’s L-functions, Mat. Sb. (1) 43, 591–602.
(1937). On Weyl’s sums, Mat. Sb. (2) 44, 17–35.
(1938). On the functions ζ (s) and π(x), Dokl. Akad. Nauk SSSR 21, 421–422.
Cipolla, M. (1902). La determinazione assintotica dell’ n imo numero primo, Rend. Accad.
Sci. Fis-Mat. Napoli (3) 8, 132–166.
Euler, L. (1748). Introductio in analysin infinitorum, I, Lausanne; Opera omnia Ser 1,
Vol. 8, Teubner, 1922.
Gronwall, T. H. (1913). Sur la fonction ζ (s) de Riemann au voisinage de σ = 1, Rend.
Mat. Cir. Palermo 35, 95–102.
Hadamard, J. (1896). Sur la distribution des zéros de la fonction ζ (s) et ses conséquences
arithmétiques, Bull. Soc. Math. France 24, 199–220.
Hardy, G. H. (1921). Note on Ramanujan’s trigonometrical function cq (n), and certain
series of arithmetical functions, Proc. Cambridge Philos. Soc. 20, 263–271.
Hecke, E. (1917). Über die Zetafunktion beliebiger algebraischer Zahlkörper, Nachr.
Akad. Wiss. Göttingen, 77–89; Mathematische Werke, Göttingen: Vandenhoeck &
Ruprecht, 1959, pp. 159–171.
Hua, L. K. (1949). An improvement of Vinogradov’s mean-value theorem and several
applications, Quart. J. Math. Oxford Ser. 20, 48–61.
Ingham, A. E. (1932). The Distribution of Prime Numbers, Cambridge Tracts Math. 30.
Cambridge: Cambridge University Press.
(1945). Some Tauberian theorems connected with the Prime Number Theorem, J.
London Math. Soc. 20, 171–180.
Jensen, J. L. W. V. (1899). Sur un nouvel et important théorème de la théorie des
fonctions, Acta Math. 22, 359–364.
Karatsuba, A. A. & Voronin, S. M. (1992). The Riemann Zeta-function. Berlin: de
Gruyter.
Korobov, N. M. (1958a). On the zeros of the function ζ (s), Dokl. Akad. Nauk SSSR 118,
231–232.
(1958b). Weyl’s estimates of sums and the distribution of primes, Dokl. Akad. Nauk
SSSR 123, 28–31.
(1958c). Evaluation of trigonometric sums and their applications, Usp. Mat. Nauk 13,
no. 4, 185–192.
Landau, E. (1899a). Neuer Beweis der Gleichung ∞ µ(k)
k=1 k = 0, Inaugural Dissertation,
Berlin; Collected Works, Vol. 1. Essen: Thales Verlag, pp. 69–83.
196 The Prime Number Theorem
(1947). The Method of Trigonometrical Sums in the Theory of Numbers, Trav. Inst.
Math. Stecklov 23; English translation, London: Interscience Publishers, 1954.
(1958). A new evaluation of ζ (1 + it), Izv. Akad. Nauk SSSR 22, 161–164.
Walfisz, A. (1963). Weylsche Exponentialsummen in der neuren Zahlentheorie, Math.
Forschungsberichte 15. Berlin: Deutscher Verlag Wiss.
7
Applications of the Prime Number Theorem
We now use the Prime Number Theorem, and other estimates obtained by similar
methods, to estimate the number of integers whose multiplicative structure is
of a specified type.
By the estimates of Chebyshev and Mertens (Corollary 2.6 and Theorem 2.7(d)),
this is
log x x
= x 1 − log +O .
log y log x
Thus if we take u = (log x)/(log y), so that y = x 1/u , then we see that
x
ψ x, x 1/u = (1 − log u)x + O (7.2)
log x
199
200 Applications of the Prime Number Theorem
0 1
for u ≥ 1.
As might be surmised from Figure 7.1, ρ(u) is positive and decreasing. To
prove this, let u 0 be the infimum of the set of all solutions of the equation
ρ(u) = 0. By the continuity of ρ it follows that ρ(u 0 ) = 0. But ρ(u) > 0 for
7.1 Numbers composed of small primes 201
We now use elementary reasoning to show that (7.3) holds uniformly for u
in bounded intervals.
202 Applications of the Prime Number Theorem
Theorem 7.2 (Dickman) Let ψ(x, y) be the number of positive integers not
exceeding x composed entirely of prime numbers not exceeding y, and let ρ(u)
be defined as above. Then for any U ≥ 0 we have
x
ψ x, x 1/u
= ρ(u)x + O (7.9)
log x
uniformly for 0 ≤ u ≤ U and all x ≥ 2.
Here the first term on the right reflects the fact that if x ≥ 1, then ψ(x, y)
counts the number n = 1 for which P(1) is undefined. In the sum on the right,
the summand is ψ(x/ p, p), and hence we see that
ψ(x, y) = 1 + ψ(x/ p, p). (7.11)
p≤y
Let s(w) = p≤w 1/ p, and write Mertens’ estimate (Theorem 2.7(d)) in the
form s(w) = log log w + c + r (w). Then the sum in the main term above is
z z
ρ((log x)/(log w) − 1) ds(w) = ρ((log x)/(log w) − 1) d log log w
y y
z
+ ρ((log x)/(log w) − 1) dr (w).
y
(7.14)
We put t = (log x)/(log w). Since
dw dt
d log log w = =− ,
w log w t
the first integral on the right-hand side of (7.14) is
u
dt
ρ(t − 1) . (7.15)
U t
By integrating by parts and the estimate r (w) 1/ log w we see that the second
integral on the right-hand side of (7.14) is
z z
ρ((log x)/(log w) − 1)r (w) − r (w) dρ((log x)/(log w) − 1)
y y
z
1
1+ 1 |dρ((log x)/(log w) − 1)|
log x y
1
log x
since ρ is monotonic and bounded. By Mertens’ estimate (Theorem 2.7(d)) we
also see that the error term in (7.13) is
x 1 x
log x y< p≤z p log x
since log log z = log log y + O(1). On combining our estimates in (7.12) we
find that
u
dt x
ψ(x, x 1/u ) = x ρ(U ) − ρ(t − 1) +O .
U t log x
Thus by (7.6) we have the desired estimate for U ≤ u ≤ U + 1, and the proof
is complete.
Rankin used this chain of inequalities to derive an upper bound for ψ(x, y).
This approach is fruitful in a variety of settings, and has become known as
‘Rankin’s method’.
To use the above, we must establish an upper bound for the product on the
right-hand side. The size of this product is a little difficult to describe, because its
behaviour depends on the size of σ . If σ is near 0, then most of the factors are ap-
proximately (1 − y −σ )−1 , and hence we expect the product to be approximately
(1 − y −σ )−y/ log y . If σ is larger (but still < 1), then the general factor is approx-
imately exp( p −σ ), and hence the product is approximately the exponential of
y
dt y 1−σ
p −σ ∼ σ
∼ .
p≤y 2 t log t (1 − σ ) log y
We begin by making these relations precise.
Lemma 7.3 If 0 ≤ σ ≤ 1, then
y
du
p −σ = σ
+ O y 1−σ exp − c log y + O(1). (7.18)
p≤y 2 u log u
that
y 1−σ v 1−σ 1 y
du
I2 = − + .
(1 − σ ) log y (1 − σ ) log v 1−σ v u σ (log u)2
Here the first term on the right is one of the main terms in (7.22), and the second
term is O(1). Let J denote the integral on the right. To complete the proof it
suffices to show that
y 1−σ
J . (7.23)
(1 − σ )(log y)2
To this end we integrate by parts again:
y 1−σ v 1−σ 2 y
dw
J= − + .
(1 − σ )(log y)2 (1 − σ )(log v)2 1−σ v w σ (log w)3
4 −4
Here the second term on the right-hand side is e 2 (1 − σ ) 1 − σ , while
the first term on the right-hand side is larger. As for the integral on the right, we
observe that if w ≥ v, then (log w)3 ≥ 4(log w)2 /(1 − σ ). Hence the last term
on the right above has absolute value not exceeding J/2. Thus we have (7.23),
and the proof is complete.
Lemma 7.5 Suppose that y ≥ 2. If max 2/ log y, 1 − 4/ log y ≤ σ ≤ 1,
then
−1
1 − p −σ log y. (7.24)
p≤y
Now (7.24) follows at once from Lemma 7.4 when σ ≥ 2/3. Thus it remains
to establish (7.25). The sum in the error term above is 1 for σ > 5/8. If
3/8 ≤ σ ≤ 5/8, then by Lemma 7.4 it is y / log y. If 2/ log y ≤ σ ≤ 3/8,
1/4
then by Lemma 7.4 the sum is y 1−2σ / log y. Thus in any case this error term
7.1 Numbers composed of small primes 207
is majorized by the error terms on the right-hand side of (7.25). By Lemma 7.4,
the main term is
y 1−σ 1
p −σ = + log
v< p≤y (1 − σ ) log y 1 − σ
y 1−σ v
+O + O .
(1 − σ )2 (log y)2 log v
Since 2/ log y ≤ σ ≤ 1 − 4/ log y, y satisfies y ≥ e6 , and σ (1 − σ ) log y ≥
2(1 − 2/ log y) ≥ 4/3. Hence (y 1−σ )3/4 ≥ v and the second error term above
is dominated by the first.
It remains to consider the contribution of the primes p ≤ v. If σ > 1/3, then
the contribution of these primes is 1, so we may suppose that 2/ log y ≤
σ ≤ 1/3. In this range
log p
1 − p −σ σ log p = .
log v
Since
log v
log C v,
p≤v log p
it follows that
(1 − p −σ )−1 < exp(Cv) = exp Ce1/σ ≤ exp C y 1/2 ,
p≤v
We now bound ψ(x, y) by combining Lemma 7.5 with the inequalities (7.17).
Theorem 7.6 If y = x 1/u and log x ≤ y ≤ x 1/9 , then
u log log u
ψ(x, y) < x(log y) exp − u log u − u log log u + u −
log u
2
u u log u
+O +O .
log u y
Here the first error term is larger than the second if y ≥ (log x) log log x,
while if y is smaller, then the second error term dominates.
Proof We first note that we may suppose that y ≥ 9 log x, since the bound for
smaller y follows by taking y = 9 log x. To motivate the choice of σ in (7.17)
we note that the expression to be minimized is approximately
y −σ
u
x σ exp du .
2 log u
208 Applications of the Prime Number Theorem
Lemma 7.7 Let A(r, k) denote the number of solutions of the inequality a1 +
a2 + · · · + ar ≤ k in non-negative integers ai . Then A(r, k) = r +k k
.
r
Analytic Proof Let ar +1 = k − i=1 ai . Then A(r, k) is the number of ways
of writing k = a1 + a2 + · · · + ar +1 , which is the coefficient of x k in the power
series
r +1
∞ ∞
−r −1 r +k
x a
= (1 − x) = xk
a=0 k=0
k
When y is of the form y = (log x)a with a not too large, the upper bound of
Theorem 7.6 and the lower bound of Theorem 7.8 are quite close, and we have
Corollary 7.9 If y = (log x)a and 1 ≤ a ≤ (log x)1/2 /(2 log log x), then
log x (log a + O(1)) log x
x 1−1/a exp < ψ(x, y) < x 1−1/a exp .
5a log log x a log log x
Proof The lower bound follows from Theorem 7.8 since log y ≤ (log x)/
(4a log log x) in the range under consideration. As for the upper bound, we
note that log u log log x, so that log log u = log log log x + O(1). Hence
210 Applications of the Prime Number Theorem
log u + log log u = log log x − log a + O(1), and the result follows from
Theorem 7.6.
For 1 ≤ u ≤ 4 we may use the differential equation (7.4) and the initial
condition (7.5) to derive formulæ for ρ(u) (see Exercise 7.1.6 below), but for
larger u we take a different approach.
The differential–delay identity (7.4) for ρ(u) thus yields a differential equation
for F(s),
by (C.12) and Theorem C.2. Hence F(0) = eC0 . An arithmetic proof of this
is found in Exercise 7.1.7 below. Thus we have the identity (7.29), and (7.30)
follows by applying the inverse Laplace transform to both sides.
7.1.1 Exercises
1. (Chowla & Vijayaraghavan 1947) Show that if f (x) is a function that tends
to infinity in such a way that log f (x) = o(log x) then almost all integers n
have a prime factor larger than f (n). That is
1
lim card{n ≤ x : P(n) > f (n)} = 1
x→∞ x
4. Show that ρ (k) (u) has a jump discontinuity at u = k, and is continuous for
u > k.
5. (a) Show that ρ(u) is convex upwards for all u ≥ 1.
(b) Show that if u ≥ 2, then uρ(u) ≥ ρ(u − 1/2).
212 Applications of the Prime Number Theorem
P
(b) Show that if σ ≥ 1, then P
(σ ) log y.
(c) Deduce that
log n
−P (1) = (log y)2 .
n n
p|n⇒ p≤y
(c) Show that if σ > 1 and A ≥ 1, then the number of integers n ≤ x such
that ω(n) > α(log x)/ log log x is at most
∞
Aω(n)
x σ A−k .
n=1
nσ
(d) Show that if A = log x and σ = 1 + (log log log x)/ log log x, then the
above is x 1−α+o(1) .
9. (de Bruijn 1966) Assume that 0 < σ ≤ 3/ log y, and note that this interval
covers a range that is not treated in Lemma 7.5.
(a) Show that 1 − p −σ σ log p, and hence deduce that
C
(1 − p −σ )−1 ≤ exp log
p≤y p≤y σ log p
Cy 4
≤ exp log (7.32)
log y σ log y
for a suitable constant C.
(b) Write
1 − y −σ
(1 − p −σ )−1 = (1 − y −σ )−π (y) = F1 · F2 ,
p≤y p≤y 1 − p −σ
say. Show that
Cy 4
F1 ≤ (1 − y −σ )−y/ log y exp log .
(log y)2 σ log y
(c) Note that
1 − p −σ (y/ p)σ − 1
= 1 − , (7.33)
1 − y −σ yσ − 1
and hence deduce that the above is ≥ 1 − c log y/ p
log y
, so that
C
F2 ≤ exp log y/ p ≤ exp C y/(log y)2 .
log y p≤y
(d) Conclude that
−σ −1 −σ −y/ log y Cy 4
(1 − p ) ≤ (1 − y ) exp log
p≤y (log y)2 σ log y
for 0 < σ ≤ 3/ log y.
10. (de Bruijn 1966) Lemma 7.5 suffers from a loss of precision when
3/ log y ≤ σ ≤ (log log y)/ log y. To obtain a refined estimate in this range,
write
(1 − p −σ )−1 = F1 · F2 · F3
p≤y
214 Applications of the Prime Number Theorem
where the Fi are products over the intervals p ≤ exp(1/σ ), exp(1/σ ) <
p ≤ y/ exp(1/σ ), and y/ exp(1/σ ) < p ≤ y, respectively.
(a) Use (7.32) to show that F1 ≤ exp Cσ e1/σ .
(b) Use Lemma 7.5 to show that
C y 1−σ
F2 ≤ exp 1/σ .
e log y
(c) Use the identity (7.33) to show that
1 − p −σ cσ log y/ p
≥1− ,
1 − y −σ yσ
and hence deduce that
log y/ p
F3 ≤ (1 − y −σ )−π (y) exp Cσ
p≤y yσ
1−σ
−σ −y/ log y y Cσ y 1−σ
≤ (1 − y ) exp + .
(log y)2 log y
(d) Conclude that
−σ −1 −σ −y/ log y Cσ y 1−σ
(1 − p ) ≤ (1 − y ) exp
p≤y log y
when 3/ log y ≤ σ ≤ (log log y)/ log y.
11. (de Bruijn 1966)
(a) For σ > 0 let f (σ ) = x σ (1 − y −σ )−y/ log y . Show that f (σ ) is mini-
mized precisely when
log(1 + y/ log x)
σ = .
log y
(b) Show that for the above σ ,
log x y + log x y y + log x
f (σ ) = exp log + log .
log y log x log y y
(c) Show that if y ≤ log x, then
log x y + log x
ψ(x, y) ≤ exp log
log y log x
y 1 y + log x
+ 1+O log .
log y log y y
(d) Show that if log x ≤ y ≤ (log x)2 , then
log x 1 y + log x
ψ(x, y) ≤ exp 1+O log
log y log y log x
y y + log x
+ log .
log y y
7.2 Numbers composed of large primes 215
0 1
Figure 7.2 Buchstab’s function w(u) and its horizontal asymptote e−C0 for 1 ≤ u ≤ 4.
where u = (log x)/ log y and w(u) is a function determined by the initial con-
dition
w(u) = 1/u (7.37)
for 1 < u ≤ 2 and for u > 2 by the differential–delay equation
(uw(u)) = w(u − 1). (7.38)
Before proceeding further we first derive some of the simplest properties of
the function w(u) depicted in Figure 7.2. By integrating (7.38) we deduce that
u−1
uw(u) = 1 w(v) dv + C for u > 2, and by letting u tend to 2 we find that
C = 1 so that
u−1
uw(u) = w(v) dv + 1 (7.39)
1
∞
for u > 2. Since w (u) tends to 0 rapidly, it follows that the integral 2 w (v) dv
converges absolutely, and hence we see that limu→∞ w(u) exists. Since it is to
be expected that (x, y) is approximately x p<y (1 − 1/ p) when y is small,
it is not surprising that
Theorem 7.11 (Buchstab) Let (x, y) denote the number of positive integers
n ≤ x composed entirely of prime numbers p ≥ y, and let w(u) be defined as
above. Then
w(u)x y x
(x, y) = − +O (7.42)
log y log y (log x)2
uniformly for 1 ≤ u ≤ U and all y ≥ 2. Here u = (log x)/ log y, which is to
say that y = x 1/u .
The term −y/ log y can be included in the error term when y x/ log x but,
in view of (7.35), has to be present when y is close to x. It might be difficult
to prove that the above holds uniformly for all u ≥ 1 because of the precise
form of the error term, but the weaker assertion (7.36) can be shown to hold for
u ≥ 1 + ε, since sieve methods can be used when u is large.
The sum over p of the first error term is x/(log x)2 , and the sum over p of the
second is x 2/U /(log x)2 , which is acceptable since U ≥ 2. To estimate the
contribution of the main term in the sum we write the Prime Number Theorem in
the form π(t) = li(t) + R(t), apply Riemann–Stieltjes integration, and integrate
the term involving R(t) by parts, to see that the sum of the main term is
x 1/U xw log x − 1 x 1/U − x 1/U
log t
dt + f (t)R(t) − R(t) d f (t) (7.45)
x 1/u t(log t)2 x 1/u − x 1/u
where
log x
xw log t
−1
f (t) = .
t log t
Since f (t) x/(t 2 log t) and R(t) t/(log t) A , the terms involving R(t)
U x/(log x) . By the change of variables v =
A
contribute an amount
(log x)/ log t − 1 we see that the first integral in (7.45) is
u−1
x
w(v) dv,
log x U −1
which by (7.39) is
x
= (uw(u) − U w(U )).
log x
On combining our estimates we obtain (7.44), so the inductive step is
complete.
We now derive formulæ for w(u) similar to those in Theorem 7.10 involving
ρ(u).
7.2.1 Exercises
1. By using (7.31), or otherwise, show that
∞
s
1 − e−z e−z
dz = C0 + log s + dz
0 z s z
when s > 0.
2. (a) Show that
1 + log(u − 1)
w(u) =
u
for 2 ≤ u ≤ 3.
220 Applications of the Prime Number Theorem
for 4 ≤ u ≤ 5.
3. (Friedlander 1972) Let S be a set of positive integers not exceeding X , and
suppose that (a, b) ≤ Y whenever a ∈ S, b ∈ S, a = b. Let M(X, Y ) denote
the maximum cardinality of all such sets S.
(a) Let S0 be the set of those positive integers n ≤ X such that if d|n, d < n,
then d ≤ Y . Show that card S0 = M(X, Y ).
(b) Show that if Y ≤ X 1/2 , then
M(X, Y ) = 1 + π(X ) − π (Y ) + (Y, p).
p≤Y
g(q) ≥ ω(q) + 1.
7.3 Primes in short intervals 221
Proof of Lemma 7.13 Let L be large and fixed, and put N = [z L/3]. We show
that if z > z 0 (L), then there exists an integer M such that (M + n, P(z)) > 1
for 1 ≤ n ≤ N . Put
P1 = p, P2 = p, P3 = p, P4 = p,
p≤L L< p≤L L L L < p≤z/3 z/3< p≤z
card N < π (N ).
222 Applications of the Prime Number Theorem
For such an M,
1
card{1 ≤ n ≤ N : (M + n, P1 P2 P3 ) = 1} ≤ π (N ) 1− .
p|P2
p
The success of the argument just completed can be attributed to the fact that
the number of n, 1 ≤ n ≤ N , for which (n, P1 P3 ) = 1 is considerably smaller
than N p|P1 P3 (1 − 1/ p). By considering how L may be chosen as a function
of z we obtain a quantitative improvement of Lemma 7.13 and hence also of
Theorem 7.14.
Theorem 7.15 (Rankin) Let pn denote the n th prime number in increasing
order. There is a constant c > 0 such that
pn+1 − pn
lim sup ≥ c.
n→∞ (log pn )(log log pn )(log log log log pn )
(log log log pn )2
7.3 Primes in short intervals 223
Proof We repeat the argument in the proof of Lemma 7.13, with the sole
change that L is allowed to depend on z. If L is chosen so that
N
ψ(N , L L ) < , (7.49)
(log N )2
then L = o(log N ), and hence
z
ψ(N , L L ) = o .
log N
Since z/ log N ≤ z/ log z π (z/3), it follows that
ψ(N , L L ) = o(π (z/3)),
and the proof proceeds as before.
By Theorem 7.6 we see that
N
ψ N , N 1/u <
(log N )2
if u log u ≥ 3 log log N , which is the case if u ≥ 4(log log N )/ log log log N .
Taking u = (log N )/ log L L , we deduce that (7.49) holds if
(log N )(log log log N )
L log L < .
4 log log N
This is satisfied if
(log N )(log log log N )
L< ,
4(log log N )2
since then log L < log log N . Since N > z when L ≥ 3, we conclude that we
may take
(log z)(log log log z)
L= .
4(log log z)2
Hence
z(log z)(log log log z)
g P(z) >
13(log log z)2
for all z > z 0 , and this gives the stated result.
Thus ρ(y) < (2 + ε)π(y). Very little is known about ρ(y). It was once conjec-
tured that
π (M + N ) ≤ π(M) + π (N ) (7.51)
for M > 1, N > 1, but there is now serious doubt as to the validity of this
inequality. Indeed, it seems likely that ρ(y) > π (y) for all large y. To see why,
let
M+N
ρ(N ) = max 1. (7.52)
M
n=M+1
p|n⇒ p>N
Proof Suppose that N is even and that N > 2. Then for every M,
M+N
M+N −1
M+N
1 = 1 ≥ 1.
n=M+1 n=M+1 n=M+1
p|n⇒ p>N p|n⇒ p≥N p|n⇒ p>N −1
Hence ρ(N ) ≥ ρ(N − 1) when N is even, N > 2, so it suffices to treat the case
when N is odd, say N = 2K + 1. Let P(K ) denote the set of integers n with
K /(2 log K ) < |n| ≤ K and |n| prime. Then
so by Theorem 6.9,
K
card P(K ) = π (2K + 1) + (c + o(1))
(log K )2
where c = 2 log 2 − 1 > 0. We now show that P(K ) can be translated to form
a set of integers {M + n : n ∈ P(K )} with each member coprime to p≤N p.
By the Chinese Remainder Theorem it suffices to show that for every prime
7.3 Primes in short intervals 225
7.3.1 Exercises
1. Show that the function ρ(N ) is weakly increasing.
2. (a) Show that in the prime k-tuple conjecture, the hypothesis that for every
prime p the numbers a j do not cover all residue classes (mod p) is
satisfied for all p > k, so that it is enough to verify the hypothesis for
p ≤ k (a finite calculation for any given set of a j ).
(b) Prove the converse of the prime k-tuple conjecture: If there exist in-
finitely many integers n for which n + a j is prime for all j, 1 ≤ j ≤ k,
then for every prime p there is a residue class x (mod p) such that
x + a j ≡ 0 (mod p)(1 ≤ j ≤ k).
3. Show that g(q) qω(q)/ϕ(q).
4. (cf. Erdős 1951) Show that if 0 < c < 1/2 then there exist arbitrarily large
numbers x such that the interval (x, x + c(log x)/ log log x) contains no
square-free number.
5. (cf. Erdős 1946, Montgomery 1987) Suppose that 2 ≤ h ≤ x. Let P de-
note the set of all primes p ≤ h, let D denote the set of positive integers
composed entirely of primes in P, and let f (n) = p|n, p∈P (1 − 1/ p).
(a) Show that f (n) = d|n,d∈D µ(d)/d.
(b) Show that
6
f (n) = 2 h + O(log h)
x<n≤x+h
π
uniformly in x.
226 Applications of the Prime Number Theorem
(d) Among those primes p > h that divide an integer in the interval (x, x +
h], let Q be those for which p ≤ h log x, and R those for which p >
h log x. Show that
1
log log log x.
p∈Q
p
p n,
p∈R x<n≤x+h
U < p≤2U
ϕ(q)2 r2 p( p − 2)
= µ(r )2 {h/r }(1 − {h/r }) .
q r |q ϕ(r )2 p|q
( p − 1)2
r >1 pr
7.3 Primes in short intervals 227
⎛ ⎞2
q
⎜
h
ϕ(q) ⎟
⎝ 1− h ⎠ ≤ hϕ(q).
n=1 m=1
q
(m+n,q)=1
8. (Erdős 1951) (a) For a positive integer q, let S(q) denote the set of those
residue classes s modulo q 2 such that (s, q) is a perfect square. Show
that if q is square-free, then S(q) contains exactly p|q ( p 2 − p + 1)
elements.
(b) Show that if q is square-free and 1 ≤ h ≤ q 2 , then there is an integer
a such that the number of members of S(q) in the interval (a, a + h]
is at most
1 1
h 1− + 2 .
p|q
p p
(c) From now on, suppose that q is the product of those primes p ≤ y such
that p ≡ 3 (mod 4). By recalling Corollary 4.12, or otherwise, show
√
that the expression above is h/ log y.
(d) Show that if an integer n can be expressed as a sum of two squares,
then n ∈ S(q).
(e) Let R be the set of those primes p, y < p ≤ C y, such that p ≡
3 (mod 4). Here C is an absolute constant, taken to be sufficiently
large to ensure that R has at least y/ log y elements. Note that such a
constant exists, in view of Exercise 4.3.5(e). Let r denote the product of
all members of R. Suppose that the number of members of S(q) lying
in the interval (a, a + h] is < y/ log y. For each s ∈ S(q) satisfying
a < s ≤ a + h, associate a prime p ∈ R. Suppose that the integer b is
chosen modulo p 2 so that s + bq 2 ≡ p (mod p 2 ). Show that the interval
(a + bq 2 , a + bq 2 + h] does not contain a sum of two squares.
(f) Show that a and b can be chosen so that 0 < a + bq 2 < (qr )2 .
(g) Show that log qr y.
√
(h) Show that this construction succeeds with h y/ log y
(log qr )/(log log qr )1/2 .
(i) Conclude that there exist arbitrarily large x such that there is no sum of
two squares between x and x + c(log x)/(log log x)1/2 . Here c is a suit-
ably small positive constant. (Note that a stronger result is established
in the next exercise.)
228 Applications of the Prime Number Theorem
9. (Richards 1982) For every prime p ≤ y, let β( p) denote the greatest positive
integer such that p β ≤ y, and put
q= p 2β( p) .
p≤y
p≡3 (4)
for any fixed k. Since the sum over all k ≥ 1 of the right-hand side is exactly x,
it is tempting to think that the above holds quite uniformly in k. However this
is not the case, as we shall presently discover. To obtain precise estimates that
are uniform in k we apply analytic methods. In Section 2.4 we determined the
asymptotic distribution of the additive function (n) − ω(n) by establishing
the mean value of the multiplicative function z (n)−ω(n) . In the same spirit
we shall derive information concerning the distribution of (n) from mean
value estimates of z (n) . Since the Euler product of this latter function behaves
badly when |z| is large, we start not with z (n) but with dz (n) defined by the
identities
−z ∞
ζ (s)z = 1 − p −s = dz (n)n −s (σ > 1). (7.56)
p n=1
Since dz ( p) = z = z ( p) , the functions dz (n) and z (n) are ‘nearby’, and hence
the mean value of z (n) can be derived from that for dz (n) by elementary
reasoning.
Theorem 7.17 Let Dz (x) = n≤x dz (n), and let R be any positive real num-
ber. If x ≥ 2, then
x(log x)z−1
Dz (x) = + O(x(log x)z−2 )
(z)
uniformly for |z| ≤ R.
Proof Let a = 1 + 1/ log x. Then by Corollary 5.3,
a+i T
1 xs x
Dz (x) − ζ (s)z ds |dz (n)| min 1,
2πi a−i T s 1
T |x − n|
2 x<n<2x
(7.57)
xa −a
+ |dz (n)|n .
T n
Since |dz (n)| is erratic, we must exercise some care in estimating the error terms
above. Let A = {n : |n − x| ≤ x/(log x)2R+1 }. Without loss of generality we
may suppose that R is an integer. We note that |dz (n)| ≤ d|z| (n) ≤ d R (n). By
the method of the hyperbola we see by induction on R that
D R (x) = x PR (log x) + O R x 1−1/R
where PR is a polynomial of degree R − 1. Hence the contribution to the first
sum in the error term in (7.57) of the n ∈ A is
|dz (n)| x(log x)−R−2
n∈A
230 Applications of the Prime Number Theorem
x(log x)z−2 .
In the main term, when m ≤ x 1/2 we write
(log x/m)z−1 = (log x)z−1 + O (log m)(log x)z−2 .
Thus the first sum on the right-hand side of (7.59) is
bz (m)
= (log x)z−1
m≤x/2
m
⎛ ⎞
|bz (m)| |bz (m)|
+ O ⎝(log x)z−2 log m + (log x) R−1 ⎠
√
m≤ x
m √
m> x
m
|bz (m)|
z−2
= (log x) F(1, z) + O (log x)
z−1
(log m) 2R+1
,
m m
232 Applications of the Prime Number Theorem
for σ > 1, |z| ≤ R. Then az (n) = z (n) in the notation of Theorem 7.18. Hence,
with σk (x) defined as at the beginning of this section we find that
∞
A z (x) = z (n) = σk (x)z k .
n≤x k=0
Here the power series on the right is actually a polynomial, since σk (x) = 0 for
sufficiently large k, when x is fixed. Our asymptotic estimate for A z (x) enables
us to recover an estimate for the power series coefficients σk (x), since Cauchy’s
formula asserts that
1 A z (x)
σk (x) = dz (7.61)
2πi |z|=r z k+1
for r < 2.
Theorem 7.19 Suppose that R < 2, that F(s, z) is given by (7.60), and that
G(z) = F(1, z)/ (z + 1). Then
k−1 x(log log x)k−1 k
σk (x) = G 1 + OR (7.62)
log log x (k − 1)! log x (log log x)2
uniformly for 1 ≤ k ≤ R log log x.
Since G(0) = G(1) = 1, we see that (7.55) holds when k = o(log log x), and
also when k = (1 + o(1)) log log x, but that (7.55) does not hold in general. The
restriction to R < 2 is necessary because of the contribution of the prime p = 2
in the Euler product (7.60) for F(s, z). If z ≥ 2, then the behaviour is different;
see Exercises 7.4.5 and 7.4.6, below.
Proof Our quantitative form of the Prime Number Theorem (Theorem 6.9)
gives the case k = 1, so we may assume that k > 1. We substitute the estimate of
Theorem 7.18 in (7.61) with r = (k − 1)/ log log x. The error term contributes
an amount
k
x k−1 (log log x)
x(log x)r −2r −k = e
(log x)2 (k − 1)k
k
x(log log x) x(log log x)k−3
.
(k − 1)!(log x)2 (k − 1)! log x
7.4 Numbers composed of a prescribed number of primes 233
This is majorized by the error term in (7.62) since G((k − 1)/ log log x) 1.
The main term we obtain from (7.61) is x I / log x where
1
I = G(z)(log x)z z −k dz
2πi |z|=r
G(r ) 1
= (log x)z z −k dz + (G(z) − G(r ))(log x)z z −k dz.
2πi |z|=r 2πi |z|=r
By integration by parts we find that
r 1
(log x)z z −k dz = (log x)z z 1−k dz.
2πi |z|=r 2πi |z|=r
We multiply both sides by G (r ) and combine with the former identity to see
that
G(r )
I = (log x)z z −k d x
2πi |z|=r
1
+ (G(z) − G(r ) − G (r )(z − r ))(log x)z z −k dz. (7.63)
2πi |z|=r
Here the first integral is (log log x)k−1 /(k − 1)! by Cauchy’s theorem, which
gives the desired main term. On the other hand,
z
G(z) − G(r ) − G (r )(z − r ) = (z − w)G (w) dw |z − r |2 ,
r
But | sin x| ≤ |x| and cos 2πθ ≤ 1 − 8θ 2 for −1/2 ≤ θ ≤ 1/2, so the above is
∞
(log log x)k−3 ek−1
θ 2 e−8(k−1)θ dθ r 3−k ek−1 (k − 1)−3/2 =
2
r 3−k ek−1
0 (k − 1)k−3/2
k(log log x)k−3 /(k − 1)!.
This completes the proof of the theorem.
whereas
(z − r )(log x)z z −k dz = o |z − r ||(log x)z z −k | |dz| .
234 Applications of the Prime Number Theorem
Theorem 7.20 Let A(x, r ) denote the number of n ≤ x such that (n) ≤
r log log x, and let B(x, r ) denote the number of n ≤ x for which (n) ≥
r log log x. If 0 < r ≤ 1 and x ≥ 2, then
Theorem 7.21 Let αn be given by (7.64) and suppose that Y > 0. Then the
number of n, 3 ≤ n ≤ x, such that αn ≤ y is
x
(y)x + OY √
log log x
Proof Let
(n) − log log x
βn = √ .
log log x
√
Since (y) 1 and αn − βn 1/ log log x when x 1/2 ≤ n ≤ x and (n) ≤
2 log log x, it suffices to consider βn in place of αn . We may of course also
suppose that x is large.
Let k be a natural number and let u be defined by writing k = u + log log x.
If |u| ≤ 12 log log x, then by Stirling’s formula (see (B.26) or the more general
Theorem C.1) we see that
(log log x)k−1
(k − 1)!
12 −log log x−u
eu log x u 1
=√ 1+ 1+O .
2π log log x log log x log log x
The estimate log(1 + δ) = δ − δ 2/2 + O(|δ|3 ) holds uniformly for |δ| ≤ 1/2.
By taking δ = u/ log log x we find that
12 −log log x−u
u
1+
log log x
u − u2 u2 |u|3
= exp −u + − + O .
2 log log x 4(log log x)2 (log log x)2
236 Applications of the Prime Number Theorem
Suppose now that |u| ≤ (log log x)2/3 . By considering separately |u| ≤
(log log x)1/2 and (log log x)1/2 < |u| ≤ (log log x)2/3 we see that
u 1 |u|3
√ + .
log log x log log x (log log x)2
Similarly, by considering |u| ≤ 1 and |u| > 1 we see that
u2 1 |u|3
√ + .
(log log x)2 log log x (log log x)2
On combining these estimates we deduce that
(log log x)k−1 log x −u 2
= √ exp
(k − 1)! 2π log log x 2 log log x
1 |u|3
× 1+O √ +O
log log x (log log x)2
uniformly for |u| ≤ (log log x)2/3 . In Theorem 7.19 we have G(1) = 1 and
k−1 1 + |u|
G = G(1) + O .
log log x log log x
Hence by Theorem 7.19,
x exp −(k−log
2
log x)
2 log log x
σk (x) = √
2π log log x
1 |k − log log x|3
× 1+O √ +O .
log log x (log log x)2
By Theorem 7.20 we know that the contribution of k ≤ log log x −
(log log x)2/3 is negligible. We sum over the range
log log x − (log log x)2/3 ≤ k ≤ log log x + y(log log x)1/2 .
This gives rise to three sums, one for the main term and two for error terms.
Each of these sums can be considered to be a Riemann sum for an associated
integral, and the stated result follows.
7.4.1 Exercises
1. Let p1 , p2 , . . . , p K be distinct primes. Show that the number of n ≤ x
composed entirely of the pk is
(log x) K
K + O (log x) K −1 .
K ! k=1 log pk
7.4 Numbers composed of a prescribed number of primes 237
2. (a) Let dz (n) be defined as in (7.56), and suppose that |z| ≤ R. Show that
|dz (n)| ≤ d|z| (n) ≤ d R (n).
(b) Let F(s, z) be defined as in (7.60). Show that if 0 < r < 1 and σ > 1/2,
then 0 < F(σ, r ) < 1.
(c) Let F(s, z) be defined as in (7.60). Show that if 1 < r < 2, then the
Dirichlet series coefficients of F(s, r ) are all non-negative.
3. (a) Show that if
z 1 z
F(s, z) = 1+ s 1− s ,
p p −1 p
then F(s, z) converges for σ > 1/2, uniformly for |z| ≤ R.
(b) Show that if F(s, z) is taken as above, and if az (n) is defined as in
Theorem 7.18, then az (n) = z ω(n) .
(c) Let ρk (x) denote the number of n ≤ x for which ω(n) = k. Show that
if x ≥ 2, then
k−1 x(log log x)k−1 k
ρk (x) = G 1 + OR
log log x (k − 1)! log x (log log x)2
uniformly for 1 ≤ k ≤ R log log x where G(z) = F(1, z)/ (z + 1).
(d) Show that G(0) = G(1) = 1.
(e) Let A(x, r ) denote the number of n ≤ x for which ω(n) ≤ r log log x.
Show that
A(x, r ) x(log x)r −1−r log r
uniformly for 0 < r ≤ 1.
(f) Let B(x, r ) denote the number of n ≤ x for which ω(n) ≥ r log log x.
Show that
B(x, r ) x(log x)r −1−r log r
uniformly for 1 ≤ r ≤ R.
4. (a) Show that if
z 1 z
F(s, z) = 1+ s 1− s ,
p p p
then F(s, z) converges for σ > 1/2, uniformly for |z| ≤ R.
(b) Show that if F(s, z) is taken as above, and if az (n) is defined as in
Theorem 7.18, then az (n) = µ(n)2 z ω(n) .
(c) Let πk (x) denote the number of square-free n ≤ x for which ω(n) = k.
Show that if x ≥ 2, then
k−1 x(log log x)k−1 k
πk (x) = G 1 + OR
log log x (k − 1)! log x (log log x)2
uniformly for 1 ≤ k ≤ R log log x where G(z) = F(1, z)/ (z + 1).
238 Applications of the Prime Number Theorem
as x → ∞.
13. Show that if x ≥ 2, then
(n)
x
=x+O .
1<n≤x
ω(n) log log x
7.5 Notes 239
7.5 Notes
Section 7.1. Theorem 7.2 was first proved by Dickman (1930), and was redis-
covered by Chowla & Vijayaraghavan (1947), Ramaswami (1949), and Buch-
stab (1949). de Bruijn (1951a) gave a more precise estimate for ψ(x, y), over
a longer range of y. There is a considerable range of applications of ψ(x, y),
such as those to the distribution of k th power residues, Waring’s problem, and
the complexity of arithmetical algorithms in computer science. As a reflection
of this there have been two significant survey articles, by Norton (1971) and by
Hildebrand & Tenenbaum (1993).
Our treatment of ψ(x, y) is fairly elementary, but it would be natural to take
a more analytic approach, and use Perron’s formula to write
c+i∞
1 xs
ψ(x, y) = (1 − p −s )−1 ds
2πi c−i∞ p≤y s
c+i∞
1 xs
= ζ (s) (1 − p −s ) ds.
2πi c−i∞ p>y s
For s not too large, an approximation to the product over p > y is provided by
the Prime Number Theorem, and this suggests the main term
c+i∞ ∞ s
1 −s −1 x
(x, y) = ζ (s) exp − v (log v) dv ds.
2πi c−i∞ y s
It can be shown that this is indeed a good approximation to ψ(x, y) over a very
long range, but the technical details are rather heavy. By Theorem 7.10 it is not
hard to show that
∞
(x, y) = x ρ(u − v)d([y v ]y −v )
0−
(x, y) ∼ ρ(u)x
240 Applications of the Prime Number Theorem
for a large range of u. For the further development of the theory, especially on
the analytic side, see Hildebrand & Tenenbaum (1993).
Section 7.2. Theorem 7.11 is due to Buchstab (1937). The finer details of
the behaviour of (x, y) when u is large are intimately connected with sieve
theory, especially that of the linear sieve, i.e., the sieve in which on average one
residue class (mod p) is removed. The standard references are Greaves (2001),
Halberstam & Richert (1974), Selberg (1991).
Section 7.3. Theorem 7.14 was first proved by Westzynthius (1931). Erdős
(1935a) showed that
pn+1 − pn
lim sup 2
> 0,
n→∞ (log pn )(log log pn )/(log log log pn )
and then Rankin (1938) obtained Theorem 7.15 with c = 1/3. The value of c
has been successively improved by Schönhage (1963), Rankin (1963), Maier
& Pomerance (1990), culminating in the value c = 2eC0 of Pintz (1997). Erdős
offered a $10,000 prize for the first proof that the limsup in Theorem 7.15 is
+∞.
Early studies of g(P(z)) were conducted by Backlund (1929), Brauer &
Zeitz (1930), Ricci (1934), and Chang (1938). The size of g(P(z)) is not known;
possibly it is z log z. However, it is conceivable that infinitely often pn+1 − pn
is as large as (log pn )θ where θ > 1. In particular, Cramér (1936) conjectured
that
pn+1 − pn
lim sup = 1.
n→∞ (log pn )2
Theorem 7.16 is due to Hensley & Richards (1973).
Section 7.4. The analysis of σk (x) is based on Selberg’s exposition (1954) of
Sathe (1953a,b, 1954a,b). Sathe (1954b) also shows that the bound R log log x
cannot be replaced by 2 log log x + 1. Arguments giving rise to versions of
Theorem 7.20 occur in Erdős (1935b). A qualitative version of Theorem 7.21
is a special case of Erdős & Kac (1940). Quantitative versions with various
weaker error terms were obtained by LeVeque (1949) and Kubilius (1956).
Theorem 7.21 had been conjectured by LeVeque and was established by Rényi
& Turán (1958). They also showed that the error term is both uniform in x and
best-possible.
7.6 References
Addison, A. W. (1957). A note on the compositeness of numbers, Proc. Amer. Math.
Soc. 8, 151–154.
7.6 References 241
Alladi, K. & Erdős, P. (1977). On an additive arithmetic function, Pacific J. Math. 71,
275–294.
Backlund, R. J. (1929). Über die Differenzen zwischen den Zahlen, die zu den n ersten
Primzahlen teilerfremd sind, Annales Acad. sci. Fennicae 32 (Lindelöf-Festschrift),
Nr. 2, 9 pp.
Brauer, A. & Zeitz, H. (1930). Über eine zahlentheoretische Behauptung von Legendre,
Sitzungsb. Math. Ges. Berlin 29, 116–125.
de Bruijn, N. G. (1949). The asymptotically periodic behavior of the solutions of some
linear functional equations, Amer. J. Math. 71, 313–330.
(1950a). On the number of uncancelled elements in the sieve of Eratosthenes, Nederl.
Akad. Wetensch. Proc. 52, 803–812. (= Indag. Math. 12, 247–256)
(1950b). On some linear functional equations, Publ. Math. 1, 129–134.
(1951a). The asymptotic behaviour of a function occurring in the theory of primes,
J. Indian Math. Soc. 15 (A), 25–32.
(1951b). On the number of positive integers ≤ x and free of prime factors > y, Proc.
Nederl. Akad. Wetensch. 54, 50–60.
(1966). On the number of positive integers ≤ x and free of prime factors > y, II,
Proc. Koninkl. Nederl. Akad. Wetensch. A 69, 239–247. (= Indag. Math. 28)
Buchstab, A. A. (1937). Asymptotic estimates of a general number-theoretic function,
Mat. Sb. (2) 44, 1239–1246.
(1949). On those numbers in an arithmetic progression all prime factors of which are
small in magnitude, Dokl. Akad. Nauk SSSR (N. S.) 67, 5–8.
Chang, T.-H. (1938). Über aufeinanderfolgende Zahlen, von denen jede mindestens
einer von n linearen Kongruenzen genügt, deren Moduln die ersten n Primzahlen
sind, Schr. Math. Sem. Inst. Angew. Math. Univ. Berlin 4, 35–55.
Chowla, S. D. & Vijayaraghavan, T. (1947). On the largest prime divisors of numbers,
J. Indian Math. Soc. (2) 12, 31–37.
Cramér, H. (1936). On the order of magnitude of the difference between consecutive
prime numbers, Acta Arith. 2, 23–46.
DeKoninck, J.-M. (1972). On a class of arithmetical functions, Duke Math. J. 39, 807–
818.
Dickman, K. (1930). On the frequency of numbers containing prime factors of a certain
relative magnitude, Ark. Mat. Astr. fys. 22, 1–14.
Duncan, R. L. (1970). On the factorization of integers, Proc. Amer. Math. Soc. 25,
191–192.
Erdős, P. (1935a). On the difference of consecutive primes, Quart. J. Math., Oxford ser.
6, 124–128.
(1935b). On the normal number of prime factors of p − 1 and some related problems
concerning Euler’s φ- function. Quart. J. Math., Oxford ser. 6, 205–213.
(1946). Some remarks about additive and multiplicative functions, Bull. Amer. Math.
Soc. 52, 527–537.
(1951). Some problems and results in elementary number theory, Publ. Math. Debre-
cen 2, 103–109.
(1962). On the integers relatively prime to n and on a number-theoretic function
considered by Jacobsthal, Math. Scand. 10, 163–170.
(1963). Problem and Solution Nr. 136, Wiskundige opgaven met de Oplossingen 21.
242 Applications of the Prime Number Theorem
Erdős, P. & Kac, M. (1940). The Gaussian law of errors in the theory of additive number
theoretic functions, Amer. J. Math. 62, 738–742.
Erdős, P. & Nicolas, J.-L. (1981). Sur la fonction: nombre de facteurs premiers de n,
Enseignoment Math. (2) 27, 3–27.
Friedlander, J. B. (1972). Maximal sets of integers with small common divisors, Math.
Ann. 195, 107–113.
Greaves, G. (2001). Sieves in Number Theory, Ergeb. Math. (3) 43. Berlin: Springer-
Verlag.
Halberstam, H. (1970). On integers all of whose prime factors are small, Proc. London
Math. Soc. (3) 21, 102–107.
Halberstam, H. & Richert, H.-E. (1974). Sieve Methods, London Mathematical Society
Monographs No. 4. London: Academic Press, 1974.
Hardy, G. H. & Littlewood, J. E. (1923). Some problems of “Partitio Numerorum”: III
On the expression of a number as a sum of primes, Acta Math. 44, 1–70.
Hausman, M. & Shapiro, H. N. (1973). On the mean square distribution of primitive
roots of unity, Comm. Pure Appl. Math. 26, 539–547.
Hensley, D. & Richards, I. (1973). Two conjectures concerning primes, Analytic Number
Theory, Proc. Sympos. Pure Math. 24. Providence: Amer. Math. Soc., 123–128.
(1973/4). Primes in intervals, Acta Arith. 25, 375–391.
Hildebrand, A. (1984). Integers free of large prime factors and the Riemann Hypothesis,
Mathematika 31, 258–271.
(1985). Integers free of large prime divisors in short intervals, Oxford Quart. J. 36,
57–69.
(1986a). On the number of positive integers ≤ x and free of prime factors > y,
J. Number Theory 22, 289–307.
(1986b). On the local behavior of ψ(x, y), Trans. Amer. Math. Soc. 297, 729–751.
(1987). On the number of prime factors of integers without large prime divisors,
J. Number Theory 25, 81–106.
Hildebrand, A. & Tenenbaum, G. (1986). On integers free of large prime factors, Trans.
Amer. Math. Soc. 296, 265–290.
(1993). Integers without large prime factors, J. Théor. Nombres Bordeaux. 5, 411–484.
Kubilius, I. P. (1956). Probabilistic methods in the theory of numbers, Uspehi Mat. Nauk
(N.S.) 11 68, 31–66.
Legendre, A. M. (1798). Théorie des Nombres, First edition, Vol. 2, pp. 71–79.
LeVeque, W. J. (1949). On the size of certain number-theoretic functions, Trans. Amer.
Math. Soc. 66, 440–463.
Maier, H. & Pomerance, C. (1990). Unusually large gaps between consecutive primes,
Trans. Amer. Math. Soc. 322, 201–237.
Montgomery, H. L. (1987). Fluctuations in the mean of Euler’s phi function, Proc. Indian
Acad. Sci. (Math. Sci.) 97, 239–245.
Montgomery, H. L. & Vaughan, R. C. (1986). On the distribution of reduced residues,
Ann. of Math. (2) 123 (1986), 311–333.
Norton, K. K. (1971). Numbers with Small Factors and the Least k’th Power Non-
Residues, Memoir 106, Providence: Amer. Math. Soc.
Pillai, S. S. & Chowla, S. D. (1930). On the error terms in some asymptotic formulæ in
the theory of numbers, I, J. London Math Soc. 5, 95–101.
7.6 References 243
Pintz, J. (1997). Very large gaps between consecutive primes, J. Number Theory 63,
286–301.
Ramaswami, V. (1949). The number of positive integers ≤ x and free of prime divisors
> y, and a problem of S. S. Pillai, Duke Math. J. 16, 99–109.
Rankin, R. A. (1938). The difference between consecutive primes, J. London Math. Soc.
13, 242–247.
(1963). The difference between consecutive primes, V, Proc. Edinburgh Math. Soc.
(2)13, 331–332.
Rényi, A. & Turán, P. (1958), On a theorem of Erdős–Kac, Acta Arith. 4, 71–84.
Ricci, G. (1934). Ricerche aritmetiche sui polinomi, II, Rend. Palermo 58, 190–208.
Richards, I. (1982). On the gaps between numbers which are sums of two squares, Adv.
in Math. 46, 1–2.
Sathe, L. G. (1953a,b,1954a,b). On a problem of Hardy on the distribution of integers
with a given number of prime factors I, II, III, IV, J. Indian Math. Soc. (N.S.) 17,
63–82 & 83–141, 18, 27–42 & 43–81.
Schinzel, A. (1961). Remarks on the paper “Sur certaines hypothèses concernant les
nombres premiers”, Acta Arith. 7, 1–8.
Schönhage, A. (1963). Eine Bemerkung zur Konstruktion grosser Primzahllücken, Arch.
Math. 14, 29–30.
Selberg, A. (1954). Note on a paper of L. G. Sathe, J. Indian Math. Soc. 18, 83–87.
(1991). Collected papers, Vol. II. Berlin: Springer-Verlag.
Westzynthius, E. (1931). Über die Verteilung der Zahlen, die zu den n ersten Primzahlen
teilerfremd sind, Comment. Phys.–Math. Soc. Sci. Fennica 5, Nr. 25, 37 pp.
8
Further discussion of the
Prime Number Theorem
244
8.1 Relations equivalent to the Prime Number Theorem 245
d|n
ζ
From (8.2) we know that for any ε > 0 there is a large number C = C(ε) such
that |ψ(y) − [y]| < εy provided that y ≥ C. That is, |ψ(x/d) − [x/d]| ≤ εx/d
for d ≤ x/C. Thus
εx
µ(d) (ψ(x/d) − [x/d]) ≤ εx log x.
d≤x/C d≤x/C
d
Since ε can be taken arbitrarily small, we see that (8.5), and hence (8.4), follows
from (8.2).
It is worth pausing here to note that the choice of the main term above is
extremely delicate. If we had subtracted x/d instead of [x/d], then we would
have had to consider the question of the size of the sum d≤x µ(d)/d, which
will be considered later. Since d≤x µ(d)[x/d] = 1 for all x ≥ 1, we avoid the
problem by this judicious choice of the main term.
To complete our proof that (8.4) is equivalent to (8.2) we now assume (8.4),
and derive (8.2). By summing (8.6) over n, which is to say by applying (2.7),
we see that
ψ(x) = µ(d)T (x/d)
d≤x
where T (x) = m≤x log m as in Section 2.2. We recall that T (x) = x log x −
x + O(log x) by the integral test. The main term here is approximately the same
as applies to the summatory function of the divisor function, since Theorem
1/2 2.2
asserts that D(x) = m≤x d(m) = x log x + (2C0 − 1)x + O x . Indeed,
the arithmetic function d(m) − 2C0 , when summed over m, produces exactly
the same main terms as log m. That is, if f (m) = log m − d(m) + 2C0 and
F(x) = m≤x f (m) then F(x) x 1/2 . On the other hand, r |n µ(r )d(n/r ) =
1 for all n and d|n µ(d) = 0 for all n > 1, so that
(n) − 1 (n > 1),
µ(d) f (n/d) =
d|n
2C0 − 1 (n = 1).
We now use (8.4) to show that the left-hand side above is o(x), which thus gives
(8.2). The reasoning employed at this point is useful for other purposes, so we
axiomatize the argument, as follows.
By taking ad = µ(d) and F(x) as in (8.7), we see that (8.4) implies (8.2).
Proof Suppose that 1 ≤ U ≤ x/2. From (ii) and (iv) we see that
U x
ad F(x/d) c
|ad | .
x/(2U )<d≤x/U
(log U ) x/(2U )<d≤x/U (log U )c
Since A(n i ) = o(x) and F(x/n i ) J 1, the first two terms are harmless. As the
points x/d are monotonically arranged in the interval [1, 2 J ], the sum above
has absolute value not exceeding
max |A(d)| |F(x/d) − F(x/(d + 1))| ≤ max |A(d)| var[1,2 J ] F.
d≤x d≤x
n 0 <d<n 1
By (i) and (iii) this is o(x) for any given J . Thus the proof is complete.
248 Further discussion of the Prime Number Theorem
Since this is o(x), we obtain (8.8). To derive (8.4) from (8.8) is easier, in view
of the following useful principle:
Lemma 8.2 If ∞ d=1 ad /d converges, then d≤x ad = o(x).
Proof Let x be given, set r (u) = u<d≤x ad /d, and note that
x
ad = r (u) du.
d≤x 0
But r (u) is bounded (independently of x), and |r (u)| < ε for u > U0 , so the
integral is U0 + εx. That is, the sum is o(x), as desired.
8.1.1 Exercises
1. As in Section 2.2, let T (x) = n≤x log n, and recall that T (x) = x log x −
x + O(log x).
(a) Show that T (x) = d≤x (d)[x/d].
(b) Show that
(d)
x = T (x) − {x/d} − ((d) − 1){x/d}.
d≤x
d d≤x d≤x
(c) Use (8.2) and Axer’s theorem to show that the last sum above is o(x).
(d) Recall Exercise 2.1.1.
(e) Show that (8.2) implies that
(d)
= log x − C0 + o(1), (8.9)
d≤x
d
(f) Apply Lemma 8.2 with ad = (d) − 1 to show that (8.9) implies (8.2).
Hence (8.2) and (8.9) are equivalent.
(g) Show that
(n){x/n} = (1 − C0 )x + o(x).
n≤x
2. (a) By recalling the proof of Theorem 2.2(c), or otherwise, show that (8.2)
implies that
x
ψ(u)
du = log x − 1 − C0 + o(1). (8.10)
1 u2
(c) By partial summation, derive (8.4) from (8.13), and thus show that (8.2),
(8.12) and (8.13) are all equivalent. (Note that a deeper assertion concerning
the sum in (8.13) was already proved in Exercise 6.2.15.)
5. Let F(n) = d|n f (d) for all n. The opening remarks in Chapter 2 raise the
possibility of a connection between the two relations
(i) S(x) = n≤x F(n) = cx + o(x);
∞
(ii) d=1 f (d)/d = c.
In Exercise 6.2.19 we have seen that (i) and the hypothesis f (n) 1 imply
(ii). Apply Axer’s theorem with ad = f (d), F(x) = {x} to show that (ii) and
the hypothesis n≤x | f (n)| x imply (i).
6. Let dk (n) be the k th divisor function, as defined in Exercise 2.1.18. Put
D0 (x) = 1, and for positive integral k let Dk (x) = n≤x dk (n).
(a) Show that if k is a positive integer, then d≤x µ(d)Dk (x/d) = Dk−1 (x).
250 Further discussion of the Prime Number Theorem
(b) Let g(n) be an arithmetic function, put G(x) = n≤x g(n), and suppose
that
G(x) = x P(log x) + O(x/(log x)c )
are equivalent. Some familiar – and useful – examples of this pairing are
displayed in Table 8.1. In many instances of (8.14), the functions A(x) and
B(x) are summatory functions of arithmetic functions a(n) and b(n), respec-
tively, in which case a(n) and b(n) are linked by the more common Möbius
inversion
b(n) = a(d), a(n) = µ(d)b(n/d). (8.15)
d|n d|n
The linear operator that takes A(x) to B(x) is continuous, but the transformation
is nevertheless quite unstable. For example, the choice of the functions A(x) in
the second and third lines of Table 8.1 are very close, and yet the corresponding
functions B(x) differ quite substantially.
When the asymptotic rate of growth of A(x) is known, it is easy to deduce that
of B(x), as a form of Abelian theorem. For example, if A(x) ∼ x as x → ∞,
then B(x) ∼ x log x. However, from the fourth line of Table 8.1 we see that
8.2 An elementary proof of the Prime Number Theorem 251
Table 8.1
A (x) B (x)
1 [x]
x x = x log x + C0 x + O(1)
1
n
n≤x
[x] d(n) = x log x + (2C0 − 1)x + O x 1/2
n≤x
ψ(x) log n = x log x − x + O(log x)
n≤x
log x/n 1
x log x x = x(log x)2 + C1 x log x + C2 x + O(1)
n≤x
n 2
some sort of Tauberian converse would be useful, for the purpose of proving
the Prime Number Theorem. Unfortunately, it is difficult to establish anything
stronger than the trivial estimate
A(x) |B(x/n)|. (8.16)
n≤x
From this we see that if B(x) 1, then A(x) x. This is rather weak, since
the same upper bound for A(x) can be deduced from a weaker upper bound for
B(x): From (8.16) we see that
Then for x ≥ 1,
2 (n) = 2x log x + O(x).
n≤x
Clearly 2 (n) > 0 only when ω(n) ≤ 2. Thus the sum on the left above is
analogous to ψ(x) but with prime powers replaced by products of two prime
powers, counted with suitable weights.
252 Further discussion of the Prime Number Theorem
Take now
A(x) = 2 (n) − 2x log x + c1 x + c2 (8.20)
n≤x
where c1 and c2 are constants to be chosen later. Then by (8.18) and lines
1, 2, and 5 of Table 8.1 we see that the corresponding B(x) given by (8.14)
is
log x/n 1
B(x) = (log n)2 − 2x + c1 x + c2 [x].
n≤x n≤x n n≤x n
x
By the integral test the first sum is 1 (log u)2 du + O((log x)2 ) = x(log x)2 −
2x log x + 2x + O((log x)2 ). Hence the above is
We now choose c1 and c2 so that the leading terms cancel. That is, we take
c1 = 2 + 2C1 and c2 = −2 + 2C2 − c1 C0 . Then B(x) (log x)2 , and hence
by (8.17) it follows that A(x) x. The desired estimate then follows from
(8.20).
Our object is to show that each term on the left above is ∼ x log x as x →
∞. Suppose, to the contrary, that ψ(x) is somewhat larger than anticipated,
say ψ(x) = ax with a > 1. By combining Mertens’ estimate n≤x (n)/n =
log x + O(1) with (8.22), we see that ψ(y)/y is on average approximately 2 − a
as y runs over the points x/ p k , counted with the appropriate weights. Note that
2 − a < 1. That is, if x is chosen so that ψ(x) is unusually large, then ψ(x/ p k )
must be unusually small for many prime powers p k . Such an argument may
be repeated, so that one finds that ψ(x/( p k q )) is unusually large for many
prime powers q . The points x/ p k and x/( p k q ) are highly interlacing, so that
ψ(y) would have to switch rapidly back and forth between large and small
values. However, ψ(x) is a (weakly) increasing function, which implies that
if it is unusually large at one point, then it continues to be unusually large for
√
some time after. More precisely, if ψ(x) ≥ ax with a > 1, then ψ(y) ≥ a y
√
uniformly
√ for x ≤ y ≤√ a x. Similarly, if ψ(x) ≤ bx with b < 1 then ψ(y) ≤
b y uniformly for b x ≤ y ≤ x. Of course an interval on which ψ(y) is
large cannot overlap with one on which ψ(y) is small. One expects to reach a
contradiction by showing that these intervals are too numerous and too long to
all fit in the interval [1, x]. Our remaining task is to convert this intuitive line
of reasoning into a rigorous proof.
Let R(x) be defined by the relation ψ(x) = x + R(x). By combining the
estimate of Mertens cited above with (8.22) we see that
R(x) log x + R(x/n)(n) x. (8.23)
n≤x
Here the sum is a weighted average of values of R, but the total amount of
weight, n≤x (n) = ψ(x), remains in doubt. To overcome this difficulty, we
iterate the identity (8.23) as follows: By replacing x in (8.23) by x/m we find
that
R(x/m) log x/m + R(x/(mn))(n) x/m.
n≤x/m
We multiply this by (m) and sum over all m ≤ x, and thus find that
R(x/m)(m) log x/m + R(x/(mn))(m)(n) x log x.
m≤x mn≤x
254 Further discussion of the Prime Number Theorem
We multiply both sides of (8.23) by log x and subtract the above to see that
R(x)(log x)2 = − R(x/n)(n) log n
n≤x
+ R(x/(mn))(m)(n) + O(x log x). (8.24)
mn≤x
This has the advantage over (8.23) that we know how much weight resides in the
coefficients on the right-hand side, by virtue of Theorem 8.3. We now formulate
a Tauberian principle that is appropriate to estimate the above expression.
when v ≥ u. Then
β2
(an − bn )r (x/n) ≤ β − + o(1) x(log x)2 .
n≤x 100
for all large y. Then (8.32) follows on summing this over y = x16−k , 1 ≤ k ≤
[(log x)/ log 16] . In proving (8.33) we consider three cases.
Case 1. r (u) ≤ 12 βu for all u ∈ [ 16y
x
, 4y
x
]. Then r (x/n) ≤ 12 βx/n for all n ∈
[4y, 16y], and hence
an
βx 1
an − r (x/n) ≥ βx .
y<n≤16y
n 2 4y<n≤16y n
Thus we have (8.33) in this case also, and the proof of Lemma 8.4 is complete.
To complete the proof of the Prime Number Theorem we apply Lemma 8.4
with
an = (n) log n, bn = (b)(c).
bc=n
This gives (8.25), and (8.27) is Selberg’s identity as expressed in Theorem 8.3.
To obtain (8.26) it suffices to subtract (8.38) from (8.27). We apply the lemma
with r (u) = R(u) = ψ(u) − u. Then
r (v) − r (u) = (n) − (v − u) ≥ −(v − u),
u<n≤v
so we have (8.28). Let α = lim sup |r (u)|/u. Our object is to show that α = 0.
We know that α ≤ 1/2, by Chebyshev’s estimates. Suppose that α > 0, and
258 Further discussion of the Prime Number Theorem
8.2.1 Exercises
1. For which entries in Table 8.1 are A(x) and B(x) summatory functions of
arithemtic functions a(n) and b(n) related as in (8.15) ?
2. If A(x) = M(x) := n≤x µ(n) in (8.14), then what is the function B(x) ?
3. (a) Verify the Dirichlet series identity
ζ ζ 2 ζ
(s) + (s) = (s).
ζ ζ ζ
(b) Compute the Dirichlet series coefficients of the three functions in the
above identity, and thus give a proof of (8.18) by means of formal Dirich-
let series.
(c) Compute the leading term of the Laurent expansions of the three func-
tions above, at the point s = 1.
(d) Suppose that ρ is a zero of ζ (s) of multiplicity m > 0. Compute the
singular portion of the Laurent expansions of the three functions above,
at s = ρ. Note that the pole of ζ /ζ at s = ρ is simple if and only if ρ
is a simple zero of ζ (s).
4. Let a = lim supx→∞ ψ(x)/x and b = lim infx→∞ ψ(x)/x. Suppose that a
sequence xν tending to infinity is chosen so that limν→∞ ψ(xν )/xν = a. Use
(8.22) to show that for each ν a prime pν can be selected so that xν / pν → ∞
and lim infν→∞ ψ(xν / pν )/(xν / pν ) ≤ 2 − a. Thus show that a + b ≤ 2. By
a similar argument, show that a + b ≥ 2. Hence demonstrate that the relation
a + b = 2 is a consequence of (8.22).
5. (a) Show that
log x log p + (log p) log q x.
p k ≤x p k q ≤x
k≥2 k+≥3
6. Show that d|n µ(d)(log n/d)2 = (n) log n + d|n (d)(n/d).
7. Let k be a positive integer, and put
k (n) = µ(d)(log n/d)k .
d|n
(b) Show that k (n) ≥ 0 for all n, and that if k (n) > 0, then ω(n) ≤ k.
8. Let c and M be positive constants,x and suppose that f (x) is a function
defined on [1, ∞) such that (i) | 1 f (u)u −2 du| ≤ M for all x ≥ 1, and also
(ii) | f (u) − f (v)| ≤ c|u − v| whenever u ≥ 1 and v ≥ 1. Put
| f (x)| 1 x
| f (u)|
α = lim sup , β = lim sup du.
x→∞ x x→∞ log x 1 u2
Show that β ≤ α(1 − α 2 /(32cM)).
when s is near 1 + it. Since this is possible only when m = 0, we have (8.39).
The above observations can be paraphrased as ‘the Prime Number Theorem
is equivalent to the assertion (8.39)’, although one needs to bear in mind the
continuity conditions also.
Suppose that α(s) = ∞ −s
n=1 an n . In Section 5.2 we derived information
concerning partial sums of this series at s = 1 from the behaviour of α(σ ) as
σ → 1+ . We now take much stronger hypotheses that concern α(s) throughout
the closed half-plane σ ≥ 1, but we obtain from them much stronger conclu-
sions, concerning partial sums of the series at s = 0. Our proof of the Hardy–
Littlewood Tauberian theorem (Theorem 5.7) depended on a simple lemma con-
cerning one-sided polynomial approximation (Lemma 5.8). Our new approach
depends similarly on a corresponding lemma concerning one-sided trigonomet-
ric approximation, as follows.
Lemma 8.5 Let E(x) = e x for x ≤ 0, and E(x) = 0 for x > 0. For any given
ε > 0 there is a T and continuous functions f + (x), f − (x) with f ± ∈ L 1 (R)
such that
(i) f − (x) ≤ E(x) ≤ f + (x) for all real x;
(ii) *
f ± (t) = 0 for |t| ≥ T ; ∞
∞
(iii) −∞ f + (x) d x < 1 + ε, −∞ f − (x) d x > 1 − ε.
converges for all s with σ > 1, and that r (s) := α(s) − c/(s − 1) extends to a
continuous function in the closed half-plane σ ≥ 1. Then
x
1 da(u) = ce x + o(e x )
0
as x → ∞.
By making the change of variable a(u) = A (eu ), we obtain the following
equivalent formulation.
Corollary 8.7 (Wiener–Ikehara) Suppose that A(v) is non-negative and in-
creasing on [1, ∞), that
∞
α(s) = v −s d A(v)
1
converges for all s with σ > 1, and that r (s) := α(s) − c/(s − 1) extends to a
continuous function in the closed half-plane σ ≥ 1. Then
x
1 d A(v) = cx + o(x)
1
as x → ∞.
By setting A(v) = n<v an we obtain a useful Tauberian theorem for Dirich-
let series.
Corollary 8.8 (Wiener–Ikehara) Suppose that an ≥ 0 for all n, that
∞
α(s) = an n −s
n=1
converges for all s with σ > 1, and that r (s) := α(s) − c/(s − 1) extends to a
continuous function in the closed half-plane σ ≥ 1. Then
an = cx + o(x)
n≤x
as x → ∞.
By taking an = (n), we see that (8.39) gives the hypotheses with c = 1,
and hence we obtain the Prime Number Theorem in the form (8.2).
Proof of Theorem 8.6 Take δ > 0, and let E(u) be as in Lemma 8.5. Then
x ∞
e−δu da(u) = e x E(u − x)e−(1+δ)u da(u),
0 0
By (8.40) this is
∞ T
= ex *
f + (t)e(tu − t x) dt e−(1+δ)u da(u).
0 −T
If a(u) = eu , then α(s) = 1/(s − 1), and thus from the above calculation we
see in particular that
∞ T
* 1
f + (u − x)e−δu du = f + (t)e(−t x) dt.
0 −T δ − 2πit
x
On multiplying both sides by ce and combining this with (8.41), we deduce
that
x T
e−δu da(u) ≤ e x *
f + (t)e(−t x)r (1 + δ − 2πit) dt
0 −T
∞
+ ce x f + (u − x)e−δu du.
0
We divide through by e x and let x tend to infinity. The first integral on the right
tends to 0 by the Riemann–Lebesgue lemma, and the second integral on the
∞
right tends to −∞ f + (u) du. Thus we see that
x ∞
lim sup e−x 1 da(u) ≤ c f + (u) du ≤ c(1 + ε)
x→∞ 0 −∞
Since ε may be taken arbitrarily small, we obtain the stated result, apart from
the need to prove Lemma 8.5.
8.3 The Wiener–Ikehara Tauberian theorem 263
This is a weighted average of the values of E(u) with special emphasis on those
u near x. We show that
To establish this we consider several cases. If |x| ≤ 1/T we simply observe
∞
that 0 ≤ f (x) ≤ −∞ JT (u) du = 1. If x ≥ 1/T we observe that 0 ≤ f (x)
T −3 −∞ (x − u)−4 du
0
1/(T x)3 . By the calculus of residues it is easy to show
∞
that −∞ JT (u) du = 1. Hence
∞
f (x) − E(x) = (E(u) − E(x))JT (x − u) du.
−∞
Here the first integral on the right vanishes because the integrand is an odd
function, and the second integral is 1/T 2 . On the other hand,
∞ ∞
(E(u) − E(x))JT (x − u) du T −3 u −4 du 1/|T x|3 ,
0 −x
2x
and similarly −∞ 1/|T x|3 , so we have (8.42) in this case also. Finally,
suppose that x ≤ −1. Then E(u) − E(x) = e x (u − x + O((u − x)2 )) for x −
1 ≤ u ≤ x + 1, so that
x+1 1
(E(u) − E(x))JT (x − u) du = − e x u JT (u) du
x−1 −1
1
+ O ex u 2 JT (u) du e x T −2 ,
−1
264 Further discussion of the Prime Number Theorem
and
∞
(E(u) − E(x))JT (x − u) du T −3 x −4 ,
x+1
8.3.1 Exercises
1. Use the Wiener–Ikehara theorem (Theorem 8.6) to show that M(x) = o(x).
2. (Dressler 1970; cf. Bateman 1972) Let f (n) denote the number of positive
integers k such that ϕ(k) = n.
(a) Show that if σ > 1, then
∞
f (n) ∞
1 1 1
= = 1+ + + ··· ,
n=1
ns k=1
ϕ(k)s p ϕ( p)s ϕ( p 2 )s
and explain why this is not an Euler product in the usual sense.
8.3 The Wiener–Ikehara Tauberian theorem 265
(b) Let the above Dirichlet series be F(s). Show that F(s) = ζ (s)G(s) for
σ > 1, where
1 1
G(s) = 1− s + .
p p ( p − 1)s
(c) By writing
p
1 1
− s =s u −s−1 du,
( p − 1)s p p−1
(b) Explain why the right-hand side above is = T * f + (0) = T R f + (x) d x.
(c) Explain why the left-hand side above is ≥ (1 − e−1/T )−1 .
(d) Deduce that
1
f + (x) d x ≥ .
R T (1 − e−1/T )
(e) Suppose that T ≥ 2. Show that the right-hand side above is = 1 +
1/(2T ) + O(1/T 2 ).
(f) Show similarly that
1
f − (x) d x ≤ ,
R T (e1/T − 1)
and that the right-hand side is = 1 − 1/(2T ) + O(1/T 2 ) when T ≥ 2.
This is an ordinary Dirichlet series, since the N (a) are positive integers, and
thus the above can be written in the form an n −s where an is the number of
ideals with norm n.
Counting ideals a with N (a) ≤ x is rather like counting rational integers. The
ideals can be parametrized by the points of a lattice in Rd , so one is counting
lattice points in a certain region, which is approximately the volume of that
region, and thus it can be shown that the number I (x) of ideals a with N (a) ≤ x is
I (x) = cx + O x 1−1/d (8.43)
where c = c(K ) is a certain positive constant, called the ideal density. Here
the implicit constant may also depend on K , which we assume is fixed. By
Theorem 1.3 it follows that
∞ ∞
cs
ζ K (s) = s I (x)x −s−1 d x = +s (I (x) − cx)x −s−1 d x.
1 s − 1 1
8.4 Beurling’s generalized prime numbers 267
Since this latter integral is uniformly convergent for σ > 1 − 1/d + δ, we de-
duce that ζ K (s) is analytic in the half-plane σ > 1 − 1/d apart from a simple
pole at s = 1 with residue c. Moreover, we see that if δ is fixed, δ > 0, then
ζ K (s) |t| uniformly for σ ≥ 1 − 1/d + δ, |t| ≥ 1.
If a and b are two ideals in O K , then
It is notable that the chain of reasoning we have just described depends only
on the estimate (8.43) and the identity (8.44). Thus the entire situation could
be abstracted as follows. Suppose we have a sequence P of real numbers pi
such that 1 < p1 ≤ p2 ≤ · · · and pi → ∞. We call these numbers ‘generalized
primes’. We form products of powers of these numbers, p1a1 p2a2 · · · pkak , and
call such products ‘generalized integers’. Let N (x) denote the number of such
products whose value does not exceed x. If
for some c > 0 and θ < 1, then by the reasoning we have outlined it follows
that the number P(x) of generalized primes pi such that pi ≤ x is li(x) +
√
O(x exp(−c log x)).
The integers Z form an additive group, a cyclic group generated by the
number 1. Moreover, the positive integers form a multiplicative semigroup
with the primes as generators. From the additive property of the integers we
know that [x] = x + O(1), which is a strong form of (8.45). However, it is now
quite clear that our proof of the Prime Number Theorem requires no further
knowledge of the additive nature of the integers beyond this estimate.
We have seen that the estimate (8.45) gives a generalization of the Prime
Number Theorem with the classical error term. We now consider the issue of
how much this hypothesis can be weakened, if the goal is only to obtain a
generalization of (8.1), namely that P(x) ∼ x/ log x as x → ∞.
Theorem 8.10 (Beurling) Let P = { pi } where 1 < p1 ≤ p2 ≤ · · · and pi →
∞, and let N (x) denote the number of products p1a1 p2a2 · · · pkak ≤ x where the
ai are non-negative integers. Suppose that there is a positive constant c such
that
x
N (x) = cx + O (8.46)
(log x)γ
for x ≥ 2. Let P(x) denote the number of members of P not exceeding x. If
γ > 3/2, then
x
P(x) ∼ (8.47)
log x
as x → ∞.
Proof Let N = {n j } where 1 = n 1 < n 2 ≤ n 3 ≤ · · · are the generalized inte-
gers, and for σ > 1 let
ζP (s) = n −s .
n∈N
Since the n ∈ N are not necessarily rational integers, the above is not necessarily
an ordinary Dirichlet series, but it is an example of a ‘generalized Dirichlet
series’. In any case it is an absolutely convergent series and by integration by
parts as in the proof of Theorem 1.3 we see that
∞ ∞
ζP (s) = u −s d N (u) = s N (u)u −s−1 du.
1− 1
∞
From (8.46) we know that 1 |N (u) − cu|u −2 du < ∞. Hence the integral
above is uniformly convergent for σ ≥ 1, and consequently it is continuous in
this closed half-plane. Thus we can extend the definition of ζP (s) so that ζP (s) =
c/(s − 1) + r0 (s) and r0 (s) is continuous for σ ≥ 1. To bound the modulus
of continuity of r0 (s) we differentiate. Thus ζP (s) = −c/(s − 1)2 + r1 (s) for
σ > 1 where
∞ ∞
r1 (s) = r0 (s) = (N (u) − cu)u −s−1 du − s (N (u) − cu)(log u)u −s−1 du.
1 1
∞
If (8.46) holds with γ > 2, then 1 |N (u) − cu|(log u)u −2 du < ∞ and then
r1 (s) is continuous in the closed half-plane σ ≥ 1. When γ is smaller, however,
the situation is more delicate. From now on we assume, as we may, that 3/2 <
γ ≤ 2. Since
∞ ∞
(log u)1−γ u −σ du = v 1−γ e−(σ −1)v dv
2 log 2
∞
= (σ − 1)γ −2 u 1−γ e−u du
(σ −1) log 2
(σ − 1)− 2 +η ,
1
(σ − 1)− 2 +η uni-
1
where η = η(γ ) > 0, from (8.46) we deduce that r1 (s)
formly for σ > 1. Consequently, if t is fixed, t = 0, then
σ
ζP (α + it) dα (σ − 1) 2 +η
1
ζP (σ + it) − ζP (1 + it) = (8.48)
1
for σ > 1. This product is absolutely convergent, and each factor is non-zero,
so ζP (s) = 0 for σ > 1, and indeed we may write
∞
1 −r s
log ζP (s) = p . (8.50)
p∈P r =1
r
case that a1 < 2a0 , but we can make a1 as close to 2a0 as we wish by using the
Fejér kernel K (θ) with K large, since
K
k 1 sin π K θ 2
K (θ) = 1 + 2 1− cos 2π kθ = ≥ 0.
k=1
K K sin π θ
Hence if σ > 1, then
K ∞
1 K
ζP (σ + ikt)(1−|k|/K ) = exp (1 − |k|/K ) p −ir kt
p∈P r =1
r pr σ k=−K
k=−K
∞
1
= exp K (r t(log p)/(2π)) .
p∈P r =1
r pr σ
Suppose for the moment that γ > 2. Then r0 (s) and r1 (s) are both continuous
in the closed half-plane σ ≥ 1, and then
ζ 1
− P (s) = + r (s)
ζP s−1
where
r0 (s) + (s − 1)r1 (s)
r (s) = −
(s − 1)ζP (s)
8.4 Beurling’s generalized prime numbers 271
and
c r0 (s) − c
−ζP (s) = − + s J (s).
(s − 1)2 s
Thus
ζP 1 c(s − 1) + (1 − 2s)r0 (s) s
− (s) − = + J (s)
ζP s−1 s(s − 1)ζP (s) ζP (s)
and by splitting the integral at X , where X is a large parameter we have
ζP 1
− (s) − = C(s) + R(s)
ζP s−1
where
∞
R(s) = (N (u) − cu) (log u)u −s−1 du
X
which by (8.46) is
∞
u −1 (log u)2−2γ du γ (log X )3−2γ
X
uniformly for δ > 0. The first integral on the right-hand side of (8.53) is also
uniformly bounded as δ tends to 0, since ζP (1 + it) = 0. Thus the contribution
of R(s) to (8.52) is γ (log X )3/2−γ , uniformly for δ > 0. Hence if we let δ
tend to 0 from above in (8.52), and divide through by x, we find that
∞ T
S(x) *
≤ u −1 f + (log u − log x) du + f + (t)x −2πit C(1 − 2πit) dt
x 1 −T
+ Oγ (log X )3/2−γ .
8.4 Beurling’s generalized prime numbers 273
∞
As x tends to infinity, the first integral on the right tends to −∞ f + (v) dv. Since
*
f + (t)C(1 − 2πit) is a continuous function of t, by the Riemann–Lebesgue
lemma the second integral on the right tends to 0 as x tends to infinity. Hence
∞
S(x)
lim sup ≤ f + (v) dv + Oγ (log X )3/2−γ .
x→∞ x −∞
By Lemma 8.5 we know that the integral on the right is < 1 + ε if T is suffi-
ciently large. Since X may also be taken arbitrarily large, we conclude that the
limsup above is ≤ 1. By a similar argument with f + replaced by f − , we find
that the corresponding liminf is ≥ 1, so we have the generalized Prime Number
Theorem in the form S(x) ∼ x. By integrating by parts we obtain the desired
relation (8.47).
We note that this function is increasing and tends to infinity with x. Hence for
each positive integer j there is a unique real number p j such that f ( p j ) = j. If
p j ≤ x < p j+1 , then P(x) = j and j ≤ f (x) < j + 1; hence P(x) = [ f (x)].
By integration by parts we see that
x
u iα x 1+iα x
du = +O .
2 log u (1 + iα) log x (log x)2
By taking α = −a, 0, a, and combining, we see that
x ia x −ia x x
f (x) = 1 − − +O ,
2(1 + ia) 2(1 − ia) log x (log x)2
and consequently
P(x) 1 P(x) 1
lim inf =1− √ , lim sup =1+ √ .
x→∞ x/ log x 1 + a2 x→∞ x/ log x 1 + a2
Clearly
x
log p = log u d[ f (u)]
p∈P 1
p≤x
x x
= log u d f (u) − log u d{ f (u)}
1 1
x x x
{ f (u)}
= 1 − cos(a log u) du − { f (u)} log u + du
1 1 1 u
x 1+ia x 1−ia
=x− − + O(log x),
2(1 + ia) 2(1 − ia)
and hence
x 1+ia x 1−ia
S(x) = x − − + O x 1/2 .
2(1 + ia) 2(1 − ia)
Let r (x) denote this last error term. Then for σ > 1,
∞
ζP
− (s) = u −s d S(u)
ζP 1
1 1 1
= − − + g(s)
s − 1 2(s − 1 − ia) 2(s − 1 + ia)
where g(s) is analytic for σ > 1/2. Hence
1 1
log ζP (s) = − log(s − 1) + log(s − 1 − ia) + log(s − 1 + ia) + G(s)
2 2
where G (s) = −g(s), and so we have (8.54) with H (s) = e G(s) .
To complete the proof we need not only (8.54) but also an estimate of the
size of ζP (s) when σ < 1. To this end we mimic the approach used to estimate
8.4 Beurling’s generalized prime numbers 275
1/ζ (s) in Theorem 6.7. Since P(x) x/ log x it follows that log ζP (1 + δ +
it) log 1/δ uniformly for 0 < δ ≤ 1/2. If t ≥ 4 + a and 1 − 1/ log t ≤ σ ≤
1 + 1/ log t, then
ζ ∞
− P (s) = (n)n −s + u −s d S(u).
ζP 2 t2
n≤t
n∈N
so that
1+1/ log t
ζP
log ζP (s) = − (α + it)dα + log ζP (1 + 1/ log t + it)
σ ζP
1 + log log t
for σ ≥ 1 − 1/ log t. Hence there is a constant A such that ζP (s) (log t) A for
σ ≥ 1 − 1/ log t, t ≥ 4 + a.
We now estimate N (x) by taking an inverse Mellin transform of ζP (s).
However, the truncated Perron formula (Corollary 5.3) is not so useful since
we lack information concerning the number of generalized integers in a short
interval. To avoid this difficulty we use Cesàro weights as discussed in Section
5.1, by means of which we see that if b > 1 and h > 0, then
1 b+i∞
(x + h)s+1 − x s+1
ζP (s) ds = w+ (n)
2πi h b−i∞ s(s + 1) n∈N
where
⎧
⎨1 (u ≤ x),
w+ (u) = (x + h − u)/ h (x < u ≤ x + h),
⎩
0 (u > x + h).
We now pull the contour to the left. In view of (8.54), at s = 1 we encounter a
simple pole with residue c(x + h/2) where c = a H (1). Because of the branch
points at 1 ± ia, we slit the plane by the segments σ ± ia for −∞ < σ ≤ 1.
Our contour follows theupper and lower sides of these segments; the integral
1 σ
−∞ (x + h) (1 − σ )
1/2
along these loops is dσ x/(log x)3/2 . By taking
276 Further discussion of the Prime Number Theorem
more care, and using Theorem C.3, we could obtain oscillatory main terms of
this order of magnitude. On the rest of the contour we estimate the integral as
in the proof of the Prime Number Theorem, and thus we see that
1 x
N (x) ≤ w+ (n) = cx + ch + O
2 (log x)3/2
n∈N 2
x
+O exp − C log x .
h
On taking h = x/(log x)2 we obtain an upper bound of the desired type. To
obtain a corresponding lower bound we argue similarly from the formula
1 b+i∞
x s+1 − (x − h)s+1
ζP (s) ds = w− (n)
2πi h b−i∞ s(s + 1) n∈N
where
⎧
⎨1 (u ≤ x − h),
w− (u) = (x − u)/ h (x − h < u ≤ x),
⎩
0 (u ≥ x).
8.5 Notes
Section 8.1. Historical accounts of the development of prime number theory
and of the various proofs of the Prime Number Theorem have been given
by Bateman & Diamond (1996), Narkiewicz (2000), and by Schwarz (1994).
Axer’s theorem originates in Axer (1911). The definitive account of Axer’s
theorem is that of Landau (1912).
Section 8.2. In former times, an argument was considered to be ‘non-
elementary’ if it involved Cauchy’s theorem or Fourier inversion. Prior to Sel-
berg’s elementary proof of the Prime Number Theorem, a distinction was drawn
between those results that could be obtained by elementary arguments, and those
that could not. Selberg’s elementary proof rendered the terminology nugatory.
Theorem 8.3 and a deduction of the Prime Number Theorem occur in Selberg
(1949). There are a number of variants of the less than straightforward Tauberian
process used in the deduction; see, for example, Erdős (1949), Wright (1952),
and Levinson (1969). For a historical review of elementary proofs of the Prime
Number Theorem see Goldfeld (2004).
Quantitative estimates of the form
π (x) = li(x)(1 + O((log x)−a ))
have been derived by elementary methods. van der Corput (1956) obtained
a = 1/200, Kuhn (1955) obtained a = 1/10, Breusch a = 1/6 − ε, and
8.5 Notes 277
Wirsing (1962) a = 3/4. Then Bombieri (1962a,b) and Wirsing (1964) showed
that the above is true for any fixed positive a. Subsequently, elementary tech-
niques have been used to show that
π(x) = li(x) + O(x exp(−c(log x)−b ))
for various values of b. Diamond & Steinig (1970) obtained b = 1/7 − ε, Lavrik
& Sobirov (1973) b = 1/6 − ε, and Srinivasan & Sampath (1988) b = 1/6.
Although the estimates obtained by elementary methods have thus far been
weaker than those derived by analytic means, we have no reason to believe that
this will always be the case.
Section 8.3. The theorem of Ikehara (1931) represented a major advance,
because it gave for the first time a Tauberian theorem that could be used to
prove the Prime Number Theorem without imposing growth conditions on the
Dirichlet series generating function. Ikehara assumed that α(s) − c/(s − 1) is
analytic in the closed half-plane σ ≥ 1. Wiener (1932) showed that mere conti-
nuity is enough, but this is of lesser significance, since still weaker hypotheses
are sufficient – see Korevaar (2006).
The heart of the Wiener–Ikehara proof of the Prime Number Theorem is
Lemma 8.5, which has the effect of enabling one to reduce directly to a use
of the Riemann–Lebesgue lemma on a finite section of the line s = 1. In the
proof of Lemma 8.5 we see that it suffices to take T = C/ε, and from Exercise
8.3.5 we see that it is necessary to take T ≥ 1/(2ε) + O(1). Graham & Vaaler
(1981) have shown that f + and f − can be constructed so that equality is achieved
in Exercise 8.3.5(e),(g).
Lemma 8.5, with T small and ε large, is also useful for proving interesting
theorems of Fatou and Riesz. Fatou (1906) showed that if an = o(1), then the
series f (z) = an z n converges at any point of the circle |z| = 1 at which f is
analytic. Landau (1910, Section 10) gives Riesz’s proof that if n≤x an = o(x),
then the Dirichlet series α(s) = an n −s converges at every point of the line
σ = 1 at which α(s) is analytic. Riesz (1916) extended this to generalized
Dirichlet series.
For detailed discussion of Wiener’s Tauberian theorem, the Ikehara theorem,
and Tauberian theorems associated with the elementary proof of the Prime
Number Theorem see Pitt (1958).
Section 8.4. The concept of generalized primes are introduced in Beurling
(1937). The hypothesis of Theorem 8.10 can be weakened: Kahane (1997) has
shown that if
∞
(N (x) − cx)2 x −3 (log x)2 d x < ∞,
1
then (8.47) still follows.
278 Further discussion of the Prime Number Theorem
In the negative direction, Hall (1973) showed that if γ < 1, then the hypothesis
(8.46) is not sufficient to imply a Chebyshev estimate. Also, Kahane (1998) has
shown that the hypothesis
∞
|N (x) − cx|
dx < ∞
1 x2
does not imply a Chebyshev estimate. Zhang (1987b) has shown that if (8.46)
holds with γ > 1, then
µ(n) = o(x) .
n≤x
n∈N
8.6 References
Axer, A. (1911). Über einige Grenzwertsätze, Sitz. Kais. Akad. Wiss. Wien. math-natur.
Klasse 120, 1253–1298.
Balanzario, E. P. (2000). On Chebyshev’s inequalities for Beurling’s generalized primes,
Math. Slovaca 50, No.4, 415–436.
Bateman, P. T. (1972). The distribution of values of the Euler function, Acta Arith. 21,
329–345.
Bateman, P. T. & Diamond, H. G. (1969). Asymptotic distribution of Beurling’s
generalized prime numbers, Studies in Number Theory, W. J. LeVeque, Ed.,
MAA Studies in math. 6. Washington: Mathematical Association of America,
pp. 152–210.
(1996). A hundred years of prime numbers, Amer. Math. Monthly 103, 729–741.
Beurling, A. (1937). Analyse de la loi asymptotique de la distribution des nombres
premiers généralisés, I, Acta Math. 68, 255–291.
Bombieri, E. (1962a). Maggiorazione del resto nel “Primzahlsatz” col metodo di Erdős–
Selberg, Ist. Lombardo Accad. Sci. Lett. Rend. A 96, 343–350.
(1962b). Sulle formule di A. Selberg generalizzate per classi di funzioni aritmetiche
e le applicazioni al problema del resto nel “Primzahlsatz”, Riv. Mat. Univ. Parma
(2) 3, 393–440.
Borel, J.-P. (1980/81). Quelques résultats d’équirépartition liés aux nombres généralisés
de Beurling, Acta Arith. 38, 255–272.
(1984). Sur le prolongement des fonctions ζ associées à un système des nombres
premiers généralisés de Beurling, Acta Arith. 43, 273–282.
Breusch, R. (1960). An elementary proof of the prime number theorem with remainder
term, Pacific J. Math. 10, 487–497.
van der Corput, J. G. (1956). Sur le reste dans la démonstration élémentaire du theorème
des nombres premiers, Colloque sur la Théorie des Nombres (Bruxelles, 1955).
Paris: Masson & Cie, pp. 163–182.
Diamond, H. G. (1969). The prime number theorem for Beurling’s generalized numbers,
J. Number Theory 1, 200–207.
(1970a). Asymptotic distribution of Beurling’s generalized integers, Illinois J. Math.
14, 12–28.
(1970b). A set of generalized numbers showing Beurling’s theorem to be sharp, Illinois
J. Math. 14, 29–34.
(1973). Chebyshev estimates for Beurling generalized prime numbers, Proc. Amer.
Math. Soc. 39, 503–508.
(1977). When do Beurling generalized integers have a density?, J. Reine Angew. Math.
295, 22–39.
Diamond, H. G., Montgomery, H. L., & Vorhauer, U. M. A. (2006). Beurling primes
with large oscillation, Math. Ann., 334, 1–36.
Diamond, H. G. & Steinig, J. (1970). An elementary proof of the prime number theorem
with a remainder term, Invent. Math. 11, 199–258.
Dressler, R. E. (1970). A density which counts multiplicity, Pacific Math. J. 34, 371–378.
Erdős, P. (1949). On a new method in elementary number theory which leads to an
elementary proof of the prime number theorem, Proc. Natl. Acad. Sci. USA 35,
374–384.
280 Further discussion of the Prime Number Theorem
Fatou, P. (1906). Séries trigonométriques et séries de Taylor, Acta Math. 30, 335–400.
Goldfeld, D. (2004). The elementary proof of the prime number theorem: an histori-
cal perspective, Number Theory (New York, 2003). New York: Springer-Verlag,
pp. 179–192.
Graham, S. W. & Vaaler, J. D. (1981). A class of extremal functions for the Fourier
transform, Trans. Amer. Math. Soc. 265, 283–302.
Hall, R. S. (1972). The prime number theorem for generalized primes, J. Number Theory
4, 313–320.
(1973). Beurling generalized prime number systems in which the Chebyshev inequal-
ities fail, Proc. Amer. Math. Soc. 40, 79–82.
Hejhal, D. A. (1976). The Selberg Trace Formula for P S L(2, R). Vol. I, Lecture Notes
Math. 548. Berlin: Springer-Verlag.
(1983). The Selberg Trace Formula for P S L(2, R). Vol. 2, Lecture Notes Math. 1001.
Berlin: Springer-Verlag.
Ikehara, S. (1931). An extension of Landau’s theorem in the analytic theory of numbers,
J. Math. Phys. 10, 1–12.
Ingham, A. E. (1945). Some Tauberian theorems connected with the prime number
theorem, J. London Math. Soc. 20, 171–180.
Kahane, J.-P. (1995). Sur travaux de Beurling et Malliavin, Séminaire Bourbaki Vol. 7
Exp. 225, Paris: Soc. Math. France, 27–39.
(1996). Une formula de Fourier pour les nombres premiers. Application aux nombres
premiers généralisés de Beurling, Harmonic analysis from the Pichorides viewpoint
(Anogia, 1995) Publ. Math. Orsay, 96–01, Orsay: Univ. Paris XI, 41–49.
(1997). Sur les nombres premiers généralisés de Beurling. Preuve d’une conjecture
de Bateman et Diamond, J. Théor. Nombres Bordeaux 9, 251–266.
(1998). Le rôle des algèbres A de Wiener, A∞ de Beurling et H 1 de Sobolev
dans la théorie des nombres premiers généralisés de Beurling, Ann. Inst. Fourier
(Grenoble) 48, 611–648.
(1999). Un théorème de Littlewood pour les nombres premiers de Beurling Bull.
London Math. Soc. 31, 424–430.
Knopfmacher, J. (1990). Abstract Analytic Number Theory, Second Edition. New York:
Dover.
Korevaar, J. (2006). The Wiener–Ikehara theorem by complex analysis, Proc. Amer.
Math. Soc. 134, 1107–1116.
Kuhn, P. (1955). Eine Verbesserung des Restgliedes beim elementaren Beweis des
Primzahlsatzes, Math. Scand. 3, 75–89.
Landau, E. (1910). Über die Bedeutung einiger neuen Grenswertsätze der Herren Hardy
und Axer, Prace mat.-fiz. 21, 97–177; Collected Works, Vol. 4. Essen: Thales Verlag,
1986, pp. 267–347.
(1912). Über einige neuere Grenzwertsätze, Rend. Circ. Mat. Palermo 34, 121–131;
Collected Works, Vol. 5. Essen: Thales Verlag, 1986, pp. 145–155.
Lavrik, A. F. & Sobirov, A. Š. (1973). The remainder term in the elementary proof of
the Prime Number Theorem, Dokl. Akad. Nauk SSSR 211, 534–536.
Levinson, N. (1969). A motivated account of an elementary proof of the Prime Number
Theorem, Amer. Math. Monthly 76, 225–245.
Malliavin, P. (1961). Sur le reste de la loi asymptotique de répartition des nombres
premiers généralisés de Beurling, Acta Math. 106, 281–298.
8.6 References 281
With more effort (see Exercise 9.1.1) it can be shown that if d1 and d2
are quasiperiods of χ , then (d1 , d2 ) is also a quasiperiod, and hence the least
282
9.1 Primitive characters 283
By Lemma 9.3 we see that in order to exhibit the primitive characters ex-
plicitly it suffices to determine the primitive characters (mod p α ). Suppose first
that p is odd, and let g be a primitive root of p α . Then by (4.16) we know that
any character χ (mod p α ) is given by
k indg n
χ (n) = e
ϕ( p α )
Theorem 9.4 Let χ be a character modulo q. Then the following are equiv-
alent:
(1) χ is primitive.
(2) If d | q and d < q, then there is a c such that c ≡ 1 (mod d), (c, q) = 1,
χ(c) = 1.
(3) If d | q and d < q, then for every integer a,
q
χ (n) = 0.
n=1
n≡a (mod d)
Proof (1) ⇒ (2). Suppose that d | q, d < q. Since χ is primitive, there exist
integers m and n such that m ≡ n (mod d), χ (m) = χ (n), χ (mn) = 0. Choose
c so that (c, q) = 1, cm ≡ n (mod q). Thus we have (2).
(2) ⇒ (3). Let c be as in (2). As k runs through a complete residue system
(mod q/d), the numbers n = ac + kcd run through all residues (mod q) for
9.1 Primitive characters 285
q/d
S= χ (ac + kcd) = χ (c)S.
k=1
9.1.1 Exercises
1. Let f (n) be an arithmetic function with period q such that f (n) = 0 when-
ever (n, q) > 1. Call d a quasiperiod of f if f (m) = f (n) whenever m ≡ n
(mod d) and (mn, q) = 1.
(a) Suppose that d1 and d2 are quasiperiods, put g = (d1 , d2 ), and suppose
that m ≡ n (mod g) and (mn, q) = 1. Show that there exist integers a
and b such that m = n + ad1 + bd2 and (n + ad1 , q) = 1.
(b) Show that if d1 and d2 are quasiperiods of f then so also is (d1 , d2 ).
(c) Show that the least quasiperiod of f divides all quasiperiods.
2. Let S(q) denote the set of all Dirichlet characters χ (mod q), and put T (q) =
)
d|q S(d). Show that the members of T (q) form a basis of the vector space
of all arithmetic functions with period q if and only if q is square-free.
3. For d|q let U(d, q) denote the set of ϕ(q/d) functions
χ (a/d) (a, q) = d,
f (a) =
0 otherwise
where χ runs over all Dirichlet characters (mod q/d). Set V(q) =
)
d|q U(d, q). Show that the members of V(q) form a basis for the vector
space of arithmetic functions with period q.
4. For i = 1, 2 let χi be a character (mod qi ) where (q1 , q2 ) = 1, and suppose
that di is the conductor of χi . Show that d1 d2 is the conductor of χ1 χ2 .
5. For i = 1, 2 suppose that χi is a character (mod qi ). Show that the following
two assertions are equivalent:
(a) The characters χ1 and χ2 are induced by the same primitive character.
(b) χ1 ( p) = χ2 ( p) for all but at most finitely many primes p.
6. Let ϕ2 (q) denote the number of primitive characters (mod q).
(a) Show that ϕ2 (q) is a multiplicative function.
(b) Show that d|q ϕ2 (d) = ϕ(q).
286 Primitive characters and Gauss sums
q
ϕ(q)
χ (n) = .
n=1
ϕ(d)
n≡a(mod d)
This may be regarded as the inner product of the multiplicative character χ (a)
with the additive character e(a/q). As such, it is analogous to the gamma
∞
function (s) = 0 x s−1 e−x d x, which is the inner product of the multiplicative
character x s with the additive character e−x with respect to the invariant measure
d x/x. Gauss sums are invaluable in transferring questions concerning Dirichlet
characters to questions concerning additive characters, and vice versa.
The Gauss sum is a special case of the more general sum
q
cχ (n) = χ (a)e(an/q). (9.4)
a=1
9.2 Gauss sums 287
whose properties were discussed in Section 4.1. We now show that the sum
cχ (n) is closely related to τ (χ ).
Theorem 9.5 Suppose that χ is a character modulo q. If (n, q) = 1, then
q
χ (n)τ (χ ) = χ (a)e(an/q), (9.6)
a=1
and in particular
τ (χ ) = χ (−1)τ (χ ). (9.7)
Proof If (n, q) = 1, then the map a → an permutes the residues modulo q,
and hence
q
χ (n)cχ (n) = χ (an)e(an/q) = τ (χ ).
a=1
For primitive characters the hypothesis that (n, q) = 1 in Theorem 9.5 can
be removed.
Theorem 9.7 Suppose that χ is a primitive character modulo q. Then (9.6)
√
holds for all n, and |τ (χ )| = q.
Proof It suffices to prove (9.6) when (n, q) > 1. Choose m and d so that
(m, d) = 1 and m/d = n/q. Then
q
d
q
χ (a)e(an/q) = e(hm/d) χ (a).
a=1 h=1 a=1
a≡h (mod d)
Since d | q and d < q, the inner sum vanishes by Theorem 9.4. Thus (9.6) holds
also in this case.
288 Primitive characters and Gauss sums
The innermost sum on the right is 0 unless a ≡ b (mod q), in which case it is
√
equal to q. Thus ϕ(q)|τ (χ )|2 = ϕ(q)q, and hence |τ (χ )| = q.
If χ is primitive modulo q, then not only does (9.6) hold for all n but also
τ (χ ) = 0, and hence we have
1
q
χ (n) = χ (a)e(an/q).
τ (χ ) a=1
−τ (χ )
q−1
L(1, χ) = χ (a) log(sin πa/q), (9.8)
q a=1
iπ τ (χ )
q−1
L(1, χ ) = aχ (a). (9.9)
q 2 a=1
∞
Proof Since L(1, χ ) = n=1 χ (n)/n, by Corollary 9.8,
1 ∞
1 1 ∞
q−1 q−1
e(an/q)
L(1, χ) = χ (a)e(an/q) = χ (a) .
τ (χ ) n=1 n a=1 τ (χ ) a=1 n=1
n
But log(1 − z)−1 = ∞ n=1 z /n for |z| ≤ 1, z = 1, where the logarithm is
n
the principal branch. We take z = e(θ ) where 0 < θ < 1. Since 1 − e(θ) =
−2ie(θ/2) sin πθ, it follows that log(1 − e(θ)) = log(2 sin π θ) + iπ (θ − 1/2).
Thus
−1
q−1
L(1, χ) = χ (a)(log(2 sin πa/q) + iπ (a/q − 1/2)).
τ (χ ) a=1
9.2 Gauss sums 289
q−1
Since a=1 χ (a) = 0, this is
−1
(S + i T )
τ (χ )
q−1 q−1
where S = a=1 χ (a) log(sin πa/q) and T = π/q a=1 χ (a)a. On replacing
a by q − a we see that S = χ (−1)S and T = −χ (−1)T . Thus if χ (−1) = 1,
then T = 0 and so
−1
q−1
L(1, χ ) = χ (a) log(sin πa/q).
τ (χ ) a=1
Then by (9.7) we obtain (9.8). If χ (−1) = −1 then S = 0 and so
−iπ
q−1
L(1, χ) = χ (a)a.
τ (χ )q a=1
Then by (9.7) we obtain (9.9).
We now turn our attention to the more general cχ (n). To this end we begin
with an auxiliary result.
Proof Let S(b, r ) denote the sum in question. If p | (b, r ) and n ≡ b (mod r ),
then p | n, and so (n, q) > 1. Thus each term in S(b, r ) is 0. Thus we are
done when (b, r ) > 1, so we suppose that (b, r ) = 1. Consider next the case
when d r . Then r is not a quasiperiod of χ . Hence there exist m and n such
that (mn, q) = 1, m ≡ n (mod r ), and χ (m) = χ (n). Choose c so that cn ≡
m (mod q). Then c ≡ 1 (mod r ) and χ(c) = 1. Hence χ (c)S(b, r ) = S(b, r ),
as in the proof of Theorem 9.4, so S(b, r ) = 0 in this case. Finally suppose
that d | r . Let χ0 be the principal character modulo q. If n ≡ b (mod r ), then
χ ! (n) = χ ! (b). Thus
q
S(b, r ) = χ ! (b) χ0 (n).
n=1
n≡b (mod r )
Write q/r = q1 q2 where q1 is the largest divisor of q/r that is relatively prime
to r . Then the sum on the right above is
q1 q2
1 = q2 ϕ(q1 ) = ϕ(q)/ϕ(r ),
k=1
(kr +b,q1 )=1
as required.
1 ≤ k ≤ r . Then
r
q/r
cχ (n) = e(kn/q) χ (br + k).
k=1 b=1
Put m = n/(q, n), and let χ1 denote the character modulo r induced by χ ! .
Then the above is
ϕ(q) r
= e(km/r )χ1 (k).
ϕ(r ) k=1
Since (m, r ) = 1, we see by the first case treated that the above is
ϕ(q) !
χ (m)µ(r/d)χ ! (r/d)τ (χ ! ),
ϕ(r )
which suffices.
9.2.1 Exercises
1. (a) Show that
1 e(a/q) (a, q) = 1,
χ (a)τ (χ ) =
ϕ(q) χ 0 otherwise.
2. Let
p k
an
G k (a) = e .
n=1
p
(b) Let l = (k, p − 1). Show that if k is a positive integer, then Nk (h) =
Nl (h) for all h, and hence that G k (a) = G l (a).
(c) Suppose that k | ( p − 1). Explain why
p
p
|G k (a)|2 = p Nk (h)2 .
a=1 h=1
(d) Suppose that k | ( p − 1). Show that there are ( p − 1)/k residues h
(mod p) for which Nk (h) = k, that Nk (0) = 1, and that Nk (h) = 0 for
all other residue classes (mod p). Hence show that the right-hand side
above is p(1 + ( p − 1)k).
(e) Let k be a divisor of p − 1. Suppose that p a, p c, and that b ≡ ack
(mod p). Show that G k (a) = G k (b).
√
(f) Suppose that k | ( p − 1). Show that if p a then |G k (a)| < k p.
3. Suppose that k | ϕ(q) and that (h, q) = 1.
(a) Explain why
1 1 if x k ≡ h (mod q),
χ (x k )χ (h) =
ϕ(q) χ 0 otherwise.
7. Let N (q) denote the number of pairs x, y of residue classes (mod q) such
that y 2 ≡ x 3 + 7 (mod q).
(a) Show that N (q) is a multiplicative function of q, that N (2) = 2, N (3) =
3, N (7) = 7, and that N ( p) = p when p ≡ 2 (mod 3).
(b) Suppose that p ≡ n1 (mod 3). Let χ1 (n) be a cubic character modulo p,
and let χ2 (n) = p be the quadratic character modulo p. Show that
p
1
p
N ( p) = e(7a/ p) 1 + χ1 (h) + χ1 (h) e(ah/ p)
2
p a=1 h=1
p
× (1 + χ2 (k))e(−ak/ p)
k=1
2
= p+ τ (χ1 )τ (χ2 )τ χ12 χ2 χ1 χ2 (−7) ,
p
√
and deduce that |N ( p) − p| ≤ 2 p.
(c) Deduce that N ( p) > 0 for all p.
(d) Show that N (2k ) = 2k−1 for k ≥ 2, that N (3k ) = 2 · 3k−1 for k ≥ 2,
that N (7k ) = 6 · 7k−1 for k ≥ 2, and that N ( p k ) = N ( p) p k−1 for all
other primes.
(e) Conclude that the congruence y 2 ≡ x 3 + 7 (mod q) has solutions for
every positive integer q.
(f) Suppose that x and y are integers such that y 2 = x 3 + 7. Show that
2 | y, x ≡ 1 (mod 4), and that x > 0. Note that y 2 + 1 = (x + 2)(x 2 −
2x + 4), so that y 2 + 1 is composed of primes ≡ 1 (mod 4), and yet x +
2 ≡ 3 (mod 4). Deduce that this equation has no solution in integers.
8. (Mordell 1933) Explain why the number N of solutions of the congruence
c1 x1k1 + · · · + cm xmkm ≡ c (mod p) is
1
p m
N= e(−ac/ p) G k j (ac j )
p a=1 j=1
When a = 1, the sum (9.10) is known as the Jacobi sum J (χ1 , χ2 ). In the
same way that the Gauss sum is analogous to the gamma function, the Jacobi
sum (and its evaluation in terms of Gauss sums) is analogous to the beta function
1
(α) (β)
B(α, β) = x α−1 (1 − x)β−1 d x = .
0 (α + β)
11. Let C be the smallest field that contains the field Q of rational numbers and
is closed under square roots. Thus C is the set of complex numbers that
are constructible by ruler-and-compass. We show that if p is of the form
p = 2k + 1, then ζ = e(1/ p) ∈ C, which is to say that a regular p-gon can
be constructed.
(a) Let p be any prime, and χ any non-principal character modulo p.
Explain why
p
τ (χ )2 χ (n)χ (1 − n) = pτ (χ 2 ).
n=1
lie in C.
(e) Explain why χ τ (χ ) = ( p − 1)ζ .
(f) (Gauss) If p = 2k + 1, then ζ ∈ C.
p
12. Let χ be a character modulo p and put J (χ ) = n=1 χ (n)χ (1 − n).
√
(a) Show that if χ 2 = χ0 , then |J (χ )| = p.
(b) Suppose that p ≡ 1 (mod 4). Show that there is a quartic character χ
modulo p.
9.3 Quadratic characters 295
p = 2 then this is impossible, but for p > 2 this is equivalent to the condition
k ≡ ( p − 1)/2 (mod p − 1). Thus there is no quadratic character modulo 2,
but for each odd prime p there is a unique quadratic character, given by the
Legendre symbol.
Now suppose that p is an odd prime and that q = p m with m > 1. We have
seen that a character χ modulo such a q is of the form χ (n) = e(k ind n/ϕ(q)),
and that χ is primitive if and only if p k. This character is quadratic only when
k ≡ ϕ(q)/2 (mod ϕ(q)), so there is a unique quadratic character modulo q, but
it is not primitive because p | k for this k. That is, the only quadratic character
modulo p m is induced by the primitive quadratic character modulo p.
Finally, suppose that q = 2m . For the modulus 2 there is only the principal
character, but for q = 4 we have a primitive quadratic character
(−1)(n−1)/2 (n odd),
χ1 (n) =
0 (n even).
For m > 2 we write χ ((−1)µ 5ν ) = e( jµ/2 + kν/2m−2 ), and we see that this
character is real if and only if 2m−3 | k. However, the character is primitive if and
only if k is odd, so primitive quadratic characters arise only when m = 3, and for
this modulus we have two different characters (corresponding to j = 0, j = 1).
Let χ2 ((−1)µ 5ν ) = e(ν/2). That is, χ2 (n) = (−1)(n −1)/8 . Then the characters
2
Theorem 9.13 Let d be a quadratic discriminant. Then χd (n) = dn K is a
primitive quadratic character modulo |d|, and every primitive quadratic char-
acter is given uniquely in this way.
−4
Proof It is easy 8 to see that n K
is the primitive quadratic character modulo
4. Similarly, n K and −8 n K
are the primitive quadratic characters p modulo
8.
Suppose that p is a prime, p ≡ 1 (mod 4). We show that n K = p L for all n
n.
p seethis,
To note that if q is anodd prime, then by (iii) and 2 quadratic preciprocity,
p q p ( p 2 −1)/8
= = . Also, = (−1) = , and −1
=1=
−1
q K q L p L 2 K p L K
. Since these two functions agree on all primes, and also on −1, and
p L
both are totally multiplicative, it follows that np K = np L for all integers n.
Suppose that p is a prime, p ≡ 3 (mod 4). We show that −np K = np L
for all n. To see this, note that if q is an odd prime, then by (iii)2 and
quadratic reciprocity, −qp K = −qp L = qp L . Also, −2p K = (−1)((− p) −1)/8
= (−1)( p −1)/8 = 2p L , and −−1p K = −1 = −1
2
p L
. Since these two functions
and also on −1, and both are totally multiplicative, it follows
agree on all primes,
that −np K = np L for all integers n.
Suppose next that d1 and d2 are quadratic discriminants with (d1 , d2 ) = 1. Put
d = d1 d2 . Supposing that dni K is a primitive quadratic character modulo |di | for
i = 1, 2, we shall show that dn K is a primitive d quadratic
d1 d2 character
d2 |d|. If
d1 modulo
q is an odd prime, then by (iii), q K = q L = q L q L = q K q K . Also,
d
d d1 d2
by (ii) we see that d2 K = d21 K d22 K , and by (iv) that −1 = −1 .
d d1 d2 K K −1 K
Since n K = n K n K when n is a prime or n = −1, and since both sides
are totally multiplicative functions, it follows that this identity holds for all
integers n. Hence by Lemma 9.3, dn K is a primitive character modulo |d|.
This allows us to account for all primitive quadratic characters, so the proof
is complete.
Since the Kronecker symbol and Legendre symbol agree whenever both are
defined, we may omit the subscripts. The same remark applies to theJacobi
symbol qn J , which for odd positive q = p1 p2 · · · pr is defined to be qn J =
r n d
i=1 pi L . Sometimes we let χd (n) denote the character n .
A character χ modulo q is an even function, χ (−n) = χ (n), if χ (−1) = 1;
for the primitive quadratic character χd this n occurs if d > 0. In the case of the
Legendre symbol, if p ≡ 1 (mod 4), then p L = χ p (n) is even. Similarly, χ is
odd, χ (−n) = −χ (n), if χ (−1) = −1. For χd this occurs when d < 0. For the
Legendre symbol, if p ≡ 3 (mod 4), then np L = χ− p (n) is odd.
We have taken the quadratic reciprocity law for the Legendre symbol for
granted, since it is treated in a variety of ways in elementary texts. In Exercise
9.3.6 below we outline a proof of quadratic reciprocity that is unusual that
298 Primitive characters and Gauss sums
it applies directly to the Jacobi symbol, without first being restricted to the
Legendre symbol. For future purposes it is convenient to formulate quadratic
reciprocity also for the Kronecker symbol.
Theorem 9.14 Suppose that d1 and d2 are relatively prime quadratic discrim-
inants. Then
d1 d2
= ε(d1 , d2 ) (9.11)
d2 d1
where ε(d1 , d2 ) = 1 if d1 > 0 or d2 > 0, and ε(d1 , d2 ) = −1 if d1 < 0 and
d2 < 0.
For odd n let m 2 be the largest square dividing n. Then there is a unique
d1 that n = ±m dd2 2 ,
2
choice of sign and a unique quadratic discriminant d2 such
and then if (n, d1 ) = 1 the above can be applied to express n in terms of d1 .
If n is even, then 4n = m 2 d2 for unique
d1 m and quadratic
d2 discriminant d2 , so if
(n, d1 ) = 1 we can again express n in terms of d1 .
d2 −8 −4 8
, so (9.11) holds. Similarly, if d2 is odd, then d2 K = d2 K d2 K =
88 K d2 d2 d2
d2 K
= 8 K = −1 K −8 K
, so again (9.11) holds.
Now let d1 , d2 and d be pairwise coprime quadratic discriminants. Then
d1 d2 d1 d2
= .
d K d K d K
Suppose that (9.11) holds for the pair d1 , d, and also for the pair d2 , d. Then
the above is
d d
= ε(d1 , d) ε(d2 , d)
d1 K d2 K
d
= ε(d1 , d)ε(d2 , d) .
d1 d2 K
9.3 Quadratic characters 299
But ε(d1 , d)ε(d2 , d) = ε(d1 d2 , d), so it follows that (9.11) holds also for the
pair d1 d2 , d. Since all quadratic discriminants can be constructed as the product
of smaller quadratic discriminants, or by appealing to the special cases already
considered, it follows now that (9.11) holds for all quadratic discriminants.
If a and q are positive integers and at least one of them is even, then
S(a, q) = S(q, a)e(1/8) q/a.
Proof We apply the Poisson summation formula, in the form of Theorem D.3,
to the function f (x) = e(ax 2 /(2q)) for 1/2 < x < q + 1/2, with f (x) = 0
otherwise. Thus
K
S(a, q) = f (n) = lim *
f (k)
K →∞
n k=−K
where
q+1/2
*
f (k) = e(ax 2 /(2q) − kx) d x.
1/2
ax 2 a k 2q
− kx = (x − kq/a)2 − ,
2q 2q 2a
and make the change of variable u = (x − kq/a)/q, to see that
1/(2q)+1−k/a
*
f (k) = qe(−k 2 q/(2a)) e(aqu 2 /2) du.
1/(2q)−k/a
300 Primitive characters and Gauss sums
k=−K r =1
2a m=−K /a 1/(2q)−m−r/a
+Oq,a (1/K ).
Let
q 2
ax
G(a, q) = e . (9.12)
x=1
q
9.3.1 Exercises
1. (a) Show that if p > 2 and p b, then
p
n n+b
= −1.
n=1
p p
4. We used Corollary 9.16 to determine the sign of τ (χ± p ), and then used
quadratic reciprocity to determine the sign of τ (χd ) for the general quadratic
discriminant d. We now show that quadratic reciprocity for the Legendre
symbol can be derived from Theorem 9.15 (mainly Corollary 9.16). Let
q
G(a, q) = n=1 e(an 2 /q).
(a) Suppose that p is an odd prime. Explain why
p
a n
G(a, p) = e(n/ p)
p L n=1 p
when (a, p) = 1.
(b) Suppose that (q1 , q2 ) = 1. By writing n modulo q1 q2 in the form n =
n 1 q2 + n 2 q1 , show that G(a, q1 q2 ) = G(aq2 , q1 )G(aq1 , q2 ).
(c) Let p and q denote odd primes. Show that
p q
G(1, pq) = G(1, p)G(1, q),
q L p L
and use Corollary 9.16 to show that
p q p−1 q−1
= (−1) 2 · 2 .
q L p L
(d) By taking a = −1 in (a), and using Corollary 9.16, show that −1 p
=
(−1)( p−1)/2 .
(e) By taking a = 4 in Theorem 9.15, show that 2p L = (−1)( p −1)/8 .
2
9.3 Quadratic characters 303
if k and ak lie in the same subset, otherwise put εk = −1. Note that
εk = ε−k . Let π + be the permutation that leaves N fixed and maps P to
itself by the formula k → εk ak (mod n). Let π − be the map that leaves
P fixed and maps N to itself by the formula k → εk ak (mod n). Finally
let π ∗ be the product of those transpositions (ak − ak) for which k ∈ P
and ak ∈ N . Show that the map x → ax (mod n) is the permutation
π ∗ π + π − . Let σ be the ‘sign change permutation’ x → −x (mod n).
Show that π − = σ π + σ . That is, π + and π − are conjugate permutations.
They are the same apart from the fact that they operate on different sets.
Thus they have the same cycle structure, and hence the same parity.
Deduce that an Z = (−1) K .
(h) Suppose that n is odd and positive, that (a, n) = 1, and that a > 0.
Show that an Z = (−1) K where K is the number of integers lying in the
intervals ((r − 12 ) an , ran ) for r = 1, 2, . . . [a/2].
(i) Show that if a > 0, (2a, n) = 1, m ≡ n (mod 4a), then ma Z = an Z .
2
Show that if n is odd and positive, then n Z = (−1)( p −1)/8 .
2
(j)
(k) Suppose that m and n are odd and positive, and that m ≡ −n (mod 4),
say m + n = 4a. Justify the following manipulations:
m 4a a a 4a n
= = = = = .
n Z n Z n Z m Z m Z m Z
(l) Suppose that m and n are odd and positive, and that m ≡ n (mod 4), say
m > n and m − n = 4a. Justify the following manipulations:
m 4a a a 4a
= = = =
n Z n Z n Z m Z m Z
−n n
= = (−1)(m−1)/2 .
m Z m Z
(m) Suppose that a is odd and positive and that (2a, mn) = 1. Show that
a mn a−1 mn−1 m n a−1 mn−1
= (−1) 2 2 = (−1) 2 2
mn Z a Z a Z a Z
a a + a−1 2 + 2
a−1 mn−1 m−1 a−1 n−1
= (−1) 2 2 2 2 .
m Z n Z
a
Show that this last exponent is even, so that mn Z
= ma Z an Z in this
case.
(n) Suppose that a is oddand negative
a and that (a, mn) = 1. Use (m) to
show that the identity mn Z = m Z n Z holds in this case also. Thus
a a
(o) Suppose that a is even and that (a, mn) = 1. Justify the following ma-
nipulations:
a −a mn−1 mn − a mn−1
= (−1) 2 = (−1) 2
mn Z mn Z mn
Z
mn − a mn − a mn−1
= (−1) 2
m n
Z
Z
−a −a a a
(−1) 2 + 1 2+ 2 .
mn−1 mn−1 m− n−1
= (−1) 2 =
m Z n Z m Z n Z
Show that this last exponent is even, and thus deduce that
a a a
=
mn Z m Z n Z
1 − z cr
f (z) = − 1.
r ∈R
1 − zr
(d) By taking z = 1 in the above, show that it would follow that c( p−1)/2 ≡
1 (mod p).
(e) Explain why c( p−1)/2 ≡ −1 (mod p); deduce that L(1, χ p ) = 0.
M+N
1
q
M+N
χ (n) = χ (a) e(an/q).
n=M+1
τ (χ ) a=1 n=M+1
1
M+N q
a(2M + N + 1) sin πa N /q
χ (n) = χ (a)e . (9.15)
n=M+1
τ (χ ) a=1 2q sin πa/q
By Theorem 9.7 and the triangle inequality the right-hand side has absolute
value
1
q−1
1
<√ .
q a=1 sin πa/q
(a,q)=1
Here the second half of the range of summation contributes the same amount as
the first. Hence it suffices to multiply by 2 and sum over 1 ≤ a ≤ q/2. However,
if q is odd, then q/2 is not an integer and hence the sum is actually over the
range 1 ≤ a ≤ (q − 1)/2, while if q is even, then 4 | q (since if q ≡ 2 (mod 4),
then there is no primitive character modulo q), and hence (q/2, q) > 1, and so
it suffices to sum over 1 ≤ a ≤ q/2 − 1 in this case. Hence in either case the
9.4 Incomplete character sums 307
expression above is
2
(q−1)/2
1
≤√ .
q a=1 sin πa/q
The function f (α) = sin πα is concave downward in the interval [0, 1/2], and
hence it lies above the chord through the points (0, 0), (1/2, 1). That is, sin π α ≥
2α for 0 ≤ α ≤ 1/2. Thus the above is
√ 1 √ 1+ √
(q−1)/2 (q−1)/2 1 (q−1)/2
2a + 1 √
≤ q < q log 2a
= q log = q log q.
a=1
a a=1 1− 1
2a a=1
2a −1
That is,
M+N
√
χ (n) < q log q (9.16)
n=M+1
M+N
= χ ! (n) µ(k)
n=M+1 k|(n,r )
= µ(k) χ ! (n)
k|r M<n≤M+N
k|n
= µ(k)χ ! (k) χ ! (m).
k|r M/k<m≤(M+N )/k
By the case already treated, we know that the inner sum above has absolute
value not exceeding d 1/2 log d, and hence the given sum has absolute value
not more than 2ω(r ) d 1/2 log d. But 2ω(r ) ≤ d(r ) r 1/2 ≤ (q/d)1/2 , so we have
proved
Theorem 9.18 (The Pólya–Vinogradov inequality) Let χ be a non-principal
character modulo q. Then for any integers M and N with N > 0,
M+N
√
χ (n) q log q.
n=M+1
Thus
1
r qi
1 if n is a primitive root (mod p),
χ0 (n) − χi (n)ai =
i=1
qi ai =1 0 otherwise.
The nature of this integral depends on whether k = 0 or not. In the former case
we find that
−1
q q
* n
f χ (0) = χ (n) 1 − = nχ (n),
n=1
q q n=1
while for k = 0 we have
q
1 − e(−kn/q) 1
q
cχ (−k)
*
f χ (k) = χ (n) = χ (n)e(−kn/q) = .
n=1
−2πik 2πik n=1 2πik
310 Primitive characters and Gauss sums
In the special case that χ is a quadratic character we know the exact value
of the Gauss sum, and hence we can say more.
Corollary 9.22 If d is a quadratic discriminant with d < 0, then
d
> 0.
1≤n≤|d|/2
n
9.4 Incomplete character sums 311
Since e(k N /q) − 1 ∼ 2πik N /q when |k| is small compared with N /q, for
rough heuristics we think of the above as being approximately
τ (χ )N
χ (−k)e(k M/q).
q 0<|k|≤N /q
for K ≤ q 1−ε . This can be used to obtain sharper constants in the Pólya–
Vinogradov inequality; see Exercise 9.4.9.
We can also show that the estimate provided by the Pólya–Vinogradov in-
equality is in general not far from the truth.
M+N
|τ (χ )|
max χ (n) ≥ .
M,N
n=M+1
π
Proof Clearly
q
M+N
q
M+N
M+N
e(M/q) χ (n) ≤ χ (n) ≤ q max χ (n) .
M
M=1 n=M+1 M=1 n=M+1 n=M+1
312 Primitive characters and Gauss sums
By (9.14) this is
−(N + 1) sin π N /q
e τ (χ ).
2q sin π/q
If q is even, then we may take N = q/2, and then the quotient of sines is
= 1/(sin π/q) ≥ q/π , while if q is odd, then we may take N = (q − 1)/2, in
which case the quotient of sines is
π
cos 2q 1 q
= π ≥ .
sin πq 2 sin 2q π
M+N
d √
max > c d log log d
M,N
n=M+1
n
k+h
1
q
sin πa(2h + 1)/q
χ(n) = χ (a)e(ak/q) .
n=k−h
τ (χ ) a=1 sin πa/q
Let h be the integer closest to q/3. Then the sine in the numerator is
approxi-
mately sin 2πa/3 when a is small. We shall choose χ so that χ (a) = a3 L when
a is small. Thus these two factors are strongly correlated. We would take k = 0
except for the need to dampen the effects of the larger values of a. To this end
9.4 Incomplete character sums 313
2q A
χ (a) sin 2πa/3
πτ (χ ) a=1 a
where A = q/K . To make this precise we observe that
and that
sin π(2K + 1)a/q 1+ 2
a/q2 ) (a/q ≤ 1/K ),
= O(K
(2K + 1) sin πa/q O K −1 a/q−1 (a/q > 1/K ).
Thus the right-hand side of (9.20) is
2
q/K
1 a a
= χ (a) +O sin 2πa/3 + O
τ (χ ) a=1 πa/q q q
2 2
K a 1 q2
× 1+O +O √
q2 q q/K <a≤q/2 K a 2
q ≡ 5 (mod 8),
q p
= (3 < p ≤ y). (9.22)
p L 3 L
Thus by the Chinese Remainder Theorem, q is restricted to certain residue
classes modulo Q = 8 3< p≤y p. Now let q be the least positive number that
satisfies these constraints. Then q issquare-free, and hence q is a quadratic
q
discriminant, so we may take χ(n) = n K . Also, q < Q. By the Prime Number
Theorem in the form of (6.13) we see that log Q = (1 + o(1))y. Let K be the
314 Primitive characters and Gauss sums
least integer such that K > q/y. Then by (9.22), χ (a) = a3 L for 1 ≤ a ≤ q/K ,
√
(a, 3) = 1. Thus 1≤a≤u χ (a) sin 2πa/3 = u/ 3 + O(1), so the main term in
(9.21) is
√
2 q 2 √
√ (log y + O(1)) ≥ √ + o(1) q log log q.
π 3 π 3
This completes the proof.
In the two preceding theorems we have seen that the character sum can be
large when N is comparable to q. For shorter sums we would expect the sum
to be smaller, and indeed one would conjecture that if χ is a non-principal
character modulo q, then
M+N
χ (n) ε N 1/2 q ε (9.23)
n=M+1
for any ε > 0. Although our present knowledge falls far short of this, we now
show that some improvement of the Pólya–Vinogradov inequality is possible, at
least in some situations. Our approach depends on the Riemann hypothesis for
curves over a finite field, in the form of the following character sum estimate,
which we derive from the exposition of Schmidt (1976).
Lemma 9.25 (Weil) Suppose that d|( p − 1) with d > 1 and that χ is a char-
acter modulo p of order d. Suppose further that e j ≥ 1 (1 ≤ j ≤ k), that d e j
for some j with 1 ≤ j ≤ k and that the c1 , c2 , . . . , ck are distinct modulo p.
Then
p
χ (n + c1 )e1 (n + c2 )e2 · · · (n + ck )ek ≤ (k − 1) p 1/2 .
n=1
Proof Clearly we may suppose that h ≤ p. Let d denote the order of χ . Then
d > 1 and
p
Sh,r = χ ((n + m 1 ) · · · (n + m r )(n + m r +1 )d−1 · · · (n + m 2r )d−1 ).
m 1 ,...,m 2r n=1
For a given 2r –tuple m 1 , . . . , m 2r let c1 < c2 < · · · < ck be the distinct val-
ues of the m j , and let al and bl denote the number of occurrences of
cl amongst the m 1 , . . . , m r and m r +1 , . . . , m 2r respectively. Let el = al +
(d − 1)bl . Then (n + m 1 ) · · · (n + m r )(n + m r +1 )d−1 · · · (n + m 2r )d−1 = (n +
c1 )e1 · · · (n + ck )ek . Note that e1 + · · · + ek = r + r (d − 1) = r d. If there is an
1
l such that d el , then by Lemma 9.25 the sum over n is bounded by (k − 1) p 2 ,
and so the total contribution to Sh,r from such 2r –tuples is
1
≤ 2r h 2r p 2 .
Theorem 9.27 (Burgess) For any odd prime p and any positive integer r we
have
M+N
r +1
r N 1− r p 4r 2 (log p)αr
1
χ (n)
n=M+1
Suppose that δ > 1/4. If N > p δ , then the bound above is o(N ) if r is
chosen suitably large in terms of δ. Thus any interval of length N contains both
quadratic residues and quadratic non-residues. In addition the reasoning used
to derive Corollary 9.19 applies here, so we see that the least positive quadratic
1
√ +ε
non-residue modulo p is ε p 4 e .
Proof When r = 1 or N > p 5/8 the bound is weaker than the Pólya–
Vinogradov Inequality (Theorem 9.18), and when r > 2 and N > p 1/2 the
r +1
stated bound is weaker than the case r = 2. Also, when N ≤ p 4r the bound is
316 Primitive characters and Gauss sums
Let
Then
M+N
S(M, N ) = χ (n + ab) + 2θM(ab)
n=M+1
We suppose that
A< p (9.25)
and then define ν() to be the number of pairs a, n with a ∈ [1, A], n ∈ [M +
1, M + N ] and n ≡ a (mod p). Thus
p
χ (n + ab) = χ (a) χ ( + b)
n,a,b =1 n,a b
n≡a (mod p)
p
≤ ν() χ ( + b) .
=1 b
By Hölder’s inequality,
p 2r p 2r −1 p 2r
2r
ν() χ ( + b) ≤ ν() 2r −1 χ ( + b)
=1 b =1 =1 b
and
2r −1 p 2r −2
p
2r
p
ν() 2r −1 ≤ ν() ν()2 .
=1 =1 =1
9.4 Incomplete character sums 317
Clearly
p
ν() = AN .
=1
10
n = n 0 satisfying this equation we have, in general, n = n 0 + (a,a a
) h, n =
a N (a,a )
n 0 + (a,a )
h. Moreover |h| ≤ max{a,a }
. Therefore the total number of possible
2N (a,a )
pairs n,n is at most 1 + max{a,a }
. Hence
N (a, a )
ν()2 A2 +
1≤a≤a ≤A
a
N
A2 +
d≤A 1≤b≤b ≤A/d
b
A2 + AN log 2A.
9.4.1 Exercises
1. Let χ be a non-principal character modulo q, and suppose that (a, q) = 1.
Choose a so that aa ≡ 1 (mod q).
(a) Explain why
M+N
M+ab+N
χ (a) χ (an + b) = χ (n).
n=M+1 n=M+ab+1
* sin π k N /q
f (k) = e(−(2M + N + 1)k/q)
sin π k/q
for k ≡ 0 (mod q).
(c) By subtracting *c(0)N /q from both sides and applying the triangle in-
equality, show that
M+N
N
q
1 |*
q−1
c(k)|
cn − cn ≤
n=M+1
q n=1 q k=1 sin π k/q
for δ > 0.
(b) Take f (u) = csc πu, x = k/q, and δ = 1/(2q), and sum over k to see
that
q−1
1 1−1/(2q)
1
<q du.
k=1
sin π k/q 1/(2q) sin πu
(c) Note that csc v has the antiderivative log(csc v − cot v), and hence de-
duce that the integral above is
π
q 1 + cos 2q
= log π .
π 1 − cos 2q
for 1 ≤ N ≤ q.
(b) Suppose that cn = 1 for 0 < n < q and that c0 = 0. Show that *
c(0) =
q − 1 and that *
c(k) = −1 for 0 < k < q. Deduce that
q−1
sin2 π N k/q
= (q − N )N
k=1 sin2 π k/q
for 0 ≤ N ≤ q.
(c) Take q = 2N and write k = 2n − 1 to deduce that
N
1
2 = 1.
N sin π 2n−1
n=1 2N
Let N tend to infinity to show that ∞ n=1 (2n − 1)
−2
= π 2 /8, and hence
that ζ (2) = π /6.
2
for 1 ≤ N ≤ q.
(b) Show that if χ = χ0 (mod p), then
2
p
M+N
χ (n) = N(p − N)
M=1 n=M+1
for 1 ≤ N ≤ p.
8. Let f χ (α) = 0<n≤qα χ (n). Show that if χ is a primitive character modulo
q, then
1
q 1
| f χ (α) − aχ |2 dα = 1− 2
0 12 p|q p
−1
q
aχ = nχ (n) = −i L(1, χ )τ (χ )/π
q n=1
if χ (−1) = −1.
9.5 Notes 321
for 1 ≤ K ≤ q.
(c) Suppose that χ is a primitive character modulo q, q > 1. Use Theo-
rem D.2 to show that
M+N
τ (χ ) χ (−k)
χ (n) = e(k M/q)(e(k N /q) − 1)
n=M+1
2πi 0<|k|≤K k
ϕ(q)
+O log 2K
K
when K < q 1−ε .
(d) By taking K = q 1/2 log q show that if χ is a primitive character modulo
q, q > 1, then
M+N
ϕ(q) 1/2
χ (n) ≤ q log q + O q 1/2 log log 3q .
n=M+1
πq
10. (Bernstein 1914a,b) Let χ be a primitive character (mod q), with q > 1.
Show that
√
(1 − |n|/q)χ (n)e(nα) q
|n|≤q
uniformly in α.
9.5 Notes
Section 9.2. That the sum in (9.6) vanishes when (n, q) > 1 was proved by de la
Vallée Poussin (1896), in a complicated way. We follow the simpler argument
that Schur showed Landau (1908, pp. 430–431).
The evaluation of the sum cχ is found in Hasse (1964, pp. 449–450). Our
derivation follows that of Montgomery & Vaughan (1975). A different proof
has been given by Joris (1977).
Section 9.3. Let ζ K (s) = a N (a)−s be the Dedekind zeta function of the
algebraic number field K . Here the sum is over all ideals a in the ring O K of
integers in K . In case K is a quadratic extension of Q, then the discriminant
322 Primitive characters and Gauss sums
√
d of K is a quadratic discriminant, K = Q( d), and ζ K (s) = ζ (s)L(s, χd ). In
other words, the number of ideals of norm n is k|n χd (k).
Section 9.4. Concerning the constant that can be taken in Theorem 9.18,
see Landau (1918), Cochrane (1987), Hildebrand (1988a,b), and Granville &
Soundararajan (2005). Granville & Soundararajan (2005) also show that in the
√
case of a cubic character, the sum in Theorem 9.18 is q(log q)θ where θ
is an absolute constant, θ < 1.
On the assumption of the Generalized Riemann Hypothesis for all Dirichlet
characters, Montgomery & Vaughan (1977) have shown that
M+N
χ (n) q 1/2 log log q.
n=M+1
See Granville & Soundararajan (2005) for a much simpler proof. Paley’s lower
bound, Theorem 9.24 above, shows that the above is essentially best-possible.
Nevertheless, it is known that one can do better a good deal of the time. In fact
in Montgomery & Vaughan (1979) it is shown that for each θ ∈ (0, 1) there is a
c(θ) > 0 such that if P > P0 (θ ), then for at least θπ (P) primes p ≤ P we have
N
n
max ≤ c(θ) p 1/2 ,
N
n=1
p
and if q > P0 (θ), then for at least θϕ(q) of the non-principal characters modulo
q we have
N
max χ (n) ≤ c(θ)q 1/2 .
N
n=1
Walfisz (1942) and Chowla (1947) showed that there exist infinitely many
primitive quadratic characters χ for which L(1, χ) eC0 log log q. In view
of Theorem 9.21, this provides an alternative approach for proving estimates
similar to Paley’s Theorem 9.24. For recent developments concerning large
L(1, χ ), see Vaughan (1996), Montgomery & Vaughan (1999), and Granville
& Soundararajan (2003).
Lemma 9.25 is a consequence of Weil’s proof of the Riemann Hypothesis
for curves over finite fields, and originally depended on considerable machinery
from algebraic geometry. Later Stepanov used constructs from transcendence
theory to estimate complete character sums, and subsequently Bombieri used
Stepanov’s ideas to give a proof of Weil’s theorem that depends only on the
Riemann–Roch theorem. Schmidt (1976) gives an exposition of this more
elementary approach that even avoids the Riemann–Roch theorem. Friedlander
& Iwaniec (1992) showed that the Pólya–Vinogradov inequality can be sharp-
ened, in the direction of Burgess’ estimates, without using Weil’s estimates. The
9.6 References 323
9.6 References
Apostol, T. M. (1970). Euler’s ϕ-function and separable Gauss sums, Proc. Amer. Math.
Soc. 24, 482–485.
Baker, R. C. & Montgomery, H. L. (1990). Oscillations of quadratic L-functions,
Analytic Number Theory (Urbana, 1989), Prog. Math. 85. Boston: Birkhäuser,
pp. 23–40.
Bernstein, S. N. (1914a). Sur la convergence absolue des séries trigonométriques, C. R.
Acad, Sci. Paris 158, 1661–1663.
(1914b). Ob absoliutnoi skhodimosti trigonometricheskikh riadov, Soobsch. Khar’k.
matem. ob-va (2) 14, 145–152; 200–201.
Burgess, D. A. (1957). The distribution of quadratic residues and non-residues, Mathe-
matika 4, 106–112.
(1962a). On character sums and primitive roots, Proc. London Math. Soc. (3) 12,
179–192.
(1962b). On character sums and L-series, Proc. London Math. Soc. (3) 12, 193–
206.
(1986). The character sum estimate with r = 3, J. London Math. Soc. (2) 33, 219–
226. √
Chowla, S. (1947). On the class-number of the corpus P( −k), Proc. Nat. Inst. Sci.
India 13, 197–200.
Chowla, S. & Mordell, L. J. (1961). Note on the nonvanishing of L(1), Proc. Amer.
Math. Soc. 12, 283–284.
Cochrane, T. (1987). On a trigonometric inequality of Vinogradov, J. Number Theory
27, 9–16.
Conway, J. H. (1997). The Sensuous Quadratic Form, Carus monograph 26. Washington:
Math. Assoc. Amer.
Friedlander, J. B. (1987). Primes in arithmetic progressions and related topics, Analytic
Number Theory and Diophantine Problems (Stillwater, 1984), Prog. Math. 70,
Boston: Birkhäuser, pp. 125–134.
Friedlander, J. B. & Iwaniec, H. (1992). A mean-value theorem for character sums,
Michigan Math. J. 39, 153–159.
(1993). Estimates for character sums, Proc. Amer. Math. Soc. 119, 365–372.
(1994). A note on character sums, The Rademacher legacy to mathematics (University
Park, 1992), Contemp. Math. 166, Providence: Amer. Math. Soc., pp. 295–299.
Fujii, A., Gallagher, P. X., & Montgomery, H. L. (1976). Some hybrid bounds for
character sums and Dirichlet L-series, Topics in Number Theory (Proc. Colloq.
324 Primitive characters and Gauss sums
Debrecen, 1974), Colloq. Math. Soc. Janos Bolyai 13. Amsterdam: North-Holland,
pp. 41–57.
Granville, A. & Soundararajan, K. (2003). The distribution of values of L(1, χd ), Geom.
Funct. Anal. 13, 992–1028; Errata 14 (2004), 245–246.
(2006). Large character sums: pretentious characters and the Pólya-Vinogradov in-
equality, to appear, 24 pp.
Hasse, H. (1964). Vorlesungen über Zahlentheorie, Second Edition, Grundl. Math. Wiss.
59. Berlin: Springer-Verlag.
Hildebrand, A. (1988a). On the constant in the Pólya–Vinogradov inequality, Canad.
Math. Bull. 31, 347–352.
(1988b). Large values of character sums, J. Number Theory 29, 271–296.
Joris, H. (1977). On the evaluation of Gaussian sums for non-primitive characters,
Enseignement Math. (2) 23, 13–18.
Landau, E. (1908). Nouvelle démonstration pour la formule de Riemann sur le nom-
bre des nombres premiers inférieurs à une limite donnée, et démonstration d’une
formule plus générale pour le cas des nombres premiers d’une progression
arithmétique, Ann. École Norm. Sup. (3) 25 399–448; Collected Works, Vol. 4.
Essen: Thales Verlag, 1986, pp. 87–130.
(1918). Abschätzungen von Charaktersummen, Einheiten und Klassenzahlen, Nachr.
Akad. Wiss. Göttingen, 79–97; Collected Works, Vol. 7. Essen: Thales Verlag, 1986,
pp. 114–132.
Martin, G. (2006). Inequities in the Shanks–Rényi prime number race, 32 pp., to appear.
Mattics, L. E. (1984). Advanced problem 6461, Amer. Math. Monthly 91, 371.
Montgomery, H. L. (1976). Distribution questions concerning a character sum, Topics in
Number Theory (Proc. Colloq. Debrecen, 1974), Colloq. Math. Soc. Janos Bolyai
13. Amsterdam: North-Holland, pp. 195–203.
(1980). An exponential polynomial formed with the Legendre symbol, Acta Arith.
37, 375–380.
Montgomery, H. L. & Vaughan, R. C. (1975). The exceptional set in Goldbach’s problem,
Acta Arith. 27, 353–370.
(1977). Exponential sums with multiplicative coefficients, Invent. Math. 43, 69–82.
(1979). Mean values of character sums, Canad. J. Math. 31, 476–487.
(1999). Extreme values of Dirichlet L-functions at 1, Number Theory in Progress,
Vol. 2 (Zakopane–Kościelisko, 1997). Berlin: de Gruyter, pp. 1039–1052.
Mordell, L. J. (1933). The number of solutions of some congruences in two variables,
Math. Z. 37, 193–209.
Paley, R. E. A. C. (1932). A theorem of characters, J. London Math. Soc. 7, 28–32.
Pólya, G. (1918). Über die Verteilung der quadratischen Reste und Nichtreste, Nachr.
Akad. Wiss. Göttingen, 21–29.
Schmidt, W. M. (1976). Equations over finite fields. An elementary approach, Lecture
Notes Math. 536, Berlin: Springer-Verlag.
Schur, I. (1918). Einige Bemerkungen zu der vorstehenden Arbeit des Herrn G. Pólya:
Über die Verteilung der quadratischen Reste und Nichtreste, Nachr. Akad. Wiss.
Göttingen, 30–36.
de la Vallée Poussin, C. J. (1896). Recherches analytiques sur la théorie des nombres
premiers, I–III, Ann. Soc. Sci. Bruxelles 20, 183–256, 281–362, 363–397.
9.6 References 325
Theorem 10.1 For arbitrary real α, and complex numbers z with z > 0,
∞
∞
e−π(n+α) z = z −1/2 e(kα)e−πk /z ,
2 2
(10.1)
n=−∞ k=−∞
and
∞
∞
(n + α)e−π (n+α) z = −i z −3/2 ke(kα)e−π k /z
2 2
(10.2)
n=−∞ k=−∞
326
10.1 Functional equations and analytic continuation 327
we see that
+∞
*
f (t) = e−π t /z
2
e−π(x+it/z) z d x.
2
−∞
proof.
Theorem 10.2 For any complex number s, except s = 0 and s = 1, and any
non-zero complex number z with z ≥ 0,
∞
ζ (s) (s/2)π −s/2 = π −s/2 n −s (s/2, π n 2 z)
n=1
∞
+ π (s−1)/2 n s−1 ((1 − s)/2, π n 2 /z) (10.3)
n=1
z (s−1)/2 z s/2
+ − .
s−1 s
Here (s, a) is the incomplete gamma function,
∞
(s, a) = e−w ws−1 dw, (10.4)
a
two sums on the right are uniformly convergent for s in any compact set, and
hence by a theorem of Weierstrass they represent entire functions. The last two
terms have simple poles at 1 and 0, respectively. As for the left-hand side, we
note that (s/2) has a pole at s = 0, and never vanishes, so it follows that ζ (s)
is analytic for all s = 1. If we simultaneously replace s by 1 − s and z by 1/z,
then the two sums on the right in (10.3) are exchanged, and the last two terms
are also exchanged, so that the value of the right-hand side is invariant. These
observations may be summarized as follows:
328 Analytic properties of ζ (s) and L(s, χ )
n=1 0
∞ ∞
e−π n
2
= u
u s/2−1 du. (10.7)
0 n=1
z to ∞. We call these integrals 1 , 2 , respectively. By reversing the steps we
made in passing from (10.6) to (10.7) we see immediately that
∞
−s/2
2 =π n −s (s/2, π n 2 z).
n=1
To treat 1 we let
+∞
e−πn
2
ϑ(u) = u
(10.8)
−∞
for u > 0. Then the sum in the integrand in (10.7) is (ϑ(u) − 1)/2. Thus
1 z 1 z s/2−1
1 = ϑ(u)u s/2−1
du − u du.
2 0 2 0
Here the second integral is 2s z s/2 . By Theorem 10.1 we know that ϑ(u) =
u −1/2 ϑ(1/u). Hence the first term above is
z ∞
1 z 1 z s/2−3/2
e−π n /u u s/2−3/2 du +
2
ϑ(1/u)u s/2−3/2 du = u du.
2 0 0 n=1
2 0
2 (s−1)/2
Here the second integral is s−1 z . By the change of variable v = 1/u we
see that the first term above is
∞ ∞
e−πn v v (1−s)/2−1 dv.
2
1/z n=1
We exchange the order of summation and integration, and make the linear
change of variables x = π n 2 v, to see that this is
∞
π (s−1)/2 n s−1 ((1 − s)/2, π n 2 /z).
n=1
Hence
z (s−1)/2 z s/2 ∞
= − + π (s−1)/2 n s−1 ((1 − s)/2, π n 2 /z),
1
s−1 s n=1
so we have the desired identity for σ > 1. But, as already noted, the two sums
represent entire functions, so the right-hand side of (10.3) is analytic for all s
except for simple poles at s = 1 and s = 0. Hence by the uniqueness of analytic
continuation the identity (10.3) holds for all s except at the poles.
Proof By the reflection principle (C.6) and the duplication formula (C.9), we
see that
1−s
1 1−s s πs πs
2s = 1− sin = π −1/2 2s (1 − s) sin .
2
π 2 2 2 2
Let σ be fixed, and let µ(σ ) denote the infimum of those exponents µ
such that ζ (σ + it) τ µ . This is the Lindelöf µ-function. By Corollary 1.17
we know that µ(σ ) = 0 for σ ≥ 1 and that µ(σ ) ≤ 1 − σ for 0 < σ ≤ 1. By
Corollary 10.5 we see that µ(σ ) = µ(1 − σ ) + 1/2 − σ . Hence in particular,
µ(σ ) = 1/2 − σ for σ ≤ 0. For 0 < σ < 1 the value of µ(σ ) is at present
unknown, but the Lindelöf Hypothesis (LH) asserts that ζ (1/2 + it) ε τ ε ,
which is to say that µ(1/2) = 0. From this it follows that
0 for σ ≥ 1/2,
µ(σ ) = (10.10)
1/2 − σ for σ ≤ 1/2.
Three different proofs that LH implies the above are found in Exercises 10.1.
18–20. Also, from Exercises 10.1.20 and 10.1.21 we see that LH is equivalent
to a certain assertion concerning the distribution of the zeros of ζ (s). Since
this assertion is visibly weaker than RH, it is evident that RH implies LH. In
Chapter 13 we shall show that RH implies a quantitative form of LH.
Concerning special values of the zeta function, we observe first that since
ζ (s) ∼ 1/(s − 1) for s near 1, it follows from Corollary 10.4 that
ζ (0) = −1/2. (10.11)
10.1 Functional equations and analytic continuation 331
Since χ is primitive, we know by Theorem 9.7 that the inner sum on the right is
τ (χ )χ (k) for all k. This gives the identity for ϑ0 . The identity for ϑ1 is proved
similarly, using (10.2).
As was the case with the zeta function, the above is first proved for σ > 1.
Since each term of the series is entire, and since the series are locally uniformly
convergent, the right-hand side is an entire function of s, and this provides an
analytic continuation of L(s, χ) to the entire complex plane. If in the above we
10.1 Functional equations and analytic continuation 333
If p|d, then χ ! ( p) = 0, and thus in the above product we may confine our
attention to those primes p|q such that p d. For such a prime, the factor
1 − χ ! ( p)/ p s is an entire function whose zeros form an arithmetic progression
on the imaginary axis. Thus L(s, χ) has all the zeros of L(s, χ ! ), and if there are
primes p|q such that p d, then L(s, χ) has additional zeros on the imaginary
axis. Such zeros constitute a finite union of arithmetic progressions. In the
special case χ = χ0 , we have
1
L(s, χ0 ) = ζ (s) 1− s .
p|q
p
Thus L(s, χ0 ) has a pole at s = 1 with residue ϕ(q)/q, it has all the zeros of
ζ (s), and it also has zeros of the form 2πik/ log p where k takes integral values
and p|q.
10.1.1 Exercises
1. Let ϑ(u) be defined as in (10.8). Show that ϑ (1) = −ϑ(1)/4.
2. Let f be an even function in L 1 (R), let β > 1, suppose that f (x) = O(x −β )
as x → ∞, and that * f (u) = O(u −β ) as u → ∞. Show that
∞
∞ ∞
2ζ (s) f (x)x s−1 d x = 2 n −s f (x)x s−1 d x
0 n=1 n
∞ ∞
+2 n s−1 *
f (u)u −s du
n=1 n
− f (0)/s + *
f (0)/(s − 1)
(b) With ϑ(x) defined as in (10.8), use the functional equation of the zeta
function to show that ϑ(x) = x −1/2 ϑ(1/x) for x > 0.
4. (Lavrik 1965)
(a) Suppose that z > 0, that σ0 > max(0, −σ ), and that s = 0, s = −1,
s = −2, . . . . By pulling the contour to the left and summing the
residues, show that
1 σ0 +i∞
dw ∞
(−1)k z s+k
(w + s)z −w = (s) − .
2πi σ0 −i∞ w k=0
k!(s + k)
(b) Show that if σ > 0, then the right-hand side above is (s, z).
(c) Argue that both sides are entire functions of s, and hence that the
identity
σ0 +i∞
1 dw
(s, z) = (w + s)z −w
2πi σ0 −i∞ w
holds for all complex s.
(d) Show that if σ0 > max(0, (1 − σ )/2), then
∞
π −s/2 n −s (s/2, π n 2 z)
n=1
σ0 +i∞
1 dw
= ζ (s + 2w) (w + s/2)π −w−s/2 z −w .
2πi σ0 −i∞ w
(e) Suppose now that s = 0 and s = 1. Explain why the integrand has poles
at w = 0, w = (1 − s)/2, w = −s/2, and nowhere else.
(f) Show that when the contour is pulled to the left, the pole at w = 0
contributes ζ (s) (s/2)π −s/2 , the pole at w = (1 − s)/2 contributes
z (s−1)/2 /(s − 1), and the pole at −s/2 contributes −z s/2 /s.
(g) Suppose the contour is pulled to the left to an abscissa σ1 <
min(0, −σ/2). By means of the identity ζ (s) (s/2)π −s/2 = ζ (1 − s)
((1 − s)/2)π (s−1)/2 and the change of variable w → −w, show
that the expression is π (s−1)/2 ∞ n=1 n
s−1
((1 − s)/2, π n 2 /z). Thus
demonstrate that Theorem 10.2 can be derived from Corollary 10.3.
5. Suppose that α is real, that z > 0 and that χ is a primitive character
(mod q).
(a) Show that
∞
τ (χ ) ∞
χ (n)e−π(n+α) z/q = 1/2 z −1/2 χ (k)e(kα/q)e−π k /(qz) .
2 2
n=−∞ q k=−∞
336 Analytic properties of ζ (s) and L(s, χ )
n=−∞ iq k=−∞
6. Let α and β be real numbers, and suppose that z > 0, and put
∞
e(nβ)e−π(n+α) z .
2
ϑ0 (z; α, β) =
n=−∞
(a) Show that if f (x) = e(βx)e−π (x+α) z , then *f (t) = e(−αβ)z −1/2 .
2
n=−∞ n=−∞
(b) Show that when k = 0, the above is consistent with the formula of
Theorem 9.9.
(c) For non-negative integers k, deduce that
−q 2k
q
L(−2k, χ) = χ (a)B2k+1 (a/q).
2k + 1 a=1
16. (a) Let p1 and p2 be distinct primes. Show that (log p1 )/(log p2 ) is irra-
tional.
(b) Let χ be a character modulo q. Show that all zeros of L(s, χ ) on the
imaginary axis are simple, except possibly for zeros at the point s = 0.
(c) Let a positive integer m and a primitive character χ ! be given. Show
that there is a character χ induced by χ ! such that L(s, χ ) has a zero
at s = 0 of exact multiplicity m.
17. (Landau 1907) (a) Let χ denote the character modulo 5 such that χ (2) = i.
Show that L(1, χ ) = (−1 − 3i)π τ (χ )/25. √
(b) With χ as above, show that L(2, χ 2 ) = 4 5π 2 /125.
(c) Let χ be as above.√By using Exercise 9.2.9, or otherwise, show that
τ (χ )2 = (−1 − 2i) 5.
(d) With χ as above, show that
L(1, χ )2
= 1 + i/2.
L(2, χ 2 )
(e) Let χ denote a non-principal character modulo q. Show that
∞
L(s, χ)2
2ω(n) χ (n)n −s =
n=1
L(2s, χ 2 )
for σ > 1/2.
338 Analytic properties of ζ (s) and L(s, χ )
(b) By taking α(w) = ζ (1/2 + it + w), and considering the residues aris-
ing from poles at w = 1/2 − it and at w = δ, show that
ζ (1/2 + δ + it) = x −δ n −1/2−it ((x/n)δ − (n/x)δ )
n≤x
∞
δx −δ x iu
+ ζ (1/2 + it + iu) du
π −∞ u 2 + δ2
2δx 1/2−δ−it
−
(1/2 − it − δ)(1/2 − it + δ)
= T1 + T2 + T3 ,
say.
(c) Show that
1
T1 1 + x 1/2−δ min , log x .
|δ − 1/2|
(d) Let M(T ) = max0≤t≤T |ζ (1/2 + it)|. Show that
T2 x −δ M(2τ )
for σ > 0.
(b) Show that
n+α
(n + α)−s − n −s + αsn −s−1 = s(s + 1) (n + α − u)u −s−2 du.
n
for σ > −1, and that the series is locally uniformly convergent in this
half-plane.
342 Analytic properties of ζ (s) and L(s, χ )
(b) Apply Theorem 10.1 to the inner sum, take the sum over n inside, and
apply Theorem 10.1 a second time to show that ϑ Q (z) = ϑ Q (1/z)/z.
10.1 Functional equations and analytic continuation 343
values Q i (m, n) run over the the values N (a) for ideals a in the i th ideal
class Ci , each value being taken exactly w times. Thus
ζ Q i (s) = w N (a)−s
a∈Ci
1 h
ζ K (s) = ζ Q (s).
w i=1 i
i=1 n=1
where r (n) = r K (n) = k|n χd (k) is the number of ideals in O K with
norm n. Show that ϑK (z) = ϑK (1/z)/z.
(b) Show that if z ≥ 0, then
for our purposes. We do not quite achieve a formula of the type (10.21) for the
zeta function, but we obtain a serviceable substitute.
Lemma 10.11 Suppose that f (z) is an entire function with a zero of order K
at 0, and that f (z) vanishes at the non-zero numbers z 1 , z 2 , z 3 , . . . . Suppose
also that there is a constant θ , 1 < θ < 2, such that
max | f (z)| ≤ exp(R θ )
|z|≤R
for all z. Here the product is uniformly convergent for z in compact sets.
Proof We may suppose that K = 0, since if K > 0 then the function f (z)/z K
does not vanish at the origin. Let N f (R) denote the number of zeros of f (z) in the
disc |z| ≤ R. By Jensen’s inequality (Lemma 6.1) we find that N f (R) ≤ 8R θ for
all sufficiently large R. Thus R<|zk |≤2R |z k |−2 ≤ 8R θ −2 , so by summing over
∞
dyadic blocks we see that k=1 |z k |−2 < ∞. (Alternatively, if more precision
∞
were desired, we could write this sum as 0 r −2 d N f (r ) , and integrate by parts.)
But (1 − z)e = 1 + O(|z| ) uniformly for |z| ≤ 1, so the product
z 2
∞
z
g(z) = 1− e z/zk
k=1
z k
Now
1
R θ −1 .
k∈K
|z k |
1
10.2 Products and sums over zeros 347
Thus
θ
|P1 (z)| ≥ e−c R
for all large R. Since card K 2 ≤ 72R θ , it follows that there is an r , R ≤ r ≤ 2R,
for which |r − |z k || ≥ 1/R 2 for all k. If r is chosen in this way and |z| = r , then
|r − |z k || 1
|1 − z/z k | ≥ ≥
|z k | 27R 3
for all k ∈ K2 . Hence
θ
|P2 (z)| ≥ e−c R log R
for |z| ≤ 2R. Hence we see that for each large R there is an r , R ≤ r ≤ 2R, for
θ θ
which |g(z)| ≥ e−c R log R when |z| = r . Thus |h(z)| ≤ ec R log R for such z, and
hence by the maximum modulus principle
θ
Mh (R) ≤ ec R log R
.
Now put j(z) = log h(z) with j(0) = 0. Then j(z) ≤ c R θ log R for all large
R, so that by the Borel–Carathéodory lemma (Lemma 6.2),
j(z) R θ log R
for all large R. But θ < 2, so j(z) must be a polynomial of degree at most 1,
say j(z) = A + Bz, and the proof is complete.
In order to apply our lemma to ξ (s) we need an upper bound for |ξ (s)|. From
Corollary 1.17 we see that ζ (s) |s|1/2 when σ ≥ 1/2 and |s| ≥ 2. Thus by
Stirling’s formula (Theorem C.1) it follows that
ξ (s) exp(|s| log |s|) (10.22)
when σ ≥ 1/2 and |s| ≥ 2. In view of the functional equation found in Corollary
10.3, this same upper bound therefore holds for all s with |s| ≥ 2. Since
ξ (s) = (s − 1)ζ (s) (1 + s/2)π −s/2 , (10.23)
it follows from (10.11) that ξ (0) = 1/2. Thus by Lemma 10.11 we obtain
Theorem 10.12 Let ξ (s) be defined as in Corollary 10.3. There is a constant
B such that
1 Bs s
ξ (s) = e 1− es/ρ (10.24)
2 ρ ρ
for all s. Here the product is extended over all zeros ρ of ξ (s).
348 Analytic properties of ζ (s) and L(s, χ )
All known zeros of the zeta function are simple, and it is plausible to conjec-
ture that they all are. In the (unlikely) event that a multiple zero is encountered,
the associated factor in the above product is to be repeated as many times as
the multiplicity.
Thus far we have remarked upon the zeros of ξ (s) without having proved
that they exist. However, from (10.24) we see that if ξ (s) had at most finitely
many zeros then there would be a constant C > 0 such that ξ (s) exp(C|s|)
for all
large s. On the contrary, by Stirling’s formula we find that ξ (σ ) =
exp 12 σ log σ + O(σ ) as σ → ∞, so it is evident that ξ (s) has infinitely many
zeros. Concerning the density of the zeros, the following estimate is useful.
N (T + 1) − N (T ) log (T + 2).
Proof We apply Jensen’s inequality (Lemma 6.1) to ξ (s), on a disc with centre
2 + i(T + 1/2) and radius R = 11/6. By taking r = 7/4, it follows from the
estimates of Corollary 1.17 that the number of zeros ρ in the rectangle 1/2 ≤
β ≤ 1, T ≤ γ ≤ T + 1 is log (T + 2). (Alternatively, we could appeal to
Theorem 6.8.) But ρ is a zero if and only if 1 − ρ is a zero, so the rectangle
0 ≤ β ≤ 1/2, T ≤ γ ≤ T + 1 contains the same number of zeros as the former
one. Thus we have the result.
and
ζ 1 1 1 1 1
(s) = B + log π − − (s/2 + 1) + + .
ζ 2 s−1 2 ρ s−ρ ρ
(10.29)
Moreover,
1 1 1 1 −C0 1
B=− + =− = − 1 + log 4π
2 ρ 1−ρ ρ ρ ρ 2 2
= −0.0230957 . . . . (10.30)
In the above, it is to be understood that if ξ (s) has a multiple zero ρ, then the
summand arising from ρ is to be repeated as many times as the multiplicity.
Proof The second identity follows from the first by means of (10.25). As for
(10.30), we observe first by taking s = 0 in (10.28) that B = ξξ (0). Also, by
taking s = 1 in (10.28) we find that ξξ (1) = B + ρ (1/(1 − ρ) + 1/ρ). By
(10.26), this is −B, so we obtain the first identity in (10.30). Since B is real,
we may write
1 1 1
B=− + .
2 ρ 1−ρ ρ
However, ρ 1/(1 − ρ) and ρ 1/ρ are absolutely convergent, so these
two sums may be written separately, above. Since 1 − ρ runs over zeros of
350 Analytic properties of ζ (s) and L(s, χ )
the zeta function as ρ does, the two sums are equal, and we obtain the second
identity in (10.30). By logarithmically differentiating the fundamental identity
s (s) = (s + 1) we see that 1/s + (s) = (s + 1). Hence (10.25) may be
rewritten as
ξ 1 ζ 1 1
(s) = + (s) + (s/2 + 1) − log π.
ξ s−1 ζ 2 2
We obtain the third identity in (10.30) by taking s = 0 in the above, in view of
(10.11), (10.14), and (C.12).
This is analogous to Theorem 1.12. To estimate the sum we use (1.29). For the
remaining terms we use the trivial estimate S(u, χ ) q. The stated estimate
then follows by taking x = qτ .
for all s. Here the product is extended over all zeros ρ of ξ (s, χ).
We expect that the zeros of ξ (s, χ ) are all simple, but if a multiple zero is
encountered, then the factor that it contributes to the above product is to be
repeated as many times as its multiplicity. In analogy to Theorem 10.13, we
have
Here each factor in the product has zeros forming an arithmetic progression
on the imaginary axis with common difference 2πi/ log p. Thus L(s, χ ) has
log r (|T | + 2) zeros of L(s, χ ! ), and additionally has p|q log p log q
zeros on the imaginary axis with imaginary part between T and T + 1. This
completes the argument.
we see that
ξ ξ
(s, χ) = − (1 − s, χ ). (10.34)
ξ ξ
By logarithmically differentiating the asymmetric form of the functional equa-
tion found in Corollary 10.9, we discover that
L L q
π π
(s, χ ) = − (1 − s, χ ) − log − (1 − s) + cot (s + κ)
L L 2π 2 2
(10.35)
and
L 1
1 q 1 1
(s, χ) = B(χ ) − ((s + κ)/2) − log + + .
L 2 2 π ρ s−ρ ρ
(10.37)
Moreover,
1 1 1 1
B(χ ) = − + =− (10.38)
2 ρ 1−ρ ρ ρ ρ
and
−1 q L 1
B(χ ) = log − (1, χ ) + C0 + (1 − κ) log 2. (10.39)
2 π L 2
As always, multiple zeros are counted multiply.
Proof The second identity follows from the first by means of (10.33). To
obtain the first identity in (10.38), we take s = 1 in (10.36), and apply (10.34)
to see that
1 1
ξ ξ
B(χ) + + = (1, χ) = − (0, χ ) = −B(χ ) = −B(χ ).
ρ 1−ρ ρ ξ ξ
From Theorem 10.17 we know that the number of zeros ρ of ξ (s, χ ) with |ρ| ≤
R is R log q R for R ≥ 2. Hence the sums ρ 1/(1 − ρ) and ρ 1/ρ
are absolutely convergent. As the map ρ → 1 − ρ merely permutes zeros of
10.2 Products and sums over zeros 353
10.2.1 Exercises
1. Let f satisfy the hypotheses of Lemma 10.11, and suppose that
∞
1
< ∞.
k=1
|z k|
(a) Show that there are numbers A and B and a non-negative integer K such
that f (z) = z K e A+Bz g(z) where g(z) = ∞ k=1 (1 − z/z k ).
(b) Observe that for any complex number w, |1 − w| ≤ e|w| and show that
there is a number C such that |g(z)| ≤ eC|z| .
(c) Deduce that ρ 1/|ρ| = ∞ where the sum is over all non-trivial zeros
of the zeta function.
2. (a) Let B be the constant given in (10.30). Show that if ρ = 1/2 + iγ is a
zero of the zeta function on the critical line, then
for σ ≥ 2, where the sum is over all non-trivial zeros of the zeta
function.
354 Analytic properties of ζ (s) and L(s, χ )
for σ ≥ 2.
(c) Show that each summand above is ≤ 1/(σ − 1).
(d) Show that if |γ | ≥ 3σ and σ is large, then the summand arising from ρ
in the sum above is ≤ 0.
(e) Concludethat N (T ) T log T when T is large.
5. Put f (s) = s+1
1
− s+23/4
.
(a) Show that if t ≥ 2, then
1
f (1 + it − ρ) = log t + O(1)
ρ 8
(c) Assume RH. Show that if c is fixed, c > 0, then all zeros of ξ (s + c) +
ξ (s − c) have real part 1/2.
8. (Vorhauer 2006) Let B(χ ) denote the constant in Theorem 10.16.
(a) Show that
1−β β 1
+ 2 ≥
(1 − β)2 + γ 2 β + γ2 1 + γ2
uniformly for 0 ≤ β ≤ 1.
(b) Deduce that
1 1
B(χ ) ≤ − .
2 γ 1 + γ2
(b) Deduce that the power series coefficients of E (z) are all ≤ 0.
(c) Write E(z) = ∞ m
m=0 Am z . Show that A0 = 1, Am = 0 for 1 ≤ m ≤ K ,
Am < 0 for m > K , and that m>K Am = −1.
(d) Show that if |z| ≤ r ≤ 1, then |1 − E(z)| ≤ 1 − E(r ) ≤ r K +1 .
356 Analytic properties of ζ (s) and L(s, χ )
10.3 Notes
Section 10.1. The case α = 0 of (10.1) was given by Poisson (1823). de la Vallée
Poussin observed that the left-hand side of (10.1) has period 1 with respect to
α, and then computed the Fourier coefficients of this function to obtain (10.1).
This is rather similar to using the Poisson summation formula, as we have done.
Theorem 10.1 is the basis for a very large class of functional equations and was
first exploited systematically by Hecke. For the most general version see Tate’s
thesis, reproduced in Tate (1967). Riemann gave two proofs of Corollary 10.3.
Riemann’s second method involved using Theorem 10.1 to establish the formula
of Exercise 10.1.10. This is the case z = 1 of Theorem 10.2, with the order of
summation and integration reversed. Theorem 10.2 is due to Lavrik (1965),
who derived it from Corollary 10.3 in the manner outlined in Exercise 10.1.4.
For further proofs of the functional equation, see Titchmarsh (1986, Chapter 2).
The proofof Theorem 10.1 can be arranged so that one does not depend on
the fact that e−π x d x = 1. To see this, let c denote the value of this integral.
2
Then the proof given establishes (10.1) with the factor c on the right-hand side.
But if z = 1 and α = 0 the two sides of (10.1) are visibly equal and positive,
so it follows that c = 1.
The functional equation for ζ (s) was established by Riemann (1860), and
that for L(s, χ) by de la Vallée Poussin (1896) although it was known in some
special cases earlier. See the commentary of Landau (1909, p. 899).
Section 10.2 The product formula of Theorem 10.12 was established by
Hadamard (1893). The constant B(χ ) in Theorem 10.16 was long considered
to be mysterious; the simple formula (10.39) for it is due to Vorhauer (2006).
10.4 References
Backlund, R. J. (1918). Über die Beziehung zwischen Anwachsen und Nullstellen der
Zetafunktion, Öfv. af finska vet. soc. förh. 61A, Nr. 9.
Berndt, B. C. (1985). The gamma function and the Hurwitz zeta-function, Amer. Math.
Monthly 92, 126–130.
Hadamard, J. (1893). Étude sur les propriétés des fonctions entières et en particulier
d’une fonction considérée par Riemann, J. Math. Pures Appl. (4) 9, 171–215.
Heilbronn, H. (1938). On Dirichlet series which satisfy a certain functional equation,
Quart J. Math. Oxford Ser. 9, 194–195.
Landau, E. (1903). Über die zahlentheoretische Funktion µ(k), Sitzungsber. Kais. Akad.
Wiss. Wien 112, 537–570; Collected Works, Vol. 2. Essen: Thales Verlag, 1986,
pp. 60–93.
(1907). Bemerkungen zu einer Arbeit des Herrn V. Furlan, Rend. Circ. Mat. Palermo
23, 367–373; Collected Works, Vol. 3. Essen: Thales Verlag, 1986, pp. 316–322.
10.4 References 357
(1909). Handbuch der Lehre von der Verteilung der Primzahlen, Third edition. New
York: Chelsea, 1974.
Lavrik, A. F. (1965). The abbreviated functional equation for the L-function of Dirichlet,
Izv. Akad. Nauk UzSSR Ser. Fiz.-Mat. Nauk 9, 17–22.
Lerch, M. (1894). Weitere Studien auf dem Gebiete der Malmstén’schen Reihen. Mit
einem Briefe des Herrn Hermite, Rozpravy 3, No. 28, 63 pp.
Littlewood, J. E. (1924). On the zeros of the Riemann Zeta-function, Cambridge Philos.
Soc. Proc. 22, 295–318.
Mallik, A. (1977). If L( 12 , χ) > 0, then L 12 , χ cannot be a minimum, Studia Sci.
Math. Hungar. 12, 445–446.
Poisson, S. D. (1823). Suite de mémoire sur les intégrales définies et sur la sommation
des séries, l’École Royale, J. Polytechnique 12, 404–509.
Riemann, B. (1860). Ueber die Anzahl der Primzahlen unter einer gegebenen Grösse,
Monatsberichte der Königlichen Preussichen Akademie der Wissenschaften zu
Berlin aus dem Jahre 1859, 671–680; Werke. Leipzig: Teubner, 1876, pp. 3–47.
Reprint: New York: Dover, 1953.
Tate, J. T. (1967). Fourier analysis in number fields, and Hecke’s zeta-functions, Alge-
braic Number Theory (Brighton, 1965). Washington: Thompson, pp. 305–347.
Taylor, P. R. (1945). On the Riemann zeta function, Quart. J. Math. Oxford Ser. 16,
1–21.
Titchmarsh, E. C. (1986). The Theory of the Riemann Zeta-function, Second Edition.
Oxford: Oxford University Press.
de la Vallée Poussin, C. (1896). Recherches analytique sur la théorie des nombres pre-
miers. Deuxième partie: Les fonctions de Dirichlet et les nombres premiers de
la forme linéaire M x + N , Annales de la Société scientifique de Bruxelles, 20,
281–342.
Vorhauer, U. M. A. (2006). The Hadamard product formula for Dirichlet L-functions,
to appear.
A. Walfisz (1931). Teilerprobleme, II, Math. Z. 34, 448–472.
A. Weil (1967). Über die Bestimmung Dirichletscher Reihen durch Funktionalgleichun-
gen, Math. Ann. 168, 149–156.
11
Primes in arithmetic progressions: II
358
11.1 A zero-free region 359
The quantity χ(n)n −it is unimodular when (n, q) = 1, so for such n there is a
360 Primes in Arithmetic Progressions: II
real number θn such that χ (n)n −it = eiθn . Thus the above is
∞
(n)
(3 + 4 cos θn + cos 2θn ).
n=1
nσ
(n,q)=1
Rq = {s : σ > 1 − c/ log qτ }
three. By Lemma 11.2, the resulting left-hand side is non-negative. That is,
3 4
− + c2 log q(|γ0 | + 4) ≥ 0
δ 1 + δ − β0
for some constant c2 . If β0 = 1, then letting δ → 0+ gives an immediate con-
tradiction, so it may be assumed that β0 < 1. Then, on taking δ = 6(1 − β0 ), it
follows that
1
1 − β0 ≥ .
14c2 log q(|γ0 | + 4)
Hence ρ0 ∈/ Rq if c is chosen sufficiently small.
This argument also applies with only small changes when χ is quadratic,
provided that |γ0 | is large. We can even allow |γ0 | to be small, as long as it is
large compared with 1 − β0 . We now consider such a case.
Case 2. Quadratic χ , |γ0 | ≥ 6(1 − β0 ). By Theorem 4.9, L(1, χ ) = 0, so γ0 =
0. Hence we can proceed as above, except that as χ 2 = χ0 the third inequality
in (11.2) must be replaced by the weaker inequality
L δ
− (1 + δ + 2iγ0 , χ 2 ) ≤ 2 + c1 log q(2|γ0 | + 4).
L δ + 4γ02
Again if β0 = 1, then taking δ → 0+ gives a contradiction. Thus it can be
supposed that β0 < 1. Since |γ0 | ≥ 6(1 − β0 ), this implies that
L δ
− (1 + δ + 2iγ0 , χ 2 ) ≤ 2 + c1 log q(2|γ0 | + 4).
L δ + 144(1 − β0 )2
We combine this inequality with the first two inequalities in (11.2) and apply
Lemma 11.2 with σ = 1 + δ = 1 + 6(1 − β0 ) to see that
1 3 4 6
− + + c2 log q(|γ0 | + 4) ≥ 0.
1 − β0 6 7 180
The factor in large parentheses above is −4/105 < −1/27, so
1
1 − β0 ≥ .
27c2 log q(|γ0 | + 4)
Case 3. Quadratic χ , 0 < |γ0 | ≤ 6(1 − β0 ). Since L(s, χ) is real when s is
real, it follows by the Schwarz reflection principle that L(β0 − iγ0 , χ) = 0.
Hence by Lemma 11.1 we see that if 1 < σ ≤ 2, then
L 1 1
− (σ, χ ) ≤ − − + c1 log 4q
L σ − ρ0 σ − ρ0
−2(σ − β0 )
= + c1 log 4q
(σ − β0 )2 + γ02
−2(σ − β0 )
≤ + c1 log 4q. (11.3)
(σ − β0 )2 + 36(1 − β0 )2
362 Primes in Arithmetic Progressions: II
Rather than apply Lemma 11.2 we simply observe that if σ > 1, then
L L ∞
(n)
− (σ, χ0 ) − (σ, χ ) = (1 + χ (n)) ≥ 0. (11.4)
L L n=1
nσ
(n,q)=1
In the same way that Theorem 6.7 was derived from Theorem 6.6, we now
derive estimates for LL (s, χ ) and log L(s, χ ) in a portion of the critical strip.
1/ log q, then
L
(s, χ) log qτ, (11.5)
L
| log L(s, χ )| ≤ log log qτ + O(1), (11.6)
and
1
log qτ. (11.7)
L(s, χ)
Alternatively, if β1 is an exceptional zero of L(s, χ) and |s − β1 | ≤ 1/ log q,
then
L 1
(s, χ ) = + O(log q) (s = β1 ), (11.8)
L s − β1
| arg L(s, χ )| ≤ log log q + O(1) (s = β1 ), (11.9)
and
where the sum is over those zeros of L(s, χ) for which |ρ − (3/2 + it)| ≤ 5/6.
Hence
1 1 1
= − + O(log qτ ). (11.12)
ρ s −ρ ρ s−ρ s1 − ρ
Our methods yield not only a zero-free region, but also enable us to bound
the number of zeros ρ of L(s, χ ) that might lie near 1 + it.
Theorem 11.5 Let n(r ; t, χ ) denote the number of zeros ρ of L(s, χ ) in the
disc |ρ − (1 + it)| ≤ r . Then n(r ; t, χ ) r log qτ uniformly for 1/ log qτ ≤
r ≤ 3/4.
11.1 A zero-free region 365
where the sum is over all zeros ρ such that |ρ − (3/2 + it)| ≤ 5/6. By
Lemma 11.1 we see that the above is log qτ , since
L ζ 1
(s1 ) ≤ − (1 + r ) log qτ.
L ζ r
If 1/3 ≤ r ≤ 3/4, then it suffices to apply Jensen’s inequality to L(s, χ) on a
disc with centre 3/2 + it, with R = 4/3 and r = 5/4, in view of the estimates
provided by Lemma 10.15.
11.1.1 Exercises
1. Let S(x; q) denote the number of integers n, 0 < n ≤ x, such that (n, q) = 1,
and put R(x; q) = S(x; q) − (ϕ(q)/q)x.
(a) Show that if σ > 0, x > 0, and s = 1, then
ϕ(q) x 1−s R(x; q) ∞
L(s, χ0 ) = χ0 (n)n −s + · − +s R(u; q)u −s−1 du.
n≤x q s − 1 x s
x
uniformly for σ ≥ δ. (This improves on the estimate used in the latter part
of the proof of Lemma 11.1.)
3. (a) Show that if σ > 0, then
∞
1 1
ζ (s) = + −s ({x} − 1/2)x −s−1 d x.
s−1 2 1
366 Primes in Arithmetic Progressions: II
for σ > 0.
(e) Show that if σ > 0, then
∞
1 1 1
ζ (σ ) + < |1 − σ log x|x −σ −1 d x = .
(σ − 1)2 2 1 eσ
(f) Justify the following chain of inequalities for σ > 1:
1
+ 1 −1) 2
ζ (σ −1)2 eσ 1 1 + (σ eσ 1
− (σ ) < = · < .
ζ 1
σ −1
+ 1
2
σ − 1 1 + σ −1 2
σ − 1
for σ > 1, where the product is over all prime ideals p in the ring.
(b) Let (a) = log(a 2 + b2 ) if a = (a + ib)k for some positive integer
k and a + ib is a Gaussian prime, and (a) = 0 otherwise. Show
that
L (a)χm (a)
(s, χm ) = −
L a N (a)s
for σ > 1.
(c) Show that there is an absolute constant c > 0 such that L(s, χm ) = 0 for
σ > 1 − c/ log mτ for every positive integer m.
for all n.
Proof Since any given L-function can have at most one such zero, if there
are two zeros, then one of them, say β1 , is a zero of L(s, χ1 ), and the other,
β2 , is a zero of L(s, χ2 ). We may assume that c is so small that 5/6 ≤ βi < 1.
Also, we note that χ1 χ2 is a non-principal character (mod q1 q2 ). Hence by four
applications of Lemma 11.1 we see that if 0 < δ ≤ 1, then
ζ 1
− (1 + δ) ≤ + c1 log 4,
ζ δ
L −1
− (1 + δ, χi ) ≤ + c1 log qi ,
L 1 + δ − βi
L
− (1 + δ, χ1 χ2 ) ≤ c1 log q1 q2 .
L
We sum these inequalities and apply Lemma 11.4 to see that
1 1 1
− − + c2 log q1 q2 ≥ 0.
δ 1 + δ − β1 1 + δ − β2
Without loss of generality we may suppose that β1 ≤ β2 . Then
1 2
− + c2 log q1 q2 ≥ 0,
δ 1 + δ − β1
and by taking δ = 2(1 − β1 ) we deduce that
1
1 − β1 ≥ .
6c2 log q1 q2
Corollary 11.9 (Landau) For each positive number A there is a c(A) > 0
such that if {qi } is a strictly increasing sequence of natural numbers with the
property that for each qi there is a primitive quadratic character χi (mod qi )
for which L(s, χi ) has a zero βi satisfying
c(A)
βi > 1 − ,
log qi
then
qi+1 > qiA .
11.2 Exceptional zeros 369
Corollary 11.10 (Page) There is a constant c > 0 such that for every Q ≥ 1
the region σ ≥ 1 − c/ log Qτ contains at most one zero of the function
∗
L(s, χ)
q≤Q χ
where ∗χ denotes a product over all primitive characters χ (mod q). If such
a zero exists, then it is necessarily real and the associated character χ is
quadratic.
We now turn to the problem of showing that even an exceptional zero cannot
be too close to 1. By taking s = 1 in (11.10) we see that this is equivalent
to showing that L(1, χ) cannot be too small. Suppose that χ is a primitive
quadratic character modulo q, and let r (n) = d|n χ (d). Then r (n) ≥ 0 for all
n and r (n) ≥ 1 when n is a perfect square. Since ∞ n=1 r (n)n
−s
= ζ (s)L(s, χ )
for σ > 1, we find that
L(1, χ )x 1−s
r (n)n −s = + ζ (s)L(s, χ ) + error terms. (11.14)
n≤x 1−s
Here the error terms are small if x is sufficiently large in terms of q. Estimates of
this kind can be derived from Corollary 1.15 by the method of the hyperbola, or
else by employing an inverse Mellin transform. Suppose that 0 ≤ s < 1 in the
above. We can give a lower bound for the left-hand side, which yields a lower
bound for L(1, χ) if the second term on the right-hand side does not interfere.
Since ζ (s) < 0 for 0 < s < 1 (cf. Corollary 1.14), this term is harmless if
L(s, χ ) ≥ 0. If this cannot be arranged, we may alternatively eliminate this
term by taking two values of x and differencing. Since the method of the
hyperbola leads to tedious details, we use an inverse Mellin transform to derive
a more precise version of (11.14). To make the estimates easier we introduce
an Abelian weighting of the sum. By (5.23) with x replaced by 1/x we see that
∞
1 2+i∞
r (n)en/x = ζ (s)L(s, χ ) (s)x s ds.
n=1
2πi 2−i∞
We move the contour of integration to the line s = −1/2, which gives rise to
residues at the poles at s = 1 and s = 0. Thus the above is
−1/2+i∞
1
= L(1, χ )x + ζ (0)L(0, χ) + ζ (s)L(s, χ) (s)x s ds.
2πi −1/2−i∞
for σ > 1, that r (1) = 1, and that r (n) ≥ 0 for all n. If there is a σ ∈ [19/20, 1)
such that f (σ ) ≥ 0, then
1
f (1) ≥ (1 − σ )M −3(1−σ ) .
4
To put this in perspective, we recall that our proof in Chapter 4 that
L(1, χ ) = 0 depended on Landau’s theorem (Theorem 1.7). The above amounts
to a quantitative elaboration of Landau’s theorem, for if f (1) were 0, then F(s)
would be analytic for s > 1/2, so by Landau’s theorem the Dirichlet series
would converge when σ > 1/2. This would imply that F(σ ) > 0 for σ > 1/2.
But ζ (σ ) < 0 for 1/2 < σ < 1 (cf. Corollary 1.14), so it would follow that
11.2 Exceptional zeros 371
for |s − 2| < 1. But the left-hand side is analytic for |s − 2| ≤ 3/2, so the series
converges in this larger disc. In order to estimate the coefficients on the right-
hand side we bound the left-hand side when s lies on the circle |s − 2| = 3/2.
To this end, we note by (1.24) that
∞
1 [u] − u
|ζ (s)| = 1 + +s du
s−1 1 u s+1
1 |s|
≤ 1+ + .
|s − 1| σ
The relation |s − 2| = 3/2 implies that |s − 1| ≥ 1/2, that |s| ≤ 7/2, and that
σ ≥ 1/2. Hence |ζ (s)| ≤ 10 for the s under consideration. Since | f (1)/(s −
1)| ≤ 2M, it follows that the left-hand side of (11.16) has modulus ≤ 12M
for |s − 2| ≤ 3/2. By the Cauchy coefficient inequalities we deduce that |bk −
f (1)| ≤ 12M(2/3)k . We apply this bound for all k > K where K is a parameter
to be chosen later. Thus from (11.16) we see that if 1/2 < σ ≤ 2, then
f (1) K k
ζ (σ ) f (σ ) − ≥ (bk − f (1))(2 − σ )k − 12M 2
(2 − σ ) .
σ −1 k=0 k>K
3
372 Primes in Arithmetic Progressions: II
Here the first factor is < 13/7. Since log(1 + δ) ≤ δ for any δ ≥ 0, on taking
δ = 1 − σ we see that log(2 − σ ) ≤ 1 − σ . Also, log 10/7 > 1/3 and it can
certainly be supposed that M ≥ 1, so the expression above is < (13/7)M 3(1−σ ) .
This with (11.17) gives the desired lower bound for f (1).
Theorem 11.14 (Siegel) For each positive number ε there is a positive con-
stant C(ε) such that if χ is a quadratic character modulo q, then
Proof We assume, as we may, that ε ≤ 1/5. For the present we restrict our
attention to primitive characters. We consider two cases, according to whether
there exists a primitive quadratic character χ1 such that L(s, χ1 ) has a real zero
β1 in the interval [1 − ε/4, 1), or not. Suppose first that there is no such zero.
We take f (s) = L(s, χ), σ = 1 − ε/4. Then f (σ ) > 0 and by Lemma 10.15
we may take M q 1/2 . Hence by Lemma 11.13, f (1) εq −3ε/8 . Thus there
−ε
is a constant C1 (ε) > 0 such that L(1, χ) ≥ C1 (ε)q .
Now consider the contrary case, in which there is a primitive quadratic char-
acter χ1 modulo q1 such that L(s, χ1 ) has a real zero β1 ≥ 1 − ε/4. Since
L(1, χ1 ) > 0 there is a constant C2 (ε) > 0 such that L(1, χ1 ) ≥ C2 (ε)q1−ε .
11.2 Exceptional zeros 373
We are unable to compute the value of the constant C(ε) in Siegel’s theorem
when ε < 1/2, because we have no way of estimating the size of the small-
est possible q1 when the second case arises in the proof. Such a constant is
called ‘non-effective.’ This is our first encounter with a non-effective constant,
so the distinction between effectively computable constants and non-effective
constants arises here for the first time.
Corollary 11.15 For any ε > 0 there is a positive number C(ε) such that
if χ is a quadratic character modulo q and β is a real zero of L(s, χ ), then
β < 1 − C(ε)q −ε .
374 Primes in Arithmetic Progressions: II
Proof We may certainly suppose that β > 1 − c/ log 4q > 1 − log1 q , where
c is the number appearing in Theorem 11.3, so that β is an exceptional zero by
the criterion following that theorem. By taking s = 1 in (10) we see that
11.2.1 Exercises
1. Call a modulus q ‘exceptional’ if there is a primitive quadratic character
χ (mod q) such that L(s, χ ) has a real zero β such that β > 1 − c/ log q.
Show that if c is sufficiently small, then the number of exceptional q not
exceeding Q is log log Q.
2. Use the last part of Theorem 4 to show that if L(s, χ ) has an exceptional
zero β1 , then L (β1 , χ) 1.
3. (cf. Mahler 1934, Davenport 1966, Haneke 1973, Goldfeld & Schinzel 1975)
Suppose that χ is a quadratic character, and put r (n) = d|n χ (d).
(a) Show that
χ (n)
= L(1, χ ) + O q 1/2 y −1 log q .
n≤y n
say.
(d) Show that
1 = (log x + C0 )L(1, χ) + L (1, χ ) + O q 1/2 y −1 (log qy)2 + O(yx −1 ).
(h) Show that for each c < 1/2 there is a constant q0 (c) such that if q ≥ q0 (c)
and L(1, χ) < c/ log q, then
r (n)
L (1, χ ) .
n≤q n
4. Use Estermann’s lemma (Lemma 11.13) to give a second proof that if L(s, χ )
has an exceptional zero β1 , then L(1, χ ) 1 − β1 (cf. (11.10) of Theorem
11.4).
5. Use Estermann’s lemma (Lemma 11.13) to give a second proof that if χ is a
cubic character (mod q), then L(1, χ) (log q)−1/2 (cf. Exercise 11.1.4(e)).
6. (Tatuzawa 1951) Let χ1 and χ2 be distinct primitive quadratic characters,
modulo q1 and q2 , respectively, and suppose that L(1, χi ) < Cεqi−ε for i =
1, 2 where 0 < ε ≤ 1 and C > 0.
(a) Show that minx>1 logx x = e. By a change of variables, deduce
that if ε > 0, then minx>1 x ε / log x = eε. Use this to show that
minx>1 x ε /(log x)2 = e2 ε 2 /4.
(b) Explain why there exists a constant c1 > 0 such that L(1, χ) ≥ c1 / log q
whenever L(s, χ ) has no exceptional zero. Let C1 = ec1 . Show that if
C < C1 , then L(s, χ1 ) and L(s, χ2 ) have exceptional zeros, say β1 and
β2 . (From now on, suppose that C < C1 .)
(c) Explain why there is a positive constant c2 such that L(1, χ) ≥ c2 (1 − β)
whenever β is an exceptional zero of L(s, χ ). Let C2 = c2 /6. Show that
if C < C2 , then β > 1 − ε/6. Let C3 = c2 /20. Show that if C < C3 ,
then β > 19/20. (From now on, suppose that C < Ci for i = 1, 2, 3.)
(d) Explain why there is a constant c3 > 0 such that at most one of L(s, χ1 ),
L(s, χ2 ) has a zero in the interval [1 − c3 / log q1 q2 , 1].
(e) Show that L(s, χ1 )L(s, χ2 ) has a zero β that satisfies the three inequal-
ities β ≥ 19/20, β ≥ 1 − ε/6, β ≤ 1 − c3 / log q1 q2 .
376 Primes in Arithmetic Progressions: II
(f) Let f (s) = L(s, χ1 )L(s, χ2 )L(s, χ1 χ2 ). Show that there is an absolute
constant c4 > 0 such that f (1) ≥ c4 (log q1 q2 )−1 (q1 q2 )−ε/2 .
(g) Explain why there is a constant c5 > 0 such that L(1, χ1 χ2 ) ≤
c5 log q1 q2 .
1/2 −1/2
(h) Show that C ≥ c4 c5 e/4.
(i) Conclude that there is a positive effectively computable absolute C such
that if 0 < ε ≤ 1, then the inequality L(1, χ) > Cεq −ε holds for all
primitive quadratic characters, with at most one exception.
7. (Fekete & Pólya 1912, Pólya & Szegö 1925, p. 44, Heilbronn 1937) Let
S1 (x, χ) = 1≤n≤x χ (n).
(a) Show that if χ is a quadratic character such that S1 (x, χ) ≥ 0 for all
x ≥ 1, then L(σ,
χ ) > 0 for all σ > 0.
(b) Let χd (n) = dn . Show that the hypothesis above holds for d =
−3, −4, −7, −8, but not for d = 5, 8.
N
(c) For k > 1 let Sk (N , χ ) = n=1 Sk−1 (n, χ ). Show that
N
N −n+k−1
Sk (N , χ) = χ (n).
n=1
k−1
(d) Let f (x) = f (x + 1) − f (x) and k f (x) = ( k−1 f (x)). Show that
k f (x) = rk=0 (−1)r rk f (x + k − r ), and that if f (k) (x) is continu-
ous then
x+1 u 1 +1 u k−1 +1
k f (x) = ··· f (k) (u k ) du k du k−1 · · · du 1 .
x u1 u k−1
(e) Show that if σ > 0, then (−1)k k (x −σ ) > 0 for all x > 0.
(f) Show that L(s, χ ) = (−1)k ∞ −s
n=1 Sk (n, χ) k (n ) for σ > 0.
(g) Show that if χ is a quadratic character and k is an integer such that
Sk (N , χ) ≥ 0 for
all integers N ≥ 1, then L(σ, χ) > 0 for all σ > 0.
(h) For χ5 (n) = n5 and χ8 (n) = n8 find the least k such that the hypothesis
above is satisfied.
(i) Let P(z, χ ) = ∞ n=1 χ (n)z for |z| < 1. Show that P(z, χ)(1 − z)
n −k
=
∞
n=1 Sk (n, χ)z for |z| < 1.
n
0.0323. Deduce that P(0.7, χ−163 ) < 0, and hence that for any k there
is an N for which Sk (N , χ−163 ) < 0.
11.3 The Prime Number Theorem for APs 377
√
Theorem 11.16 There is a constant c1 > 0 such that if q ≤ exp(2c1 log x),
then
ψ(x, χ) = E 0 (χ )x + O x exp − c1 log x (11.23)
when L(s, χ ) has no exceptional zero, but
x β1
ψ(x, χ ) = − + O x exp − c1 log x (11.24)
β1
when L(s, χ) has an exceptional zero β1 . Here E 0 (χ ) = 1 if χ = χ0 , and
E 0 (χ ) = 0 otherwise.
Proof By Theorems 4.8 and 5.2 we see that
σ0 +i T
−1 L xs
ψ(x, χ ) = (s, χ) ds + R
2πi σ0 −i T L s
where σ0 > 1 and
x (4x)σ0
∞
(n)
R (n) min 1, +
x/2<n<2x
T |x − n| T n=1 n σ0
a pole inside C at β1 , so the left-hand side of (11.25) has the value −x β1/β1 .
Otherwise, the estimates proceed as before, and we find that
x β1 1 −c log x
ψ(x, χ ) = − + O x(log x)2 + exp . (11.27)
β1 T 5 log qT
Case 3. There is an exceptional zero β1 , but it satisfies β1 < 1 − c/(4 log qT ).
We proceed exactly as in Case 1, and so we obtain (11.26). To pass to (11.27)
it suffices to note that
x β1 −c log x
x exp
β1 5 log qT
in the current case.
We have established (11.26) if there is no exceptional zero, and (11.27)
if there is one. To complete our argument, we need only observe that if
√ √ √
c1 = c/20, if q ≤ exp(2c1 log x), and if T = exp(2c1 log x), then (11.26)
gives (11.23) and (11.27) gives (11.24).
term when x < exp(c12 /(1 − β1 )2 ). If β1 is extremely close to 1, then one might
have β1 ≥ 1 − 1/ log x, and in such a situation the second main term is of the
same order of magnitude as the first main term, since
x β1 1
x− = (β1 − 1)x β1/β1 + (log x) x σ dσ (1 − β1 )x log x. (11.30)
β1 β1
Thus if 1 − β1 is small compared with 1/ log x, then the main term is nearly
doubled if χ1 (a) = −1, and it is nearly annihilated if χ1 (a) = 1. Unfortunately,
the upper bound provided by the Brun–Titchmarsh theorem (Theorem 3.9) is
not quite strong enough to refute such a possibility.
The constants c and c1 in Theorems 11.3, 11.4, 11.16 and Corollary 11.17
are effectively computable. However, if we are willing to accept non-effective
constants, then by Siegel’s theorem (Theorem 11.14), or more precisely by its
corollary (Corollary 11.15), we can eliminate the second main term, provided
that q is more sharply limited.
Corollary 11.18 Let c1 be the same constant as in Theorem 11.16. For any
positive A there is an x0 (A) such that if q ≤ (log x) A , then
ψ(x, χ) = E 0 (χ )x + O x exp − c1 log x (11.31)
for x ≥ x0 (A).
Proof Suppose that χ is quadratic and that L(s, χ ) has an exceptional zero
β1 . Then
≤ x exp(−C(ε)(log x)1−Aε ).
In order to reach (11.31) we need to take ε a little smaller than 1/(2A), say
ε = 1/(3A). Then the above is
≤ x exp − c1 log x
obtain an estimate
ψ(x, χ) A
x exp − c1 log x
that is valid for all q and all x ≥ exp q 1/A , though of course the implicit
constant is so large that the bound is worse than the trivial ψ(x, χ ) x when
x < x0 . By applying (11.22) and (11.28), we obtain
Proof Since
find that it is
ϑ(u; q, a) − u/ϕ(q) x x
ϑ(u; q, a) − u/ϕ(q)
− du.
log u 2− 2 u(log u)2
If there is no exceptional zero, then the numerator in the integrand is
√ √
u exp(−c1 log u) x exp(−c1 log x), so we obtain (11.33). If there is
an exceptional character χ1 , then the main term is reduced by χ1 (a)/ϕ(q) times
the amount
x β1
x
1 u β1 x
u β1 −1 1
d = du = dv = li(x β1 ) + O(1).
2 log u β1 2 log u 2β1 log v
The error term is still treated in the same way, so we obtain (11.35).
Corollary 11.21 Let c1 be the constant in Theorem 11.16, and suppose that
A is given, A > 0. If q ≤ (log x) A and (a, q) = 1, then
x
ϑ(x; q, a) = + OA x exp − c1 log x (11.36)
ϕ(q)
and
li(x)
π(x; q, a) = + OA x exp − c1 log x . (11.37)
ϕ(q)
11.3.1 Exercises
1. Suppose that χ is a character modulo q. Explain why
q
ψ(x, χ ) = χ (a)ψ(x; q, a).
a=1
(a,q)=1
√
2. Suppose that exp(2c1 log x) ≤ q ≤ x. Show that there is a positive con-
stant c2 such that
−c2 log x
ψ(x, χ ) = E 0 (χ )x + O x exp
log q
if L(s, χ) has no exceptional zero, and that
x β1 −c2 log x
ψ(x, χ) = − + x exp
β1 log q
if L(s, χ) has the exceptional zero β1 .
√
3. Show that if q ≤ exp(2c1 log x), then
ϑ(x, χ) = E 0 (χ )x + O x exp − c1 log x
11.3 The Prime Number Theorem for APs 383
and that
π (x, χ) = E 0 (χ )li(x) + O x exp − c1 log x .
√
7. Let c1 be the constant of Theorem 11.16, suppose that q ≤ exp(2c1 log x)
and that χ is a character modulo q. Show that
M(x, χ ) x exp − c1 log x
and that
M(x, χ) A
x exp − c1 log x .
and that
1
M(x; q, a) = χ (a)M(x, χ).
ϕ(q) χ
10. Let c1 be the constant in Theorem 11.16. Show that if (a, q) = 1, then
(x; q, a) x exp − c1 log x
for all a.
12. Suppose that (a, q) = 1. Show that
M(x; q, a) x exp − c1 log x
where kk ≡ 1 (mod r ).
(d) Show that M(x; q, a) x/q in any case.
√
(e) Deduce that M(x; q, a) x exp(−c log x) if there is no exceptional
character modulo r , and that
µ(d)χ1 (b)(x/d)β1 χ1 ( p)
M(x; q, a) = 1 − + O x exp − c log x
ϕ(r )L (β1 , χ1 )β1 p|d p β1
pr
11.4 Applications
The fundamental estimates of the preceding section can be applied to a
wide variety of counting problems, of which the following are representative
examples.
Theorem 11.22 (Walfisz) Let A > 0 be fixed, and let R(n) denote the number
of ways of writing n as a sum of a prime and a square-free number. Then
R(n) = c(n)li(n) + O n/(log n) A
where
1 1 1
c(n) = 1− = 1+ 2 1− .
pn
p( p − 1) p|n
p − p−1 p p( p − 1)
Proof Clearly
R(n) = µ(n − p)2
p<n
= µ(d)
p<n d 2 |(n− p)
Since ϕ(d 2 ) = dϕ(d), we see that the sum in the main term is
∞
µ(d) 1 1
+O = 1− + O(1/y)
d=1
dϕ(d) d>y
dϕ(d) pn
p( p − 1)
(d,n)=1
Theorem 11.23 Let N (x) denote the number of integers n ≤ x for which
(n, ϕ(n)) = 1. Then
e−C0 x
N (x) ∼
log log log x
as x → ∞.
Proof We note that (n, ϕ(n)) = 1 if and only if n has the following two prop-
erties: (i) n is square-free, and (ii) there do not exist prime factors p, p of n
such that p ≡ 1 (mod p). Let p(n) denote the least prime factor of n. We shall
show that if p(n) is small compared with log log x then n is unlikely to have the
property (ii). We also show that n is likely to have both properties (i) and (ii) if
p(n) is large compared with log log x. Thus N (x) is approximately the number
of integers n ≤ x for which p(n) > log log x.
Let A p (x) denote the number of n ≤ x that satisfy (i) and (ii) and for which
p(n) = p. Thus
N (x) = A p (x).
p≤x
We begin by estimating A p (x) when p ≤ log log x. Let p be given, and suppose
that n is an integer such that p(n) = p and for which (ii) holds. Write n = pm;
then m is relatively prime to all prime numbers < p and also to all primes
≡ 1 (mod p). Thus by the sieve estimate (3.20) we see that
x 1 1
A p (x) 1− 1− .
p p < p p p ≤x/ p
p
p ≡1( p)
uniformly for p ≤ log log x. Hence there is a constant c > 0 such that in this
range,
x
A p (x) exp(−c(log log x)/ p).
p log p
Now it is not hard to show that the number of integers n ≤ x such that p(n) = p
is x/( p log p) uniformly for p ≤ x/2. Hence the exponential above reflects
the relative improbability that n satisfies condition (ii). On summing, we find
that
x
A p (x) exp(−c(log log x)/U ).
(log U )2
2 U < p≤U
1
To derive a corresponding lower bound for the left-hand side we start with the
numbers counted by (x, y) and then delete those that do not satisfy (i) or (ii).
If n does not satisfy (i), then there is a prime number p such that p 2 |n. The
number of such n ≤ x is not more than [x/ p 2 ] ≤ x/ p 2 . Hence the total number
of n counted in (x, y) for which (i) fails is not more than x p>y p −2
x/(y log y). Similarly, if n does not satisfy (ii), then there exist primes p, p
with pp |n such that p ≡ 1 (mod p). If p and p are given, then the number
of n ≤ x for which pp |n is ≤ x/( pp ). Hence the total number of n counted in
(x, y) for which (ii) fails is not more than
1 1
x . (11.40)
√ p
y≤ p≤ x p ≤x/ p
p
p ≡1( p)
11.4 Applications 389
uniformly for U ≥ p. We take U = 2k p and sum over k to see that the inner
sum in (11.40) is (log log 4x/ p 2 )/ p. Hence the expression (11.40) is
1 x log log x
x(log log x) 2
.
p>y p y log y
In order that this is a smaller order of magnitude than the main term, it is
necessary to take y ≤ (log log x)(1+ε) with ε → 0 as x → ∞. By taking y to
be of this form with ε tending to 0 slowly, we obtain the stated result.
11.4.1 Exercises
1. Let R(n) be defined as in Theorem 11.22.
(a) Show that if there is a primitive quadratic character χ1 (mod q1 ), q1 ≤
√
exp( log x), for which L(s, χ1 ) has a real zero β1 > 1 − c(log x)−1/2 ,
then
R(n) = c(n)li(n) − χ1 (n)c1 (n)li(n β1 ) + O n exp − c log n
390 Primes in Arithmetic Progressions: II
where
∞
µ(d)
c1 (n) = .
d=1
dϕ(d)
(d,n)=1
q1 |d 2
11.5 Notes
Section 11.1. Theorem 11.3 is a combination of work by Gronwall (1913) and
Titchmarsh (1930).
Section 11.2. Lemma 11.6, Theorem 11.7, and Corollaries 11.8, 11.9 origi-
nate in Landau (1918a, b), while Corollary 11.10 is from Page (1935). Theorem
11.11 can also be proved by appealing to the Dirichlet class number formula,
which asserts that if d is a quadratic discriminant and χd (n) = dn K is the
associated quadratic character, then
⎧
⎪ 2π h
⎪ √
⎨ (d < 0),
w −d
L(1, χd ) =
⎪
⎪ h log ε
⎩ √ (d > 0);
d
√
see Davenport (2000, Section 6). If d < 0, then χd (−1) = −1, Q( d) is an
imaginary quadratic field with class number h, and w denotes the number of
roots of unity in the field (which is to say that w = 6 if d = −3, √ w = 4 if
d = −4, and w = 2 otherwise). If d > 0, then χd (−1) = 1, Q( d) is a √ real
quadratic field with class number h and fundamental unit ε. Since ε d,
it follows that if χ is a quadratic character with χ (−1) = 1, then L(1, χ)
(log q)/q 1/2 .
Corollary 11.12 has been sharpened by Davenport (1966), Haneke (1973),
and by Goldfeld & Schinzel (1975).
Section 11.3. Let h(d) denote the number of equivalence classes of primitive
binary quadratic forms of discriminant d. Gauss (1801, Section 303) conjec-
tured that h(d) → ∞ as d → −∞. (The behaviour for d > 0 is quite different –
the heuristics of Cohen & Lenstra (1984a, b) predict that h( p) = 1 for a positive
proportion of primes p ≡ 1 (mod 4).) For Gauss, the generic binary quadratic
form was written ax 2 + 2bx y + cy 2 , which is to say that the middle coefficient
is even. Put = b2 − ac. In Gauss’s notation, Landau (1903) found that if
< 0, then the class number is 1 precisely when = −1, −2, −3, −4, −7.
Binary quadratic forms ax 2 + bx y + cy 2 with d = b2 − 4ac correspond, when
√ to ideals in the ring O K of integers
d is a fundamental quadratic discriminant,
in the quadratic number field K = Q( d). In this notation, h(d) = 1 if and
only if O K is a unique factorization domain. The problem of determining all
d < 0 for which h(d) = 1 is now solved, but historically it was enormously
more difficult than the class number 1 problem settled by Landau. Landau
(1918b) recorded Hecke’s observation that if d < 0 is a quadratic discriminant
and L(s, χd ) > 0 for 1 − c/ log |d| < s < 1, then h(d) c |d|1/2 / log |d|. In
view of Dirichlet’s class number formula (4.36), we have obtained Hecke’s
result – by a different method – in Theorem 11.4. Thus we have a good lower
392 Primes in Arithmetic Progressions: II
bound for h(d) when d < 0, except for those d for which L(s, χd ) has an ex-
ceptional real zero. Deuring (1933) showed that if h(d) = 1 has infinitely many
solutions with d < 0, then the Riemann Hypothesis is true. Mordell (1934)
showed that the same conclusion can be derived from the weaker hypothe-
sis that h(d) does not tend to infinity as d → −∞. Heilbronn (1934) found
that instead of arguing from a hypothetical zero ρ of the zeta function with
β > 1/2 one could just as well argue from an exceptional zero of a quadratic
L-function, and thus proved Gauss’s conjecture that h(d) → ∞ as d → −∞.
Landau (1935) put Heilbronn’s theorem in a quantitative form: h(d) > |d|3/8−ε
as d → −∞. Through a different arrangement of the technical details, Siegel
(1935) sharpened Landau’s argument to show that h(d) > |d|1/2−ε , which by
(4.36) is the case d < 0 of Theorem 11.14. To achieve his result, Siegel first gen-
eralized to algebraic number fields the formula (found in Exercise 10.1.10) that
Riemann used to prove the functional equation for ζ (s). Then Siegel applied this
√ √
to the quartic number field K = Q( d1 , d2 ) whose Dedekind zeta function
is ζ K (s) = ζ (s)L(s, χd1 )L(s, χd2 )L(s, χd1 d2 ). It is now recognized that Siegel’s
formula arises through the choice of the kernel in a Mellin transform, and that
many other choices work just as well; see Goldfeld(1974). Our exposition is
based on that of Estermann (1948).
It is easy to show that the complex quadratic field of discriminant d < 0
has unique factorization in the nine cases d = −3, −4, −7, −8, −11, −19,
−43, −67, −163. Heilbronn & Linfoot (1934) showed that there could ex-
ist at most one more such discriminant. The ‘problem of the tenth discrimi-
nant’ was solved first by Heegner (1952). However, Heegner’s paper contained
many assertions for which proofs were not provided, and Heegner also used
results from Weber’s Algebra which were known not to be trustworthy. Con-
sequently, for many years Heegner’s paper was thought to be incorrect. Baker
(1966) proved a fundamental lower bound for linear forms in logarithms of
algebraic numbers, which by means of a result of Gel’fond & Linnik (1948)
reduced the class number 1 problem to a finite calculation. Meanwhile, Stark
(1967) showed that there is no tenth discriminant by translating Heegner’s
argument into parallel language where it could be checked. After a reexami-
nation of Heegner’s work, Deuring (1968), Birch (1969), and Stark (1969) all
concluded that Heegner’s paper was after all correct. Gel’fond & Linnik re-
duced the class number problem to a question concerning linear forms in three
logarithms, which Baker treated successfully. However, with a small modifi-
cation of their argument, Gel’fond & Linnik could have reduced the problem
to linear forms in two logarithms, which Gel’fond had already treated. Thus
one could say that Gel’fond & Linnik ‘should’ have solved the problem in
1948.
11.6 References 393
Baker (1971) and Stark (1971b, 1972) reduced the complete determination
of complex quadratic fields with h(d) = 2 to a finite calculation which was
provided by Bundschuh & Hock (1969), Ellison et al. (1971), Montgomery &
Weinberger (1973), and by Stark (1975).
The effective determination of all quadratic discriminants d < 0 for which
h(d) takes specific larger values became possible only with the addition of
further ideas. Goldfeld (1976) showed that a zero at s = 1/2 of the L-function
of an elliptic curve would be useful if it is of sufficiently high multiplicity.
In particular, if (i) the Birch–Swinnerton-Dyer conjectures are true, and if (ii)
there exist elliptic curves of arbitrarily high rank, then h(d) A (log |d|)A for
arbitrarily large A, with an effectively computable implicit constant. Although
these conjectures remain unproved, Gross & Zagier (1986) were able to establish
enough to give an effective lower bound for h(d) tending to infinity. For accounts
of this, see Zagier (1984), Goldfeld (1985), Coates (1986), and finally Oesterlé
(1988), who developed the Goldfeld and Gross–Zagier work to show that
√
1 [2 p]
h(d) ≥ (log |d|) 1− .
55 p|d
p+1
p<|d|
By means of this inequality, Arno (1992), Wagner (1996), and Arno, Robinson &
Wheeler (1998) treated progressively larger collections of class numbers. Most
recently, Watkins (2004) settled the complete determination of all discriminants
d < 0 for which h(d) ≤ 100.
With regard to Corollary 11.17, Page (1935) states the final conclusion in
a less precise form in which the term corresponding to the exceptional zero is
replaced by O(x β1 /φ(q)).
The deduction of Corollaries 11.18 and 11.19 from Siegel’s theorem was
first recorded by Walfisz (1936).
Section 11.4. Theorem 11.22 is due to Walfisz (1936). In a weaker form it
occurs first in Estermann (1931), and is given in a somewhat refined form but
without the benefit of Siegel’s theorem in Page (1935). For similar theorems
see see Mirsky (1949).
Theorem 11.23 is due to Erdős (1948).
11.6 References
Arno, S. (1992). The imaginary quadratic fields of class number 4, Acta Arith. 60,
321–334.
Arno, S., Robinson, M. L., & Wheeler, F. S. (1998). Imaginary quadratic fields with
small class number, Acta Arith. 83, 295–330.
394 Primes in Arithmetic Progressions: II
(1985). Gauss’ class number problems for imaginary quadratic fields, Bull. Amer.
Math. Soc. 13, 23–37.
(2004). The Gauss class number problem for imaginary quadratic fields, Heegner
Points and Rankin L-series, Math. Sci. Res. Inst. Publ. 49. Cambridge: Cambridge
University Press, 25–36.
Goldfeld, D. M. & Schinzel, A. (1975). On Siegel’s zero, Ann. Scuola Norm. Sup. Pisa
Cl. Sci. (4) 2, 571–583.
Gronwall, T. H. (1913). Sur les séries de Dirichlet correspondant à des caractères com-
plexes, Rend. Circ. Mat. Palermo 35, 145–159.
Gross, B. H. & Zagier, D. B. (1986). Heegner points and derivatives of L-series, Invent.
Math. 84, 225–320.
Haneke, W. (1973). Über die reellen Nullstellen der Dirichletschen L-Reihen, Acta Arith.
22, 391–421; Corrigendum, 31 (1976), 99–100.
Heegner, K. (1952). Diophantische Analysis und Modulfunktionen, Math. Z. 56, 227–
253.
Heilbronn, H. (1934). On the class-number in imaginary quadratic fields, Quart. J. Math.
Oxford Ser. 5, 150–160.
(1937). On real characters, Acta Arith. 2, 212–213.
Heilbronn, H. & Linfoot, E. (1934). On the imaginary quadratic corpora of class-number
one, Quart. J. Math. Oxford Ser. 5, 293–301.
Landau, E. (1903). Über die Klassenzahl der binären quadratischen Formen von neg-
ativer Discriminante, Math. Ann. 56, 671–676; Collected Works, Vol. 1. Essen:
Thales Verlag, 1985, pp. 354–359.
(1918a). Über imaginär-quadratische Zahlkörper mit gleicher Klassenzahl, Nachr.
Akad. Wiss. Göttingen, 277–284; Collected Works, Vol. 7. Essen: Thales Verlag,
1986, pp. 142–160.
(1918b). Über die Klassenzahl imaginär-quadratischer Zahlkörper, Nachr. Akad.
Wiss. Göttingen, 285–295; Collected Works, Vol. 7. Essen: Thales Verlag,
pp. 150–160.
(1935). Bemerkungen zum Heilbronnschen Satz, Acta Arith. 1, 1–18; Collected Works,
Vol. 9. Essen: Thales Verlag, 1987, pp. 265–282.
Mahler, K. (1934). On Hecke’s theorem on the real zeros of the L-functions and the
class number of quadratic fields, J. London Math. Soc. 9, 298–302.
Mirsky, L. (1949). The number of representations of an integer as the sum of a prime
and a k-free integer, Amer. Math. Monthly 56, 17–19.
Montgomery, H. L. & Weinberger, P. J. (1973). Notes on small class numbers, Acta
Arith. 24, 529–542.
Mordell, L. J. (1934). On the Riemann Hypothesis and imaginary quadratic fields with
given class number, J. London Math. Soc. 9, 405–415.
Oesterlé, J. (1988). Le problème de Gauss sur le nombre de classes, Enseignement Math.
(2) 34, 43–67.
Page, A. (1935). On the number of primes in an arithmetic progression, Proc. London
Math. Soc. (2) 39, 116–141.
Pólya, G. & Szegö, G. (1925). Aufgaben und Lehrsätze aus der Analysis, Vol. 2, Grundl.
Math. Wiss. 20. Berlin: Springer.
Rosser, J. B. (1950). Real roots of real Dirichlet L-series, J. Research Nat. Bur. Standards
45, 505–514.
396 Primes in Arithmetic Progressions: II
uniformly for −1 ≤ σ ≤ 2.
397
398 Explicit formulæ
Here the first term on the right is significant only for |t| ≤ 1. We could prove
the above by the same method that we used to prove Lemma 6.4, but we find it
instructive to argue instead from Corollary 10.14.
Proof By combining (10.29) and Theorem C.1, it is immediate that
ζ −1 1 1
1
(s) = + + − log τ + O(1).
ζ s−1 ρ s−ρ ρ 2
On applying this at σ + it and at 2 + it, and differencing, it follows that
ζ −1 1 1
(s) = + − + O(1).
ζ s−1 ρ s−ρ 2 + it − ρ
By Theorem 10.13 it is clear that
1
1 log τ.
ρ 2 + it − ρ ρ
|γ −t|≤1 |γ −t|≤1
Now suppose that n is a positive integer, and consider those zeros ρ for which
n ≤ |γ − t| ≤ n + 1. Since
1 1 2−σ 1
− = ,
s−ρ 2 + it − ρ (s − ρ)(2 + it − ρ) n2
it follows that such zeros contribute an amount
N (t + n + 1) − N (t + n) + N (t − n) − N (t − n − 1) log(τ + n)
.
n2 n2
On summing over n we obtain the stated estimate.
The next lemma is useful in Chapter 14, but we establish it here since it is a
also an immediate corollary of Lemma 12.1.
Lemma 12.3 For any real number t,
arg ζ (σ + it) log τ
uniformly for −1 ≤ σ ≤ 2.
12.1 Classical formulæ 399
The function log ζ (s) has a branch point at s = 1, and also at zeros ρ of
the zeta function. To obtain a single branch of the logarithm, we remove from
the complex plane the interval (−∞, 1], and also intervals of the form (−∞ +
iγ , β + iγ ]. What remains is simply connected, and in this region we take
that branch of log ζ (s) for which log ζ (s) → 0 as σ → ∞. This is the branch
of the logarithm that we have expanded as a Dirichlet series, for σ > 1 (cf.
Corollary 1.11). Thus, if t is not the ordinate of a zero, we define arg ζ (s) =
log ζ (s) by continuous variation from ∞ + it to σ + it, which is to say
that
∞
ζ
arg ζ (s) = − (α + it) dα.
σ ζ
If t is the ordinate of a zero then we set arg ζ (s) = (arg ζ (σ + it + ) + arg ζ (σ +
it − ))/2.
Lemma 12.4 Let A denote the set of those points s ∈ C such that σ ≤ −1
and |s + 2k| ≥ 1/4 for every positive integer k. Then
ζ
(s) log(|s| + 1)
ζ
uniformly for s ∈ A.
400 Explicit formulæ
Proof We recall (10.27), in which the first two terms are bounded for s ∈ A.
Also,
(1 − s) log(|s| + 1)
where
x x
R(x, T ) (log x) min 1, + (log x T )2 . (12.4)
T x T
Since x > 0 for all x, we obtain (12.1) by letting T → ∞ in the above.
Moreover, if n 1 < n 2 are two consecutive prime powers, then from the above
we see that |γ |≤T x ρ/ρ converges uniformly for x in an interval of the form
[n 1 + δ, n 2 − δ]. This sum, of course, cannot be uniformly convergent for x
in a neighbourhood of a prime power, since ψ0 (x) has jump discontinuities
at such points, but we see from the above that it is boundedly convergent in
the neighbourhood of a prime power. The sum over ρ is also convergent when
x = 1, but it is not boundedly convergent near 1, since log(1 − 1/x 2 ) → −∞
as x → 1+ .
Proof Let T1 be the number supplied by Lemma 12.2. Then by Theorem 5.2
and its Corollary 5.3, with σ0 = 1 + 1/ log x, we see that
σ0 +i T1
−1 ζ xs
ψ0 (x) = (s) ds + R1
2πi σ0 −i T1 ζ s
where
x x ∞
(n)
R1 (n) min 1, + .
x/2<n<2x
T |x − n| T n=1 n σ0
n=x
12.1 Classical formulæ 401
Here the second sum is − ζζ (σ0 ) 1/(σ0 − 1) = log x. In the first sum, the
terms for which x + 1 ≤ n < 2x contribute an amount
x log x x
(log x)2 .
x+1≤n<2x
T (n − x) T
The terms for which x/2 < n ≤ x − 1 are handled similarly. Finally, any terms
for which x − 1 < n < x + 1 contribute an amount
x
(log x) min 1, ,
T x
so
x x
R1 (log x) min 1, + (log x)2 .
T x T
Let K denote an odd positive integer, and let C denote the contour consisting
of line segments connecting σ0 − i T1 , −K − i T1 , −K + i T1 , σ0 + i T1 . Then
by Cauchy’s residue theorem,
xρ x −2k ζ
ψ0 (x) = x − + − (0) + R1 + R2
ρ ρ 1≤k<K /2
2k ζ
|γ |<T1
where
−1 ζ xs
R2 = (s) ds.
2πi C ζ s
Since |σ ± i T1 | ≥ T , we see by Lemma 12.2 that
σ0 ±i T1 σ0
ζ xs (log T )2 x(log T )2 x(log T )2
(s) ds x σ dσ .
−1±i T1 ζ s T −1 T log x T
Similarly, since (log |σ ± i T1 |)/|σ ± i T1 | (log T )/T , we see by Lemma
12.4 that
−1±i T1 −1
ζ log T log T log T
(s)x s ds x σ dσ .
−K ±i T1 ζ T −∞ x T log x T
As | − K + it| ≥ K , by Lemma 12.4 we also see that
−K +i T1
ζ xs log K T −K T1
T log K T
(s) ds x 1 dt .
−K −i T1 ζ s K −T1 KxK
This tends to 0 as K → ∞, so we obtain the stated result.
uniformly for −1 ≤ σ ≤ 2.
Proof By combining (10.37) and Theorem C.1, it is immediate that
L 1 1
(s, χ) = B(χ ) + + + O(log qτ ).
L ρ s−ρ ρ
On applying this at σ + it and 2 + it, and differencing, it follows that
L 1 1
(s, χ ) = − + O(log qτ ).
L ρ s−ρ 2 + it − ρ
By Theorem 10.17 it is clear that
1
1 log qτ.
ρ 2 + it − ρ ρ
|γ −t|≤1 |γ −t|≤1
Now suppose that n is a positive integer, and consider those zeros ρ for which
n ≤ |γ − t| ≤ n + 1. Since
1 1 2−σ 1
− = ,
s−ρ 2 + it − ρ (s − ρ)(2 + it − ρ) n2
it follows that such zeros contribute an amount
log q + log(|t + n| + 2) + log(|t − n| + 2) log q(τ + n)
.
n2 n2
On summing over n we obtain the stated estimate.
Proof Suppose that −1 ≤ σ ≤ 2, and that t is not the ordinate of a zero. Then
2
L
arg L(σ + it, χ ) = arg L(2 + it, χ ) − (α + it, χ ) dα.
σ L
Here arg L(2 + it, χ ) 1 uniformly in t, by Theorem 4.8. Thus by
Lemma 12.6, the right-hand side above is
2
1
− dα + O(log qτ ).
|γ −t|≤1 σ
α + it − ρ
where
L q
C(χ ) = (1, χ ) + log − C0 (12.7)
L 2π
and
x x
R(x, T ; χ ) (log x) min 1, + (log q x T )2 . (12.8)
T x T
Here x denotes the distance from x to the nearest prime power, other than x
itself.
if L(s, χ ) has the exceptional zero β1 . In this latter case, the sum over ρ includes
a large term due to ρ = 1 − β1 . This, however, is largely cancelled by C(χ),
since
x 1−β1 − 1 log x 1−β1
− =− x σ dσ x 1−β1 log x. (12.11)
1 − β1 1 − β1 0
Thus we have (12.15). It is also clear that (12.8) gives (12.16). To obtain (12.17),
we note that
x
u x x
1 x 2 log T
min 1, du ≤ 1+ du .
c T u T pk ≤2x x/T u T log x
12.1.1 Exercises
1. Suppose that |s − 1| ≥ 1. Show that
log ζ (s) = log(s − ρ) + O(log τ )
ρ
|γ −t|≤1
8. (Hardy & Littlewood 1918; Wigert 1920) (a) Let k be a non-negative integer.
Show that for s near −k, the Laurent expansion of (s) begins
(−1)k (−1)k
(s) = + (k + 1) + · · · .
k!(s + k) k!
(b) Let k be a positive integer. Show that for s near −2k, the Laurent expan-
sion of ζζ (s) begins
ζ 1 ζ
(s) = − (2k + 1) + log 2π − (2k + 1) + · · · .
ζ s + 2k ζ
2
9. Suppose that a > 0, that x ≥ 1, and that x is not of the form e2a k
where k
is a positive integer. Show that
1 ∞
−(log x/n)2
√ (n) exp
2π a n=1 2a 2
= ea /2 x − ea ρ /2 x ρ + e2a k x −2k
2 2 2 2 2
ρ 0<k< log2x
2a
2 ∞
1 −(log x) ζ
(−(log x)/a 2 + it)e−a t /2 dt.
2 2
− exp
2π 2a 2 −∞ ζ
and
∞
e( 2 +δ0 )2π |x| |d F(x)| < ∞
1
(12.21)
−∞
where δ0 > 0 is fixed. Suppose that F(x) = 12 (F(x − ) + F(x + )) for all x, and
that F(x) + F(−x) = 2F(0) + O(|x|). Put
∞
(s) = F(x)e−(s−1/2)2π x d x
−∞
and
(3/4) = −C0 − 3 log 2 + π/2.
Here C0 is Euler’s constant. Since |d f g| ≤ | f | |dg| + |g| |d f |, from
(12.20) and (12.21) we see that ea|x| F(x) is of bounded variation for any a,
0 ≤ a ≤ (1/2 + δ0 )2π. Hence F(x) exp(−(1/2 + δ0 )2π |x|), and (s) is an-
alytic in the strip −δ0 < σ < 1 + δ0 . For |t| ≤ 1 we note that φ(s) 1. For
|t| ≥ 1 we integrate by parts to see that
∞
1
(s) = e(−t x) d (F(x) exp((1 − 2σ )π x)) ;
2πit −∞
and similarly
L ∞
(1 − s) (s, χ ) = − (n)χ (n)n −1/2
L n=1
∞
1
× F −x + log n e−(s−1/2)2π x d x. (12.26)
−∞ 2π
From the estimate F(x) e−(1/2+δ0 )2π |x| we see that
∞
(n)n −1/2 F x − 2π 1
log n e−(1/2+δ1 )2π x d x
n −∞
⎛
∞
∞
⎜
−1/2
(n)n ⎝ e−(1+δ0 +δ1 )2π x n 1/2+δ0 d x
n=1
(log n)/(2π )
(log n)/(2π )
⎞
A similar calculation relates to the second term (12.26), and hence for
s = 1 + δ1 + it,
∞
L L * (t)
(s) (s, χ) + (1 − s) (s, χ ) = H (x)e(−t x) d x = H
L L −∞
12.2 Weil’s explicit formula 413
where
∞
(n) log n
H (x) = − χ (n)F x −
n=1
n 1/2 2π
log n
+ χ (n)F −x + e−(1/2+δ1 )2π x .
2π
Now H (x) is of bounded variation, since
(n)
log n −(1/2+δ1 )2π x
VarH ≤ 1/2
Var F x − e
n n 2π
(n)
log n −(1/2+δ1 )2π x
+ 1/2
Var F −x + e
n n 2π
=2 (n)n −1−δ1 Var F(x)e−(1/2+δ1 )2π x 1.
n
That is,
1+δ1 +i T
1 L L
lim (s) (s, χ) + (1 − s) (s, χ ) ds
T →∞ 2πi 1+δ −i T L L
1
−1 (n) − log n log n
= χ (n)F + χ (n)F .
2π n n 1/2 2π 2π
The remaining terms from (12.24) contribute to the integral (12.23) an amount
1+δ1 +i T
1
G(s) ds.
2πi 1+δ1 −i T
where
1 1 1 q 1 s+κ
G(s) = E 0 (χ ) + + log + ((s) + (1 − s))
s s−1 2 π 2 2
By Cauchy’s theorem this is
1 1/2+i T
log2 qT
G(s) ds + E 0 (χ )((0) + (1)) + O .
2πi 1/2−i T T
414 Explicit formulæ
Since (a + ibt) log(|t| + 2), the inner integral above is T log T , uni-
formly in x. Put δ = T −2/3 . The contribution to the above by those x for which
|x| ≤ δ is
δ
|x|T log T d x δ 2 T log T = T −1/3 log T.
−δ
For |x| ≥ δ we appeal to Theorem C.5 to estimate the inner integral. The error
term in Theorem C.5 contributes an amount
∞
min(x, 1)T −1 x −2 d x T −1 log T.
δ
0
On the right-hand side we see that −δ ··· δ, so that
∞
T
−2π e−2πax/b
lim (a + ibt)*
J (t) dt = J (−x) d x
T →∞ −T b 0 1 − e−2π x/b
provided that J (0) = 0. To obtain the general case we apply the above to
* (t) =
the function√K (x) = J (x) − J (0)e−π x /A where A > 0 is large. Then K
2
*
J (t) − J (0) Ae −π At 2
, and hence
T T
lim * (t) dt = lim
(a + ibt) K (a + ibt)*
J (t) dt
T →∞ −T T →∞ −T
√ ∞
(a + ibt)e−π At dt.
2
− J (0) A
−∞
−∞
416 Explicit formulæ
On combining these estimates, we see that (12.29) holds apart from an error
term O(A−1/2 ), and we obtain the result since A can be arbitrarily large.
12.3 Notes
Section 12.1. Let (x) = n≤x (n)/ log n. Riemann (1859) gave a heuristic
proof that if x > 1, and x is not a prime power, then
∞
du
(x) = Li(x) − Li (x ρ ) − log 2 + .
ρ x (u 2 − 1)u log u
Here the sum over the zeros is conditionally convergent, and it is to be un-
derstood that it is computed as the limit, as T → ∞, of the sum over those
zeros for which |γ | ≤ T . The above formula was first proved rigorously by von
Mangoldt (1895), and additional proofs were subsequently given by Landau
(1908a, b). For further discussion of the explicit formula in the form given by
Riemann, see Edwards (1974, Chapter 1). von Mangoldt (1895) also proved the
explicit formula (12.1). Landau (1909, Section 89) was the first to show that
the limit in (12.1) is attained uniformly for x in a compact interval not con-
taining a prime power. Cramér (1918) showed that (12.1) can be derived from
the above. von Koch (1910) and Landau (1912) estimated the error term that
arises when the explicit formula is truncated, as in Theorem 12.5. The explicit
formula for ψ0 (x, χ) was first established by Landau (1908b), but with not
so much attention to the constant term. In the customary form of this explicit
formula (cf. Davenport (2000, p. 117)), the constant term is expressed in terms
of the constant B(χ ) that arises in the Hadamard product formula for ξ (s, χ ).
Our presentation, which avoids this, is that of Vorhauer (2006).
12.4 References 417
Section 12.2. Although many specific explicit formulæ were derived by vari-
ous authors for a variety of purposes, it was Guinand (1942) who first suggested
that it would be possible to specify a general class of such formulæ. Guinand
(1948) did this assuming the Riemann Hypothesis, but it seems that he im-
posed RH only in order to obtain a wider class of test functions. Theorem
12.13 is a special case of the main result of Weil (1952), who treats general
L-functions associated with Grössencharaktere χ , which are representations
of the group of idèle-classes of an algebraic number field k into the multiplica-
tive group of non-zero complex numbers. Weil also showed that a necessary
and sufficient condition for the Riemann hypothesis to hold for L is that the
right-hand side corresponding to (12.22) is non-negative for all functions F of a
certain class. Gallagher (1987) widened the class of test functions in Guinand’s
formula and gave several applications. See also Besenfelder (1977a, b),
Yoshida (1982), Jorgenson, Lang & Goldfeld (1994), and Bombieri & Lagarias
(1999).
12.4 References
Barner, K. (1981). On A. Weil’s explicit formula, J. Reine Angew. Math. 323, 139–152.
Besenfelder, H.-J. (1977a). Die Weilsche “Explizite Formel” und temperierte Distribu-
tionen, J. Reine Angew. Math. 293–294, 228–257.
(1977b). Zur Nullstellenfreiheit der Riemannschen Zeta-funktion auf der Geraden
σ = 1, J. Reine Angew. Math. 295, 116–119.
Besenfelder, H.-J. & Palm, G. (1997). Einige Äquivalenzen zur Riemannschen Vermu-
tung, J. Reine Angew. Math. 293–294, 109–115.
Bombieri, E. & Lagarias, J. C. (1999). Complements to Li’s criterion for the Riemann
hypothesis, J. Number Theory 77, 274–287.
Cramér, H. (1918). Über die Herleitung der Riemannschen Primzahlformel, Arkiv för
Mat. Astr. Fys. 13, no. 24, 7 pp.
Davenport, H. (2000). Multiplicative Number Theory, Third Edition, Graduate Texts
Math. 74. New York: Springer-Verlag.
Edwards, H. M. (1974). Riemann’s Zeta Function, Pure and Applied Math. 58. New
York: Academic Press.
Gallagher, P. X. (1987). Applications of Guinand’s formula, Analytic number the-
ory and Diophantine problems (Stillwater, 1984), Progress in Math. 70. Boston:
Birkhäusen, pp. 135–157.
Guinand, A. P. (1937). A class of self-reciprocal functions connected with summation
formulæ, Proc. London Math. Soc. (2) 43, 439–448.
(1938). Summation formulæ and self-reciprocal functions, Quart. J. Math. Oxford
Ser. 9, 53–67.
(1939a). Finite summation formulæ, Quart. J. Math. 10, 38–44.
(1939b). Summation formulæ and self-reciprocal functions (II), Quart. J. Math. 10,
104–118.
418 Explicit formulæ
(1939c). A formula for ζ (s) in the critical strip, J. London Math. Soc. 14, 97–100.
(1941). On Poisson’s summation formula, Ann. of Math. (2) 42, 591–603.
(1942). Summation formulæ and self-reciprocal functions (III), Quart. J. Math. 13,
30–39.
(1948). A summation formula in the theory of prime numbers, Proc. London Math.
Soc. 50, 107–119.
Hardy, G. H. & Littlewood, J. E. (1918). Contributions to the theory of the Riemann
zeta-function and the theory of the distribution of primes, Acta Math. 41, 119–196;
Collected Papers, Vol. 2. Oxford: Clarendon Press, 1967, pp. 20–97.
Ingham, A. E. (1932). The Distribution of Prime Numbers, Cambridge Tract No. 30.
Cambridge: Cambridge University Press.
Jorgenson, J., Lang, S., & Goldfeld, D. (1994). Explicit Formulas. Lecture Notes in
Math. 1593. Berlin: Springer-Verlag.
von Koch, H. (1910). Contributions à la théorie des nombres premiers, Acta Math. 33,
293–320.
Landau, E. (1908a). Neuer Beweis der Riemannschen Primzahlformel, Sitzungsber.
Königl. Preuß. Akad. Wiss. Berlin, 737–745; Collected Works, Vol. 4, Essen: Thales
Verlag, 1986, pp. 11–19.
(1908b). Nouvelle démonstration pour la formule de Riemann sur le nombre des
nombres premiers inférieurs à une limite donnée, et démonstration d’une formule
plus générale pour le cas des nombres premiers d’une progression arithmétique,
Ann. l’École Norm. Sup. (3) 25, 399–442; Collected Works, Vol. 4, Essen: Thales
Verlag, 1986, pp. 87–130.
(1909). Handbuch der Lehre von der Verleilung der Primzahlen. Leipzig: Teubner.
Reprint: New York: Chelsea, 1953.
(1912). Über einige Summen, die von den Nullstellen der Riemannschen Zetafunktion
abhängen, Acta Math. 35, 271–294; Collected Works, Vol. 5. Essen: Thales Verlag,
1986, pp. 62–85.
von Mangoldt, H. (1895). Zu Riemann’s Abhandlung “Ueber die Anzahl der Primzahlen
unter einer gegebenen Grösse”, J. Reine Angew. Math. 114, 255–305.
Riemann, B. (1859). Ueber die Anzahl der Primzahlen unter einer gegebenen Grösse,
Monatsber. Kgl. Preuss. Akad. Wiss. Berlin, 671–680; Werke, Leipzig: Teubner,
1876, pp. 3–47. Reprint: New York: Dover, 1953.
de la Vallée Poussin, C. J. (1896). Recherches analytiques sur la théorie des nombres
premiers, I–III, Ann. Soc. Sci. Bruxelles 20, 183–256, 281–362, 363–397.
Vorhauer, U. M. A. (2006). The Hadamard product formula for Dirichlet L-functions,
to appear.
Weil, A. (1952). Sur les “formules explicites” de la théorie des nombres premiers, Comm.
Sém. Math. Univ. Lund [Medd. Lunds Univ. Mat. Sem.], Tome Supplementaire,
252–265.
Wigert, S. (1920). Sur la théorie de la fonction ζ (s) de Riemann, Ark. Mat. 14, 1–17.
Yoshida H. (1992). On Hermitian forms attached to zeta functions, Zeta functions in
geometry (Tokyo, 1990), Adv. Stud. Pure Math. 21. Tokyo: Kinokuniya , 281–325.
13
Conditional estimates
419
420 Conditional estimates
The factor (log x)2 in (13.2) can be avoided if we take smoother weights.
For example, put
ψ1 (x) = (x − n)(n). (13.6)
n≤x
and
1/2
ϑ(x) − x x
π(x) − li(x) = +O . (13.10)
log x (log x)2
Proof By an easy elaboration on Corollary 2.5, we see that
ϑ(x) = ψ(x) − ψ x 1/2 + O x 1/3 .
assuming RH. This improves on (13.12) when |γ | < x/ h. We use this estimate
for the size of the summand together with Theorem 10.13 to see that the sum
in (13.11) is hx 1/2 log x/ h. Hence if h = C x 1/2 log x, then
h
(n) ≥ .
x−h<n<x+h
2
To complete the proof it remains to estimate the contribution made by higher
powers of primes on the left-hand side. The number of squares in this interval is
log x, so the squares of the primes contribute an amount that is (log x)2 .
For each k > 2 there is at most one k power in the interval. Moreover, if p k is
th
422 Conditional estimates
in the interval, then k log x. Hence the higher powers contribute an amount
2
(log x) , and the proof is complete.
if 0 ≤ h ≤ x and β = 1/2. Taking this a step further, we see that the above is
if we assume RH. Here the sum has xh −1 log x/ h terms, and with sums of
independent random variables in mind, we might guess that the above sum is
(x/ h)1/2+ε , which suggests
Note that if we were to use the pointwise bound of Theorem 13.1 to bound
the left-hand side above, then we would obtain an estimate that is larger than
the above by a factor (log X )4 . From the above we see that ψ(x) = x + O(x 1/2 )
on average.
where
∞
2X 2
R(x)2 d x X (log X )4 + log p k 1+ u −2 du
X X/2< p k <3X 1
X (log X )4 .
In view of the symmetry of zeros about the real axis, we may confine our
attention to γ1 > 0. For each such zero, we consider γ2 in various ranges. By
Theorem 10.13, the sum over γ2 < −γ1 is
1 1 log n log γ1
.
γ2 |γ2 |(1 + |γ1 − γ2 |) γ2 γ22
n>γ1 n
2 γ1
γ2 <−γ1 γ2 <−γ1
We sum these estimates, multiply by 1/γ1 , and sum over γ1 to see that the
expression (13.16) is
(log γ1 )2 ∞
(log n)3
< ∞.
γ1 >0 γ1
2
n=1
n2
By averaging | f (u)|2 over a longer interval we obtain not just an upper bound,
but an asymptotic formula.
Theorem 13.6 Assume RH, and let f (u) be defined as in (13.17). Then
1 U m 2ρ
lim | f (u)|2 du =
U →∞ U 0 distinct γ
|ρ|2
Proof Since the explicit formula for ψ0 (x) is uniformly convergent in intervals
free of prime powers, and is boundedly convergent in a neighbourhood of a
prime power, it follows that
U
1
| f (u)|2 du
U 1
1 U
= lim ei(γ1 −γ2 )u du + o(1)
T →∞ γ ,γ
1 2
ρ1 ρ2 U 1
|γi |≤T
1 1 1 1
= 1− + O min 1, + o(1).
U γ1 ,γ2 |ρ1 |2 γ1 ,γ2 |γ1 γ2 | U |γ1 − γ2 |
γ1 =γ2 γ1 =γ2
Suppose that ρ = 1/2 + iγ is a zero, and that its multiplicity is m ρ . Then the
equation γi = γ has m ρ solutions for i = 1 and for i = 2. Thus there are m 2ρ
pairs (γ1 , γ2 ) such that γ1 = γ2 = γ , so we have the result.
We now return to the distribution of primes in arithmetic progressions.
Theorem 13.7 Let q be given, and suppose that GRH holds for all L-functions
modulo q. Then for x ≥ 2,
ψ(x, χ ) = E 0 (χ )x + O x 1/2 (log x)(log q x) , (13.19)
1/2
ϑ(x, χ ) = E 0 (χ )x + O x (log x)(log q x) , (13.20)
1/2
π (x, χ) = E 0 (χ )li(x) + O x log q x (13.21)
where E 0 (χ ) = 1 or 0 according as χ = χ0 or not.
Proof For χ0 these relations follow from Theorem 1 and (12.14). Suppose
that χ is non-principal, and that χ ! is a primitive character that induces χ . Thus
χ ! is a character modulo d for some d|q, 1 < d ≤ q. By taking T = x in the
explicit formula for ψ(x, χ ! ), and appealing to Theorem 10.17, we see that
ψ(x, χ ! ) x 1/2 (log q x)(log x),
and then by (12.14) we have (13.19). By the triangle inequality, |ψ(x, χ ) −
ϑ(x, χ )| ≤ ψ(x) − ϑ(x). From Corollary 2.5 we know that this latter quantity
is x 1/2 , so (13.20) follows from (13.19). On inserting (13.20) into the identity
ϑ(x, χ ) x
ϑ(u, χ )
π(x, χ) = + 2
du,
log x 2 u(log u)
we obtain (13.21).
426 Conditional estimates
Corollary 13.8 Let q be given, and assume GRH for all L-functions modulo
q. Suppose that (a, q) = 1. Then for x ≥ 2,
x
ψ(x; q, a) = + O x 1/2 (log x)2 , (13.22)
ϕ(q)
x
ϑ(x; q, a) = + O x 1/2 (log x)2 , (13.23)
ϕ(q)
li(x)
π(x; q, a) = + O x 1/2 log x . (13.24)
ϕ(q)
Note that trivially,
0 ≤ ψ(x; q, a) ≤ (log x) 1 ≤ (log x)(1 + x/q).
0<n≤x
n≡a (q)
Thus we see that the bound (13.22) is worse than trivial if q > x 1/2 . However,
if q is smaller, say q ≤ x θ with θ < 1/2, then (13.22) provides a form of the
Prime Number Theorem for arithmetic progressions with a much better error
term than we were able to prove unconditionally (cf. Corollary 11.17).
Proof In view of the remarks above, we may assume that q ≤ x 1/2 . By (11.22)
we see that
x ψ(x, χ0 ) − x 1
ψ(x; q, a) − = + χ (a)ψ(x, χ ). (13.25)
ϕ(q) ϕ(q) ϕ(q) χ =χ0
and so (13.22) follows from (13.19). The other relations are proved
similarly.
Since L(s, χ) has log q zeros with γ 1, we expect (assuming GRH) that
ψ(x, χ ) is usually about (x log q)1/2 in size. Thus the estimates of Theorem 13.7
are close to what we presume would be best possible. On the right-hand side
of (13.25), we have ϕ(q) terms. With sums of independent random variables in
mind, we would expect therefore that the right-hand side of (13.25) is usually
(x(log q)/ϕ(q))1/2 . Since we are unable to prove that there is cancellation
in (13.25), we have no recourse but to use the triangle inequality, as in (13.26).
However, we conjecture that a lot has been lost at this point.
for arbitrary complex numbers c(χ ). To understand why this holds, expand the
left-hand side and take the sum over a inside, to see that it is
q
= c(χ1 )c(χ2 ) χ1 (a)χ2 (a).
χ1 χ2 a=1
(a,q)=1
By the basic orthogonality property of Dirichlet characters (cf (4.14)), the inner
sum here is ϕ(q) if χ1 = χ2 , and is 0 otherwise, and this gives (13.27). By
taking c(χ ) = (ψ(x, χ ) − E 0 (χ )x)/ϕ(q), it follows by (11.22) that
q
1
(ψ(x; q, a) − x/ϕ(q))2 = |ψ(x, χ) − E 0 (χ )x|2 ,
a=1
ϕ(q) χ
(a,q)=1
for x ≥ C(log q)(log log q), it follows by taking x = C(log q)2 (log log q)2 in
(13.19) that n(χ ) (log q)2 (log log q)2 . As was the case with Cramér’s the-
orem (Theorem 13.3), we can do slightly better by using a weighted sum of
primes.
On pulling the contour to the line σ = 1/4, we see that the above is
x ρ+1 x 5/4 ∞
L x it
− − (1/4 + it, χ ) dt.
ρ ρ(ρ + 1) 2π −∞ L (1/4 + it)(5/4 + it)
By Theorem 10.17, the sum over ρ is x 3/2 log q. By Theorem 10.17 with
L
Lemma 12.7, we see that L (1/4 + it, χ ) log qτ . Hence the second term
5/4
above is x log q. Thus
χ (n)(n)(x − n) x 3/2 log q. (13.28)
n≤x
if x ≥ C(log q)(log log q). If χ (n) = χ0 (n) for all prime powers n ≤ x, then
the left-hand sides of (13.28) and (13.29) are equal. However, the right-
hand sides are inconsistent if we take x = C(log q)2 , so we obtain the stated
result.
Weaker hypotheses concerning the zeros of L(s, χ ) also imply bounds for
n(χ ). The argument here depends on a careful selection of the kernel in the
inverse Mellin transform.
If ρ is a zero for which β > 1 − δ and ρ is not real, then by hypothesis we have
|γ | ≥ δ 2 log q. The summand in (13.32) is x/|ρ − 1|, so that from (13.30)
with R = δ 2 log q we see that such zeros contribute an amount x/δ 2 . On
combining these estimates we find that there is an absolute constant c1 > 0
such that
w(n)χ (n)(n) ≤ c1 x 1−δ y 2δ δ −1 log q + xδ −2 . (13.33)
n
13.1.1 Exercises
1. Let = supρ β where ρ runs over all non-trivial zeros of ζ (s). Show that
where
(x + h + )ρ+1 − (x + h)ρ+1 − x ρ+1 + (x − )ρ+1
S(ρ) = .
ρ(ρ + 1)
(b) Show that if RH holds, then S(ρ) h x −1/2 for |γ | ≤ x/ h, that
S(ρ) ≤ x /|γ | for x/ h ≤ |γ | ≤ x/ , and that S(ρ)
1/2
x 3/2 /γ 2 for
γ | ≥ x/ .
(c) Show that if RH holds, then
2h
ψ(x + h) − ψ(x) = h + O x 1/2 (log x) log 1/2
x log x
uniformly for x 1/2 log x ≤ h ≤ x.
3. Assume RH. Show that
X
dx m 2ρ
(ψ(x) − x)2 ∼ (log X )
x2 ρ |ρ|
2
2
as X → ∞.
4. Assume RH. Suppose that T is given, T ≥ 2, and let f (u) be defined as in
(13.17). Show that
1 U eiγ u 2 m 2ρ
lim f (u) + du = .
U →∞ U 1 ρ ρ ρ |ρ|2
|γ |≤T |γ |>T
(b) Assume that GRH holds for the d − 1 L-functions L(s, χ k ) where
0 < k < d. Show that for each d th root of unity e(a/d) there is a prime
p such that χ ( p) = e(a/d), with p d 2 (log q)2 .
9. (Montgomery
1971, p. 122) Let P(y) denote the set of those primes p such
that np = 1 for all n ≤ y, and let P(y) be the product of all primes not
exceeding y. Suppose that 2 ≤ y ≤ x.
(a) Explain why
p1
log p = 2−π(y) (log p) 1+ .
x< p≤2x x< p≤2x p1 ≤y p
p∈P(y)
(b) For each m|P(y), m > 1, let χm be the quadratic character determined
p1
by quadratic reciprocity so that χm ( p) = p1 |m p . Also, let χ1 (n) =
1 for all n. Explain why the above is
= 2−π(y) (ϑ(2x, χm ) − ϑ(x, χm )).
m|P(y)
(c) Assume GRH for all quadratic L-functions. Show that the above is
= 2−π (y) x(1 + o(1)) + O x 1/2 (log x)2 .
(d) Show that if y = 23 (log x)(log log x), then the above is positive, for all
sufficiently large x.
(e) Let n 2 ( p) denote the least quadratic non-residue
n of p, which is to say
the least positive integer n such that p = −1. Show that if GRH
is true for all quadratic L-functions, then there exist infinitely many
primes p such that n 2 ( p) > 23 (log p)(log log p).
10. (Littlewood 1924a; cf. Goldston 1982)
(a) Show (unconditionally) that
(x + h)ρ+1 − x ρ+1
ψ(x) ≤ x − + O(h)
ρ hρ(ρ + 1)
for 2 ≤ h ≤ x/2.
(b) Show (unconditionally) that
x ρ+1 − (x − h)ρ+1
ψ(x) ≥ x − − O(h)
ρ hρ(ρ + 1)
for 2 ≤ h ≤ x/2.
13.2 Estimates for the zeta function 433
provided that s = 1 and that ζ (s) = 0. This much is true unconditionally, but
from now on we assume RH, and show that the sum on the left provides a useful
approximation to − ζζ (s) when σ > 1/2.
434 Conditional estimates
Proof If σ ≥ 1/2, then |y ρ−s − 1| ≤ 2. Hence for σ > 1/2, the sum over ρ
in (13.25) has absolute value not exceeding
2x 1/2−σ 1
.
log y ρ |s − ρ|2
ζ 1
1 σ −1
= (s) + (s/2 + 1) − log π + ,
ζ 2 2 (σ − 1)2 + t 2
and by Theorem C.1 this is
ζ 1
= (s) + log τ + O(1).
ζ 2
On inserting this in (13.35), we find that
ζ (n) θ 2x 1/2−σ ζ
(s) = − w(n) s + (s)
ζ n≤x y n (σ − 1/2) log y ζ
(13.37)
x 1/2−σ log τ (x y)1−σ y 1−σ
+O +O +O
(σ − 1/2) log y τ2 τ2
where θ is a complex number satisfying |θ| ≤ 1. Thus
ζ (n) x 1/2−σ log τ (x y)1−σ y 1−σ
(s) w(n) s + + + 2 (13.38)
ζ n≤x y n (σ − 1/2) log y τ 2 τ
provided that
2x 1/2−σ
≤ c < 1. (13.39)
(σ − 1/2) log y
We take
1
y = exp , x = (log τ )2 /y.
σ − 1/2
Then the left-hand side of (13.39) is 2e(log τ )1−2σ , and so (13.39) holds with
13.2 Estimates for the zeta function 435
Proof Since
3/2
ζ
log ζ (σ + it) = log ζ (3/2 + it) − (α + it) dα,
σ ζ
it follows by the triangle inequality that
3/2
ζ
| log ζ (σ + it)| ≤ | log ζ (3/2 + it)| + (α + it) dα,
σ ζ
which by Corollary 13.13 is
(n) −σ (log τ )2−2σ
≤ | log ζ (3/2 + it)| + n − n −3/2 + O .
n≤(log τ )2
log n log log τ
436 Conditional estimates
But
∞
(n)
∞
(n)
| log ζ (3/2 + it)| = n −3/2−it ≤ n −3/2 ,
n=1
log n n=1
log n
so it follows that
(n)
(n) −σ (log τ )2−2σ
| log ζ (σ + it)| ≤ n + n −3/2 + O .
n≤(log τ )2
log n n>(log τ )2
log n log log τ
(13.41)
By the Chebyshev estimate ψ(x) x we see that
(n)
n −3/2 U −1/2 (log U )−1 .
U <n≤2U
log n
for 1 ≤ n ≤ z, so that
(n) (n)
(n −σ − n −1 ) |σ − 1| |σ − 1| log z 1.
n≤z log n n≤z n
On combining these estimates with z = (log τ )2 , we see that the sum in (13.40)
is ≤ log log log τ + O(1), which gives the desired estimate.
Concerning (13.44), we note that
(n) z
1
n −σ = σ
dψ(u)
n≤z log n 2− u log u
z
1 ψ(z) − z
= σ log u
du + σ + 21−σ/ log 2
2 u z log z
z
ψ(u) − u 1
+ σ +1 log u
σ+ du. (13.45)
2 u log u
Corollary 13.17 Assume RH. Then |ζ (1 + it)| ≤ 2eC0 log log τ + O(1).
To complete the picture, we estimate |ζ (s)| and argζ (s) when σ is near 1/2.
Of these estimates, the upper bound for |ζ (s)| is the most immediate.
Theorem 13.18 Assume RH. There is an absolute constant C > 0 such that
C log τ
|ζ (s)| < exp
log log τ
uniformly for σ ≥ 1/2, |t| ≥ 1.
Here the first member on the right-hand side is bounded by Corollary 13.15,
and 0 ≤ σ1 − σ ≤ 1/ log log τ , so we have the stated bound.
To obtain the remaining estimates, we first establish two lemmas, which are
of interest in their own right.
13.2 Estimates for the zeta function 439
Here each summand has positive real part, and for T ≤ γ ≤ T + 1/ log log T
the real part is ≥ 12 log log T , so we obtain the stated bound.
Here the first term on the right-hand side is log τ , by Corollary 13.14. Let
k be a positive integer, and consider zeros for which k/ log log τ ≤ |γ − t| ≤
(k + 1)/ log log τ . By the preceding lemma, there are (log τ )/ log log τ such
zeros, each one of which contributes an amount (log log τ )/k 2 to the above
sum. On summing over k we see that the contribution of zeros for which |γ −
t| > 1/ log log τ is log τ . Finally, for the zeros with |γ − t| ≤ 1, we observe
that |1/(s1 − ρ)| ≤ log log τ , and there are (log τ )/ log log τ such zeros, so
we have the stated result.
If t is not the ordinate of a zero of the zeta function, then we define arg ζ (s)
by continuous variation along the ray α + it where α runs from σ to +∞,
440 Conditional estimates
and arg(+∞ + it) = 0. If t is the ordinate of a zero, then we put arg ζ (s) =
(arg ζ (σ + it + ) + arg ζ (σ + it − ))/2.
Proof We may assume that t is not the ordinate of a zero. Let σ1 and s1
be defined as in the preceding proof. If σ ≥ σ1 , then the above follows from
Corollary 13.16. Suppose now that 1/2 ≤ σ ≤ σ1 . Then
σ1
ζ
arg ζ (s) = arg ζ (s1 ) − (α + it) dα.
σ ζ
Since 0 ≤ σ1 − σ ≤ 1/ log log τ , by Lemma 13.20 the right-hand side above is
σ1
1 log τ
=− dα + O .
|γ −t|≤1/ log log τ σ
α + it − ρ log log τ
Although a lower bound for |ζ (s)| at all heights is out of the question, we
can show, assuming RH, that there are heights for which a lower bound can be
established.
Theorem 13.22 Assume RH. There is an absolute constant C such that for
every T ≥ 4 there is a t, T ≤ t ≤ T + 1, such that
−C log T
|ζ (s)| ≥ exp
log log T
uniformly for −1 ≤ σ ≤ 2.
that
T +1
1 log T
log dt . (13.46)
T min |ζ (s)| log log T
σ ∈I
Since this lower bound applies for all σ ∈ I , the above provides a lower bound
for log minσ ∈I |ζ (s)|. We note that
σ1 γ +δ δ δ
1 x
dt dα = dy dx
1/2 γ −δ α + it − ρ 0 −δ x 2 + y2
π/2 2δ
r cos θ
≤ r dr dθ = 4δ.
−π/2 0 r2
Hence
T +1 σ1 1 log T
dα dt δ ,
T 1/2 ρ α + it − ρ ρ log log T
|γ −t|≤δ T −1≤γ ≤T +2
Theorem 13.23 Assume RH. There is a constant C > 0 such that if |t| ≥ 1,
then
⎧
C log τ
⎨exp log
1 log τ
for σ ≥ 1/2 + 1/ log log τ,
≤
ζ (s) ⎩exp C log τ log e
for 1/2 < σ ≤ 1/2 + 1/ log log τ.
log log τ (σ −1/2) log log τ
Proof The first part follows from Corollary 13.14. Let σ1 and s1 be defined
as in the proof of Lemma 13.20, and suppose that 1/2 < σ ≤ σ1 . Then
σ1
ζ
log ζ (s) = log ζ (s1 ) − (α + it) dα.
σ ζ
Here the first term on the right is (log τ )/ log log τ , by Corollary 13.16. By
Lemma 13.19 we know that the sum in Lemma 13.20 has (log τ )/ log log τ
terms. Since each term has absolute value ≤ 1/(σ − 1/2), it follows that
ζ log τ
(α + it)
ζ (α − 1/2) log log τ
for 1/2 < α ≤ σ1 . Hence
σ1 − 1/2 log τ
log ζ (s) 1 + log ,
σ − 1/2 log log τ
which gives the stated bound.
Theorem 13.24 Assume RH. Then there is an absolute constant C > 0 such
that
C log x
M(x) x 1/2 exp
log log x
for x ≥ 4.
Proof Put σ1 = 1/2 + 1/ log log x, and let C denote the contour that passes
by straight line segments from σ0 − i x to σ1 − i x to σ1 + i x to σ0 + i x. Then
σ0 +i x
xs xs
ds = ds,
σ0 −i x ζ (s)s C ζ (s)s
since the integrand is analytic in the rectangle enclosed by these contours. By
the first case of Theorem 13.22 we see that
σ0 +i x
xs C log x C log x
ds exp σ0 x σ −1 dσ exp ,
σ1 +i x ζ (s)s log log x σ1 log log x
and the same estimate applies to the integral from σ1 − i x to σ0 − i x. Similarly,
by the second part of Theorem 13.22 we see that
σ1 +i x
xs x
C log τ e log log x dt
ds x σ1 exp log .
σ1 −i x ζ (s)s 0 log log τ log log τ τ
13.2 Estimates for the zeta function 443
13.2.1 Exercises
1. (a) Show (unconditionally) that
ξ 1
(s) =
ξ ρ s − ρ
whenever ξ (s) = 0.
(b) Show (unconditionally) that
ξ
(1/2 + it) = 0
ξ
for all t such that ξ (1/2 + it) = 0.
(c) Assume RH. Show that
⎧
> 0 if σ > 1/2,
ξ ⎨
(s) = 0 if σ = 1/2 and ξ (s) = 0,
ξ ⎩
< 0 if σ < 1/2.
(d) Assume RH. Show that if ξ (s) = 0, then s = 1/2.
(e) Assume RH, and let t be any fixed real number. Show that |ξ (σ +
it)| is a strictly increasing function of σ for 1/2 ≤ σ < ∞, and that
|ξ (σ + it)| is a strictly decreasing function of σ for −∞ < σ ≤ 1/2.
(f) Assume RH, and suppose that t is a fixed real number. Show that
(σ − 1/2) ξξ (σ + it) is an increasing function of σ for 1/2 ≤ σ < ∞.
(g) Assume RH. Show that if 1/2 < σ2 ≤ σ1 , then
ξ
σ2 − 1/2 (σ1 −1/2) ξ (σ1 +it)
|ξ (σ2 + it)| ≥ |ξ (σ1 + it)| · .
σ1 − 1/2
2. (a) Show (unconditionally) that if ξ (s) = 0, then
2
ξ ξ 1
(s) − (s) = − .
ξ ξ ρ (s − ρ)2
is real.
444 Conditional estimates
Let 1 denote the sum of the above terms for which d ≤ y, and let
2 denote the sum of the above terms for which d > y. Here y is a
parameter to be determined later, 1 ≤ y ≤ x 1/2 .
(b) Put
S(x, y) = µ(d)B1 (x/d 2 )
d≤y
13.3 Notes
Section 13.1. Theorem 13.1 is due to von Koch (1901). Theorems 13.3 and
13.5 are due to Cramér (1921). The order of magnitude of the estimate in
Theorem 13.5 is optimal, in view of Theorem 13.6, which is from Cramér
(1922). Wintner (1941) showed (assuming RH) that the function f (u) defined
in (13.17) has a limiting distribution. That is, there is a weakly monotonic
function F(x) with limx→−∞ F(x) = 0, limx→+∞ F(x) = 1, such that
1
lim meas{u ∈ [0, U ] : f (u) ≤ x} = F(x)
U →∞ U
where the X γ are independent random variables, each one uniformly distributed
on [0, 1]. It can be shown (unconditionally) that the distribution function FX of
X satisfies the inequalities
√ √ √ √
exp −c1 xe 2π x < 1 − FX (x) < exp −c2 xe 2π x (13.48)
function in the critical strip. Let α(σ ) denote the least number such that
ζ (σ + it) exp (log τ )α(σ )+ε
as t → ∞. From Corollary 13.16 we see that α(σ ) ≤ 2 − 2α, assuming RH.
In the opposite direction, Titchmarsh (1928) showed (unconditionally) that
α(σ ) ≥ 1 − α. More precisely, it is known that if 1/2 ≤ σ < 1, then there is a
c(σ ) > 0 such that
c(σ )(log τ )1−σ
|ζ (σ + it)| = exp .
(log log τ )σ
For 1/2 < σ < 1 this is due to Montgomery (1977); the case σ = 1/2 is due
to Balasubramanian & Ramachandra (1977). Opinions as to where the truth
lies between these bounds vary widely among experts. For more on the value
distribution of the zeta function and L-functions, see Titchmarsh (1986), Joyner
(1986), and Laurinčikas (1996).
That the estimate M(x) x 1/2+ε is equivalent to RH was proved by
Littlewood (1912). Theorems 13.22 through 13.24 are due to Titchmarsh (1927).
Theorem 13.24 has been improved upon by Maier & Montgomery (2006), who
showed (assuming RH) that
M(x) x 1/2 exp (log x)39/61 .
13.4 References
Ankeny, N. C. (1952). The least quadratic non residue, Ann. of Math. 55, 65–72.
Axer, A. (1911). Über einige Grenzwertsätze, S.-B. Wiss. Wien IIa 120, 1253–1298.
Balasubramanian, R. & Ramachandra, K. (1977). On the frequency of Titchmarsh’s
phenomenon for ζ (s), III, Proc. Indian Acad. Sci. Sect. A 86, 341–351.
Bohr, H., Landau, E., & Littlewood, J. E. (1913). Sur la fonction ζ (s) dans le voisi-
nage de la droite σ = 1/2, Acad. Roy. Belgique Bull. Cl. Sci., 1144–1175; Bohr’s
Collected Works, Vol. 1. København: Dansk Mat. Forening, 1952, B.2; Landau’s
Collected Works, Vol. 6. Essen: Thales Verlag, 1986, pp. 61–93; Littlewood’s Col-
lected Papers, Vol. 2. Oxford: Oxford University Press, 1982, pp. 797–828.
Cramér, H. (1918). Über die Nullstellen der Zetafunktion, Math. Z. 2, 237–241;
Collected Works, Vol. 1. Berlin: Springer-Verlag, 1994, 92–96.
(1921). Some theorems concerning prime numbers, Arkiv för Mat. Astr. Fys. 15, no. 5,
33 pp.; Collected Works, Vol. 1. Berlin: Springer-Verlag, 1994, pp. 138–170.
(1922). Ein Mittelwertsatz der Primzahltheorie, Math. Z. 12, 147–153; Collected
Works, Vol. 1. Berlin: Springer-Verlag, 1994, pp. 229–235.
Goldston, D. A. (1982). On a result of Littlewood concerning prime numbers, Acta Arith.
40, 263–271.
Joyner, D. (1986). Distribution Theorems of L-functions, Pitman Research Notes in
Math. 142. Harlow: Longman.
450 Conditional estimates
von Koch, H. (1901). Sur la distribution des nombres premiers, Acta Math. 24, 159–182.
Lagarias, J. C., Montgomery, H. L., & Odlyzko, A. M. (1979). A bound for the least
prime ideal in the Chebotarev density theorem, Invent. Math. 54, 271–296.
Landau, E. (1920). Über die Nullstellen der Zetafunktion, Math. Z. 6, 151–154;
Collected Works, Vol. 7. Essen: Thales Verlag, 1986, pp. 226–229.
Laurinčikas, A. (1996). Limit Theorems for the Riemann Zeta-function, Mathematics
and its Applications 352. Dordrecht: Kluwer.
Littlewood, J. E. (1912). Quelques conséquences de l’hypothèse que la fonction ζ (s) de
Riemann n’a pas de zéros dans le demi-plan R(s) > 12 , Comptes Rendus Acad. Sci.
Paris 154, 263–266; Collected Papers, Vol. 2. Oxford: Oxford University Press,
1882, pp. 793–796.
(1922). Researches in the theory of the Riemann ζ -function, Proc. London Math. Soc.
(2) 20, xxii–xxviii; Collected Papers, Vol. 2. Oxford: Oxford University Press, 1982,
pp. 844–850.
(1924a). Two notes on the Riemann Zeta-function, Proc. Cambridge Philos. Soc.
22, 234–242; Collected Papers, Vol. 2. Oxford: Oxford University Press, 1982,
pp. 851–859.
(1924b). On the zeros of the Riemann zeta-function, Proc. Cambridge Philos. Soc.
22, 295–318; Collected Papers, Vol. 2. Oxford: Oxford University Press, 1982, pp.
860–883.
(1926). On the Riemann zeta function, Proc. London Math. Soc. (2) 24, 175–201;
Collected Papers, Vol. 2. Oxford: Oxford University Press, 1982, pp. 844–910.
(1928). Mathematical Notes (5): On the function 1/ζ (1 + ti), Proc. London Math.
Soc. (2) 27, 349–357; Collected Papers, Vol. 2, Oxford: Oxford University Press,
1982, pp. 911–919.
Maier, H. & Montgomery, H. L. (2006). On the sum of the Möbius function, to appear,
16 pp.
Montgomery, H. L. (1971). Topics in Multiplicative Number Theory, Lecture Notes in
Math. 227. Berlin: Springer-Verlag.
(1977). Extreme values of the Riemann zeta-function, Comment. Math. Helv. 52,
511–518.
(1994). Ten lectures on the interface between analytic number theory and harmonic
analysis, CMBS 84. Providence: Amer. Math. Soc.
Montgomery, H. L. & Vaughan, R. C. (1981). The distribution of square-free numbers,
Recent Progress in Analytic Number Theory (Durham, 1979), Vol. 1. London:
Academic Press, pp. 247–256.
Selberg, A. (1943). On the normal density of primes in small intervals, Arch. Math.
Natur-vid. 47, 87–105; Collected Papers, Vol. 1, New York: Springer Verlag, 1989,
pp. 160–178.
(1944). On the Remainder in the Formula for N (T ), the Number of Zeros of ζ (s) in
the Strip 0 < t < T . Avhandl. Norske Vid.-Akad. Oslo I. Mat.-Naturv. Kl., no. 1;
Collected Papers, Vol. 1, New York: Springer Verlag, 1989, pp. 179–203.
(1946a). Contributions to the Theory of the Riemann zeta-function, Arch. Math.
Naturvid. 48, 89–155; Collected Papers, Vol. 1, New York: Springer Verlag, 1989,
pp. 214–280.
(1946b). Contributions to the Theory of Dirichlet’s L-functions, Skrifter Norske Vid.-
Akad. Oslo I. Mat.-Naturvid. Kl., no. 3; Collected Papers, Vol. 1, New York:
Springer Verlag, 1989, pp. 281–340.
13.4 References 451
452
14.1 General distribution of the zeros 453
By the Schwarz reflection principle, the real parts cancel and the imaginary
parts reinforce. Thus the above is
1
= arg(1/2 + i T ) + arg(−1/2 + i T ) + arg ζ (1/2 + i T )
π
T
+ arg (1/4 + i T /2) − log π .
2
Clearly
14.1.1 Exercise
1. Let χ be a primitive character modulo q with q > 1. Show that if L(s, χ) = 0
for σ > 1/2, then
T qT T log qT
N (T, χ ) = log − +O
2π 2π 2π log log qT
for T ≥ 2.
Theorem 14.8 (Hardy) There exist infinitely many real numbers γ such that
ζ (1/2 + iγ ) = 0.
0
0 0 0 0 100
then Z (t) is not of constant sign in the interval (T, 2T ), which is to say that ζ (s)
has at least one zero 1/2 + iγ of odd multiplicity, with T < γ < 2T . Although
it is possible to show that (14.7) holds for all large T , the requisite arguments
involve technical tools that we have not yet developed.
Fortunately, there is a
family of weights W (t) such that the integral W (t)Z (t) dt can be evaluated
by interpreting it as an inverse Mellin transform with a familiar kernel. Thus we
are able to establish a weighted variant of (14.7), which suffices for our purpose.
In preparation for the main argument, we establish two preliminary results.
Lemma 14.9 If z > 0 and σ0 > 1, then
1 σ0 +i∞
∞
ζ (s) (s/2)(π z)−s/2 ds = 2 e−π n z .
2
This is the inverse of the Mellin transform relationship (10.7) that Riemann
used to establish the functional equation.
Proof By Theorem C.4 we see that if w > 0 and σ0 > 0, then
σ0 +i∞
1
(s/2)w −s/2 ds = 2e−w .
2πi σ0 −i∞
We take w = πn z, and sum over n, to obtain the desired identity. Here the
2
458 Zeros
uniformly for T ≥ 2.
Proof Let C denote the rectangular contour with vertices 1/2 + i, 2 + i,
2 + i T , 1/2 + i T . Since ζ (s) is analytic in this rectangle, we have
ζ (s) ds = 0
C
Thus
T T
ζ (1/2 + it) dt = ζ (2 + it) dt + O T 1/2 .
1 1
n=1
∞
Here the left-hand side is of the form −∞ W (t)Z (t) dt with
| (1/4 + it/2)|
W (t) = .
2π 5/4 z it/2
Write z in polar coordinates, z = r eiθ . Then z −it/2 = r −it/2 eθ t/2 . For our app-
roach to work, W (t) must have constant argument. Accordingly, we take r = 1,
and set θ = π/2 − δ where δ is small and positive. By (C.19) we see that
Hence
−1/4 π(t−τ )/4 −δt/2 τ −1/4 e−(π −δ)τ/2 if t ≥ 0,
W (t) τ e e
τ −1/4 e−(1−δ)π τ/2 if t ≤ 0.
Thus W (t) tends to 0 very rapidly as t → −∞, but relatively slowly as t →
+∞. In particular,
W (t) τ −1/4
δ −3/4 .
n=1
If ζ (s) had only finitely many zeros on the critical line, then we would have
∞ ∞
W (t)Z (t) dt = W (t)|Z (t)| dt + O(1)
−∞ −∞
14.2.1 Exercise
1. (a) Show that the right-hand side of (14.8) is
n=−∞
for 0 < δ ≤ 1.
(d) By taking α = 1/2 in Theorem 10.1, or otherwise, show that
∞
(−1)n e−πn x −1/2 e−π/(4x)
2
x
n=−∞
14.3 Notes
Section 14.1. Theorem 14.1 and Corollary 14.2 are due to Backlund (1914,
1918), and this gave a shorter proof of Corollary 14.3 which had been ob-
tained by von Mangoldt (1905). Earlier von Mangoldt (1895) had the error
term O((log T )2 ). Riemann (1859) proposed Corollary 14.3 but with no indica-
tion of a proof. It is remarkable that Corollary 14.3 is perhaps the only theorem
on the Riemann zeta function that has not seen some significant improvement
in the last 100 years.
Although the maximum order of S(t) is unclear, even assuming the Riemann
Hypothesis, we have considerable (unconditional) knowledge of its moments
and distribution. Selberg (1944) showed that if k is a fixed non-negative even
integer, then
T
k!
S(t)k dt = T (log log T )k/2 + O(T (log log T )k/2−1 ).
0 (k/2)!(2π )k
Although Selberg did not mention it, his techniques can also be used to show
that
T
S(t)k dt = o(T (log log T )k/2 )
0
when k is odd. From these estimates it follows that the distribution of S(t) is
14.4 References 461
for any given real number c. Similar results apply to the distribution of the real
part of log ζ (1/2 + it), and indeed Selberg (unpublished) showed that the real
and imaginary parts can be treated simultaneously. Specifically,
T
(log ζ (1/2 + it))h (log ζ (1/2 − it))k dt = δh,k k!T (log log T )k
0
+ Oh,k T (log log T )(h+k−1)/2
where
1 if h = k,
δh,k =
0 otherwise.
From this it follows that log ζ (1/2 + it) is asymptotically normally distributed
in the complex plane, in the sense that if is a set in the complex plane with
Jordan content, then
1 / log ζ (1/2 + it) 0 1
e−|z| d x d y.
2
lim meas t ∈ [4, T ] : √ ∈ =
T →∞ T log log t π
Section 14.2. Theorem 14.8 was announced and a proof sketched in Hardy
(1914). Further details are given in Hardy & Littlewood (1917). Let N0 (T )
denote the number of zeros of the form 1/2 + iγ with 0 < γ ≤ T . Hardy
& Littlewood (1921) showed that N0 (T ) T . Later Selberg improved this,
first (1942a) to N0 (T ) T log log T and then (1942b) to N0 (T ) T log T ,
so that a positive proportion of the zeros are on the 12 -line. Levinson (1974)
introduced an alternative method that enabled him to show that at least one-
third of the non-trivial zeros are on the 12 -line. Selberg’s method detects only
zeros of odd multiplicity. This should not be a handicap, since presumably all
zeros are simple. Heath-Brown (1979) has observed that Levinson’s method
detects only simple zeros. Conrey (1989) used Levinson’s method to show that
N0 (T ) 25 N (T ).
The proof we have given of Hardy’s Theorem 14.8 is but one of several
described by Titchmarsh (1986, Chapter 10).
14.4 References
Backlund, R. J. (1914). Sur les zéros de la fonction ζ (s) de Riemann, C. R. Acad. Sci.
Paris 158, 1979–1981.
462 Zeros
(1918). Über die Nullstellen der Riemannschen Zetafunktion, Acta Math. 41, 345–
375.
Conrey, J. B. (1989). More than two fifths of the zeros of the Riemann zeta function are
on the critical line, J. Reine Angew. Math. 399, 1–26.
Hardy, G. H. (1914). Sur les zéros de la fonction ζ (s) de Riemann, C. R. Acad. Sci. Paris
158, 1012–1014; Collected Papers, Vol. 2, Oxford: Oxford University Press, 1967,
pp. 6–8.
Hardy, G. H. & Littlewood, J. E. (1917). Contributions to the theory of the Riemann
Zeta-function and the theory of the distribution of primes, Acta Math. 41, 119–196;
Collected Papers, Vol. 2, Oxford: Oxford University Press, 1967, pp. 20–97.
(1921). The zeros of Riemann’s zeta-function on the critical line, Math. Z. 10, 283–
317; Collected Papers, Vol. 2, Oxford: Oxford University Press, 1967, pp. 115–149.
Heath–Brown, D. R. (1979). Simple zeros of the Zeta-function on the critical line, Bull.
London Math. Soc. 11, 17–18.
Levinson, N. (1974). More than one third of zeros of Riemann’s zeta-function are on
σ = 1/2, Adv. Math. 13, 383–436.
von Mangoldt, H. (1895). Zu Riemann’s Abhandlung “Ueber die Anzahl der Primzahlen
unter einer gegebenen Grösse”, J. Reine Angew. Math. 114, 255–305.
(1905). Zur Verteilung der Nullstellen der Riemannschen Funktion ξ (t), Math. Ann.
60, 1–19.
Riemann, B. (1859). Ueber die Anzahl der Primzahlen unter eine gegebenen Grösse,
Monatsber. Kgl. Preuss. Akad. Wiss. Berlin, 671–680; Werke, Leipzig: Teubner,
1876, pp. 3–47. Reprint: New York: Dover, 1953.
Selberg, A. (1942a). On the zeros of Riemann’s zeta-function on the critical line, Arch.
Math. Naturvid. 45, 101–114; Collected Papers, Vol. 1, New York: Springer Verlag,
1989, pp. 142–155.
(1942b). On the Zeros of Riemann’s Zeta-function, Skr. Norske Vid. Akad. Oslo I.,
no. 10; Collected Papers, Vol. 1, New York: Springer Verlag, 1989, pp. 156–159.
(1944). On the Remainder in the Formula for N (T ), the Number of Zeros of ζ (s) in
the Strip 0 < t < T , Avh. Norske Vid. Akad. Oslo. I, no. 1; Collected Papers, Vol.
1, New York: Springer Verlag, 1989, pp. 179–203.
Titchmarsh, E. C. (1986). The Theory of the Riemann Zeta-function, Second edition.
New York: Oxford University Press.
15
Oscillations of error terms
Proof Write
X0 ∞
F(s) = A(x)x −s d x + A(x)x −s d x = F1 (s) + F2 (s),
1 X0
say. Then the function F1 (s) is entire, and the proof of Theorem 1.7 can be
adapted to F2 (s) to give the stated result.
In Exercise 13.1.1 we saw that if denotes the supremum of the real parts of
the zeros of the zeta function, then ψ(x) = x + O(x (log x)2 ). Conversely, if
ψ(x) = x + O(x α+ε ), then by Theorem 1.3 the Dirichlet series ∞ n=1 ((n) −
1)n −s converges for σ > α, and hence ζ (s) = 0 in this half-plane. That is,
ψ(x) − x = (x −ε ). We now sharpen this, by showing that ψ(x) − x must
be large in both signs.
463
464 Oscillations of error terms
Theorem 15.2 Let denote the supremum of the real parts of the zeros of
the zeta function. Then for every ε > 0,
and
as x → ∞.
Here the left-hand side has a pole at − ε, but is analytic for real s > − ε,
in view of Corollary 1.14. Hence the above identity holds for σ > − ε,
and both sides are analytic in this half-plane. But by the definition of ,
the function ζ /ζ has poles with real part > − ε. From this contradiction
we deduce that the assertion (15.3) is false. That is, ψ(x) − x = + (x −ε ).
To obtain the corresponding − estimate we argue similarly using the
identity
∞
1 ζ (s) 1
− − = (x −ε + ψ(x) − x)x −s−1 d x.
s − + ε sζ (s) s − 1 1
In contrast to the situation of Corollary 2.5 or Theorem 13.2, it does not seem
possible to derive (15.2) from (15.1) by integrating by parts. Instead, we pursue
an argument modelled on the one just given. First we examine the Mellin
transform of li(x). By integrating by parts we see that
∞ ∞ ∞
dx du
s li(x)x −s−1 d x = s
= e−u .
2 2 x log x (s−1) log 2 u
15.1 Applications of Landau’s theorem 465
Clearly this is
∞
du 1
e−u − 1
= e−u + du − log(s − 1) − log log 2.
1 u (s−1) log 2 u
By (7.31) we see that this is
(s−1) log 2
e−u − 1
=− du − C0 − log(s − 1) − log log 2.
0 u
Thus we find that
∞
s li(x)x −s−1 d x = − log(s − 1) + r (s)
2
for σ > 1. We observe that this function is analytic on the real axis for s > −
ε. Thus by Lemma 1, if (x) − li(x) < x −ε for all sufficiently large x, then the
identity above holds in the half-plane σ > − ε. However, we are assuming
that the zeta function has a zero ρ = β + iγ with β > − ε, and the left-hand
side above has a logarithmic singularity at s = ρ. Thus we have a contradiction,
and so (x) − li(x) = + (x −ε ). Since π (x) = (x) + O(x 1/2/ log x), and
since ≥ 1/2, it follows that π (x) − li(x) = + (x −ε ). For the corresponding
− estimate, we argue similarly from the identity
1 1 r (x)
+ log(ζ (s)(s − 1)) −
s−+ε s s
∞
= (x −ε + (x) − li(x))x −s−1 d x.
2
Next we show that if there is a zero of ζ (s) on the line σ = , then we may
draw a stronger conclusion.
466 Oscillations of error terms
Theorem 15.3 Suppose that is the supremum of the real parts of the zeros
of ζ (s), and that there is a zero ρ with ρ = , say ρ = + iγ . Then
ψ(x) − x 1
lim sup ≥ , (15.4)
x→∞ x |ρ|
and
ψ(x) − x 1
lim inf
≤− . (15.5)
x→∞ x |ρ|
Proof Suppose that ψ(x) ≤ x + cx for all x ≥ X 0 . Then by Lemma 15.1,
∞
c ζ (s) 1
+ + = (cx − ψ(x) + x)x −s−1 d x (15.6)
s − sζ (s) s − 1 1
for σ > . We now consider the behaviour of these two expressions as s tends
to from above through real values. On the right-hand side, the integral from
1 to X 0 is uniformly bounded, while the integral from X 0 to ∞ is non-negative.
Thus the lim inf of the right-hand side is > −∞ as s → + . On the other hand,
the left-hand side is a meromorphic function that has a pole at s = with
residue
meiφ me−iφ
c+ +
2ρ 2ρ
where m ≥ 1 denotes the multiplicity of the zero ρ. We choose φ so that eiφ /ρ =
−1/|ρ|. Then the above is c − m/|ρ|. This quantity must be non-negative, for if
it were negative, then the left-hand side would tend to −∞ as s → + . Hence
c ≥ 1/|ρ|, and we have (15.4). The proof of (15.5) is similar.
Proof We first prove (15.7). If RH is false, then > 1/2, and we have a
stronger result by Theorem 15.2. If RH holds, then we have (15.7) by Theo-
rem 15.3, and the remaining assertions follow by Theorem 13.2.
Many similar results can be proved using the above ideas. For example, for
M(x) = n≤x µ(n) we find, in the manner of Theorem 15.2, that
M(x) = ± (x −ε ). (15.10)
In analogy to (15.6) we put
∞
1 c
G(s) = − = (M(x) − cx )x −s−1 d x.
sζ (s) s − 1
to disclose any linear dependences, and in the absence of any indication to the
contrary, we presume that the ordinates γ > 0 are linearly independent. Under
this assumption, we can improve on the estimate (15.13).
Theorem 15.5 Let 0 < γ1 < γ2 < · · · < γ K and γ be ordinates of zeros of
ζ (s). For 1 ≤ k ≤ K let εk take one of the values −1, 0, 1. Suppose that
K
εk γk = 0 (15.15)
k=1
for such εk only when εk = 0 for all k. Suppose also that the equation
K
εk γk = γ (15.16)
k=1
has a solution only if γ is one of the γk , say γ = γk0 and that in this case the
only solution is obtained by taking εk0 = 1, εk = 0 for k = k0 . Then
M(x) K
1
lim sup ≥
(15.17)
x→∞ x 1/2 k=1
|ρ ζ
k (ρk )|
and
M(x) K
1
lim inf ≤ − (ρ )|
. (15.18)
x→∞ x 1/2 k=1
|ρk ζ k
Proof In view of (15.10) and (15.14), we may assume that RH holds and that
all zeros of the zeta function are simple. We suppose that M(x) ≤ cx 1/2 for all
large x and consider the integral
∞
M(x) − cx 1/2 K
I (s) = (1 + cos(φk − γk log x)) d x.
1 x s+1 k=1
With G(s) defined as above (with = 1/2), we multiply out the product to
see that this integral is a linear combination of G at various arguments. More
precisely, we see that
1 K
I (s) = G(s) + (eiφk G(s + iγk ) + e−iφk G(s − iγk )) + J (s)
2 k=1
where J (s) is a linear combination of G at arguments of the form
K
s +i εk γk
k=1
with more than one of the εk non-zero. The function G(s) is analytic in the
half-plane σ > 0, except for poles at s = 1/2 and at the non-trivial zeros ρ.
15.1 Applications of Landau’s theorem 469
Hence by Landau’s theorem we see that I (s) converges for σ > 1/2, and our
hypotheses (15.15), (15.16) imply that J (s) is analytic at the point s = 1/2.
Thus the integral I (s) has a pole at s = 1/2 with residue
K
eiφk
−c + .
k=1
ρk ζ (ρk )
We choose the φk so that the summands here are positive real. Since I (s) is
bounded above uniformly for s > 1/2, by letting s tend to 1/2 from above we
deduce that
K
1
c≥ .
k=1
|ρk ζ (ρk )|
It is not known whether it is possible to choose zeros ρ in such a way that the
hypotheses (15.15), (15.16) hold, and for which the sum in (15.17) and (15.18)
is large, but at least we are able to establish
Theorem 15.6 Suppose that the Riemann Hypothesis is true and that the zeros
of the zeta function are simple. Then
1
(ρ)|
T
0<γ ≤T
|ζ
as T → ∞.
Corollary 15.7 If the ordinates γ > 0 of the Riemann zeta function are lin-
early independent over Q, then
M(x)
lim sup = +∞
x→∞ x 1/2
and
M(x)
lim inf = −∞.
x→∞ x 1/2
Proof of Theorem 15.6 It is enough to prove the inequality with T restricted
to the special sequence of values Tν of Theorem 13.21, for which |ζ (s)| τ −ε
470 Oscillations of error terms
15.1.1 Exercises
1. (a) Suppose that ε is small and positive, and let Li(x) be defined as in
Exercise 6.2.22. Explain why
∞ ∞
dx
s Li(x)x −s−1 d x = Li(1 + ε)(1 + ε)−s + = T1 + T2 .
1+ε 1+ε x s log x
(b) Show that Li(1 − ε) = Li(1 + ε) + O(ε).
(c) Show that
∞
dv
Li(1 − ε) = − e−v .
ε v
(d) Show that Li(1 + ε) log 1/ε.
(e) Deduce that
∞
dv 1
T1 = − e−v + O ε log .
ε v ε
(f) Show that
∞
dv
T2 = e−v .
(s−1) log(1+ε) v
(g) Show that
∞
dv
T2 = e−v + O(ε) .
(s−1)ε v
15.1 Applications of Landau’s theorem 471
for σ > 1.
2. Let ψ1 (x) = n≤x (n)(x − n). Show that ψ1 (x) − 12 x 2 = ± (x 3/2 ).
3. Show that ψ(2x) − 2ψ(x) = ± (x 1/2 ).
4. (a) Show that as x → ∞,
(1 − n/x)µ(n) = ± x 1/2 .
n≤x
and that
ψ(x) − x 1 1
lim inf ≤− − .
x→∞ x 1/2 |1/2 + iγ1 | |1/2 + iγ2 |
7. Show that n≤x (−1)ω(n) x 1/2+ε if and only if (3s − 2)/ζ (s) is analytic
for σ > 1/2.
8. (Ingham 1942; cf. Haselgrove 1958) Let L(x) = n≤x λ(n).
(a) Show that if > 1/2, then for every ε > 0, L(x) = ± (x −ε ) as
x → ∞.
(b) Show that lim infx→∞ L(x)/x 1/2 ≤ 1/ζ (1/2) (= −0.685 . . . ).
(c) Show that if ζ (s) has a multiple zero, then L(x) = ± x 1/2 log x .
(d) Show that if RH holds and σ is fixed, 1/4 < σ < 1/2, then
|ζ (2s)/ζ (s)| = τ σ −1/2+o(1) .
(e) Show that if RH holds, then there is a sequence of Tν → ∞ in such a
way that Tν+1 ≤ Tν + 2, and
ζ (2ρ)
= Tν + O Tν3/4+ε .
0<γ ≤Tν
ζ (ρ)
(f) Show that if RH holds and the ordinates γ > 0 of the zeros of the zeta
function are linearly independent over Q, then
L(x)
lim sup = +∞
x→∞ x 1/2
and
L(x)
lim inf = −∞.
x→∞ x 1/2
9. (Turán 1948; cf. Haselgrove 1958)
(a) Show that if n≤x λ(n)/n ≥ 0 for all x ≥ 1, then the Riemann Hy-
pothesis is true.
(b) Show that
λ(n)/n = + x −1/2
n≤x
as x → ∞.
10. Let the positive integer q be fixed. Suppose that if χ is a character (mod
q), then L(σ, χ ) = 0 for 0 < σ < 1. Suppose also that a and b are integers
such that (ab, q) = 1 and a ≡ b (mod q).
(a) Let = (q; a, b) denote the supremum of the real parts of the poles
of the function
L
(χ (a) − χ (b)) (s, χ).
χ L
15.1 Applications of Landau’s theorem 473
Show that
ψ(x; q, a) − ψ(x; q, b) = ± (x −ε )
for any ε > 0.
(b) Let r (a) denote the number of solutions of the congruence x 2 ≡ a
(mod q). Show that
r (a) 1/2
ϑ(x; q, a) = ψ(x; q, a) − x + o x 1/2 .
ϕ(q)
(c) Show that if (q; a, b) > 1/2, then
ϑ(x; q, a) − ϑ(x; q, b) = ± (x −ε ),
π(x; q, a) − π(x; q, b) = ± (x −ε )
for any ε > 0.
(d) Show that (q; a, b) ≥ 1/2.
(e) Show that
ψ(x; q, a) − ψ(x; q, b) = ± x 1/2 .
(f) Show that if r (a) ≥ r (b), then
ϑ(x; q, a) − ϑ(x; q, b) = − x 1/2 ,
π(x; q, a) − π (x; q, b) = − x 1/2 / log x .
(g) Show that if r (a) ≤ r (b), then
ϑ(x; q, a) − ϑ(x; q, b) = + x 1/2 ,
π(x; q, a) − π (x; q, b) = + x 1/2 / log x .
(h) Show that
π(x; 4, 1) − π (x; 4, 3) = − x 1/2 / log x .
11. (Hardy & Littlewood 1918; Landau 1918a, b) Let χ−4 (n) = ( −4
n
) denote
the non-principal character modulo 4, and let
T1 (x) = (n)χ−4 (n)(x − n).
n≤x
log 3 ∞
(−1)k log 2k − 1 log 2k + 1
L (1, χ−4 ) = + − ,
6 k=2
2 2k − 1 2k + 1
and apply the alternating series test to show that 0.19 < L (1, χ−4 ) <
0.196.
(e) Deduce that
1
0.148 < < 0.164.
ρ |ρ|
2
(f) Show that |T1 (x)| < (0.165)x 3/2 for all large x.
(g) Show that
2 3/2
(log p)(x − p 2 ) = x + o x 3/2 .
p≤x 1/2
3
(h) Let T2 (x) = 2< p≤x (log p)(−1)( p−1)/2 (x − p). Show that
5 1
− x 3/2 < T2 (x) < − x 3/2
6 2
for all large x.
(i) Let T3 (x) = 2< p≤x (−1)( p−1)/2 (x − p). Show that
T2 (x) x
T2 (u) 2(x − u)
T3 (x) =+ 2 2
x+ du
log x 3 u (log u) log u
T2 (x) x 3/2
= +O .
log x (log x)2
(j) Let P(x) = p>2 (−1)( p−1)/2 e− p/x . Show that
∞
1
P(x) = T3 (u)e−u/x du.
x2 0
Theorem 15.8 Let denote the supremum of the real parts of the zeros of
ζ (s). If ζ (s) has a zero with real part , then there exists a constant C > 0 such
that ψ(x) − x changes sign in every interval [x, C x] for which x ≥ 2.
We see easily that Rk (y) is differentiable for k > 1, and that Rk (y) = Rk−1 (y).
By the method used to prove explicit formulæ we see also that
eρy
Rk (y) = − + O(y k+1 ).
ρ ρ
k+1
Suppose that the numbers γ j are determined, 0 < γ1 < γ2 < . . . so that the
numbers ± iγ j constitute all the zeros of ζ (s) on the line σ = , and let
m j denote the multiplicity of the zero ρ j = + iγ j . Since ρ |ρ|−α < ∞ for
α > 1, we see that if k ≥ 1, then
m j eiγ j y
Rk (y) = −2ey + o(ey ) (15.19)
j ρ k+1
j
Choose φ so that eiγ1 φ /ρ1K > 0. By taking k = K in (15.19) and using the above
inequality, we see that for all large numbers n, R K (φ + π n/γ1 ) is positive or
476 Oscillations of error terms
Here each term in the sum is periodic, and if γ is large, then both the period and
the amplitude of the term are small. The sum is not absolutely convergent, but
by suitably averaging this with respect to y we may arrange that the γ beyond
a chosen point make a small contribution. Suppose, for simplicity, that by such
an averaging we could truncate the sum, which would leave us to consider the
partial sum
sin γ y
−2 . (15.20)
0<γ ≤T
γ
Here the sum of the absolute values of the coefficients is (log T )2 , and the
sum will be of this order of magnitude if we can find a y for which the fractional
parts {γ y/(2π)} are approximately 1/4 for all the above γ . This, however, is an
inhomogeneous problem of Diophantine approximation, and in general such a
15.2 The error term in the Prime Number Theorem 477
problem has a solution only if the coefficients γ are linearly independent over Q.
Moreover, in order to obtain a quantitative result it would be necessary to have
quantitative lower bounds for the absolute values of linear forms in the γ . Since
we have no such information, we are confined to homogeneous approximation.
Dirichlet’s theorem assures us that there exist large y for which each of the
numbers γ y/(2π) is near an integer. That is, γ y/(2π ) is small for 0 < γ ≤ T ,
where θ denotes the distance from θ to the nearest integer, θ = minn∈Z |θ −
n|. However, the sum (15.20) vanishes when y = 0, and will therefore be small
when the numbers γ y/(2π) are small. On the other hand, if we take y = π/T
in (15.20), then sin γ y γ /T , and the sum is N (T )/T log T . While this
is smaller than the (log T )2 that we might have hoped for, it is definitely large.
This y is small, but by Dirichlet’s theorem there exists a large number y0 for
which the numbers γ y0 /(2π) are small, and then we may take y = y0 ± π/T
to make the sum (15.20) large in either sign.
The truth of the matter is that the sum (15.20) is not an average of the error
term in the Prime Number Theorem, but we can form a weighted sum that
resembles (15.20).
The first factor in the sum is near 1 if γ is small compared to 1/δ, and then
becomes small for larger γ . Thus, despite its more complicated appearance, the
above sum behaves like the partial sum (15.20) with T 1/δ.
for x ≥ 2. We replace x by e±δ x and difference to see that the left-hand side in
the lemma is
δ (eδ(ρ+1) − e−δ(ρ+1) )x ρ
− + O(1). (15.21)
sinh δ ρ 2δρ(ρ + 1)
so our expression is
sin γ δ x iγ
−i x 1/2 · + O x 1/2 .
ρ δ ρ(ρ + 1)
Now 1/ρ = 1/(iγ ) + O(1/γ 2 ), and the first factor in the above sum is |γ |,
so that if we replace 1/ρ by 1/(iγ ), then we introduce an error term that is
x 1/2 γ 1/γ 2 x 1/2 . Similarly we may replace 1/(ρ + 1) by 1/(iγ ). Thus
we see that the above sum is
sin γ δ x iγ
−x 1/2 · + O x 1/2 .
ρ γδ iγ
Proof The point p(n) = ({θ1 n}, . . . , {θ K n}) lies in the hypercube [0, 1) K . We
partition this hypercube into N K hypercubes of side length 1/N . We allow n
to take the values 0, 1, . . . , N K , which gives us N K + 1 points. Hence by the
pigeon-hole principle there are two values of n, say 0 ≤ n 1 < n 2 ≤ N K , for
which the points p(n 1 ), p(n 2 ) lie in the same hypercube. Thus
and
π(x) − li(x) = ± x 1/2 (log x)−1 log log log x . (15.23)
Proof We consider (15.22). If RH is false, then Theorem 15.2 is stronger.
Thus it remains to prove (15.22) if RH holds. Let N be a large integer. We
apply Lemma 15.10 to those numbers γ (log N )/(2π) for which 0 < γ ≤ T =
N log N . Thus in Lemma 15.10 we have K = N (T ) T log T , and there exists
an integer n, 1 ≤ n ≤ N K such that
1γ n 1
1 1 1
1 log N 1 <
2π N
for 0 < γ ≤ T . We take x = N n e±1/N , δ = 1/N in Lemma 15.9. From the
general inequality | sin 2πα − sin 2πβ| ≤ 2π α − β we see that
| sin(γ log x) ∓ sin γ /N | ≤ 2π/N .
Since
sin γ /N 1
· (log N )2
γ γ /N γ
and γ >T 1/γ 2 T −1 log T 1/N , we deduce that the right-hand side in
Lemma 15.9 is
sin γ /N 2
−1
∓2x 1/2
N + O x 1/2 .
γ >0
γ /N
K
The sum over γ is N log N . But x ≤ N N e1/N and K = N (T ) T log T
N (log N )2 , so that
log log x N (log N )3 ,
and hence log N ≥ (1 + o(1)) log log log x. The left-hand side in Lemma 15.9
is simply the average of ψ(u) − u over a neighbourhood of x. Since x N
and N is arbitrarily large, we have (15.22).
As for (15.23), we note that if RH holds, then (15.22) and (15.23) are equiva-
lent, in view of Theorem 13.2. If RH is false, then Theorem 15.2 gives a stronger
result.
15.2.1 Exercises
1. Show that
π(x; 4, 1) − π(x; 4, 3) = ± x 1/2 (log x)−1 log log log x
as x → ∞.
480 Oscillations of error terms
2. (a) Show that if f (k−1) (x) is continuous in [a, a + kh] and if f (k) (x) ex-
ists throughout (a, a + kh), then there exists a ξ ∈ (a, a + kh) such
that
k
k
h k f (k) (ξ ) = (−1)k f (a + j h).
j=0
j
(b) Show that there exist constants C > 0, c > 0 such that if RH holds,
then for all x ≥ 2,
and
3. Show that for every C > 1 there is a δ = δ(C) > 0 such that if RH holds,
then
for all x ≥ 2.
4. (Ingham 1936)
(a) Let N be a positive integer, Y a positive real number, and let θ1 , . . . , θ K
be arbitrary real numbers. By using Dirichlet’s theorem, or otherwise,
show that there is a real number y, Y ≤ y ≤ Y N K such that θk y <
1/N for 1 ≤ k ≤ K .
(b) Let N be an integer > 1, Y a positive real number. Show that there
exist real numbers θ1 , . . . , θ K such that maxk θk y ≥ 1/N uniformly
for all real y in the interval Y ≤ y ≤ Y (N − 1) K .
(c) Suppose that RH holds. Show that there exists an absolute constant
c > 0 such that for any real numbers X ≥ 2 and Z ≥ 16 there exists
an x, X ≤ x ≤ X Z , for which
π(x) − li(x) < −cx 1/2 (log x)−1 log log log Z .
(d) Deduce that there is an absolute constant C > 0 such that if RH holds,
then π(x) − li(x) changes sign in every interval [X, C X ] for X ≥ 2.
15.2 The error term in the Prime Number Theorem 481
for x > 0, and that the convergence is uniform in intervals that do not
contain a square-free number.
(c) Let
eiγ y
g(y) = lim .
ν→∞
|γ |≤Tν
ρζ (ρ)
482 Oscillations of error terms
Show that if g(y) is continuous at y0 , then for any ε > 0 there exist
arbitrarily large y such that |g(y) − g(y0 )| < ε.
(d) Show that g(0+ ) − g(0− ) = 1.
(e) Deduce that lim supx→∞ |M(x)|/x 1/2 ≥ 1/2.
10. (a) Let h(x) = (M(2x) − M(x))/x 1/2 . Show that h(1+ ) = −1 and that
h(1− ) = 1.
(b) Show that
lim sup µ(n) x −1/2 ≥ 1.
x→∞ x<n≤2x
15.3 Notes
Theorems 15.2 and 15.3, and Corollary 15.4, are due in substance to E. Schmidt
(1903). Mertens (1897) conjectured that |M(x)| ≤ x 1/2 for all x ≥ 1. This
‘Mertens Hypothesis’ was disproved by Odlyzko and te Riele (1984), who
showed that
M(x)
lim sup ≥ 1.06
x→∞ x 1/2
and that
M(x)
lim inf ≤ −1.009.
x→∞ x 1/2
One would expect that here the lim sup is +∞ and the lim inf is −∞, but
neither of these assertions has been proved. Ingham (1942) proved Theorem
15.5 under the stronger hypothesis that the ordinates γ > 0 are joined by at
most a finite number of linear relations. That one may restrict the coefficients
of the linear relations, and thus in principle verify the hypothesis for the first
several zeros, was shown by Bateman et al. (1971). The product used in the
proof of Theorem 15.5 is very similar to the Riesz products used in the study
of lacunary Fourier series (see Zygmund 1959, pp. 208–212).
The method used to prove Theorem 15.8 was introduced by Littlewood
(1927) for the purpose of providing a simple proof of Theorem 15.3.
Theorem 15.11 was announced by Littlewood (1914), who sketched the
proof. Full details were given later by Hardy and Littlewood (1918). The initial
proofs depended on an appeal to the Phragmén–Lindelöf principle. Ingham
(1936) found that this could be dispensed with. Ingham considered a more
complicated weighted average of ψ(u) − u which led to the simpler weighted
15.3 Notes 483
partial sum
sin γ y
(1 − γ /T )
0<γ ≤T
γ
of the sum (15.20). The present exposition was inspired by Ingham’s editorial
remark in Hardy’s Collected Works (1967, p. 99).
The proof given of Theorem 15.11 is non-effective in the sense that it does
not permit one to determine an explicit constant c about which one can assert
that π(x) > li(x) for some x < c. Skewes (1933, 1955) formulated a slightly
different division into cases (RH ‘nearly true’ vs. RH ‘significantly false’),
which permitted him to show that one can take
c = exp(exp(exp(exp(7.705)))).
One of the problems here is to construct a function f (x) about which one can
assert that in any interval [x0 , f (x0 )] there exist x for which the sum over the non-
trivial zeros is not highly cancelling. That is, the conclusion of Theorem 15.2
must be put in a more quantitative, localized form. In this connection, Littlewood
(1937) was led to consider a question concerning a sum of cosines. Turán
(1946) discovered that the theorem formulated by Littlewood is false – the
argument provided establishes a weaker result than claimed. Turán undertook a
detailed study of such power sums. His ‘power sum method’ has many important
applications to the oscillatory error terms that arise in analytic number theory
(see Turán 1984). In particular, Knapowski (1961) used Turán’s method to
show, without need of extensive numerical calculations, that an effective upper
bound for the constant c can be determined. Subsequently, Lehman (1966)
used extensive numerical information concerning the zeros ρ to show that one
can take c = 1.65 × 101165 . Using the same method te Riele (1989) shows that
π(x) > lix for at least 10180 consecutive integers in the interval [6.627 . . . ×
10370 , 6.687 . . . × 10370 ]. More recently Bays & Hudson (2000) have given
some new regions where π (x) > li(x), the first of these being around 1.39 ×
10316 . An extension of Littlewood’s theorem to Beurling primes has been given
by Kahane (1999).
Monach & Montgomery (cf. Monach 1980) have conjectured that for every
ε > 0 and every K > 0 there is a T0 (ε, K ) such that
kγ γ > exp(−T 1+ε ) (15.24)
0<γ ≤T
whenever T ≥ T0 and the kγ are integers, not all 0, for which |kγ | ≤ K . From
484 Oscillations of error terms
15.4 References
Anderson, R. J. (1991). On the Möbius sum function, Acta Arith. 59, 205–213.
Bateman, P. T., Brown, J. W., Hall, R. S., Kloss, K. E., Stemmler, R. M. (1971). Linear
relations connecting the imaginary parts of the zeros of the zeta function, Computers
in Number Theory. New York: Academic Press, pp. 11–19.
Bays, C. & Hudson, R. H. (2000). A new bound for the smallest x with π (x) > li(x),
Math. Comp. 69, 1285–1296.
Chebyshev, P. L. (1853). On a new theorem concerning prime numbers of the forms
4n + 1 and 4n + 3, Bull. Acad. Imp. Sci. St. Petersburg, Phys.-Mat. Kl. 11, 208;
Collected Works, Vol. 1. Moscow-Leningrad: Akad. Nauk SSSR.
Hardy, G. H. (1967). Collected Papers of G. H. Hardy, Vol. 2, Oxford: Clarendon Press.
Hardy, G. H. & Littlewood, J. E. (1918). Contributions to the theory of the Riemann
zeta-function and the theory of the distribution of primes, Acta Math. 41, 119–196;
Collected Papers, Vol. 2. Oxford: Clarendon Press, 1967, pp. 20–97.
Haselgrove, C. B. (1958). A disproof of a conjecture of Pólya, Mathematika 5, 141–145.
Ingham, A. E. (1936). A note on the distribution of primes, Acta Arith. 1, 201–211.
(1942). On two conjectures in the theory of numbers, Amer. J. Math. 64, 313–319.
Jurkat, W. B. (1973). On the Mertens Conjecture and Related General -theorems, An-
alytic Number Theory (St. Louis, 1972), Proc. Sympos. Pure Math. 24. Providence:
Amer. Math. Soc., pp. 147–158.
Kahane, J.-P. (1999). Un théorème de Littlewood pour les nombres premiers de Beurling,
Bull. London Math. Soc. 31, 424–430.
Knapowski, S. (1961). On sign-changes in the remainder-term in the prime-number
formula, J. London Math. Soc. 36, 451–460.
15.4 References 485
Landau, E. (1905). Über einen Satz von Tschebyscheff, Math. Ann. 61, 527–550;
Collected Works, Vol. 2. Essen: Thales Verlag, 1986, pp. 206–229; Commentary,
Collected Works, Vol. 3. pp. 72–75.
(1918a). Über einige ältere Vermutungen und Behauptungen in der Primzahlentheorie,
Math. Z. 1, 1–24; Collected Works, Vol. 6. Essen: Thales Verlag, 1986, pp. 469–492.
(1918b). Über einige ältere Vermutungen und Behauptungen in der Primzahlentheorie,
Zweite Abhandlung, Math. Z. 1, 213–219; Collected Works, Vol. 6. Essen: Thales
Verlag, 1986, pp. 506–512.
Lehman, R. S. (1960). On Liouville’s function, Math. Comp. 14, 311–320.
(1966). On the difference π (x) − li(x), Acta Arith. 11, 397–410.
Littlewood, J. E. (1914). Sur la distribution des nombres premiers, C. R. Acad. Sci. Paris
158, 1869–1872; Collected Papers, Vol. 2. Oxford: Oxford University Press, 1982,
pp. 829–832.
(1927). Mathematical notes (3): On a theorem concerning the distribution of prime
numbers, J. London Math. Soc. 2, 41–45; Collected Papers, Vol. 2. Oxford: Oxford
University Press, 1982, pp. 833–837.
(1937). Mathematical notes. XII.: An inequality for a sum of cosines, J. London Math.
Soc. 12, 217–221; Collected Papers, Vol. 2. Oxford: Oxford University Press, 1982,
pp. 838–842.
Mertens, F. (1897). Über eine zahlentheoretische Funktion, Sitz. Akad. Wiss. Wien 106,
761–830.
Monach, W. R. (1980). Numerical Investigation of Several Problems in Number Theory,
Doctoral Thesis. Ann Arbor: University of Michigan.
Odlyzko, A. M. & te Riele, H. J. J. (1984). Disproof of the Mertens conjecture, J. Reine
Angew. Math. 357, 138–160.
Pólya, G. (1919). Verschiedene Bermerkungen zur Zahlentheorie, Jahresbericht
Deutsche Math.–Ver. 28, 31–40.
te Riele, H. J. J. (1989). On the sign of the difference π(x) − lix, Math. Comp. 48,
323–328.
Schmidt, E. (1903). Über die Anzahl der Primzahlen unter gegebener Grenze, Math.
Ann. 57, 195–204.
Skewes, S. (1933). On the difference π(x) − lix, J. London Math. Soc. 8, 277–283.
(1955). On the difference π(x) − lix, II, Proc. London Math. Soc. (3) 5, 48–69.
Turán, P. (1946). On a theorem of Littlewood, J. London Math. Soc. 21, 268–275;
Collected Papers, Vol. 1. Budapest: Akad Kiadó, 1990, pp. 284–293.
(1948). On some approximative Dirichlet polynomials in the theory of the zeta-
function of Riemann, Danske Vid. Selsk. Mat.-Fys. Medd. 24, no. 17, 36 pp.;
Collected Papers, Vol. 1. Budapest: Akad Kiadó, 1990, pp. 369–402.
(1984). On a New Method of Analysis and its Applications, New York: Wiley-
Interscience.
Zygmund, A. (1959). Trigonometric Series, Vol. 1. Cambridge: Cambridge University
Press.
Appendix A
The Riemann–Stieltjes integral
b
We generalize the Riemann integral a f (x) d x by defining an integral
b
a f (x) dg(x) as a limit of Riemann sums n f (ξn ) g(x n ). More precisely,
for a < b suppose that we have a partition
a = x0 ≤ x1 ≤ · · · ≤ x N = b. (A.1)
|S(xn , ξn ) − I | < ε
486
The Riemann–Stieltjes integral 487
where the supremum is taken over all {xn } satisfying (A.1). Since f is uniformly
continuous on [a, b], there is a δ > 0 such that | f (ξ ) − f (ξ )| < ε whenever
|ξ − ξ | ≤ δ. We show that
M
=ε |g(xm ) − g(xm−1
)|
m=1
≤ εVar[a,b] g. (A.3)
We now take {xm } to be the union of {xn } and {xn }, so that both {xn } and {xn }
are subsequences of {xm }. Since
|S(xn , ξn ) − S(xn , ξn )| = |S(xn , ξn ) − S(xm , ξm ) + S(xm , ξm ) − S(xn , ξn )|
≤ |S(xn , ξn ) − S(xm , ξm )| + |S(xm , ξm ) − S(xn , ξn )|
by the triangle inequality, the desired bound (A.2) follows by applying (A.3)
twice.
b
The main negative feature of the Riemann–Stieltjes integral is that a f dg
does not exist if f and g have a common discontinuity in (a, b). However,
488 The Riemann–Stieltjes integral
There is some freedom in the interval of integration, since the left endpoint
can be any number in [0, 1), and the right endpoint can be any number in
[N , N + 1) without affecting the value of the integral.
N Frequently it is useful
−
to integrate from 1 to N , i.e. to consider limε→0+ 1−ε . Some care must be
exercised in choosing the endpoints of integration, since for example
N
N
f (x) d A(x) = an f (n).
1 n=2
b b
Theorem A.2 If a f dg exists, then a g d f also exists, and
b b
g d f = f (b)g(b) − f (a)g(a) − f dg.
a a
b
As we see in the above, we lose no information by writing a f dg instead
b
of the longer a f (x) dg(x). On combining Theorems A.1 and A.2 we see that
b
a f dg exists if f is of bounded variation on [a, b] and g is continuous on
[a, b].
Here the sum on the b right-hand side is a Riemann–Stieltjes sum S(ξn , xn−1 )
approximating to a f dg, since xn−1 ∈ [ξn−1 , ξn ]. Moreover, mesh{ξn } ≤
b
2mesh{xn }, so that the sum on the right tends to a f dg as mesh{xn } tends
to 0.
This proof displays the close relation between partial summation and inte-
gration by parts. Rather than sum the series an f (n) by parts, we can integrate
by parts in (A.4) to see that
N N
an f (n) = A(N ) f (N ) − A(x) d f (x). (A.5)
n=1 0
The Riemann–Stieltjes integral 489
b
It is to be expected that if g is differentiable, then f dg should resemble
b a
Theorem A.4 Suppose that g has bounded variation, and put g ∗ (x) =
Var[a,x] g. Then
b b
f (x) dg(x) ≤ | f (x)| dg ∗ (x).
a a
Proof Clearly
N
|S(xn , ξn )| ≤ | f (ξn )||g(xn ) − g(xn−1 )|
n=1
N
≤ | f (ξn )|(g ∗ (xn ) − g ∗ (xn−1 )),
n=1
Exercises
1. Suppose that ϕ(t) is continuous and strictly increasing for α ≤ t ≤ β, and
that ϕ(α) = a, ϕ(β) = b. Put F(t) = f (ϕ(t)), G(t) = g(ϕ(t)). Show that
b β
f (x) dg(x) = F(t) dG(t)
a α
4. Show that
b 2 b b
f g dh ≤ | f | |dh|
2
|g| |dh|
2
a a a
7. (Second mean value theorem) Suppose that f and g are real-valued functions
with f weakly increasing on [a, b], and g continuous on this interval. Show
that there is an x0 ∈ [a, b] such that
b
f dg = f (a)(g(x0 ) − g(a)) + f (b)(g(b) − g(x0 )) .
a
8. (Darst & Pollard 1970) Suppose that f and g are real-valued functions with
f of bounded variation on [a, b], and g continuous on this interval. (a) Show
that if ξ ∈ [a, b] and f (ξ ) = 0, then
b
f dg ≤ Var[ξ,b] ( f ) max (g(b) − g(x)),
ξ ξ ≤x≤b
ξ
f dg ≤ Var[a,ξ ] ( f ) max (g(x) − g(a)).
a ξ ≤x≤b
9. Suppose that
1 if 0 < x ≤ 1, 1 if 0 ≤ x ≤ 1
f (x) = g(x) =
0 otherwise; 0 otherwise.
0 1 1
Show that −1 f dg and 0 f dg both exist, but that −1 f dg does not exist.
A.1 Notes
Our treatment follows that of Ingham in his lectures at Cambridge University.
Several variants of the Riemann–Stieltjes (R-S) integral have been proposed.
The integral as we have defined it is known as the uniform Riemann–Stieltjes
integral. A slightly more powerful variant is the refinement Riemann–Stieltjes
b
integral, in which a f dg is said to have the value I if for every ε > 0 there is a
partition {xn } such that if {xm } is a refinement of {xn }, then |S(xm , ξm ) − I | < ε
A.2 References 493
for all choices of ξm ∈ [xm−1 , xm ]. The refinement Riemann–Stieltjes integral is
developed in considerable detail by Apostol (1974, Chapter 9) and Bartle (1964,
b
Section 22), and is used by Bateman & Diamond (2004). If a f dg exists in
the sense of uniform R–S integration, then it also exists in the refinement R–S
sense, and has the same value. The refinementc integral has the attractive
c prop-
b
erty that if a < b < c, and if a f dg, b f dg both exist, then a f dg exists
and
c b c
f dg = f dg + f dg .
a a b
This is not true for the uniform R–S integral, as we see by the example in
Exercise A.9.
We mention without proof two more advanced properties of the Riemann–
Stieltjes integral: If f is continuous on [a, b], and if g is absolutely continuous
on the same interval, then
b b
f dg = f g
a a
where the integral on the right is a Lebesgue integral. Secondly, the Riesz
representation theorem, which is fundamental to functional analysis, asserts that
if G is a positive bounded linear functional on the space C[a, b] of continuous
functions on [a, b], then there exists a weakly increasing function g on [a, b]
such that
b
G( f ) = f dg
a
for all f ∈ C[a, b]. An account of this is given in Kestelman (1960, pp. 265–
269).
For more extensive accounts of Riemann–Stieltjes integration, see Apostol
(1974, Chapter 9), Hildebrandt (1938), Kestelman (1960, Chapter 11), Rankin
(1963, Section 29), or Widder (1946, Chapter 1).
A.2 References
Apostol, T. M. (1974). Mathematical Analysis, Second edition. Menlo Park: Addison–
Wesley.
Bartle, R. G. (1964). The Elements of Real Analysis. New York: Wiley.
Bateman, P. T. & Diamond, H. G. (2004). Analytic number theory. An introductory
course, Hackensack: World Scientific.
Darst, R. & Pollard, H. (1970). An inequality for the Riemann–Stieltjes integral, Proc.
Amer. Math. Soc. 25, 912–913.
494 The Riemann–Stieltjes integral
Hildebrandt, T. H. (1938). Stieltjes integrals of the Riemann type, Amer. Math. Monthly
45, 265–277.
Kestelman, H. (1960). Modern Theories of Integration. New York: Dover.
Rankin, R. A. (1963). An Introduction to Mathematical Analysis. Oxford: Pergamon.
Widder, D. V. (1946). The Laplace Transform, Princeton: Princeton University
Press.
Appendix B
Bernoulli numbers and the Euler–Maclaurin
summation formula
since [x] = x − {x}. On integrating the last integral by parts (recall Theorem
A.2), we find that the right-hand side above is
b b
f (x) d x − f (b){b} + f (a){a} + {x} d f (x).
a a
The familiar ‘integral test’ is an immediate corollary of this identity, and indeed
the last term on the right gives an explicit representation of the difference
between f (n) and f (x). If f has a continuous first derivative then (by
Theorem A.3) we may replace d f (x) by f (x) d x in the last integral, so that
b b
f (n) = f (x) d x − f (b){b} + f (a){a} + {x} f (x) d x. (B.1)
a<n≤b a a
B0 (x) = 1. (B.2)
If Bk−1 (x) is given, then Bk (x) is determined, apart from its constant term, by
the differential equation
d
Bk (x) = k Bk−1 (x) (k ≥ 1). (B.3)
dx
495
496 The Euler–Maclaurin summation formula
The Bernoulli number Bk is the constant term of Bk (x). Its value is determined
by the condition
1
Bk (x) d x = 0 (k ≥ 1). (B.4)
0
From (B.2) and (B.3) we see that B1 (x) = x + B1 , and from (B.4) we deduce
that B1 = −1/2. Hence B2 (x) = x 2 − x + B2 , and then we find that B2 = 1/6.
These polynomials and numbers have many significant properties, a few of
which we now investigate.
1.5
0.5
–0.5
–1
In view of (B.3), the integral (B.4) is (Bk+1 (1) − Bk+1 (0))/(k + 1). Thus (B.4)
is equivalent to the assertion that
After subtracting Bk from both sides, this identity provides a formula for Bk−1
in terms of B0 , B1 , . . . , Bk−2 .
Next we determine a power series generating function for the Bk . The func-
tion z/(e z − 1) is analytic except at the points z = 2πki, k = 0. In particular,
this function is analytic in the disc |z| < 2π , and we may write its power series
in the form
z ∞
ck k
= z .
ez − 1 k=0
k!
After multiplying both sides by e z − 1 and equating power series coefficients,
we see not only that c0 = 1 but also that the ck satisfy the recurrence (B.7).
Consequently ck = Bk for all k. That is,
z ∞
Bk k
= z (|z| < 2π ). (B.8)
e −1
z
k=0
k!
Theorem B.1 If k is odd, then
Bk = 0 (k ≥ 3), (B.9)
Bk (x) = −Bk (1 − x) (k ≥ 1), (B.10)
sgnBk (x) = (−1) (k+1)/2
(k ≥ 1, 0 < x < 1/2). (B.11)
If k is even, then
(−1)k/2 Bk (x) ↑ (k ≥ 2, 0 < x < 1/2), (B.12)
Bk (x) = Bk (1 − x) (k ≥ 0), (B.13)
sgnBk = (−1) (k/2)+1
(k ≥ 2). (B.14)
From (B.10) and (B.13) we see that Bk (x + 1/2) is an odd function for odd
k, and an even function for even k. From (B.10) it follows that the sign is
reversed in (B.11) if the interval 0 < x < 1/2 is replaced by 1/2 < x < 1, and
similarly from (B.12) and (B.13) we see that (−1)k/2 Bk (x) is strictly decreasing
for 1/2 ≤ x ≤ 1 when k is even, k ≥ 2. Such properties are evident in the graphs
of Figure B.1.
Proof These assertions are evident for k = 0, 1, 2. We proceed by induction.
Case 1. k odd. We integrate by parts in (B.4) and use (B.3) to see that
1
0 = Bk − k x Bk−1 (x) d x.
0
498 The Euler–Maclaurin summation formula
Table B.1
k Bk
0 1/1 = 1.00000 00000
1 −1/2 = −0.50000 00000
2 1/6 = 0.16666 66667
4 −1/30 = −0.03333 33333
6 1/42 = 0.02380 95238
8 −1/30 = −0.03333 33333
10 5/66 = 0.07575 75758
12 −691/2730 = −0.25311 35531
14 7/6 = 1.16666 66667
16 −3617/510 = −7.09215 68627
18 43867/798 = 54.97117 79449
20 −174611/330 = −529.12424 24242
1
From (B.13)k−1 we see that this integral is 12 0 Bk−1 . By (B.4) this integral
vanishes, so we have (B.9). To prove (B.10), let
Then (B.3) gives f k (x) = k(Bk−1 (x) − Bk−1 (1 − x)), which vanishes by
(B.13)k−1 . Thus f k (x) is a constant. To determine its value we note that by (B.6)
and (B.9), f k (0) = 2Bk = 0. Thus we have (B.10). To prove (B.11) we first note
that Bk (0) = Bk (1/2) = 0 by (B.9) and (B.10). Suppose that k ≡ 1 (mod 4).
It now suffices to show that Bk (x) is convex for 0 < x < 1/2. But this fol-
lows from (B.3) and (B.12)k−1 . If k ≡ 3 (mod 4), then Bk (x) is concave for
0 < x < 1/2, and (B.11) again follows.
Case 2. k even. The assertion (B.12) is immediate from (B.3) and (B.11)k−1 .
To prove (B.13), take
Then by (B.3) we have gk (x) = k f k−1 (x) = 0 by (B.10)k−1 . Thus gk (x) is a con-
stant. But gk (0) = 0 by (B.6). To prove (B.14) we note by (B.4) and (B.13) that
1/2
Bk (x) d x = 0.
0
From this and (B.12) it follows that (−1)k/2 Bk (0) < 0, (−1)k/2 Bk (1/2) > 0.
Thus we have (B.14), and the proof is complete.
The first Bernoulli numbers are easily calculated; in Table B.1 we display
only the non-zero values.
The Euler–Maclaurin summation formula 499
For even k, the identity (B.13) contains (B.6) as a special case. For odd k,
(B.6) is similarly contained in (B.10), in view of (B.9). The identity (B.6) can
be generalized in other ways. For example,
Bk+1 (x + 1) − Bk+1 (x)
= xk (k ≥ 0). (B.15)
k+1
This is obvious for k = 0; to prove this for larger k we argue by induction.
By the inductive hypothesis we see that the derivatives of the two sides are
equal. Thus the two sides differ by at most a constant. We set x = 0 and use
(B.6) to see that this constant is 0.
Suppose that a and b are integers with a < b. In (B.15) we let x take on the
values a, a + 1, . . . , b, and sum, to obtain the important corollary
b
Bk+1 (b + 1) − Bk+1 (a)
nk = (k ≥ 0). (B.16)
n=a k+1
Apart from the value of the constant term, there can be at most one polynomial
with this property. Hence this identity provides a further characterization of the
polynomials Bk (x).
When (B.1) is integrated by parts repeatedly the functions Bk ({x}) arise.
Since these latter functions have period 1, it is natural to consider their ex-
pansions in Fourier series. In general, if f has period 1 we define the Fourier
coefficient f (m) by the formula
1
*
f (m) = f (x)e(−mx) d x
0
where e(θ) = e2πiθ . From (B.4) we see that * Bk (0) = 0 for all k ≥ 1. By in-
tegrating by parts we find that if m = 0, then *B1 (m) = −1/(2πim). If F has
*
period 1 and F = f ∈ L 1 (T), then F(m) =* f (m)/(2πim) for m = 0. Hence
by (B.3) we see that * Bk (m) = k *
Bk−1 (m)/(2πim) and hence that * Bk (m) =
−k!/(2πim) for m = 0. Now B1 ({x}) has a jump discontinuity at the in-
k
tegers, but since it has bounded variation on [0, 1] the symmetric partial
sums of its Fourier series will converge to B1 ({x}) when x is not an integer.
For k > 1 the function Bk ({x}) is continuous and its Fourier series is abso-
lutely convergent, so the series converges uniformly to Bk ({x}). Thus we have
proved
Theorem B.2 If x ∈
/ Z, then
1 ∞
1
B1 ({x}) = − sin 2π mx. (B.17)
π m=1 m
500 The Euler–Maclaurin summation formula
If k > 1, then
Bk ({x}) = −k! (2πim)−k e(mx) (B.18)
m=0
uniformly in x.
A self-contained proof of (B.17), with particular attention to the rate of
convergence, is given in Appendix D.1. Since only the defining properties (B.3)
and (B.4) were used in deriving the above, these formulæ provide a second
means of proving the earlier assertions (B.6), (B.9), (B.10), (B.13), (B.14).
These formulæ have many applications. For example, we may take x = 0 in
(B.18) to obtain
Corollary B.3 For any integer k ≥ 1,
ζ (2k) = (−1)k−1 22k−1 π 2k B2k /(2k)!. (B.19)
Hence ζ (2) = π/6, ζ (4) = π /90, ζ (6) = π /945, and in general ζ (2k) is
4 6
a rational multiple of π 2k .
Since 1 < ζ (2k) < 1 + 22−2k for k ≥ 1, this gives not only the sign of Bk
but also a very precise estimate of its size, namely
2(2k)!(2π)−2k < |B2k | < 2(2k)!(2π )−2k (1 + 22−2k ) (k ≥ 1). (B.20)
We may similarly derive from Theorem B.2 an estimate for the Bernoulli poly-
nomials in the interval 0 ≤ x ≤ 1.
Corollary B.4 Suppose that 0 ≤ x ≤ 1. Then |B1 (x)| ≤ 1/2, and
|Bk (x)| ≤ k!21−k π −k ζ (k) (k ≥ 2). (B.21)
If k is even, then this takes the simpler form |Bk (x)| ≤ |Bk |, and equality
is achieved when x = 0 or 1. For odd k ≥ 3 the inequality can be improved
slightly (see Exercise B.5(e)).
We are now in a position to formulate the Euler–Maclaurin summation
formula.
Theorem B.5 (Euler–Maclaurin) Suppose that K is a positive integer and
that f has continuous derivatives through the K th order on the interval [a, b]
where a and b are real numbers with a < b. Then
b
f (n) = f (x) d x
a<n≤b a
K
(−1)k
+ Bk ({b}) f (k−1) (b) − Bk ({a}) f (k−1) (a)
k=1
k!
b
(−1) K
− B K ({x}) f (K ) (x) d x.
K! a
The Euler–Maclaurin summation formula 501
In most applications the last term is treated as an error term that is only
crudely bounded. For example, by Corollary B.4 above we see that the modulus
of this term does not exceed
b
2ζ (K )
| f (K ) (x)| d x. (B.22)
(2π ) K a
Here the second term has a pole at s = 1, but the integral converges for σ >
1 − K , and hence this formula provides an analytic continuation of ζ (s) into
this larger half-plane. Since K can be taken arbitrarily large, it follows that ζ (s)
is
−sanalytic
in the entire plane, apart from the pole at s = 1. Moreover, the factor
K
has zeros at s = 0, s = −1, . . . , s = 1 − K , and so the last term vanishes
when s is a non-positive integer and K is sufficiently large. Let n denote a
non-negative integer, and set s = −n. If K ≥ n + 2, then we find that
1
K
n Bk
ζ (−n) = 1 − − (−1)k .
n + 1 k=1 k−1 k
Here the sum may be restricted to 1 ≤ k ≤ n + 1, since the binomial coef-
ficient vanishes when k > n. Thus we obtain an expression for ζ (−n) that is
502 The Euler–Maclaurin summation formula
independent of K . Since there are only finitely many terms on the right-hand side
above, and since each term is rational, it is at once clear that ζ (−n) is a rational
number. However, by making use of the properties of Bernoullipolynomials
we
can make this more precise. First we use the identity (n + 1) k−1 n
= k n+1 k
,
and then we observe that the second term on the right supplies an amount that
would arise if we allowed k = 0 in the sum. Thus we see that
1 n+1
n+1
ζ (−n) = 1 − (−1)k Bk .
n + 1 k=0 k
By taking x = −1 in (B.5), we see that the above is
(−1)n
=1+ Bn+1 (−1) .
n+1
By taking x = −1 in (B.15) we see that Bn+1 (−1) = Bn+1 − (−1)n (n + 1).
Hence we conclude that
Bn+1
ζ (−n) = (−1)n .
n+1
In conjunction with the values provided by Theorem B.1, this may be formulated
as follows.
Theorem B.6 Apart from a simple pole at s = 1, the zeta function is analytic
in the complex plane. Moreover, ζ (0) = −1/2, ζ (−2n) = 0 for n = 1, 2, . . . ,
and ζ (1 − 2n) = −B2n /(2n) for n = 1, 2, . . . .
The functional equation of the zeta function (Corollary 10.3) relates ζ (s) to
ζ (1 − s), so that for many purposes it suffices to consider ζ (s) for σ ≥ 1/2.
In this half-plane, the formula (B.23) is not very useful, since the terms in
the sum are far larger than ζ (s) when |s| is large. This is due to the fact that
in our application of the Euler–Maclaurin summation formula, the numbers
f (k) (1) increase rapidly in size with k. It is in situations in which the values
f (k) (x) decreases rapidly in size as k increases that the Euler–Maclaurin formula
provides accurate estimates. With this in mind we break the defining series
−s
n into two ranges, n ≤ N and n > N , and apply the sum formula only in
the second range. Taking a = N and letting b tend to infinity, we find that
N
N 1−s K
s+k−2
ζ (s) = n −s + + N −s Bk N −k+1 /k
n=1
s−1 k=1
k − 1
(B.24)
∞
s+K −1
− B K ({x})x −s−K d x.
K N
The initial derivation of this is carried out under the assumption that σ > 1,
but then one sees that the above provides a valid formula for ζ (s) throughout
The Euler–Maclaurin summation formula 503
where
∞
11 1
c= + B2 ({x})x −2 d x.
12 2 1
From (B.22) we see that the last term in (B.25) has modulus less than 1/(12n).
In addition we describe below how it may be shown that c = 12 log 2π, so that
on exponentiating we obtain Stirling’s formula
n n√
n! = 2π n(1 + O(1/n)). (B.26)
e
More accurate approximations can be derived by using larger values of K . The
value of c can be determined by appealing to Wallis’s formula, which asserts
that
∞
2 1
= 1− 2 . (B.27)
π n=1
4n
Exercises
1. Show that (−1) Bk (−x) = Bk (x) + kx k−1 for all k ≥ 0.
k
4. Show that if k ≥ 3 is odd, then Bk (x) has simple zeros at 0, 1/2, and 1,
and no other zeros in [0, 1]. Show that if k ≥ 2 is even, then Bk (x) has one
simple zero in (0, 1/2) and another in (1/2, 1), and no other zeros in [0, 1].
5. (Lehmer 1940) √
(a) Show that max0≤x≤1 |B3 (x)| = 3/36 < 3/(2π 3 ).
(b) Deduce that
∞ √
max m −3 sin 2π mx = 3π 3 /54 = 0.994527 . . . .
x
m=1
8. Show that
∞
(−1)m π3
= .
m=0
(2m + 1)3 32
The Euler–Maclaurin summation formula 505
(Suggestion: Suppose first that 0 < x < 1/q, and use Theorem B.2.)
10. Show that if a and b are positive integers, then
1
(a, b)2
B1 ({ax})B1 ({bx}) d x = .
0 12ab
11. Using (8), or otherwise, show that
∞
B2k
z cot z = (−1)k (2z)2k
k=0
(2k)!
for |z| < π/2. Show that all coefficients in the latter series are positive.
∞
12. (a) Suppose that A(z) = ∞ n=0 an z /n! and B(z) =
n
n=0 bn z /n! are
n
for |z| < π/2 (cf. Exercise 11). By taking C(z) = sin z, B(z) = cos z in the
preceding exercise, or otherwise, show that the Tk are all positive integers.
506 The Euler–Maclaurin summation formula
where
b−a
g(x) = f (K +1) (a + r − 1 + x) − f (K +1) (a + r − x) .
r =1
(c) Show that if f (K +1) (x) exists and is monotonically decreasing in [a, b],
then
b
sgn B K ({x}) f (K ) (x) d x = −sgnB K .
a
The Euler–Maclaurin summation formula 507
(d) Show that if f (K ) < 0, f (K +1) > 0, f (K +2) < 0 throughout [a, b], then
the last term in the Euler–Maclaurin formula has smaller modulus than,
and opposite sign to, the term k = K in the sum.
(e) Show that
n!
1< √ < e1/(12n) .
(n/e)n 2π n
π
17. For n ≥ 0, let In = 0 (sin x)n d x.
(a) Show that I0 = π , I1 = 2.
(b) Show that In+2 = n+1 I .
n+2 n
(c) Show that In /In+1 → 1 as n → ∞.
(d) Deduce the formula (B.27) of Wallis (1656).
18. Show that if 0 < x < 1, then
∞
e(nα) π sin 2π αx − sin 2π(α − 1)x
2 − n2
= · .
n=−∞ x n 1 − cos 2π x
19. Let C0 denote Euler’s constant. Show that if N and K are positive integers,
then
N
1 1
K −1
B2k B2K
= log N + C0 + − 2k
−θ
n=1
n 2N k=1
2k N 2K N 2K
for some θ ∈ (0, 1).
20. Let t be real, fixed. Show that n≤x (−1)n−1 n −it is boundedly oscillating.
21. (Carlitz 1964)
(a) Choose σ0 > 1 so that log ζ (σ0 ) = 2π . By substituting z = log ζ (s) in
(B.8), show that
log ζ (s) ∞
Bk
= (log ζ (s))k
ζ (s) − 1 k=0
k!
for σ > σ0 .
(b) Choose σ1 > 1 so that ζ (σ1 ) = 2. By writing log ζ (s) = log(1 +
(ζ (s) − 1)), show that
log ζ (s) ∞
(ζ (s) − 1)k
= (−1)k
ζ (s) − 1 k=0
k+1
for σ > σ1 .
(c) Show that there exist rational numbers b(n) such that
log ζ (s) ∞
= b(n)n −s
ζ (s) − 1 n=1
for x ≥ 1. x
(c) Put A2 (x) = 1
2 n≤x (x − n)2 n log n = 1 A1 (u) du. Show that
1 4 13 4 1
A2 (x) = x log x − x + (log C − 1/12)x 2 + O(x log x)
24 288 2
for x ≥ 1.
(d) By using (5.19), show that
σ0 +i∞
−1 x s+2
A2 (x) = ζ (s − 1) ds .
2πi σ0 −i∞ s(s + 1)(s + 2)
Show that F = f .
(f) Let f and F be as above, and suppose that G is a further function such
that G = f . Show that F − G is periodic with period 1, and hence
that if G is a polynomial then G = F + C for some constant C.
The Euler–Maclaurin summation formula 511
(g) Let f and F be as above, and suppose that a and b are integers such
that a ≤ b. Show that
b
f (x + j) = F(x + b + 1) − F(x + a).
j=a
(c) Put
k
x
F(x) = ar k r ! .
r =0
r +1
p
31. Put Sk ( p) = a=1 a k .
(a) Show that S0 ( p) ≡ 0 (mod p).
(b) Show that if ( p − 1)|k and k > 0, then Sk ( p) ≡ −1 (mod p).
(c) Show that if (c, p) = 1, then ck Sk ( p) ≡ Sk ( p) (mod p).
(d) Show that if ( p − 1) k, then there is a c, (c, p) = 1, such that ck ≡
1 (mod p).
(e) Deduce that if ( p − 1) k, then Sk ( p) ≡ 0 (mod p).
(f) Summarize:
−1 (mod p) if ( p − 1)|k, k > 0;
Sk ( p) ≡
0 (mod p) otherwise.
32. (von Staudt 1840, Clausen 1840, cf. Lucas 1891, Carlitz 1960/61) By
combining the preceding two exercises, deduce the von Staudt–Clausen
theorem: If k is positive and even, then
1
Bk +
( p−1)|k
p
is an integer.
33. (a) Let Sk ( p) be defined as in Exercise 29. Use the binomial theorem to
show that
n−1
n
Sk ( p) ≡ 0 (mod p).
k=0
k
(b) Deduce that
n
≡0 (mod p).
0<k<n
k
( p−1)|k
(b) Suppose that k = 1 or that k is a positive even integer, and let q be a pos-
itive integer. By using the von Staudt–Clausen theorem, or otherwise,
show that
1
q k Bk +
( p−1)|k
p
pq
is an integer.
B.1 Notes 513
is an integer.
(d) Suppose that k is odd, k ≥ 3, and that q is a positive integer. By in-
ducting on a, show that q k Bk (a/q) is an integer, for all non-negative
integers a.
35. (Almkvist & Meurman 1991) Suppose that q and k are positive integers.
Show that q k (Bk (a/q) − Bk ) is an integer for all integers a.
36. Suppose that 0 < α ≤ 1, and recall that the Hurwitz zeta function is defined
to be ζ (s, α) = ∞ n=0 (n + α)
−s
for σ > 1.
(a) Show that
1 1
K
−s Bk (1 − α)
ζ (s, α) = + − (−1)k
α s s − 1 k=1 k k
∞
−s
− (−1) K B K ({x − α})x −s−K d x
K 1
for σ > 1 − K .
(b) Deduce that ζ (s, α) is an analytic function of s throughout the complex
plane, except for a simple pole with residue 1 at s = 1.
(c) Let n denote a non-negative integer. Show that
1 n+1
n+1
ζ (−n, α) = α n − (−1)k Bk (1 − α).
n + 1 k=0 k
(d) By (B.10), (B.13), (B.15), and Exercise 2, deduce that
Bn+1 (α)
ζ (−n, α) = − .
n+1
B.1 Notes
Although the notation we have adopted here is quite common, other (conflicting)
notations for the Bernoulli numbers are to be found in the literature. Thus it is
important to recognize the notational conventions when comparing texts.
The basic facts concerning the Bernoulli numbers and polynomials can be
derived in many ways, so the approach depends on one’s motivation. Other
expositions of note are found in Borevich & Shafarevich (1966, Section 5.8),
Rademacher (1973, Chapters 1, 2), and Boas (1977). The proof of the von
514 The Euler–Maclaurin summation formula
can be derived in this way. Apéry (cf. van der Poorten (1978/79), (1980),
Beukers (1979), Ball & Rivoal (2001)) used this formula to prove that ζ (3)
is irrational. It still is not known whether ζ (2k + 1) is irrational when k ≥ 2,
nor is it known whether ζ (2k + 1)/π 2k+1 is irrational. (In this latter connec-
tion see Grosswald (1970) and Terras (1976).) Presumably Euler’s constant
C0 = 0.577215664901532 . . . and Catalan’s constant
∞
L(2, χ−4 ) = (−1)m /(2m + 1)2 = 0.915965594 . . .
m=0
for σ > 1. Now suppose that the complex plane is slit along the positive real
axis, and that C is the ‘Hankel path’ that starts at +∞ on the positive side of
the slit, and follows the slit to the origin, circles the origin in the positive sense,
and then returns to +∞ along the negative side of the slit. Set
z s−1
I (s) = dz.
C ez − 1
This integral is uniformly convergent in any compact portion of the plane, and
therefore defines an entire function. Suppose that σ > 1. We shrink the path C
until it coincides with the slit. The integral along the first leg of the path is then
∞
x s−1
− d x.
0 ex − 1
The portion of the path that circles the origin becomes negligible, and the
integral along the second leg is
∞
(xe2πi )s−1
d x.
0 ex − 1
On combining these results and using the fact that (s) (1 − s) = π/ sin π s
(see Appendix C), we find that
Although we have derived this under the assumption that σ > 1, by the unique-
ness of analytic continuation it remains valid throughout the complex plane.
In general the integrand in I (s) has a branch point at the origin, but if s is a
negative integer then the singularity is merely a pole, the residue can then be
calculated using the power series (B.8), and we obtain Theorem B.4 once more.
See Apostol (1951) for a discussion of the values of the Lerch zeta functions.
By means of the Euler–Maclaurin formula one can calculate ζ (s) and its
derivatives, when |s| is not too large. Let S(t) and Z (t) be defined as in Chapter
14. As long as ζ (1/2 + it) is calculated sufficiently accurately to allow the sign
of Z (t) to be determined, one can prove the existence of zeros on the critical
line by detected changes of sign of Z (t). Let H (n) denote the assertion that
the first n zeros lie on the critical line and are simple. Gram (1903) established
H (10), Backlund (1914) H (79), and Hutchinson (1925) H (138), all using the
Euler–Maclaurin formula. Since the amount of computation to evaluate Z (t)
for a single value of t is comparable to t by this method, it would be slow
work to continue this for larger t. However, in unpublished notes of Riemann,
Siegel (1932) discovered indications of a more rapidly convergent formula,
known today as the Riemann–Siegel formula: Let θ = θ(t) = − 12 t log π +
516 The Euler–Maclaurin summation formula
√
arg (1/4 + it/2), m = [ t/(2π )]. Then
m
Z (t) = 2 n −1/2 cos(θ − t log n) + R(t)
n=1
where the remainder R(t) has an asymptotic expansion that is rapidly convergent
when t is large. The most trivial estimate is that R(t) t −1/4 , but if this is not
sufficient one can write √
(−1)m−1 h t/(2π ) − m
R(t) = 1/4
+ O t −3/4
(t/(2π ))
where h(u) = (cos 2π (u 2 − u − 1/16))/ cos 2π u for 0 ≤ u < 1. Titchmarsh
(1935, 1936) used the above to establish H (1041). All such calculations fall
into two parts. First one calculates Z (t); by detecting sign changes one obtains
a lower bound for N (t). Secondly, one computes S(t), so that N (t) is known
via Theorem 14.1. Titchmarsh argued that if ζ (σ + it) > 0 for σ ≥ 1/2, then
N (t) is the integer nearest to
1 t
arg (1/4 + it/2) − log π + 1.
π 2π
Values of t for which this works are rare when t is large, but Turing (1953)
devised an alternative procedure that depends on the estimate
T
S(t) dt log T, (B.31)
0
which is due to Littlewood (1924). Turing (1953) was the first to employ a
digital computer as an aid to the computation; he achieved H (1104). To be use-
ful in numerical calculations, estimates need to be constructed for the various
implicit constants. For the Riemann–Siegel formula this was done by Titch-
marsh. For (B.31) this was done by Turing. Titchmarsh’s analysis contained
errors that were later corrected by Rosser, Yohe & Schoenfeld (1969). Turing’s
argument also contained errors, which were repaired by Lehman (1970). Sub-
sequently, Lehmer (1956a,b) achieved H (25,000), Meller (1958) H (35,337),
Lehman (1966) H (250,000), Rosser, Yohe & Schoenfeld (1969) H (3,500,000),
Brent (1979) H (81,000,001), Brent, van de Lune, te Riele & Winter (1982a,b)
H (200,000,001), van de Lune & te Riele (1983) H (300,000,001), vande Lune,
te Riele & Winter (1986) H (1,500,000,001) and Wedeniwski H 9 · 1011
(cf http://www.zetagrid.net). The evaluation of ζ (1/2 + it) by means of the
Riemann–Siegel formula involves t 1/2 arithmetic operations, which is a big
improvement over the Euler–Maclaurin method. Odlyzko & Schönhage (1988)
have shown that if multiple evaluations are to be made, the amount of calcula-
tion per evaluation can be reduced to t ε . This new algorithmwas implemented
by Gourdon & Demichel (2004), who used it to establish H 1013 .
B.2 References 517
B.2 References
Almkvist, G. & Meurman, A. (1991). Values of Bernoulli polynomials and Hur-
witz’s zeta function at rational points, C. R. Math. Rep. Acad. Sci. Canada 13,
104–108.
Apéry, R. (1979). Irrationalité de ζ (2) et ζ (3), Astérisque 61, 11–13.
Apostol, T. M. (1951). On the Lerch zeta functions, Pacific J. Math. 1, 161–167.
Backlund, R. (1914). Sur les zéros de la fonction ζ (s) de Riemann, C. R. Acad. Sci.
Paris, 158, 1979–1982.
Ball, K. & Rivoal, T. (2001). Irrationalité d’une infinité de valeurs de la fonction zeta
aux entiers impairs, Invent. Math. 146, 193–207.
Barnes, E. W. (1903). The generalisation of the Maclaurin sum formula, and the range
of its applicability, Quart. J. 35, 175–188.
(1905). The Maclaurin sum-formula Proc. London Math. Soc. (2) 3, 253–272.
Bartz, K. & Rutkowski, J. (1993). On the von Staudt–Clausen theorem, C. R. Math. Rep.
Acad. Sci. Canada 15, 46–48.
Beukers, F. (1979). A note on the irrationality of ζ (2) and ζ (3), Bull. London Math. Soc.
11, 268–272.
Boas, R. P. (1977). Partial sums of infinite series, and how they grow, Amer. Math.
Monthly 84, 237–258.
Borevich, Z. I. & Shafarevich, I. R. (1966). Number Theory. New York: Academic
Press.
Brent, R. (1979). On the zeros of the Riemann zeta function in the critical strip, Math.
Comp. 33, 1361–1372.
Brent, R. P., van de Lune, J., te Riele, H. J. J., Winter, D. T. (1982a). The first 200,000,001
zeros of Riemann’s zeta function, Computational Methods in Number Theory, Part
II, Math. Centre Tracts 155. Amsterdam: Math. Centrum, 389–403.
(1982b). On the zeros of the Riemann zeta function in the critical strip. II, Math.
Comp. 39, 681–688; Corrigenda, 46 (1986), 771.
Carlitz, L. (1960/1961). The Staudt–Clausen theorem, Math. Mag. 34, 131–146.
(1964). Extended Bernoulli and Eulerian numbers, Duke Math. J. 31, 667–689.
Cassels, J. W. S. (1986). Local Fields, London Math Soc. Student Texts 3, Cambridge:
Cambridge University Press.
Clausen, Th. (1840). Theorem, Astronomische Nachrichten 17, 351.
Euler, L. (1732/33). Comm. Petropol. 6, 68–97; Opera, Vol. 1, 15, pp. 42–72.
Glaisher, J. W. L. (1895). On the constant which occurs in the formula for 11 22 · · · n n ,
Messenger of Math. 24, 1–16.
Gourdon, X. & Demichel, P. (2004). The 1013 first zeros of the Riemann zeta function,
and zeros computation at very large height, http://numbers.computation.free.fr/
Constants/Miscellaneous/zetazeros1e13–1e24.pdf.
Gram, J. (1903). Sur les zéros de la fonction ζ (s) de Riemann, Acta Math. 27,
289–304.
Grosswald, E., (1970). Die Werte der Riemannschen Zetafunktion an ungeraden Argu-
mentstellen, Nachr. Akad. Wiss. Göttingen Math.–Phys. Kl. II, 9–13.
Hardy, G. H., (1949). Divergent Series. London: Oxford University Press.
Hutchinson, J. I. (1925). On the roots of the Riemann zeta function, Trans. Amer. Math.
Soc. 27, 49–60.
518 The Euler–Maclaurin summation formula
For any complex number s not equal to a non-positive integer we define the
gamma function by its Weierstrass product,
∞
e−C0 s es/n
(s) = . (C.1)
s n=1
1 + s/n
Here C0 is Euler’s constant, and we recall from Corollary 1.14 or Exercise B.15
that this constatnt is determined by the relation
N
1
= log N + C0 + O(1/N ). (C.2)
n=1
n
From (C.1) it is evident that 1/ (s) is an entire function with simple zeros at the
non-positive integers, which is to say that (s) is a non-vanishing meromorphic
function with simple poles at the non-positive integers as depicted in Figure C.1.
On considering the N th partial product in (C.1) and appealing to (C.2), we obtain
Gauss’s formula,
Ns N!
(s) = lim . (C.3)
N →∞ s(s + 1) · · · (s + N )
520
The gamma function 521
two functions. To this end we let p N (s) denote the expression on the right in
(C.3), and note that
N
N
p N (s) p N (1 − s) = (1 − (s/n)2 )−1 .
s(N + 1 − s) n=1
On the other hand, we recall that the Weierstrass product for the sine function
may be written
∞
s2
sin s = s 1− .
n=1
(π n)2
From (C.1) we see that (s) never takes the value 0, and that it has sim-
ple poles at the non-positive integers. Let k be a non-negative integer. Since
522 The gamma function
Also, since
⎧
⎨1 if n ≡ 1 (mod 4),
−1 − i 1 −1 + i
e(n/4) − e(n/2) + e(3n/4) = −1 if n ≡ 0 (mod 4),
4 2 4 ⎩
0 otherwise,
by taking θ = 1/4, 1/2, 3/4 in (C.13) we deduce via (C.10) that
(1/4) = −C0 − 3 log 2 − π/2. (C.15)
Similarly,
(3/4) = −C0 − 3 log 2 + π/2. (C.16)
Theorem C.1 Let δ > 0 be given, and let R = R(δ) be the set of those com-
plex numbers s for which |s| ≥ δ and | arg s| < π − δ. Then
(s) = log s + O(1/|s|) (C.17)
and
√
(s) = 2πs s−1/2 e−s (1 + O(1/|s|)) (C.18)
uniformly for s ∈ R.
The second estimate here is Stirling’s formula for the gamma function, which
generalizes his estimate (B.26) for n!. From this we see that
Proof From (C.2) and (C.10) we see that if N > |s|, then
N
1
(s) = log N − + O(|s|/N ).
n=0
n+s
By the Euler–MacLaurin summation formula (Theorem B.5) with f (x) =
1/(x + s), a = 0− , b = N , K = 2 we find that
N
1 1 1
= log(N + s) − log s + + + O(|s|−2 ).
n=0
n+s 2s 2(s + N )
On combining these estimates and letting N tend to infinity we find that
1
(s) = log s − + O(|s|−2 ). (C.20)
2s
524 The gamma function
This estimate is more precise than (C.17), and still greater accuracy can be
obtained by choosing a larger value of K .
To derive (C.18) we begin by taking logarithms in (C.3) and applying the
Euler–MacLaurin summation formula, or we integrate (C.20) from s to s + ∞
along a ray parallel to the real axis. In either case we find that
1
log (s) = s log s − s − log s + c + O(1/|s|),
2
and it remains to determine the value of the constant c. This may be done
in a number of ways. For example, we could appeal to (C.5) and (B.26). Al-
ternatively, we can take logarithms in (C.9) and apply the above to see that
c = (log 2π)/2. Then (C.18) follows by exponentiating.
where
(1 − x/N ) N x s−1 for 0 ≤ x ≤ N ,
f N (x) =
0 for x > N .
Proof It is clear that the left-hand side is an entire function of s. Thus it suffices
to prove the identity when σ < 1. For such s we let r → 0+ , and note that the
integral along the semicircle tends to 0. The remaining integrals tend to
∞ ∞
eiπ s e−x x −s d x − e−iπ s e−x x −s d x = 2i(sin π s) (1 − s)
0 0
Euler’s formula asserts that the gamma function is the Mellin transform of
the function e−x . We now establish the inverse.
Suppose that x < 0. Let C be the contour passing by line segment from
−∞ − i T to −i T to i T to −∞ + i T . By the calculus of residues and (C.10)
we find that
2πi ∞
(a + bs)e−2π xs ds = − e2π x(n+a)/b
C b n=0
2πi 2πax/b −1
=− e 1 − e2π x/b .
b
The gamma function 527
−i T
We parametrize the integral −∞−i T , and integrate by parts, to see that it is
0
(a + bσ − ibT )e(x T )e−2π xσ dσ
−∞
0
e(x T ) be(x T )
=− (a − ibT ) + (a + bσ − ibT )e−2π xσ dσ.
2π x 2π x −∞
But
∞
(s) = (n + s)−2 1/|t|
n=0
Exercises
1. Show:
π
(a) | (it)|2 = ;
t sinh π t
π
(b) | (1/2 + it)|2 = ;
cosh πt
(c) (s) > 0 if t > 0;
∂
(d) log | (s)| < 0 when t > 0;
∂t
(e) For any given σ , | (s)| is a strictly decreasing function of t on the
interval 0 < t < ∞.
2. (Gauss 1812) Prove Gauss’s multiplication formula:
q−1
(s + a/q) = (2π )(q−1)/2 q 1/2−qs (qs).
a=0
3. Show:
(a) (1 − s) − (s) = π cot π s;
1
(b) (s + 1) = + (s);
s
(c) If n is an integer, n > 1, then
n−1
1
(n) = −C0 + .
k=1
k
528 The gamma function
for |s − 1| < 1.
7. Show:
∞
(a) (s) = (s + n)−2 ;
n=0
(s) ∞
(b) = (s)2 + (s + n)−2 ;
(s) n=0
(c) The functions (σ ), (σ ) have the same sign for all real σ .
8. Show that if x > 0 and y ≥ 1, then
(x + y)
≥ x y.
(x)
9. (Hermite 1881) Let xn denote the unique critical point of (σ ) in the in-
terval (−n, −n + 1). Show that xn = −n + (log n)−1 + O((log n)−2 ) for
n ≥ 2.
10. Show that (s) = s −1 + 12 s −2 + O(|s|−3 ) uniformly in the region R of
Theorem C.1.
∞
11. (a) Show that 1 e−x x s−1 d x is an entire function.
and
∞
e−e esx d x.
x
(s) =
−∞
(a) Write
∞ ∞
(a) (b) = e−u−v u a−1 v b−1 du dv
0 0
Show that
C.1 Notes
Euler, in a letter of 1729 to Goldbach (cf. Fuss 1843, p. 3) gave the formula
1 ∞ 1 s s −1
(s) = 1+ 1+ .
s n=1 n n
This is substantially the same as the formula (C.3) that Gauss (1812) took to be
fundamental. Based on the above definition of the gamma function, the formula
532 The gamma function
(C.1) was proved by Schlömilch (1844) and Newman (1848). Weierstrass (1856)
took (C.1) to be the definition of the gamma function. Euler had given the special
value (C.7) already in his letter to Goldbach. Euler (1771) also discovered the
reflection formula (C.6). The duplication formula (C.9) of Legendre (1809) is
a special case of the multiplication formula of Gauss (1812), given in Exercise
C.3. Stirling (1730, p. 135) gave the series expansion
∞
1 1 Bn
log (s) = s − log s − s + log 2π + .
2 2 n=2
n(n − 1)s n−1
q
q √
q
χ (a) log (a/q) = −(C0 + log 2π ) aχ (a) − L (1, χ )
a=1 a=1
π
C.2 References
Artin, E. (1931). Einführung in die Theorie der Gamma-Funktion. Hamburger math.
Einzelschriften 11. Leipzig: Teubner.
(1964). The Gamma Function. New York: Holt, Reinhart and Winston.
Barnes, E. W. (1900). The theory of the G-function, Quart. J. Math. 31, 264–314.
Cauchy, A. L. (1827). Exercices de Math. Vol. 2. Paris: de Buse Frèses, pp. 91–92.
Lejeune–Dirichlet, P. G. (1839). Sur une nouvelle methode pour la détermination des
intégrales multiples, J. Math. pures appl. 4, 164–168; Werke I, pp. 375–380.
Euler, L. (1730). De Progressionibus transcendemibus seu quarum termini generales
algebraice dari nequennt, Comment. Acad. Sci. Petropolitanae 5, 36–57; Opera
Omnia, Ser 1, Vol. 14, Teubner, 1924,
pp. 1–14.
(1771). Evolutio formulae integralis x f −1 (log x)m/n d x integratione a valore x = 0
ad x = 1 extensa, Novi Comment. Acad. Petropol. 16, 91–139.
(1794). Institutiones calculi integralis, Vol. 4, p. 342.
Feller, W. (1965). A direct proof of Stirling’s formula, Amer. Math. Monthly 74, 1223–
1225.
Fuss, P.-H. (1843). Correspondence Mathématique et Physique de quelques célèbres
géomètres du XVIIème siècle, Vol. 1. St. Petersburg: Acad. Impér. Sci.
Gauss, C. F. (1812). Disquisitiones generales circa seriem infinitam etc., Comment. Gott.
2, 1–46; Werke, Vol. 3. Berlin: Deutsch von H. Simon, 1888, pp. 123–162.
Gram, J. P. (1899). Nyt Tidsskrift Mat. 10B, 96.
Hankel, H. (1864). Die Eulerschen Integrale bei unbeschränkter Variabilität des Argu-
ments, Zeit. Math. Phys. 9, 1–21.
Henrici, P. (1977). Applied and Computational Complex Analysis, Vol. 2. New York:
Wiley.
Hermite, Ch. (1881). Sur l’intégrale Eulérienne de seconde espèce, J. Reine Angew.
Math. 90, 332–338.
Hölder, O. (1886). Über die Eigenschaft der Gammafunktion keiner algebraischen Dif-
ferentialgleichung zu genügen, Math. Ann. 28, 1–13.
534 The gamma function
*
f (k) = f (x)e(−kx) d x (D.1)
T
are the Fourier coefficients of f . Here e(θ ) = e2πiθ is the complex exponential
with period 1. It is a familiar fact in the theory of Fourier series that if f has
bounded variation on T, then
K
f (α + ) + f (α − )
lim *
f (k)e(kα) = . (D.2)
K →∞ 2
k=−K
Less familiar is the strong quantitative version of this that we now derive.
K
Let D K (x) = k=−K e(kx). This is the Dirichlet kernel. We multiply both
sides of (D.1) by e(kα) and sum, to see that
K
f (k)e(kα) = f (x)D K (α − x) d x = D K (x) f (α − x) d x.
k=−K T T
= D K (x) f (α + x) d x. (D.3)
T
535
536 Topics in harmonic analysis
0.5
–0.5 0 .5 1 1.5
–0.5
15
Figure D.1 Graph of s(x) and its Fourier approximation − k=1 sin 2πkx/(π k).
Proof All terms comprising E K (x) are odd, and hence E K is odd. Thus we
may suppose that 0 ≤ x ≤ 1/2. The case x = 0 is clear. We observe that if
x∈/ Z , then
K
E K (x) = 1 + 2 cos 2π kx = D K (x).
k=1
D.1 Pointwise convergence of Fourier series 537
−1 1−x
sin(2K + 1)π z
= dz
2 x sin π z
i 1−x e K + 12 z
= dz.
2 x sin π z
This gives the second part of the bound. The first bound, |E K (x)| ≤ 1/2,
is weaker if 1/(2K + 1) ≤ x ≤ 1/2, since sin π x ≥ 2x in this range. Thus
it suffices to show that |E K (x)| ≤ 1/2 when 0 < x < 1/(2K + 1). Since
0 < sin u < u for 0 ≤ u ≤ π , it follows from the definition of E K (x)
that
1 1
x− ≤ E K (x) ≤ (2K + 1)x −
2 2
for 0 ≤ x ≤ 1/(2K + 1). This gives the desired bound.
To complete the proof it suffices to apply the triangle inequality (as in Theorem
A.4) and the bound of Lemma D.1.
Moreover, F(α) has period 1, |F(α)| dα < ∞, and F has Fourier coefficients
T
1 1
*
f (k) = F(α)e(−kα) dα = f (n + α)e(−kα) dα
0 n∈Z 0
= f (x)e(−kx) d x (D.9)
R
= *
f (k).
Here the interchange of the integral and the sum is justified by absolute con-
vergence. Thus the Fourier expansion of F is
*
f (k)e(kα).
k∈Z
The Poisson summation formula (D.6) is simply the assertion that this Fourier
expansion converges to F(α) when α = 0. Our hypotheses thus far do not ensure
this, but in this direction we establish the following two precise results.
f is of bounded variation once more, we see that F(α + ) = n∈Z f ((n + α)+ ),
and similarly for F(α − ). Hence we have the stated result.
Theorem D.4 Suppose that f is continuous, and that the series n∈Z f (n +
α) is uniformly convergent for 0 ≤ α ≤ 1. Then
K
|k| *
f (n) = lim 1− f (k).
K →∞ K
n∈Z k=−K
Proof Clearly F(α) given in (D.8) is continuous. Since we have not assumed
that f ∈ L 1 (R), the Fourier transform * f (t) may not exist. However, if k is an
*
integer, then f (k) exists as a convergent improper integral. To see this we first
N
note that n=M f (n + α) is small if M and N are large integers and 0 ≤ α ≤ 1.
Then
1 N N +1
f (n + α)e(−kα) dα = f (x)e(−kx) d x
0 M M
is small. The hypothesis that n f (n + α) converges uniformly implies that
v
f (x) → 0 as |x| → ∞. Hence u f (x)e(−kx) d x → 0 as u, v tend to infinity
through real values. The calculation of * f (k) in (D.9) is still valid, but is now
justified by uniform convergence. Next we appeal to a theorem of Fejér, which
asserts that the Fourier series of a continuous function F(α) with period 1 is
uniformly (C, 1)-summable to F (see Katznelson (2004), p.19). That is,
K
|k| *
1− f (k)e(kα) −→ F(α)
k=−K
K
Exercises
1. Show that if f satisfies the hypotheses of Theorem D.2, and α and β are
real numbers, then the function f (x + α)e(βx) does also. Specify conditions
under which
f (n + α)e(βn) = *
f (k − β)e((k − β)α).
n k
2. Suppose that f has bounded variation on [−A, A], for every A > 0. Show
that
N
∞ T
lim f (n) = lim f (x)e(−kx) d x
N →∞ T →∞
n=−N k=−∞ −T
(a) Show that the sum F(x) is absolutely convergent for almost all x.
(c) Define the Fourier transform of f, and the Fourier coeffi-
cient
of F, respectively, to be * * =
f (t) = Rn f (x)e(−t · x) d x, F(k)
* *
Tn F(x)e(−k · x) d x. Show that F(k) = f (k).
4. (a) Suppose that there is a δ > 0 such that c(k) (1 + |k|)−n−δ . Show that
c(k)e(k · x)
k∈Zn
is a continuous function of x ∈ Tn .
(b) Suppose that there is a δ > 0 such that f (x) (1 + |x|)−n−δ for x ∈ Rn .
Suppose also that f (x) is continuous. Show that
F(x) = f (λ + x)
λ∈Zn
for all x ∈ Tn .
5. A lattice in Rn is a set of points of the form AZn where A is a non-singular
n × n matrix. Thus Zn is an example of a lattice, called the lattice of integral
points.
(a) Suppose that 1 = AZn and 2 = BZn are two lattices. Show that 2 ⊆
1 if and only if there is an n × n matrix K with integral entries such
that B = AK .
(b) An n × n matrix U is said to be unimodular if (i) its entries are integers,
and (ii) detU = ±1. Show that if 1 = AZn and 2 = BZn are two
lattices, then 1 = 2 if and only if there is a unimodular matrix U
such that B = AU .
(c) Let a 1 , . . . , a n denote the columns of A. These vectors are said to form a
basis for 1 , because every member of 1 has a unique representation in
the form c1 a 1 + · · · cn a n where the ci are integers. If = AZn , we say
542 Topics in harmonic analysis
that the determinant of is d() = |det A|. Show that the determinant
of a lattice is independent of the basis by which it is presented.
(d) Suppose that = AZn is a lattice in Rn . Let ∗ be the set of all those
∗
points µ ∈ Rn such that µ · λ ∈−1ZTforn all λ ∈ . Show that is a
∗
lattice, and indeed that = A Z .
(e) Suppose that f is a continuous function on Rn such that
f (x) (1 + |x|)−n−δ ,
*f (t) (1 + |t|)−n−δ
for some δ > 0. Let = AZn be a lattice. Show that
1 *
f (λ + x) = f (µ)e(µ · x)
λ∈
d() µ∈∗
for all x.
D.3 Notes
Section D.1. The relation (D.2) is the famous Dirichlet–Jordan test, which is
usually derived with much less effort. Theorem D.2 generalizes and refines an
argument of Pólya (1918), who estimated the rate of convergence of the Fourier
series (9.18). For more on the convergence of Fourier series, see Katznelson
(2004, Chapter 2), Körner (1988, Part I), or Zygmund (2002, Chapter II).
Section D.2. For more on the Poisson summation formula, see Katznelson
(2004, VI.1.15), Körner (1988, Section 27), or Zygmund (2002, Chapter 2,
Section 13). For a discussion of the Poisson summation formula in higher
dimensions, see Stein & Weiss (1971, Chapter VII Section 2). Siegel (1935)
showed that Minkowski’s convex body theorem could be derived by applying
the Poisson summation formula. Cohn & Elkies (2003), Cohn (2002) and Cohn
& Kumar (2004) have applied the Poisson summation formula in Rn to limit
the density of sphere packings.
D.4 References
Cohn, H. (2002). New upper bounds on sphere packings, II, Geom. Topol. 6, 329–353.
Cohn, H. & Elkies, N. (2003). New upper bounds on sphere packings, I, Ann. of Math.
(2) 157, 689–714.
Cohn, H. & Kumar, A. (2004). The densest lattice in twenty-four dimensions, Electron.
Res. Announc. Amer. Math. Soc. 10, 58–67.
Katznelson, Y. (2004). An Introduction to Harmonic Analysis, Third edition. Cambridge:
Cambridge University Press.
D.4 References 543
544
Name index 545
Chang, T.-H., 240, 241 Estermann, T., v, 33, 370, 392, 393, 394
Chebyshev, P. L., 3ff, 46ff, 54, 69, 71, 475, Euler, L., 20, 32, 33, 194, 195, 500, 514, 517,
484 524, 530, 531, 532, 533
Chih, T.-T., 69, 71 Evelyn, C. J. A., 39, 40, 72, 73
Chowla, S. D., 68, 71, 74, 87, 104, 134, 135,
211, 226, 239, 242, 305, 322, 323, 377, Fatou, P., 277, 280
394 Fekete, M., 376, 394
Chudakov, N. G., 193, 195 Feller, W., 44, 72, 532, 533
Chung, K.-L., 81, 104 Fine, N. J., 49
Cipolla, M., 183, 195 Ford, K., 103, 105
Clausen, Th., 512, 514, 517 Fouvry, E., 103, 105
Coates, J., 393, 394 Freud, G., 163, 164
Cochrane, T., 322, 323 Friedlander, J. B. 102–105, 220, 242, 322
Cohen, E., 71 Friedman, A., 112, 135
Cohen, H., 391, 394 Fujii, A., 323
Cohn, H., 542 Fuss, P.-H., 531, 533
Conrey, J. B., 461, 462
Conway, J. H., 303, 323 Gallagher, P. X., 323, 417
van der Corput, J. G., 68, 69, 71, 81, 104, 276, Ganelius, T., 163, 164
279 Gauss, C. F., 5, 9, 32, 133, 134, 294, 300, 391,
Costa Pereira, N., 69, 71 392, 394, 527, 528, 531, 532, 533
Cramér, H., 31, 33, 240, 241, 416, 417, 421, Gegenbauer, L., 68, 72
447, 448, 449 Gel’fand, I. M., 162, 164
Gel’fond, A. O., 69, 134, 135, 392, 394
Darst, R., 492, 493 Glaisher, J. W. L., 508, 517
Davenport, H., v, 31, 33, 63, 71, 134, 135, 374, Goldbach, C., 531, 532
391, 394, 416, 417 Goldberg, R. R., 162, 164
DeKoninck, J.-M., 241 Goldfeld, D. M., 102, 105, 106, 276, 280, 374,
Delange, H., 71, 72, 135, 163, 164 391, 392, 393, 394, 395, 417, 418
Deléglise, M., 31, 33 Goldston, D. A., 432, 449
Demichel, P., 516, 517 Golomb, S., 54, 72
Deuring, M., 392, 394 Goodman, A., 163, 164
Diamond, H. G., 69, 72, 103, 104, 276, 277, Gorshkov, L. S., 70, 72
278, 279, 493 Gourdon, X., 31, 32, 516, 517
Dickman, K., 202, 239, 241 Graham, S. W., 265, 277, 280
Dirichlet, P. G. L., 38, 68, 72, 115, 133–135, Gram, J. P., 515, 517, 529, 533
391, 530, 533 Granville, A., 322, 324
Dodgson, C., 79 Greaves, G., 103, 105, 240, 242
Dressler, R. E., 264, 279 Gronwall, T. H., 193, 195, 391, 395
Duncan, R. L., 39, 72, 241 Gross, B. H., 393, 395
Dusart, P., 69, 72 Grosswald, E., 42, 63, 71, 72, 514, 517
Grytczuk, A., 113, 135
Edwards, D. A., 164 Guinand, A. P., 417
Edwards, H. M., 416, 417
Eggleston, H. G., 163, 164 Hadamard, J., 3, 192, 194, 195, 345, 356
Elkies, N., 542 Halberstam, H., v, 70, 72, 103, 105, 240,
Ellison, W. J., 393, 394 242
Eratosthenes, 76 Hall, R. R., 70, 72
Erdős, P., 43, 68, 69, 72, 100, 101, 103, 104, Hall, R. S., 278, 280, 482, 484
105, 131, 135, 211, 212, 215, 225, 227, 240, Haneke, W., 374, 391, 395
241, 242, 276, 279, 390, 393, 394 Hankel, H., 525, 532, 533
546 Name index
Hardy, G. H., 31, 32, 33, 59, 69, 70, 72, 101, Karamata, J., 163, 165
103, 105, 133, 150, 151, 162, 163, 164, 165, Karatsuba, A. A., 193, 195
185, 186, 193, 195, 242, 409, 418, 456, 461, Kátai, I., 71, 73
462, 473, 482, 484, 514, 517 Katznelson, Y., 540, 542
Hartman, P., 40, 72 Kestelman, H., 493, 494
Haselgrove, C. B., 472, 484 Kinkelin, H., 508, 518
Hasse, H., 321, 324 Kloss, K. E., 482, 484
Hausman, M., 226, 242 Knapowski, S., 483, 484
Heath-Brown, D. R., 70, 461, 462 Knopfmacher, J., 278, 280
Hecke, E., 194, 195, 356, 391 Knuth, D. E., 32, 34
Heegner, K., 392, 395 Knutson, D. E., 183
Heilbronn, H., 81, 105, 335, 356, 376, 392, 395 Koblitz, N., 514, 518
Hejhal, D. A., 278, 280 von Koch, H., 416, 418, 447, 450
Henrici, P., 532, 533 Körner, T. W., 542, 543
Hensley, D., 88, 105, 240, 242 Kojima, T., 157, 163, 165
Hermite, Ch., 528, 533 Kolesnik, G., 69, 73
Hewitt, E., 162, 165 Korevaar, J., 163, 164, 165, 277, 280
Hildebrand, A., 70, 72, 133, 135, 239, 240, Korobov, N. M., 193, 195
242, 322, 324 Kowalski, E., 103, 105
Hildebrandt, T. H., 493, 494 Kronecker, L., 514, 518
Hille, E., 40, 72 Kubilius, I. P., 70, 71, 73, 240, 242
Hock, A., 394 Kuhn, P., 276, 280
Hölder, O., 133, 135, 533 Kumar, A., 542
Hooley, C., 89, 102, 103, 105 Kummer, E. E., 514, 532, 534, 542
Hua, L. K., 193, 195 Kurokawa, N., 33
Hudson, R. H., 483, 484 Kusmin, R. O., 31, 32
Hutchinson, J. I., 515, 517
Huxley, M. N., 69, 73 Lagarias, J. C., 31, 34, 417, 448, 450
Landau, E., 16, 17, 31, 32, 34, 39, 41, 70, 73,
Ikehara, S., 259, 261, 264, 265, 277, 280 134, 135, 160, 163, 165, 166, 178, 182, 183,
Ingham, A., 163 184, 185, 187, 192, 193, 194, 195, 196, 267,
Ingham, A. E., v, 31, 32, 33, 128, 135, 163, 276, 277, 278, 280, 321, 322, 324, 337, 350,
165, 186, 192, 193, 194, 195, 280, 409, 418, 353, 356, 367ff, 391, 392, 395, 416, 418,
472, 480, 482, 483, 484, 494 448, 449, 450, 473, 485
Ivić, A., 215 Lang, S., 417, 418
Iwaniec, H., 69, 73, 104, 105, 322, 323 Laurinčikas, A., 449, 450
Iwaniec, H. 102ff, 102 Lavrik, A. F., 277, 280, 335, 356, 357
Legendre, A. M., 3, 76, 242, 532, 534
Jacobi, C. G. J., 514, 518 Lehman, R. S., 483, 484, 485, 516,
Jacobsthal, E., 220 518
Jarnı́k, V., 41, 73 Lehmer, D. H., 31, 34, 65, 80, 106, 504, 516,
Jensen, J. L. W. V., 31, 34, 192, 195, 532, 533, 518
534 Lenstra, H., 391, 394
Jordan, C., 514, 518 Lerch, M., 341, 357
Jorgenson, J., 417, 418 LeVeque, W. J., 240, 242
Joris, H., 321, 324 Levinson, N., 276, 280, 461, 462
Joyner, D., 449 Lévy, P., 162, 163, 166
Jurkat, W. B., 106, 481, 484 Linfoot, E. H., 39, 40, 72, 73, 392, 395
Linnik, Yu. V., 134, 135, 392, 394
Kac, M., 71, 73, 240, 242 van Lint, J. H., 88, 106
Kahane, J.-P., 277, 278, 280, 483, 484 Liouville, J., 529, 530, 534
Name index 547
Littlewood, J. E., 5, 31, 33, 101, 103, 105, 150, Niven, I., 69, 74
151, 160, 162, 163, 164, 165, 166, 193, 196, Norton, K. K., 239, 242
242, 340, 357, 409, 418, 432, 448, 449, 450, Nowak, W. G., 41, 74
461, 462, 473, 478, 482, 483, 484, 485, 516, Nyman, B., 278, 281
518
Lucas, É., 512, 514, 518 Odlyzko, A. M., 31, 34, 448, 450, 482, 485,
van de Lune, 166, 516, 517, 518 516, 518
Lunnon, W. F., 394 Oesterlé, J., 393, 395
Onishi, H., 104
Maclaurin, C., 500, 514, 518 Orr, R. C., 39, 74
Mahler, K., 374, 395 Ostrowski, A., 533, 534
Maier, H., 240, 242, 449, 450
Ma̧kowski, A., 69, 73 Page, A., 369, 379, 391, 393, 395
Malliavin, P., 278, 280 Paley, R. E. A. C., 312, 322, 324
Mallik, A., 336, 357 Palm, G., 417
von Mangoldt, H., 194, 195, 196, 416, 418, Parry, W., 278, 281
460, 462 Perron, O., 138, 162, 166
Mapes, D. C., 31, 34 Pesek, J., 394
Martin, G., 286, 324 Peyerimhoff, A., 163, 166
Mascheroni, L., 32, 34 Phragmén, E., 160
Massias, J.-P., 69, 73, 184, 196 Pila, J., 41, 71
Mattics, L. E., 293, 324 Pillai, S. S., 68, 74, 226, 242
McMillan, E. M., 32, 33 Pincherle, S., 532, 534
Meissel, E. D. F., 31 Pintz, J., 134, 136, 194, 197, 240, 243
Meller, N. A., 516, 518 Pitt, H. R., 164, 166, 277, 281
Mellin, H., 162, 166, 525, 529, 531, 532, 534 Poisson, S. D., 356, 357
Mertens, F., 46ff, 68, 70, 73, 127, 134, 135, Pollard, H., 492, 493
176, 193, 197, 482, 485 Pollicott, M., 278, 281
Meurman, A., 513, 517 Pólya, G., 190, 197, 307, 309, 322, 324, 376,
Miller, V. S., 31, 34 394, 395, 484, 485, 542, 543
Mirsky, L., 7, 393, 395 Pomerance, C., 65, 74, 131, 135, 240, 242
Mittag-Leffler, M. G., vi van der Poorten, A., 514, 518
Möbius, A. F., 35 Postnikov, A. G., 163, 166
Monach, W. R., 483, 485 Pringsheim, A., 18, 32, 34
Monsky, P., 134, 136 Pritsker, I. E., 70, 74
Montgomery, H. L., 68, 69, 70, 73, 74, 89,
102, 106, 163, 166, 177, 193, 197, 225, 226, Raabe, J., 531, 534
242, 278, 279, 321, 322, 323, 324, 393, 395, Rademacher, H., 513, 518
432, 446, 448, 449, 450, 483 Ramachandra, K., 449
Moore, E. H., 533, 534 Ramanujan, S., 59, 60, 70, 72, 74, 113, 114,
Mordell, L. J., 32, 34, 134, 135, 293, 305, 323, 133, 136
324, 392, 395 Ramaswami, V., 239, 243
Moser, L., 10 Rankin, R. A., 222, 240, 243, 493, 494
Motohashi, Y., 102, 103, 106 Redmond, D., 113, 136
Mozzochi, C. J., 69, 73 Rényi, A., 65, 71, 74, 240, 243
Reznick, B., 112, 136
Narkiewicz, W., 71, 73, 134, 136, 276, 281 Ricci, G., 100, 106, 240, 243
Newman, D. J., 7, 162, 163, 164, 166 Richards, I., 228, 240, 242, 243
Newman, F. W., 532, 534 Richert, H.-E., 69, 70, 72, 74, 88, 103, 105,
Nicolas, J.-L., 70, 73, 184, 196, 212, 242 106, 193, 197, 240, 242
Nielsen, N., 32, 34, 518, 532, 534 te Riele, H. J. J., 482, 483, 485, 516, 517, 518
548 Name index
Riemann, B., 162, 328, 356, 357, 416, 418, Stark, H. M., 392, 393, 396
460, 462, 515 Stás, W., 194, 197
Riesel, H., 31, 34, 106 von Staudt, K. G. C., 512, 514, 519
Riesz, M., 31, 32, 33, 143, 160, 162, 165, 166, Stein, E., 542, 543
277, 281 Steinhaus, H., 163, 166
Rivat, J., 31, 33 Steinig, J., 277, 279
Rivoal, T., 514, 517 Stemmler, R. M., 482, 484
Robbins, H., 532, 534 Stepanov, S. A., 322
Robin, G., 69, 73, 184, 196 Stieltjes, T. J., 27, 29, 34, 41, 75
Robinson, M. L., 393 Stirling, J., 514, 532, 534
Robinson, R. L., 74 Sweeney, D. W., 32, 34
Rogers, K., 39, 74 Swinnerton-Dyer, H. P. F., 393
Rohrbach, H., 81, 106 Sylvester, J. J., 69, 75
Romanoff, N. P., 97, 103, 106 Szegö, G., 190, 197, 376, 395
Rosser, J. B., 69, 74, 182, 183, 197, 377, 395, Szekeres, G., 43, 72
516, 518
Rubel, L., 163, 166 Tate, J. T., 356, 357
Runge, C., 70, 74 Tatuzawa, T., 193, 197, 375, 396
Rutkowski, J., 512, 517 Tauber, A., 150, 160, 163, 166
Taylor, P. R., 354, 357
Saalschütz, L., 529, 534 Teege, H., 134, 136
Saffari, B., 71, 74, 131 Tenenbaum, G., 70, 71, 72, 75, 239, 240, 242
Sampath, A., 277, 281 Terras, A., 514, 519
Sathe, L. G., 240, 243 Titchmarsh, E. C., 90, 102, 107, 162, 163, 166,
Schinzel, A., 163, 166, 243, 374, 391, 395 167, 193, 194, 197, 356, 357, 391, 396, 448,
Schlömilch, O., 532, 534 449, 451, 461, 462, 516, 519
Schmidt, E., 482, 485 Toeplitz, O., 148, 163, 167
Schmidt, P. G., 43, 74 Tornier, E., 44, 72
Schmidt, W. M., 314, 322, 324 Tsang, K. M., 107
Schoenberg, I. J., 160, 166 Turán, P., 58, 64, 70, 75, 103, 105, 194, 197,
Schoenfeld, L., 69, 74, 182, 197, 516, 518 240, 243, 448, 451, 472, 483, 485
Schönhage, A., 240, 243, 516, 518 Turing, A., 516, 519
Schur, I., 148, 163, 166, 321, 324
Schwarz, W., 71, 74, 133, 135, 136, 276, 281 Vaaler, J. D., 265, 277, 280
Sebah, P., 31, 32 de la Vallée Poussin, C. J., 3, 39, 75, 192ff,
Selberg, A., 102, 103, 106, 107, 240, 243, 251, 193, 194, 197, 321, 324, 356, 357, 409, 418
276, 281, 445, 448, 450, 460–462 Vaughan, R. C., 31, 34, 89, 102, 104, 106, 107,
Serre, J.-P., 133, 136 131, 135, 136, 177, 193, 197, 226, 242, 321,
Shafarevich, I. R., 513, 514, 517 322, 324, 325, 390, 396, 446, 450
Shafer, R. E., 29, 34 Vijayaraghavan, T., 80, 107, 211, 239, 241
Shan, Z., 65, 75 Vinogradov, I. M., 31, 193, 197, 307, 309, 322,
Shapiro, H. N., 68, 72, 226, 242 325
Siegel, C. L., 372, 381, 392, 396, 515, 519, Vivanti, G., 18, 32, 34
542, 543 Vorhauer, U. M. A., 278, 279, 325, 355, 356,
Sitaramachandrarao, R., 41, 75 357, 416, 418, 445, 451
Skewes, S., 483, 485 Vorhauer, V. M. A, 286
Sobirov, A. Š., 277, 280 Voronin, S. M., 193, 195
Soundararajan, K., 69, 75, 322, 324 Voronoı̈, G., 68, 75
Spilker, J., 133, 135
Srinivasan, B. R., 277, 281 Wagner, C., 393, 396
Stall, D. S., 394 Wagon, S., 10, 34
Name index 549
Walfisz, A., 32, 34, 68, 75, 193, 198, 322, 325, Wigert, S., 70, 75, 409, 418
336, 357, 381, 386, 393, 396 Wilf, H., 31, 34
Wallis, J., 507, 519 Williamson, H., 162, 165
Ward, D. R., 43, 75 Wilson, B. M., 71, 75
Waterman, M. S., 32, 33 Winter, D. T., 516, 517, 518
Watkins, M., 393, 396 Wintner, A., 40, 43, 72, 75, 113, 136, 158, 167,
Watson, G. N., 514, 519, 532, 534 447, 451
Weber, H., 392 Wirsing, E. A., 70, 75, 134, 277, 281
Wedeniwski, S., 516 Wirtinger, W., 514, 519
Weierstrass, K., 345, 532, 534 Witt, E., 514
Weil, A., 314, 322, 335, 357, 410, 417, Wrench, W. R., 32, 34
418 Wright, E. M., 276, 281
Weinberger, P. J., 393, 395
Weiss, G., 542, 543 Yohe, J. M., 516, 518
Westzynthius, E., 221, 240, 243 Yoshida H., 417, 418
Wheeler, F. S., 393
Whittaker, E. T., 514, 519, 532, 534 Zagier, D. M., 393, 395, 396
Widder, D. V., 34, 162, 163, 164, 167, 281, Zeitz, H., 240, 241
493, 494 Zhang, W. B., 278, 281
Wielandt, H., 163, 167 Zolotarev, G., 303
Wiener, N., 162–164, 167, 259, 261, 264–265, Zuckerman, H. S., 69, 74
277, 281 Zygmund, A., 162, 167, 482, 485, 542, 543
Subject index
550
Subject index 551
Riemann zeta function, 2 square-free number, 36, 183, 186, 225, 446,
analytic continuation, 24–27, 500, 501 471
distribution of zeros, 175, 353–354, von Staudt–Clausen theorem, 512, 514
452ff Stirling’s formula, 503
Euler product, 22 summability, 147–167
functional equation, 326ff Abel, 147
linear independence of zeros, 447ff, Cesàro, 158
467ff Lambert, 159
non-trivial zero, 328 Riesz, 158
special values, 328 sums of two squares, 45, 46, 187, 188, 227,
trivial zeros, 328 228
zero-free region, 168–175, 192–194 symmetric group, 184
zeros on the critical line, 456ff
Riesz product, 482 tangent coefficients, 505
Riesz representation theorem, 493 Tauberian theorem, 150ff
Riesz typical mean, 143 Hardy–Littlewood, 151–155, 163
Hardy’s, 150
saw-tooth function, 536 Karamata’s, 163
secant coefficients, 506 Littlewood’s, 151, 163
sieve, 76ff Tauber’s first, 150
Brun, 78 Tauber’s second, 160–161
combinatorial, 78 Wiener–Ikehara, 259–266, 277
Eratosthenes–Legendre, 76 Wiener’s, 163–164
Selberg, 82ff, 102
sine integral, 139 Wallis’ formula, 503, 507
square-free kernel, 84 Weyl sum, 193