0% found this document useful (0 votes)

49 views

Multiplicative Number Theory I-Montgomery

The document introduces 'Multiplicative Number Theory I: Classical Theory,' which focuses on the distribution of prime numbers and their significance in mathematics and physics, particularly in relation to the Riemann hypothesis. Authored by Hugh Montgomery and Robert Vaughan, it serves as a comprehensive resource for students, covering foundational topics in multiplicative number theory and providing extensive exercises and historical context. This volume is the first part of a larger project, with a second volume planned to explore more advanced topics in the field.

Uploaded by

loonee0415

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views

Multiplicative Number Theory I-Montgomery

Uploaded by

loonee0415

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 572

This page intentionally left blank

CAMBRIDGE STUDIES IN ADVANCED MATHEMATICS 97

Editorial Board
B. Bollobas, W. Fulton, A. Katok, F. Kirwan, P. Sarnak, B. Simon, B. Totaro

MULTIPLICATIVE NUMBER THEORY I:

CLASSICAL THEORY

Prime numbers are the multiplicative building blocks of natural numbers. Un-
derstanding their overall influence and especially their distribution gives rise
to central questions in mathematics and physics. In particular their finer distri-
bution is closely connected with the Riemann hypothesis, the most important
unsolved problem in the mathematical world. Assuming only subjects covered
in a standard degree in mathematics, the authors comprehensively cover all the
topics met in first courses on multiplicative number theory and the distribution
of prime numbers. They bring their extensive and distinguished research exper-
tise to bear in preparing the student for intelligent reading of the more advanced
research literature. The text, which is based on courses taught successfully over
many years at Michigan, Imperial College and Pennsylvania State, is enriched
by comprehensive historical notes and references as well as over 500 exercises.

Hugh Montgomery is a Professor of Mathematics at the University of Michigan.

Robert Vaughan is a Professor of Mathematics at Pennsylvannia State
University.
CAMBRIDGE STUDIES IN ADVANCED MATHEMATICS
All the titles listed below can be obtained from good booksellers of from Cambridge University
Press. For a complete series listing visit:
http://www.cambridge.org/series/sSeries.asp?code=CSAM
Already published
70 R. Iorio & V. Iorio Fourier analysis and partial differential equations
71 R. Blei Analysis in integer and fractional dimensions
72 F. Borceaux & G. Janelidze Galois theories
73 B. Bollobás Random graphs
74 R. M. Dudley Real analysis and probability
75 T. Sheil-Small Complex polynomials
76 C. Voisin Hodge theory and complex algebraic geometry, I
77 C. Voisin Hodge theory and complex algebraic geometry, II
78 V. Paulsen Completely bounded maps and operator algebras
79 F. Gesztesy & H. Holden Soliton Equations and Their Algebro-Geometric Solution, I
81 S. Mukai An Introduction to Invariants and Moduli
82 G. Tourlakis Lectures in Logic and Set Theory, I
83 G. Tourlakis Lectures in Logic and Set Theory, II
84 R. A. Bailey Association Schemes
85 J. Carlson, S. Müller-Stach & C. Peters Period Mappings and Period Domains
86 J. J. Duistermaat & J. A. C. Kolk Multidimensional Real Analysis I
87 J. J. Duistermaat & J. A. C. Kolk Multidimensional Real Analysis II
89 M. Golumbic & A. Trenk Tolerance Graphs
90 L. Harper Global Methods for Combinatorial Isoperimetric Problems
91 I. Moerdijk & J. Mrcun Introduction to Foliations and Lie Groupoids
92 J. Kollar, K. E. Smith & A. Corti Rational and Nearly Rational Varieties
93 D. Applebaum Levy Processes and Stochastic Calculus
94 B. Conrad Modular Forms and the Ramanujan Conjecture
95 M. Schechter An Introduction to Nonlinear Analysis
96 R. Carter Lie Algebras of Finite and Afﬁne Type
97 H. L. Montgomery & R. C Vaughan Multiplicative Number Theory I
98 I. Chavel Riemannian Geometry
99 D. Goldfeld Automorphic Forms and L-Functions for the Group GL(n,R)
100 M. Marcus & J. Rosen Markov Processes, Gaussian Processes, and Local Times
101 P. Gille & T. Szamuely Central Simple Algebras and Galois Cohomology
102 J. Bertoin Random Fragmentation and Coagulation Processes
Multiplicative Number Theory
I. Classical Theory

HUGH L. MONTGOMERY
University of Michigan, Ann Arbor
ROBERT C. VAUGHAN
Pennsylvania State University, University Park
cambridge university press
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo

Cambridge University Press

The Edinburgh Building, Cambridge cb2 2ru, UK
Published in the United States of America by Cambridge University Press, New York
www.cambridge.org
Information on this title: www.cambridge.org/9780521849036

© Cambridge University Press 2006

This publication is in copyright. Subject to statutory exception and to the provision of

relevant collective licensing agreements, no reproduction of any part may take place
without the written permission of Cambridge University Press.

First published in print format 2006

isbn-13 978-0-511-25645-5 eBook (EBL)

isbn-10 0-511-25645-0 eBook (EBL)

isbn-13 978-0-521-84903-6 hardback

isbn-10 0-521-84903-9 hardback

Cambridge University Press has no responsibility for the persistence or accuracy of urls
for external or third-party internet websites referred to in this publication, and does not
guarantee that any content on such websites is, or will remain, accurate or appropriate.
Dedicated to our teachers:
P. T. Bateman
J. H. H. Chalk
H. Davenport
T. Estermann
H. Halberstam
A. E. Ingham
Talet är tänkandets början och slut.
Med tanken föddes talet.
Utöfver talet når tanken icke.

Numbers are the beginning and end of thinking.

With thoughts were numbers born.
Beyond numbers thought does not reach.
Magnus Gustaf Mittag-Leffler, 1903
Contents

Preface page xi
List of notation xiii
1 Dirichlet series: I 1
1.1 Generating functions and asymptotics 1
1.2 Analytic properties of Dirichlet series 11
1.3 Euler products and the zeta function 19
1.4 Notes 31
1.5 References 33
2 The elementary theory of arithmetic functions 35
2.1 Mean values 35
2.2 The prime number estimates of Chebyshev and of Mertens 46
2.3 Applications to arithmetic functions 54
2.4 The distribution of (n) − ω(n) 65
2.5 Notes 68
2.6 References 71
3 Principles and ﬁrst examples of sieve methods 76
3.1 Initiation 76
3.2 The Selberg lambda-squared method 82
3.3 Sifting an arithmetic progression 89
3.4 Twin primes 91
3.5 Notes 101
3.6 References 104
4 Primes in arithmetic progressions: I 108
4.1 Additive characters 108
4.2 Dirichlet characters 115
4.3 Dirichlet L-functions 120

vii
viii Contents

4.4 Notes 133

4.5 References 134
5 Dirichlet series: II 137
5.1 The inverse Mellin transform 137
5.2 Summability 147
5.3 Notes 162
5.4 References 164
6 The Prime Number Theorem 168
6.1 A zero-free region 168
6.2 The Prime Number Theorem 179
6.3 Notes 192
6.4 References 195
7 Applications of the Prime Number Theorem 199
7.1 Numbers composed of small primes 199
7.2 Numbers composed of large primes 215
7.3 Primes in short intervals 220
7.4 Numbers composed of a prescribed number of primes 228
7.5 Notes 239
7.6 References 241
8 Further discussion of the Prime Number Theorem 244
8.1 Relations equivalent to the Prime Number Theorem 244
8.2 An elementary proof of the Prime Number Theorem 250
8.3 The Wiener–Ikehara Tauberian theorem 259
8.4 Beurling’s generalized prime numbers 266
8.5 Notes 276
8.6 References 279
9 Primitive characters and Gauss sums 282
9.1 Primitive characters 282
9.2 Gauss sums 286
9.3 Quadratic characters 295
9.4 Incomplete character sums 306
9.5 Notes 321
9.6 References 323
10 Analytic properties of the zeta function and L-functions 326
10.1 Functional equations and analytic continuation 326
10.2 Products and sums over zeros 345
10.3 Notes 356
10.4 References 356
Contents ix

11 Primes in arithmetic progressions: II 358

11.1 A zero-free region 358
11.2 Exceptional zeros 367
11.3 The Prime Number Theorem for arithmetic
progressions 377
11.4 Applications 386
11.5 Notes 391
11.6 References 393

12 Explicit formulæ 397

12.1 Classical formulæ 397
12.2 Weil’s explicit formula 410
12.3 Notes 416
12.4 References 417

13 Conditional estimates 419

13.1 Estimates for primes 419
13.2 Estimates for the zeta function 433
13.3 Notes 447
13.4 References 449

14 Zeros 452
14.1 General distribution of the zeros 452
14.2 Zeros on the critical line 456
14.3 Notes 460
14.4 References 461

15 Oscillations of error terms 463

15.1 Applications of Landau’s theorem 463
15.2 The error term in the Prime Number Theorem 475
15.3 Notes 482
15.4 References 484

APPENDICES
A The Riemann–Stieltjes integral 486
A.1 Notes 492
A.2 References 493

B Bernoulli numbers and the Euler–MacLaurin

summation formula 495
B.1 Notes 513
B.2 References 517
x Contents

C The gamma function 520

C.1 Notes 531
C.2 References 533
D Topics in harmonic analysis 535
D.1 Pointwise convergence of Fourier series 535
D.2 The Poisson summation formula 538
D.3 Notes 542
D.4 References 542

Name index 544

Subject index 550
Preface

Our object is to introduce the interested student to the techniques, results, and
terminology of multiplicative number theory. It is not intended that our discus-
sion will always reach the research frontier. Rather, it is hoped that the material
here will prepare the student for intelligent reading of the more advanced re-
search literature.
Analytic number theorists are not very uniformly distributed around the
world and it possible that a student may be working without the guidance of an
experienced mentor in the area. With this in mind, we have tried to make this
volume as self-contained as possible.
We assume that the reader has some acquaintance with the fundamentals of
elementary number theory, abstract algebra, measure theory, complex analysis,
and classical harmonic analysis. More specialized or advanced background
material in analysis is provided in the appendices.
The relationship of exercises to the material developed in a given section
varies widely. Some exercises are designed to illustrate the theory directly
whilst others are intended to give some idea of the ways in which the theory can
be extended, or developed, or paralleled in other areas. The reader is cautioned
that papers cited in exercises do not necessarily contain a solution.
This volume is the ﬁrst instalment of a larger project. We are preparing a
second volume, which will cover such topics as uniform distribution, bounds for
exponential sums, a wider zero-free region for the Riemann zeta function, mean
and large values of Dirichlet polynomials, approximate functional equations,
moments of the zeta function and L functions on the line σ = 1/2, the large
sieve, Vinogradov’s method of prime number sums, zero density estimates,
primes in arithmetic progressions on average, sums of primes, sieve methods,
the distribution of additive functions and mean values of multiplicative func-
tions, and the least prime in an arithmetic progression. The present volume was

xi
xii Preface

twenty-ﬁve years in preparation—we hope to be a little quicker with the second

volume.
Many people have assisted us in this work—including P. T. Bateman, E.
Bombieri, T. Chan, J. B. Conrey, H. G. Diamond, T. Estermann, J. B. Friedlan-
der, S. W. Graham, S. M. Gonek, A. Granville, D. R. Heath-Brown, H. Iwaniec,
H. Maier, G. G. Martin, D. W. Masser, A. M. Odlyzko, G. Peng, C. Pomerance,
H.–E. Richert, K. Soundararajan, and U. M. A. Vorhauer. In particular, our
doctoral students, and their students also, have been most helpful in detecting
errors of all types. We are grateful to them all. We would be most happy to hear
from any reader who detects a misprint, or might suggest improvements.
Finally we thank our loved ones and friends for their long term support
and the long–suffering David Tranah at Cambridge University Press for his
forbearance.
Notation

Symbol Meaning Found on page

C The set of complex numbers. 109
Fp A ﬁeld of p elements. 9
N The set of natural numbers, 1, 2, . . . 114
Q The set of rational numbers. 120
R The set of real numbers. 43
T R/Z, known as the circle group or 110
the one-dimensional torus, which is
to say the real numbers modulo 1.
Z The set of rational integers. 20
B constant in the Hadamard product 347, 349
for ξ (s)
Bk Bernoulli numbers. 496ff
Bk (x) Bernoulli polynomials. 45, 495ff
B(χ ) constant in the Hadamard product 351, 352
for ξ (s, χ )
C0 Euler’s constant 26
cq (n) The sum of e(an/q) with a running 110
over a reduced residue system
modulo q; known as Ramanujan’s
sum.
q
cχ (n) = a=1 χ (a)e(an/q). 286, 290
d(n) The number of positive divisors of n, 2
called the divisor function.
dk (n) The number of ordered k-tuples of 43
positive integers whose product
is n.
E 0 (χ ) = 1 if χ = χ0 , 0 otherwise. 358

xiii
xiv List of notation

Symbol Meaning Found on page

Ek The Euler numbers, also known as 506
the secant coefficients.
e(θ ) = e2πiθ ; the complex exponential 64, 108ff
with period 1.
L(s, χ ) A Dirichlet
x du L-function. 120
Li(x) = 0 log u
with the Cauchy 189
principal value taken at 1; the
logarithmic
x du integral.
li(x) = 2 log u
; the logarithmic 5
integral.

M(x) = n≤x µ(n) 182
M(x; q, a) The sum of µ(n) over those n ≤ x 383
for which n ≡ a (mod q).
M(x, χ ) The sum of χ (n)µ(n) over those 383
n ≤ x.
N (T ) The number of zeros ρ = β + iγ 348, 452ff
of ζ (s) with 0 < γ ≤ T.
N (T, χ ) The number of zeros ρ = β + iγ 454
of L(s, χ ) with β > 0 and
0 ≤ β ≤ T.
P(n) The largest prime factor of n. 202
Q(x) the number of square-free numbers 36
not exceeding x
S(t) = π1 argζ ( 12 + it). 452
S(t, χ ) ∞ sin2u + it, χ ).
= π1 argL( 1
454
si(x) = − x u du; the sine integral. 139
Tk The tangent coefficients. 505
w(u) The Buchstab function, defined by 216
the equation (uw(u)) = w(u − 1)
for u > 2 together with the initial
condition w(u) = 1/u for
1 < u ≤ 2.
Z (t) Hardy’s function. The function 456ff
Z (t) is real-valued, and
|Z (t)| = |ζ ( 12 + it)|.
β The real part of a zero of the zeta 173
function
∞ or of an L-function.
(s) = 0 e−x x s−1 d x for σ > 0; 30, 520ff
called the Gamma function.
List of notation xv

Symbol Meaning
∞ Found on page
(s, a) = a e−w w s−1 dw; the incomplete 327
Gamma function.
γ The imaginary part of a zero of the 172
zeta function or of an L-function.
N −1
N (θ ) = 1 + 2 n=1 (1 − n/N ) cos 2π nθ; 174
known as the Fejér
kernel.
κ 1/2
ε(χ ) = τ (χ )/ i q . 332

ζ (s) = ∞ n=1 n −s
for σ > 1, known as the 2
Riemann zeta function.

ζ (s, α) = ∞ n=0 (n + α)
−s
for σ > 1; known 30
as the Hurwitz zeta function.
−s
ζ K (s) a N (a) ; known as the Dedekind 343
zeta function of the algebraic number
ﬁeld K .
= sup ρ 430, 463

ϑ(x) = p≤x log p. 46
∞
= n=−∞ e−π n z for z > 0.
2
ϑ(z) 329
ϑ(x; q, a) The sum of log p over primes p ≤ x 128, 377ff
for which p ≡ a (mod q).

ϑ(x, χ) = p≤x χ ( p) log p. 377ff
κ = (1 − χ (−1))/2. 332
(n) = log p if n = p k , = 0 otherwise; 23
known as the von Mangoldt Lambda
function.

2 (n) = (n) log n + bc=n (b)(c). 251
(x; q, a) The sum of λ(n) over those n ≤ x 383
such that n ≡ a (mod q).

(x, χ ) = n≤x χ (n)λ(n). 383
λ(n) = (−1)(n) ; known as the Liouville 21
lambda function.
µ(n) = (−1)ω(n) for square-free n, = 0 21
otherwise. Known as the Möbius mu
function.
µ(σ ) the Lindelöf mu function 330
ξ (s) = 12 s(s − 1)ζ (s) (s/2)π −s/2 . 328
ξ (s, χ ) = L(s, χ ) ((s + κ)/2)(q/π )(s+κ)/2 333
where χ is a primitive character
modulo q, q > 1.
xvi List of notation

Symbol Meaning Found on page

(x) = n≤x (n)/ log n. 416
π(x) The number of primes not exceeding x. 3
π (x; q, a) The number of p ≤ x such that p ≡ a 90, 358
(mod q),.

π(x, χ ) = p≤x χ( p). 377ff
ρ = β + iγ ; a zero of the zeta function or 173
of an L-function.
ρ(u) The Dickman function, deﬁned by the 200
equation uρ (u) = −ρ(u − 1) for u > 1
together with the initial condition
ρ(u) = 1 for 0 ≤ u ≤ 1.
σ (n) The sum of the positive divisors of n. 27

σa (n) = d|n d a . 28
τ = |t| + 4. 14
q
τ (χ ) = a=1 χ (a)e(a/q); known as the 286ff
Gauss sum of χ .
q (z) The q th cyclotomic polynomial, which is 64
to say a monic polynomial with integral
coefﬁcients, of degree ϕ(q), whose roots
are the numbers e(a/q) for (a, q) = 1.
(x, y) The number of n ≤ x such that all prime 215
factors of n are ≥ y.
y
= √12π −∞ e−t /2 dt; the cumulative
2
(y) 235
distribution function of a normal random
variable with mean 0 and variance 1.
ϕ(n) The number of a, 1 ≤ a ≤ n, for which 27
(a, n) = 1; known as Euler’s totient
function.
χ (n) A Dirichlet character. 115

ψ(x) = n≤x (n). 46
ψ(x, y) The number of n ≤ x composed entirely 199
of primes p ≤ y.
ψ(x; q, a) The sum of (n) over n ≤ x for which 128, 377ff
n ≡ a (mod q).

ψ(x, χ ) = n≤x χ (n)(n). 377ff
(n) The number of prime factors of n, 21
counting multiplicity.
ω(n) The number of distinct primes dividing n. 21
List of notation xvii

Symbol Meaning Found on page

[x] The unique integer such that 15, 24
[x] ≤ x < [x] + 1; called the integer
part of x.
{x} = x − [x]; called the fractional part of x. 24
x The distance from x to the nearest 477
integer.
f (x) = O(g(x)) | f (x)| ≤ Cg(x) where C is an absolute 3
constant.
f (x) = o(g(x)) lim f (x)/g(x) = 0. 3
f (x) g(x) f (x) = O(g(x)). 3
f (x) g(x) g(x) = O( f (x)), g non-negative. 4
f (x) g(x) c f (x) ≤ g(x) ≤ C f (x) for some positive 4
absolute constants c, C.
f (x) ∼ g(x) lim f (x)/g(x) = 1. 3
1
Dirichlet series: I

1.1 Generating functions and asymptotics

The general rationale of analytic number theory is to derive statistical informa-
tion about a sequence {an } from the analytic behaviour of an appropriate gen-

erating function, such as a power series an z n or a Dirichlet series an n −s .
The type of generating function employed depends on the problem being in-
vestigated. There are no rigid rules governing the kind of generating function
that is appropriate – the success of a method justiﬁes its use – but we usually
deal with additive questions by means of power series or trigonometric sums,
and with multiplicative questions by Dirichlet series. For example, if

∞
k
f (z) = zn
n=1

for |z| < 1, then the n th power series coefﬁcient of f (z)s is the number rk,s (n)
of representations of n as a sum of s positive k th powers,

n = m k1 + m k2 + · · · + m ks .

We can recover rk,s (n) from f (z)s by means of Cauchy’s coefﬁcient formula:

1 f (z)s
rk,s (n) = dz.
2πi z n+1
By choosing an appropriate contour, and estimating the integrand, we can de-
termine the asymptotic size of rk,s (n) as n → ∞, provided that s is sufﬁciently
large, say s > s0 (k). This is the germ of the Hardy–Littlewood circle method,
but considerable effort is required to construct the required estimates.
To appreciate why power series are useful in dealing with additive prob-

lems, note that if A(z) = ak z k and B(z) = bm z m then the power series

1
2 Dirichlet series: I

coefﬁcients of C(z) = A(z)B(z) are given by the formula

cn = a k bm . (1.1)
k+m=n

The terms are grouped according to the sum of the indices, because
z k z m = z k+m .

A Dirichlet series is a series of the form α(s) = ∞ n=1 an n
−s
where s is
∞ −s
a complex variable. If β(s) = m=1 bm m is a second Dirichlet series and
γ (s) = α(s)β(s), then (ignoring questions relating to the rearrangement of terms
of inﬁnite series)
∞
∞ ∞ ∞ ∞
γ (s) = ak k −s bm m −s = ak bm (km)−s = ak bm n −s .
k=1 m=1 k=1 m=1 n=1 km=n
(1.2)
∞ −s
That is, we expect that γ (s) is a Dirichlet series, γ (s) = n=1 cn n , whose
coefﬁcients are

cn = a k bm . (1.3)
km=n

This corresponds to (1.1), but the terms are now grouped according to the
product of the indices, since k −s m −s = (km)−s .
Since we shall employ the complex variable s extensively, it is useful to have
names for its real and complex parts. In this regard we follow the rather peculiar
notation that has become traditional: s = σ + it.
Among the Dirichlet series we shall consider is the Riemann zeta function,
which for σ > 1 is deﬁned by the absolutely convergent series

∞
ζ (s) = n −s . (1.4)
n=1

As a first application of (1.3), we note that if α(s) = β(s) = ζ (s) then the
manipulations in (1.3) are justified by absolute convergence, and hence we see
that

∞
d(n)n −s = ζ (s)2 (1.5)
n=1

for σ > 1. Here d(n) is the divisor function, d(n) = d|n 1.
From the rate of growth or analytic behaviour of generating functions we
glean information concerning the sequence of coefficients. In expressing our
findings we employ a special system of notation. For example, we say, ‘f (x) is
asymptotic to g(x)’ as x tends to some limiting value (say x → ∞), and write
1.1 Generating functions and asymptotics 3

f (x) ∼ g(x) (x → ∞), if

f (x)
lim = 1.
x→∞ g(x)
An instance of this arises in the formulation of the Prime Number Theorem
(PNT), which concerns the asymptotic size of the number π (x) of prime num-

bers not exceeding x; π(x) = p≤x 1. Conjectured by Legendre in 1798, and
finally proved in 1896 independently by Hadamard and de la Vallée Poussin,
the Prime Number Theorem asserts that
x
π(x) ∼ .
log x
Alternatively, we could say that
x
π(x) = (1 + o(1)) ,
log x
which is to say that π(x) is x/ log x plus an error term that is in the limit
negligible compared with x/ log x. More generally, we say, ‘f (x) is small oh
of g(x)’, and write f (x) = o(g(x)), if f (x)/g(x) → 0 as x tends to its limit.
The Prime Number Theorem can be put in a quantitative form,

x x
π(x) = +O . (1.6)
log x (log x)2
Here the last term denotes an implicitly defined function (the difference be-
tween the other members of the equation); the assertion is that this function has
absolute value not exceeding C x(log x)−2 . That is, the above is equivalent to
asserting that there is a constant C > 0 such that the inequality
x Cx
π (x) − ≤
log x (log x)2
holds for all x ≥ 2. In general, we say that f (x) is ‘big oh of g(x)’, and write
f (x) = O(g(x)) if there is a constant C > 0 such that | f (x)| ≤ Cg(x) for all
x in the appropriate domain. The function f may be complex-valued, but g
is necessarily non-negative. The constant C is called the implicit constant;
it is an absolute constant unless the contrary is indicated. For example, if C
is liable to depend on a parameter α, we might say, ‘For any fixed value of
α, f (x) = O(g(x))’. Alternatively, we might say, ‘ f (x) = O(g(x)) where the
implicit constant may depend on α’, or more briefly, f (x) = Oα (g(x)).
When there is no main term, instead of writing f (x) = O(g(x)) we save a
pair of parentheses by writing instead f (x) g(x). This is read, ‘f (x) is less-
than-less-than g(x)’, and we write f (x) α g(x) if the implicit constant may
depend on α. To provide an example of this notation, we recall that Chebyshev
4 Dirichlet series: I

0000

0 00000 00000 00000 00000 1000000

Figure 1.1 Graph of π(x) (solid) and x/ log x (dotted) for 2 ≤ x ≤ 106 .

proved that π (x) x/ log x. This is of course weaker than the Prime Number
Theorem, but it was derived much earlier, in 1852. Chebyshev also showed
that π(x) x/ log x. In general, we say that f (x) g(x) if there is a positive
constant c such that f (x) ≥ cg(x) and g is non-negative. In this situation both
f and g take only positive values. If both f g and f g then we say that f
and g have the same order of magnitude, and write f g. Thus Chebyshev’s
estimates can be expressed as a single relation,
x
π (x) .
log x
The estimate (1.6) is best possible to the extent that the error term is not
o(x(log x)−2 ). We have also a special notation to express this:

x x
π(x) − = .
log x (log x)2

In general, if lim supx→∞ | f (x)|/g(x) > 0 then we say that f (x) is ‘Omega of
g(x)’, and write f (x) = (g(x)). This is precisely the negation of the statement
‘ f (x) = o(g(x))’. When studying numerical values, as in Figure 1.1, we ﬁnd
that the ﬁt of x/ log x to π (x) is not very compelling. This is because the error
term in the approximation is only one logarithm smaller than the main term.
This error term is not oscillatory – rather there is a second main term of this
1.1 Generating functions and asymptotics 5

size:

x x x
π (x) = + + O .
log x (log x)2 (log x)3
This is also best possible, but the main term can be made still more elaborate to
give a smaller error term. Gauss was the first to propose a better approximation to
π(x). Numerical studies led him to observe that the density of prime numbers in
the neighbourhood of x is approximately 1/ log x. This suggests that the number
of primes not exceeding x might be approximately equal to the logarithmic
integral,
x
1
li(x) = du.
2 log u
(Orally, ‘li’ rhymes with ‘pi’.) By repeated integration by parts we can show
that

K −1
(k − 1)! x
li(x) = x + OK
k=1
(log x)k (log x) K
for any positive integer K ; thus the secondary main terms of the approximation
to π(x) are contained in li(x).
In Chapter 6 we shall prove the Prime Number Theorem in the sharper
quantitative form

x
π(x) = li(x) + O √
exp(c log x)
√
for some suitable positive constant c. Note that exp(c log x) tends to infinity
faster than any power of log x. The error term above seems to fall far from
what seems to be the truth. Numerical evidence, such as that in Table 1.1,
√
suggests that the error term in the Prime Number Theorem is closer to x in
size. Gauss noted the good fit, and also that π(x) < li(x) for all x in the range of
his extensive computations. He proposed that this might continue indefinitely,
but the numerical evidence is misleading, for in 1914 Littlewood showed that
1/2
x log log log x
π(x) − li(x) = ± .
log x
Here the subscript ± indicates that the error term achieves the stated or-
der of magnitude infinitely often, and in both signs. In particular, the dif-
ference π − li has infinitely many sign changes. More generally, we write
f (x) = + (g(x)) if lim supx→∞ f (x)/g(x) > 0, we write f (x) = − (g(x))
if lim infx→∞ f (x)/g(x) < 0, and we write f (x) = ± (g(x)) if both these re-
lations hold.
6 Dirichlet series: I

Table 1.1 Values of π (x), li(x), x/ log x for x = 10k , 1 ≤ k ≤ 22.

x π (x) li(x) x/ log x

10 4 5.12 4.34
102 25 29.08 21.71
103 168 176.56 144.76
104 1229 1245.09 1085.74
105 9592 9628.76 8685.89
106 78498 78626.50 72382.41
107 664579 664917.36 620420.69
108 5761455 5762208.33 5428681.02
109 50847534 50849233.90 48254942.43
1010 455052511 455055613.54 434294481.90
1011 4118054813 4118066399.58 3948131653.67
1012 37607912018 37607950279.76 36191206825.27
1013 346065536839 346065458090.05 334072678387.12
1014 3204941750802 3204942065690.91 3102103442166.08
1015 29844570422669 29844571475286.54 28952965460216.79
1016 279238341033925 279238344248555.75 271434051189532.39
1017 2623557157654233 2623557165610820.07 2554673422960304.87
1018 24739954287740860 24739954309690413.98 24127471216847323.76
1019 234057667276344607 234057667376222382.22 228576043106974646.13
1020 2220819602560918840 2220819602783663483.55 2171472409516259138.26
1021 21127269486018731928 21127269486616126182.33 20680689614440563221.48
1022 201467286689315906290 201467286691248261498.15 197406582683296285295.97

In the exercises below we give several examples of the use of generating

functions, mostly power series, to establish relations between various counting
functions.

1.1.1 Exercises

1. Let r (n) be the number of ways that n cents of postage can be made, using
only 1 cent, 2 cent, and 3 cent stamps. That is, r (n) is the number of ordered
triples (x1 , x2 , x3 ) of non-negative integers such that x1 + 2x2 + 3x3 = n.
(a) Show that

∞
1
r (n)z n =
n=0
(1 − z)(1 − z 2 )(1 − z 3 )

for |z| < 1.

(b) Determine the partial fraction expansion of the rational function above.
1.1 Generating functions and asymptotics 7

That is, ﬁnd constants a, b, . . . , f so that the above is

a b c d e f
+ + + + +
(z − 1) 3 (z − 1) 2 z−1 z+1 z−ω z−ω

where ω = e2πi/3 and ω = e−2πi/3 are the primitive cube roots of unity.
(c) Show that r (n) is the integer nearest (n + 3)2 /12.
(d) Show that r (n) is the number of ways of writing n = y1 + y2 + y3 with
y1 ≥ y2 ≥ y3 ≥ 0.
2. Explain why
∞
k
1 + z2 = 1 + z + z2 + · · ·
k=0

for |z| < 1.

3. (L. Mirsky & D. J. Newman) Suppose that 0 ≤ ak < m k for 1 ≤ k ≤ K , and
that m 1 < m 2 < · · · < m K . This is called a family of covering congruences
if every integer x satisﬁes at least one of the congruences x ≡ ak (mod m k ).
A system of covering congruences is called exact if for every value of x
there is exactly one value of k such that x ≡ ak (mod m k ). Show that if the
system is exact then

K
z ak 1
=
k=1
1 − zmk 1−z

for |z| < 1. Show that the left-hand side above is

e2πia K /m K
∼
m K (1 − r )

when z = r e2πi/m K and r → 1− . On the other hand, the right-hand side is

bounded for z in a neighbourhood of e2πi/m K if m K > 1. Deduce that a family
of covering congruences is not exact if m k > 1.
4. Let p(n; k) denote the number of partitions of n into at most k parts, that is, the
number of ordered k-tuples (x1 , x2 , . . . , xk ) of non-negative integers such
that n = x1 + x2 + · · · + xk and x1 ≥ x2 ≥ · · · ≥ xk . Let p(n) = p(n; n) de-
note the total number of partitions of n. Also let po (n) be the number of

partitions of n into an odd number of parts, po (n) = 2k p(n; k). Finally,
let pd (n) denote the number of partitions of n into distinct parts, so that
x1 > x2 > · · · > xk . By convention, put p(0) = po (0) = pd (0) = 1.
(a) Show that there are precisely p(n; k) partitions of n into parts not
exceeding k.
8 Dirichlet series: I

(b) Show that

∞ k
p(n; k)z n = (1 − z j )−1
n=0 j=1

for |z| < 1.

(d) Show that

∞ ∞
pd (n)z n = (1 + z k )
n=0 k=1

for |z| < 1.

(e) Show that

∞ ∞
po (n)z n = (1 − z 2k−1 )−1
n=0 k=1

for |z| < 1.

(f) By using the result of Exercise 2, or otherwise, show that the last two
generating functions above are identically equal. Deduce that po (n) =
pd (n) for all n.
5. Let A(n) denote the number of ways of associating a product of n terms;
thus A(1) = A(2) = 1 and A(3) = 2. By convention, A(0) = 0.
(a) By considering the possible positionings of the outermost parentheses,
show that

n−1
A(n) = A(k)A(n − k)
k=1

for all n ≥ 2.

(b) Let P(z) = ∞ n
n=0 A(n)z . Show that

P(z)2 = P(z) − z.
Deduce that
√
1− 1 − 4z ∞
1/2 2n−1
P(z) = = 2 (−1)n−1 z n .
2 n=1
n
2n−2
(c) Conclude that A(n) = n−1 /n for all n ≥ 1. These are called the Cata-
lan numbers.
1.1 Generating functions and asymptotics 9

(d) What needs to be said concerning the convergence of the series used
above?
6. (a) Let n k denote the total number of monic polynomials of degree k in
F p [x]. Show that n k = p k .
(b) Let P1 , P2 , . . . be the irreducible monic polynomials in F p [x], listed in
some (arbitrary) order. Show that
∞
(1 + z deg Pr + z 2 deg Pr + z 3 deg Pr + · · · ) = 1 + pz + p 2 z 2
r =1
+ p3 z 3 + · · ·

for |z| < 1/ p.

(c) Let gk denote the number of irreducible monic polynomials of degree k
in F p [x]. Show that
∞
(1 − z k )−gk = (1 − pz)−1 (|z| < 1/ p).
k=1

(d) Take logarithmic derivatives to show that

∞
z k−1 p
kgk = (|z| < 1/ p).
k=1
1 − zk 1 − pz
(e) Show that

∞
∞
∞
kgk z mk = pn z n (|z| < 1/ p).
k=1 m=1 n=1

(f) Deduce that

kgk = p n
k|n

for all positive integers n.

(g) (Gauss) Use the Möbius inversion formula to show that
1
gn = µ(k) p n/k
n k|n

for all positive integers n.

(h) Use (f) (not (g)) to show that
pn 2 p n/2 pn
− ≤ gn ≤ .
n n n
(i) If a monic polynomial of degree n is chosen at random from F p [x], about
how likely is it that it is irreducible? (Assume that p and/or n is large.)
10 Dirichlet series: I

(j) Show that gn > 0 for all p and all n ≥ 1. (If P ∈ F p [x] is irreducible and
has degree n, then the quotient ring F p [x]/(P) is a field of p n elements.
Thus we have proved that there is such a field, for each prime p and
integer n ≥ 1. It may be further shown that the order of a finite field
is necessarily a prime power, and that any two finite fields of the same
order are isomorphic. Hence the field of order p n , whose existence we
have proved, is essentially unique.)
7. (E. Berlekamp) Let p be a prime number. We recall that polynomials in a
single variable (mod p) factor uniquely into irreducible polynomials. Thus
a monic polynomial f (x) can be expressed uniquely (mod p) in the form
g(x)h(x)2 where g(x) is square-free (mod p) and both g and h are monic. Let
sn denote the number of monic square-free polynomials (mod p) of degree
n. Show that
∞
∞ ∞
sk z k p m z 2m = pn z n
k=0 m=0 n=0

for |z| < 1/ p. Deduce that

∞
1 − pz 2
sk z k = ,
k=0
1 − pz

and hence that s0 = 1, s1 = p, and that sk = p k (1 − 1/ p) for all k ≥ 2.

8. (cf Wagon 1987) (a) Let I = [a, b] be an interval. Show that I e2πi x d x = 0
if and only if the length b − a of I is an integer.
(b) Let R = [a, b] × [c, d] be a rectangle. Show that R e2πi(x+y) d x d y =
0 if and only if at least one of the edge lengths of R is an integer.
(c) Let R be a rectangle that is a union of ﬁnitely many rectangles Ri ; the
Ri are disjoint apart from their boundaries. Show that if all the Ri have
the property that at least one of their side lengths is an integer, then R
also has this property.
9. (L. Moser) If A is a set of non-negative integers, let rA (n) denote the number
of representations of n as a sum of two distinct members of A. That is, rA (n) is
the number of ordered pairs (a1 , a2 ) for which a1 ∈ A, a2 ∈ A, a1 + a2 = n,

and a1 = a2 . Let A(z) = a∈A z a .

(a) Show that n rA (n)z n = A(z)2 − A(z 2 ) for |z| < 1.
(b) Suppose that the non-negative integers are partitioned into two sets A
and B in such a way that rA (n) = rB (n) for all non-negative integers n.
Without loss of generality, 0 ∈ A. Show that 1 ∈ B, that 2 ∈ B, and
that 3 ∈ A.
(c) With A and B as above, show that A(z) +2B(z) = 21/(1 − z) for |z| < 1.
(d) Show that A(z) − B(z) = (1 − z) A(z ) − B(z ) , and hence by
1.2 Analytic properties of Dirichlet series 11

induction that
∞
k
A(z) − B(z) = 1 − z2
k=0

for |z| < 1.

(e) Let the binary weight of n, denoted w(n), be the number of 1’s in the
binary expansion of n. That is, if n = 2k1 + · · · + 2kr with k1 > · · · > kr ,
then w(n) = r . Show that A consists of those non-negative integers n
for which w(n) is even, and that B is the set of those integers for which
w(n) is odd.

1.2 Analytic properties of Dirichlet series

Having provided some motivation for the use of Dirichlet series, we now turn to
the task of establishing some of their basic analytic properties, corresponding
to well-known facts concerning power series.

Theorem 1.1 Suppose that the Dirichlet series α(s) = ∞ n=1 an n
−s
converges
at the point s = s0 , and that H > 0 is an arbitrary constant. Then the series
α(s) is uniformly convergent in the sector S = {s : σ ≥ σ0 , |t − t0 | ≤ H (σ −
σ0 )}.
By taking H large, we see that the series α(s) converges for all s in the
half-plane σ > σ0 , and hence that the domain of convergence is a half-plane.
More precisely, we have

Corollary 1.2 Any Dirichlet series α(s) = ∞ n=1 an n
−s
has an abscissa of
convergence σc with the property that α(s) converges for all s with σ > σc , and
for no s with σ < σc . Moreover, if s0 is a point with σ0 > σc , then there is a
neighbourhood of s0 in which α(s) converges uniformly.
In extreme cases a Dirichlet series may converge throughout the plane (σc =
−∞), or nowhere (σc = +∞). When the abscissa of convergence is ﬁnite, the
series may converge everywhere on the line σc + it, it may converge at some
but not all points on this line, or nowhere on the line.

Proof of Theorem 1.1 Let R(u) = n>u an n −s0 be the remainder term of the
series α(s0 ). First we show that for any s,

N N
an n −s = R(M)M s0 −s − R(N )N s0 −s + (s0 − s) R(u)u s0 −s−1 du.
n=M+1 M
(1.7)
12 Dirichlet series: I

To see this we note that an = (R(n − 1) − R(n)) n s0 , so that by partial

summation
N N
an n −s = (R(n − 1) − R(n))n s0 −s
n=M+1 n=M+1

N
= R(M)M s0 −s−R(N )N s0 −s − R(n −1)((n −1)s0 −s − n s0 −s ).
n=M+1

The second factor in this last sum can be expressed as an integral,

n
(n − 1)s0 −s − n s0 −s = −(s0 − s) u s0 −s−1 du,
n−1

and hence the sum is

N n
N n
(s − s0 ) R(n − 1) u s0 −s−1 du = (s − s0 ) R(u)u s0 −s−1 du
n=M+1 n−1 n=M+1 n−1

since R(u) is constant in the interval [n − 1, n). The integrals combine to give
(1.7).
If |R(u)| ≤ ε for all u ≥ M and if σ > σ0 , then from (1.7) we see that

N ∞
|s − s0 |
an n −s ≤ 2ε + ε|s − s0 | u σ0 −σ −1 du ≤ 2 + ε.
n=M+1 M σ − σ0
For s in the prescribed region we see that
|s − s0 | ≤ σ − σ0 + |t − t0 | ≤ (H + 1)(σ − σ0 ),
N
so that the sum M+1 an n −s is uniformly small, and the result follows by the
uniform version of Cauchy’s principle.

In deriving (1.7) we used partial summation, although it would have been

more efﬁcient to use the properties of the Riemann–Stieltjes integral (see
Appendix A):

N N N N
an n −s = − u s0 −s d R(u) = −u s0 −s R(u) + R(u) du s0 −s
n=M+1 M M M

by Theorems A.1 and A.2. By Theorem A.3 this is

N
= M s0 −s R(M) − N s0 −s R(N ) + (s0 − s) R(u)u s0 −s−1 du.
M

In more complicated situations it is an advantage to use the Riemann–Stieltjes

integral, and subsequently we shall do so without apology.

The series α(s) = an n −s is locally uniformly convergent for σ > σc , and
each term is an analytic function, so it follows from a general principle of
1.2 Analytic properties of Dirichlet series 13

Weierstrass that α(s) is analytic for σ > σc , and that the differentiated series is
locally uniformly convergent to α (s):

∞
α (s) = − an (log n)n −s (1.8)
n=1

for s in the half-plane σ > σc .

Suppose that s0 is a point on the line of convergence (i.e., σ0 = σc ), and that
the series α(s0 ) converges. It can be shown by example that
lim α(s)
s→s0
σ >σc

need not exist. However, α(s) is continuous in the sector S of Theorem 1.1, in
view of the uniform convergence there. That is,
lim α(s) = α(s0 ),
s→s0
(1.9)
s∈S

which is analogous to Abel’s theorem for power series.

We now express a convergent Dirichlet series as an absolutely convergent
integral.

Theorem 1.3 Let A(x) = n≤x an . If σc < 0, then A(x) is a bounded func-
tion, and

∞ ∞
an n −s = s A(x)x −s−1 d x (1.10)
n=1 1

for σ > 0. If σc ≥ 0, then

log |A(x)|
lim sup = σc , (1.11)
x→∞ log x
and (1.10) holds for σ > σc .
Proof We note that

N N N N
an n −s = x −s d A(x) = A(x)x −s − A(x) d x −s
n=1 1− 1− 1−
N
= A(N )N −s + s A(x)x −s−1 d x.
1

Let φ denote the left-hand side of (1.11). If θ > φ then A(x) x θ where the
implicit constant may depend on the an and on θ . Thus if σ > θ, then the integral
in (1.10) is absolutely convergent. Thus we obtain (1.10) by letting N → ∞,
since the ﬁrst term above tends to 0 as N → ∞.
Suppose that σc < 0. By Corollary 1.2 we know that A(x) tends to a ﬁnite
limit as x → ∞, and hence φ ≤ 0, so that (1.10) holds for all σ > 0.
14 Dirichlet series: I

Now suppose that σc ≥ 0. By Corollary 1.2 we know that the series in (1.10)
diverges when σ < σc . Hence φ ≥ σc . To complete the proof it sufﬁces to show
that φ ≤ σc . Choose σ0 > σc . By (1.7) with s = 0 and M = 0 we see that
N
A(N ) = −R(N )N σ0 + σ0 R(u)u σ0 −1 du.
0

Since R(u) is a bounded function, it follows that A(N ) N σ0 where the implicit
constant may depend on the an and on σ0 . Hence φ ≤ σ0 . Since this holds for
any σ0 > σc , we conclude that φ ≤ σc .

The terms of a power series are majorized by a geometric progression at

points strictly inside the circle of convergence. Consequently power series con-
verge very rapidly. In contrast, Dirichlet series are not so well behaved. For
example, the series

∞
(−1)n−1 n −s (1.12)
n=1

converges for σ > 0, but it is absolutely convergent only for σ > 1. In general

we let σa denote the inﬁmum of those σ for which ∞ n=1 |an |n
−σ
< ∞. Then σa ,
the abscissa of absolute convergence, is the abscissa of convergence of the series
∞ −s

n=1 |an |n , and we see that an n −s is absolutely convergent if σ > σa ,
but not if σ < σa . We now show that the strip σc ≤ σ ≤ σa of conditional
convergence is never wider than in the example (1.12).

Theorem 1.4 In the above notation, σc ≤ σa ≤ σc + 1.

Proof The ﬁrst inequality is obvious. To prove the second, suppose that ε > 0.

Since the series an n −σc −ε is convergent, the summands tend to 0, and hence
an n σc +ε where the implicit constant may depend on the an and on ε. Hence

the series an n −σc −1−2ε is absolutely convergent by comparison with the series
−1−ε
n .

Clearly a Dirichlet series α(s) is uniformly bounded in the half-plane

σ > σa + ε, but this is not generally the case in the strip of conditional conver-
gence. Nevertheless, we can limit the rate of growth of α(s) in this strip.
To aid in formulating our next result we introduce a notational convention
that arises because many estimates relating to Dirichlet series are expressed
in terms of the size of |t|. Our interest is in large values of this quantity, but
in order that the statements be valid for small |t| we sometimes write |t| + 4.
Since this is cumbersome in complicated expressions, we introduce a shorthand:
τ = |t| + 4.
1.2 Analytic properties of Dirichlet series 15

Theorem 1.5 Suppose that α(s) = an n −s has abscissa of convergence σc .
If δ and ε are ﬁxed, 0 < ε < δ < 1, then

α(s) τ 1−δ+ε

uniformly for σ ≥ σc + δ. The implicit constant may depend on the coefﬁcients

an , on δ, and on ε.

By the example found in Exercise 8 at the end of this section, we see that
the bound above is reasonably sharp.

Proof Let s be a complex number with σ ≥ σc + δ. By (1.7) with s0 = σc + ε

and N → ∞, we see that

M ∞
α(s) = an n −s + R(M)M σc +ε−s + (σc + ε − s) R(u)u σc +ε−s−1 du.
n=1 M

Since the series α(σc + ε) converges, we know that an n σc +ε , and also that
R(u) 1. Thus the above is

M
|σc + ε − s| σc +ε−σ
n −δ+ε + M −δ+ε + M .
n=1
σ − σc − ε

By the integral test the sum here is

M
M 1−δ+ε
< u −δ+ε du = M 1−δ+ε .
0 1−δ+ε
Hence on taking M = [τ ] we obtain the stated estimate.

We know that the power series expansion of a function is unique; we now

show that the same is true for Dirichlet series expansions.

Theorem 1.6 If an n −s = bn n −s for all s with σ > σ0 then an = bn for
all positive integers n.

Proof We put cn = an − bn , and consider cn n −s . Suppose that cn = 0 for

all n < N . Since cn n −σ = 0 for σ > σ0 we may write

cN = − cn (N /n)σ .
n>N

By Theorem 1.4 this sum is absolutely convergent for σ > σ0 + 1. Since each
term tends to 0 as σ → ∞, we see that the right-hand side tends to 0, by
the principle of dominated convergence. Hence c N = 0, and by induction we
deduce that this holds for all N .
16 Dirichlet series: I

Suppose that f is analytic in a domain D, and that 0 ∈ D. Then f can

be expressed as a power series ∞ n=0 an z in the disc |z| < r where r is the
n

distance from 0 to the boundary ∂D of D. Although Dirichlet series are analytic

functions, the situation regarding Dirichlet series expansions is very different:
The collection of functions that may be expressed as a Dirichlet series in some
half-plane is a very special class. Moreover, the line σc + it of convergence
need not contain a singular point of α(s). For example, the Dirichlet series
(1.12) has abscissa of convergence σc = 0, but it represents the entire function
(1 − 21−s )ζ (s). (The connection of (1.12) to the zeta function is easy to establish,
since
∞ ∞ ∞
(−1)n−1 n −s = n −s − 2 n −s = ζ (s) − 21−s ζ (s)
n=1 n=1 n=1
n even

for σ > 1. That this is an entire function follows from Theorem 10.2.) Since a
Dirichlet series does not in general have a singularity on its line of convergence,
it is noteworthy that a Dirichlet series with non-negative coefficients not only
has a singularity on the line σc + it, but actually at the point σc .

Theorem 1.7 (Landau) Let α(s) = an n −s be a Dirichlet series whose ab-
scissa of convergence σc is finite. If an ≥ 0 for all n then the point σc is a
singularity of the function α(s).
It is enough to assume that an ≥ 0 for all sufficiently large n, since any finite
N
sum n=1 an n −s is an entire function.
Proof By replacing an by an n −σc , we may assume that σc = 0. Suppose that
α(s) is analytic at s = 0, so that α(s) is analytic in the domain D = {s : σ >
0} ∪ {|s| < δ} if δ > 0 is sufficiently small. We expand α(s) as a power series
at s = 1:
∞
α(s) = ck (s − 1)k . (1.13)
k=0

The coefﬁcients ck can be calculated by means of (1.8),

α (k) (1) 1 ∞
ck = = an (− log n)k n −1 .
k! k! n=1
The radius of convergence of the power series (1.13) is the distance from 1 to
the nearest singularity of α(s). Since α(s) is analytic in D, and since the nearest
points
√ not in D are ±iδ, we deduce that the radius of convergence is at least
1 + δ 2 = 1 + δ , say. That is,

∞
(1 − s)k
∞
α(s) = an (log n)k n −1
k=0
k! n=1
1.2 Analytic properties of Dirichlet series 17

for |s − 1| < 1 + δ . If s < 1 then all terms above are non-negative. Since
series of non-negative numbers may be arbitrarily rearranged, for −δ < s < 1
we may interchange the summations over k and n to see that

∞
∞
(1 − s)k (log n)k
α(s) = an n −1
n=1 k=0
k!
∞
∞
= an n −1 exp (1 − s) log n = an n −s .
n=1 n=1

Hence this last series converges at s = −δ /2, contrary to the assumption that
σc = 0. Thus α(s) is not analytic at s = 0.

1.2.1 Exercises
1. Suppose that α(s) is a Dirichlet series, and that the series α(s0 ) is boundedly
oscillating. Show that σc = σ0 .

2. Suppose that α(s) = ∞ n=1 an n
−s
is a Dirichlet series with abscissa of con-

vergence σc . Suppose that α(0) converges, and put R(x) = n>x an . Show
that σc is the inﬁmum of those numbers θ such that R(x) xθ .

3. Let Ak (x) = n≤x an (log n) .k

(a) Show that

x
A1 (x) A1 (u)
A0 (x) − = a1 + du.
log x 2 u(log u)2
(b) Suppose that A1 (x) x θ where θ > 0 and the implicit constant may
depend on the sequence {an }. Show that
A1 (x)
A0 (x) = + O(x θ (log x)−2 ).
log x

(c) Let σc denote the abscissa of convergence of an n −s , and σc the ab-

scissa of convergence of an (log n)n . Show that σc = σc . (The re-
−s

marks following the proof of Theorem 1.1 imply only that σc ≤ σc .)

4. (Landau 1909b) Let α(s) = an n −s be a Dirichlet series with abscissa of
convergence σc and abscissa of absolute convergence σa > σc . Let C(x) =
−σc

n≤x an n and A(x) = n≤x |an |n −σc .
(a) By a suitable application of Theorem 1.3, or otherwise, show that
C(x) x ε and that A(x) x σa −σc +ε for any ε > 0, where the implicit
constants may depend on ε and on the sequence {an }.
(b) Show that if σ > σc then
∞
an n −s = −C(N )N σc −s + (s − σc ) C(u)u σc −s−1 du.
n>N N
18 Dirichlet series: I

Deduce that the above is τ N σc −σ +ε uniformly for s in the half-plane

σ ≥ σc + ε where the implicit constant may depend on ε and on the
sequence {an }.
(c) Show that

N N
|an |n −σ = A(N )N −σ +σc + (σ − σc ) A(u)u −σ +σc −1 du
n=1 1

for any σ . Deduce that the above is N σa −σ +ε uniformly for σ in the

interval σc ≤ σ ≤ σa , for any given ε > 0. Here the implicit constant
may depend on ε and on the sequence {an }.
(d) Let θ (σ ) = (σa − σ )/(σa − σc ). By making a suitable choice of N , show
that

α(s) τ θ (σ )+ε

uniformly for s in the strip σc + ε ≤ σ ≤ σa .

5. (a) Show that if α(s) = an n −s has abscissa of convergence σc < ∞, then

lim α(σ ) = a1 .
σ →∞

(b) Show that ζ (s) = − ∞ n=1 (log n)n
−s
for σ > 1.

(c) Show that limσ →∞ ζ (σ ) = 0.
(d) Show that there is no half-plane in which 1/ζ (s) can be written as a
convergent Dirichlet series.

6. Let α(s) = an n −s be a Dirichlet series with an ≥ 0 for all n. Show that
σc = σa , and that

sup |α(s)| = α(σ )

for any given σ > σc .

7. (Vivanti 1893; Pringsheim 1894) Suppose that f (z) = ∞ n
n=0 an z has radius
of convergence 1 and that an ≥ 0 for all n. Show that z = 1 is a singular point
of f .

8. (Bohr 1910, p. 32) Let t1 = 4, tr +1 = 2tr for r ≥ 1. Put α(s) = an n −s
where an = 0 unless n ∈ [tr , 2tr ] for some r , in which case put
⎧
⎪
⎪ itr
(n = tr ),
⎨tr
an = n − (n − 1)
itr itr
(tr < n < 2tr ),
⎪
⎪
⎩−(2t − 1)itr (n = 2t ).
r r

2tr
(a) Show that tr an = 0.
1.3 Euler products and the zeta function 19

(b) Show that if tr ≤ x < 2tr for some r , then A(x) = [x]itr where A(x) =

n≤x an .
(c) Show that A(x) 1 uniformly for x ≥ 1.
(d) Deduce that α(s) converges for σ > 0.
(e) Show that α(it) does not converge; conclude that σc = 0.
(f) Show that if σ > 0, then

R
2tr ∞
α(s) = an n −s + s A(x)x −s−1 d x .
r =1 n=tr t R+1

(g) Suppose that σ > 0. Show that the above is

2t R
−s
|s|
an n + O t R−1 + O σ .
n=t R σ t R+1

(h) Show that if σ > 0, then

2t R 2t R
an n −s = s [x]it R x −s−1 d x .
n=t R tR

(i) Show that if n ≤ x < n + 1, then (n it R x −it R ) ≥ 1/2. Deduce that
2t R
[x]it R x −σ −it R −1 d x t R−σ .
tR

(j) Suppose that δ > 0 is ﬁxed. Conclude that if R ≥ R0 (δ), then |α(σ +
it R )| t R1−σ uniformly for δ ≤ σ ≤ 1 − δ.

(k) Show that |an |n −σ < ∞ when σ > 1. Deduce that σa = 1.

1.3 Euler products and the zeta function

The situation regarding products of Dirichlet series is somewhat complicated,
but it is useful to note that the formal calculation in (2) is justiﬁed if the series
are absolutely convergent.

Theorem 1.8 Let α(s) = an n −s and β(s) = bn n −s be two Dirichlet se-

ries, and put γ (s) = cn n −s where the cn are given by (1.3). If s is a point at
which the two series α(s) and β(s) are both absolutely convergent, then γ (s) is
absolutely convergent and γ (s) = α(s)β(s).

The mere convergence of α(s) and β(s) is not sufﬁcient to justify (1.2).
Indeed, the square of the series (1.12) can be shown to have abscissa of conver-
gence ≥ 1/4.
20 Dirichlet series: I

A function is called an arithmetic function if its domain is the set Z of inte-

gers, or some subset of the integers such as the natural numbers. An arithmetic
function f (n) is said to be multiplicative if f (1) = 1 and if f (mn) = f (m) f (n)
whenever (m, n) = 1. Also, an arithmetic function f (n) is called totally multi-
plicative if f (1) = 1 and if f (mn) = f (m) f (n) for all m and n. If f is multi-

plicative then the Dirichlet series f (n)n −s factors into a product over primes.
To see why this is so, we ﬁrst argue formally (i.e., we ignore questions of con-
vergence). When the product

(1 + f ( p) p −s + f ( p 2 ) p −2s + f ( p 3 ) p −3s + · · · )
p

is expanded, the generic term is

f p1k1 f p2k2 · · · f prkr
k1 k2 s .
p1 p2 · · · prkr

Set n = p1k1 p2k2 · · · prkr . Since f is multiplicative, the above is f (n)n −s . More-
over, this correspondence between products of prime powers and positive inte-
gers n is one-to-one, in view of the fundamental theorem of arithmetic. Hence

after rearranging the terms, we obtain the sum f (n)n −s . That is, we expect
that

∞
f (n)n −s = (1 + f ( p) p −s + f ( p 2 ) p −2s + · · · ). (1.14)
n=1 p

The product on the right-hand side is called the Euler product of the Dirichlet
series. The mere convergence of the series on the left does not imply that the
product converges; as in the case of the identity (1.2), we justify (1.14) only
under the stronger assumption of absolute convergence.

Theorem 1.9 If f is multiplicative and | f (n)|n −σ < ∞, then (1.14) holds.

If f is totally multiplicative, then the terms on the right-hand side in (1.14)

form a geometric progression, in which case the identity may be written more
concisely,

∞
f (n)n −s = (1 − f ( p) p −s )−1 . (1.15)
n=1 p

Proof For any prime p,

∞
∞
| f ( p k )| p −kσ ≤ | f (n)|n −σ < ∞,
k=0 n=1
1.3 Euler products and the zeta function 21

so each sum on the right-hand side of (1.14) is absolutely convergent. Let

y be a positive real number, and let N be the set of those positive integers
composed entirely of primes not exceeding y, N = {n : p|n ⇒ p ≤ y}. (Note
that 1 ∈ N .) Since a product of ﬁnitely many absolutely convergent series may
be arbitrarily rearranged, we see that

y = 1 + f ( p) p −s + f ( p 2 ) p −2s + · · · = f (n)n −s .
p≤y n∈N

Hence

∞
y − f (n)n −s ≤ | f (n)|n −σ .
n=1 n ∈N
/

If n ≤ y then all prime factors of n are ≤ y, and hence n ∈ N . Consequently

the sum on the right above is

≤ | f (n)|n −σ ,
n>y

which is small if y is large. Thus the partial products y tend to f (n)n −s as
y → ∞.

Let ω(n) denote the number of distinct primes dividing n, and let (n) be
the number of distinct prime powers dividing n. That is,

ω(n) = 1, (n) = 1= k. (1.16)
p|n p k |n p k n

It is easy to distinguish these functions, since ω(n) ≤ (n) for all n, with equal-
ity if and only if n is square-free. These functions are examples of additive
functions because they satisfy the functional relation f (mn) = f (m) + f (n)
whenever (m, n) = 1. Moreover, (n) is totally additive because this func-
tional relation holds for all pairs m, n. An exponential of an additive function is
a multiplicative function. In particular, the Liouville lambda function is the to-
tally multiplicative function λ(n) = (−1)(n) . Closely related is the Möbius mu
function, which is deﬁned to be µ(n) = (−1)ω(n) if n is square-free, µ(n) = 0
otherwise. By the fundamental theorem of arithmetic we know that a multi-
plicative (or additive) function is uniquely determined by its values at prime
powers, and similarly that a totally multiplicative (or totally additive) function
is uniquely determined by its values at the primes. Thus µ(n) is the unique
multiplicative function that takes the value −1 at every prime, and the value 0
at every higher power of a prime, while λ(n) is the unique totally multiplicative
function that takes the value −1 at every prime. By using Theorem 1.9 we can
22 Dirichlet series: I

determine the Dirichlet series generating functions of λ(n) and of µ(n) in terms
of the Riemann zeta function.
Corollary 1.10 For σ > 1,
∞
n −s = ζ (s) = (1 − p −s )−1 , (1.17)
n=1 p

∞
1
µ(n)n −s = = (1 − p −s ), (1.18)
n=1
ζ (s) p

and

∞
ζ (2s)
λ(n)n −s = = (1 + p −s )−1 . (1.19)
n=1
ζ (s) p

Proof All three series are absolutely convergent, since n −σ < ∞ for σ >
1, by the integral test. Since the coefﬁcients are multiplicative, the Euler product
formulae follow by Theorem 1.9. In the ﬁrst and third cases use the variant
(1.15). On comparing the Euler products in (1.17) and (1.18), it is immediate
that the second of these Dirichlet series is 1/ζ (s). As for (1.19), from the identity
1 + z = (1 − z 2 )/(1 − z) we deduce that
−2s
−s p (1 − p ) ζ (s)
(1 + p ) = −s
= .
p p (1 − p ) ζ (2s)

The manipulation of Euler products, as exempliﬁed above, provides a pow-

erful tool for relating one Dirichlet series to another.
In (1.17) we have expressed ζ (s) as an absolutely convergent product; hence
in particular ζ (s) = 0 for σ > 1. We have not yet deﬁned the zeta function
outside this half-plane, but we shall do so shortly, and later we shall ﬁnd that
the zeta function does have zeros in the half-plane σ ≤ 1. These zeros play an
important role in determining the distribution of prime numbers.
Many important relations involving arithmetic functions can be expressed
succinctly in terms of Dirichlet series. For example, the fundamental elementary
identity

1 if n = 1,
µ(d) = (1.20)
d|n 0 if n > 1.
is equivalent to the identity
1
ζ (s) · = 1,
ζ (s)
1.3 Euler products and the zeta function 23

in view of (1.3), (1.17), (1.18), and Theorem 1.6. More generally, if

F(n) = f (d) (1.21)
d|n

for all n, then, apart from questions of convergence,

F(n)n −s = ζ (s) f (n)n −s .

By Möbius inversion, the identity (1.21) is equivalent to the relation

f (n) = µ(d)F(n/d),
d|n

which is to say that

1
f (n)n −s = F(n)n −s .
ζ (s)
Such formal manipulations can be used to suggest (or establish) many useful
elementary identities.
For σ > 1 the product (1.17) is absolutely convergent. Since log(1 − z)−1 =
∞ k
k=1 z /k for |z| < 1, it follows that

∞
log ζ (s) = log(1 − p −s )−1 = k −1 p −ks .
p p k=1

On differentiating, we ﬁnd also that

ζ (s) ∞
=− (log p) p −ks
ζ (s) p k=1

for σ > 1. This is a Dirichlet series, whose n th coefﬁcient is the von Mangoldt
lambda function: (n) = log p if n is a power of p, (n) = 0 otherwise.

Corollary 1.11 For σ > 1,

∞
(n)
log ζ (s) = n −s
n=1
log n

and
ζ (s) ∞
− = (n)n −s .
ζ (s) n=1

The quotient f (s)/ f (s), obtained by differentiating the logarithm of f (s),

is known as the logarithmic derivative of f . Subsequently we shall often write

it more concisely as ff (s).
24 Dirichlet series: I

The important elementary identity

(d) = log n (1.22)
d|n

is reﬂected in the relation

ζ
ζ (s) − (s) = −ζ (s),
ζ
since

∞
−ζ (s) = (log n)n −s
n=1

for σ > 1.
We now continue the zeta function beyond the half-plane in which it was
initially deﬁned.

Theorem 1.12 Suppose that σ > 0, x > 0, and that s = 1. Then

x 1−s {x} ∞
ζ (s) = n −s + + s −s {u}u −s−1 du. (1.23)
n≤x s−1 x x

Here {u} denotes the fractional part of u, so that {u} = u − [u] where [u]
denotes the integral part of u.

Proof of Theorem 1.12 For σ > 1 we have

∞
ζ (s) = n −s = n −s + n −s .
n=1 n≤x n>x

This second sum we write as

∞ ∞ ∞
u −s d[u] = u −s du − u −s d{u}.
x x x

We evaluate the ﬁrst integral on the right-hand side, and integrate the second
one by parts. Thus the above is
∞
x 1−s
= + {x}x −s + {u} du −s .
s−1 x

Since (u −s ) = −su −s−1 , the desired formula now follows by Theorem A.3.
The integral in (1.23) is convergent in the half-plane σ > 0, and uniformly so
for σ ≥ δ > 0. Since the integrand is an analytic function of s, it follows that the
integral is itself an analytic function for σ > 0. By the uniqueness of analytic
continuation the formula (1.23) holds in this larger half-plane.
1.3 Euler products and the zeta function 25

0 1 5

–10

Figure 1.2 The Riemann zeta function ζ (s) for 0 < s ≤ 5.

By taking x = 1 in (1.23) we obtain in particular the identity

∞
s
ζ (s) = −s {u}u −s−1 du (1.24)
s−1 1

for σ > 0. Hence we have

Corollary 1.13 The Riemann zeta function has a simple pole at s = 1 with
residue 1, but is otherwise analytic in the half-plane σ > 0.
A graph of ζ (s) that exhibits the pole at s = 1 is provided in Figure 1.2. By
repeatedly integrating by parts we can continue ζ (s) into successively larger
half-planes; this is systematized by using the Euler–Maclaurin summation for-
mula (see Theorem B.5). In Chapter 10 we shall continue the zeta function by a
different method. For the present we note that (1.24) yields useful inequalities
for the zeta function on the real line.
Corollary 1.14 The inequalities
1 σ
< ζ (σ ) <
σ −1 σ −1
hold for all σ > 0. In particular, ζ (σ ) < 0 for 0 < σ < 1.
Proof From the inequalities 0 ≤ {u} < 1 it follows that
∞ ∞
1
0≤ {u}u −σ −1 du < u −σ −1 du = .
1 1 σ
This sufﬁces.
26 Dirichlet series: I

We now put the parameter x in (1.23) to good use.

Corollary 1.15 Let δ be ﬁxed, δ > 0. Then for σ ≥ δ, s = 1,
x 1−s
n −s = + ζ (s) + O(τ x −σ ). (1.25)
n≤x 1 − s

In addition,
1
= log x + C0 + O(1/x) (1.26)
n≤x n

where C0 is Euler’s constant,

∞
C0 = 1 − {u}u −2 du = 0.5772156649 . . . . (1.27)
1

Proof The ﬁrst estimate follows by crudely estimating the integral in (1.23):
∞ ∞
x −σ
{u}u −s−1 du u −σ −1 du = .
x x σ
As for the second estimate, we note that the sum is
x x x
u −1 d[u] = u −1 du − u −1 d{u}
1− 1− 1−
x
= log x + 1 − {x}/x − {u}u −2 du.
1
x ∞ ∞
The result now follows by writing 1 = 1 − x , and noting that
∞ ∞
{u}u −2 du u −2 du = 1/x.
x x

By letting s → 1 in (1.25) and comparing the result with (1.26), or by letting

s → 1 in (1.24) and comparing the result with (1.27), we obtain
Corollary 1.16 Let
1
∞
ζ (s) = + ak (s − 1)k (1.28)
s − 1 k=0
be the Laurent expansion of ζ (s) at s = 1. Then a0 is Euler’s constant, a0 = C0 .
Euler’s constant also arises in the theory of the gamma function. (See
Appendix C and Chapter 10.)
Corollary 1.17 Let δ > 0 be ﬁxed. Then
1
ζ (s) = + O(1)
s−1
1.3 Euler products and the zeta function 27

uniformly for s in the rectangle δ ≤ σ ≤ 2, |t| ≤ 1, and

1
ζ (s) (1 + τ 1−σ ) min , log τ
|σ − 1|
uniformly for δ ≤ σ ≤ 2, |t| ≥ 1.

Proof The ﬁrst assertion is clear from (1.24). When |t| is larger, we obtain
a bound for |ζ (s)| by estimating the sum in (1.25). Assume that x ≥ 2. We
observe that
x
n −s n −σ 1+ u −σ du
n≤x n≤x 1

uniformly for σ ≥ 0. If 0 ≤ σ ≤ 1 − 1/ log x, then this integral is

(x 1−σ − 1)/(1 − σ ) < x 1−σ /(1 − σ ). If |σ − 1| ≤ 1/ log then u −σ u −1
xx, −1
uniformly for 1 ≤ u ≤ x, and hence the ∞integral is 1 u du = log x. If
σ ≥ 1 + 1/ log x, then the integral is < 1 u −σ du = 1/(σ − 1). Thus
1
n −s (1 + x 1−σ ) min , log x (1.29)
n≤x |σ − 1|

uniformly for 0 ≤ σ ≤ 2. The second assertion now follows by taking x = τ

in (1.25).

1.3.1 Exercises
1. Suppose that f (mn) = f (m) f (n) whenever (m, n) = 1, and that f is not
identically 0. Deduce that f (1) = 1, and hence that f is multiplicative.

2. (Stieltjes 1887) Suppose that an converges, that |b | < ∞, and that
n
cn is given by (1.3). Show that cn converges to ( an )( bn ). (Hint:

Write n≤x cn = n≤x bn A(x/n) where A(y) = n≤y an .)

3. Determine ϕ(n)n −s , σ (n)n −s , and |µ(n)|n −s in terms of the zeta
function. Here ϕ(n) is Euler’s ‘totient function’, which is the number of a,
1 ≤ a ≤ n, such that (a, n) = 1.
4. Let q be a positive integer. Show that if σ > 1, then

∞
n −s = ζ (s) (1 − p −s ).
n=1 p|q
(n,q)=1

5. Show that if σ > 1, then

∞
d(n)2 n −s = ζ (s)4/ζ (2s).
n=1
28 Dirichlet series: I

6. Let σa (n) = d|n d a . Show that

∞
σa (n)σb (n)n −s = ζ (s)ζ (s − a)ζ (s − b)ζ (s − a − b)/ζ (2s − a − b)
n=1

when σ > max (1, 1 + a, 1 + b, 1 + (a + b)).

7. Let F(s) = p (log p) p −s , G(s) = p p −s for σ > 1. Show that in this
half-plane,
ζ ∞
− (s) = F(ks),
ζ k=1

∞
ζ
F(s) = − µ(d) (ds),
d=1
ζ

∞
log ζ (s) = G(ks)/k,
k=1
∞
µ(d)
G(s) = log ζ (ds).
d=1
d
8. Let F(s) and G(s) be deﬁned as in the preceding problem. Show that if
σ > 1, then
∞ ∞
µ(d)
ω(n)n −s = ζ (s)G(s) = ζ (s) log ζ (ds),
n=1 d=1
d

∞
∞
∞
ϕ(k)
(n)n −s = ζ (s) G(ks) = ζ (s) log ζ (ks).
n=1 k=1 k=1
k
9. Let t be a ﬁxed real number, t = 0. Describe the limit points of the sequence

of partial sums n≤x n −1−it .
N
10. Show that n=1 n −1 > log N + C0 for all positive integers N , and that
−1
n≤x n > log x for all positive real numbers x.

11. (a) Show that if an is totally multiplicative, and if α(s) = an n −s has
abscissa of convergence σc , then

∞
(−1)n−1 an n −s = (1 − 2a2 2−s )α(s)
n=1

for σ > σc .
(b) Show that

∞
(−1)n−1 n −s = (1 − 21−s )ζ (s)
n=1

for σ > 0.
1.3 Euler products and the zeta function 29

(c) (Shafer 1984) Show that

∞
1
(−1)n (log n)n −1 = C0 log 2 − (log 2)2 .
n=1
2
12. (Stieltjes 1885) Show that if k is a positive integer, then
(log n)k (log x)k+1 (log x)k
= + C k + Ok
n≤x n k+1 x
for x ≥ 1 where
∞
Ck = {u}(log u)k−1 (k − log u)u −2 du.
1

Show that the numbers ak in (1.28) are given by ak = (−1)k Ck /k!.

13. Let D be the disc of radius 1 and centre 2. Suppose that the numbers εk tend
monotonically to 0, that the numbers tk tend monotonically to 0, and that
the numbers Nk tend monotonically to inﬁnity. We consider the Dirichlet

series α(s) = n an n −s with coefﬁcients an = εk n itk for Nk−1 < n ≤ Nk .
For suitable choices of the εk , tk , and Nk we show that the series converges
at s = 1 but that it is not uniformly
convergent in D.
(a) Suppose that σk = 2 − 1 − tk2 , so that sk = σk + itk ∈ D. Show that if
t2
Nkk 1, (1.30)
then
Nk
an n −sk εk log .
Nk−1 <n≤Nk
Nk−1

Thus if
Nk
εk log 1 (1.31)
Nk−1
then the series is not uniformly convergent in D.
(b) By using Corollary 1.15, or otherwise, show that if (a, b] ⊆ (Nk−1 , Nk ],
then
εk
an n −1 .
a<n≤b
tk

Hence if

∞
εk
< ∞, (1.32)
k=1
tk
then the series α(1) converges.
30 Dirichlet series: I

(c) Show that the parameters can be chosen so that (1.30)–(1.32) hold, say
1/2
by taking Nk = exp(1/εk ) and tk = εk with εk tending rapidly to 0.

14. Let t(n) = (−1)(n)−ω(n) p|n ( p − 1)−1 , and put T (s) = n t(n)n −s .
(a) Show that for σ > 0, T (s) has the absolutely convergent Euler product
1
T (s) = 1+ .
p ( p − 1)( p s + 1)

(b) Determine all zeros of the function 1 + 1/(( p − 1)( p s + 1)).

(c) Show that the line σ = 0 is a natural boundary of the function T (s).
15. Suppose throughout that 0 < α ≤ 1. For σ > 1 we deﬁne the Hurwitz zeta
function by the formula

∞
ζ (s, α) = (n + α)−s .
n=0

Thus ζ (s, 1) = ζ (s).

(a) Show that ζ (s, 1/2) = (2s − 1)ζ (s).
(b) Show that if x ≥ 0 then
(x + α)1−s {x}
ζ (s, α) = (n + α)−s + +
0≤n≤x
s−1 (x + α)s
∞
−s {u}(u + α)−s−1 du.
x

(c) Deduce that ζ (s, α) is an analytic function of s for σ > 0 apart from a
simple pole at s = 1 with residue 1.
(d) Show that
∞
1 {u}
lim ζ (s, α) − = 1/α − log α − du.
s→1 s−1 0 (u + α)2
(e) Show that
1 1 {x}
lim ζ (s, α) − = − log(x + α) +
s→1 s−1 0≤n≤x
n + α x +α
∞
{u}
− du.
x (u + α)2
(f) Let x → ∞ in the above, and use (C.2), (C.10) to show that

1
lim ζ (s, α) − =− (α).
s→1 s−1
(This is consistent with Corollary 1.16, in view of (C.11).)
1.4 Notes 31

1.4 Notes
Section 1.1. For a brief introduction to the Hardy–Littlewood circle method,
including its application to Waring’s problem, see Davenport (2005). For a
comprehensive account of the method, see Vaughan (1997). Other examples
of the fruitful use of generating functions are found in many sources, such as
Andrews (1976) and Wilf (1994).
Algorithms for the efficient computation of π(x) have been developed
by Meissel (Lehmer, 1959), Mapes (1963), Lagarias, Miller & Odlyzko
(1985), Deléglise & Rivat (1996), and by X. Gourdon. For discussion
of these methods, see Chapter 1 of Riesel (1994) and the web page of
Gourdon & Sebah at http://numbers.computation.free.fr/Constants/Primes/
countingPrimes.html.
The ‘big oh’ notation was introduced by Paul Bachmann (1894, p. 401). The
‘little oh’ was introduced by Edmund Landau (1909a, p. 61). The notation
was introduced by Hardy (1910, p. 2). Our notation f ∼ g also follows Hardy
(1910). The Omega notation was introduced by G. H. Hardy and J. E. Littlewood
(1914, p. 225). Ingham (1932) replaced the R and L of Hardy and Littlewood
by + and − . The notation is due to I. M. Vinogradov.

Section 1.2. The series an n −s is called an ordinary Dirichlet series,
to distinguish it from a generalized Dirichlet series, which is a sum of the

form an e−λn s where 0 < λ1 < λ2 < · · · , λn → ∞. We see that generalized
Dirichlet series include both ordinary Dirichlet series (λn = log n) and power
series (λn = n). Theorems 1.1, 1.3, 1.6, and 1.7 extend naturallyto generalized
∞
Dirichlet series, and even to the more general class of functions 0 e−us d A(u)
where A(u) is assumed to have finite variation on each finite interval [0, U ].
The proof of the general form of Theorem 1.6 must be modified to depend on
uniform, rather than absolute, convergence, since a generalized Dirichlet series

may be never more than conditionally convergent (e.g., (−1)n (log n)−s ).
If we put a = lim sup(log n)/λn , then the general form of Theorem 1.4
reads σc ≤ σa ≤ σc + a. Hardy & Riesz (1915) have given a detailed ac-
count of this subject, with historical attributions. See also Bohr & Cramér
(1923).
Jensen (1884) showed that the domain of convergence of a generalized
Dirichlet series is always a half-plane. The more precise information provided
by Theorem 1.1 is due to Cahen (1894) who proved it not only for ordinary
Dirichlet series but also for generalized Dirichlet series.
The construction in Exercise 1.2.8 would succeed with the simpler choice
an = n itr for tr ≤ n ≤ 2tr , an = 0 otherwise, but then to complete the argu-
ment one would need a further tool, such as the Kusmin–Landau inequality
32 Dirichlet series: I

(cf. Mordell 1958). The square of the Dirichlet series in Exercise 1.2.8 has ab-
scissa of convergence 1/2; this bears on the result of Exercise 2.1.9. Information
concerning the convergence of the product of two Dirichlet series is found in
Exercises 1.3.2, 2.1.9, 5.2.16, and in Hardy & Riesz (1915).
Theorem 1.7 originates in Landau (1905). The analogue for power series had
been proved earlier by Vivanti (1893) and Pringsheim (1894). Landau’s proof
extends to generalized Dirichlet series (including power series).

Section 1.3. The hypothesis | f (n)|n −σ < ∞ of Theorem 1.9 is equivalent
to the assertion that

(1 + | f ( p)| p −σ + | f ( p 2 )| p −2σ + · · · ) < ∞,

which is slightly stronger than merely asserting that the Euler product converges

absolutely. We recall that a product n (1 + an ) is said to be absolutely con-

vergent if n (1 + |an |) < ∞. To see that the hypothesis p (1 + | f ( p) p −s +
· · · |) < ∞ is not sufﬁcient, consider the following example due to Ingham:
For every prime p we take f ( p) = 1, f ( p 2 ) = −1, and f ( p k ) = 0 for k > 2.
Then the product is absolutely convergent at s = 0, but the terms f (n) do not

tend to 0, and hence the series f (n) diverges. Indeed, it can be shown that
−2 −3
n≤x f (n) ∼ cx as x → ∞ where c = p 1 − 2 p + p > 0.
Euler (1735) deﬁned the constant C0 , which he denoted C.
Mascheroni (1790) called the constant γ , which is in common use, but
we wish to reserve this symbol for the imaginary part of a zero of the
zeta function or an L-function. It is conjectured that Euler’s constant C0
is irrational. The early history of the determination of the initial digits of
C0 has been recounted by Nielsen (1906, pp. 8–9). More recently, Wrench
(1952) computed 328 digits, Knuth (1963) computed 1,271 digits, Sweeney
(1963) computed 3,566 digits, Beyer & Waterman (1974) computed 4,879
digits, Brent (1977) computed 20,700 digits, Brent & McMillan (1980)
computed 30,100 digits. At this time, it seems that more than 108 digits
have been computed – see the web page of X. Gourdon & P. Sebah at
http://numbers.computation.free.fr/Constants/Gamma/gamma.html. To 50
places, Euler’s constant is

C0 = 0.57721 56649 01532 86060 65120 90082 40243 10421 59335 93992.

Statistical analysis of the continued fraction coefﬁcients of C0 suggest that it

satisﬁes the Gauss–Kusmin law, which is to say that C0 seems to be a typical
irrational number.
Landau & Walﬁsz (1920) showed that the functions F(s) and G(s) of Ex-
ercise 1.3.7 have the imaginary axis σ = 0 as a natural boundary. For further
1.5 References 33

work on Dirichlet series with natural boundaries see Estermann (1928a,b) and
Kurokawa (1987).

1.5 References
Andrews, G. E. (1976). The Theory of Partitions, Reprint. Cambridge: Cambridge Uni-
versity Press (1998).
Bachmann, P. (1894). Zahlentheorie, II, Die analytische Zahlentheorie, Leipzig:
Teubner.
Beyer, W. A. & Waterman, M. S. (1974). Error analysis of a computation of Euler’s
constant and ln 2, Math. Comp. 28, 599–604.
Bohr, H. (1910). Bidrag til de Dirichlet’ske Rækkers theori, København: G. E. C. Gad;
Collected Mathematical Works, Vol. I, København: Danske Mat. Forening, 1952.
A3.
Bohr, H. & Cramér, H. (1923). Die neuere Entwicklung der analytischen Zahlentheo-
rie, Enzyklopädie der Mathematischen Wissenschaften, 2, C8, 722–849; H. Bohr,
Collected Mathematical Works, Vol. III, København: Dansk Mat. Forening, 1952,
H; H. Cramér, Collected Works, Vol. 1, Berlin: Springer-Verlag, 1952, pp. 289–
416.
Brent, R. P. (1977). Computation of the regular continued fraction of Euler’s constant,
Math. Comp. 31, 771–777.
Brent, R. P. & McMillan, E. M. (1980). Some new algorithms for high-speed computation
of Euler’s constant, Math. Comp. 34, 305–312.
Cahen, E. (1894). Sur la fonction ζ (s) de Riemann et sur des fonctions analogues, Ann.
de l’École Normale (3) 11, 75–164.
Davenport, H. (2005). Analytic Methods for Diophantine Equations and Diophantine
Inequalities. Second edition, Cambridge: Cambridge University Press.
Deléglise, M. & Rivat, J. (1996). Computing π (x): the Meissel, Lehmer, Lagarias, Miller,
Odlyzko method, Math. Comp. 65, 235–245.
Estermann, T. (1928a). On certain functions represented by Dirichlet series, Proc. Lon-
don Math. Soc. (2) 27, 435–448.
(1928b). On a problem of analytic continuation, Proc. London Math. Soc. (2) 27,
471–482.
Euler, L. (1735). De Progressionibus harmonicus observationes, Comm. Acad. Sci. Imper.
Petropol. 7, 157; Opera Omnia, ser. 1, vol. 14, Teubner, 1914, pp. 93–95.
Hardy, G. H. (1910). Orders of Inﬁnity. Cambridge Tract 12, Cambridge: Cambridge
University Press.
Hardy, G. H. & Littlewood, J. E. (1914). Some problems of Diophantine approximation
(II), Acta Math. 37, 193–238; Collected Papers, Vol I. Oxford: Oxford University
Press. 1966, pp. 67–112.
Hardy, G. H. & Riesz, M. (1915). The General Theory of Dirichlet’s Series, Cambridge
Tract No. 18. Cambridge: Cambridge University Press. Reprint: Stechert–Hafner
(1964).
Ingham, A. E. (1932). The Distribution of Prime Numbers, Cambridge Tract 30. Cam-
bridge: Cambridge University Press.
34 Dirichlet series: I

Jensen, J. L. W. V. (1884). Om Rækkers Konvergens, Tidsskrift for Math. (5) 2, 63–72.

(1887). Sur la fonction ζ (s) de Riemann, Comptes Rendus Acad. Sci. Paris 104,
1156–1159.
Knuth, D. E. (1962). Euler’s constant to 1271 places, Math. Comp. 16, 275–281.
Kurokawa, N. (1987). On certain Euler products, Acta Arith. 48, 49–52.
Lagarias, J. C., Miller, V. S., & Odlyzko, A. M. (1985). Computing π(x): The Meissel–
Lehmer method, Math. Comp. 44, 537–560.
Lagarias, J. C. & Odlyzko, A. M. (1987). Computing π(x): An analytic method, J.
Algorithms 8, 173–191.
Landau, E. (1905). Über einen Satz von Tschebyschef, Math. Ann. 61, 527–550;
Collected Works, Vol. 2, Essen: Thales, 1986, pp. 206–229.
(1909a). Handbuch der Lehre von der Verteilung der Primzahlen, Leipzig: Teubner.
Reprint: Chelsea (1953).
(1909b). Über das Konvergenzproblem der Dirichlet’schen Reihen, Rend. Circ. Mat.
Palermo 28, 113–151; Collected Works, Vol. 4, Essen: Thales, 1986, pp. 181–220.
Landau, E. & Walﬁsz, A. (1920). Über die Nichtfortsetzbarkeit einiger durch Dirich-
letsche Reihen deﬁnierte Funktionen, Rend. Circ. Mat. Palermo 44, 82–86;
Collected Works, Vol. 7, Essen: Thales, 1986, pp. 252–256.
Lehmer, D. H. (1959). On the exact number of primes less than a given limit, Illinois J.
Math. 3, 381–388.
Mapes, D. C. (1963). Fast method for computing the number of primes less than a given
limit, Math. Comp. 17, 179–185.
Mascheroni, L. (1790). Abnotationes ad calculum integrale Euleri, Vol. 1. Ticino:
Galeatii. Reprinted in the Opera Omnia of L. Euler, Ser. 1, Vol 12, Teubner, 1914,
pp. 415–542.
Mordell, L. J. (1958). On the Kusmin–Landau inequality for exponential sums, Acta
Arith. 4, 3–9.
Nielsen, N. (1906). Handbuch der Theorie der Gammafunktion. Leipzig: Teubner.
Pringsheim, A. (1894). Über Functionen, welche in gewissen Punkten endliche Differen-
tialquotienten jeder endlichen Ordnung, aber kein Taylorsche Reihenentwickelung
besitzen, Math. Ann. 44, 41–56.
Riesel, H. (1994). Prime Numbers and Computer Methods for Factorization, Second
ed., Progress in Math. 126. Boston: Birkhäuser.
Shafer, R. E. (1984). Advanced problem 6456, Amer. Math. Monthly 91, 205.
Stieltjes, T. J. (1885). Letter 75 in Correspondance d’Hermite et de Stieltjes, B. Baillaud
& H. Bourget, eds., Paris: Gauthier-Villars, 1905.
(1887). Note sur la multiplication de deux séries, Nouvelles Annales (3) 6, 210–215.
Sweeney, D. W. (1963). On the computation of Euler’s constant, Math. Comp. 17, 170–
178.
Vaughan, R. C. (1997). The Hardy–Littlewood Method, Second edition, Cambridge Tract
125. Cambridge: Cambridge University Press.
Vivanti, G. (1893). Sulle serie di potenze, Rivista di Mat. 3, 111–114.
Wagon, S. (1987). Fourteen proofs of a result about tiling a rectangle, Amer. Math.
Monthly 94, 601–617.
Widder, D. V. (1971). An Introduction to Transform Theory. New York: Academic Press.
Wilf, H. (1994). Generatingfunctionology, Second edition. Boston: Academic Press.
Wrench, W. R. Jr (1952). A new calculation of Euler’s constant, MTAC 6, 255.
2
The elementary theory of arithmetic functions

2.1 Mean values

We say that an arithmetic function F(n) has a mean value c if

1 N
lim F(n) = c.
N →∞ N n=1

In this section we develop a simple method by which mean values can be shown
to exist in many interesting cases.
If two arithmetic functions f and F are related by the identity

F(n) = f (d), (2.1)
d|n

then we can write f in terms of F:

f (n) = µ(d)F(n/d). (2.2)
d|n

This is the Möbius inversion formula. Conversely, if (2.2) holds for all n then
so also does (2.1). If f is generally small then F has an asymptotic mean value.
To see this, observe that

F(n) = f (d).
n≤x n≤x d|n

By iterating the sums in the reverse order, we see that the above is

= f (d) 1= f (d)[x/d].
d≤x n≤x d≤x
d|n

35
36 The elementary theory of arithmetic functions

Since [y] = y + O(1), this is

f (d)
=x + O | f (d)| . (2.3)
d≤x
d d≤x
∞
Thus F has the mean value d=1 f (d)/d if this series converges and if

d≤x | f (d)| = o(x). This approach, though somewhat crude, often yields use-
ful results.

Theorem 2.1 Let ϕ(n) be Euler’s totient function. Then for x ≥ 2,

ϕ(n) 6
= x + O(log x).
n≤x n π2

Proof We recall that ϕ(n) = n p|n (1 − 1/ p). On multiplying out the prod-
uct, we see that
ϕ(n) µ(d)
= .
n d|n
d

On taking f (d) = µ(d)/d in (2.3), it follows that

ϕ(n) µ(d)
=x + O(log x).
n≤x n d≤x
d2

Since d>x d −2 x −1 , we see that
µ(d)
∞
µ(d) 1 1 1
= +O = +O
d≤x
d2 d=1
d2 x ζ (2) x

by Corollary 1.10. From Corollary B.3 we know that ζ (2) = π 2 /6; hence the
proof is complete.

Let Q(x) denote the number of square-free integers not exceeding x, Q(x) =

n≤x µ(n) . We now calculate the asymptotic density of these numbers.
2

Theorem 2.2 For all x ≥ 1,

6 1/2
Q(x) = x + O x .
π2
Proof Every positive integer n is uniquely of the form n = ab2 where a is
square-free. Thus n is square-free if and only if b = 1, so that by (1.20)

µ(d) = µ(d) = µ(n)2 . (2.4)
d 2 |n d|b
2.1 Mean values 37

√
This is a relation of the shape (2.1) where f (d) = µ( d) if d is a perfect square,
and f (d) = 0 otherwise. Hence by (2.3),

µ(d)
Q(x) = x +O 1 .
d 2 ≤x
d2 d 2 ≤x

The error term is x 1/2 , and the sum in the main term is treated as in the
preceding proof.

We note that the argument above is routine once the appropriate identity
(2.4) is established. This relation can be discovered by considering (2.2), or by
using Dirichlet series: Let Q denote the class of square-free numbers. Then for
σ > 1,
1 − p −2s ζ (s)
n −s = (1 + p −s ) = −s
= .
n∈Q p p 1− p ζ (2s)

Now 1/ζ (2s) can be written as a Dirichlet series in s, with coefﬁcients f (n) =
µ(d) if n = d 2 , f (n) = 0 otherwise. Hence the convolution equation (2.4) gives
the coefﬁcients of the product Dirichlet series ζ (s) · 1/ζ (2s).
Suppose that ak , bm , cn are joined by the convolution relation

cn = ak bm , (2.5)
km=n

and that A(x), B(x), C(x) are their respective summatory functions. Then

C(x) = ak bm , (2.6)
km≤x

and it is useful to note that this double sum can be iterated in various ways. On
one hand we see that

C(x) = ak B(x/k); (2.7)
k≤x

this is the line of reasoning that led to (2.3) (take ak = f (k), bm = 1). At the
opposite extreme,

C(x) = bm A(x/m), (2.8)
m≤x

and between these we have the more general identity

C(x) = ak B(x/k) + bm A(x/m) − A(y)B(x/y) (2.9)
k≤y m≤x/y

for 0 < y ≤ x. This is obvious once it is observed that the ﬁrst term on the right
sums those terms ak bm for which km ≤ x, k ≤ y, the second sum includes the
38 The elementary theory of arithmetic functions

pairs (k, m) for which km ≤ x, m ≤ x/y, and the third term subtracts those ak bm
for which k ≤ y, m ≤ x/y, since these (k, m) were included in both the previous
terms. The advantage of (2.9) over (2.7) is that the number of terms is reduced
( y + x/y instead of x), and at the same time A and B are evaluated only
at large values of the argument, so that asymptotic formulæ for these quantities
may be expected to be more accurate. For example, if we wish to estimate the
average size of d(n) we take ak = bm = 1, and then from (2.3) we see that

d(n) = x log x + O(x).
n≤x

To obtain a more accurate estimate we observe that the ﬁrst term on the
right-hand side of (2.9) is

[x/k] = x 1/k + O(y).
k≤y k≤y

By Corollary 1.15 this is

x log y + C0 x + O(x/y + y).

Here the error term is minimized by taking y = x 1/2 . The second term
on the right in (2.9) is then identical to the ﬁrst, and the third term is
[x 1/2 ]2 = x + O(x 1/2 ), and we have

Theorem 2.3 For x ≥ 2.

d(n) = x log x + (2C0 − 1)x + O x 1/2 .
n≤x

We often construct estimates with one or more parameters, and then choose
values of the parameters to optimize the result. The instance above is typical –
we minimized x/y + y by taking y = x 1/2 . Suppose, more generally, that we
wish to minimize T1 (y) + T2 (y) where T1 is a decreasing function, and T2 is
an increasing function. We could differentiate and solve for a root of T1 (y) +
T2 (y) = 0, but there is a quicker method: Find y0 so that T1 (y0 ) = T2 (y0 ). This
does not necessarily yield the exact minimum value of T1 (y) + T2 (y), but it is
easy to see that

T1 (y0 ) ≤ min(T1 (y) + T2 (y)) ≤ 2T1 (y0 ),

so the bound obtained in this way is at most twice the optimal bound.
Despite the great power of analytic techniques, the ‘method of the hyperbola’
used above is a valuable tool. The sequence cn given by (2.5) is called the
Dirichlet convolution of ak and bm ; in symbols, c = a ∗ b. Arithmetic functions
form a ring when equipped with pointwise addition, (a + b)n = an + bn , and
2.1 Mean values 39

Dirichlet convolution for multiplication. This ring is called the ring of formal
Dirichlet series. Manipulations of arithmetic functions in this way correspond
to manipulations of Dirichlet series without regard to convergence. This is
analogous to the ring of formal power series, in which multiplication is provided

by Cauchy convolution, cn = k+m=n ak bm .
In the ring of formal Dirichlet series we let O denote the arithmetic function
that is identically 0; this is the additive identity. The multiplicative identity is i
where i 1 = 1, i n = 0 for n > 1. The arithmetic function that is identically 1 we
denote by 1, and we similarly abbreviate µ(n), (n), and log n by µ, Λ, and
L. In this notation, the characteristic property of µ(n) is that µ ∗ 1 = i, which
is to say that µ and 1 are convolution inverses of each other, and the Möbius
inversion formula takes the compact form
a∗1=b ⇐⇒ a = b ∗ µ.
In the elementary study of prime numbers the relations Λ ∗ 1 = L, L ∗ µ = Λ
are fundamental.

2.1.1 Exercises
1. (de la Vallée Poussin 1898; cf. Landau 1911) Show that

{x/n} = (1 − C0 )x + O x 1/2
n≤x

where C0 is Euler’s constant, and {u} = u − [u] is the fractional part of u.

2. (Duncan 1965; cf. Rogers 1964, Orr 1969) Let Q(x) be deﬁned as in The-
orem 2.2.

(a) Show that Q(N ) ≥ N − p [N / p 2 ] for every positive integer N .
(b) Justify the relations
1 1 ∞
1 1 1 ∞
1 1
< + < + − = 1/2.
p p 2 4 k=1
(2k + 1)2 4 2 k=1
2k 2k +2

(c) Show that Q(N ) > N /2 for all positive integers N .

(d) Show that every positive integer n > 1 can be written as a sum of two
square-free numbers.
3. (Linfoot & Evelyn 1929) Let Qk denote the set of positive k th power free
integers (i.e., q ∈ Qk if and only if m k |q ⇒ m = 1).
(a) Show that
ζ (s)
n −s =
n∈Q
ζ (ks)
k

for σ > 1.
40 The elementary theory of arithmetic functions

(b) Show that for any ﬁxed integer k > 1

x
1= + O x 1/k
n≤x ζ (k)
n∈Qk

for x ≥ 1.
4. (cf. Evelyn & Linfoot 1930) Let N be a positive integer, and suppose that
P is square-free.
(a) Show that the number of residue classes n (mod P 2 ) for which (n, P 2 )
is square-free and (N − n, P 2 ) is square-free is

1 2
P 2
1− 2 1− 2 .
p|P
p p|P
p
p 2 |N p 2 N

(b) Show that the number of integers n, 0 < n < N , for which (n, P 2 ) is
square-free and (N − n, P 2 ) is square-free is

1 2
N 1− 2 1 − 2 + O(P 2 ).
p|P
p p|P
p
p 2 |N p 2 N

(c) Show that the number of n, 0 < n < N , such that n is divisible by the
square of a prime > y is N /y.
(d) Take P to be the product of all primes not exceeding y. By letting y
tend to inﬁnity slowly, show that the number of ways of writing N as
a sum of two square-free integers is ∼ c(N )N where

1 2
c(N ) = a 1+ 2 , a= 1− 2 .
p 2 |N
p −2 p p

5. (cf. Hille 1937) Suppose that f (x) and F(x) are complex-valued functions
deﬁned on [1, ∞). Show that

F(x) = f (x/n)
n≤x

for all x if and only if

f (x) = µ(n)F(x/n)
n≤x

for all n if and only if

f (n) = µ(m/n)F(m).
m
n|m

7. (Jarnı́k 1926; cf. Bombieri & Pila 1989) Let C be a simple closed curve in
the plane, of arc length L. Show that the number of ‘lattice points’ (m, n),
m, n ∈ Z, lying on C is at most L + 1. Show that if C is strictly convex
then the number of lattice points on C is 1 + L 2/3 , and that this estimate
is best possible.
8. Let C be a simple closed curve in the plane, of arc length L that encloses
a region of area A. Let N be the number of lattice points inside C. Show
that |N − A| ≤ 3(L + 1).
9. Let r (n) be the number of pairs ( j, k) of integers such that j 2 + k 2 = n.
Show that

r (n) = π x + O x 1/2 .
n≤x

10. (Stieltjes 1887) Suppose that an , bn are convergent series, and that
−1/2
cn = km=n ak bm . Show that cn n converges. (Hence if two Dirichlet
series have abscissa of convergence ≤ σ then the product series γ (s) =
α(s)β(s) has abscissa of convergence σc ≤ σ + 1/2.)

11. (a) Show that n≤x ϕ(n) = (3/π 2 )x 2 + O(x log x) for x ≥ 2.
(b) Show that

1 = −1 + 2 ϕ(n)
m≤x n≤x
n≤x
(m,n)=1

for x ≥ 1. Deduce that the expression above is (6/π 2 )x 2 + O(x log x).

12. Let σ (n) = d|n d. Show that
π2 2
σ (n) = x + O(x log x)
n≤x 12

for x ≥ 2.
13. (Landau 1900, 1936; cf. Sitaramachandrarao 1982, 1985, Nowak 1989)

(a) Show that n/ϕ(n) = d|n µ(d)2 /ϕ(d).
(b) Show that
n ζ (2)ζ (3)
= x + O(log x)
n≤x ϕ(n) ζ (6)

for x ≥ 2.
42 The elementary theory of arithmetic functions

(c) Show that

∞
µ(d)2 log d log p 1
= 1+ .
d=1
dϕ(d) p p2 − p + 1 p p( p − 1)
(d) Show that for x ≥ 2,
1
ζ (2)ζ (3) log p
= log x +C0 − + O((log x)/x).
n≤x ϕ(n) ζ (6) p p − p+1
2

14. Let κ be a ﬁxed real number. Show that

ϕ(n) κ
= c(κ)x + O (x ε )
n≤x n
where
1
c(κ) = 1− (1 − (1 − 1/ p)κ ) .
p p
15. (cf. Grosswald 1956, Bateman1957)
(a) By using Euler products, or otherwise, show that

2ω(n) = µ(d)d(m).
d 2 m=n

(b) Deduce that

6
2ω(n) = x log x + cx + O x 1/2 log x
n≤x π 2

for x ≥ 2 where c = 2C0 − 1 − 2ζ (2)/ζ (2)2 .

where

1 1
C= 1+ .
8 log 2 p>2
p( p − 2)

16. (a) Show that for any positive integer q,

µ(d) log d ϕ(q) log p
=− .
d|q
d q p|q p − 1

(b) Show that for any real number x ≥ 1 and any positive integer q,
1 log p ϕ(q)
= log x + C0 + + O 2ω(q) /x .
m≤x m p|q
p−1 q
(m,q)=1
2.1 Mean values 43

(c) Show that for any real number x ≥ 2 and any positive integer q,
log p
1 ζ (2)ζ (3) p
= 1− 2 log x + C0 +
n≤x ϕ(n) ζ (6) p|q p − p+1 p|q
p−1
(n,q)=1

log p log x
− + O 2ω(q) .
pq
p2 − p + 1 x

17. (cf. Ward 1927) Show that for x ≥ 2,

µ(n)2 log p
= log x + C0 + + O x −1/2 log x .
n≤x ϕ(n) p p( p − 1)

18. Let dk (n) be the number of ordered k-tuples (d1 , . . . , dk ) of positive integers
such that d1 d2 · · · dk = n.

(a) Show that dk (n) = d|n dk−1 (d).
∞
(b) Show that n=1 dk (n)n −s = ζ (s)k for σ > 1.
(c) Show that for every ﬁxed positive integer k,

dk (n) = x Pk (log x) + O x 1−1/k (log x)k−2
n≤x

for x ≥ 2, where P ∈ R[z] has degree k − 1 and leading coefﬁcient

1/(k − 1)!.
19. (cf. Erdős & Szekeres 1934, Schmidt 1967/68) Let An denote the number
of non-isomorphic Abelian groups of order n.

(a) Show that ∞ n=1 An n
−s
= ∞ k=1 ζ (ks) for σ > 1.
(b) Show that

An = cx + O x 1/2
n≤x
∞
where c = k=2 ζ (k).

20. (Wintner 1944, p. 46) Suppose that |g(d)|/d < ∞. Show that
d
d≤x |g(d)|

= o(x). Suppose also that n≤x f (n) = cx + o(x), and put
h(n) = d|n f (d)g(n/d). Show that

h(n) = cgx + o(x)
n≤x

where g = d g(d)/d.
√
21. (a) Show that if a 2 is the largest perfect square ≤ x then x − a 2 ≤ 2 x.
(b) Let a 2 be as above, and let b2 be the least perfect square such that a 2 +
b2 > x. Show that a 2 + b2 < x + 6x 1/4 . Thus for any x ≥ 1, there is
a sum of two squares in the interval (x, x + 6x 1/4 ). (It is somewhat
44 The elementary theory of arithmetic functions

embarrassing that this is the best-known upper bound for gaps between
sums of two squares.)
22. (Feller & Tornier 1932) Let f (n) denote the multiplicative function such
that f ( p) = 1 for all p, and f ( p k ) = −1 whenever k > 1.
(a) Show that
∞
f (n) 2
= ζ (s) 1 −
n=1
ns p p 2s

for σ > 1.
(b) Deduce that

f (n) = µ(d)2ω(d) .
d 2 |n

(c) Explain why 2ω(n) ≤ d(n) for all n.

(d) Show that

f (n) = ax + O x 1/2 log x
n≤x

where a is the constant of Exercise 3.

(e) Let g(n) denote the number of primes p such that p 2 |n. Show
that the set of n for which g(n) is even has asymptotic density
(1 + a)/2.
(f) Put
1
ek = µ(d)2k/d .
k d|k

Show that if |z| < 1, then

∞

log(1 − 2z) = ek log 1 − z k .
k=1

(g) Deduce that

∞
a= ζ (2k)ek .
k=1

Note that the k th factor here differs from 1 by an amount that is

1/(k2k ). Hence the product converges very rapidly. Since ζ (2k)
can be calculated very accurately by the Euler–Maclaurin formula (see
Appendix B), the formula above permits the rapid calculation of the
constant a.
2.1 Mean values 45

23. Let B1 (x) = x − 1/2, as in Appendix B.

(a) Show that
1
= log x + C0 − B1 ({x})/x + O(1/x 2 ).
n≤x n

(b) Write n≤x d(n) = x log x + (2C0 − 1)x + (x). Show that

(x) = −2 B1 ({x/n}) + O(1).
√
n≤ x
X
(c) Show that 0 (x) d x X.
(d) Deduce that

X
d(n)(X − n) = d(n) dx
n≤X 0 n≤x

1 2 3
= X log X + C0 − X 2 + O(X ).
2 4
24. Let r (n) be the number of ordered pairs (a, b) of integers for which a 2 +
b2 = n.
(a) Show that
√ 2
r (n) = 1 + 4[ x] + 8 x − n 2 − 4 x/2 .
√
n≤x 1≤n≤ x/2

(b) Show that

π 1 1 √
x− n2 = + x − B1 x/2 − x + O(1).
√ 8 2 2
1≤n≤ x/2

(c) Write 0≤n≤xr (n) = π x + R(x). Show that

R(x) = −8 B1 x − n 2 + O(1).
√
1≤n≤ x/2

25. (a) Show that if (a, q) = 1, and β is real, then

q
a
B1 n+β = B1 ({qβ}).
n=1
q

(b) Show that if A ≥ 1, | f (x) − a/q| ≤ A/q 2 for 1 ≤ x ≤ q, and (a, q) =

1, then

q
B1 ({ f (n)}) A.
n=1
46 The elementary theory of arithmetic functions

(c) Suppose that Q ≥ 1 is an integer, B ≥ 1, and that 1/Q 3 ≤ ± f (x) ≤

B/Q 3 for 0 ≤ x ≤ N where the choice of sign is independent of
x. Show that numbers ar , qr , Nr can be determined, 0 ≤ r ≤ R for
some R, so that (i) (ar , qr ) = 1, (ii) qr ≤ Q, (iii) | f (Nr ) − ar /qr | ≤
1/(qr Q), and (iv) N0 = 0, Nr = Nr −1 + qr −1 for 1 ≤ r ≤ R, N − Q ≤
NR ≤ N .
(d) Show that under the above hypotheses

N
B1 ({ f (n)}) B(R + 1) + Q.
n=0

(e) Show that the number of s for which as /qs = ar /qr is Q 2 /q 2 .

Let 1 ≤ q ≤ Q. Show that the number of r for which qr = q is
(Q/q)2 (B N q/Q 3 + 1).
(f) Conclude that under the hypotheses of (c),

N
B1 ({ f (n)}) B 2 N Q −1 log 2Q + B Q 2 .
n=0
√
26. Show that if U ≤ x, then

B1 ({x/n}) x 1/3 log x.
U <n≤2U

Let (x) be as in Exercise 23(b). Show that (x) x 1/3 (log x)2 .
27. Let R(x) be as in Exercise 24(c). Show that R(x) x 1/3 log x.

2.2 The prime number estimates of Chebyshev and

of Mertens
Because of the irregular spacing of the prime numbers, it seems hopeless to
give a useful exact formula for the n th prime. As a compromise we estimate the
n th prime, or equivalently, estimate the number π (x) of primes not exceeding x.

Similarly we put ϑ(x) = p≤x log p, and ψ(x) = n≤x (n). As we shall see,
these three summatory functions are closely related. We estimate ψ(x) ﬁrst.

Theorem 2.4 (Chebyshev) For x ≥ 2, ψ(x) x.

The proof we give below establishes only that there is an x0 such that
ψ(x) x uniformly for x ≥ x0 . However, both ψ(x) and x are bounded away
from 0 and from ∞ in the interval [2, x0 ], and hence the implicit constants can
be adjusted so that ψ(x) x uniformly for x ≥ 2. In subsequent situations of
2.2 Estimates of Chebyshev and of Mertens 47

this sort, we shall assume without comment that the reader understands that it
sufﬁces to prove the result for all sufﬁciently large x.

Proof By applying the Möbius inversion formula to (1.22) we ﬁnd that

(n) = µ(d) log n/d .
d|n

Thus by (2.7) it follows that

ψ(x) = µ(d)T (x/d) (2.10)
d≤x

where T (x) = n≤x log n. By the integral test we see that
N N +1
log u du ≤ T (N ) ≤ log u du
1 1

for any positive integer N . Since log x d x = x log x − x, it follows easily
that

T (x) = x log x − x + O(log 2x) (2.11)

for x ≥ 1. Despite the precision of this estimate, we encounter difﬁculties when

we substitute this in (2.10), since we have no useful information concerning the
sums
µ(d) µ(d) log d
, ,
d≤x
d d≤x
d

which arise in the main terms. To avoid this problem we introduce an idea that
is fundamental to much of prime number theory, namely we replace µ(d) by
an arithmetic function ad that in some way forms a truncated approximation to
µ(d). Suppose that D is a ﬁnite set of numbers, and that ad = 0 when d ∈ / D.
Then by (2.11) we see that
ad log d
ad T (x/d) = (x log x − x) ad /d − x + O(log 2x).
d∈D d∈D d∈D
d
(2.12)
Here the implicit constant depends on the choice of ad , which we shall consider
to be ﬁxed. Since we want the above to approximate the relation (2.10), and
since we are hoping that ψ(x) x, we restrict our attention to ad that satisfy
the condition
ad
= 0, (2.13)
d∈D
d
48 The elementary theory of arithmetic functions

and hope that

ad log d
− is near 1. (2.14)
d∈D
d

By the deﬁnition of T (x) we see that the left-hand side of (2.12) is

ad log n = ad (k) = ad (k)
dn≤x dn≤x k|n dkm≤x
(2.15)
= (k)E(x/k)
k≤x

where E(y) = dm≤y ad = d ad [y/d]. The expression above will be near
ψ(x) if E(y) is near 1. If y ≥ 1 then

µ(d)[y/d] = µ(d) 1= µ(d) = µ(d) = 1,
d d k≤y/d dk≤y n≤y d|n

in view of (1.20). Thus E(y) will be near 1 for y not too large if ad is near µ(d)

for small d. Moreover, by (2.13) we see that E(y) = − d∈D ad {y/d}, so that
E(y) is periodic with period dividing lcmd∈D d. Hence for a given choice of
the ad , the behaviour of E(y) can be determined by a ﬁnite calculation.
The simplest realization of this approach involves taking a1 = 1, a2 = −2,
ad = 0 for d > 2. Then (2.13) holds, the expression (2.14) is log 2, E(y) has
period 2 and E(y) = 0 for 0 ≤ y < 1, E(y) = 1 for 1 ≤ y < 2. Hence for this
choice of the ad the sum in (2.15) satisﬁes the inequalities

ψ(x) − ψ(x/2) = (k) ≤ (k)E(x/k) ≤ (k) = ψ(x).
x/2<k≤x k≤x k≤x

Thus ψ(x) ≥ (log 2)x + O(log x), which is a lower bound of the desired shape.
In addition,

ψ(x) − ψ(x/2) ≤ (log 2)x + O(log x).

On replacing x by x/2r and summing over r we deduce that

ψ(x) ≤ 2(log 2)x + O((log x)2 ),

so the proof is complete.

Chebyshev obtained better constants than above, by taking a1 = a30 = 1,

a2 = a3 = a5 = −1, ad = 0 otherwise. Then (2.13) holds, the expression (2.14)
is 0.92129 . . . , E(y) = 1 for 1 ≤ y < 6, and 0 ≤ E(y) ≤ 1 for all y, with the
result that

ψ(x) ≥ (0.9212)x + O(log x)

2.2 Estimates of Chebyshev and of Mertens 49

and

ψ(x) ≤ (1.1056)x + O((log x)2 ).

By computing the implicit constants one can use this method to determine a
constant x0 such that ψ(2x) − ψ(x) > x/2 for all x > x0 . Since the contribution
of the proper prime powers is small, it follows that there is at least one prime
in the interval (x, 2x], when x > x0 . After separate consideration of x ≤ x0 ,
one obtains Bertrand’s postulate: For each real number x > 1, there is a prime
number in the interval (x, 2x).
Chebyshev said it, but I’ll say it again:
There’s always a prime between n and 2n.
N. J. Fine

Corollary 2.5 For x ≥ 2,

ϑ(x) = ψ(x) + O x 1/2

and

ψ(x) x
π(x) = +O .
log x (log x)2
Proof Clearly

∞

ψ(x) = log p = ϑ x 1/k .
p k ≤x k=1

But ϑ(y) ≤ ψ(y) y, so that

ψ(x) − ϑ(x) = ϑ(x 1/k ) x 1/2 + x 1/3 log x x 1/2 .
k≥2

As for π(x), we note that

x
ϑ(x) x
ϑ(u)
π(x) = (log u)−1 dϑ(u) = + du.
2− log x 2 u(log u)2
This last integral is
x
(log u)−2 du x(log x)−2 ,
2

so we have the stated result.

Corollary 2.6 For x ≥ 2, ϑ(x) x and π (x) x/ log x.

In Chapters 6 and 8 we shall give several proofs of the Prime Number

Theorem (PNT), which asserts that π (x) ∼ x/ log x. By Corollary 2.5 this is
50 The elementary theory of arithmetic functions

equivalent to the estimates ϑ(x) ∼ x, ψ(x) ∼ x. By partial summation it is

easily seen that the PNT implies that
log p
∼ log x,
p≤x p

and that
1
∼ log log x.
p≤x p

However, these assertions are weaker than PNT, as we can derive them from
Theorem 2.4.

Theorem 2.7 For x ≥ 2,

(n)
(a) = log x + O(1),
n≤x n
log p
(b) = log x + O(1),
p≤x p
x
(c) ψ(u)u −2 du = log x + O(1),
1
1
(d) = log log x + b + O(1/ log x),
p≤x p
1 −1
(e) 1− = eC0 log x + O(1)
p≤x p
where C0 is Euler’s constant and
∞
1
b = C0 − k
.
p k=2 kp

Proof Taking f (d) = (d) in (2.1), we see from (2.3) that

(d)
T (x) = log n = x + O (ψ(x)) .
n≤x d≤x
d

By Theorem 2.4 the error term is x. Thus (2.11) gives (a). The sum in (b)
differs from that in (a) by the amount
log p log p
≤ 1.
p k ≤x
pk p p( p − 1)
k≥2

To derive (c) we note that the sum in (a) is

x
ψ(u) x x x
u −1 dψ(u) = + ψ(u)u −2 du = ψ(u)u −2 du + O(1)
2− u 2− 2 2
2.2 Estimates of Chebyshev and of Mertens 51

by Theorem 2.4. We now prove (d) without determining the value of the con-
stant b. We express (b) in the form L(x) = log x + R(x) where R(x) 1.
Then
1 x x
1 x
d R(u)
= (log u)−1 d L(u) = d log u +
p≤x p 2− 2− log u 2− log u

x
du R(u) x x
= + − R(u) d(log u)−1
2− u log u log u 2− 2−
x
R(u) R(x)
= log log x − log log 2 + 1 + du. +
2 u(log u)2 log x
∞ ∞
The
∞ penultimate term is 1/ log x, and the integral is 2 − x =
2 +O(1/ log x), so we have (d) with
∞
R(u)
b = 1 − log log 2 + du.
2 u(log u)2
As for (e), we note that
−1 −1
1 1 1 1
log 1 − = + log 1 − − .
p≤x p p≤x p p≤x p p

The second sum on the right is

∞
1 −2
k
+ O p
p k=2 kp p>x

and the error term here is n>x n −2 x −1 , so from (d) we have

−1
1
log 1 − = log log x + c + O(1/ log x) (2.16)
p≤x p

where c = b + p k≥2 (kp k )−1 . Since e z = 1 + O(|z|) for |z| ≤ 1, on expo-
nentiating we deduce that
−1
1
1− = ec log x + O(1).
p≤x p

To complete the proof it sufﬁces to show that c = C0 . To this end we ﬁrst note
that if p ≤ x and p k > x, then k ≥ (log x)/ log p. Hence
1 log p log p 1 log p 1
p −k ,
p≤x kp k p≤x (log x) p k p log x k≥2
log x p p 2 log x
p k >x p k >x
52 The elementary theory of arithmetic functions

so that from (2.16) we have

(n)
= log log x + c + O(1/ log x).
1<n≤x
n log n

By Corollary 1.15 this can be written

(n) 1
= + (c − C0 ) + O(1/ log 2x).
1<n≤x
n log n n≤log x
n

Since this is trivial when 1 ≤ x < 2, the above holds for all x ≥ 1. We
express this brieﬂy as T1 = T2 + T3 + T4 , and estimate the quantities Ii =
∞ −1−δ
δ 1 x Ti (x) d x. On comparing the results as δ → 0+ we shall deduce
that c = C0 . By Theorem 1.3, Corollary 1.11, and Corollary 1.13 we see that
1
I1 = log ζ (1 + δ) = log + O(δ)
δ
as δ → 0+ . Secondly,
∞
1 ∞ ∞
1 −δn
I2 = δ x −1−δ d x = e = log(1 − e−δ )−1
n=1
n en n=1
n
= log(δ + O(δ 2 ))−1 = log 1/δ + O(δ).

Thirdly,

I3 = c − C 0 ,

and ﬁnally
∞ e1/δ ∞
dx dx
I4 δ x −1−δ δ+δ + δ2 x −1−δ d x δ log 1/δ.
1 log 2x 2 x log x e1/δ

Since the main terms cancel, on letting δ → 0+ we see that c = C0 .

Corollary 2.8 We have

π (x)
lim sup ≥1
x→∞ x/ log x
and
π (x)
lim inf ≤ 1.
x→∞ x/ log x
Proof By Corollary 2.5 it sufﬁces to show that lim sup ψ(u)/u ≥ 1, and that
lim inf ψ(u)/u ≤ 1. Suppose that lim sup ψ(u)/u = a, and suppose that ε > 0.
2.2 Estimates of Chebyshev and of Mertens 53

Then there is an x0 such that ψ(x) ≤ (a + ε)x for all x ≥ x0 , and hence
x x0 x
ψ(u)u −2 du ≤ ψ(u)u −2 du +(a + ε) u −1 du ≤ (a + ε) log x + Oε (1).
1 1 x0
x
Since this holds for arbitrary ε > 0, it follows that 1 ψ(u)u −2 du ≤ (a +
o(1)) log x. Thus by Theorem 2.7(c) we have a ≥ 1. Similarly lim inf ψ(u)/u
≤ 1.

2.2.1 Exercises
1. (a) Let dn = [1, 2, . . . , n]. Show that dn = eψ(n) .
1
(b) Let P ∈ Z[x], deg P ≤ n. Put I = I (P) = 0 P(x) d x. Show that
I dn+1 ∈ Z, and hence that dn+1 ≥ 1/|I | if I = 0.
(c) Show that there is a polynomial P as above so that I dn+1 = 1.
20≤x≤1 |x2 (1 − x) (2x
(d) Verify that max 2 2
− 1)| = 5−5/2 .
2n
(e) For P(x) = x (1 − x) (2x − 1) , verify that 0 < I < 5−5n .
(f) Show that ψ(10n + 1) ≥ ( 12 log 5) · 10n.
2. Let A be the set of integers composed entirely of primes p ≤ A1 , and
let B be the set of integers composed entirely of primes p > A1 . Then n
is uniquely of the form n = ab, a ∈ A, b ∈ B. Let δ(A1 , A2 ) denote the
density of those n such that a ≤ A2 .
(a) Give a formula for δ(A1 , A2 ).
(b) Show that δ(A1 , A2 ) (log A2 )/ log A1 for 2 ≤ A2 ≤ A1 .
3. Let an = 1 + cos log n, and note that an ≥ 0 for all n.
(a) Show that

∞
1 1
an n −s = ζ (s) + ζ (s + i) + ζ (s − i)
n=1
2 2
for σ > 1.
(b) By Corollary 1.15, or otherwise, show that
an
= log x + O(1).
n≤x n

(c) By integrating by parts as in the proof of Theorem 1.12, show that

xi x −i
an = 1 + + x + O(log x).
n≤x 2(1 + i) 2(1 − i)
(d) Deduce that
1 1 1 1
lim inf an = 1 − √ , lim sup an = 1 + √ .
x→∞ x n≤x 2 x→∞ x n≤x 2
54 The elementary theory of arithmetic functions

Thus for the coefﬁcients an we have an analogue of Mertens’ esti-

mate of Theorem 2.7(b), but not an analogue of the Prime Number
Theorem.
4. (Golomb 1992) Let dx denote the least common multiple of the positive
integers not exceeding x. Show that
∞
2n (−1)k−1
= d2n/k .
n k=1

5. (Chebyshev 1850) From Corollaries 2.5 and 2.8 we see that if there is a
number a such that ψ(x) = (a + o(1))x as x → ∞, then we must have
a = 1. We now take this a step further.
(a) Suppose that there is a number a such that

ψ(x) = x + (a + o(1))x/ log x (2.17)

as x → ∞. Deduce that
x
ψ(u)
du = log x + (a + o(1)) log log x
2 u2
as x → ∞.
(b) By comparing the above with Theorem 2.7(c), deduce that if (2.17)
holds, then necessarily a = 0.
(c) Suppose that there is a constant A such that

x x
π (x) = +o (2.18)
log x − A (log x)2
x
as x → ∞. By writing ϑ(x) = 2− log u dπ (u), integrating by parts,
and estimating the expressions that arise, show that if (2.18) holds,
then

ψ(x) = x + (A − 1 + o(1))x/ log x

as x → ∞.
(d) Deduce that if (2.18) holds, then A = 1.

2.3 Applications to arithmetic functions

The results above are useful in determining the extreme values of familiar
arithmetic functions. We consider three instances.
2.3 Applications to arithmetic functions 55

Theorem 2.9 For all n ≥ 3,

n −C0
ϕ(n) ≥ e + O(1/ log log n) ,
log log n
and there are inﬁnitely many n for which the above relation holds with equality.

Proof Let R be the set of those n for which ϕ(n)/n < ϕ(m)/m for all m < n.
We ﬁrst prove the inequality for these ‘record-breaking’ n ∈ R. Suppose that
ω(n) = k, and let n ∗ be the product of the ﬁrst k primes. If n = n ∗ then n ∗ < n
and ϕ(n ∗ )/n ∗ < ϕ(n)/n. Hence R is the set of n of the form

n= p. (2.19)
p≤y

Taking logarithms, we see that log n = ϑ(y) y by Corollary 2.6. On taking

logarithms a second time, it follows that log log n = log y + O(1). Thus by
Mertens’ formula (Theorem 2.7(e)) we see that

ϕ(n) 1 e−C0
= 1− = 1 + O(1/ log y) ,
n p≤y p log y

which gives the desired result for n ∈ R. If n ∈/ R then there is an m < n such
that m ∈ R, ϕ(m)/m < ϕ(n)/n. Hence

ϕ(n) ϕ(m) 1 1
> = e−C0 + O
n m log log m log log m

1 1
≥ e−C0 + O .
log log n log log n
We note that equality holds for n of the type (2.19), so the proof is complete.

Theorem 2.10 For all n ≥ 3,

log n
1 ≤ ω(n) ≤ (1 + O(1/ log log n)) .
log log n
Proof As in the preceding proof we see that record-breaking values of ω(n)
occur when n is of the form (2.19), and that it sufﬁces to prove the bound for
these n. As in the preceding proof, for n given by (2.19) we have ϑ(y) = log n
and log y = log log n + O(1). This gives the result, and we note that the bound
is sharp for these n.

We now consider the maximum order of d(n). From the pairing d ↔ n/d
√
of divisors, and the fact that at least one of these is ≤ n, it is immediate that
√
d(n) ≤ 2 n. On the other hand, if n is square-free then d(n) = 2ω(n) , which
56 The elementary theory of arithmetic functions

√
can be large, but not nearly as large as n. Indeed, for each ε > 0 there is a
constant C(ε) such that
d(n) ≤ C(ε)n ε (2.20)
for all n ≥ 1. To see this we express n in terms of its canonical factorization,

n = p pa , so that
d(n) a+1
= = f p (a),
nε p paε p

say. Let α p be an integral value of a for which f p (a) is maximized. From the
inequalities f p (α p ) ≥ f p (α p ± 1) we see that
( p ε − 1)−1 − 1 ≤ α p ≤ ( p ε − 1)−1 ,
so that we may take α p = [( p ε − 1)−1 ]. Hence (2.20) holds with

C(ε) = f p (α p ).
p

This constant is best possible, since equality holds when n = p p α p . By
analysing the rate at which C(ε) grows as ε → 0+ , we derive
Theorem 2.11 For all n ≥ 3
log n
log d(n) ≤ (log 2 + O(1/ log log n)) .
log log n
We note that this bound is sharp for n of the form in (2.19).
Proof It sufﬁces to show that there is an absolute constant K such that

C(ε) ≤ exp K ε 2 21/ε , (2.21)
since the stated bound then follows by taking ε = (log 2)/ log log n. We observe
that α p = 0 if p > 21/ε , that α p = 1 if (3/2)1/ε < p ≤ 21/ε , and that α p 1/ε
when p ≤ (3/2)1/ε . Hence

log C(ε) log(2/ p ε ) + log(1/ε).
p≤21/ε p≤(3/2)1/ε

Here the second sum is π (3/2) 1/ε
log 1/ε ε 2 21/ε . The ﬁrst sum is
(log 2)π (2 ) − εϑ(2 ), and by Corollary 2.5 this is ε 2 21/ε . Thus we have
1/ε 1/ε

(2.21), and the proof is complete.

It is very instructive to consider our various results from the perspective of

elementary probability theory. Let d be a ﬁxed integer. Then the set of n that
are divisible by d has asymptotic density 1/d, and we might say, loosely, that
2.3 Applications to arithmetic functions 57

the ‘probability’ that d|n when n is ‘randomly chosen’ is 1/d. If d1 and d2

are two ﬁxed numbers then the ‘probability’ that d1 |n and d2 |n is 1/[d1 , d2 ].
If (d1 , d2 ) = 1 then this ‘probability’ is 1/(d1 d2 ), and we see that the ‘events’
d1 |n, d2 |n are ‘independent.’ To make this rigourous we consider the integers
1 ≤ n ≤ N , and assign probability 1/N to each of the N numbers n. Then
1 1
P(d|n) = [N /d]/N = − {N /d}.
d N
This is 1/d if d|N ; otherwise it is close to 1/d if d is small compared to N .
Similarly the events d1 |n, d2 |n are not independent in general, but are nearly
independent if N /(d1 d2 ) is large. The probabilistic heuristic, in which inde-
pendence is assumed, provides a useful means of constructing conjectures.
Many of our investigations can be considered to be directed toward determin-
ing whether the cumulative effect of the error terms {N /d}/N have a discernible
effect.
As an example of the probabilistic approach, we note that n is square-free
if and only if none of the numbers 22 , 32 , 52 , . . . , p 2 , . . . divide n. The ‘prob-
ability’ that p 2 n is approximately 1 − 1/ p 2 . Since these events are nearly
independent, we predict that the probability that a random integer n ∈ [1, N ] is

square-free is approximately p≤N (1 − 1/ p 2 ). This was conﬁrmed in Theorem
2.2. On the other hand, the sieve of Eratosthenes asserts that
√
1 = π (N ) − π N + 1
n≤N
(n,P)=1

where P = p≤√ N p. For a random n ∈ [1, N ] we expect that the probability
that (n, P) = 1 should be approximately

ϕ(P) 1 2e−C0
= 1− ∼
P √ p log N
p≤ N

by Mertens’ formula (Theorem 2.7(e)). This would suggest that perhaps

x
π (x) ∼ 2e−C0 .
log x
However, since 2e−C0 = 1.1229189 . . . , this conﬂicts with the Prime Number
Theorem, and also with Corollary 2.8. Thus the probabilistic model is mislead-
ing in this case.
Suppose now that X p (n) is the arithmetic function

1 if p|n,
X p (n) =
0 otherwise,
58 The elementary theory of arithmetic functions

so that ω(n) = p X p (n). If we were to treat the X p as though they
were independent random variables then we would have E(X p ) = 1/ p,
Var(X p ) = (1 − 1/ p)/ p. Hence we expect that the average of ω(n) should be
approximately
1
E Xp = E(X p ) = = log log n + O(1),
p≤n p≤n p≤n p

and that its variance is approximately

1 1
Var Xp = Var(X p ) = 1− = log log n + O(1).
p≤n p≤n p≤n p p

The ﬁrst of these is easily conﬁrmed, since by (2.3) we have

1
ω(n) = x + O (π (x)) .
n≤x p≤x p

By Mertens’ formula (Theorem 2.7(d)) and Chebyshev’s bound (Corollary 2.6)

this is

= x log log x + bx + O(x/ log x). (2.22)

As for the variance, we have

Theorem 2.12 (Turán) For x ≥ 3,

(ω(n) − log log x)2 x log log x (2.23)
n≤x

and

(ω(n) − log log n)2 x log log x. (2.24)
1<n≤x

These estimates also hold with ω(n) replaced by (n).

Let E be the set of ‘exceptional’ n for which

|ω(n) − log log n| > (log log n)3/4 .

By Theorem 2.12 we see that

x
1 ≤ (log log x)−3/2 (ω(n) − log log n)2 = o(x),
n∈E n≤2x
(log log x)1/2
x<n≤2x

so we have
2.3 Applications to arithmetic functions 59

Corollary 2.13 (Hardy–Ramanujan) For almost all n, ω(n) ∼ (n) ∼

log log n.

Note that in analytic number theory we say ‘almost all’ when the excep-
tional set has asymptotic density 0; this conﬂicts with the usage in some
parts of algebra, where the term means that there are at most ﬁnitely many
exceptions.

Proof of Theorem 2.12 To prove (2.23) we ﬁrst multiply out the square on the
left, and write the sum as

2 − 2(log log x)1 + [x](log log x)2 . (2.25)

We have already determined the size of 1 in (2.22). The new sum is

2 = ω(n) =
2
1 1 = 1.
n≤x n≤x p1 |n p2 |n p1 ≤x n≤x
p2 ≤x pi |n

The terms for which p1 = p2 contribute

1
[x/ p] = x + O (π(x)) = x log log x + O(x).
p≤x p≤x p

The terms p1 = p2 contribute

x 1
2
1
≤x ≤x = x(log log x)2 + O(x log log x)
p1 = p2
p 1 p 2 p1 p2 ≤x p 1 p 2 p≤x p
p1 = p2
(2.26)

by Mertens’ formula (Theorem 2.7(d)). Thus

2 ≤ x(log log x)2 + O(x log log x).

The estimate (2.23) now follows by inserting this and (2.22) in (2.25).
We derive (2.24) from (2.23) by applying the triangle inequality x −
y ≤ x − y for vectors. This gives
1/2 1/2
(ω(n) − log log n)2 − (ω(n) − log log x)2
1<n≤x 1<n≤x
1/2
≤ (log log x − log log n)2 .
1<n≤x
60 The elementary theory of arithmetic functions

By the integral test the sum on the right is

x
= (log log x − log log u)2 du + O((log log x)2 ).
e

By integrating by parts twice we ﬁnd that this integral is

x
1 + log log x −log log u x
−e(log log x)2 −2e log log x +2 du .
2 (log u)2 (log x)2
Thus
1/2 1/2

(ω(n)−log log n)2 = (ω(n) − log log x)2 + O x 1/2 / log x ,
1<n≤x n≤x

and (2.24) follows by squaring both sides and applying (2.23). We omit the
similar argument for (n).

Since 2ω(n) ≤ d(n) ≤ 2(n) for all n, Corollary 2.13 carries an interesting
piece of information for d(n):

d(n) = (log n)(log 2+o(1))

for almost all n. Since this is smaller than the average size of d(n), we see that
the average is determined not by the usual size of d(n) but by a sparse set of n for
which d(n) is disproportionately large. Since the first moment (i.e., average) of
d(n) is inflated by the ‘tail’ in its distribution, it is not surprising that this effect
is more pronounced for the higher moments. As was originally suggested by
Ramanujan, it can be shown that for any fixed real number κ there is a positive
constant c(κ) such that
κ
d(n)κ ∼ c(κ)x(log x)2 −1 (2.27)
n≤x

as x → ∞.
In order to handle the error terms that arise in our arguments we are frequently
led to estimate the mean value of multiplicative functions. In most such cases
the method of the hyperbola or the simpler identity (2.3) will sufﬁce, but the
labour involved quickly becomes tiresome. It will therefore be convenient to
have the following result on record, as it is very readily applied.

Theorem 2.14 Let f be a non-negative multiplicative function. Suppose that

A is a constant such that

f ( p) log p ≤ Ax (2.28)
p≤x
2.3 Applications to arithmetic functions 61

for all x ≥ 1, and that

f ( p k )k log p
≤ A. (2.29)
k
pk
p
k≥2

Then for x ≥ 2,
x f (n)
f (n) (A + 1) .
n≤x log x n≤x n
We note that this is sharper than the trivial estimate

f (n) ≤ x f (n)/n (2.30)
n≤x n≤x

that holds whenever f ≥ 0.

If f ≥ 0 and f is multiplicative, then
f (n)
f ( p) f ( p2 )
≤ 1+ + + ··· .
n≤x n p≤x p p2
On combining this with Theorem 2.14 we obtain
Corollary 2.15 Under the above hypotheses

x f ( p) f ( p2 )
f (n) (A + 1) 1+ + + ··· .
n≤x log x p≤x p p2
Suppose for example that f (n) = d(n)κ . We write
κ κ
2κ 3κ 1 −2 1 2
1+ + 2 + ··· = 1− 1−
p≤x p p p≤x p p≤x p

2κ 3κ
× 1+ + 2 + ···
p p
and observe that the second product tends to a ﬁnite limit as x → ∞, so that
by Mertens’ formula (Theorem 2.7(e)) we have
κ
d(n)κ x(log x)2 −1 (2.31)
n≤x

for any ﬁxed κ. Though weaker than (2.27), this is all that is needed in many
cases. We can similarly show that for any ﬁxed real κ,
n κ
x. (2.32)
n≤x ϕ(n)
62 The elementary theory of arithmetic functions

Thus we see that ϕ(n)/n is not often very small.

Proof of Theorem 2.14 The desired bound is obtained by adding the two
estimates
x f (n)
f (n) log x , (2.33)
n≤x n n≤x n
f (n)
f (n) log n Ax . (2.34)
n≤x n≤x n
The ﬁrst of these is immediate, since f ≥ 0 and log x/n x/n uniformly for

1 ≤ n ≤ x. Since log n = d|n (d), the second sum is

(d) f (md).
d≤x m≤x/d

Writing d = pi , m = p j r where p r , we see that this is

(log p) f ( pi+ j ) f (r ) = k(log p) f ( p k ) f (r ).
p,i≥1, j≥0 r ≤x/ pi+ j p,k r ≤x/ p k
pi+ j ≤x pr p k ≤x pr

Here we have put i + j = k. We now drop the condition p r on the right-

hand side, and consider ﬁrst the contribution of the proper prime powers (i.e.,
k ≥ 2). By (2.30) with x replaced by x/ p we see that the terms for which k ≥ 2
contribute

x (log p k ) f ( p k ) p −k f (r )/r ≤ Ax f (n)/n
p,k≥2 r ≤x/ p k n≤x

by (2.29). It remains to bound

(log p) f ( p) f (r ) = f (r ) f ( p) log p.
p≤x r ≤x/ p r ≤x p≤x/r

By (2.28) this is ≤ Ax r ≤x f (r )/r , so we have (2.34) and the proof is
complete.

In the above proof we made no use of prime number estimates, but as we

have seen the estimates of Chebyshev are useful in verifying the hypotheses

and Mertens’ formula is helpful in estimating the sum n≤x f (n)/n.

2.3.1 Exercises

1. Let σ (n) = d|n d.
(a) Show that σ (n)ϕ(n) ≤ n 2 for all n ≥ 1 .
(b) Deduce that n + 1 ≤ σ (n) ≤ eC0 n log log n + O(1) for all n ≥ 3.
2.3 Applications to arithmetic functions 63

√
2. Show that d(n) ≤ 3n with equality if and only if n = 12.

3. Let f (n) = p|n (1 + p −1/2 ).
(a) Show that there is a constant a such that if n ≥ 3, then

f (n) < exp a(log n)1/2 (log log n)−1 .

(b) Show that n≤x f (n) = cx + O x 1/2 where c = p (1 + p −3/2 ).
4. Let dk (n) be as in Exercise 2.1.18. Show that if k and κ are ﬁxed, then
κ
dk (n)κ x(log x)k −1 .
n≤x

for x ≥ 2.
5. (Davenport 1932) Let
µ(d) log d
f (n) = − .
d|n
d

(a) By recalling Exercise 2.1.16(a), or otherwise, show that f (n) ≥ 0 for

all n.
(b) Show that f (n) log log n for n ≥ 3.

(c) Show that f (n) ∼ 14 log log n if n = y< p≤y 2 p.

(d) Show that f (n) ≤ 14 + o(1) log log n as n → ∞.
6. (cf. Bateman & Grosswald 1958) Let F be the set of ‘power-full’ numbers
where n is power-full if p|n ⇒ p 2 |n.
(a) Show that
ζ (2s)ζ (3s)
n −s =
n∈F
ζ (6s)

for σ > 1/2.

(b) Show that

1 if n ∈ F,
µ(c) =
a,b,c
0 otherwise.
a 2 b3 c6 =n

(c) Show that

1 = ζ (3/2)y 1/2 + ζ (2/3)y 1/3 + O y 1/5 .
a 2 b3 ≤x

(d) Show that

ζ (3/2) 1/2 ζ (2/3) 1/3
1= x + x + O x 1/5 .
n≤x ζ (3) ζ (2)
n∈F
64 The elementary theory of arithmetic functions

7. (Bateman 1949) Let q (z) denote the q th cyclotomic polynomial,

q
q (z) = (z − e(a/q))
a=1
(a,q)=1

where e(θ ) = e2πiθ .

(a) Show that
d (z) = z q − 1.
d|q

(b) Show that

q (z) = (z d − 1)µ(q/d) .
d|q

(c) If P(z) = pn z and Q(z) = qn z n are polynomials with real coefﬁ-
n

cients, then we say that P Q if | pn | ≤ qn for all non-negative integers

n. Show that if P1 Q 1 and P2 Q 2 , then P1 + P2 Q 1 + Q 2 and
P1 P2 Q 1 Q 2 .
(d) Show that q (z) Q q (z) where
Q q (z) = (1 + z d + z 2d + · · · + z q−d ).
d|q

(e) Show that Q q (1) = q d(q)/2 .

(f) Show that for any ε > 0 there is a q0 (ε) such that if q > q0 (ε), then all
coefﬁcients of q have absolute value not exceeding

exp q (log 2+ε)/ log log q .
8. (Turán 1934) (a) Show that the ﬁrst sum in (2.26) is
1
=x + O(x).
p1 p2 ≤x p1 p2

(b) Explain why the sum above is

2 ⎛ ⎞2
1 1 1 1
−2 +⎝ ⎠ . (2.35)
p≤x p
√ p p2 √ p
p ≤ x 1
1 x/ p1 < p2 ≤x x< p≤x
√
(c) Show that if y ≤ x, then
1
= log log x − log log(x/y) + O(1/ log x).
x/y< p≤x
p

(d) Show that the right-hand side above is (log y)/ log x.
2.4 The distribution of (n) − ω(n) 65

(e) Deduce that the second and third terms in (2.35) are 1.
(f) Conclude that
2 = x(log log x)2 + (2b + 1) log log x + O(x)
where b is the constant in Theorem 2.7(d).
(g) Show that the left-hand side of (2.23) is = x log log x + O(x).
(h) Show that the left-hand side of (2.24) is = x log log x + O(x).
9. (cf. Pomerance 1977, Shan 1985) Note that ϕ(n)|(n − 1) when n is prime. An
old – and still unsolved – problem of D. H. Lehmer asks whether there exists
a composite integer n such that ϕ(n)|(n − 1). Let S denote the (presumably
empty) set of such numbers.
(a) Show that if n ∈ S, then n is square-free.
(b) Suppose that mp ∈ S. Show that m ≡ 1 (mod p − 1).
(c) Let p be given. Show that the number of m such that mp ≤ x and mp ∈ S
is x/ p 2 .
(d) Show that the number of n ∈ S, n ≤ x, such that n has a prime factor
> y is x/(y log y).
(e) Suppose that x/y < n ≤ x and that n is composed entirely of primes
p ≤ y. Show that ω(n) ≥ (log x)/(log y) − 1.
(f) By Exercise 4, or otherwise, show that the number of n ≤ x such that
ω(n) ≥ z is x(log x)2 /3z .
(g) Conclude that the number of n ≤ x such that n ∈ S is
√
x/ exp( log x).

2.4 The distribution of (n) − ω(n)

In order to illustrate further the use of elementary techniques we now discuss
an elegant result of Rényi, which asserts that the set of numbers n such that
(n) − ω(n) = k has density dk , where the dk are the power series coefﬁcients
of the meromorphic function
∞
1 1
F(z) = dk z =
k
1− 1+ . (2.36)
k=0 p p p−z
By examining this product we see that F has simple poles at the points z = p
( p = 3), and simple zeros at the points z = p + 1 ( p = 2), so that the power
series converges for |z| < 2. We let Nk (x) denote the number of n ≤ x for
which (n) − ω(n) = k; our object is to show that Nk (x) ∼ dk x. If this holds

for each k then we can deduce that dk ≤ 1. By taking z = 1 in (2.36) we see

that dk = 1, which gives us hope that the asymptotic relation may be fairly
66 The elementary theory of arithmetic functions

uniform in k. This is indeed the case, as we see from the following quantitative
form of Rényi’s theorem.

Theorem 2.16 For any non-negative integer k, and any x ≥ 2,

3 k
Nk (x) = dk x + O 4
x 1/2 (log x)4/3 .

In preparation for the proof of this result we ﬁrst establish a subsidiary

estimate.

Lemma 2.17 For any y ≥ 0 and any natural number f ,

−1
6 1 −1
µ(n)2 = 2 1+ y + O y 1/2 1 − p −1/2 .
n≤y π p| f
p p| f
(n, f )=1

Proof Let D = {d : p|d ⇒ p| f }. By considering the Dirichlet series identity

∞
ζ (s) ζ (s)
µ(n)2 n −s = (1 + p −s ) = (1 + p −s )−1 = λ(d)d −s ,
n=1 p f
ζ (2s) p| f
ζ (2s) d∈D
(n, f )=1

or by elementary considerations, we see that the characteristic function of the

set of those square-free n such that (n, f ) = 1 may be written

λ(d)µ(m)2 .
dm=n
d∈D

Hence the sum in question is

6 y
λ(d) µ(m) = λ(d)
2
· + O y 1/2 d −1/2
d∈D m≤y/d d∈D
π d
2

by Theorem 2.2. But d∈D λ(d)/d = p| f (1 + 1/ p)−1 and d∈D d −1/2 =
−1/2 −1
p| f (1 − p ) , so that the proof is complete.

Proof of Theorem 2.16 Let Q denote the set of square-free numbers and F
denote the set of ‘power-full’ numbers (i.e., those f such that p| f ⇒ p 2 | f ).
Every number is uniquely expressible in the form n = q f , q ∈ Q, f ∈ F,
(q, f ) = 1. Hence

Nk = 1.
f ≤x q≤x/ f
f ∈F q∈Q
( f )−ω( f )=k (q, f )=1
2.4 The distribution of (n) − ω(n) 67

By Lemma 2.17 this is

⎛ ⎞

1 ⎜ ⎟
6 ⎜ 1/2 ⎟
−1/2 −1 ⎟
x (1 + p ) −1 −1
+O⎜
⎜x f −1/2
1− p ⎟.
π2 f ≤x
f p| f ⎝ f ≤x p| f ⎠
f ∈F f ∈F
( f )−ω( f )=k ( f )−ω( f )=k

In order to appreciate the nature of these sums it is helpful to observe that each
member of F is uniquely of the form a 2 b3 with b square-free, so that there are
x 1/2 members of F not exceeding x. Suppose that z ≥ 1. Then the sum in
the error term is
−1
≤ z −k z ( f )−ω( f ) f −1/2 1 − p −1/2 .
f ≤x p| f
f ∈F

Since ( f ) − ω( f ) is an additive function, it follows that z ( f )−ω( f ) is a mul-

tiplicative function. Hence the above is

−k −1/2
−1 z z2 z3
≤z 1+ 1− p + 3/2 + 2 + · · · .
p≤x p p p
√
When p = 2 the sum converges only for z < 2. Hence we take z = 4/3, and
then the product is

4 C
≤ 1+ + 3/2 (log x)4/3
p≤x 3 p p
by Mertens’ formula. Thus
−1 3 k
f −1/2 1 − p −1/2 (log x)4/3
f ≤x p| f
4
f ∈F
( f )−ω( f )=k

which sufﬁces for the error term.

We now consider the effect of dropping the condition f ≤ x in the main
term. Since

1 1 −1 −1
1+ ≤ U −1/2 f −1/2 1 − p −1/2
U < f ≤2U
f p| f p U < f ≤2U p| f
f ∈F f ∈F
( f )−ω( f )=k ( f )−ω( f )=k
3 k
U −1/2
(log 2U )4/3 ,
4
on taking U = x2r and summing over r ≥ 0 we see that
−1
1 1 3 k
1+ x −1/2 (log x)4/3 .
f ≤x
f p| f
p 4
f ∈F
( f )−ω( f )=k
68 The elementary theory of arithmetic functions

Hence we have the stated result with

−1
6 1 1
dk = 2 1+ .
π f ∈F
f p| f
p
( f )−ω( f )=k

To see that (2.36) holds, it sufﬁces to multiply this by z k and sum over k.

2.4.1 Exercise
1. Let dk be as in (2.36). Show that
dk = c2−k + O(5−k )
where
1 1 −1
c= 1− .
4 p>2
( p − 1)2

2.5 Notes

Section 2.1. Mertens (1874 a) showed that n≤x ϕ(n) = 3x 2 /π 2 + O(x log x).
This refines an earlier estimate of Dirichlet, and is equivalent to Theorem 2.1,
by partial summation. Let R(x) denote the error term in Theorem 2.1. Chowla
(1932) showed that
x
x
R(u)2 du ∼
1 2π 2
as x → ∞, and Walfisz (1963, p. 144) showed that
R(x) (log x)2/3 (log log x)4/3 .
In the opposite direction, Pillai & Chowla (1930) showed (cf. Exercise
7.3.6) that R(x) = (log log log x). That the error term changes sign in-
finitely often was first proved by Erdős & Shapiro (1951), who showed that
R(x) = ± (log log log log x). More recently, Montgomery (1987) showed that
√
R(x) = ± ( log log x). It may be speculated that R(x) log log x and that
R(x) = ± (log log x).
Theorem 2.2 is due to Gegenbauer (1885).
Theorem 2.3 is due to Dirichlet (1849). The problem of improving the error
term in this theorem is known as the Dirichlet divisor problem. Let (x) denote
the error term. Voronoı̈ (1903) showed that (x) x 1/3 log x (see Exercises
2.1.23, 2.1.25, 2.1.26). van der Corput (1922) used estimates of exponential
sums to show that (x) x 33/100+ε . This exponent has since been reduced
2.5 Notes 69

by van der Corput (1928), Chih (1950), Richert (1953), Kolesnik (1969, 1973,
1982, 1985), Iwaniec & Mozzochi (1988), and by Huxley (1993), who showed
that (x) x 23/73+ε . In the opposite direction, Hardy (1916) showed that
(x) = ± (x 1/4 ). Soundararajan (2003) showed that

(x) = x 1/4 (log x)1/4 (log log x)b (log log log x)−5/8
with b = 34 (24/3 − 1), and it is plausible that the ﬁrst three exponents above are
optimal.
The result of Exercise 2.1.12 generalizes to Rn : A lattice point
(a1 , a, . . . , an ∈ Zn ) is said to be primitive if gcd(a1 , a2 , . . . , an ) = 1. The
asymptotic density of primitive lattice points is easily shown to be 1/ζ (n).
In addition, Cai & Bach (2003) have shown that the density of lattice points
a ∈ Zn such that gcd(ai , a j ) = 1 for all pairs with 1 ≤ i < j ≤ n is

1 n n 1 n−1
1− + 1− .
p p p p

Section 2.2. Chebyshev (1848) used the asymptotics of log ζ (σ ) as σ → 1+

to obtain Corollary 2.8. In his second paper on prime numbers, Chebyshev
(1850) introduced the notations ϑ(x), ψ(x), T (x), and proved Theorem 2.4,
Corollaries 2.5, 2.6, Theorem 2.7(a), and the results of Exercise 2.2.5. Sylvester
(1881) devised a more complicated choice of the ad that gave better constants
than those of Chebyshev. Diamond & Erdős (1980) have shown that for any
ε > 0 it is possible to choose numbers ad as in the proof of Theorem 2.4 to
show that (1 − ε)x < ψ(x) < (1 + ε)x for all sufﬁciently large x. This does
not constitute a proof of the Prime Number Theorem, because the PNT is used
in the proof. Chebyshev (1850) also used his main results to prove Bertrand’s
postulate. Simpler proofs have been devised by various authors. For an easy
exposition, see Theorem 8.7 of Niven, Zuckerman & Montgomery (1991).
Richert (1949a, b) (cf. Ma̧kowski 1960) used Bertrand’s postulate to show that
every integer > 6 can be expressed as a sum of distinct primes. Rosser &
Schoenfeld (1962, 1975) and Schoenfeld (1976) have given a large number of
very useful explicit estimates for primes and for the Chebyshev functions, of
which one example is that π(x) > x/ log x for all x ≥ 17. For the k th prime
number, pk , Dusart (1999) has given the lower bound
pk > k(log k + log log k − 1)
for k ≥ 2. For further explicit estimates, see Schoenfeld (1969), Costa Pereira
(1989), and Massias & Robin (1996). In Exercise 2.2.1 we ﬁnd that ψ(x) ≥
cx + O(1) with c = 12 log 5 = 0.8047 . . . . This approach is mentioned by Gel’-
fond, in his editorial remarks in the Collected Works of Chebyshev (1946,
70 The elementary theory of arithmetic functions

pp. 285–288). Polynomials can be found that produce better constants, but
Gorshkov (1956) showed that the supremum of such constants is < 1, so
the Prime Number Theorem cannot be established by this method. For more
on this subject, see Montgomery (1994, Chapter 10), Pritsker (1999), and
Borwein (2002, Chapter 10).
Theorem 2.7(b)–(e) is due to Mertens (1874a, b). Our determination of the
constant in Theorem 2.7(e) incorporates an expository finesse due to Heath-
Brown.
Section 2.3. Theorem 2.9 is due to Landau (1903). Runge (1885) proved
(2.20), and Wigert (1906/7) showed that d(n) < n (log 2+ε)/ log log n for n > n 0 (ε).
Ramanujan (1915a, b) established the upper bound of Theorem 2.11, first with
an extra log log log n in the error term, and then without. Ramanujan (1915b)
also proved that
log d(n)
< li(n) + O n exp − c log n
log 2
for all n ≥ 2, and that
log d(n)
> li(n) + O n exp − c log n
log 2
for infinitely many n. For a survey of extreme value estimates of arithmetic
functions, see Nicolas (1988).
Theorem 2.12 is due to Turán (1934), although Corollary 2.13 and the es-
timate (2.22) used in the proof of Theorem 2.12 were established earlier by
Hardy & Ramanujan (1917). Kubilius (1956) generalized Turán’s inequality to
arbitrary additive functions. See Tenenbaum (1995, pp. 302–304) for a proof,
and discussion of the sharpest constants.
Theorem 2.14 is due to Hall & Tenenbaum (1988, pp. 2, 11). It represents
a weakening of sharper estimates that can be derived with more work. For
example, Wirsing (1961) showed that if f is a multiplicative function such that
f (n) ≥ 0 for all n, if there is a constant C < 2 such that f ( p k ) C k for all
k ≥ 2, and if

f ( p) ∼ κ x/ log x
p≤x

as x → ∞ where κ is a positive real number, then

e−C0 κ x f ( p) f ( p2 )
f (n) ∼ 1+ + + ··· .
n≤x (κ) log x p≤x p p2

For more information concerning non-negative multiplicative functions, see

Wirsing (1967), Hall (1974), Halberstam & Richert (1979), and Hildebrand
2.6 References 71

(1984, 1986, 1987). For a comprehensive account of the mean values of (not
necessarily non-negative) multiplicative functions, see Tenenbaum (1995, pp.
48–50, 308–310, 325–357). The two sides of (2.31) are of the same order of
magnitude, and with more work one can derive a more precise asymptotic
estimate; see Wilson (1922).
Section 2.4. Rényi (1955) gave a qualitative form of Theorem 2.16. Robinson
(1966) gave formulæ for the densities dk . Kac (1959, pp. 64–71) gave a proof
by probabilistic techniques. Generalizations have been given by Cohen (1964)
and Kubilius (1964). Sharper estimates for the error term have been derived
by Delange (1965, 1967/68, 1973), Kátai (1966), Saffari (1970), and Schwarz
(1970).
For a much more detailed historical account of the development of prime
number theory, see Narkiewicz (2000).

2.6 References
Bateman, P. T. (1949). Note on the coefﬁcients of the cyclotomic polynomial, Bull. Amer.
Math. Soc. 55, 1180–1181.
Bateman, P. T. & Grosswald, E. (1958). On a theorem of Erdős and Szekeres, Illinois J.
Math. 2, 88–98.
Bombieri, E. & Pila, J. (1989). The number of integral points on arcs and ovals, Duke
Math. J. 59, 337–357.
Borwein, P. (2002). Computational excursions in analysis and number theory. Canadian
Math. Soc., New York: Springer.
Cai, J.-Y. & Bach, E. (2003). On testing for zero polynomials by a set of points with
bounded precision, Theoret. Comp. Sci. 296, 15–25.
Chebyshev, P. L. (1848). Sur la fonction qui détermine la totalité des nombres premiers
inférieurs à une limite donné, Mem. Acad. Sci. St. Petersburg 6, 1–19.
(1850). Mémoire sur nombres premiers, Mem. Acad. Sci. St. Petersburg 7, 17–33.
(1946). Collected works of P. L. Chebyshev, Vol. 1, Akad. Nauk SSSR, Moscow–
Leningrad.
Chih, T.-T. (1950). A divisor problem, Acta Sinica Sci. Record 3, 177–182.
Chowla, S. (1932). Contributions to the analytic theory of numbers, Math. Zeit. 35,
279–299.
Cohen, E. (1964). Some asymptotic formulas in the theory of numbers, Trans. Amer.
Math. Soc. 112, 214–227.
van der Corput, J. G. (1922). Vereschärfung der Abschätzung beim Teilerproblem, Math.
Ann. 87, 39–65.
(1928). Zum Teilerproblem, Math. Ann. 98, 697–716.
Costa Pereira, N. (1989). Elementary estimates for the Chebyshev function ψ(x) and
for the Möbius function M(x), Acta Arith. 52, 307–337.
Davenport, H. (1932). On a generalization of Euler’s function φ(n), J. London Math. Soc.
7, 290–296; Collected Works, Vol. IV. London: Academic Press, pp. 1827–1833.
72 The elementary theory of arithmetic functions

Delange, H. (1965). Sur un théorème de Rényi, Acta Arith. 11, 241–252.

(1967/68). Sur un théorème de Rényi, II, Acta Arith. 13, 339–362.
(1973). Sur un théorème de Rényi, III, Acta Arith. 23, 157–182.
Diamond, H. G. & Erdős, P. (1980). On sharp elementary prime number estimates,
Enseignement Math. (2) 26, 313–321.
Dirichlet, L. (1849). Über die Bestimmung der mittleren Werthe in der Zahlentheorie,
Math. Abhandl. Königl. Akad. Wiss. Berlin, 69–83; Werke, Vol. 2, pp. 49–66.
Duncan, R. L. (1965). The Schnirelmann density of the k-free integers, Proc. Amer.
Math. Soc. 16, 1090–1091.
Dusart, P. (1999). The kth prime is greater than k(log k + log log k − 1) for k ≥ 2, Math.
Comp. 68, 411–415.
Erdős, P. & Shapiro, H. N. (1951). On the change of sign of a certain error function,
Canadian J. Math. 3, 375–385.
Erdős, P. & Szekeres, G. (1934). Über die Anzahl der Abelschen Gruppen gegebener
Ordnung und über ein verwandtes zahlentheoretisches Problem, Acta Litt. Sci.
Szeged 7, 95–102.
Evelyn, C. J. A. & Linfoot, E. H. (1930). On a problem in the additive theory of numbers,
II, J. Reine Angew. Math. 164, 131–140.
Feller, W. & Tornier, E. (1932). Mengentheoretische Untersuchungen von Eigenschaften
der Zahlenreihe, Math. Ann. 107, 188–232.
Gegenbauer, L. (1885). Asymptotische Gesetse der Zahlentheorie, Denkschriften
Österreich. Akad. Wiss. Math.-Natur.
Cl. 49, 37–80.
Golomb, S. (1992). An inequality for 2n n
, Amer. Math. Monthly 99, 746–748.
Gorshkov, L. S. (1956). On the deviation of polynomials with rational integer coefﬁcients
from zero on the interval [0, 1]. Proceedings of the 3rd All-union congress of Soviet
mathematicians, Vol. 3, Moscow, pp. 5–7.
Grosswald, E. (1956). The average order of an arithmetic function, Duke Math. J. 23,
41–44.
Halberstam, H. & Richert, H.-E. (1979). On a result of R. R. Hall, J. Number Theory
11, 76–89.
Hall, R. R. (1974). Halving an estimate obtained from the Selberg upper bound method,
Acta Arith. 25, 487–500.
Hall, R. R. & Tenenbaum, G. (1988). Divisors, Cambridge Tract 90. Cambridge: Cam-
bridge University Press.
Hardy, G. H. (1916). On Dirichlet’s divisor problem, Proc. London Math. Soc. (2)
15, 1–25; Collected Papers, Vol. 2. Cambridge: Cambridge University Press,
pp. 268–292.
Hardy, G. H. & Ramanujan, S. (1917). The normal order of prime factors of a number
n, Quart. J. Math. 48, 76–92; Collected Papers, Vol. II. Oxford: Oxford University
Press, 100–113.
Hartman, P. & Wintner, A. (1947). On Möbius’ inversion, Amer. J. Math. 69, 853–858.
Hildebrand, A. (1984). Quantitative mean value theorems for non-negative multiplicative
functions I, J. London Math. Soc. (2) 30, 394–406.
(1986). On Wirsing’s mean value theorem for multiplicative functions, Bull. London
Math. Soc. 18, 147–152.
(1987). Quantitative mean value theorems for non-negative multiplicative functions
II, Acta Arith. 48, 209–260.
Hille, E. (1937). The inversion problem of Möbius, Duke Math. J. 3, 549–568.
2.6 References 73

Huxley, M. N. (1993). Exponential sums and lattice points II. Proc. London Math. Soc.
(3) 66, 279–301.
Iwaniec, H. & Mozzochi, C. J. (1988). On the divisor and circle problems, J. Number
Theory 29, 60–93.
Jarnı́k, V. (1926). Über die Gitterpunkte auf konvexen Curven, Math. Z. 24, 500–
518.
Kac, M. (1959). Statistical Independence in Probability, Analysis and Number Theory,
Carus Monograph 12. Washington: Math. Assoc. Amer.
Kátai, I. (1966). A remark on H. Delange’s paper “Sur un théorème de Rényi”, Magyar
Tud. Akad. Mat. Fiz. Oszt. Közl. 16, 269–273.
Kolesnik, G. (1969). The improvement of the error term in the divisor problem, Mat.
Zametki 6, 545–554.
(1973). On the estimation of the error term in the divisor problem, Acta Arith. 25,
7–30.
(1982). On the order of ζ ( 12 + it) and (R), Paciﬁc J. Math. 82, 107–122.
(1985). On the method of exponent pairs, Acta Arith. 45, 115–143.
Kubilius, J. (1956). Probabilistic methods in the theory of numbers (in Russian), Uspehi
Mat. Nauk 11, 31–66; Amer. Math. Soc. Transl. (2) 19 (1962), 47–85.
(1964). Probabilistic Methods in the Theory of Numbers, Translations of Mathematical
Monographs, Vol. 11. Providence: American Mathematical Society.
Landau, E. (1900). Ueber die zahlentheoretische Function ϕ(n) und ihre Beziehung zum
Goldbachschen Satz, Nachr. Akad. Wiss. Göttingen, 177–186; Collected Works,
Vol. 1. Essen: Thales Verlag, 1985, pp. 106–115.
(1903). Über den Verlauf der zahlentheoretischen Funktion ϕ(x), Arch. Math. Phys.
(3) 5, 86–91; Collected Works, Vol. 1. Essen: Thales Verlag, 1985, pp. 378–383.
(1911). Sur les valeurs moyennes de certaines fonctions arithmétiques, Bull. Acad.
Royale Belgique, 443–472; Collected Works, Vol. 4. Essen: Thales Verlag, 1986,
pp. 377–406.
(1936). On a Titchmarsh–Estermann sum, J. London Math. Soc. 11, 242–245;
Collected Works, Vol. 9. Essen: Thales Verlag, 1987, pp. 393–396.
Linfoot, E. H. & Evelyn, C. J. A. (1929). On a problem in the additive theory of numbers,
I, J. Reine Angew. Math. 164, 131–140.
Ma̧kowski, A. (1960). Partitions into unequal primes, Bull. Acad. Pol. Sci. 8, 125–126.
Massias, J.-P. & Robin, G. (1996). Bornes effectives pour certaines fonctions concernant
les nombres premiers, J. Théor. Nombres Bordeaux 8, 215–242.
Mertens, F. (1874a). Ueber einige asymptotische Gesetze der Zahlentheorie, J. Reine
Angew. Math. 77, 289–338.
(1874b). Ein Beitrag zur analytischen Zahlentheorie, J. Reine Angew. Math. 78,
46–62.
Montgomery, H. L. (1987). Fluctuations in the mean of Euler’s phi function, Proc. Indian
Acad. Sci. (Math. Sci.) 97, 239–245.
(1994). Ten Lectures on the Interface of Analytic Number Theory and Harmonic
Analysis, CBMS 84. Providence: Amer. Math. Soc.
Narkiewicz, W. (2000). The Development of Prime Number Theory. Berlin: Springer-
Verlag.
Nicolas, J.-L. (1988). On Highly Composite Numbers. Ramanujan Revisited (G. E.
Andrews, R. A. Askey, B. C. Berndt, K. G. Ramanathan, R. A. Rankin, eds.). New
York: Academic Press, pp. 215–244.
74 The elementary theory of arithmetic functions

Niven, I. Zuckerman, H. S. & Montgomery, H. L. (1991). An Introduction to the Theory

of Numbers, Fifth edition. New York: Wiley & Sons.
Nowak, W. G. (1989). On an error term involving the totient function, Indian J. Pure
Appl. Math. 20, 537–542.
Orr, R. C. (1969). On the Schnirelmann density of the sequence of k-free integers, J.
London Math. Soc. 44, 313–319.
Pillai, S. S. & Chowla, S. D. (1930). On the error term in some formulae in the theory
of numbers (I), J. London Math. Soc. 5, 95–101.
Pomerance, C. (1977). On composite n for which ϕ(n)|(n − 1), II, Pacific J. Math. 69,
177–186.
Pritsker, I. E. (1999). Chebyshev Polynomials with Integer Coefficients, in Analytic and
Geometric Inequalities and Applications, Math. Appl. 478. Dordrecht: Kluwer,
pp. 335–348.
Ramanujan, S. (1915a). On the number of divisors of a number, J. Indian Math. Soc.
7, 131–133; Collected Papers, Cambridge: Cambridge University Press, 1927,
pp. 44–46.
(1915b). Highly composite numbers, Proc. London Math. Soc. (2) 14, 347–409;
Collected Papers, Cambridge: Cambridge University Press, 1927, pp. 78–128.
Rényi, A. (1955). On the density of certain sequences of integers, Acad. Serbe Sci. Publ.
Inst. Math. 8, 157–162.
Richert, H.-E. (1949a). Über Zerfällungen in ungleiche Primzahlen, Math. Z. 52, 342–
343.
(1949b). Über Zerlegungen in paarweise verschiedene Zahlen, Norsk Mat. Tidsskr.
31, 120–122.
(1953). Verschärfung der Abschätzung beim Dirichletschen Teilerproblem, Math. Z.
58, 204–218.
Robinson, R. L. (1966). An estimate for the enumerative functions of certain sets of
integers, Proc. Amer. Math. Soc. 17, 232–237; Errata, 1474.
Rogers, K. (1964). The Schnirelmann density of the square-free integers, Proc. Amer.
Math. Soc. 15, 515–516.
Rosser, J. B. & Schoenfeld, L. (1962). Approximate formulas for some functions of
prime numbers, Illinois J. Math. 6, 64–94.
(1975). Sharper bounds for the Chebyshev functions θ (x) and ψ(x), Math. Comp. 29,
243–269.
Runge, C. (1885). Über die auflösbaren Gleichungen von der Form x 5 + ux + v = 0,
Acta Math. 7, 173–186.
Saffari, B. (1970). Sur quelques applications de la “méthode de l’hyperbole” de Dirichlet
à la théorie des nombres premiers, Enseignement Math. (2) 14, 205–224.
Schmidt, P. G. (1967/68). Zur Anzahl Abelscher Gruppen gegebener Ordnung, II, Acta
Arith. 13, 405–417.
Schoenfeld, L. (1969). An improved estimate for the summatory function of the Möbius
function, Acta Arith. 15, 221–233.
(1976). Sharper bounds for the Chebyshev functions θ(x) and ψ(x), II, Math. Comp.
30, 337–360.
Schwarz, W. (1970). Eine Bemerkung zu einer asymptotischen Formel von Herrn Rényi,
Arch. Math. (Basel) 21, 157–166.
2.6 References 75

Shan, Z. (1985). On composite n for which ϕ(n)|(n − 1), J. China Univ. Sci. Tech. 15,
109–112.
Sitaramachandrarao, R. (1982). On an error term of Landau, Indian J. Pure Appl. Math.
13, 882–885.
(1985). On an error term of Landau, II, Rocky Mountain J. Math. 15, 579–588.
Soundararajan, K. (2003). Omega results for the divisor and circle problems, Int. Math.
Res. Not., 1987–1998.
Stieltjes, T. J. (1887). Note sur la multiplication de deux séries, Nouvelles Annales (3)
6, 210–215.
Sylvester, J. J. (1881). On Tchebycheff’s theory of the totality of the prime numbers
comprised within given limits, Amer. J. Math. 4, 230–247.
Tenenbaum, G. (1995). Introduction to Analytic and Probabilistic Number Theory, Cam-
bridge Studies 46, Cambridge: Cambridge University Press.
Turán, P. (1934). On a theorem of Hardy and Ramanujan, J. London Math. Soc. 9,
274–276.
de la Vallée Poussin, C. J. (1898). Sur les valeurs moyennes de certaines fonctions
arithmétiques, Ann. Soc. Sci. Bruxelles 22, 84–90.
Voronoı̈, G. (1903). Sur un problème du calcul des fonctions asymptotiques, J. Reine
Angew. Math. 126, 241–282.
Walﬁsz, A. (1963). Weylsche Exponentialsummen in der neueren Zahlentheorie, Math-
ematische Forschungsberichte 15, Berlin: VEB Deutscher Verlag Wiss.
Ward, D. R. (1927). Some series involving Euler’s function, J. London Math. Soc. 2,
210–214.
Wigert, S. (1906/7). Sur l’ordre de grandeur du nombre des diviseurs d’un entier, Ark.
Mat. 3, 1–9.
Wilson, B. M. (1922). Proofs of some formulæ enunciated by Ramanujan, Proc. London
Math. Soc. 21, 235–255.
Wintner, A. (1944). The Theory of Measure in Arithmetic Semigroups. Baltimore:
Waverly Press.
Wirsing, E. (1961). Das asymptotische Verhalten von Summen über multiplikative Funk-
tionen, Math. Ann. 143, 75–102.
(1967). Das asymptotische Verhalten von Summen über multiplikative Funktionen,
II, Acta Math. Acad. Sci. Hungar. 18, 411–467.
3
Principles and ﬁrst examples of sieve methods

3.1 Initiation
The aim of sieve theory is to construct estimates for the number of integers
remaining in a set after members of certain arithmetic progressions have been
discarded. If P is given, then the asymptotic density of the set of integers
relatively prime to P is ϕ(P)/P; with the aid of sieves we can estimate how
quickly this asymptotic behaviour is approached. Throughout this chapter we
let S(x, y; P) denote the numbers of integers n in the interval x < n ≤ x + y
for which (n, P) = 1. A ﬁrst (weak) result is provided by

Theorem 3.1 (Eratosthenes–Legendre) For any real x, and any y ≥ 0,

ϕ(P)
S(x, y; P) = y + O 2ω(P) .
P
Of course if y is an integral multiple of P then the above holds with no error
term. Since 2ω(P) ≤ d(P) P ε , the main term above is larger than the error
ε
term if y ≥ P ; thus the reduced residues are roughly uniformly distributed in
the interval (0, P].

76
3.1 Initiation 77

Removing the square brackets, we see that this is

µ(d)
=y +O |µ(d)| ,
d|P
d d|P

which is the desired result.

The identity (3.1) can be considered to be an instance of Sylvester’s principle

of inclusion–exclusion, which in general asserts that if S is a ﬁnite set and
S1 , . . . , S R are subsets of S, then

&' R
card S Sr = card(S) − 1 + 2 − · · · + (−1) R R (3.2)
r =1

where

(
s
s = card Sr j .
1≤r1 <···<rs ≤R j=1

To obtain (3.1) we take S = {n ∈ Z : x < n ≤ x + y}, R = ω(P), we let

p1 , . . . , p R be the distinct primes dividing P, and we put Sr = {n : x < n ≤
x + y, pr |n}. Here we see that the Möbius µ-function has an important com-
binatorial signiﬁcance, namely that it enables us to present the inclusion–
exclusion identity in a compact manner, in arithmetic situations such as (3.1)
above.
To prove (3.2) it sufﬁces to note that if an element of S is not in any of the Sr ,
then it is counted once on the right-hand
side, while if it is in precisely t > 0 of
the sets Sr then it is counted st times in s , and hence it contributes altogether
P t
t t
(−1)s = (−1)s = (1 − 1)t = 0.
s=0
s s=0
s
If p is a prime, then either p|P or ( p, P) = 1. Hence
π(x + y) − π (x) ≤ ω(P) + S(x, y; P), (3.3)
so that a bound for S(x, y; P) can be used to bound the number of prime numbers
in an interval. In view of the main term in Theorem 3.1, it is reasonable to expect
that it will be best to take P of the form
P= p. (3.4)
p≤z

On taking z = log y, we see immediately that

y
π(x + y) − π (x) ≤ e−C0 + ε(y)
log log y
78 Principles and ﬁrst examples of sieve methods

where ε(y) → 0 as y → ∞. This bound is very weak, but has the interesting
property of being uniform in x. Since the bound for the error term in Theorem 3.1
is very crude, we might expect that more is true, so that perhaps
ϕ(P)
S(x, y; P) ∼ y
P
even when z is fairly large. However, as we have already noted in our remarks
following Theorem 2.11, this asymptotic formula fails when z = y 1/2 .
In order to derive a sharper estimate for S(x, y; P), we replace µ(d) by a more
general arithmetic function λd that in some sense is a truncated approximation
to µ(d). This is reminiscent of our derivation of the Chebyshev bounds, but in
fact the speciﬁc properties required of the λd are now rather different. Suppose
that we seek an upper bound for S(x, y; P). Let λ+ n be a function such that

1 if n = 1,
λ+
d ≥ (3.5)
d|n
0 otherwise.

Such a λ+d we call an ‘upper bound sifting function’, and by arguing as in the
proof of Theorem 3.1 we see that

S(x, y; P) ≤ λ+
d = y λ+d /d + O |λ+
d| . (3.6)
x<n≤x+y d|n d|P d|P
d|P

This will be useful if d|P λ+ d /d is not much larger than ϕ(P)/P, and if
+
d|P |λ d | is much smaller than 2ω(P) . Brun (1915) was the ﬁrst to succeed
with an argument of this kind. He took his λ+n to be of the form

µ(n) if n ∈ D+ ,
λ+
n =
0 otherwise,
where D+ is a judiciously chosen set of integers. A sieve of this kind is called
‘combinatorial’. With Brun’s choice of D+ it is easy to verify (3.5), and it

is not hard to bound d|P |λ+ d |, but the determination of the asymptotic size

of the main term d|P λ+ d /d presents some technical difﬁculties. We do not
develop a detailed account of Brun’s method, but the spirit of the approach can
be appreciated by considering the following simple choice of D+ : Let r be an
integer at our disposal, and put
D+ = {n : ω(n) ≤ 2r }.
We observe that

2r
2r
ω(P)
λ+
d = µ(d) = (−1) j
.
d|P j=0 d|P j=0
j
ω(d)= j
3.1 Initiation 79

Then (3.5) follows on taking J = 2r , h = ω(P) in the binomial coefﬁcient

identity
J
h h−1
(−1) j = (−1) J .
j=0
j J

This identity can in turn be proved by induction, or by equating coefﬁcients in

the power series identity

∞ h
h
h−1
h−1
x i
(−1) j
x = (1 − x)
j h−1
= (−1) J
xJ.
i=0 j=0
j J =0
J

Lower bounds for S(x, y; P) can be derived in a parallel manner, by intro-

ducing a lower bound sifting function λ− −
n . That is, λn is an arithmetic function
such that

− 1 if n = 1,
λd ≤ (3.7)
d|n
0 otherwise.

Corresponding to the upper bound (3.6) we have

S(x, y; P) ≥ y λ−
d /d − O |λ−
d| . (3.8)
d|P d|P

Unfortunately, this lower bound may be negative, in which case it is useless,

since trivially S(x, y; P) ≥ 0. Brun determined λ− d combinatorially by con-
structing a set D− similar to his D+ . Indeed, an admissible set can be obtained
by taking
D− = {n : ω(n) ≤ 2r − 1}.
By Brun’s method it can be shown that
y
π (x + y) − π (x) . (3.9)
log y
When x = 0 this is merely a weak form of the Chebyshev upper bound. The
main utility of the above is that it holds uniformly in x. We shall establish a
reﬁned form of (3.9) in the next section (cf. Corollary 3.4).

3.1.1 Exercises

1. (Charles Dodgson) In a very hotly fought battle, at least 70% of the combat-
ants lost an eye, at least 75% an ear, at least 80% an arm, and at least 85% a
leg. What can you say about the percentage that lost all four members?
80 Principles and ﬁrst examples of sieve methods

2. (P. T. Bateman) Would you believe a market investigator who reports that of
1000 people, 816 like candy, 723 like ice cream, 645 like cake, while 562
like both candy and ice cream, 463 like both candy and cake, 470 like both
ice cream and cake, while 310 like all three?
3. (Erdős 1946) For x > 0 write
ϕ(k)
1= x + E k (x).
1≤n≤x
k
(n,k)=1

(a) Show that if k > 1, then

E k (x) = − µ(d)B1 ({x/d})
d|k

where B1 (z) = z − 1/2 is the ﬁrst Bernoulli polynomial. Let E k (x) be

deﬁned by this formula when x < 0.
(b) Show that if k > 1, then E k (x) is periodic with period k, that E k (x) is
an odd function (apart from values at discontinuities), and that
k
E k (x) d x = 0.
0

(c) By using the result of Exercise B.10, or otherwise, show that if d|k and
e|k, then
k
(d, e)2
B1 ({x/d})B1 ({x/e}) d x = k.
0 12de
(d) Show that if k > 1, then
k
1 ω(k)
E k (x)2 d x = 2 ϕ(k).
0 12
(e) Deduce that if k > 1, then
1/2
ω(k)/2 ϕ(k)
max |E k (x)| 2 .
x k
4. (Lehmer 1955; cf. Vijayaraghavan 1951) Let E k (x) be deﬁned as above.
(a) Show that |E k (x)| ≤ 2ω(k)−1 for all k > 1.
(b) Suppose that k is composed of distinct primes p ≡ 3 (mod 4), and that
ω(k) is even. Show that if d|k, then µ(d)B1 ({k/(4d)}) = −1/4.
(c) Show that there exist inﬁnitely many numbers k for which

max |E k (x)| ≥ 2ω(k)−2 .

x
3.1 Initiation 81

5. (Behrend 1948; cf. Heilbronn 1937, Rohrbach 1937, Chung 1941, van der
Corput 1958) Let a1 , . . . , a J be positive integers, and let T (a1 , . . . , a J ) de-
note the asymptotic density of the set of those positive integers that are not
divisible by any of the ai .

(a) Show that T (a1 , . . . , a J ) = Jj=0 (−1) j j where
1
j = .
1≤i 1 <···<i j ≤J
[ai1 , . . . , ai j ]

(b) Show that if a1 , . . . , a J are pairwise relatively prime, then

J
1
T (a1 , . . . , a J ) = 1− .
j=1
aj

(c) Show if (d, vs ) = 1 for 1 ≤ s ≤ S, then

1
T (du 1 , . . . , du R , v1 , . . . , v S ) = T (u 1 , . . . , u R , v1 , . . . , v S )
d
1
+ 1− T (v1 , . . . , v S ).
d
(d) Suppose that d|a j for 1 ≤ j ≤ j0 , that (d, a j ) = 1 for j > j0 , that d|bk
for 1 ≤ k ≤ k0 , and that (d, bk ) = 1 for k0 < k ≤ K . Put a j = a j /d for
1 ≤ j ≤ j0 , and bk = bk /d for 1 ≤ k ≤ k0 . Explain why
T (a1 , . . . , a J )T (b1 , . . . , b K )
1
= T (a1 , . . . , a j0 , a j0 +1 , . . . , a J )T (b1 , . . . , bk 0 , bk0 +1 , . . . , b K )
d
1
+ 1− T (a j0 +1 , . . . , a J )T (bk0 +1 , . . . , b K )
d

1 1
− 1− (T (a j0 +1 , . . . , a J ) − T (a1 , . . . , a j0 , a j0 +1 , . . . , a J ))
d d

· T (bk0 +1 , . . . , b K ) − T (b1 , . . . , bk 0 , bk0 +1 , . . . , b K ) .
(e) Explain why the factors that constitute the last term above are all non-
negative.
(f) Show that
T (a1 , . . . , a J , b1 , . . . , b K ) ≥ T (a1 , . . . , a J )T (b1 , . . . , b K ).
(g) Show that
J
1
T (a1 , . . . , a J ) ≥ 1− .
j=1
aj
82 Principles and ﬁrst examples of sieve methods

3.2 The Selberg lambda-squared method

Let n be a real-valued arithmetic function such that 1 = 1. Then
2
1 if n = 1,
d ≥
d|n
0 if n > 1.

This simple observation can be used to obtain an upper bound for S(x, y; P);
namely
⎛ ⎞2
⎜ ⎟
S(x, y; P) ≤ ⎝ d ⎠
x<n≤x+y d|n
d|P

= d e 1
d|P x<n≤x+y
e|P d|n,e|n

x+y x
= d e −
d|P
[d, e] [d, e]
e|P
⎛ 2 ⎞
d e
=y +O⎝ |d | ⎠ . (3.10)
d|P
[d, e] d|P
e|P

In the general framework of the preceding section this amounts to taking

λ+
n = d e ,
d,e
[d,e]=n

since it then follows that

2

λ+
d = d .
d|n d|n

We now suppose that n = 0 for n > z where z is a parameter at our disposal, in

the hope that this will restrict the size of the error term. As for the main term, we
see that we wish to minimize a quadratic form subject to the constraint 1 = 1.
In fact we can diagonalize this quadratic form and determine the optimal n
exactly; this permits us to prove

Theorem 3.2 Let x, y, and z be real numbers such that y > 0 and z ≥ 1. For
any positive integer P we have
y
S(x, y; P) ≤ + O(z 2 L P (z)−2 )
L P (z)
3.2 The Selberg lambda-squared method 83

where
µ(n)2
L P (z) = .
n≤z ϕ(n)
n|P

Proof Clearly we may assume that P is square-free. Since [d, e](d, e) = de

and d|n ϕ(d) = n, we see that
1 (d, e) 1
= = ϕ( f ).
[d, e] de de f |d, f |e

Hence
d e d e
= ϕ( f )
d|P,e|P
[d, e] f |P d
d e e
f |d|P f |e|P

= ϕ( f )yf2
f |P

where
d
yf = . (3.11)
d
d
f |d|P

This linear change of variables, from d to yf , is non-singular. That is, if the yf

are given then there exist unique d such that the above holds. Indeed, by a form
of the Möbius inversion formula (cf. Exercise 2.1.6) the above is equivalent to
the relation

d = d yf µ( f /d). (3.12)
f
d| f |P

Moreover, from these formulæ we see that d = 0 for all d > z if and only if
yf = 0 for all f > z. Thus we have diagonalized the quadratic form in (3.10),
and by (3.12) we see that the constraint 1 = 1 is equivalent to the linear
condition

yf µ( f ) = 1. (3.13)
f |P

We determine the value of the constrained minimum by completing squares. If

the yf satisfy (3.13), then

2
µ( f ) 1
ϕ( f )yf2 = ϕ( f ) yf − + . (3.14)
f |P f |P
ϕ( f )L P (z) L P (z)
f ≤z
84 Principles and ﬁrst examples of sieve methods

Here the right-hand side is minimized by taking

µ( f )
yf = (3.15)
ϕ( f )L P (z)
for f ≤ z, and we note that these yf satisfy (3.13). Hence the minimum of the
quadratic form in (3.10), subject to 1 = 1, is precisely 1/L P (z); this gives the
main term.
We now treat the error term. Since P is square-free, from (3.12) and (3.15)
we see that
d µ( f )µ( f /d) dµ(d) µ(m)2
d = = ; (3.16)
L P (z) f ϕ( f ) L P (z)ϕ(d) m|P ϕ(m)
d| f |P (m,d)=1
f ≤z m≤z/d

here we have put m = f /d. Thus

1 d 1 1 1 d
|d | ≤ = .
d≤z
L P (z) d≤z ϕ(d) m≤z/d ϕ(m) L P (z) m≤z ϕ(m) d≤z/m ϕ(d)

Since d/ϕ(d) = r |d µ2 (r )/ϕ(r ), it follows by the method of Section 2.1 that
d µ2 (r ) µ2 (r )
= [y/r ] ≤ y y.
d≤y
ϕ(d) r ≤y ϕ(r ) r r ϕ(r )

On inserting this in our former estimate, we ﬁnd that

z 1 z
|d | . (3.17)
d≤z
L P (z) m≤z mϕ(m) L P (z)

This gives the stated error term, so the proof is complete.

In order to apply Theorem 3.2, we require a lower bound for the sum L P (z).
To this end we show that
µ(n)2
> log z (3.18)
n≤z ϕ(n)

for all z ≥ 1. Let s(n) denote the largest square-free number dividing n (some-
times called the ‘square-free kernel of n’). Then for square-free n,
1
1 1 1 1
= 1 + + 2 + ··· = ,
ϕ(n) n p|n p p m m
s(m)=n

so that the sum in (3.18) is

1
.
m m
s(m)≤z
3.2 The Selberg lambda-squared method 85

Since s(m) ≤ m, this latter sum is

1
≥ > log z.
m≤z m

Here the last inequality is obtained by the integral test. With more work one can
derive an asymptotic formula for the the sum in (3.18) (recall Exercise 2.1.17).
By taking z = y 1/2 in Theorem 3.2, and appealing to (3.18), we obtain

Theorem 3.3 Let P = p≤√ y p. Then for any x and any y ≥ 2,

2y 1
S(x, y; P) ≤ 1+O .
log y log y
By combining the above with (3.3) we obtain an immediate application to
the distribution of prime numbers.
Corollary 3.4 For any x ≥ 0 and any y ≥ 2,

2y 1
π (x + y) − π (x) ≤ 1+O .
log y log y
In Theorem 3.3 we consider only a very special sort of P, but the following
lemma enables us to obtain corresponding results for more general P.
Lemma 3.5 Put M(y; P) = maxx S(x, y; P). If (P, q) = 1, then
q
M(y; P) ≤ M(y; q P).
ϕ(q)
Proof It sufﬁces to show that

q
ϕ(q)S(x, y; P) = S(x + Pm, y; q P), (3.19)
m=1

since the right-hand side is bounded above by q M(y; q P). Suppose that x +
Pm < n ≤ x + Pm + y and that (n, q P) = 1. Put r = n − Pm. Then x <
r ≤ x + y, (r, P) = 1, and (r + Pm, q) = 1. Thus the right-hand side above is

1 = 1.
m r x<r ≤x+y 1≤m≤q
(r,P)=1 (r +Pm,q)=1

Since (P, q) = 1, the map m → r + Pm permutes the residue classes (mod q).
Hence the inner sum above is ϕ(q), and we have (3.19).

Theorem 3.6 For any real x and any y ≥ 2,

⎛ ⎞

C0 ⎜
⎜ 1 ⎟ 1
S(x, y; P) ≤ e y ⎝ 1− ⎟ 1+O .
p|P
p ⎠ log y
√
p≤ y
86 Principles and ﬁrst examples of sieve methods

Proof Let
P1 = p, q1 = p.
p|P
√ pP
p≤ y √
p≤ y

Theorem 3.3 provides an upper bound for M(y; q1 P1 ), and hence by Lemma
3.5 we have an upper bound for M(y; P1 ). To complete the argument it sufﬁces
to note that S(x, y; P) ≤ S(x, y; P1 ) ≤ M(y; P1 ), and to appeal to Mertens’
formula (Theorem 2.7(e)).

We note that Theorem 3.3 is a special case of Theorem 3.6. Although we have
taken great care to derive uniform estimates, for many purposes it is enough to
know that

1
S(x, y; P) y 1− . (3.20)
p|P
p
p≤y

This follows from Theorem 3.6 since √ y< p≤y (1 − 1/ p)−1 1 by Mertens’
formula. To obtain an estimate in the opposite direction, write P = P1 q1 where
P1 is composed entirely of primes > y, and q1 is composed entirely of primes
≤ y. Since the integers in the interval (0, y] have no prime factor > y, we see
that M(y; P1 ) ≥ [y] . Hence by Lemma 3.5,

1
M(y; P) ≥ [y] 1− . (3.21)
p|P
p
p≤y

Thus the bound (3.20) is of the correct order of magnitude.

The advantage of Theorem 3.6 lies in its uniformity. On the other hand, the
use of Lemma 3.5 is wasteful if the P in Theorem 3.6 is much smaller than in

Theorem 3.3. For example, if P = p≤y 1/4 p, then by Theorem 3.6 we ﬁnd that

cy 1
S(x, y; P) ≤ 1+O
log y log y
with c = 4, whereas by Theorem 3.2 with z = y 1/2 we obtain the above with
the better constant
4
c= = 2.4787668 . . . .
3 − 2 log 2
To see this, we note that
µ(n)2 1 µ(n)2
L P (z) = − . (3.22)
n≤z ϕ(n) z 1/2 < p≤z
p − 1 n≤z/ p ϕ(n)
3.2 The Selberg lambda-squared method 87

Then by Exercise 2.1.17 and Mertens’ estimates (Theorem 2.7) it follows that
this is 14 (3 − 2 log 2) log y + O(1).

3.2.1 Exercises
1. Let d be deﬁned as in the proof of Theorem 3.2.
(a) Show that
d 2z
d log
L P (z)ϕ(d) d
for d ≤ z.
(b) Use the above to give a second proof of (3.17).
2. Show that for y ≥ 2 the number of prime powers p k in the interval
(x, x + y] is

2y 1
≤ 1+O .
log y log y
3. (Chowla 1932) Let f (n) be an arithmetic function, put

g(n) = f (d) f (e),
[d,e]=n

and let σc denote the abscissa of convergence of the Dirichlet series

g(n)n −s .
(a) Show that if σ > max(1, σc ), then
f (d) f (e) ∞ 2
ζ (s) = f (d) n −s .
d, e
[d, e]s n=1 d|n

(b) Show that

µ(d)µ(e) 6
= .
d, e
[d, e]2 π2

(c) Show that

µ(d)µ(e) = µ(n)
d, e
[d,e]=n

for all positive integers n.

4. Let f (n) be an arithmetic function such that f (1) = 1. Show that f is
multiplicative if and only if f (m) f (n) = f ((m, n)) f ([m, n]) for all pairs
of positive integers m, n.
88 Principles and ﬁrst examples of sieve methods

5. (Hensley 1978)

(a) Let P = p≤√ y p. Show that the number of n, x < n ≤ x + y, such
that (n) = 2, is
x + y
x
≤ S(x, y; P) + π −π .
√
p≤ y
p p

(b) By using Theorem 3.3 and Corollary 3.4, show that for y ≥ 2,

2y log log y 1
1≤ 1+O .
x<n≤x+y log y log log y
(n)=2

6. (H.-E. Richert, unpublished)

(a) Show that
2 ⎛ 2 ⎞
d e
d = y 2
+O⎝ |d | ⎠ .
x<n≤x+y d 2 |n d, e
[d, e] d

(b) Let f (n) = n 2 p|n (1 − p −2 ). Show that d|n f (d) = n 2 .
(c) For 1 ≤ d ≤ z let d be real numbers such that 1 = 1. Show that the

minimum of d, e d e /[d, e]2 is 1/L where L = n≤z µ(n)2 / f (n).
Show also that d 1 for the extremal d .
(d) Show that ζ (2) − 1/z ≤ L ≤ ζ (2).
(e) Let Q(x) denote the number of square-free numbers not exceeding x.
Show that for x ≥ 0, y ≥ 1,
y
Q(x + y) − Q(x) ≤ + O y 2/3 .
ζ (2)
7. Let m(y; P) = minx S(x, y; P). Show that if (q, P) = 1, then
q
m(y; P) ≥ m(y; q P).
ϕ(q)
8. (N. G. de Bruijn, unpublished; cf. van Lint & Richert 1964) Let M be an
arbitrary set of natural numbers, and let s(n) denote the largest square-free
divisor of n. Show that
µ(n)2 1 µ(n)2 1
0≤ − ≤ − 1.
n≤x ϕ(n) n≤x n n≤x ϕ(n) n≤x n
n∈M s(n)∈M

9. (van Lint & Richert 1965)

(a) Show that
⎛ ⎞

µ(n)2 µ(d)2 ⎜ µ(m) ⎟
2
≤ ⎝ ⎠.
n≤z ϕ(n) d|q
ϕ(d) m≤z ϕ(m)
(m,q)=1
3.3 Sifting an arithmetic progression 89

(b) Deduce that

µ(n)2 ϕ(q) µ(n)2
≥ .
n≤z ϕ(n) q n≤z ϕ(n)
(n,q)=1

10. (Hooley 1972; Montgomery & Vaughan 1979)

(a) Let λ+ +
d be an upper bound sifting function such that λd = 0 for all
d > z. Show that for any q,
ϕ(q) λ+ λ+
0≤ d
≤ d
.
q d
d d
d
(d,q)=1

(Hint: Multiply both sides by P/ϕ(P) = 1/m where m runs over

all integers composed of the primes dividing P, and P = p≤z p.)
(b) Let d be real with d = 0 for d > z. Show that for any q,
ϕ(q) d e d e
0≤ ≤ .
q d, e
[d, e] d, e
[d, e]
(de,q)=1

(c) Let λ−
be a lower bound sifting function such that λ−
d d = 0 for d > z.
Show that for any q,
ϕ(q) λ− λ−
d
≥ d
.
q d
d d
d
(d,q)=1

3.3 Sifting an arithmetic progression

Thus far we have sifted only the zero residue class from a set of consecutive
integers. We now widen the situation slightly.
Lemma 3.7 Let P be a positive integer, and for each prime p dividing P
suppose that one particular residue class a p has been chosen. Let S (x, y; P)
denote the number of integers m, x < m ≤ x + y, such that for each p|P,
m ≡ a p (mod p). Then
max S (x, y; P) = max S(x, y; P).
x x

Since S (x, y; P) reduces to S(x, y; P) when we take a p = 0 for all p|P,
we see that there is no loss of generality in sifting only the zero residue class,
when the initial set of numbers consists of consecutive integers. Also, we note
that the value of the maximum taken above is independent of the choice of the
ap.
90 Principles and ﬁrst examples of sieve methods

Proof By the Chinese remainder theorem there is a number c such that c ≡ a p

(mod p) for every p|P. Put n = m − c. Thus the inequality x < m ≤ x + y is
equivalent to x − c < n ≤ x − c + y, and the condition that p|P implies m ≡
a p (mod p) is equivalent to (n, P) = 1. Hence S (x, y; P) = S(x − c, y; P),
so that
max S (x, y; P) = max S(x − c, y; P) = max S(x, y; P),
x x x

and the proof is complete.

Theorem 3.8 Suppose that (a, q) = 1, that (P, q) = 1, and that x and y are
real numbers with y ≥ 2q. The number of n, x < n ≤ x + y, such that n ≡ a
(mod q) and (n, P) = 1 is
⎛ ⎞

⎜
C0 y ⎜ 1 ⎟ 1
≤e 1− ⎟ 1+O .
q ⎝ p|P p ⎠ log y/q
√
p≤ y/q

Proof Write n = mq + a, so that x < m ≤ x + y where x = (x − a)/q

and y = y/q. For each p|P let a p be the unique residue class (mod p) such
that a p q + a ≡ 0 (mod p). Thus p|n if and only if m ≡ a p (mod p). Hence
the number of n in question is S (x , y ; P), in the language of Lemma 3.7. The
stated bound now follows from this lemma and Theorem 3.6.

Using the estimate above, we generalize Corollary 3.4 to arithmetic progres-

sions. We let π(x; q, a) denote the number of prime numbers p ≤ x such that
p ≡ a(mod q).
Theorem 3.9 (Brun–Titchmarsh) Let a and q be integers with (a, q) = 1, and
let x and y be real numbers with x ≥ 0 and y ≥ 2q. Then

2y 1
π(x + y; q, a) − π(x; q, a) ≤ 1+O . (3.23)
ϕ(q) log y/q log y/q
√
Proof Take P to be the product of those primes p ≤ y/q such that pq.
Then

1 1 −1 1
1− = 1− 1−
p|P
p p|q
p √
p≤ y/q
p
√
p≤ y/q
−1
1 1
≤ 1− 1− .
p|q
p √
p≤ y/q
p
By Mertens’ estimate this is

q 2e−C0 1
= · 1+O .
ϕ(q) log y/q log y/q
3.4 Twin primes 91

Thus by Theorem 3.8, the number of primes p, x < p ≤ x + y, such that p ≡ a

(mod q) and ( p, P) = 1 satisﬁes the bound (3.23). To complete the proof it
remains to note that the number of primes p, x < p ≤ x + y, such that p ≡ a
√
(mod q) and p|P is at most ω(P) ≤ y/q, which can be absorbed in the error
term in (3.23).

3.4 Twin primes

Thus far we have removed at most one residue class per prime. More generally,
we might wish to delete from an interval (x, x + y] those numbers n that lie
in a certain set B( p) of ‘bad’ residue classes modulo p. Let b( p) = card B( p)
denote the number of residue classes to be removed, for p|P where P is a given
square-free number, and set
a(n) = p.
p|P
n∈B( p) (mod p)

Thus the n that remain after sifting are precisely the n for which (a(n), P) = 1.
By the sieve we obtain upper and lower bounds for the number of remaining n
of the form

λm = λm 1. (3.24)
x<n≤x+y m|(a(n),P) m|P x<n≤x+y
m|a(n)

Now p|a(n) if and only if n ∈ B( p) (mod p). By the Chinese remainder theo-

rem, this will be the case for all p|m when n lies in one of precisely p|m b( p)
residue classes modulo m. The b( p) are deﬁned only for primes, but it is con-
venient now to extend the deﬁnition to all positive integers by putting

b(m) = b( p)α .
p α m

Thus b(m) is the totally multiplicative function generated by the b( p). For
square-free m, b(m) represents the number of deleted residue classes modulo
m. We are now in a position to estimate the inner sum above. We partition the
interval (x, x + y] into [y/m] intervals of length m, and one interval of length
{y/m}m. In each interval of length m there are precisely b(m) values of n for
which m|a(n). In the ﬁnal shorter interval, the number of such n lies between
0 and b(m). Thus the inner sum on the right above is = yb(m)/m + O(b(m)),
and hence the expression (3.24) is

b(m)λm
=y + O b(m)|λm | . (3.25)
m|P
m m|P
92 Principles and ﬁrst examples of sieve methods

To continue from this point, one should specify the choice of λm , and then
estimate the main term and error term. In the context of Selberg’s 2 method,
we have real d with 1 and d = 0 for d > z. The number of n ∈ (x, x + y]
that survive sifting is
2

≤ d = d e 1
x<n≤x+y d|(a(n),P) d|P e|P x<n≤x+y
[d,e]|a(n)

b([d, e])
=y d e + O g([d, e])|d e | . (3.26)
d|P e|P
[d, e] d|P e|P

This is (3.25) with λm = [d,e]=m d e .
We consider ﬁrst the main term above. Clearly [d, e] = de/(d, e) and
b([d, e]) = b(d)b(e)/b((d, e)). For square-free m put
b( p)
g(m) = . (3.27)
p|m
p − b( p)

Here we have 0 in the denominator if there is a prime p for which b( p) = p.

However, in that case all residues modulo p are removed, and no integer survives
sifting. Thus we may conﬁne our attention to b( p) such that b( p) < p for all
p. If m is square-free, then
1
p − b( p) m
= 1+ = .
d|m
g(d) p|m
b( p) b(m)

By applying this with m = (d, e) we see that the ﬁrst sum in (3.26) is
b(d)d b(e)e (d, e) b(d)d b(e)e 1
· · = ·
d|P
d e b((d, e)) d|P
d e f |d
g( f )
e|P e|P f |e
1 b(d) b(e)
= d e
f |P
g( f ) d d e e
f |d|P f |e|P

1 2
= y (3.28)
f |P
g( f ) f

where
b(d)
yf = d . (3.29)
d
d
f |d|P
3.4 Twin primes 93

The linear change of variables from d to yf is invertible:

d
d = yf µ( f /d) . (3.30)
b(d) f
d| f |P

By the above formulæ we see that the condition that d = 0 for d > z is
equivalent to the condition that yf = 0 for f > z. Also, the condition that
1 = 1 is equivalent to

yf µ( f ) = 1. (3.31)
f |P

For such yf we see that

1 2 1 2 1
yf = yf − µ( f )g( f )/L + (3.32)
f |P
g( f ) f |P
g( f ) L
f ≤z

where

L= µ( f )2 g( f ) . (3.33)
f ≤z
f |P

Thus our main term is minimized by taking

µ( f )g( f )/L ( f ≤ z),
yf = (3.34)
0 (otherwise),
and we note that these yf satisfy (3.31). The size of L depends on P, z, and the
b( p). In the case of twin primes we obtain the following estimate.

Theorem 3.10 Let P = p≤√ y p where y ≥ 4. The number of integers n ∈
(x, x + y], such that (n, P) = (n + 2, P) = 1 does not exceed

8cy log log y
1 + O
(log y)2 log y
where

1
c=2 1− .
p>2
( p − 1)2
√
The number of primes p ∈ (x, x + y] for which p|P is ≤ π ( y). Likewise,
the number of primes p ∈ (x, x + y] for which p + 2 is prime and ( p + 2)|P
√
is ≤ π( y). Otherwise, if p ∈ (x, x + y] and p + 2 is prime, then ( p, P) =
√
( p + 2, P) = 1; the number of such p is bounded by the above. Since π ( y)
is negligible by comparison, the above bound applies also to the number of
primes p ∈ (x, x + y] for which p + 2 is prime.
94 Principles and ﬁrst examples of sieve methods

Proof We ﬁrst estimate L as given in (3.33). We have b(2) = 1 and b( p) = 2

for p > 2. Since µ(m)2 g(m) is a multiplicative function that takes the value
2/( p − 2) when m = p > 2, and since d(n)/n is a multiplicative function that
takes the value 2/ p when n = p, we expect that d(n)/n and µ(m)2 g(m) are
‘close’ in the sense that we can obtain the latter function by convolving d(n)/n
with a fairly tame function c(k). On comparing the Euler products of the re-
spective Dirichlet series generating functions, we see that if the c(k) are deﬁned
so that
∞
−s −s −s−1 2 2 1 2
c(k)k = (1 + 2 )(1 − 2 ) 1+ 1 − s+1 ,
k=1 p>2
( p − 2) p s p
(3.35)
then

µ(m)2 g(m) = c(k)d(n)/n.
k,n
kn=m

Hence

L= µ(m)2 g(m) = c(k) d(n)/n.
m≤z k≤z n≤z/k

By Theorem 2.3 and (Riemann–Stieltjes) integration by parts we see that

N
d(n) 1
= (log N )2 + O(log N ).
n=1
n 2
Hence

L= c(k)((log z/k)2 /2 + O(log z))
k≤z

1
= (log z)2 c(k) + O (log z) |c(k)| log 2k
2
k≤z k

+O |c(k)|(log k)2 .
k

The Euler product in (3.35) is absolutely convergent for σ > −1/2. Hence

|c(k)|k −σ < ∞ for σ > −1/2. Thus the two sums in the error terms above
are convergent. Also,
1 ∞
1
|c(k)| ≤ |c(k)| log k .
k>z
log z k=1
log z
Thus by taking s = 0 in (3.35) we ﬁnd that
1
L = (log z)2 + O(log z). (3.36)
2c
3.4 Twin primes 95

It remains to bound the error term in (3.26). Since 0 ≤ b([d, e]) ≤ b(d)b(e),
the error term is
2

b(d)|d | .
d≤z

From (3.30) and (3.34) we see that

d µ(d)dg(d)
d = µ( f )g( f )µ( f /d) = µ(m)2 g(m) .
b(d)L f ≤z b(d)L m≤z/d
d| f (m,d)=1

Hence
1
b(d)|d | µ(d)2 dg(d) µ(m)2 g(m)
d≤z
L d≤z m≤z/d
1
= µ(m)2 g(m) µ(d)2 dg(d) .
L m≤z d≤z/m

By Corollary 2.15 we see that

D
µ(d)2 dg(d) (1 + g( p))
d≤D
log D p≤D
−2
D 1
1− D log D .
log D p≤D
p

Since L (log z)2 , it follows that

z z
b(d)|d | µ(m)2 g(m)/m .
d≤z
log z m≤z log z

On combining our estimates, we see that the number of n, x < n ≤ x + y, such

that (a(n), P) = 1 is

2cy y z2
≤ + O + O .
(log z)2 (log z)3 (log z)2
In order that the last error term is majorized by the one before it, we take
z = (y/ log y)1/2 . Then
1
log z = log y + O(log log y),
2
so we obtain the stated result.

Corollary 3.11 (Brun) Let ∗p denote a sum over those primes p for which
∗
p + 2 is prime. Then p 1/ p converges.
96 Principles and ﬁrst examples of sieve methods

Proof The number of twin primes for which 2k−1 < p ≤ 2k is 2k /k 2 .

Hence the contribution of such primes to the sum in question is 1/k 2 . But

1/k < ∞, so we obtain the stated result.
2

Let r be an even non-zero integer. To bound the number of primes p for

which p + r is also prime, it sufﬁces to establish the following monotonicity
principle, which is a natural generalization of Lemma 3.5.

Lemma 3.12 For each prime p let B( p) be the union of b( p) arithmetic

)
progressions with common difference p. Put B = p|P B( p), and set

M(x, y; b) = max 1
B
x<n≤x+y
n ∈B
/

where the maximum is over all choices of the B( p) with b( p) ﬁxed. If 0 ≤

b1 ( p) ≤ b2 ( p) < p for all p, then

b1 ( p) −1 b2 ( p) −1
M(x, y; b1 ) 1− ≤ M(x, y; b2 ) 1− .
p|P
p p|P
p

Proof We induct on p|P (b2 ( p) − b1 ( p)). If b1 ( p) = b2 ( p) for all p|P, then
we have equality in the above. Let p |P be a prime for which b1 ( p ) < b2 ( p ).
Suppose that the B1 ( p) are chosen so that card B1 ( p) = b1 ( p) and

1 = M(x, y; b1 ) .
x<n≤x+y
n ∈B
/ 1

We note that

p
p
1 = 1. (3.37)
b=1 x<n≤x+y x<n≤x+y b=1
/ 1 ( p )
b∈B n ∈B
/ 1 n ∈B
/ 1 / 1 ( p )
b∈B
n≡b ( p ) b≡n ( p )

Consider the inner sum on the right. Since n ∈ / B1 ( p ), the variable b is restricted
to lie in one of p − b1 ( p ) − 1 residue classes. Hence the right-hand side above
is

= ( p − b1 ( p ) − 1)M(x, y; b1 ).

Since there are p − b1 ( p ) values of b in the outer sum on the left-hand side of
(3.37), it follows that there is a choice of b such that b ∈ / B1 ( p ) and
p − b1 ( p ) − 1
1 ≥ M(x, y; b1 ) .
x<n≤x+y p − b1 ( p )
n ∈B
/ 1
n≡b ( p )
3.4 Twin primes 97

Let b1 ( p) = b1 ( p) for p = p , b1 ( p ) = b1 ( p ) + 1. The left-hand side above

is ≤ M(x, y; b1 ), which by the inductive hypothesis is

p − b1 ( p ) − 1 p − b1 ( p)
≤ M(x, y; b2 ) .
p − b2 ( p ) p|P
p − b2 ( p)
p= p

Thus

p − b1 ( p)
M(x, y; b1 ) ≤ M(x, y; b2 ) ,
p|P
p − b2 ( p)

and the induction is complete.

By combining Theorem 3.10 and Lemma 3.12, we obtain

Theorem 3.13 Suppose that y ≥ 4. Let B( p) be the union of b( p) arithmetic
)
progressions with common difference p, and put B = p|P B( p). If b(2) ≤ 1
and b( p) ≤ 2 for p > 2, then the number of n ∈ (x, x + y] such that n ∈
/ B is
−2
y b( p) 1 log log y
≤8 1− 1− 1+O .
(log y)2 p|P p p log y

Corollary 3.14 Let r be an even non-zero integer, and suppose that y ≥ 4.

The number of primes p ∈ (x, x + y] such that p + r is also prime is

8c(r )y log log y
≤ 1+O
(log y)2 log y
uniformly in r where
⎛ ⎞ ⎛ ⎞
−1 −2
1 ⎝ 2 1 ⎠=⎜ p − 1⎟
c(r ) = 1− 1− 1− ⎝ ⎠c
p|r
p pr
p p p|r
p−2
p>2

and c is the constant in Theorem 3.10.

Suppose that r is a fixed even non-zero integer. It is conjectured that the
number of primes p ≤ y such that p + r is also prime is asymptotic to
c(r )y
(log y)2
as y tends to infinity. Thus the bound we have derived is larger than this by a
factor of 8. We conclude with an application of the above.
Theorem 3.15 (Romanoff) Let N (x) denote the number of integers n ≤ x
that can be expressed as a sum of a prime and a power of 2. Then N (x) x
for x ≥ 4.
98 Principles and first examples of sieve methods

Proof Let r (n) denote the number of solutions of n = p + 2k . By Cauchy’s

inequality,
2

r (n) ≤ N (x) r (n)2 .
n≤x n≤x

Thus to complete the proof it sufﬁces to show that

r (n) x (x ≥ 4), (3.38)
n≤x

and that

r (n)2 x. (3.39)
n≤x

The ﬁrst of these estimates is easy: Put y = [(log x)/ log 2]. If 0 ≤ k ≤ y − 1,
then 2k ≤ x/2, and if also p ≤ x/2, then p + 2k ≤ x. Thus the sum in (3.38)
is
x
≥ π (x/2)y log x x
log x
for x ≥ 4.
To prove (3.39), we ﬁrst observe that the sum on the left-hand side is

= 1.
p1 , p2 , j,k
p1 +2 j ≤x
p2 +2k ≤x
p1 +2 j = p2 +2k

This sum includes ‘diagonal’ terms, in which p1 = p2 and j = k; there are

x/ log x choices for p1 and log x choices for j, so there are x such
terms. The remaining terms above contribute an amount that is

π2 (x, 2k − 2 j ) (3.40)
0≤ j<k≤y

where π2 (x, r ) denotes the number of primes p ≤ x for which p + r is also

prime. From Corollary 3.14 we know that if r = 0, then
1
x 1 x
π2 (x, r ) 2
1 + 2
,
(log x) p|r p (log x) m|r m
p>2 2m

uniformly in r . Thus the expression (3.40) is

x 1
.
(log x)2 0≤ j<k≤y m|(2k −2 j )
m
2m
3.4 Twin primes 99

Put n = k − j. Thus 0 < n ≤ y. Let h 2 (m) denote the order of 2 modulo m,

which is to say that h 2 (m) is the least positive integer h such that 2h ≡ 1
(mod m). We note that m|(2n − 1) if and only if h 2 (m)|n. The number of such
n, 0 < n ≤ y, is ≤ y/ h 2 (m). There are also ≤ y choices of j. Thus to complete
the proof of (3.39) it sufﬁces to show that
1
< ∞. (3.41)
m mh 2 (m)
2m

To this end, let

1
an = ,
m m
2m
h 2 (m)=n

and set A(x) = n≤x an . We shall show that
A(x)log x. (3.42)

By summation by parts it follows that an /n converges. (Alternatively, we

could appeal to Theorem 1.3, from which we see that an /n s converges for

σ > 0.) This sufﬁces, since the sum in (3.41) is an /n.
It remains to establish (3.42). Set
P = P(x) = (2n − 1) .
n≤x

If h 2 (m) = n ≤ x, then m|P. Hence

1
1 1 P
A(x) ≤ ≤ 1 + + 2 + ··· = log log P
m|P
m p|P
p p ϕ(P)
2
by Theorem 2.9. But P ≤ 2x , so we have (3.42), and the proof is complete.

3.4.1 Exercises
1. For each prime p let B( p) be the union of b( p) ‘bad’ arithmetic progressions
)
with common difference p. Put B = p|P B( p), and let

m(x, y; b) = min 1
B
x<n≤x+y
n ∈B
/

where the minimum is over all choices of the B( p) with b( p) ﬁxed. Show
that if b1 ( p) ≤ b2 ( p) for all p, then

b1 ( p) −1 b2 ( p) −1
m(x, y; b1 ) 1− ≥ m(x, y; b2 ) 1− .
p p p p
100 Principles and ﬁrst examples of sieve methods

2. Show that the number of primes p ≤ 2n such that 2n − p is prime is

⎛ ⎞

⎜ p − 1⎟ 2n log log 4n
≤ 8c ⎝ ⎠ 1+O
p|n
p − 2 (log 2n)2 log 2n
p>2

where c is the constant in Theorem 3.10.

3. (Erdős 1940, Ricci 1954)
(a) Show that

c(r ) = x + O(log x)
r ≤x

where c(r ) is deﬁned as in Corollary 3.14.

(b) Let p denote the least prime > p, and put d( p) = p − p. Show that
if a and b are ﬁxed real numbers with a < b, then

log p 8(b − a)x .
p≤x
a log p≤d( p)≤b log p

(c) Suppose that f is a non-negative, properly Riemann-integrable function

on a ﬁnite interval [a, b]. Show that
d( p) b
f log p ≤ (8 + o(1))x f (u) du .
p≤x log p a

(d) Show that if a and b are ﬁxed real numbers with a < b, then

(b log p − d( p)) 4(b − a)2 x .
p≤x
a log p≤d( p)≤b log p

(e) Explain why

(d( p) − b log p) ≥ 0 .
p≤x
d( p)>b log p

(f) Deduce that

(b log p − d( p)) 4(b − a)2 x .
p≤x
d( p)≥a log p

(g) Show that

d( p) ∼ x .
p≤x

(h) Show that

(b log p − d( p)) = (b − 1 + o(1))x .
p≤x
3.5 Notes 101

(i) Take b = a + 1/8, and suppose that d( p) ≥ a log p for all p > p0 .
Show that the estimates of (f) and (h) are inconsistent if a > 15/16.
Thus conclude that
d( p) 15
lim inf ≤ .
p→∞ log p 16
4. Let r (n) be deﬁned as in the proof of Theorem 3.15. Show that
x
r (n) ∼ .
n≤x log 2
5. Let r (n) be deﬁned as in the proof of Theorem 3.15. Show that
x
r (n) .
n≤x log x
2|n

6. (Erdős 1950)
(a) Show that if n ≡ 1 (mod 3) and k ≡ 0 (mod 2), then 3|(n − 2k ).
(b) Show that if n ≡ 1 (mod 7) and k ≡ 0 (mod 3), then 7|(n − 2k ).
(c) Show that if n ≡ 2 (mod 5) and k ≡ 1 (mod 4), then 5|(n − 2k ).
(d) Show that if n ≡ 8 (mod 17) and k ≡ 3 (mod 8), then 17|(n − 2k ).
(e) Show that if n ≡ 11 (mod 13) and k ≡ 7 (mod 12), then 13|(n − 2k ).
(f) Show that if n ≡ 121 (mod 241) and k ≡ 23 (mod 24), then 241|
(n − 2k ).
(g) Show that every integer k satisﬁes at least one of the congruences
k ≡ 0 (mod 2), k ≡ 0 (mod 3), k ≡ 1 (mod 4), k ≡ 3 (mod 8), k ≡
7 (mod 12), k ≡ 23 (mod 24).
(h) Show that if n satisﬁes all the congruences n ≡ 1 (mod 3), n ≡ 1
(mod 7), n ≡ 2 (mod 5), n ≡ 8 (mod 17), n ≡ 11 (mod 13), n ≡
121 (mod 241), then n − 2k is divisible by at least one of the primes
3, 7, 5, 17, 13, 241.
(i) Show that these congruential conditions are equivalent to the single
condition n ≡ 172677 (mod 3728270).
(j) An integer n satisfying the above might still be representable in the
form p + 2k , but if it is, then the prime in question must be one of the
six primes listed. Show that if in addition, n ≡ 9 or 11 or 15 (mod 16),
then n cannot be expressed as a sum of a prime and a power of 2.

3.5 Notes
Sections 3.1, 3.2. The modern era of sieve methods began with the work
of Brun (1915, 1919). Hardy & Littlewood (1922) used Brun’s method to
establish the estimate (3.9). The sharp form of this in Corollary 3.4 is due
102 Principles and ﬁrst examples of sieve methods

to Selberg (1952a,b). The 2 method of Selberg (1947) provides only upper

bounds, but lower bounds can also be derived from it by using ideas of Buchstab
(1938).
In contrast to the elegance of the Selberg 2 method, the further study of
sieves leads us to construct asymptotic estimates for complicated sums over
integers whose prime factors are distributed in certain ways. In this connection,
the argument (3.22) is a simple foretaste of more complicated things to come.
Hence further discussion of sieves is possible only after the appropriate technical
tools are in place.
In this chapter we have applied the sieve only to arithmetic progressions,
but it can be shown that the sieve is applicable to much more general sets. This
makes sieves very versatile, but it also means that they are subject to certain
unfortunate limitations. In order to estimate the number of elements of a set S
that remain after sifting, it sufﬁces to have a reasonably precise estimate of the
number X d of multiples of d in the set, say of the form X d = f (d)X/d + O(Rd )
where X is an estimate for the cardinality of S, and f is a multiplicative function.
Thus Theorem 3.3 can be generalized to much more general sets, and in that
more general setting it is known that the constant 2 is best-possible. It may be
true that the constant 2 can be improved in the special case that one is sifting
an interval, but this has not been achieved thus far.
When sifting an interval, the error terms can be avoided by using Fourier
analysis as in Selberg (1991, Sections 19–22), or by using the large sieve as
in Montgomery & Vaughan (1973). In particular, the number of integers in
[M + 1, M + N ] remaining after sifting is at most N /L where

µ(q)2 b( p)
L= . (3.43)
q≤Q 1 + 32 q Q/N p|q
p − b( p)

Here b( p) is the number of residue classes modulo p that are deleted. This is
both a generalization and a sharpening of Theorem 3.2.
Section 3.3. Titchmarsh (1930) used Brun’s method to obtain Theorem 3.9,
but with a larger constant instead of 2. Montgomery & Vaughan (1973) have
shown that Corollary 3.4 and Theorem 3.9 are still valid when the error terms are
omitted. See also Selberg (1991, Section 22). The ﬁrst signiﬁcant improvement
of Theorem 3.9 was obtained by Motohashi (1973). Other improvements of
various kinds have been derived by Motohashi (1974), Hooley (1972, 1975),
Goldfeld (1975), Iwaniec (1982), and Friedlander & Iwaniec (1997).
In Lemmas 3.5 and 3.12, and in Exercises 3.2.7, 3.2.9, 3.2.10, 3.4.1 we see
evidence of a monotonicity principle that permeates sieve theory; cf. Selberg
(1991, pp. 72–73).
3.5 Notes 103

Hooley (1994) has shown that quite sharp sieve bounds can be derived using
the interrupted inclusion–exclusion idea that Brun started with. This approach
has been developed further by Ford & Halberstam (2000). An exposition of
sieves based on these ideas is given by Bateman & Diamond (2004, Chapters 12,
13). Still more extensive accounts of sieve methods have been given by Greaves
(2001), Halberstam & Richert (1974), Iwaniec & Kowalski (2004, Chapter
6), Motohashi (1983), and Selberg (1971, 1991). In addition, a collection of
applications of sieves to arithmetic problems has been given by Hooley (1976),
and additional sieve ideas are found in Bombieri (1977), Bombieri, Friedlander
& Iwaniec (1986, 1987, 1989), Fouvry & Iwaniec (1997), Friedlander & Iwaniec
(1998a, b), and Iwaniec (1978, 1980a, b, 1981).
Section 3.4. The twin prime conjecture is a special case of the prime k-tuple
conjecture. Suppose that d1 , . . . , dk are distinct integers, and let b( p) denote
the number of distinct residue classes modulo p found among the di . The prime
k-tuple conjecture asserts that if b( p) < p for every prime number p, then there
exist infinitely many positive integers n such that the k numbers n + di are all
prime. Hardy & Littlewood (1922) put this in a quantitative form: If b( p) < p
for all p, then the number of n ≤ N for which the k numbers n + di are all
prime is conjectured to be
N
∼ S(d) (3.44)
(log N )k
as N → ∞ where
−k
b( p) 1
S(d) = 1− 1− . (3.45)
p p p
This product is absolutely convergent, since b( p) = k for all sufficiently large
primes p. Although this remains unproved, by sifting we can obtain an upper
bound of the expected order of magnitude. In particular, from (3.43) it can be
shown that the number of n, M + 1 ≤ n ≤ M + N , for which the numbers
n + di are all prime is
N
2k k!S(d) . (3.46)
(log N )k
Corollarys 3.4 and 3.14 are special cases of this.
Theorem 3.15 is due to Romanoff (1934). Once the bound for the number
of twin primes is in place, the hardest part of the proof is to establish the
estimate (3.41). Romanoff’s original proof of this was rather difficult. Erdős
& Turán (1935) gave a simpler proof, but the clever proof we have given is
due to Erdős (1951). Let r (n) be defined as in the proof of Theorem 3.15.

Erdős (1950) showed that r (n) = (log log n), and that n≤x r (n)k k x for
104 Principles and first examples of sieve methods

any positive k. Presumably r (n) = o(log n), but for all we know there could be,
although it seems unlikely, inﬁnitely many n such that n − 2k is prime whenever
0 < 2k < n. The number n = 105 has this property, and is probably the largest
such number. The best upper bound we have for the number of such n not
exceeding X is (Vaughan 1973),

c log X log log log X
X exp − .
log log X

For generalizations of Romanoff’s theorem, see Erdős (1950, 1951).

3.6 References
Ankeny, N. C. & Onishi, H. (1964/1965). The general sieve, Acta Arith. 10, 31–62.
Bateman, P. T. & Diamond, H. (2004). Analytic Number Theory, Hackensack: World
Scientific.
Behrend, F. A. (1948). Generalization of an inequality of Heilbronn and Rohrbach, Bull.
Amer. Math. Soc. 54, 681–684.
Bombieri, E. (1977). The asymptotic sieve, Rend. Accad. Naz. XL (5) 1/2 (1975/76),
243–269.
Bombieri, E., Friedlander, J. B., & Iwaniec, H. (1986). Primes in arithmetic progressions
to large moduli, Acta Math. 156, 203–251.
(1987). Primes in arithmetic progressions to large moduli, II, Math. Ann. 277, 361–
393.
(1989). Primes in arithmetic progressions to large moduli, III, J. Amer. Math. Soc. 2,
215–224.
Brun, V. (1915). Über das Goldbachsche Gesetz und die Anzahl der Primzahlpaare,
Archiv for Math. og Naturvid. B 34, no. 8, 19 pp.
(1919). La série 1/5 + 1/7 + 1/11 + 1/13 + 1/17 + 1/19 + 1/29 + 1/31 +
1/41 + 1/43 + 1/59 + 1/61 + · · · où les dénominateurs sont “nombres premiers
jumeaus” est convergente ou finie, Bull. Sci. Math. (2) 43, 100–104; 124–128.
(1967). Reflections on the sieve of Eratosthenes, Norske Vid. Selsk. Skr. Trondheim,
no. 1, 9 pp.
Buchstab, A. A. (1938). New improvements in the method of the sieve of Eratosthenes,
Mat. Sb. (N. S.) 4 (46), 375–387.
Chowla, S. (1932). Contributions to the analytic theory of numbers, Math. Z. 35, 279–
299.
Chung, K.-L. (1941). A generalization of an inequality in the elementary theory of
numbers, J. Reine Angew. Math. 183, 193–196.
van der Corput, J. G. (1958). Inequalities involving least common multiple and other
arithmetical functions, Nederl. Akad. Wetensch. Proc. Ser. A 61 (= Indag. Math.
20), 5–15.
Erdős, P. (1940). The difference of consecutive primes, Duke Math. J. 6, 438–441.
(1946). On the coefficients of the cyclotomic polynomial, Bull. Amer. Math. Soc. 52,
179–184.
3.6 References 105

(1950). On integers of the form 2k + p and some related problems, Summa Brasil.
Math. 2, 113–123.
(1951). On some problems of Bellman and a theorem of Romanoff, J. Chinese Math.
Soc. (N. S.) 1, 409–421.
Erdős, P. & Turán, P. (1935). Ein zahlentheoretischer Satz, Mitt. Forsch. Inst. Math.
Mech. Univ. Tomsk 1, 101–103.
Ford, K. & Halberstam, H. (2000). The Brun–Hooley sieve, J. Number Theory 81,
335–350.
Fouvry, E. & Iwaniec, H. (1997). Gaussian primes, Acta Arith. 79 (1997), 249–287.
Friedlander, J. B. & Iwaniec, H. (1997). The Brun–Titchmarsh theorem, Analytic Number
Theory (Kyoto, 1996). London Math. Soc. Lecture Note Ser. 247, Cambridge:
Cambridge University Press, pp. 85–93.
(1998a). The polynomial X 2 + Y 4 captures its primes, Ann. of Math. (2) 148, 945–
1040.
(1998b). Asymptotic sieve for primes, Ann. of Math. (2) 148, 1041–1065.
Goldfeld, D. M. (1975). A further improvement of the Brun–Titchmarsh theorem, J.
London Math. Soc. (2) 11, 434–444.
Greaves, G. (2001). Sieves in Number Theory. Berlin: Springer.
Halberstam, H. (1985). Lectures on the linear sieve, Topics in Analytic Number Theory
(Austin, 1982). Austin: University of Texas Press, pp. 165–220.
Halberstam, H. & Richert, H.-E. (1973). Brun’s method and the fundamental lemma,
Acta Arith. 24, 113–133.
(1974). Sieve Methods. London: Academic Press.
(1975). Brun’s method and the fundamental lemma. II, Acta Arith. 27, 51–59.
Hardy, G. H. & Littlewood, J. E. (1922). Some problems of ‘Partitio Numerorum’: III.
On the expression of a number as a sum of primes, Acta Math. 44, 1–70; Collected
Papers, Vol. I, London: Oxford University Press, 1966, pp. 561–630.
Heilbronn, H. (1937). On an inequality in the elementary theory of numbers, Proc.
Cambridge Philos. Soc. 33, 207–209.
Hensley, D. (1978). An almost-prime sieve, J. Number Theory 10, 250–262; Corrigen-
dum, 12, (1980), 437.
Hooley, C. (1972). On the Brun–Titchmarsh theorem, J. Reine Angew. Math. 255,
60–79.
(1975). On the Brun–Titchmarsh theorem, II, Proc. London Math. Soc. (3) 30, 114–
128.
(1976). Applications of Sieve Methods to the Theory of Prime Numbers, Cambridge
Tract 70. Cambridge: Cambridge University Press.
(1994). An almost pure sieve, Acta Arith. 66, 359–368.
Iwaniec, H. (1978). Almost-primes represented by quadratic polynomials, Invent. Math.
47, 171–188.
(1980a). Rosser’s sieve, Acta Arith. 36, 171–202.
(1980b). A new form of the error term in the linear sieve, Acta Arith. 37, 307–320.
(1981). Rosser’s sieve – bilinear forms of the remainder terms – some applications.
Recent Progress in Analytic Number Theory, Vol. 1. New York: Academic Press,
pp. 203–230.
(1982). On the Brun–Titchmarsh theorem, J. Math. Soc. Japan 34, 95–123.
Iwaniec, H. & Kowalski, E. (2004). Analytic Number Theory, Colloquium Publications
53. Providence: Amer. Math. Soc.
106 Principles and ﬁrst examples of sieve methods

Jurkat, W. B. & Richert, H.-E. (1965). An improvement in Selberg’s sieve method, I,

Acta Arith. 11, 217–240.
Lehmer, D. H. (1955). The distribution of totatives, Canad. J. Math. 7, 347–357.
µ2 (n)
van Lint, J. H. & Richert, H.-E. (1964). Über die Summe nx ϕ(n)
Nederl. Akad.
p(n)<y
Wetensch. Proc. Ser. A 67 (= Indag. Math. 26), 582–587.
(1965). On primes in artihmetic progressions, Acta Arith. 11, 209–216.
Montgomery, H. L. (1968). A note on the large sieve, J. London Math. Soc. 43,
93–98.
Montgomery, H. L. & Vaughan, R. C. (1973). The large sieve, Mathematika 20, 119–134.
(1979). Mean values of character sums, Canad. J. Math. 31, 476–487.
Motohashi, Y. (1973). On some improvements of the Brun–Titchmarsh theorem, II,
Research of analytic number theory (Proc. Sympos., Res. Inst. Math. Sci., Kyoto,
1973), Søurikaisekikenkyøusho Kókyøuroku, No. 193, 97–109.
(1974). On some improvements of the Brun–Titchmarsh theorem, J. Math. Soc. Japan
26, 306–323.
(1975). On some improvements of the Brun–Titchmarsh theorem, III, J. Math. Soc.
Japan 27, 444–453.
(1983). Lectures on Sieve Methods and Prime Number theory. Tata Institute of Fun-
damental Research (Bombay). Berlin: Springer-Verlag.
Ricci, G. (1954). Sull’andamento della differenza di numeri primi consecutivi, Riv. Mat.
Univ. Parma 5, 3–54.
Riesel, H. & Vaughan, R. C. (1983). On sums of primes, Ark. Mat. 21, 46–74.
Rohrbach, H. (1937). Beweis einer zahlentheoretischen Ungleichung, J. Reine Angew.
Math. 177, 193–196.
Romanoff, N. P. (1934). Über einige Sätze der additiven Zahlentheorie, Math. Ann. 109,
668–678.
Selberg, A. (1947). On an elementary method in the theory of primes, Norske Vid. Selsk.
Forh., Trondhjem 19, no. 18, 64–67; Collected Papers, Vol. 1. Berlin: Springer-
Verlag, 1989, pp. 363–366.
(1952a). On elementary methods in primenumber-theory and their limitations, Den
11te Skandinaviske Matematikerkongress (Trondheim, 1949), Oslo: Johan Grundt
Tanums Forlag, pp. 13–22; Collected Papers, Vol. 1. Berlin: Springer-Verlag, 1989,
pp. 388–397.
(1952b). The general sieve-method and its place in prime-number theory. Proceedings
of the International Congress of Mathematicians (Cambridge MA, 1950), Vol. 1,
Providence: Amer. Math. Soc., pp. 286–292; Collected Papers, Vol. 1. Berlin:
Springer-Verlag, 1989, pp. 411–417.
(1971). Sieve methods, Proceedings of Symposium on Pure Mathematics (SUNY
Stony Brook, 1969), Vol. XX. Providence: Amer. Math. Soc., 311–351; Collected
Papers, Vol. 1. Berlin: Springer-Verlag, 1989, pp. 568–608.
(1972). Remarks on sieves, Proceedings of the Number Theory Conference (Boulder
CO Aug. 14–18), pp. 205–216; Collected Papers, Vol. 1. Berlin: Springer-Verlag,
1989, pp. 609–615.
(1989). Sifting problems, sifting density and sieves, Number Theory, Trace Formulas,
and Discrete Groups (Oslo, 1987), K. E. Aubert, E. Bombieri, D. Goldfeld, eds.
3.6 References 107

Boston: Academic Press, pp. 467–484; Collected Papers, Vol. 1. Berlin: Springer-
Verlag, 1989, pp. 675–69.
(1991). Lectures on Sieves, Collected Papers, Vol. 2. Berlin: Springer-Verlag,
pp. 65–247.
Titchmarsh, E. C. (1930). A divisor problem, Rend. Circ. Math. Palermo 54, 414–429.
Tsang, K. M. (1989). Remarks on the sieving limit of the Buchstab–Rosser sieve, Number
Theory, Trace Formulas and Discrete Groups (Oslo, 1987). Boston: Academic
Press, pp. 485–502.
Vaughan, R. C. (1973). Some applications of Montgomery’s sieve, J. Number Theory 5,
64–79.
Vijayaraghavan, T. (1951). On a problem in elementary number theory, J. Indian Math.
Soc. (N.S.) 15, 51–56.
4
Primes in arithmetic progressions: I

4.1 Additive characters

∞
If f (z) = n=0 cn z n is a power series, we can restrict our attention to terms
for which n has prescribed parity by considering
1 1 ∞
f (z) + f (−z) = cn z n
2 2 n=0
n≡ 0 (2)

or
1 1 ∞
f (z) − f (−z) = cn z n .
2 2 n=0
n≡1 (2)

That is, we can express the characteristic function of an arithmetic progression

(mod 2) as a linear combination 12 1n ± 12 (−1)n of 1n and (−1)n . Here 1 and
−1 are the square-roots of 1, and we can similarly express the characteristic
function of an arithmetic progression (mod q) as a linear combination of the
sequences ζ n where ζ runs over the q different q th roots of unity. We write
e(θ ) = e2πiθ , and then the q th roots of unity are the numbers ζ = e(a/q) for
1 ≤ a ≤ q. If (a, q) = 1 then the least integer n such that ζ n = 1 is q, and we
say that ζ is a primitive q th root of unity. From the formula

q−1
1 − ζq
ζk =
k=0
1−ζ

for the sum of a geometric progression, we see that if ζ is a q th root of unity

then
q
ζk = 0
k=1

108
4.1 Additive characters 109

unless ζ = 1. Hence

1
q
1 if n ≡ a (mod q),
e(−ka/q)e(kn/q) = (4.1)
q k=1 0 otherwise,

and thus the characteristic function of an arithmetic progression (mod q) can be

expressed as a linear combination of the sequences e(kn/q). These functions
are called the additive characters (mod q) because they are the homomorphisms
from the additive group (Z/qZ)+ of integers (mod q) to the multiplicative group
C× of non-zero complex numbers.
In the language of linear algebra we see that the arithmetic functions of
period q form a vector space of dimension q. For any k, 1 ≤ k ≤ q, the se-
quence {e(kn/q)}∞ n=−∞ has period q, and these q sequences form a basis
for the space of q-periodic arithmetic functions. Indeed, the formula (4.1)
expresses the a th elementary vector as a linear combination of the vectors
[e(n/q), e(2n/q), . . . , e((q − 1)n/q), 1].
If f (n) is an arithmetic function with period q then we deﬁne the ﬁnite
Fourier transform of f to be the function
1
q
*
f (k) = f (n)e(−kn/q). (4.2)
q n=1
To obtain a Fourier representation of f we multiply both sides of (4.1) by f (n)
and sum over n to see that
q
f (n)
q
f (a) = e(−ka/q)e(kn/q)
n=1
q k=1

q
1
q
= e(−ka/q) f (n)e(kn/q)
k=1
q n=1

q
= e(−ka/q) *
f (−k).
k=1

Here the exact values that k runs through are immaterial, as long as the set of
these values forms a complete residue system modulo q. Hence we may replace
k by −k in the above, and so we see that

q
f (n) = *
f (k)e(kn/q). (4.3)
k=1

This includes (4.1) as a special case, for if we take f to be the characteris-

tic function of the arithmetic progression a (mod q) then by (4.2) we have
*f (k) = e(−ka/q)/q, and then (4.3) coincides with (4.1). The pair (4.2), (4.3)
of inversion formulæ are analogous to the formula for the Fourier coefﬁcients
110 Primes in arithmetic progressions: I

and Fourier expansion of a function f ∈ L 1 (T), but the situation here is simpler
because our sums have only ﬁnitely many terms.
Let v(h) be the vector v(h) = [e(h/q), e(2h/q), . . . , e((q − 1)h/q), 1].
From (4.1) we see that two such vectors v(h 1 ) and v(h 2 ) are orthogonal un-
less h 1 ≡ h 2 (mod q). These vectors are not normalized, but they all have the
√
same length q, so apart from some rescaling, the transformation from f to * f
is an isometry. More precisely, if f has period q and * f is given by (4.2), then
by (4.3),

q
q
q 2
| f (n)|2 = *
f (k)e(kn/q) .
n=1 n=1 k=1

By expanding and taking the sum over n inside, we see that this is
q q
q
= *
f ( j) *
f (k) e( jn/q)e(−kn/q).
j=1 k=1 n=1

By (4.1) the innermost sum is q if j = k and is 0 otherwise. Hence

q
q
| f (n)|2 = q |*
f (k)|2 . (4.4)
n=1 k=1

This is analogous to Parseval’s identity for functions f ∈ L 2 (T), or to

Plancherel’s identity for functions f ∈ L 2 (R).
Among the exponential sums that we shall have occasion to consider is
Ramanujan’s sum
q
cq (n) = e(an/q). (4.5)
a=1
(a,q)=1

We now establish some of the interesting properties of this quantity.

Theorem 4.1 As a function of n, cq (n) has period q. For any given n, cq (n)
is a multiplicative function of q. Also,

q if q|n,
cd (n) = (4.6)
d|q
0 otherwise.

Finally,
µ(q/(q, n))
cq (n) = dµ(q/d) = ϕ(q). (4.7)
d|(q,n)
ϕ(q/(q, n))

The case n = 1 of this last formula is especially memorable:

q
e(a/q) = µ(q).
a=1
(a,q)=1
4.1 Additive characters 111

Proof The ﬁrst assertion is evident, as each term in the sum (4.5) has period
q. As for the second, suppose that q = q1 q2 where (q1 , q2 ) = 1. By the Chinese
Remainder Theorem, for each a (mod q) there is a unique pair a1 , a2 with ai
determined (mod qi ), so that a ≡ a1 q2 + a2 q1 (mod q). Moreover, under this
correspondence we see that (a, q) = 1 if and only if (ai , qi ) = 1 for i = 1, 2.
Then

q1
q2
cq (n) = e((a1 q2 + a2 q1 )n/(q1 q2 ))
a1 =1 a2 =1
(a1 ,q1 )=1 (a2 ,q2 )=1
⎛ ⎞⎛ ⎞
⎜ ⎟⎜
q1 q2
⎟
=⎝ e(a1 n/q1 )⎠ ⎝ e(a2 n/q2 )⎠
a1 =1 a2 =1
(a1 ,q1 )=1 (a2 ,q2 )=1

= cq1 (n)cq2 (n).

To establish (4.6), suppose that d|q, and consider those a, 1 ≤ a ≤ q, such

that (a, q) = d. Put b = a/d. Then the numbers a are in one-to-one correspon-
dence with those b, 1 ≤ b ≤ q/d, for which (b, q/d) = 1. Hence

q
q
e(na/q) = e(na/q)
a=1 d|q a=1
(a,q)=d

q/d
= e(nb/(q/d))
d|q b=1
(b,q/d)=1

= cq/d (n).
d|q

By (4.1), the left-hand side above is q when q|n, and is 0 otherwise. Thus we
have (4.6).
The ﬁrst formula in (4.7) is merely the Möbius inverse of (4.6). To obtain
the second formula in (4.7), we begin by considering the special case in which
q is a prime power, q = p k .

k
p
c pk (n) = e(na/ p k )
a=1
pa

k k−1
p p
= e(na/ p ) −k
e(na/ p k−1 ).
a=1 a=1
112 Primes in arithmetic progressions: I

Here the ﬁrst sum is p k if p k |n, and is 0 otherwise. Similarly, the second
sum is p k−1 if p k−1 |n, and is 0 otherwise. Hence the above is
⎧
⎨0 if p k−1 n,
= −p k−1
if p k−1 n,
⎩ k
p − p k−1 if p k |n
k
µ p /(n, p k )
= k ϕ( p k ).
ϕ p /(n, p k )
The general case of (4.7) now follows because cq (n) is a multiplicative function
of q.

4.1.1 Exercises
√
1. Let U = [u kn ] be the q × q matrix with elements u kn = e(kn/q)/ q. Show
that UU ∗ = U ∗ U = I , i.e., that U is unitary.
2. (Friedman 1957; cf. Reznick 1995)
(a) Show that
1
2r 2r
ue(θ/2) + ve(−θ/2) dθ = u r vr
0 r
for any non-negative integer r and arbitrary complex numbers u, v.
(b) Show that if u = (x − i y)/2, v = (x + i y)/2, then
x cos π θ + y sin π θ = ue(θ/2) + ve(−θ/2)
for all θ.
(c) Show that
1
2r 2r
x cos π θ + y sin π θ dθ = 2−2r (x 2 + y 2 )r
0 r
for any non-negative integer r and arbitrary real or complex numbers
x, y.
(d) Show that

q
πia/q
−πia/q 2r 2r
ue + ve =q u r vr
a=1
r
if r is an integer, 0 ≤ r < q.
(e) Show that
q
2r
(x cos πa/q + y sin πa/q)2r = q 2−2r (x 2 + y 2 )r
a=1
r
if r is an integer, 0 ≤ r < q.
4.1 Additive characters 113

3. Show that |cq (n)| ≤ (q, n).

4. (Carmichael 1932)
(a) Show that if q > 1, then

q
cq (n) = 0.
n=1

(b) Show that if q1 = q2 and [q1 , q2 ]|N , then

N
cq1 (n)cq2 (n) = 0.
n=1

(c) Show that if q|N , then

N
cq (n)2 = N ϕ(q).
n=1

5. (Grytczuk 1981; cf. Redmond 1983) Show that

|cd (n)| = 2ω(q/(q,n)) (q, n).
d|q

6. (Ramanujan 1918) Show that

ϕ(n) ∞
µ(d) ∞
= 2
cq (n) = aq cq (n)
n d=1
d q|d q=1

where
−1
6µ(q) 1
aq = 2 2 1− 2 .
π q p|q
p

7. (Wintner 1943, Sections 33–35) The orthogonality relations of Exercise 4

give us hope that it might be possible to represent an arithmetic function
F(n) in the form
∞
F(n) = aq cq (n) (4.8)
q=1

by taking
1 1
aq = lim F(n)cq (n) . (4.9)
ϕ(q) x→∞ x n≤x

In the following, suppose that f (r ) is chosen so that F(n) = r |n f (r ) for
all n.
(a) Suppose that
∞
| f (r )|
< ∞. (4.10)
r =1
r
114 Primes in arithmetic progressions: I

Let d be a ﬁxed positive integer. Show that

x ∞
f (r )
F(n) = (d, r ) + o(x)
n≤x d r =1
r
d|n

as x → ∞.
(b) Suppose that (4.10) holds. Show that
1 ∞
f (r )
lim F(n)cq (n) = ϕ(q) .
x→∞ x r
n≤x r =1
q|r

Show that if

∞
| f (r )|d(r )
<∞ (4.11)
r =1
r
∞
then (4.8) and (4.9) hold, and moreover that q=1 |aq cq (n)| < ∞.
∞
8. (Ramanujan 1918) Show that if q > 1, then n=1 cq (n)/n = −(q). (See
also Exercise 8.3.4.)
9. Let q (z) denote the q th cyclotomic polynomial, i.e., the monic polynomial
whose roots are precisely the primitive q th roots of unity, so that
q
q (z) = (z − e(n/q)).
n=1
(n,q)=1

(a) Show that

q (z) = (z d − 1)µ(q/d)
d|q

and that (z d − 1)µ(q/d) has a power series expansion, valid when |z| < 1,
with integer coefﬁcients. Deduce that q (z) ∈ Z[z].
(b) Suppose that z ∈ Z and p | q (z) and let e denote the order of z modulo
p. Show that e | q and that if p | (z d − 1) then e | d.
(c) Choose t so that p t (z e − 1). Show that for m ∈ N with p m one has
p t (z me − 1).

(d) Show that if p q, then p ht q (z) where h = µ(q/d). Deduce that
e|d|q
e = q and that q | ( p − 1).
(e) By taking z to be a suitable multiple of q, or otherwise, show that there
are inﬁnitely many primes p with p ≡ 1 (mod q).
4.2 Dirichlet characters 115

4.2 Dirichlet characters

In the preceding section we expressed the characteristic function of an arithmetic
progression as a linear combination of additive characters. For purposes of
multiplicative number theory we shall similarly represent the characteristic
function of a reduced residue class (mod q) as a linear combination of totally
multiplicative functions χ (n) each one supported on the reduced residue classes
and having period q. These are the Dirichlet characters. Since χ (n) has period
q we may think of it as mapping from residue classes, and since χ (n) = 0 if and
only if (n, q) = 1, we may think of χ as mapping from the multiplicative group
of reduced residue classes to the multiplicative group C× of non-zero complex
numbers. As χ is totally multiplicative, χ (mn) = χ (m)χ (n) for all m, n, we see
that the map χ : (Z/qZ)× −→ C× is a homomorphism. The method we use to
describe these characters applies when (Z/qZ)× is replaced by an arbitrary ﬁnite
abelian group G, so we consider the slightly more general problem of ﬁnding
all homomorphisms χ : G → C× from such a group G to C× . We call these
homomorphisms the characters of G, and let G * denote the set of all characters
of G. We let χ0 denote the principal character, whose value is identically 1.
We note that if χ ∈ G, * then χ (e) = 1 where e denotes the identity in G. Let n
denote the order of G. If g ∈ G and χ ∈ G, * then g n = e, and hence χ (g n ) = 1.
Consequently χ (g) = 1, and so we see that all values taken by characters are
n

* is ﬁnite, since there can be at

n th roots of unity. In particular, this implies that G
most n such maps. If χ1 and χ2 are two characters of G, then we can deﬁne
n

a product character χ1 χ2 by χ1 χ2 (g) = χ1 (g)χ2 (g). For χ ∈ G, * let χ be the

character χ (g). Then χ · χ = χ0 , and we see that G * is a ﬁnite abelian group
with identity χ0 . The following lemmas prepare for a full description of G * in
Theorem 4.4.

Lemma 4.2 Suppose that G is cyclic of order n, say G = (a). Then there are
exactly n characters of G, namely χk (a m ) = e(km/n) for 1 ≤ k ≤ n. Moreover,

n if χ = χ0 ,
χ (g) = (4.12)
g∈G
0 otherwise,

and

n if g = e,
χ (g) = (4.13)
*
0 otherwise.
χ ∈G

* is cyclic, G
In this situation, G * = (χ1 ).

Proof Suppose that χ ∈ G. * As we have observed, χ (a) is an n th root of unity,

say χ (a) = e(k/n) for some k, 1 ≤ k ≤ n. Hence χ (a m ) = χ (a)m = e(km/n).
116 Primes in arithmetic progressions: I

Since the characters are now known explicitly, the remaining assertions are
easily veriﬁed.

Next we describe the characters of the direct product of two groups in terms
of the characters of the factors.
Lemma 4.3 Suppose that G 1 and G 2 are ﬁnite abelian groups, and that G =
G 1 ⊗ G 2 . If χi is a character of G i , i = 1, 2, and g ∈ G is written g = (g1 , g2 ),
gi ∈ G i , then χ (g) = χ1 (g1 )χ2 (g2 ) is a character of G. Conversely, if χ ∈ G, *
then there exist unique χi ∈ G i such that χ (g) = χ1 (g1 )χ2 (g2 ). The identities
(4.12) and (4.13) hold for G if they hold for both G 1 and G 2 .
* corresponds to a pair (χ1 , χ2 ) ∈ G
We see here that each χ ∈ G *1 × G
* 2 . Thus
∼ * *
G = G1 ⊗ G2.
Proof The ﬁrst assertion is clear. As for the second, put χ1 (g1 ) = χ ((g1 , e2 )),
*i for i = 1, 2, and χ1 (g1 )χ2 (g2 ) = χ (g). The
χ2 (g2 ) = χ ((e1 , g2 )). Then χi ∈ G
χi are unique, for if g = (g1 , e2 ), then
χ (g) = χ ((g1 , e2 )) = χ1 (g1 )χ2 (e2 ) = χ1 (g1 ),
and similarly for χ2 . If χ (g) = χ1 (g1 )χ2 (g2 ), then

χ (g) = χ1 (g1 ) χ2 (g2 ) ,
g∈G g1 ∈G 1 g2 ∈G 2

so that (4.12) holds for G if it holds for G 1 and for G 2 . Similarly, if g = (g1 , g2 ),
then
⎛ ⎞⎛ ⎞

χ (g) = ⎝ χ1 (g1 )⎠ ⎝ χ2 (g2 )⎠ ,
*
χ ∈G *1
χ1 ∈G *2
χ1 ∈ G

so that (4.13) holds for G if it holds for G 1 and G 2 .

* is isomorphic to G,
Theorem 4.4 Let G be a ﬁnite abelian group. Then G
and (4.12) and (4.13) both hold.
Proof Any ﬁnite abelian group is isomorphic to a direct product of cyclic
groups, say
G∼
= Cn 1 ⊗ Cn 2 ⊗ · · · ⊗ Cnr .
The result then follows immediately from the lemmas.

Though G and G * are isomorphic, the isomorphism is not canonical. That is,
no particular one-to-one correspondence between the elements of G and those
* is naturally distinguished.
of G
4.2 Dirichlet characters 117

Corollary 4.5 The multiplicative group (Z/qZ)× of reduced residue classes

(mod q) has ϕ(q) Dirichlet characters. If χ is such a character, then
q
ϕ(q) if χ = χ0 ,
χ (n) = (4.14)
n=1
0 otherwise.
(n,q)=1

If (n, q) = 1, then

ϕ(q) if n ≡ 1 (mod q),
χ (n) = (4.15)
χ
0 otherwise,

where the sum is extended over the ϕ(q) Dirichlet characters χ (mod q).

As we remarked at the outset, for our purposes it is convenient to deﬁne the

Dirichlet characters (mod q) on all integers; we do this by setting χ (n) = 0
when (n, q) > 1. Thus χ is a totally multiplicative function with period q that
vanishes whenever (n, q) > 1, and any such function is a Dirichlet character
(mod q). In this book a character is understood to be a Dirichlet character unless
the contrary is indicated.

Corollary 4.6 If χi is a character (mod qi ) for i = 1, 2, then χ1 (n)χ2 (n)

is a character (mod [q1 , q2 ]). If q = q1 q2 , (q1 , q2 ) = 1, and χ is a character
(mod q), then there exist unique characters χi (mod q), i = 1, 2, such that
χ (n) = χ1 (n)χ2 (n) for all n.

Proof The ﬁrst assertion follows immediately from the observations that
χ1 (n)χ2 (n) is totally multiplicative, that it vanishes if (n, [q1 , q2 ]) > 1, and
that it has period [q1 , q2 ]. As for the second assertion, we may suppose that
(n, q) = 1. By the Chinese Remainder Theorem we see that

(Z/qZ)× ∼
= (Z/q1 Z)× ⊗ (Z/q2 Z)×

if (q1 , q2 ) = 1. Thus the result follows from Lemma 4.2.

Our proof of Theorem 4.4 depends on Abel’s theorem that any ﬁnite abelian
group is isomorphic to the direct product of cyclic groups, but we can prove
Corollary 4.5 without appealing to this result, as follows. By the Chinese Re-
mainder Theorem we see that
+
(Z/qZ)× ∼ = (Z/ p α Z)× .
p α q

If p is odd, then the reduced residue classes (mod p α ) form a cyclic group; in
classical language we say there is a primitive root g. Thus if (n, p) = 1, then
there is a unique ν (mod ϕ( p α )) such that g ν ≡ n (mod p α ). The number ν is
118 Primes in arithmetic progressions: I

called the index of n, and is denoted ν = indg n. From Lemma 4.2 it follows
that the characters (mod p α ), p > 2, are given by

k indg n
χk (n) = e (4.16)
ϕ( p α )
for (n, p) = 1. We obtain ϕ( p α ) different characters by allowing k to assume
integral values in the range 1 ≤ k ≤ ϕ( p α ). By Lemma 4.3 it follows that if q
is odd, then the general character (mod q) is given by

k indg n
χ (n) = e (4.17)
p α q
ϕ( p α )

for (n, q) = 1, where it is understood that k = k( p α ) is determined (mod ϕ( p α ))

and that g = g( p α ) is a primitive root (mod p α ).
The multiplicative structure of the reduced residues (mod 2α ) is more com-
plicated. For α = 1 or α = 2 the group is cyclic (of order 1 or 2, respectively),
and (4.16) holds as before. For α ≥ 3 the group is not cyclic, but if n is odd, then
there exist unique µ (mod 2) and ν (mod 2α−2 ) such that n ≡ (−1)µ 5ν (mod 2α ).
In group-theoretic terms this means that

Z/2α Z)× ∼
= C2 ⊗ C2α−2
when α ≥ 3. By Lemma 4.3 the characters in this case take the form

jµ kν
χ (n) = e + α−2 (4.18)
2 2
for odd n where j = 0 or 1 and 1 ≤ k ≤ 2α−2 . Thus (4.17) holds if 8 q, but if
8|q, then the general character takes the form
⎛ ⎞
⎜ jµ kν indg n ⎟
χ (n) = e ⎝ + α−2 + ⎠ (4.19)
2 2 α
p q
ϕ( p α )
p>2

when (n, q) = 1.
By definition, if f (n) is totally multiplicative, f (n) = 0 whenever (n, q) > 1,
and f (n) has period q, then f is a Dirichlet character (mod q). It is useful to
note that the first condition can be relaxed.
Theorem 4.7 If f is multiplicative, f (n) = 0 whenever (n, q) > 1, and f has
period q, then f is a Dirichlet character modulo q.
Proof It suffices to show that f is totally multiplicative. If (mn, q) > 1, then
f (mn) = f (m) f (n) since 0 = 0. Suppose that (mn, q) = 1. Hence in partic-
ular (m, q) = 1, so that the map k → n + kq (mod m) permutes the residue
classes (mod m). Thus there is a k for which n + kq ≡ 1 (mod m), and
4.2 Dirichlet characters 119

consequently (m, n + kq) = 1. Then

f (mn) = f (m(n + kq)) (by periodicity)
= f (m) f (n + kq) (by multiplicativity)
= f (m) f (n) (by periodicity),
and the proof is complete.

We shall discuss further properties of Dirichlet characters in Chapter 9.

4.2.1 Exercises
1. Let G be a ﬁnite abelian group of order n. Let g1 , g2 , . . . , gn denote the
elements of G, and let χ1 (g), χ2 (g), . . . , χn (g) denote the characters of G.
√
Let U = [u i j ] be the n × n matrix with elements u i j = χi (g j )/ n. Show
that UU ∗ = U ∗ U = I , i.e., that U is unitary.
2. Show that for arbitrary real or complex numbers c1 , . . . , cq ,

q 2
q
cn χ (n) = ϕ(q) |cn |2
χ n=1 n=1
(n,q)=1

where the sum on the left-hand side runs over all Dirichlet characters
χ (mod q).
3. Show that for arbitrary real or complex numbers cχ ,

q 2
cχ χ (n) = ϕ(q) |cχ |2
n=1 χ χ

where the sum over χ is extended over all Dirichlet characters (mod q).
4. Let (a, q) = 1, and suppose that k is the order of a in the multiplicative group
of reduced residue classes (mod q).
(a) Show that if χ is a Dirichlet character (mod q), then χ (a) is a k th root
of unity.
(b) Show that if z is a k th root of unity, then

k if z = 1,
1 + z + ··· + z k−1
=
0 otherwise.

(c) Let ζ be a k th root of unity. By taking z = χ (a)/ζ , show that each k th

root of unity occurs precisely ϕ(q)/k times among the numbers χ (a) as
χ runs over the ϕ(q) Dirichlet characters (mod q).
5. Let χ be a Dirichlet character (mod q), and let k denote the order of χ in the
character group.
(a) Show that if (a, q) = 1, then χ (a) is a k th root of unity.
120 Primes in arithmetic progressions: I

(b) Show that each k th root of unity occurs precisely ϕ(q)/k times among the
numbers χ (a) as a runs over the ϕ(q) reduced residue classes (mod q).
6. Let χ be a character (mod q) such that χ (a) = ±1 whenever (a, q) = 1, and
q
put S(χ ) = n=1 nχ (n). Thus S(χ ) is an integer.
(a) Show that if (a, q) = 1 then aχ (a)S(χ ) ≡ S(χ ) (mod q).
(b) Show that there is an a such that (a, q) = 1 and (aχ (a) − 1, q)|12.
(c) Deduce that 12S(χ ) ≡ 0 (mod q).
In algebraic number fields we encounter not only Dirichlet characters, but
also characters of ideal class groups and of Galois groups. In addition, algebraic
number fields possessing one or more complex embeddings also have a further
kind of character, Hecke’s Grössencharaktere. In a sequence of exercises, be-
ginning with the one below, we develop the basic properties of these characters
√
for the Gaussian field Q( −1).
7. Let K be the Gaussian field,
√
K =Q −1 = {a + bi : a, b ∈ Q},

and let O K be the ring of algebraic integers in K ,

O K = {a + bi : a, b ∈ Z}.
Elements α = a + bi ∈ K have a norm, N (α) = a 2 + b2 , and we observe
that N (αβ) = N (α)N (β). An element α of a ring is a unit if α has an inverse
in the ring. The ring O K has precisely four units, namely i k for k = 0, 1, 2, 3.
Two elements α, β ∈ O K are associates if α = uβ for some unit u. For each
integer m we deﬁne the Hecke Grössencharakter
4mi arg α
e if α = 0,
χm (α) =
0 if α = 0.

(a) Show that if α and β are associates then χm (α) = χm (β).

(b) Show that χm (αβ) = χm (α)χm (β) for all α and β in O K .

4.3 Dirichlet L-functions

Let χ be a character (mod q). For σ > 1 we put

∞
L(s, χ ) = χ (n)n −s . (4.20)
n=1

Since χ is totally multiplicative, by Theorem 1.9 we have

L(s, χ ) = (1 − χ ( p) p −s )−1 (4.21)
p
4.3 Dirichlet L-functions 121

for σ > 1. Thus we see that

∞

L(s, χ0 ) = n −s = ζ (s) 1 − p −s (4.22)
n=1 p|q
(n,q)=1

for σ > 1. By (4.14) we see that if χ = χ0 , then

χ (n) = 0
1≤n≤kq

for k = 1, 2, 3, . . . . Hence

χ (n) ≤ q (4.23)
n≤x

for any x, so that by Theorem 1.3, the series (4.20) converges for σ > 0. This
result is best possible since the terms in (4.20) do not tend to 0 when σ = 0. On
the other hand, we shall show in Chapter 10 that the function L(s, χ) is entire
if χ = χ0 . For σ > 1 we can take logarithms in (4.21), and differentiate, as in
Corollary 1.11, and thus we obtain

Theorem 4.8 If χ = χ0 , then L(s, χ) is analytic for σ > 0. On the other

hand, the function L(s, χ0 ) is analytic in this half-plane except for a simple
pole at s = 1 with residue ϕ(q)/q. In either case,

∞
(n)
log L(s, χ) = χ (n)n −s (4.24)
n=2
log n

for σ > 1, and

L ∞
− (s, χ ) = (n)χ (n)n −s . (4.25)
L n=1

In these last formulæ we see how relations for L-functions parallel those
for the zeta functions. Indeed, when manipulating Dirichlet series formally, the
only property of n −s that is used is that it is totally multiplicative. Hence all
such calculations can be made with n −s replaced by χ (n)n −s . For example, we

know that µ(n)2 n −s = ζ (s)/ζ (2s) for σ > 1. Hence formally

∞
µ(n)2 χ (n)n −s = L(s, χ )/L(2s, χ 2 ). (4.26)
n=1

Since |χ(n)n −s | ≤ n −σ , this latter series is absolutely convergent whenever the

former one is, and by (4.21) we see that (4.26) holds for σ > 1. In fact, by a
theorem of Stieltjes (see Exercise 1.3.2), the identity (4.26) holds for σ > 1/2
if χ = χ0 .
122 Primes in arithmetic progressions: I

We now use the identity (4.15) to capture a prescribed residue class. If

(a, q) = 1, then

1 1 if n ≡ a (mod q),
χ (a)χ (n) = (4.27)
ϕ(q) χ 0 otherwise

where the sum is extended over all characters χ (mod q). This is the multiplica-
tive analogue of (4.1). Hence if (a, q) = 1 then

∞
1 ∞
(n)n −s = (n)n −s χ (a)χ (n)
n=1
ϕ(q) n=1 χ
n≡a (q)
−1 L
= χ (a) (s, χ) (4.28)
ϕ(q) χ L

for σ > 1. As L(s, χ0 ) has a simple pole at s = 1, the function LL (s, χ) has a
simple pole at 1 with residue −1. Thus the term arising from χ0 on the right-hand
side above is
1
+ Oq (1) (4.29)
ϕ(q)(s − 1)

as s → 1+ . This enables us to prove that there are inﬁnitely many primes

p ≡ a (mod q), provided that we can show that the terms from χ = χ0 on the
right-hand side of (4.28) do not interfere with the main term (4.29). But L(s, χ )

is analytic for σ > 0, so that LL (s, χ ) is analytic except at zeros of L(s, χ).
Hence
L L
lim+ (s, χ ) = (1, χ) (4.30)
s→1 L L
for χ = χ0 , provided that L(1, χ ) = 0. Thus the following result lies at the
heart of the matter.

Theorem 4.9 (Dirichlet) If χ is a character (mod q) with χ = χ0 , then

L(1, χ) = 0.

Suppose that (a, q) = 1. Then the above, with (4.28), (4.29), and (4.30) give
the estimate

∞
1
(n)n −s = + Oq (1)
n=1
ϕ(q)(s − 1)
n≡a (q)
4.3 Dirichlet L-functions 123

as s → 1+ . Consequently
∞
(n)
= ∞.
n=1
n
n≡a (q)

Here the contribution of the proper prime powers is

log p ∞ log p
≤ log p p −k = < ∞, (4.31)
p p( p − 1)
p k
p ≡a (q)
k p k=2
k≥2

and thus we have

Corollary 4.10 (Dirichlet’s theorem) If (a, q) = 1, then there are inﬁnitely

many primes p ≡ a (mod q), and indeed
log p
= ∞.
p≡a (q)
p

We call a character real if all its values are real (i.e., χ (n) = 0 or ±1 for all
n). Otherwise a character is complex. A character is quadratic if it has order
2 in the character group: χ 2 = χ0 but χ = χ0 . Thus a quadratic character is
real, and a real character is either principal or quadratic. In Chapter
9 we shall
express quadratic characters in terms of the Kronecker symbol dn .

Proof of Theorem 4.9 We treat quadratic and complex characters separately.

Case 1: Complex χ. From (4.24) we have

∞
(n) −s
L(s, χ ) = exp χ (n)n
χ χ n=2 log n

for σ > 1. By (4.15) this is

⎛ ⎞
⎜ ∞
(n) −s ⎟
= exp ⎝ϕ(q) n ⎠.
n=2
log n
n≡1 (q)

If we take s = σ > 1, then the sum above is a non-negative real number, and
hence we see that

L(σ, χ) ≥ 1 (4.32)
χ

for σ > 1. Now L(s, χ0 ) has a simple pole at s = 1, but the other L(s, χ)
are analytic at s = 1. Thus L(1, χ ) = 0 can hold for at most one χ , since
otherwise the product in (4.32) would tend to 0 as σ → 1+ . If χ is a character
(mod q), then χ is a character (mod q), and χ = χ if χ is complex. Moreover
124 Primes in arithmetic progressions: I

L(s, χ ) = L(s, χ ) by the Schwarz reﬂection principle, so that L(1, χ ) = 0 if

L(1, χ ) = 0. Consequently L(1, χ ) = 0 for complex χ .
∞ −s
Case 2: Quadratic χ. Let r (n) = d|n χ (d). Thus n=1 r (n)n =
ζ (s)L(s, χ) for σ > 1, r (n) is multiplicative, and
⎧
⎪
⎪ 1 if p | q,
⎨
α + 1 if χ ( p) = 1,
r ( pα ) =
⎪
⎪ 1 if χ ( p) = −1 and 2 | α,
⎩
0 if χ ( p) = −1 and 2 α.

Hence r (n) ≥ 0 for all n, and r (n 2 ) ≥ 1 for all n. Suppose that L(1, χ ) = 0.
Then ζ (s)L(s, χ) is analytic for σ > 0, and by Landau’s theorem (Theorem

1.7) the series r (n)n −s converges for σ > 0. But this is false, since

∞
∞
∞
r (n)n −1/2 ≥ r (n 2 )n −1 ≥ n −1 = +∞.
n=1 n=1 n=1

Hence L(1, χ ) = 0. Since L(σ, χ ) > 0 for σ > 1 when χ is quadratic, we see
in fact that L(1, χ ) > 0 in this case.

By using the techniques of Chapter 2 we can prove more than the mere
divergence of the series in Corollary 4.10.

Theorem 4.11 Suppose that χ is a non-principal Dirichlet character. Then

for x ≥ 2,
χ (n)(n)
(a) χ 1,
n≤x n
χ ( p) log p
(b) χ 1,
p≤x p
χ ( p)
1
(c) = b(χ ) + Oχ ,
p≤x p log x

χ ( p) −1 1
(d) 1− = L(1, χ ) + Oχ
p≤x p log x
where
χ ( pk )
b(χ ) = log L(1, χ) − .
k
kp k
p
k>1

Proof We show ﬁrst that

χ (n) log n
log x
= −L (1, χ) + Oq . (4.33)
n≤x n x

To this end we put S(x) = n≤x χ (n). Then from (4.23) we see that S(x) χ 1.
4.3 Dirichlet L-functions 125

Thus the error term above is

χ (n) log n ∞
log u
= d S(u)
n>x n x u
∞
S(x) log x
=− − S(u)(1 − log u)u −2 du
x x
log x
χ .
x

As log n = d|n (d), the left-hand side of (4.33) is
(d)χ (md) (d)χ (d) χ (m)
= . (4.34)
md≤x
md d≤x
d m≤x/d
m

Here the inner sum is of the form

χ (m) χ (m)
= L(1, χ) − ,
m≤y m m>y m

and this last sum is

∞ ∞
S(y)
u −1 d S(u) = − + S(u)u −2 du χ y −1 .
y y y

Hence the right-hand side of (4.34) is

(d)χ (d) 1
L(1, χ ) + Oχ (d) .
d≤x
d x d≤x

This last error term is χ 1, and then (a) follows from (4.33) and the fact that
L(1, χ) = 0. The derivation of (b) from (a), and of (c) from (b) proceeds as in
the proof of Theorem 2.7. Continuing as in that proof, we see from (c) that
(n)χ (n)
1
= c(χ ) + Oχ
1<n≤x
n log n log x

where
χ ( pk )
c(χ ) = b(χ ) + .
k
kp k
p
k>1

We let s → 1+ in (4.24), and deduce by Theorem 1.1 that c(χ) = log L(1, χ ).
To complete the derivation of (d) it sufﬁces to argue as in the proof of
Theorem 2.7.

By forming a linear combination of these estimates as in (4.27) we obtain

Corollary 4.12 If (a, q) = 1 and x ≥ 2, then
(n) 1
(a) = log x + Oq (1),
n≤x n ϕ(q)
n≡a (q)
126 Primes in arithmetic progressions: I

log p 1
(b) = log x + Oq (1),
p≤x p ϕ(q)
1
n≡a (q)

1 1
(c) = log log x + b(q, a) + Oq ,
p≤x p ϕ(q) log x
n≡a (q)
−1
1 1
(d) 1− = c(q, a)(log x)1/ϕ(q) 1 + Oq
p≤x p log x
n≡a (q)
where

1 1 1
b(q, a) = C0 + log 1 − + χ (a) log L(1, χ) −
ϕ(q) p χ =χ0
kp k
p|q k
p ≡a (q)
k>1

and
1/ϕ(q)
ϕ(q) 1 −χ ( p) χ ( p)
c(q, a) = e C0
L(1, χ )χ (a) 1− 1− .
q χ =χ0 p p p
Proof To derive (a) from Theorem 4.11(a) we use (4.27) and the estimate
(n)χ0 (n)
= log x + Oq (1),
n≤x n
which follows from Theorem 2.7(a) since
log p log p
= q 1.
k
pk p|q
p−1
p
p|q

We derive (b) and (c) similarly from the corresponding parts of Theorem 4.11.
In the latter case we use the estimate
χ0 ( p)
1
= log log x + b(χ0 ) + Oq
p≤x p log x
where

1 χ0 ( p k )
b(χ0 ) = C0 + log 1 − − .
p|q
p k
kp k
p
k>1
To derive (d) we observe ﬁrst that
−1
χ0 ( p) −1 1 1
1− = 1− 1− ,
p≤x p p≤x p p≤x p
p|q
which by Theorem 2.7(e) is
⎛ ⎞−1

ϕ(q) ⎜ 1 ⎟ −C0 1
= ⎝ 1− ⎠ e (log x) 1 + O .
q p|q
p log x
p>x
4.3 Dirichlet L-functions 127

Here each term in the product is 1 + O(1/x), and the number of factors is
≤ ω(q), so the product is 1 + Oq (1/x), and hence the above is

ϕ(q) 1
= eC0 (log x) 1 + Oq .
q log x
To complete the proof it sufﬁces to combine this with Theorem 4.11(d)
in (4.27).

4.3.1 Exercises
1. Let χ be a Dirichlet character (mod q). Show that if σ > 1, then

∞
(a) (−1)n−1 χ (n)n −s = (1 − χ (2)21−s )L(s, χ );
n=1

∞
L(s, χ)4
(b) d(n)2 χ (n)n −s = .
L(2s, χ 2 )
n=1
2. (Mertens 1895a,b) Let r (n) = d|n χ (d).
(a) Show that if χ is a non-principal character (mod q), then
χ (n) 1
√ χ √ .
n>x n x
(b) Show that if χ is a non-principal character (mod q), then
r (n)
1/2
= 2x 1/2 L(1, χ ) + Oχ (1).
n≤x n

(c) Recall that if χ is quadratic then r (n) ≥ 0 for all n, and that r (n 2 ) ≥ 1.
Deduce that if χ is a quadratic character, then the left-hand side above
is log x.
(d) Conclude that if χ is a quadratic character, then L(1, χ ) > 0.

3. (Mertens 1897, 1899) For u ≥ 0, put f (u) = m≤u (1 − m/u).
(a) Show that f (u) ≥ 0, that f (u) is continuous, and that if u is not an
integer, then
[u]([u] + 1)
f (u) = ;
2u 2
deduce that f is increasing.
(b) Show also that
u 1 u u 1
f (u) = − {v} dv = − + O(1/u) .
2 u 0 2 2

(c) Let r (n) = d|n χ (d), and assume that χ is non-principal. Show that

r (n)(1 − n/x) = χ (d) f (x/d) .
n≤x d≤x
128 Primes in arithmetic progressions: I

(d) Write d≤x = d≤y + y<d≤x = S1 + S2 where 1 ≤ y ≤ x. Use
part (b) to show that S1 = 2 x L(1, χ ) + Oχ (x/y) + O(y 2 /x).
1

(e) Use the results of part (a) to show that S2 χ f (x/y).

(f) By making an appropriate choice of y, deduce that if χ is a non-principal
character, then
x
r (n)(1 − n/x) = L(1, χ ) + Oχ x 1/3 .
n≤x 2
(g) Argue that if χ is a quadratic character, then the left-hand side above
is x 1/2 ; deduce that L(1, χ) > 0.
4. (Ingham 1929) Let f 1 (n) and f 2 (n) be totally multiplicative functions, and
suppose that | f i (n)| ≤ 1 for all n.
(a) Show that if σ > 1, then

∞
f 1 (d) f 2 (d) n −s
n=1 d|n d|n

∞
∞
∞
ζ (s) f 1 (n)n −s f 2 (n)n −s f 1 (n) f 2 (n)n −s
n=1 n=1 n=1
=

∞
f 1 (n) f 2 (n)n −2s
n=1

f 1 ( p) f 2 ( p)

1−
p p 2s
= .
1 f 1 ( p) f 2 ( p) f 1 ( p) f 2 ( p)
1− s 1− 1 − 1 −
p p ps ps ps
(b) By considering

∞ 2
F(s) = χ (d)d −iu n −s ,
n=1 d|n

show that L(1 + iu, χ ) = 0.

5. Let π(x; q, a) denote the number of primes p ≡ a (mod q) with p not
exceeding x. Similarly, let

ϑ(x; q, a) = log p, ψ(x; q, a) = (n).
p≤x n≤x
p≡a (q) n≡a (q)

(a) Show that

ϑ(x; q, a) = ψ(x; q, a) + O x 1/2 .
(b) Show that

ϑ(x; q, a) x
π(x; q, a) = +O .
log x (log x)2
4.3 Dirichlet L-functions 129

(c) Show that if x ≥ C, C ≥ 2, and (a, q) = 1, then

log p log C
= + Oq (1).
x/C< p≤x
p ϕ(q)
p≡a (q)

(d) Show that for any positive integer q there is a small number cq and a
large number Cq such that if x ≥ 2Cq and (a, q) = 1, then
log p
> cq .
x/Cq < p≤x
p
p≡a (q)

(e) Show that for any positive integer q there is a Cq such that if (a, q) = 1,
then
x
π (x; q, a) q
log x
uniformly for x ≥ Cq .
(f) Show that if (a, q) = 1, then
π(x; q, a) 1 π (x; q, a) 1
lim inf ≤ , lim sup ≥ .
x→∞ x/ log x ϕ(q) x→∞ x/ log x ϕ(q)
6. (a) Show that

x
ϑ(x) ≤ π (x) log x ≤ ϑ(x) + O
log x
for x ≥ 2.
(b) Let P denote a set of prime numbers, and put

πP (x) = 1, ϑP (x) = log p.
p≤x p≤x
p∈P p∈P

Show that

x
ϑP (x) = πP (x) log x + O
log x
for x ≥ 2, where the implicit constant is absolute.
(c) Let

n= p.
p≤y
p∈P

Show that log n = ω(n) log y + O(y/ log y) for y ≥ 2.

(d) From now on, assume that ϑP (x) x for all sufﬁciently large x, where
the implicit constant may depend on P. Show that log log n = log y +
OP (1).
130 Primes in arithmetic progressions: I

(e) Deduce that

d(n) = n (log 2+o(1))/ log log n

as y → ∞.
7. Let R(n) denote the number of ordered pairs a, b such that a 2 + b2 = n
with a ≥ 0 and b > 0. Also, let r (n) denote
the number of such pairs for
−4
which (a, b) = 1. Finally, let χ−4 = n be the non-principal character
(mod 4). We recall that if the prime factorization of n is written in the form

n = 2α pβ qγ ,
γ
β
p n q n
p≡1 (4) q≡3 (4)

then r (n) > 0 if and only if γ = 0 for all primes q and α ≤ 1. We also
recall that

p (β + 1) if 2|γ for all q,
R(n) = r (n/d 2 ) = χ−4 (d) =
d 2 |n d|n
0 otherwise.

(a) Show that ∞ R(n)n −s = ζ (s)L(s, χ−4 ) for σ > 1.
n=1
∞
(b) Show that n=1 r (n)n −s = ζ (s)L(s, χ−4 )/ζ (2s) for σ > 1.
(c) Show that if x ≥ 0 and y ≥ 2, then
y
card{n ∈ (x, x + y] : r (n) > 0} √ .
log y
(d) Show that
x
card{n ≤ x : R(n) > 0} √
log x
for x ≥ 2.
(e) Suppose that n is of the form

n= p.
p≤y
p≡1 (4)

Thus log n = ϑ(y; , 4, 1) y for y ≥ 5, and hence log y = log log n +

O(1). Show that for such n,

R(n) = n (log 2+o(1))/ log log n .

In the above it is noteworthy that although R(n) ≤ d(n) for all n, that
R(n) is usually 0 and has a smaller average value (cf. Exercise 2.1.9)
than d(n) (cf. Theorem 2.3), the maximum order of magnitude of R(n)
is the same as for d(n).
4.3 Dirichlet L-functions 131

√
8. Let K = Q( −1) be the Gaussian field, O K = {a + ib : a, b ∈ Z} the ring
of integers in K . Ideals a in O K are principal, a = (a + ib), and have norm
N (a) = a 2 + b2 .
(a) Explain why the number of ideals a with N (a) ≤ x is π4 x + O(x 1/2 ).

(b) For σ > 1, let ζ K (s) = a N (a)−s be the Dedekind zeta function of
K . Show that ζ K (s) = ζ (s)L s, χ−4 .
(c) For the Gaussian field K , show that N (ab) = N (a)N (b). (This is true
in any algebraic number field.)
(d) Assume that ideals in K factor uniquely into prime ideals. (This is true
in any algebraic number field, and is particularly easy to establish for
the Gaussian field since it has a division algorithm.) Deduce that if
σ > 1, then

1 −1
ζ K (s) = 1−
p N (p)

where the product runs over all prime ideals p in O K .

(e) Deﬁne a function µ(a) = µ K (a) in such a way that
1 µ(a)
=
ζ K (s) a N (a)
s

for σ > 1.
(f) Let a and b be given ideals. Show that

1 if gcd(a, b) = 1,
µ(d) =
d|a
0 otherwise.
d|b

(g) Among pairs a, b of ideals with N (a) ≤ x, N (b) ≤ x, show that the
probability that gcd(a, b) = 1 is
1 6
+ O x −1/2 = 2 + O x 1/2 .
ζ K (2) π L 2, χ−4
9. (Erdős 1946, 1949, 1957, Vaughan 1974, Saffari, unpublished, but see
Bateman, Pomerance & Vaughan 1981; cf. Exercise 2.3.7) Let q (z) =
µ(q/d)
d|q (z − 1)
d
denote the q th cyclotomic polynomial. Suppose that

q= p
p≤y
p≡±2 (5)

where y is chosen so that ω(q) is odd.

(a) Show that if d|q and ω(d) is even, then |e(d/5) − 1| = |e(1/5) − 1|.
(b) Show that if d|q and ω(d) is odd, then |e(d/5) − 1| = |e(2/5) − 1|.
(c) Deduce that |q (e(1/5))| = |e(1/5) + 1|d(q)/2 .
132 Primes in arithmetic progressions: I

(d) Deduce that q (z) has a coefﬁcient whose absolute value is at least

exp q (log 2−ε)/ log log q
if y > y0 (ε).
√
10. Grössencharaktere for Q( −1), continued from Exercise 4.2.7.
(a) For σ > 1 put
1
L(s, χm ) = χm (α)N (α)−s = χm (a + bi)(a 2 + b2 )−s
α∈O
4 a,b∈Z
K
(a,b)=(0,0)

where α denotes a sum over unassociated members of O K . Show
that the above sum is absolutely convergent in this half-plane.
(b) We recall that members of O K factor uniquely into Gaussian primes.
Also, the Gaussian primes are obtained by factoring the rational primes:
The prime 2 ramiﬁes, 2 = i 3 (1 + i)2 , the rational primes p ≡ 1 (mod 4)
split into two distinct Gaussian primes, p = (a + bi)(a − bi), and the
rational primes q ≡ 3 (mod 4) are inert. Show that
L(s, χm ) = (1 − χm (p)N (p)−s )−1
p

for σ > 1 where the product is over an unassociated family of Gaussian

primes p.
(c) By grouping associates together, show that if 4 m, then the sum

emi arg(a+bi) (a 2 + b2 )−s
a,b∈Z
(a,b)=(0,0)

vanishes identically for σ > 1.

(d) For 0 ≤ θ ≤ 2π , put N (x; θ ) = card{(a, b) ∈ Z2 : a 2 + b2 ≤ x, 0 <
arg(a + bi) ≤ θ}. Show that for x ≥ 1,
θ
N (x; θ) = x + O x 1/2
2
uniformly in θ .
(e) Show that if m = 0, then
π/2
χm (a + bi) = e4miθ d N (x; θ) |m|x 1/2 .
a 2 +b2 ≤x 0
a>0,b≥0

(f) Show that if m = 0, then the Dirichlet series L(s, χm ) is convergent for
σ > 1/2.
(g) Show that L(s, χm ) and L(s, χ−m ) are identically equal, and hence that
L(σ, χm ) ∈ R for σ > 1/2.
4.4 Notes 133

4.4 Notes
Section 4.1. Ramanujan’s sum was introduced by Ramanujan (1918). Incredi-
bly, both Hardy and Ramanujan missed the fact that cq (n) be written in closed
form: The formula on the extreme right of (4.7) is due to Hölder (1936). Nor-
mally one would say that a function f is even if f (x) = f (−x). However, in
the present context, an arithmetic function f with period q is said to be even
if f (n) is a function only of (n, q). Thus cq (n) is an even function. The space
of almost-even functions is rather small, but includes several arithmetic func-
tions of interest. For such functions one may hope for a representation in the
∞
form f (n) = q=1 aq cq (n), called a Ramanujan expansion. For a survey of the
theory of such expansions, see Schwarz (1988). Hildebrand (1984) established
definitive results concerning the pointwise convergence of Ramanujan expan-
sions. An appropriate Parseval identity has been established for mean-square
summable almost-even functions; see Hildebrand, Schwarz & Spilker (1988).
Section 4.2. The first instance of characters of a non-cyclic group occurs in
Gauss’s analysis of the genus structure of the class group of binary quadratic
forms. The quotient of the class group by the principal genus is isomorphic to
C2 ⊗ C2 ⊗ · · · ⊗ C2 , and the associated characters are given by Kronecker’s
symbol. Dirichlet (1839) defined the Dirichlet characters for the multiplicative
group (Z/qZ)× of reduced residues modulo q, and the same technique suffices
to construct the characters for any finite Abelian group. More generally, if
G is a group, then a homomorphism h : G −→ G L(n, C) is called a group
representation, and the trace of h(g) is a group character. Note that if a and
b are conjugate elements of G, say a = gbg −1 , then h(a) and h(b) are similar
matrices. Hence they have the same eigenvalues, and in particular tr h(a) =
tr h(b). Thus a group character is constant on conjugacy classes. In the case of a
finite Abelian group it suffices to take n = 1, and in this case the representation
and its trace are essentially the same. For an introduction to characters in a
wider setting, see Serre (1977).
Section 4.3. Dirichlet (1837a,b,c) first proved Corollary 4.10 in the case that
q is prime. The definition of the Dirichlet characters is not difficult in that case,
since the multiplicative group (Z/ pZ)× of reduced residues is cyclic. The most
challenging part of the proof is to show that L(1, χ ) when χ is the Legendre
symbol (mod p). If p ≡ 3 (mod 4), then
p−1
p−1
a p( p − 1)
a ≡ a= ≡ 1 (mod 2),
a=1
p a=1
2

and hence the sum on the left is non-zero. It follows by (9.9) that L(1, χ p ) = 0
in this case. If p ≡ 1 (mod 4), then one has the identity of Exercise 9.3.7(c),
134 Primes in arithmetic progressions: I

and thus to show that L(1, χ p ) = 0 it sufﬁces to show that Q = 1. Dirichlet

established this by means of Gauss’s theory of cyclotomy. Accounts of this are
found in Davenport (2000, Sections 1–3), and in Narkiewicz (2000, pp. 64–
65). An alternative proof that Q = 1 was given more recently by Chowla &
Mordell (1961) (cf. Exercise 9.3.8). In order to prove that L(1, χ) = 0 when χ
is quadratic, Dirichlet related L(1, χ) to the class number of binary quadratic
Suppose that d is a fundamental quadratic discriminant, and put χd (n) =
forms.
d
n
, the Kronecker symbol (as discussed in Section 9.3). Suppose first that
d > 0. Among the solutions of Pell’s equation x 2 − dy 2 = 4, let (x0 , y0√) be
the solution with x0 > 0, y0 > 0, and y0 minimal, and put η = 12 (x0 + y0 d).
Dirichlet showed that
h log η
L(1, χd ) = √ (4.35)
d
where h is the number of equivalence classes of binary quadratic forms with
√
discriminant d. Since h ≥ 1 and y0 ≥ 1, it follows that L(1, χd ) (log d)/ d
in this case. Now suppose that d < 0 and that w denotes the number of auto-
morphs of the positive definite binary quadratic forms of discriminant d (i.e.,
w = 6 if d = −3, w = 4 if d = −4, and w = 2 if d < −4). Dirichlet showed
that
2π h
L(1, χd ) = √ . (4.36)
w −d
√
Thus L(1, χd ) ≥ π/ −d when d < −4.
Our treatment of quadratic characters in the proof of Theorem 4.9 is due
to Landau (1906). Mertens (1895a,b, 1897, 1899) gave two elementary proofs
that L(1, χ) > 0 when χ is quadratic; cf. Exercises 2.4.2 and 2.4.3. For a
definitive account of Mertens’ methods, see Bateman (1959). Other proofs
have been given by Teege (1901), Gel’fond & Linnik (1962, Chapter 3 Section
2), Bateman (1966, 1997), Pintz (1971), and Monsky (1993). See also Baker,
Birch & Wirsing (1973).

4.5 References
Baker, A., Birch, B. J., & Wirsing, E. A. (1973). On a problem of Chowla, J. Number
Theory 5, 224–236.
Bateman, P. T. (1959). Theorems implying the non-vanishing of χ(m)m −1 for real
residue-characters, J. Indian
Math. Soc. 23, 101–115.
(1966). Lower bounds for h(m)/m for arithmetical function h similar to real
residue characters, J. Math. Anal. Appl. 15, 2–20.
4.5 References 135

(1997). A theorem of Ingham implying that Dirichlet’s L-functions have no zeros

with real part one, Enseignement Math. (2) 43, 281–284.
Bateman, P. T., Pomerance, C., & Vaughan, R. C. (1981). On the size of the coefficients
of the cyclotomic polynomial, Coll. Math. Soc. J. Bolyai, pp. 171–202.
Carmichael, R. (1932). Expansions of arithmetical functions in infinite series, Proc.
London Math. Soc. (2) 34, 1–26.
Chowla, S. & Mordell, L. J. (1961). Note on the nonvanishing of L(1), Proc. Amer.
Math. Soc. 12, 283–284.
Davenport, H. (2000). Multiplicative Number Theory, Graduate Texts Math. 74. New
York: Springer-Verlag.
Delange, H. (1976). On Ramanujan expansions of certain arithmetical functions, Acta
Arith. 31, 259–270.
Dirichlet, P. G. L. (1839a). Sur l’usage des intétrales définies dans la sommation des
séries finies ou infinies, J. Reine Angew. Math. 17, 57–67; Werke, Vol. 1, Berlin:
Reimer, 1889, pp. 237–256.
(1837b). Beweis eines Satzes ueber die arithmetische Progression, Ber Verhandl. Kgl.
Preuss. Akad. Wiss., 108–110; Werke, Vol. 1, Berlin: Reimer, 1889, pp. 307–312.
(1837c). Beweis des Satzes, dass jede unbegrenzte arithmetische Progression, deren
erstes Glied und Differenz ganze Zahlen ohne gemeinschaftlichen Factor sind, un-
endlich viele Primzahlen enthält, Abhandl. Kgl. Preuss. Akad. Wiss. 45–81; Werke,
Vol. 1, Berlin: Reimer, 1889, pp. 313–342.
(1839). Recherches sur diverses applications de l’analyse infinitésimale a la théorie
des nombres, J. Reine Angew. Math. 19, 324–369; Werke, Vol. 1, Berlin: Reimer,
1889, pp. 411–496.
Erdős, P. (1946). On the coefficients of the cyclotomic polynomial, Bull. Amer. Math.
Soc. 52, 179–184.
(1949). On the coefficients of the cyclotomic polynomial, Portugal. Math. 8, 63–71.
(1957). On the growth of the cyclotomic polynomial in the interval (O, 1). Proc.
Glasgow Math. Assoc. 3, 102–104.
Friedman, A. (1957). Mean-values and polyharmonic polynomials, Michigan Math. J.
4, 67–74.
Gel’fond, A. O. & Linnik, Ju. V. (1962). Elementary Methods in Analytic Number
Theory. Moscow: Gosudarstv. Izdat. Fiz.-Mat. Lit.; English translation, Chicago:
Rand McNally, 1965; English translation, Cambridge: M. I. T. Press, 1966.
Grytczuk, A. (1981). An identity involving Ramanujan’s sum, Elem. Math. 36, 16–17.
Hildebrand, A. (1984). Über die punkweise Konvergenz von Ramanujan-Entwicklungen
zahlentheoretischer Funktionen, Acta Arith. 44, 108–140.
Hildebrand, A., Schwarz, W., & Spilker, J. (1988). Still another proof of Parseval’s
equation for almost-even arithmetical functions, Aequationes Math. 35, 132–139.
Hölder, O. (1936). Zur Theorie der Kreisteilungsgleichung, Prace Mat.–Fiz. 43, 13–23.
Ingham, A. E. (1929). Note on Riemann’s ζ -function and Dirichlet’s L-functions,
J. London Math. Soc. 5, 107–112.
Landau, E. (1906). Über das Nichtverschwinden einer Dirichletschen Reihe, Sitzungsber.
Akad. Wiss. Berlin 11, 314–320; Collected Works, Vol. 2. Essen: Thales, 1986, pp.
230–236.
Mertens, F. (1895a). Über Dirichletsche Reihen, Sitzungsber. Kais. Akad. Wiss. Wien
104, 2a, 1093–1153.
136 Primes in arithmetic progressions: I

(1895b). Über das Nichtverschwinden Dirichletscher Reihen mit reelen Gliedern,

Sitzber. Kais. Akad. Wiss. Wien 104, 2a, 1158–1166.
(1897). Über Multiplikation und Nichtverschwinden Dirichlet’scher Reihen, J. Reine
Angew. Math. 117, 169–184.
(1899). Eine asymptotische Aufgabe, Sitzber. Kais. Akad. Wiss. Wien 108, 2a, 32–37.
Monsky, P. (1993). Simplifying the proof of Dirichlet’s theorem, Amer. Math. Monthly
100, 861–862.
Narkiewicz, W. (2000). The Development of Prime Number Theory, Berlin: Springer-
Verlag.
Pintz, J. (1971). On a certain point in the theory of Dirichlet’s L-functions, I,II, Mat.
Lapok 22, 143–148; 331–335.
Ramanujan, S. (1918). On certain trigonometrical sums and their applications in the
theory of numbers, Trans. Cambridge Philos. Soc. 22, 259–276; Collected papers.
Cambridge: Cambridge University Press, 1927, pp. 179–199.
Redmond, D. (1983). A remark on a paper: “An identity involving Ramanujan’s sum”
by A. Grytczuk, Elem. Math. 38, 17–20.
Reznick, B. (1995). Some constructions of spherical 5-designs, Linear Algebra Appl.,
226/228, 163–196.
Schwarz, W. (1988). Ramanujan expansions of arithmetical functions, Ramanujan revis-
ited, Proc. Centenary Conference (Urbana, June 1987). Boston: Academic Press,
pp. 187–214.
Serre, J.–P. (1977). Linear representation of ﬁnite groups, Graduate Texts Math. 42.
New York: Springer-Verlag.

Teege, H. (1901). Beweis, daß die unendliche Reihe n=∞ n=1
p 1
n n
einen positiven von
Null verschiedenen Wert hat, Mitt. Math. Ges. Hamburg 4, 1–11.
Vaughan, R. C. (1974). Bounds for the coefﬁcients of cyclotomic polynomials, Michigan
Math. J. 21, 289–295.
Wintner, A. (1943). Eratosthenian averages. Baltimore: Waverly Press.
5
Dirichlet series: II

5.1 The inverse Mellin transform

In Chapter 1 we saw that we can express a Dirichlet series α(s) = ∞n=1 an n
−s

in terms of the coefﬁcient sum A(x) = n≤x an , by means of the formula
∞
α(s) = s A(x)x −s−1 d x, (5.1)
1
which holds for σ > max(0, σc ). This is an example of a Mellin transform. In
the reverse direction, Perron’s formula asserts that
σ0 +i∞
1 xs
A(x) = α(s) ds (5.2)
2πi σ0 −i∞ s
for σ0 > max(0, σc ). This is an example of an inverse Mellin transform.
To understand why we might expect that (2) should be true, note that if
σ0 > 0, then by the calculus of residues
σ0 +i∞
1 ds 1 if y > 1,
ys = (5.3)
2πi σ0 −i∞ s 0 if 0 < y < 1.
Thus we would expect that
1 σ0 +i∞
xs an σ0 +i∞
x s ds
α(s) ds = = an . (5.4)
2πi σ0 −i∞ s n 2πi σ0 −i∞ n s n≤x

The interchange of limits here is difﬁcult to justify, since α(s) may not be
uniformly convergent, and because the integral in (5.3) is neither uniformly nor
absolutely convergent. Moreover, if x is an integer, then the term n = x in (5.4)
gives rise to the integral (5.3) with y = 1, and this integral does not converge,
although its Cauchy principal value exists:
σ0 +i T
1 ds 1
lim = (5.5)
T →∞ 2πi σ −i T s 2
0

for σ0 > 0. We now give a rigorous form of Perron’s formula.

137
138 Dirichlet series: II

Theorem 5.1 (Perron’s formula) If σ0 > max(0, σc ) and x > 0, then

1 σ0 +i T
xs
an = lim α(s) ds.
T →∞ 2πi σ −i T s
n≤x 0

Here indicates that if x is an integer, then the last term is to be counted with
weight 1/2.
Proof Choose N so large that N > 2x + 2, and write

α(s) = an n −s + an n −s = α1 (s) + α2 (s),
n≤N n>N

say. By (5.4), modiﬁed in recognition of (5.5), we see that

1 σ0 +i T
xs
an = lim α1 (s) ds;
T →∞ 2πi σ −i T s
n≤x 0

here the justiﬁcation is trivial since there are only ﬁnitely many terms. As for
α2 (s), we observe that
∞ ∞
α2 (s) = u −s d(A(u) − A(N )) = s (A(u) − A(N ))u −s−1 du.
N N

But A(u) − A(N ) u θ for θ > max(0, σc ), and hence

|s|
α2 (s) 1+ N θ −σ
σ −θ
for σ > θ > max(0, σc ). Implicit constants here and in the rest of this proof
may depend on the an . Hence
T ±i T ∞
xs Nθ x σ N θ (x/N )σ0
α2 (s) ds dσ ,
σ0 ±i T s σ0 − θ σ0 N σ0 − θ log N /x
and
T +i T
xs
α2 (s) ds N θ (x/N )σ0
T −i T s
for large T . We take θ so that σ0 > θ > max(0, σc ). Hence by Cauchy’s theorem
σ0 +i T T −i T T +i T σ0 +i T
= + + x σ0 N θ −σ0 .
σ0 −i T σ0 −i T T −i T T +i T

On combining our estimates, we see that

1 σ0 +i T
xs
lim sup an − α(s) ds x0σ N θ −σ0 .
T →∞ n≤x 2πi σ0 −i T s
Since this holds for arbitrarily large N , it follows that the lim sup is 0, and the
proof is complete.
5.1 The inverse Mellin transform 139

We have now established a precise relationship between (5.1) and (5.2), but
Theorem 5.1 is not sufﬁciently quantitative to be useful in practice. We express
the error term more explicitly in terms of the sine integral
∞
sin u
si(x) = − du.
x u
By integration by parts we see that si(x) 1/x for x ≥ 1, and hence that
si(x) min(1, 1/x) (5.6)
for x > 0. We also note that
+∞
sin u
si(x) + si(−x) = − du = −π. (5.7)
−∞ u
Theorem 5.2 If σ0 > max(0, σa ) and x > 0, then
1 σ0 +i T
xs
an = α(s) ds + R (5.8)
n≤x 2πi σ0 −i T s
where
1 x
R= an si T log
π x/2<n<x
n

1 n 4σ0 + x σ0 |an |
− an si T log +O σ0
.
π x<n<2x x T n n

Proof Since the series α(s) is absolutely convergent on the interval [σ0 −
i T, σ0 + i T ], we see that
1 σ0 +i T
xs 1 σ0 +i T
x s ds
α(s) ds = an .
2πi σ0 −i T s n 2πi σ0 −i T n s
Thus it sufﬁces to show that
⎧
⎪
⎪ 1 + O(y σ0 /T ) if y ≥ 2,
σ0 +i T ⎨ σ0
1 ds 1 + 1
si(T log y) + O(2 /T ) if 1 ≤ y ≤ 2,
ys = π
σ0
2πi σ0 −i T s ⎪
⎪ − si(T log 1/y) + O(2 /T ) if 1/2 ≤ y ≤ 1,
1
⎩ π σ0
O(y /T ) if y ≤ 1/2
(5.9)
for σ0 > 0.
To establish the ﬁrst part of this formula, suppose that y ≥ 2, and let C be
the piecewise linear path from −∞ − i T to σ0 − i T to σ0 + i T to −∞ + i T .
Then by the calculus of residues we see that
1 ds
ys = 1,
2πi C s
140 Dirichlet series: II

since the integrand has a pole with residue 1 at s = 0. In addition,

σ0 ±i T σ0 σ0
ds y σ ±i T 1 y σ0 y σ0
ys = dσ y σ dσ = ,
−∞±i T s −∞ σ ± iT T −∞ T log y T
so we have (5.9) in the case y ≥ 2. The case y ≤ 1/2 is treated similarly, but
the contour is taken to the right, and there is no residue.
Suppose now that 1 ≤ y ≤ 2, and take C to be the closed rectangular path
from σ0 − i T to σ0 + i T to i T to −i T to σ0 − i T , with a semicircular inden-
tation of radius ε at s = 0. Then by Cauchy’s theorem
1 ds
ys = 0.
2πi C s
We note that
σ0 ±i T σ0 σ0
ds 1 1 2σ0
ys y σ dσ ≤ 2σ dσ .
±i T s T 0 T 0 T
The integral around the semicircle tends to 1/2 as ε → 0, and the remaining
integral is
iT −iε T
1 ds 1 dt
lim + ys = lim y it − y −it
2πi ε→0 iε −i T s 2πi ε→0 ε t
1 T log y dv
sin v =
π 0 v
1 1
= + si(T log y)
2 π
by (5.7). This gives (5.9) when 1 ≤ y ≤ 2 and the case 1/2 ≤ y ≤ 1 is treated
similarly.

In many situations, Theorem 5.2 contains more information than is really

needed – it is often more convenient to appeal to the following less precise result.
Corollary 5.3 In the situation of Theorem 5.2,

x 4σ0 + x σ0 ∞
|an |
R |an | min 1, + .
x/2<n<2x
T |x − n| T n=1
n σ0
n=x

Proof From (5.6) we see that

In classical
1 harmonic analysis, for f ∈ L1 (T) we deﬁne Fourier coefﬁcients

*
f (k) = 0 f (x)e(−kα) dα, and we expect that the Fourier series *f (k)e(kα)
provides a useful formula for f (α). As it happens, the Fourier series may
diverge, or converge to a value other than f (α), but for most f a satisfactory
alternative can be found. For example, if f is of bounded variation, then
f (α − ) + f (α + ) K
= lim *
f (k)e(kα).
2 K →∞
−K

A sharp quantitative form of this is established in Appendix D.1. Analogously,

if f ∈ L 1 (R), then we can deﬁne the Fourier transform of f ,
+∞
*
f (t) = f (x)e(−t x) d x, (5.10)
−∞

and we expect that

+∞
f (x) = *
f (t)e(t x) dt. (5.11)
−∞

As in the case of Fourier series, this may fail, but it is not difﬁcult to show that
if f is of bounded variation on [−A, A] for every A, then
f (α − ) + f (α + ) T
*
= lim f (t)e(t x) dt. (5.12)
2 T →∞ −T

The relationship between (5.1) and (5.2) is precisely the same as between
(5.10) and (5.11). Indeed, if we take f (x) = A(e2π x )e−2π σ x , then f ∈ L 1 (R) by
Theorem 1.3, and by changing variables in (5.1) we ﬁnd that

* α(σ + it)
f (t) = .
2π (σ + it)
Thus (5.2) is equivalent to (5.11), and an appeal to (5.12) provides a second
(real variable) proof of Theorem 5.1.
In general, if
∞
F(s) = f (x)x s−1 d x, (5.13)
0

then we say that F(s) is the Mellin transform of f (x). By (5.10) and (5.11) we
expect that
σ0 +i∞
1
f (x) = F(s)x −s ds, (5.14)
2πi σ0 −i∞

and when this latter formula holds we say that f is the inverse Mellin transform
of F. Thus if A(x) is the summatory function of a Dirichlet series α(s), then
α(s)/s is the Mellin transform of A(1/x) for σ > max(0, σc ), and Perron’s
formula (Theorem 5.1) asserts that if σ0 > max(0, σc ), then A(1/x) is the inverse
142 Dirichlet series: II

Mellin transform of α(s)/s. Further instances of this pairing arise if we take a

weight function w(x), and form a weighted summatory function
∞
Aw (x) = an w(n/x).
n=1

Let K (s) denote the Mellin transform of w(x),

∞
K (s) = w(x)x s−1 d x.
0

Then we expect that

∞
α(s)K (s) = Aw (x)x −s−1 d x, (5.15)
0

and that
σ0 +i∞
1
Aw (x) = α(s)K (s)x s ds. (5.16)
2πi σ0 −i∞

Alternatively, we may start with a kernel K (s), and deﬁne the weight w(x)
to be its inverse Mellin transform. The precise conditions under which these
identities hold depends on the weight or kernel; we mention several important
examples.
1. Cesàro weights. For a positive integer k, put
1
Ck (x) = an (x − n)k . (5.17)
k! n≤x
x
Then Ck (x) = 0 Ck−1 (u) du for k ≥ 1 where C0 (x) = A(x), and hence
Ck (x) x θ for θ > k + max(0, σc ). (The implicit constant here may depend
on k, on θ, and on the an .) By integrating (5.1) by parts repeatedly, we see
that
∞
α(s) = s(s + 1) · · · (s + k) Ck (x)x −s−k−1 d x (5.18)
1

for σ > max(0, σc ). By following the method used to prove Theorem 5.1, it
may also be shown that
σ0 +i∞
1 x s+k
Ck (x) = α(s) ds (5.19)
2πi σ0 −i∞ s(s + 1) · · · (s + k)
when x > 0 and σ0 > max(0, σc ). Here the critical step is to show that if y ≥ 1
and σ0 > 0, then
σ0 +i∞ k
1 ys ys
ds = Res
2πi σ0 −i∞ s(s + 1) · · · (s + k) j=0
s(s + 1) · · · (s + k) s=− j
5.1 The inverse Mellin transform 143

by the calculus of residues; this is

k
(−1) j y − j 1
= = (1 − 1/y)k
j=0
j!(k − j)! k!

by the binomial theorem.

2. Riesz typical means. For positive integers k and positive real x put
1
Rk (x) = an (log x/n)k . (5.20)
k! n≤x
x
Then Rk (x) = 0 Rk−1 (u)/u du where R0 (x) = A(x), so that Rk (x) x θ for
θ > max(0, σc ). (The implicit constant here may depend on k, on θ, and on the
an .) By integrating (5.1) by parts repeatedly we see that
∞
α(s) = s k+1 Rk (x)x −s−1 d x (5.21)
1

for σ > max(0, σc ). By following the method used to prove Theorem 5.1 we
also ﬁnd that
σ0 +i∞
1 xs
Rk (x) = α(s) ds (5.22)
2πi σ0 −i∞ s k+1
when x > 0 and σ0 > max(0, σc ). Here the critical observation is that if y ≥ 1
and σ0 > 0, then
σ0 +i∞ s
1 ys y 1
ds = Res k+1 = (log y)k .
2πi σ0 −i∞ s k+1 s s=0 k!
3. Abelian weights. For σ > 0 we have
∞ ∞
(s) = e−u u s−1 du = n s e−nx x s−1 d x.
0 0

We multiply by an n −s and sum, to ﬁnd that

∞
α(s) (s) = P(x)x s−1 d x (5.23)
0

where

∞
P(x) = an e−nx . (5.24)
n=1

These operations are valid for σ > max(0, σa ), but by partial summation
P(x) x −θ as x → 0+ for θ > max(0, σc ), so that the integral in (5.23) is
absolutely convergent in the half-plane σ > max(0, σc ). Hence the integral is
an analytic function in this half-plane, so that by the principle of uniqueness
144 Dirichlet series: II

of analytic continuation it follows that (5.23) holds for σ > max(0, σc ). In the
opposite direction,
σ0 +i∞
1
P(x) = α(s) (s)x −s ds (5.25)
2πi σ0 −i∞

for x > 0, σ > max(0, σc ). To prove this we recall from Theorem 1.5 that
α(s) τ uniformly for σ ≥ ε + max(0, σc ), and from Stirling’s formula
π
(Theorem C.1) we see that | (s)| e− 2 |t| |t|σ −1/2 as |t| → ∞ with σ bounded.
Thus the value of the integral is independent of σ0 , and in particular we may
assume that σ0 > max(0, σa ). Consequently the terms in α(s) can be integrated
individually, and it sufﬁces to appeal to Theorem C.4.
The formulæ (5.23) and (5.25) provide an important link between the Dirich-
let series α(s) and the power series generating function P(x). Indeed, these
formulæ hold for complex x, provided that x > 0. In particular, by taking
x = δ − 2πiα we ﬁnd that

∞
1 σ0 +i∞
an e(nα)e−nδ = α(s) (s)(δ − 2πiα)−s ds.
n=1
2πi σ0 −i∞

It may be noted in the above examples that smoother weights w(x) give rise
to kernels K (s) that tend to 0 rapidly as |t| → ∞. Further useful kernels can
be constructed as linear combinations of the above kernels.
Since the Mellin transform is a Fourier transform with altered variables, all
results pertaining to Fourier transforms can be reformulated in terms of Mellin
transforms. Particularly useful is Plancherel’s identity, which asserts that if f ∈
L 1 (R) ∩ L 2 (R), then f 2 = *
f 2 . This is the analogue for Fourier transforms

of Parseval’s identity for Fourier series, which asserts that k | * f (k)|2 = f 22 .
By the changes of variables we noted before, we obtain
∞
Theorem 5.4 (Plancherel’s identity) Suppose that 0 |w(x)|x −σ −1 d x < ∞,
∞ ∞
and also that 0 |w(x)|2 x −2σ −1 d x < ∞. Put K (s) = 0 w(x)x −s−1 d x. Then
∞ +∞
2π |w(x)|2 x −2σ −1 d x = |K (σ + it)|2 dt.
0 −∞

Among the many possible applications of this theorem, we note in particular

that
∞ +∞
α(σ + it) 2
2π |A(x)|2 x −2σ −1 d x = dt (5.26)
0 −∞ σ + it

for σ > max(0, σc ).

5.1 The inverse Mellin transform 145

5.1.1 Exercises
1. Show that if σc < σ0 < 0, then
1 σ0 +i T
xs
lim α(s) ds = a .
n>x n
T →∞ 2πi σ0 −i T s
2. (a) Show that if y ≥ 0, then
π
− = si(0) ≤ si(y) ≤ si(π ) = 0.28114 . . . .
2

(b) Show that if y ≥ 0, then

∞ y+i∞
eiu ei z
du = dz.
y u y z
(c) Deduce that if y ≥ 0, then |si(y)| < 1/y.
3. (a) Let β > 0 be ﬁxed. Show that if σ0 > 0, then
σ0 +i∞
1 −β
(s/β)y s ds = βe−y .
2πi σ0 −i∞

(b) Let β > 0 be ﬁxed. Show that if x > 0 and σ0 > max(0, σc ), then
1 σ0 +i∞ ∞
β
α(s) (s/β)x s ds = β an e−(n/x) .
2πi σ0 −i∞ n=1

4. (a) Suppose that a > 0 and that b is real. Explain why

σ0 +i∞ σ0 +i∞
e−b /(2a
2 2
)
1 s /2+bs
2 2 2
(s+b/a 2 )2/2
ea ds = ea ds .
2πi σ0 −i∞ 2πi σ0 −i∞

(b) Explain why the values of the integrals above are independent of the
value of σ0 . Hence show that if σ0 = −b/a 2 , then the above is
+∞
e−b /(2a
2 2
)
1
e−a t /2
e−b /a .
2 2 2 2
= dt = √
2π −∞ 2π a
(c) Show that if a > 0, x > 0 and σ0 > σc , then

1 σ0 +i∞
a 2 s 2/2 s 1 ∞
(log x/n)2
α(s)e x ds = √ an exp − .
2πi σ0 −i∞ 2π a n=1 2a 2

5. Take k = 1 in (5.22) for several different values of x, and form a suitable

linear combination, to show that if x ≥ 0 and and σc < 0, then
2
2 +∞ sin 12 t log x
α(it) dt = an log x/n.
π −∞ t n≤x
146 Dirichlet series: II

6. Let w(x) , and suppose that w(x) x σ as

∞x → ∞−σ
for some ﬁxed σ .
Let σw be the inﬁmum of those σ such that 0 w(x)x −1 d x < ∞, and
put
∞
K (s) = w(x)x −s−1 d x
0

for σ > σw .

(a) Show that Aw (x) = ∞
n=1 an w(x/n) satisﬁes Aw (x) x θ for θ >
max(σw , σc ).
(b) Show that
∞
K (s)α(s) = Aw (x)x −s−1 d x
0

for σ > max(σw , σc ).

for σ0 > max(σw , σc ), x > 0.

7. Show that
∞
{x}
ζ (s) = −s dx
0 x s+1
for 0 < σ < 1, and that
∞ +∞
ζ (σ + it) 2
2π {x}2 x −2σ −1 d x = dt
0 −∞ σ + it
for 0 < σ < 1.
8. (a) Show that if f ∈ L 1 (R) and f ∈ L 1 (R), then f* (t) = 2πit *
f (t).
(b) Suppose that f is a function such that f ∈ L 1 (R), that x f (x) ∈ L 2 (R),
and that f ∈ L 1 (R) ∩ L 2 (R). Show that
+∞ +∞
| f (x)|2 d x = − x f (x) f (x) + f (x) f (x) d x.
−∞ −∞

The Cauchy–Schwarz inequality asserts that

+∞ 2 +∞ +∞
a(x)b(x) d x ≤ |a(x)|2 d x |b(x)|2 d x .
−∞ −∞ −∞

By means of this inequality, or otherwise, show that

+∞ +∞ +∞ 2
1
|x f (x)|2 d x |t *
f (t)|2 dt ≥ | f (x)|2 d x .
−∞ −∞ 16π 2 −∞
5.2 Summability 147

This is a form of the Heisenberg uncertainty principle. From it we see that

if f tends to 0 rapidly outside [−A, A], and if *
f tends to 0 rapidly outside
[−B, B], then AB 1.
9. (a) Note the identity

f g = 12 | f + g|2 − 12 | f − g|2 + 2i | f + ig|2 − 2i | f − ig|2 .

(b) Show that if f ∈ L 1 (R) ∩ L 2 (R) and if g ∈ L 1 (R) ∩ L 2 (R), then

+∞ +∞
f (x)g(x) d x = *
f (t)*
g (t) dt.
−∞ −∞

10. Suppose that F is strictly increasing, and that for i = 1, 2 the functions f i
are real-valued with f i ∈ L 1 (R) ∩ L 2 (R) and F( f i ) ∈ L 1 (R) ∩ L 2 (R).
(a) Show that
+∞
( f 1 (x) − f 2 (x))(F( f 1 (x)) − F( f 2 (x))) d x
−∞
+∞
= f*1 (t) − f*2 (t) F(
f 1 )(t) − F( f 2 )(t) dt.
−∞

(b) Suppose additionally that *

f i (t) = 0 for |t| ≥ T , and that F( f 1 )(t) =

F( f 2 )(t) for −T ≤ t ≤ T . Show that f 1 = f 2 a.e.

5.2 Summability

We say that an inﬁnite series an is Abel summable to a, and write an = a
(A) if

∞
lim− an r n = a.
r →1
n=0

Abel proved that if a series converges, then it is A-summable to the same value.
Because of this historical antecedent, we call a theorem ‘Abelian’ if it states
that one kind of summability implies another. Perhaps the simplest Abelian

theorem asserts that if ∞ n=1 an converges to a, then

N
n
lim 1− an = a. (5.27)
N →∞ N
n=1

This is the Cesàro method of summability of order 1, and so we abbreviate the

N
relation above as an = a (C, 1). On putting s N = n=1 an , we reformulate
148 Dirichlet series: II

the above by saying that if lim N →∞ s N = a, then

1 N
lim sn = a. (5.28)
N →∞ N n=1

Here, as in Abel summability and in most other summabilities, each term in

the second limit is a linear function of the terms in the ﬁrst limit. Following
Toeplitz and Schur, we characterize those linear transformations T = [tmn ] that
preserves limits of sequences. We call T regular if the following three conditions
are satisﬁed:
∞
There is a C = C(T ) such that |tmn | ≤ C for all m; (5.29)
n=1

lim tmn = 0 for all n; (5.30)

m→∞

∞
lim tmn = 1. (5.31)
m→∞
n=1

We now show that regular transformations preserve limits, and relegate the
veriﬁcation of the converse to exercises.

Theorem 5.5 Suppose that T satisﬁes (5.29) above. If {an } is a bounded

sequence, then the sequence

∞
bm = tmn an (5.32)
n=1

is also bounded. If T satisﬁes (5.29) and (5.30), and if limn→∞ an = 0,

then limm→∞ bm = 0. Finally, if T is regular and limn→∞ an = a, then
limm→∞ bm = a.

The important special case (5.28) is obtained by noting that the (semi-inﬁnite)
matrix [tmn ] with

1/m if 1 ≤ n ≤ m,
tmn =
0 if n > m

is regular. Moreover, the proof of Theorem 5.5 requires only a straightforward

elaboration of the usual proof of (5.28).

Proof If |an | ≤ A and (5.29) holds, then

∞
∞
|bm | ≤ |tmn an | ≤ A |tmn | ≤ C A.
n=1 n=1
5.2 Summability 149

To establish the second assertion, suppose that ε > 0 and that |an | < ε for
n > N = N (ε). Now

N
|bm | ≤ |tmn an | + |tmn an | = 1 + 2 ,
n=1 n>N

say. From (5.29) and the argument above with A = ε we see that 2 ≤ Cε.
From (5.30) we see that limm→∞ 1 = 0. Hence lim supm→∞ |bm | ≤ Cε, and
we have the desired conclusion since ε is arbitrary. Finally, suppose that T is
regular and that limn→∞ an = a. We write an = a + αn , so that
∞ ∞
bm = a tmn + tmn αn .
n=1 n=1

Since limn→∞ αn = 0, we may appeal to the preceding case to see that

the second sum tends to 0 as m → ∞. Hence by (5.31) we conclude that
limm→∞ bm = a, and the proof is complete.

In Chapter 1 we used Theorem 1.1 to show that if S is a sector of the

form S = {s : σ > σ0 , |t − t0 | ≤ H (σ − σ0 )} where H is an arbitrary positive
constant, and if the Dirichlet series α(s) converges at the point s0 , then
lim α(s) = α(s0 ).
s→s0
s∈S

To see how this may also be derived from Theorem 5.5, let {sm } be an arbitrary
sequence of points of S for which limm→∞ sm = s0 . It sufﬁces to show that
limm→∞ α(sm ) = α(s0 ). Take
tmn = n s0 −sm − (n + 1)s0 −sm ,
so that

∞
n
−s0
α(sm ) = tmn ak k .
n=1 k=1

In view of Theorem 5.5, it sufﬁces to show that [tmn ] is regular. The conditions
(5.30) and (5.31) are clearly satisﬁed, and (5.29) follows on observing that if
s ∈ S, then s − s0 H σ − σ0 , so that
n+1
n s0 −s − (n + 1)s0 −s = (s − s0 ) u s0 −s−1 du
n
n+1

H
(σ − σ0 ) u σ0 −σ −1 du
n
= n σ0 −σ − (n + 1)σ0 −σ .
Thus we have the result. Abel’s analogous theorem on the convergence of power
series can be derived similarly from Theorem 5.5.
150 Dirichlet series: II

The converse of Abel’s theorem on power series is false, but Tauber (1897)

proved a partial converse: If an = o(1/n) and an = a (A), then an = a.
Following Hardy and Littlewood, we call a theorem ‘Tauberian’ if it provides
a partial converse of an Abelian theorem. The qualifying hypothesis (‘an =
o(1/n)’ in the above) is the ’Tauberian hypothesis’. For simplicity we begin
with partial converses of (5.27).

Theorem 5.6 If ∞ n=1 an = a (C, 1), then an = a provided that one of the
following hypotheses holds:
(a) an ≥ 0 for n ≥ 1;
(b) an = O(1/n) for n ≥ 1;
(c) There is a constant A such that an ≥ −A/n for all n ≥ 1.
Proof Clearly (a) implies (c). If (b) holds, then both an and an satisfy (c).

Thus it sufﬁces to prove that an = a when (c) holds. We observe that if H
is a positive integer, then

N
N + H N +H
n N N
n
an = an 1 − − an 1 −
n=1
H n=1
N+H H n=1 N
1
− an (N + H − n) (5.33)
H N <n<N +H
= T1 − T2 − T3 ,
say. Take H = [εN ] for some ε > 0. By hypothesis, lim N →∞ T1 = a(1 + ε)/ε,
and lim N →∞ T2 = a/ε. From (c) we see that
1 AH
T3 ≥ −A ≥− ≥ −Aε.
N <n<N +H
n N
Hence on combining these estimates in (5.33) we see that
N
lim sup an ≤ a + Aε.
N →∞ n=1

Since ε can be taken arbitrarily small, it follows that

N
lim sup an ≤ a.
N →∞ n=1

To obtain a corresponding lower bound we note that

N
N N
n N − H N−H
n
an = an 1 − − an 1 −
n=1
H n=1
N H n=1
N−H
(5.34)
1
+ an (n + H − N ).
H N −H <n<N
5.2 Summability 151

Arguing as we did before, we ﬁnd that

N
lim inf an ≥ a − Aε/(1 − ε),
N →∞
n=1

so that

N
lim inf an ≥ a,
N →∞
n=1

and the proof is complete.

If we had argued from (a) or (b), then the treatment of the term T3 above
would have been simpler, since from (a) it follows that T3 ≥ 0, while from
(b) we have T3 ε.
Our next objective is to generalize and strengthen Theorem 5.6. The type of
generalization we have in mind is exhibited in the following result, which can
be established by adapting the above proof: Let β be ﬁxed, β ≥ 0. If

N
n
an 1 − = (a + o(1))N β ,
n=1
N

and if an ≥ −An β−1 , then

N
an = (a(β + 1) + o(1))N β .
n=1

Concerning the possibility of strengthening Theorem 5.6, we note that by an

Abelian argument (or by an application of Theorem 5.5) it may be shown that

an = a (C, 1) implies that an = a (A). Thus if we replace (C, 1) by (A)
in Theorem 5.6, then we have weakened the hypothesis, and the result would
therefore be stronger. Indeed, Hardy (1910) conjectured and Littlewood (1911)

proved that if an = a (A) and an = O(1/n), then an = a. That is, the
condition ‘an = o(1/n)’ in Tauber’s theorem can be replaced by the condition
(b) above. In fact the still weaker condition (c) sufﬁces, as will be seen by
taking β = 0 in Corollary 5.9 below. We now formulate a general result for the
Laplace transform, from which the analogues for power series and Dirichlet
series follow easily.

Theorem 5.7 (Hardy–Littlewood) Suppose that a(u) is Riemann-integrable

over [0, U ] for every U > 0, and that the integral
∞
I (δ) = a(u)e−uδ du
0
152 Dirichlet series: II

converges for every δ > 0. Let β be ﬁxed, β ≥ 0, and suppose that

I (δ) = (α + o(1))δ −β (5.35)

as δ → 0+ . If, moreover, there is a constant A ≥ 0 such that

a(u) ≥ −A(u + 1)β−1 (5.36)

for all u ≥ 0, then

U
α
a(u) du = + o(1) U β . (5.37)
0 (β + 1)
The basic properties of the gamma function are developed in Appendix C,
but for our present purposes it sufﬁces to put
∞
(β) = u β−1 e−u du
0

for β > 0. From this it follows by integration by parts that

β (β) = (β + 1) (5.38)

when β > 0.
The amount of unsmoothing required in deriving (5.37) from (5.35) is now
much greater than it was in the proof of Theorem 5.6. Nevertheless we follow
the same line of attack. To obtain the proper perspective we review the preceding
proof. Let J = [0, 1], let χJ (u) be its characteristic function, and put K (u) =
N N
max(0, 1 − u) for u ≥ 0. Thus n=1 an = n an χJ (n/N ), and n=1 an (1 −

n/N ) = n an K (n/N ). Our strategy was to approximate to χJ (u) by linear
combinations of K (κu) for various values of κ, κ > 0. The relation underlying
(5.33) and (5.34) is both simple and explicit:
1 1
K (u) − (1− ε)K (u/(1 − ε)) ≤ χJ (u) ≤ ((1+ ε)K (u/(1+ ε)) − K (u));
ε ε
(5.39)
we took ε = H/N . In the present situation we wish to approximate to χJ (u) by
linear combinations of e−κu , κ > 0. We make the change of variable x = e−u ,
so that 0 ≤ x ≤ 1, and we put J = [1/e, 1]. Then we want to approximate to
χJ (x) by a linear combination P(x) of the functions x κ , κ > 0. In fact it sufﬁces
to use only integral values of κ, so that P(x) is a polynomial that vanishes at
the origin. In place of (5.33), (5.34) and (5.39) we shall substitute

Lemma 5.8 Let ε be given, 0 < ε < 1/4, and put J = [1/e, 1], K =
[e−1−ε , e−1+ε ]. There exist polynomials P± (x) such that for 0 ≤ x ≤ 1 we have

P− (x) ≤ χJ (x) ≤ P+ (x) (5.40)

5.2 Summability 153

and

|P± (x) − χJ (x)| ≤ εx(1 − x) + 5χK (x). (5.41)

Proof Let g(x) = (χJ (x) − x)/(x(1 − x)). Then g is continuous in [0, 1]
apart from a jump discontinuity at x = 1/e of height e2 /(e − 1) < 5. Hence
by Weierstrass’s theorem on the uniform approximation of continuous func-
tions by polynomials we see that there are polynomials Q ± (x) such that
Q − (x) ≤ g(x) ≤ Q + (x) for 0 ≤ x ≤ 1, and for which

|g(x) − Q ± (x)| ≤ ε + 5χK (x) (5.42)

for 0 ≤ x ≤ 1. Then the polynomials P± (x) = x + x(1 − x)Q ± (x) have the
desired properties.

Proof of Theorem 5.7 We suppose ﬁrst that α = 0. We note that if P(x) is a

polynomial such that P(0) = 0, say P(x) = rR=1 cr x r , then by (5.35) we see
that
∞
R
a(u)P(e−uδ ) du = cr I (r δ) = o(δ −β ) (5.43)
0 r =1

as δ → 0+ . In the notation of the above lemma,

U ∞
a(u) du = a(u)χJ (e−u/U ) du.
0 0

If (5.40) holds, then by (5.36) we see that

∞
a(u) P+ e−u/U − χJ e−u/U du
0
∞
≥ −A (u + 1)β−1 P+ e−u/U − χJ e−u/U du.
0

By (5.41) this latter integral is

∞ (1+ε)U
ε (u + 1)β−1 e−u/U (1 − e−u/U ) du + (u + 1)β−1 du.
0 (1−ε)U

In the ﬁrst term, the integrand is (u + 1)β U −1 for 0 ≤ u ≤ U ; it is

β−1 −u/U
u e for u ≥ U . Hence the ﬁrst integral is U β . The second integral is
β
εU . On taking δ = 1/U , P = P+ in (5.43) and combining our results, we
ﬁnd that
U
a(u) du ≤ A1 εU β + o(U β ).
0
154 Dirichlet series: II

Since ε can be arbitrarily small, we deduce that

U
lim sup U −β a(u) du ≤ 0.
U →∞ 0

By arguing similarly with P− instead of P+ , we see that the corresponding

liminf is ≥ 0, and so we have (5.37) in the case α = 0.
Suppose now that α = 0, β > 0. We note ﬁrst that
∞ ∞ ∞
(u + 1)β−1 e−uδ du = eδ v β−1 e−vδ dv = eδ v β−1 e−vδ dv + O(eδ ),
0 1 0

and that
∞ ∞
v β−1 e−vδ dv = δ −β w β−1 e−w dw = δ −β (β).
0 0

Hence if b(u) = a(u) − α(u + 1)β−1 / (β), then b(u) ≥ −B(u + 1)β−1 , and
∞
b(u)e−uδ du = o(δ −β ).
0
U
Thus 0 b(u) du = o(U β ), so that
U
α
a(u) du = U β + o(U β ),
0 β (β)
and we have (5.37), in view of (5.38).
For the remaining case, β = 0, it sufﬁces to consider b(u) = a(u) −
αχ[0,1] (u).
∞
Corollary 5.9 Suppose that p(z) = n=0 an z n converges for |z| < 1, and
that β ≥ 0. If p(x) = (α + o(1))(1 − x)−β as x → 1− , and if an ≥ −An β−1
for n ≥ 1, then
N
α
an = + o(1) N β .
n=0
(β + 1)

Proof Put a(u) = an for n ≤ u < n + 1. Then (5.36) holds, and

∞ n+1
1 − e−δ
I (δ) = an e−uδ du = p(e−δ ).
n=0 n δ

But 1 − e−δ ∼ δ as δ → 0+ , so that (5.35) holds. The result now follows by

taking U = N + 1 in (5.37).
N
Corollary 5.10 If an = α (A), and if the sequence s N = n=0 an is

bounded, then an = α (C, 1).
5.2 Summability 155

∞
Proof Take β = 1, p(z) = ∞ n=0 sn z = (1 − z)
n −1 n
n=0 an z in Corollary
N
5.9. Then n=0 sn = (α + o(1))N , which is the desired result.

For Dirichlet series we have similarly

Theorem 5.11 Suppose that α(s) = ∞ n=1 an n
−s
converges for σ > 1, and
that β ≥ 0. If α(σ ) = (α + o(1))(σ − 1) as σ → 1+ , and if an ≥ −A(1 +
−β

log n)β−1 , then

N
an α
= + o(1) (log N )β .
n=1
n (β + 1)

Proof Take a(u) = u−1≤log n<u an /n. Then I (δ) converges for δ > 0, and
moreover
∞
an 1+log n −uδ 1 − e−δ
I (δ) = e du = α(1 + δ),
n=1
n log n δ

so that (5.37) follows. To obtain the desired conclusion we require a further

appeal to our Tauberian hypothesis. We note that
log N an an ne
a(u) du = − log .
0 n≤N
n N /e<n≤N
n N

By our Tauberian hypothesis this is

an
≤ + A1 (log N )β−1 ,
n≤N
n

so that
an
α
≥ + o(1) (log N )β − A1 (log N )β−1 .
n≤N
n (β + 1)

On taking U = 1 + log N in (5.37) we may derive a corresponding upper bound

to complete the proof.

The qualitative arguments we have given can be put in quantitative form as

the need arises. For example, it is easy to see that if

N
√
an = N + O N , (5.44)
n=1

then

N
1 2
an (N − n) = N + O N 3/2 . (5.45)
n=1
2
156 Dirichlet series: II

This is best possible (take an = 1 + n −1/2 ), but if the error term is oscilla-
√
tory, then smoothing may reduce its size (consider an = cos n). Conversely if
(5.45) holds and if the sequence an is bounded, then the method used to prove
Theorem 5.6 can be used to show that

N

an = N + O N 3/4 . (5.46)
n=1

This conclusion, though it falls short of (5.44), is best possible (take an =

1 + cos n 1/4 ). We can also put Theorem 5.7 in quantitative form, but here
the loss in precision is much greater, and in general the importance of The-
orem 5.7 and its corollaries lies in its versatility. For example, it can be

shown that if ∞ n=0 an r = (1 − r )
n −1
+ O(1) as r → 1− , and if an = O(1),
then
N
N
an = N + O .
n=0
log N

This error term, though weak, is best possible (take an = 1 + cos(log n)2 ).
For Dirichlet series it can be shown that if
∞
1
α(s) = an n −s = + O(1)
n=1
s−1

as s → 1+ , and if the sequence an is bounded, then

N
an log N
= log N + O .
n=1
n log log N

This is also best possible (take an = 1 + cos(log log n)2 ), but we can obtain a
sharper result by strengthening our analytic hypothesis. For example, it can be
shown that if α(s) is analytic in a neighbourhood of 1 and if the sequence an is
bounded, then

N
an
= O(1).
n=1
n

However, even this stronger assumption does not allow us to deduce that

N
an = o(N ),
n=1

as we see by considering an = cos log n. In Chapter 8 we shall encounter further

Tauberian theorems in which the above conclusion is derived from hypotheses
concerning the behaviour of α(s) throughout the half-plane σ ≥ 1.
5.2 Summability 157

5.2.1 Exercises
1. Let T be a regular matrix such that tmn ≥ 0 for all m, n. Show that if
limn→∞ an = +∞, then limm→∞ bm = +∞.
2. Show that if T = [tmn ] and U = [u mn ] are regular matrices, then so is
T U = V = [vmn ] where

∞
vmn = tmk u kn .
k=1
3. Show that if b = T a and limm→∞ bm = a whenever limn→∞ an = a, then
T is regular.
4. For n = 0, 1, 2, . . . let tn (x) be defined on [0, 1), and suppose that the tn
satisfy the following conditions:

(i) There is a constant C such that if x ∈ [0, 1), then ∞ n=0 |tn (x)| ≤ C.
(ii) For all n, limx→1− tn (x) = 0.

(iii) limx→1− ∞ n=0 tn (x) = 1.
Show that if limn→∞ an = a and if b(x) = ∞ n=0 an tn (x), then
limx→1− b(x) = a.
5. (Kojima 1917) Suppose that the numbers tmn satisfy the following
conditions:

(i) There is a constant C such that ∞ n=1 |tmn | ≤ C for all m.
(ii) For all n, limm→∞ tmn exists.

(iii) limm→∞ ∞ n=1 tmn exists.
Show that if limn→∞ an exists and if bm = ∞ n=1 tmn an , then limm→∞ bm
exists.
6. For positive
∞ integers n let K n (x) be a function defined on [0, ∞) such that
(i) 0 K n (x) d x → 1 as n → ∞;
∞
(ii) 0 |K n (x)| d x ≤ C for all n;
(iii) limn→∞ K n (x) = 0 uniformly for 0 ≤ x ≤ X . ∞
Suppose that a(x) is a bounded function, and that bn = 0 a(x)K n (x) d x.
Show that if limx→∞ a(x) = a, then limn→∞ bn = a.
7. Let rm be a sequence of positive real numbers with rm → 1− as m → ∞ .
For m ≥ 1, n ≥ 1, put tmn = nrmn−1 (1 − rm )2 .
(a) Show that [tmn ] is regular.

(b) Show that if an = n−1 k=0 ck (1 − k/n) and bm is defined by (5.32), then
∞
bm = k=0 ck rmk .

(c) Show that if cn = c (C, 1), then cn = c (A).
8. Suppose that T = [tmn ] is given by
⎧
⎪
⎪ 0 if n = 0,
⎨ m!n
tmn = if m ≥ n > 0,
⎪
⎪ m n+1 (m − n)!
⎩
0 if m < n.
158 Dirichlet series: II

(a) Show that

m
m!
tmn =
n=k
m k (m − k)!

for 1 ≤ k ≤ m .
(b) Verify that T is regular.

(c) Show that if an = nk=0 x k /k! for n ≥ 0, then bm = (1 + x/m)m for
m ≥ 1.
9. (Mercer’s theorem) Suppose that
1 1 a1 + a2 + · · · + am
bm = am + ·
2 2 m
for m ≥ 1. Show that
2n 2
n−1
an = bn − mbm .
n+1 n(n + 1) m=1

Conclude that limn→∞ an = a if and only if limm→∞ bm = a.

10. For a non-negative integer k we say that an = a (C, k) if
n k
lim an 1 − = a.
x→∞
n≤x x

This is Cesàro summability of order k.

(a) Show that if an = a (C, j), then an = a (C, k) for all k ≥ j.

(b) Show that if an = a (C, k) for some k, then an = a (A).

11. Show that if an = a (A), then lims→0+ an n −s = a. (See Wintner 1943
for Tauberian converses.)

12. For a non-negative integer k we say that an = a (R, k) if

log n k
lim an 1 − = a.
x→∞
n≤x log x

This is Riesz summability of order k.

(a) Show that if an = a (R, j), then an = a (R, k) for all k ≥ j.

(b) Show that if an = a (R, k) for some k, then s→0+ α(s) = a.
13. Put tmn = 0 for n > m, set
m+1
tmm = (log(m + 1) − log m),
log(m + 1)
while for 1 ≤ n < m put
n+1
tmn = (− log n + 2 log(n + 1) − log(n + 2)) .
log(m + 1)
5.2 Summability 159

(a) Show that if

n
k
an = ck 1 −
k=1
n+1
for n ≥ 1, then the bm given in (5.32) satisﬁes
m
log k
bm = ck 1 − .
k=1
log(n + 1)
(b) Show that tmn ≥ 0 for all m, n.
(c) Show that
∞
log 2
tmn = 1 + .
n=1
log(m + 1)
(d) Show that limm→∞ tmn = 0 .

(e) Conclude that if ck = c (C, 1), then ck = c (R, 1) .

14. Let A(x) = 0<n≤x an .
(a) Show that

N
n 1 N
an 1 − = A(x) d x .
n=1
N N 0

(b) Show that

N N
log n 1 A(x)
an 1 − = dx .
n=1
log N log N 1 x
(c) Suppose that t is a ﬁxed non-zero real number. By Corollary 1.15, or
otherwise, show that

N
−1−it n N −it log N
n 1− = + ζ (1 + it) + O .
n=1
N (1 − it)2 N
(d) Similarly, show that
N
−1−it log n 1
n 1− = ζ (1 + it) + O .
n=1
log N log N
∞ −1−it
(e) Conclude that n=1 n is not summable (C, 1), but that it is
summable (R, 1) to ζ (1 + it) .

15. We say that a series is Lambert summable, and write an = a (L), if
∞
nan r n
lim− (1 − r ) = a.
r →1
n=1
1 − rn

(a) Show that if an = a, then an = a (L).
160 Dirichlet series: II

(b) Show that if an is a bounded sequence and |z| < 1, then

∞
nan z n ∞
= dad z n .
n=1
1 − zn n=1 d|n

(c) Show that ∞ µ(n)/n = 0 (L).

n=1
(d) Deduce that if ∞ n=1 µ(n)/n converges, then its value is 0. (See (6.18)
and (8.6).)

(e) Show that ∞ ((n) − 1)/n = −2C0 (L).

n=1
(f) Deduce that if n≤x (n)/n = log x + c + o(1) then c = −C0 . (See
Exercise 8.1.1.)
16. (Bohr 1909; Riesz 1909; Phragmén (cf. Landau 1909, pp. 762, 904))

Let α(s) = an n −s , β(s) = bn n −s , and γ (s) = α(s)β(s) = cn n −s

where cn = d|n ad bn/d . Further, put A(x) = n≤x an and B(x) =

n≤x bn .
(a) Show that
x
dy
A(y)B(x/y) = cn log x/n.
1 y n≤x

(b) Show that if an converges and bn converges, then cn =
α(0)β(0) (R, 1).
(c) (Landau 1907) By taking j = 0 in Exercise 12(a), or otherwise, show

that if the three series an , bn , cn all converge, then cn =

an bn .
17. Suppose that f (n) ∞. Construct an so that |an | ≤ f (n)/n for all n,

N
N
lim sup an = 1, lim inf an = −1,
N →∞ n=1 N →∞
n=1

but

N
lim an (1 − n/N ) = 0.
N →∞
n=1

18. (Landau 1908) Show that if f (x) ∼ x as x → ∞ and x f (x) is increasing,

then limx→∞ f (x) = 1.
19. (Landau (1913); cf. Littlewood (1986, p. 54–55); Schoenberg 1973) Show
that if f (x) → 0 as x → ∞, and if f (x) = O(1), then f (x) → 0 as
x → ∞.

20. (Tauber’s ‘second theorem’) Suppose that P(δ) = ∞ n=0 an e
−nδ
for δ > 0,
N
and put s N = n=0 an .
(a) Show that if an = O(1/n), then s N = P(1/N ) + O(1).
(b) Show that if an = o(1/n), then s N = P(1/N ) + o(1).
5.2 Summability 161

N
(c) Let B(N ) = n=1 nan . Show that if an converges, then B(N ) =
o(N ) as N → ∞.
(d) Show that if P(δ) converges for δ > 0, then

B(N ) N
1 e−u/N e−u/N
s N − P(1/N ) = + B(u) − − du
N 1 u2 u2 uN
∞
u du
+ B(u)e−u/N −1 .
N N u2
(e) Show that if B(N ) = o(N ), then s N − P(1/N ) = o(1).

(f) Show that if an = a (A), then an = a if and only if B(N ) = o(N ).
∞
21. (a) Using Ramanujan’s identity n=1 d(n)2 n −s = ζ (s)4 /ζ (2s) and Theo-

rem 5.11, show that n≤x d(n)2 /n ∼ (4π 2 )−1 (log x)4 .

(b) Show that if n≤x d(n)2 ∼ cx(log x)3 as x → ∞, then c = 1/π 2 .

22. Show that ∞ n=1 1/(d(n)n ) ∼ c(s − 1)
s −1/2
as s → 1+ where

p
c= ( p 2 − p)1/2 log .
p p−1

Deduce that
1 2c
∼ √ (log x)1/2
n≤x nd(n) π
as x → ∞.

23. Show that if n≤N an /n = O(1) and lims→1+ ∞ n=1 an n
−s
= a, then
an
log n
lim 1− = a.
n≤x n log x
x→∞

24. Show that

∞
sin x −sx
e d x = arctan 1/s
0 x
for s > 0. Using Theorem 5.7, deduce that
∞
sin x π
dx = .
0 x 2
∞ ∞
25. Suppose that f (u) ≥ 0, that 0 f (u) du < ∞, and that 0 (1 −
∞
e−δu ) du ∼ δ 1/2 as δ → 0+ . Show that U f (u) du ∼ (πU )−1/2 as U →
∞.

26. Show that ∞ n=1 an = a if and only if

∞
n
lim− an r 2 = a.
r →1
n=0
162 Dirichlet series: II

27. Suppose that for every ε > 0 there is an η > 0 such that

|an | < ε whenever N > 1/η. Show that if an = a (A),

N <n≤(1+η)N
then an = a.

28. Show that if an = a (C, 1) and if an+1 − an = O(|an |/n), then an = a.

29. (Hardy & Littlewood 1913, Theorem 27) Show that if an = a (A) and if

an+1 − an = O(|an |/n), then an = a.
30. (Hardy 1907) Show that

∞
k
lim− (−1)k x 2
x→1
k=0

does not exist.

5.3 Notes
Section 5.1. Theorem 5.1 and the more general (5.22) were first proved rig-
orously by Perron (1908). Although the Mellin transform had been used by
Riemann and Cahen, it was Mellin (1902) who first described a general class
of functions for which the inversion succeeds. Hjalmar Mellin was Finnish, but
his family name is of Swedish origin, so it is properly pronounced mĕ · lēn .
However, in English-speaking countries the uncultured pronunciation mĕl · ı̆n
is universal.
In connection with Theorem 5.4, it should be noted that Plancherel’s formula
f 2 = * f 2 holds not just for all f ∈ L 1 (R) ∩ L 2 (R) but actually for all
f ∈ L (R). However, in this wider setting one must adopt a new definition for
2

*f , since the deﬁnition we have taken is valid only for f ∈ L 1 (R). See Goldberg
(1961, pp. 46–47) for a resolution of this issue.
For further material concerning properties of Dirichlet series, one should
consult Hardy & Riesz (1915), Titchmarsh (1939, Chapter 9), or Widder (1971,
Chapter 2). Beyond the theory developed in these sources, we call attention to
two further topics of importance in number theory. Wiener (1932, p. 91) proved
that if the Fourier series of f ∈ L 1 (T) is absolutely convergent and is never zero,
then the Fourier series of 1/ f is also absolutely convergent. Wiener’s proof was
rather difﬁcult, but Gel’fand (1941) devised a simpler proof depending on his
theory of normed rings. Lévy (1934) proved more generally that the Fourier
series of F( f ) is absolutely convergent provided that F is analytic at all points
in the range of f . Elementary proofs of these theorems have been given by
Zygmund (1968, pp. 245–246) and Newman (1975). These theorems were
generalized to absolutely convergent Dirichlet series by Hewitt & Williamson

(1957), who showed that if α(s) = an n −s is absolutely convergent for σ ≥
σ0 , then 1/α(s) is represented by an absolutely convergent Dirichlet series
5.3 Notes 163

in the same half-plane, if and only if the values taken by α(s) in this half-
plane are bounded away from 0. Ingham (1962) noted a fallacy in Zygmund’s
account of Lévy’s theorem, corrected it, and gave an elementary proof of the
generalization to absolutely convergent Dirichlet series. See also Goodman &
Newman (1984). Secondly, Bohr (1919) developed a theory concerning the
values taken on by an absolutely convergent Dirichlet series. This is described
by Titchmarsh (1986, Chapter 11), and in greater detail by Apostol (1976,
Chapter 8). For a small footnote to this theory, see Montgomery & Schinzel
(1977).
Section 5.2. That conditions (5.29)–(5.31) are necessary and sufficient for
the transformation T to preserve limits was proved by Toeplitz (1911) for upper
triangular matrices, and by Steinhaus (1911) in general. See also Kojima (1917)
and Schur (1921). For more on the Toeplitz matrix theorem and various aspects
of Tauberian theorems, see Peyerimhoff (1969).
Theorem 5.6 under the hypothesis (a) is trivial by dominated convergence.
Theorem 5.6(b) is a special case of a theorem of Hardy (1910), who considered
the more general (C,k) convergence, and Theorem 5.6(c) is similarly a special
case of a theorem of Landau (1910, pp. 103–113).
Tauber (1897) proved two theorems, the second of which is found in Exer-
cise 5.2.18. Littlewood (1911) derived his strengthening of Tauber’s first theo-
rem by using high-order derivatives. Subsequently Hardy & Littlewood (1913,
1914a, b, 1926, 1930) used the same technique to obtain Theorem 5.8 and
its corollaries. Karamata (1930, 1931a, b) introduced the use of Weierstrass’s
approximation theorem. Karamata also considered a more general situation,
in which the right-hand sides of (5.35) and (5.36) are multiplied by a slowly
oscillating function L(1/δ), and the right-hand side of (5.37) is multiplied by
L(U ). Our exposition employs a further simplification due to Wielandt (1952).
Other proofs of Littlewood’s theorem have been given by Delange (1952) and
by Eggleston (1951). Ingham (1965) observed that a peak function similar
to Littlewood’s can be constructed by using high-order differencing instead
of differentiation. Since many proofs of the Weierstrass theorem involve con-
structing a peak function, the two methods are not materially different. Sharp
quantitative Tauberian theorems have been given by Postnikov (1951), Kore-
vaar (1951, 1953, 1954a–d), Freud (1952, 1953, 1954), Ingham (1965), and
Ganelius (1971).
For other accounts of the Hardy–Littlewood theorem, see Hardy (1949) or
Widder (1946, 1971). For a brief survey of applications of summability to
classical analysis, see Rubel (1989).
Wiener (1932, 1933) invented a general Tauberian theory that contains the
Hardy–Littlewood theorems for power series (Theorem 5.8 and its corollaries)
164 Dirichlet series: II

as a special case. Wiener’s theory is discussed by Hardy (1949), Pitt (1958), and
Widder (1946). Among the longer expositions of Tauberian theory, the recent
accounts of Korevaar (2002, 2004) are especially recommended.

5.4 References
Apostol, T. (1976). Modular Functions and Dirichlet Series in Number Theory, Graduate
Texts Math. 41. New York: Springer-Verlag.
Bohr, H. (1909). Über die Summabilität Dirichletscher Reihen, Nachr. König. Gesell.
Wiss. Göttingen Math.-Phys. Kl., 247–262; Collected Mathematical Works, Vol. I.
København: Dansk Mat. Forening, 1952, A2.
(1919). Zur Theorie algemeinen Dirichletschen Reihen, Math. Ann. 79, 136–156;
Collected Mathematical Works, Vol. I. København: Dansk Mat. Forening, 1952,
A13.
Delange, H. (1952). Encore une nouvelle démonstration du théorème taubérien de Lit-
tlewood, Bull. Sci. Math. (2) 76, 179–189.
Edwards, D. A. (1957). On absolutely convergent Dirichlet series, Proc. Amer. Math.
Soc. 8, 1067–1074.
Eggleston, H. G. (1951). A Tauberian lemma, Proc. London Math. Soc. (3) 1, 28–45.
Freud, G. (1952). Restglied eines Tauberschen Satzes, I, Acta Math. Acad. Sci. Hungar.
2, 299–308.
(1953). Restglied eines Tauberschen Satzes, II, Acta Math. Acad. Sci. Hungar. 3,
299–307.
(1954). Restglied eines Tauberschen Satzes, III, Acta Math. Acad. Sci. Hungar. 5,
275–289.
Ganelius, T. (1971). Tauberian Remainder Theorems, Lecture Notes Math. 232. Berlin:
Springer-Verlag.
Gel’fand, I. M. (1941). Über absolut konvergente trigonometrische Reihen und Integrale,
Mat. Sb. N. S. 9, 51–66.
Goldberg R. R. (1961). Fourier Transforms, Cambridge Tract 52. Cambridge: Cambridge
University Press.
Goodman, A. & Newman, D. J. (1984). A Wiener type theorem for Dirichlet series,
Proc. Amer. Math. Soc. 92, 521–527.
Hardy, G. H. (1907). On certain oscillating series, Quart. J. Math. 38, 269–288; Collected
Papers, Vol. 6. Oxford: Clarendon Press, 1974, pp. 146–167.
(1910). Theorems relating to the summability and convergence of slowly oscillating
series, Proc. London Math. Soc. (2) 8, 301–320; Collected Papers, Vol. 6. Oxford:
Clarendon Press, 1974, pp. 291–310.
(1949). Divergent Series, Oxford: Oxford University Press.
Hardy, G. H. & Littlewood, J. E. (1913). Contributions to the arithmetic theory of
series, Proc. London Math. Soc. (2) 11, 411–478; Collected Papers, Vol. 6. Oxford:
Clarendon Press, 1974, pp. 428–495.
(1914a). Tauberian theorems concerning power series and Dirichlet series whose co-
efﬁcients are positive, Proc. London Math. Soc. (2) 13, 174–191; Collected Papers,
Vol. 6. Oxford: Clarendon Press, 1974, pp. 510–527.
5.4 References 165

(1914b). Some theorems concerning Dirichlet’s series, Messenger Math. 43, 134–147;
Collected Papers, Vol. 6. Oxford: Clarendon Press, 1974, pp. 542–555.
(1926). A further note on the converse of Abel’s theorem, Proc. London Math.
Soc. (2) 25, 219–236; Collected Papers, Vol. 6. Oxford: Clarendon Press, 1974,
pp. 699–716.
(1930). Notes on the theory of series XI: On Tauberian theorems, Proc. London
Math. Soc. (2) 30, 23–37; Collected Papers, Vol. 6. Oxford: Clarendon Press, 1974,
pp. 745–759.
Hardy, G. H. & Riesz, M. (1915). The General Theory of Dirichlet’s Series, Cambridge
Tract No. 18. Cambridge: Cambridge University Press. Reprint: Stechert–Hafner
(1964).
Hewitt, E. & Williamson, H. (1957). Note on absolutely convergent Dirichlet series,
Proc. Amer. Math. Soc. 8, 863–868.
Ingham, A. E. (1962). On absolutely convergent Dirichlet series. Studies in Mathemati-
cal Analysis and Related Topics. Stanford: Stanford University Press, pp. 156–164.
(1965). On tauberian theorems, Proc. London Math. Soc. (3) 14A, 157–173.
Karamata, J. (1930). Über die Hardy–Littlewoodschen Umkehrungen des Abelschen
Stetigkeitssatzes, Math. Z. 32, 319–320.
(1931a). Neuer Beweis und Verallgemeinerung einiger Tauberian-Sätze, Math. Z. 33,
294–300.
(1931b). Neuer Beweis und Verallgemeinerung der Tauberschen Sätze, welche die
Laplacesche und Stieltjessche Transformation betreffen, J. Reine Angew. Math.
164, 27–40.
Kojima, T. (1917). On generalized Toeplitz’s theorems on limit and their application,
Tôhoku Math. J. 12, 291–326.
Korevaar, J. (1951). An estimate of the error in Tauberian theorems for power series,
Duke Math. J. 18, 723–734.
(1953). Best L 1 approximation and the remainder in Littlewood’s theorem, Proc.
Nederl. Akad. Wetensch. Ser. A 56 (= Indagationes Math. 15), 281–293.
(1954a). A very general form of Littlewood’s theorem, Proc. Nederl. Akad. Wetensch.
Ser. A 57 (= Indagationes Math. 16), 36–45.
(1954b). Another numerical Tauberian theorem for power series, Proc. Nederl. Akad.
Wetensch. Ser. A 57 (= Indagationes Math. 16), 46–56.
(1954c). Numerical Tauberian theorems for Dirichlet and Lambert series, Proc.
Nederl. Akad. Wetensch. Ser. A 57 (= Indagationes Math. 16), 152–160.
(1954d). Numerical Tauberian theorems for power series and Dirichlet series, I, II,
Proc. Nederl. Akad. Wetensch. Ser. A 57 (= Indagationes Math. 16), 432–443,
444–455.
(2001). Tauberian theory, approximation, and lacunary series of powers, Trends in
approximation theory (Nashville, 2000), Innov. Appl. Math. Nashville: Vanderbilt
University Press, pp. 169–189.
(2002). A century of complex Tauberian theory, Bull. Amer. Math. Soc. (N.S.) 39,
475–531.
(2004). Tauberian Theory. A Century of Developments. Grundl. Math. Wiss. 329.
Berlin: Springer-Verlag.
Landau, E. (1907). Über die Multiplikation Dirichletscher Reihen, Rend. Circ. Mat.
Palermo 24, 81–160.
166 Dirichlet series: II

(1908). Zwei neue Herleitungen für die asymptotische Anzahl der Primzahlen unter
einer gegebenen Grenze, Sitzungsberichte Akad. Wiss. Berlin 746–764; Collected
Works, Vol.4. Essen: Thales Verlag, 1986, pp. 21–39.
(1909). Handbuch der Lehre von der Verteilung der Primzahlen, Leipzig: Teubner.
Reprint: Chelsea (New York), 1953.
(1910). Über die Bedeutung einiger neuerer Grenzwertsätze der Herren Hardy und
Axer, Prace mat.-ﬁz. (Warsaw) 21, 97–177; Collected Works, Vol. 4. Essen: Thales
Verlag, 1986, pp. 267–347.
(1913). Einige Ungleichungen für zweimal differentiierbare Funktionen, Proc. Lon-
don Math. Soc. (2) 13, 43–49; Collected Works, Vol. 6. Essen: Thales Verlag, 1986,
pp. 49–55.
Lévy, P. (1934). Sur la convergence absolue des séries de Fourier, Compositio Math. 1,
1–14.
Littlewood, J. E. (1911). The converse of Abel’s theorem on power series, Proc. London
Math. Soc. (2) 9, 434–448; Collected Papers, Vol. 1. Oxford: Oxford University
Press, 1982, pp. 757–773.
(1986). Littlewood’s Miscellany, Bollobas, B. Ed., Cambridge: Cambridge University
Press.
van de Lune, J. (1986). An Introduction to Tauberian Theory: From Tauber to Wiener.
CWI Syllabus 12. Amsterdam: Mathematisch Centrum.
Mellin, H. (1902). Über den Zusammenhang zwischen den linearen Differential- und
Differenzengleichungen, Acta Math. 25, 139–164.
Montgomery, H. L. & Schinzel, A. (1977). Some arithmetic properties of polynomials in
several variables. Transcendence Theory: Advances and Applications (Cambridge,
1976). London: Academic Press, pp. 195–203.
Newman, D. J. (1975). A simple proof of Wiener’s 1/ f theorem, Proc. Amer. Math. Soc.
48, 264–265.
Perron, O. (1908). Zur Theorie der Dirichletschen Reihen, J. Reine Angew. Math. 134,
95–143.
Peyerimhoff, A. (1969). Lectures on summability, Lecture Notes Math. 107. Berlin:
Springer-Verlag.
Pitt, H. R. (1958). Tauberian Theorems. Tata Monographs. London: Oxford University
Press.
Postnikov, A. G. (1951). The remainder term in the Tauberian theorem of Hardy and
Littlewood, Dokl. Akad. Nauk SSSR N. S. 77, 193–196.
Riesz, M. (1909). Sur la sommation des séries de Dirichlet, C. R. Acad. Sci. Paris 149,
18–21.
Rubel, L. (1989). Summability theory: a neglected tool of analysis, Amer. Math. Monthly
96, 421–423.
Schoenberg, I. J. (1973). The elementary cases of Landau’s problem of inequalities
between derivatives, Amer. Math. Monthly 80, 121–158.
Schur, I. (1921). Über lineare Transformationen in der Theorie der unendlichen Reihen,
J. Reine Angew. Math. 151, 79–111.
Steinhaus, H. (1911). Kilka slów o uogólnieniu pojȩcia granicy, Warsaw: Prace mat-ﬁz
22, 121–134.
Tauber, A. (1897). Ein Satz aus der Theorie der unendlichen Reihen, Monat. Math. 8,
273–277.
5.4 References 167

Titchmarsh, E. C. (1939). The Theory of Functions, Second Edition. Oxford: Oxford

University Press.
(1986). The Theory of the Riemann Zeta-function, Second Edition. Oxford: Oxford
University Press.
Toeplitz, O. (1911). Über algemeine lineare Mittelbildungen, Warsaw: Prace mat–ﬁz
22, 113–119.
Widder, D. V. (1946). The Laplace transform, Princeton: Princeton University Press.
(1971). An Introduction to Transform Theory. New York: Academic Press.
Wielandt, H. (1952). Zur Umkehrung des Abelschen Stetigkeitssatzes, Math Z. 56, 206–
207.
Wiener, N. (1932). Tauberian theorems, Ann. of Math. (2) 33, 1–100.
(1933). The Fourier Integral, and Certain of its Applications. Cambridge: Cambridge
University Press.
Wintner, A. (1943). Eratosthenian averages. Baltimore: Waverly Press.
Zygmund, A. (1968). Trigonometric series, Vol. 1, Second Edition. Cambridge: Cam-
bridge University Press.
6
The Prime Number Theorem

6.1 A zero-free region

The Prime Number Theorem (PNT) asserts that
x
π (x) ∼
log x

as x tends to inﬁnity. We shall prove this by using Perron’s formula, but in

the course of our arguments it will be important to know that ζ (s) = 0 for
σ ≥ 1. In Chapter 1 we saw that ζ (s) = 0 for σ > 1, but it remains to show
that ζ (1 + it) = 0. To obtain a quantitative form of the Prime Number The-
orem we take some care to show that ζ (s) = 0 for σ ≥ 1 − δ(t) where δ(t)
is some function of t. We would like the width δ(t) of the zero-free region
to be as large as possible, as the rate at which δ(t) tends to 0 determines the
size of the estimate we can derive for the error term in the Prime Number
Theorem.
We begin by reviewing some basic facts concerning functions of a complex
variable. If P(z) is a polynomial, then the rate of growth of |P(z)| as |z| →
∞ reﬂects the number of zeros of P(z). This is generalized to other analytic
functions by Jensen’s formula. For our purposes we are content to establish the
following simple consequence of Jensen’s formula.

Lemma 6.1 (Jensen’s inequality) If f (z) is analytic in a domain containing

the disc |z| ≤ R, if | f (z)| ≤ M in this disc, and if f (0) = 0, then for r < R the
number of zeros of f in the disc |z| ≤ r does not exceed

log M/| f (0)|

.
log R/r

Proof Let z 1 , z 2 , . . . , z K denote the zeros of f in the disc |z| ≤ R, and

168
6.1 A zero-free region 169

put
K
R 2 − zz k
g(z) = f (z) .
k=1
R(z − z k )
th
The k factor of the product has been constructed so that it has a pole at z k , and
so that it has modulus 1 on the circle |z| = R. Hence g is an analytic function
in the disc |z| ≤ R, and if |z| = R, then |g(z)| = | f (z)| ≤ M. Hence by the
maximum modulus principle, |g(0)| ≤ M. But
K
R
|g(0)| = | f (0)| .
|z |
k=1 k

Each factor in the product is ≥ 1, and if |z k | ≤ r , then the factor is ≥ R/r . If

there are L such zeros, then the above is ≥ | f (0)|(R/r ) L , which gives the stated
upper bound for L.

We now show that a bound for the modulus of an analytic function can be
derived from a one-sided bound for its real part in a slightly larger region.
Lemma 6.2 (The Borel–Carathéodory Lemma) Suppose that h(z) is analytic
in a domain containing the disc |z| ≤ R, that h(0) = 0, and that h(z) ≤ M
for |z| ≤ R. If |z| ≤ r < R, then
2Mr
|h(z)| ≤
R −r
and
2M R
|h (z)| ≤ .
(R − r )2
Proof It sufﬁces to show that
h (k) (0) 2M
≤ k (6.1)
k! R
for all k ≥ 1, for then
∞
h (k) (0) k ∞
r k 2Mr
|h(z)| ≤ r ≤ 2M = ,
k=1
k! k=1
R R −r
and

∞
|h (k) (0)|kr k−1 2M ∞
r k−1 2M R
|h (z)| ≤ ≤ k = .
k=1
k! R k=1 R (R − r )2
To prove (6.1) we ﬁrst note that
1
1 dz
h(Re(θ )) dθ = h(z) = h(0) = 0.
0 2πi |z|=R z
170 The Prime Number Theorem

Moreover, if k > 0, then

1
R −k
h(Re(θ ))e(kθ ) dθ = h(z)z k−1 dz = 0,
0 2πi |z|=R

and
1
Rk R k h (k) (0)
h(Re(θ ))e(−kθ) dθ = h(z)z −k−1 dz = .
0 2πi |z|=R k!
By forming a linear combination of these identities we see that if k > 0, then
1
R k e(−φ)h (k) (0)
h(Re(θ ))(1 + cos 2π(kθ + φ)) dθ = .
0 2 · k!
By taking real parts it follows that
1
1 k
R e(−φ)h (k) (0)/k! ≤ M (1 + cos 2π (kθ + φ)) dθ = M
2 0

for k > 0. Since this holds for any real φ, we are free to choose φ so that
e(−φ)h (k) (0) = |h (k) (0)|. Then the above inequality gives (6.1), and the proof
is complete.
K
If P(z) = c k=1 (z − z k ), then

P K
1
(z) = .
P k=1
z − zk

We now generalize this to analytic functions f (z), to the extent that f / f can
be approximated by a sum over its nearby zeros.

Lemma 6.3 Suppose that f (z) is analytic in a domain containing the disc
|z| ≤ 1, that | f (z)| ≤ M in this disc, and that f (0) = 0. Let r and R be ﬁxed,
0 < r < R < 1. Then for |z| ≤ r we have

f K
1 M
(z) = + O log
f k=1
z − zk | f (0)|

where the sum is extended over all zeros z k of f for which |z k | ≤ R. (The implicit
constant depends on r and R, but is otherwise absolute.)

Proof If f (z) has zeros on the circle |z| = R, then we replace R by a very
slightly larger value. Thus we may assume that f (z) = 0 for |z| = R. Set
K
R 2 − zz k
g(z) = f (z) .
k=1
R(z − z k )
6.1 A zero-free region 171

By Lemma 6.1 we know that

log M/| f (0)| M
K ≤ log . (6.2)
log 1/R | f (0)|
If |z| = R, then each factor in the product has modulus 1. Consequently |g(z)| ≤
M when |z| = R, and by the maximum modulus principle |g(z)| ≤ M for |z| ≤
R. We also note that
K
R
|g(0)| = | f (0)| ≥ | f (0)|.
|z
k=1 k
|
Since g(z) has no zeros in the disc |z| ≤ R, we may put h(z) = log(g(z)/g(0)).
Then h(0) = 0, and
h(z) = log |g(z)| − log |g(0)| ≤ log M − log | f (0)|
for |z| ≤ R. Hence by the Borel–Carathéodory lemma we see that
M
h (z) log (6.3)
| f (0)|
for |z| ≤ r . But
g f K
1 K
1
h (z) = (z) = (z) − + . (6.4)
g f k=1
z − zk k=1
z − R 2/z k

Now |R 2/z k | ≥ R, so that if |z| ≤ r then |z − R 2/z k | ≥ R − r . Hence for |z| ≤ r

the last sum above has modulus
K M
≤ log
R −r | f (0)|
by (6.2). To obtain the stated result it sufﬁces to combine this estimate and (6.3)
in (6.4).

We now apply these general principles to the zeta function.

Lemma 6.4 If |t| ≥ 7/8 and 5/6 ≤ σ ≤ 2, then
ζ 1
(s) = + O(log τ )
ζ ρ s −ρ

where τ = |t| + 4 and the sum is extended over all zeros ρ of ζ (s) for which
|ρ − (3/2 + it)| ≤ 5/6.
Proof We apply Lemma 6.3 to the function f (z) = ζ (z + (3/2 + it)), with
R = 5/6 and r = 2/3. To complete the proof it sufﬁces to note that | f (0)| 1
by the (absolutely convergent) Euler product formula (1.17), and that f (z) τ
for |z| ≤ 1 by Corollary 1.17.
172 The Prime Number Theorem

If the zeta function were to have a zero of multiplicity m at 1 + iγ , then we

would have
ζ m
(1 + δ + iγ ) ∼
ζ δ
as δ → 0+ . But
ζ ∞
(1 + δ + iγ ) = − (n)n −1−δ cos(γ log n),
ζ n=1

and in the very worst case this could be no larger than

∞
ζ 1
(n)n −1−δ = − (1 + δ) ∼ .
n=1
ζ δ

Thus m is at most 1, and even in this case ζ /ζ would be essentially as large as

it could possibly be. Roughly speaking, this would imply that piγ is near −1
for most primes. But then it would follow that p 2iγ is near 1 for most primes,
so that
ζ 1
(1 + δ + 2iγ ) ∼ −
ζ δ
as δ → 0+ . Then ζ (s) would have a pole at 1 + 2iγ , contrary to Corollary
1.13. The essence of this informal argument is captured very effectively by the
following elementary inequality.

Lemma 6.5 If σ > 1, then

ζ ζ ζ
−3 (σ ) − 4 (σ + it) − (σ + 2it) ≥ 0.
ζ ζ ζ
Proof From Corollary 1.11 we see that the left-hand side above is

∞

(n)n −1−δ 3 + 4 cos(t log n) + cos(2t log n) .
n=1

It now sufﬁces to note that 3 + 4 cos θ + cos 2θ = 2(1 + cos θ )2 ≥ 0 for

all θ.

We now use Lemmas 6.4 and 6.5 to establish the existence of a zero-free
region for the zeta function.

Theorem 6.6 There is an absolute constant c > 0 such that ζ (s) = 0 for
σ ≥ 1 − c/ log τ .

This is the classical zero-free region for the zeta function.

6.1 A zero-free region 173

Proof Since ζ (s) is given by the absolutely convergent product (1.17) for
σ > 1, it sufﬁces to consider σ ≤ 1. From (1.24) we see that
∞
s |s|
ζ (s) − ≤ |s| u −σ −1 du = (6.5)
s−1 1 σ
for σ > 0. From this we see that ζ (s) = 0 when σ > |s − 1|, i.e., in the parabolic
region σ > (1 + t 2 )/2. In particular, ζ (s) = 0 in the rectangle 8/9 ≤ σ ≤ 1,
|t| ≤ 7/8. Now suppose that ρ0 = β0 + iγ0 is a zero of the zeta function with
5/6 ≤ β0 ≤ 1, |γ0 | ≥ 7/8. Since ρ ≤ 1 for all zeros ρ of ζ (s), it follows that
1/(s − ρ) > 0 whenever σ > 1. Hence by Lemma 6.4 with s = 1 + δ + iγ0
we see that
ζ 1
− (1 + δ + iγ0 ) ≤ − + c1 log(|γ0 | + 4).
ζ 1 + δ − β0
Similarly, by Lemma 6.4 with s = 1 + δ + 2iγ0 we ﬁnd that
ζ
− (1 + δ + 2iγ0 ) ≤ c1 log(|2γ0 | + 4).
ζ
From Corollary 1.13 we see that
ζ 1
− (1 + δ) = + O(1).
ζ δ
On combining these estimates in Lemma 6.5 we conclude that
3 4
− + c2 log(|γ0 | + 4) ≥ 0.
δ 1 + δ − β0
We take δ = 1/(2c2 log(|γ0 | + 4)). Thus the above gives
4
7c2 log(|γ0 | + 4) ≥ ,
1 + δ − β0
which is to say that

1 4
1+ − β0 ≥ .
2c2 log(|γ0 | + 4) 7c2 log(|γ0 | + 4)
Hence

1
1 − β0 ≥ ,
14c2 log(|γ0 | + 4)
so the proof is complete.

In the above argument it is essential that the coefﬁcient of ζ (s) is larger

than the coefﬁcient of ζ (σ ). Among non-negative cosine polynomials T (θ) =
174 The Prime Number Theorem

a0 + a1 cos 2πθ + · · · + a N cos 2π N θ , the ratio a1 /a0 can be arbitrarily close

to 2, as we see in the Fejér kernel

N −1
n 1 sin π N θ 2
N (θ ) = 1 + 2 1 − cos 2nπ θ = ≥ 0,
n=1
N N sin π θ
but it must be strictly less than 2 since
1
a0 − 12 a1 = T (θ)(1 − cos 2π θ ) dθ > 0.
0

It is useful to have bounds for the zeta function and its logarithmic derivative
in the zero-free region.
Theorem 6.7 Let c be the constant in Theorem 6.6. If σ > 1 − c/(2 log τ )
and |t| ≥ 7/8, then
ζ
(s) log τ , (6.6)
ζ
| log ζ (s)| ≤ log log τ + O(1) , (6.7)
and
1
log τ . (6.8)
ζ (s)
ζ
On the other hand, if 1 − c/(2 log τ ) < σ ≤ 2 and |t| ≤ 7/8, then (s) =
ζ
−1/(s − 1) + O(1), log ζ (s)(s − 1) 1, and 1/ζ (s) |s − 1|.
Proof If σ > 1, then by Corollary 1.11 and the triangle inequality we see that
ζ ∞
ζ 1
(s) ≤ (n)n −σ = − (σ ) .
ζ n=1
ζ σ −1
Hence (6.6) is obvious if σ ≥ 1 + 1/ log τ . Let s1 = 1 + 1/ log τ + it. In par-
ticular we have
ζ
(s1 ) log τ. (6.9)
ζ
From this estimate and Lemma 6.4 we deduce that
1
log τ (6.10)
ρ s1 − ρ

where the sum is over those zeros ρ for which |ρ − (3/2 + it)| ≤ 5/6. Suppose
that 1 − c/(2 log τ ) ≤ σ ≤ 1 + 1/ log τ . Then by Lemma 6.4 we see that
ζ ζ 1 1

(s) − (s1 ) = − + O(log τ ). (6.11)
ζ ζ ρ s−ρ s1 − ρ
6.1 A zero-free region 175

Since |s − ρ| |s1 − ρ| for all zeros ρ in the sum, it follows that

1 1 1 1
− .
s−ρ s1 − ρ |s1 − ρ|2 log τ s1 − ρ
Now (6.6) follows on combining this with (6.9) and (6.10) in (6.11).
To derive (6.7) we begin as in our proof of (6.6). From Corollary 1.11 and
the triangle inequality we see that if σ > 1, then

∞
(n)
| log ζ (s)| ≤ n −σ = log ζ (σ ).
n=2
log n

But by Theorem 1.14 we know that ζ (σ ) < 1 + 1/(σ − 1), so that (6.7)
holds when σ ≥ 1 + 1/ log τ . In particular (6.7) holds at the point s1 =
1 + 1/ log τ + it, so that to treat the remaining s it sufﬁces to bound the
difference
s
ζ
log ζ (s) − log ζ (s1 ) = (w) dw.
s1 ζ
We take the path of integration to be the line segment joining the endpoints.
Then the length of this interval multiplied by the bound (6.6) gives the error
term O(1) in (6.7).
The estimate (6.8) follows directly from (6.7), since log 1/|ζ | = − log ζ .
The remaining estimates follow trivially from (6.5).

The ideas we have used enable us not only to derive a zero-free region but
also to place a bound on the number of zeros ρ that might lie near the point
1 + it.

Theorem 6.8 Let n(r ; t) denote the number of zeros ρ of ζ (s) in the disc
|ρ − (1 + it)| ≤ r . Then n(r ; t) r log τ , uniformly for r ≤ 3/4.

Proof If c1 is a small positive constant and r < c1 / log τ , then n(r ; t) = 0 by

Theorem 6.6. Suppose that c1 / log τ ≤ r ≤ 1/6, |t| ≥ 7/8. As in the proof of
Theorem 6.7, the estimate (6.10) holds when we take s1 = 1 + r + it. In the sum
over ρ, each term is non-negative, and those zeros ρ counted in n(r ; t) contribute
at least 1/(2r ) apiece. Hence their number is r log τ . If 1/6 < r ≤ 3/4 and
|t| ≥ 3, then the desired bound follows at once by applying Jensen’s inequality
(Lemma 6.1 above) to the function f (z) = ζ (z + 2 + it), with R = 11/6, in
view of the bounds provided by Corollary 1.17. Note that | f (0)| 1 because
of the absolute convergence of the Euler product. If 1/6 < r ≤ 3/4 and |t| ≤ 3,
then we apply Jensen’s inequality to the function f (z) = (z + 1 + it)ζ (z + 2 +
it).
176 The Prime Number Theorem

6.1.1 Exercises
1. (a) Show that if |z| < R, |w| ≤ R, and z = w, then
zw − R 2
≥ 1.
(z − w)R
(b) Show that if |w| ≤ ρ < R, |z| = r < R, and z = w, then
zw − R 2 rρ + R 2
≥ .
(z − w)R (r + ρ)R
(c) Suppose that f is analytic in the disc |z| ≤ R. For r ≤ R put M(r ) =
max|z|≤r | f (z)|. Show that if 0 < r < R and 0 < ρ < R, then the num-
ber of zeros of f in the disc |z| ≤ ρ does not exceed
M(R)
log
M(r )
.
rρ + R 2
log (r + ρ)R

2. Suppose that R, M, and ε are positive real numbers, and set h(z) =
2M z/(z + R + ε).
(a) Show that h(0) = 0, that h(z) is analytic for |z| < R + ε, and that
h(z) ≤ M for |z| ≤ R + ε.
(b) Show that if 0 < r < R, then
2Mr
max |h(z)| = −h(−r ) = .
|z|≤r R +ε−r
(c) Show that if 0 < r < R, then
2M(R + ε)
max |h (z)| = h (−r ) = .
|z|≤r (R + ε − r )2
3. Show that, in the situation of the Borel–Carathéodory lemma (Lemma 6.2),
if |z| ≤ r < R, then
4M R
|h (z)| ≤ .
(R − r )3
4. (Mertens 1898) Use the Dirichlet series expansion of log ζ (s) to show that
if σ > 1, then
|ζ (σ )3 ζ (σ + it)4 ζ (σ + 2it)| ≥ 1.
The method used to establish a zero-free region for the zeta function can be
applied to any particular Dirichlet L-function, though the constants involved
may depend on the function. We shall pursue this systematically in Chapter 11,
but in the exercise below we treat one interesting example.
6.1 A zero-free region 177

5. Let χ0 denote the principal character (mod 4), and χ1 the non-principal
character (mod 4).
(a) Show that L(1, χ1 ) = π/4, and hence that there is a neighbourhood of
1 in which L(s, χ1 ) = 0.
(b) Show that if σ > 1, then

L L L
−3 (σ, χ0 ) − 4 (σ + it, χ1 ) − (σ + 2it, χ0 ) ≥ 0.
L L L
(c) Show that there is a constant c > 0 such that L(s, χ1 ) = 0 for σ >
1 − c/ log τ .
(d) Show that there is a constant c > 0 such that if σ > 1 − c/ log τ , then
L
(s, χ1 ) log τ,
L
| log L(s, χ1 )| ≤ log log τ + O(1),
1
log τ.
L(s, χ1 )
6. (a) Show that if 1 < σ1 ≤ σ2 , then
ζ (σ2 ) ζ (σ2 + it) ζ (σ1 )
≤ ≤
ζ (σ1 ) ζ (σ1 + it) ζ (σ2 )
for all real t.
(b) Show that if 1 < σ1 ≤ σ2 ≤ 2, then
σ1 − 1 ζ (σ2 + it) σ2 − 1
σ2 − 1 ζ (σ1 + it) σ1 − 1
uniformly in t.
7. (Montgomery & Vaughan 2001)
(a) Show that if σ > 1, then

ζ (σ + i(t + 1)) ∞
(n) 1
≤ exp 2 sin 2 log n
ζ (σ + it) n=1
n σ log n

uniformly for all real t. 1

(b) Put f (θ) = | sin πθ |, and for integers k set *
f (k) = 0 f (θ)e(−kθ) dθ
where e(θ) = e2πiθ . Show that * f (k) = −2/(π (4k 2 − 1)).
(c) By Corollary D.3, or otherwise, show that

∞
| sin π θ| = *
f (k)e(kθ) .
k=−∞
178 The Prime Number Theorem

(d) Show that if 1 < σ ≤ 2, then

∞
ζ (σ + i(t + 1)) *
≤ |ζ (σ + ik)|2 f (k)
ζ (σ + it) k=−∞

uniformly for all real t.

(e) Show that if σ > 1, then
ζ (σ + i(t + 1))
(σ − 1)4/π (σ − 1)−4/π
ζ (σ + it)
uniformly in t.
(f) Show that
ζ (1 + i(t + 1))
(log t)−4/π (log t)4/π
ζ (1 + it)
uniformly for t ≥ 2.
8. Suppose that a and b are ﬁxed, 0 < a < b < 1. Suppose that f is analytic
in a domain containing the disc |z| ≤ R, that f (0) = 0, and that | f (z)| ≤ M
for |z| ≤ R. Show that

f K
1 1 M
(z) = +O log
f k=1
z − zk R | f (0)|
for |z| ≤ a R where the sum is over those zeros z k of f (z) for which
|z k | ≤ b R.
9. (Landau 1924a) Suppose that θ (t) and φ(t) are functions with the following
properties: φ(t) > 0, φ(t) , e−φ(t) ≤ θ(t) ≤ 1/2, θ (t) . Suppose also
that
ζ (s) eφ(t)
for σ ≥ 1 − θ (t), t ≥ 2.
(a) Show that
1
ζ φ(t + 1)
(s) = +O
ζ ρ s −ρ θ(t + 1)

for σ ≥ 1 − θ(t + 1)/3 where the sum is over zeros ρ for which |ρ −
(1 + θ (t + 1) + it)| ≤ 5θ (t + 1)/3.
(b) Show that there is an absolute constant c > 0 such that ζ (s) = 0 for
θ (2t + 1)
σ ≥1−c .
φ(2t + 1)
(c) Show that the zero-free region (6.26) follows from the estimate (6.25).
6.2 The Prime Number Theorem 179

(d) By mimicking the proof of Theorem 6.7, but with s1 = 1 +

θ (2t + 1)/φ(2t + 1) + it, show that
ζ φ(2t + 2)
(s) ,
ζ θ(2t + 2)
φ(2t + 2)
| log ζ (s)| ≤ log + O(1),
θ(2t + 2)
1 φ(2t + 2)
ζ (s) θ(2t + 2)
for σ ≥ 1 − 12 cθ (2t + 2)/φ(2t + 2).
10. Suppose that ζ (s) = 0 for σ ≥ η(t), t ≥ 2, where η(t) , η(t) 1/ log t.
Show that
ζ
(s) log t
ζ
for σ ≥ 1 − 12 η(t + 1), t ≥ 2.

6.2 The Prime Number Theorem

We are now in a position to prove the Prime Number Theorem in a quantitative

form. We apply Perron’s formula to ζζ (s) to obtain an asymptotic estimate for

ψ(x) = (n),
n≤x

and then use partial summation to derive an estimate for π (x). It would be more
direct to apply Perron’s formula to log ζ (s), but our approach is technically

simpler since log ζ (s) has a logarithmic singularity at s = 1 while ζζ (s) has
only a simple pole there.

Theorem 6.9 There is a constant c > 0 such that

x
ψ(x) = x + O √ , (6.12)
exp(c log x)

x
ϑ(x) = x + O √ , (6.13)
exp(c log x)
and

x
π(x) = li(x) + O √ (6.14)
exp(c log x)
uniformly for x ≥ 2.
180 The Prime Number Theorem

Here li(x) is the logarithmic integral,

x
1
li(x) = du.
2 log u
By integrating this integral by parts K times we see that

K −1
(k − 1)! x
li(x) = x + O K . (6.15)
k=1
(log x)k (log x) K
On combining this with (6.14) we see that

x x
π(x) = +O .
log x (log x)2
This is a quantitative form of the Prime Number Theorem. When this main term
is used, the error term is genuinely of the indicated size, since by (6.14) and
(6.15) again we see that

x x x
π(x) = + + O .
log x (log x)2 (log x)3
Thus we see that in order to obtain a precise estimate of π (x), it is essential
to use the logarithmic integral (or some similar function) to express the main
term.
Proof From Corollary 1.11 and Theorem 5.2 we see that
σ0 +i T
−1 ζ xs
ψ(x) = (s) ds + R (6.16)
2πi σ0 −i T ζ s
for σ0 > 1, where by Corollary 5.3 we see that

x (4x)σ0 ∞
(n)
R (n) min 1, + .
x/2<n<2x
T |x − n| T n=1
n σ0

Here the second sum is − ζζ (σ0 ), which is 1/(σ0 − 1) for 1 < σ0 ≤ 2. To
estimate the first sum we note that (n) ≤ log n log x. For the n that is
nearest to x we replace the minimum by its first member, and for all other
values of n we replace it by its second member. Thus the first sum is

x 1 x
(log x) 1 + log x + (log x)2 .
T 1≤k≤x k T

Suppose that 2 ≤ T ≤ x and that σ0 = 1 + 1/ log x. Then

x
R (log x)2 .
T
6.2 The Prime Number Theorem 181

Put σ1 = 1 − c/ log T where c is a small positive constant, and let C denote

the closed contour that consists of line segments joining the points σ0 − i T ,

σ0 + i T , σ1 + i T , σ1 − i T . From Theorem 6.6 we know that ζζ (s) has a simple
pole with residue −1 at s = 1, but that it is otherwise analytic within C. Hence
by the calculus of residues,
−1 ζ xs
(s) ds = x.
2πi C ζ s
If c is small, then the estimate (6.6) of Theorem 6.7 applies on this contour.
Hence
σ1 +i T
ζ xs log T σ0 x
− (s) ds x (σ0 − σ1 ) ,
σ0 +i T ζ s T T
and similarly for the integral from σ1 − i T to σ0 − i T . Using (6.6) again, we
also see that
σ1 −i T
ζ xs T
dt 1
dt
− (s) ds x σ1 (log T ) + x σ1
σ1 +i T ζ s −T 1 + |t| −1 |σ 1 + it − 1|
σ1
x
x σ1 (log T )2 + x σ1 (log T )2 .
1 − σ1
On combining these estimates we conclude that

1 −c/ log T
ψ(x) = x + O x(log x) 2
+x .
T
We choose T so that the two terms in the last factor of the error term are equal,
√
i.e., T = exp c log x . With this choice of T , the error term above is

x(log x)2 exp − c log x x exp − c log x

since we may suppose that 0 < c < 1. Thus the proof of (6.12) is complete.
To derive (6.13) it sufﬁces to combine (6.12) with the ﬁrst estimate of Corol-
lary 2.5. As for (6.14), we note that
x x
1 1
π(x) = dϑ(u) = li(x) + d(ϑ(u) − u).
2− log u 2− log u
By integrating by parts we see that this last integral is
ϑ(u) − u x x
ϑ(u) − u
+ 2
du,
log u 2− 2 u(log u)
√
and by (6.13) it follows that this is x exp(−c log x). Thus we have (6.14),
and the proof is complete.
182 The Prime Number Theorem

The method we used to derive Theorem 6.9 is very ﬂexible, and can be
applied to many other situations. For example, the summatory function

M(x) = µ(n)
n≤x

can be estimated by applying the above method with ζ /ζ replaced by 1/ζ .

Thus it may be shown that

M(x) x exp − c log x (6.17)
for x ≥ 2. If instead we were to apply the method to the function 1/ζ (s + 1),
we would ﬁnd that
µ(n)
exp − c log x , (6.18)
n≤x n
since 1/(sζ (s + 1)) is analytic at s = 0. Hence in particular,

∞
µ(n)
= 0. (6.19)
n=1
n

6.2.1 Exercises

1. (Landau 1901b; cf. Rosser & Schoenfeld 1962) Use Theorem 6.9 to show
that
π(2x) − 2π (x) = −2(log 2)x(log x)−2 + O(x(log x)−3 ).
Deduce that for all large x, the interval (x, 2x] contains fewer prime num-
bers than the interval (0, x].

2. Use Theorem 6.9 to show that if n is of the form n = p≤y p where y is
sufﬁciently large, then d(n) > n (log 2)/ log log n .
3. (a) Use Theorem 6.9 to show that
1 log y
= log + O exp − c log x .
x< p≤y p log x
(b) Use the above and Theorem 2.7 to show that
1
= log log x + b + O exp − c log x
p≤x p

where b = C0 − p ∞ k
k=2 1/(kp ) .
4. Show that for x ≥ 2,
(n)
= log x − C0 + O exp − c log x .
n≤x n
6.2 The Prime Number Theorem 183

5. (cf. Cipolla 1902; Rosser 1939) Let p1 < p2 < · · · denote the prime num-
bers. Show that

log log n 2 (log log n)2
pn = n log n + log log n − 1 + − +O .
log n log n (log n)2
6. (Landau 1900) Let πk (x) denote the number of integers not exceeding x
that are composed of exactly k distinct primes.
(a) Show that

π2 (x) = π(x/ p) + O x(log x)−2 .
√
p≤ x

(b) Show that the sum above is

x
+ O x(log log x)(log x)−2 .
√ p log x/ p
p≤ x

(c) Using Theorem 6.9 and integration by parts, show that the sum above
is
√
x
du
x + O(x/ log x).
2 u(log x/u) log u
(d) Conclude that π2 (x) = x(log log x)/ log x + O(x/ log x).
7. (D. E. Knutson) Let dn denote the least common multiple of the numbers
1, 2, . . . , n.
(a) Show that dn = exp(ψ(n)).

(b) Let E(z) = ∞ n=1 z /dn . Show that this power series has radius of
n

convergence e.
(c) Show that E(1) is irrational.
8. (Landau 1905) Let Q(x) denote the number of square-free integers not
exceeding x, and deﬁne R(x) by the relation Q(x) = (6/π 2 )x + R(x).
(a) Show that

R(x) = M(y){x/y 2 } − µ(d){x/d 2 }
d≤y
∞
+ M x/m − 2x M(u)u −3 du.
m≤x/y 2 y

√
(b) Taking y = x 1/2 exp(−c log x) where c is sufﬁciently small, show
√
that R(x) x 1/2 exp(−c log x).

9. Let N = N (Q) = 1 + q≤Q ϕ(q) be the number of Farey points of order
Q, and for 0 ≤ α ≤ 1 write

card{(a, q) : q ≤ Q, (a, q) = 1, a/q ≤ α} = N α + R

184 The Prime Number Theorem

where R = R(Q, α).

(a) Show that if α = (1/Q)− , then R = −N /Q −Q.
(b) Show that if α = 1 − 1/Q, then R = N /Q − 1 Q.
(c) Show that

R=− {r α}M(Q/r )
r ≤Q

for 0 ≤ α ≤ 1.
(d) Show that R Q uniformly for 0 ≤ α ≤ 1.
10. (Landau 1903b; Massias, Nicolas & Robin 1988, 1989) Let f (n) denote
the maximal order of any element of the symmetric group Sn .
(a) Show that f (n) = max lcm(n 1 , n 2 , . . . , n k ) where the maximum is ex-
tended over all sets {n 1 , n 2 , . . . , n k ) of natural numbers for which
n 1 + n 2 + · · · + n k ≤ n.

(b) Choose y as large as possible so that p≤y p ≤ n. Show that

log f (n) ≥ log p = (1 + o(1))(n log n)1/2 .
p≤y

(c) Show that f (n) = max q1 q2 · · · qk where qi = pia(i) , pi = p j for i =

j, and qi ≤ n.

(d) Use the arithmetic–geometric mean inequality to show that qi ≤
(n/k)k .
(e) Show that if k is the number of qi ’s in (c), then k ≤ (2 +
o(1))(n/ log n)1/2 .
(f) Conclude that log f (n) (n log n)1/2 .
11. Let λ(n) = (−1)(n) be Liouville’s lambda function.

(a) Show that ∞ n=1 λ(n)n
−s
= ζ (2s)/ζ (s) for σ > 1.
(b) Using the method of proof of Theorem 6.9, show that

λ(n) x exp − c log x .
n≤x

(c) Use (6.17) and the fact that λ(n) = d 2 |n µ(n/d 2 ) to give a second
proof of the above estimate.
12. (Landau 1907, Section 14) Let cn = 1 if n is a prime or a prime power,
cn = 0 otherwise.

(a) Show that µ(n)ω(n) = − d|n cd µ(n/d).
(b) Use (6.18) and the method of the hyperbola to show that

∞
µ(n)ω(n)
= 0.
n=1
n
6.2 The Prime Number Theorem 185

13. Use the method of proof of Theorem 6.9 to show that

x 1−it
(n)n −it = + O(x exp − c log x
n≤x 1 − it

log x
+ O x(log x) exp −c
2
log τ
uniformly for |t| ≤ x.
14. Use the method of proof of Theorem 6.9 to show that for any ﬁxed real t,

∞
1
µ(n)n −1−it = .
n=1
ζ (1 + it)
15. (a) Use the method of proof of Theorem 6.9 to show that for any ﬁxed
t = 0,

∞
(n)
n −1−it = log ζ (1 + it).
n=1
log n
(b) Deduce that for any t = 0,
(1 − p −1−it )−1 = ζ (1 + it).
p

16. (Landau 1899b, 1901a, 1903c) Use the method of proof of Theorem 6.9 to
show that
∞
µ(n) log n
(a) = −1;
n=1
n
∞
µ(n)(log n)2
(b) = −2C0 ;
n=1
n
∞
λ(n) log n
(c) = −ζ (2).
n=1
n
17. Taking (6.18) and a quantitative form of the ﬁrst part of the preceding
exercise for granted, use elementary reasoning to show that if q ≤ x then
µ(n)
(a) exp − c log x ,
n≤x n
(n,q)=1
µ(n) log n q
(b) =− + O exp − c log x .
n≤x n ϕ(q)
(n,q)=1
18. (Hardy 1921) Use the method of proof of Theorem 6.9 to show that
∞
µ(n)
(a) = 0;
n=1
ϕ(n)
∞
µ(n) log n
(b) = 0;
n=1
ϕ(n)
186 The Prime Number Theorem

∞
µ(n)(log n)2
(c) = 4A log 2
n=1
ϕ(n)

where A = p>2 1 − ( p−1)
1
2 .

19. Let Q(x) denote the number of square-free integers not exceeding x, and
recall Theorem 2.2.
(a) Show that
6 µ(n)
Q(x) = x−x − µ(n){x/n 2 }
π 2 √
n> x
n 2 √
n≤ x

where {θ } = x − [x] is the fractional part of θ .

√
(b) Show that n>y µ(n)/n 2 y −1 exp(−c log y) for y ≥ 2.
(c) Note that if k is a positive integer, then {x/n 2 } is monotonic for n in
√ √
the interval x/(k + 1) < n ≤ x/k. Deduce that if x ≥ 2k 2 , then

µ(n){x/n 2 } x/k exp − c log x .
√ √
x/(k+1)<n≤ x/k
√
(d) By using the above for 1 ≤ k ≤ K = exp(−b log x) where b is suit-
ably chosen in terms of c, show that
6 c
Q(x) = x + O x 1/2 exp − log x .
π 2 2

20. (Ingham 1945) Let F(n) = d|n f (d) for all n. From our remarks at the
beginning of Chapter 2 we see that it is natural to expect a connection
between

(i) S(x) := n≤x F(n) = cx + o(x);
∞
(ii) n=1 f (n)/n = c.
Neither of these implies the other, but we show now that (i) implies that the
series (ii) is (C,1) summable to c.

(a) Show that S(x) = n≤x f (n)[x/n].
(b) Show that

f (n) n x dv
1− = S(v) µ(d)/d .
n≤x n x 1 d≤x/v
v2

(c) Show that

x µ(d) dv
→1
1 d≤x/v
d v

as x → ∞.
6.2 The Prime Number Theorem 187

(d) Use the estimate d≤y µ(d)/d (log 2y)−2 to show that
x µ(d) dv
1.
1 d≤x/v
d v

(e) Mimic the proof of Theorem 5.5, or use Exercise 5.2.6 to show that if
(i) holds, then
f (n) n
lim 1− = c.
x→∞
n≤x n x
(f) Use Theorem 5.6 to show that if (i) holds and f (n) = O(1), then (ii)
follows.

(g) Take f (n) = µ(n) to deduce that ∞ n=1 µ(n)/n = 0. (Of course we
used much more above in (d). For a result in the converse direction, see
Exercise 8.1.5.)
21. (Landau 1908b) Let R be the set of positive integers that can be expressed
as a sum of two squares, let R(x) denote the number of such integers not
exceeding x, and let χ1 denote the non-principal character (mod 4), as in
Exercise 6.1.5.
(a) Show that

n −s = (1 − 2−s )−1 (1 − p −s )−1 (1 − p −2s )−1
n∈R p≡1 (4) p≡3 (4)

for σ > 1.
√
(b) Show that the Dirichlet series above is f (s) ζ (s)L(s, χ1 ) where
f (s) = (1 − 2−s )−1/2 (1 − p −2s )−1/2
p≡3 (4)

is a Dirichlet series with abscissa of convergence σc = 1/2.

(c) Deduce that the Dirichlet series generating function for R has a
quadratic singularity at s = 1.
(d) Show that
1 xs
R(x) = f (s) ζ (s)L(s, χ1 ) ds + O x exp − c log x
2πi C s
where C is the contour running from 1 − c − iδ along a straight line
to 1 − iδ, then along the semicircle 1 + δeiθ , −π/2 ≤ θ ≤ π/2, and
ﬁnally along a straight line to 1 − c + iδ. Here c should be sufﬁciently
small and δ = 1/ log x.
(e) Show that the integral above is
1 g(s)x s
= √ ds
2πi C s−1
188 The Prime Number Theorem

where
f (s)
g(s) = (s − 1)ζ (s)L(s, χ1 )
s
is analytic in a neighbourhood of 1.
(f) Show that
,
π
g(1) = (1 − p −2 )−1/2 .
2 p≡3 (4)

(g) Show that g(s) = g(1) + O(|s − 1|) when s is near 1.

(h) By means of Theorem C.3 with s = 1/2, or otherwise, show that
1 xs x
√ ds = √ + O(x 1−c ).
2πi C s−1 π log x
(i) Show that if δ = 1/ log x, then
x
|s − 1|1/2 x σ |ds| .
C (log x)3/2
(j) Show that
bx
R(x) = √ + O x(log x)−3/2
log x
where

b = 2−1/2 (1 − p −2 )−1/2 .
p≡3 (4)

22. Let A denote the set of those positive integers that are composed entirely
of the prime 2 and primes ≡ 1 (mod 4), and let B be the the set of those
positive integers that are composed entirely of primes ≡ 3 (mod 4).
(a) Explain why any positive integer n has a unique representation in the
form n = a(n)b(n) where a(n) ∈ A and b(n) ∈ B.
(b) Let A(x) denote the number of a ∈ A, a ≤ x. Show that

αx x
A(x) = √ +O
log x (log x)3/2
√
where α = 1/ 2.
(c) Let B(x) denote the number of b ∈ B, b ≤ x. Show that

βx x
B(x) = √ +O
log x (log x)3/2
√
where β = 2/π .
6.2 The Prime Number Theorem 189

(d) For 0 ≤ κ ≤ 1 let Nκ (x) denote the number of n ≤ x such that a(n) ≤
n κ . Show that

Nκ (x) = 1.
a≤x κ a 1/κ−1 ≤b≤x/a
a∈A b∈B

(e) Show that if κ is ﬁxed, 0 ≤ κ ≤ 1, then

x
Nk (x) = c(κ)x + O √
log x
where
κ
1 du
c(κ) = √ .
π 0 u(1 − u)
23. The deﬁnition of li(x) is somewhat arbitrary because of the casual choice
of the lower endpoint of integration. A more intrinsic logarithmic integral
is Li(x), which is deﬁned to be
1−ε x
dt
Li(x) = lim+ + (6.20)
ε→0 0 1+ε log t
for x > 1. (Note that li(x) = Li(x) − Li(2).)
(a) Show that
1−ε ∞
dt dv
=− e−v .
0 log t − log(1−ε) v
(b) Show that
1−ε ∞
dt
= log ε − (log v)e−v dv + O(ε log 1/ε),
0 log t 0

and explain why the integral on the right is (1) = −C0 .
(c) Show that if x > 1, then
x log x
dt dv
= ev .
1+ε log t log(1+ε) v
(d) Show that if x > 1, then
x
dt log x
ev − 1
= log log x − log ε + dv + O(ε).
1+ε log t 1 v
(e) Show that if x > 1, then
log x
ev − 1
Li(x) = log log x + C0 + dv.
0 v
190 The Prime Number Theorem

(f) Expand ev as a power series, and integrate term-by-term, to show that

if x > 1, then

∞
(log x)n
Li(x) = log log x + C0 + . (6.21)
n=1
n!n

24. For 0 < x < 1 let

x
dt
Li(x) = .
0 log t

(a) Show that if 0 < x < 1, then

∞
Li(x) = x log log 1/x − e−v log v dv.
− log x

(b) Show that if 0 < x < 1, then

− log x
Li(x) = x log log 1/x + C0 + e−v log v dv.
0

(c) Show that if 0 < x < 1, then

− log x
1 − e−v
Li(x) = log log 1/x + C0 − dv.
0 v
(d) Show that if 0 < x < 1, then

∞
(log x)n
Li(x) = log log 1/x + C0 + .
n=1
n!n

(e) (Pólya & Szegö 1972, p. 8) Show that

∞
zn ∞
n
1 (−z)n
= −e z
.
n=1
n!n n=1 k=1
k n!

(f) Show that if 0 < x < 1, then

∞ n
1 (log 1/x)n
Li(x) = log log 1/x + C0 − x . (6.22)
n=1 k=1
k n!

25. By repeated integration by parts we know that

K
(k − 1)! x
Li(x) = x + OK .
k=1
(log x)k (log x) K +1

Our object is to determine how closely one can approximate to Li(x) by

6.2 The Prime Number Theorem 191

partial sums of the formal asymptotic expansion

∞
(k − 1)!
Li(x) ∼ x .
k=1
(log x)k

(a) Show that the least term in the sum above occurs when k = [log x] + 1.
(b) Show that if x ≥ e K , then

K
(k − 1)!
Li(x) = x + Li(e)
k=1
(log x)k

K −1 ek+1
dt (k − 1)!ek
+ k! −
k=1 ek (log t)k+1 kk
(K − 1)!e K x
dt
− + K! .
KK eK (log t) K +1
(c) Deﬁne R(x) by the relation
x]
[log
(k − 1)!
Li(x) = x + R(x).
k=1
(log x)k

Show that R(x) is increasing, continuous, and convex downward for

x ∈ [e K , e K +1 ). Let α K = R(e K ), and let β K be the limit of R(x) as x
tends to e K +1 from below.
(d) Show that
e K +1
dt eK 1/K
eK w
= dw.
eK (log t) K +1 KK 0 (1 + w) K +1
(e) Show that the integrand on the right above is ≤ 1 in the range of inte-
gration.
(f) Show that the minimum of e K w /(1 + w) K +1 for w > 0 occurs when
w = 1/K .
(g) Show that
e K +1
e K +1 dt eK
< < .
(K + 1) K +1 eK (log t) K +1 K K +1
(h) Show that α K and that β K .
(i) Show that β K − α K K −1/2
(j) Show that R(x) = c + O((log x)−1/2 ) where

∞ ek+1
dt (k − 1)!ek
c = Li(e) + k! − .
k=1 ek (log t)k+1 kk
192 The Prime Number Theorem

(k) Show that if x ≥ e, then

x]
[log
(k − 1)!
α1 ≤ Li(x) − x ≤ β1 (6.23)
k=1
(log x)k
where α1 = −0.82316 . . . and β1 = 1.259706 . . . . .
26. (Ingham 1932, pp. 60–63) Suppose that η(t) is deﬁned for t ≥ 2, that η (t) is
continuous, η (t) → 0 as t → ∞, that η(t) , that 1/ log t η(t) ≤ 1/2,
and that ζ (s) = 0 for σ ≥ 1 − η(t), t ≥ 2. For x ≥ 2, put

ω(x) = min η(t) log x + log t .

2≤t<∞

(a) Show that there is an absolute constant c > 0 such that

π (x) = li(x) + O(x exp(−cω(x))).
(b) Show that if a > 0 is ﬁxed and (6.24) below holds, then (6.27) below
holds with b = 1/(1 + a).
(c) Show that (6.28) follows from (6.26).

6.3 Notes
Section 6.1. Jensen (1899) proved that if f satisfies the hypotheses of
Lemma 6.1, then
n 2π
R 1
| f (0)| = exp log | f (Reiθ )| dθ
|z |
k=1 k
2π 0
where z 1 , . . . , z n are the zeros of f in the disc |z| ≤ R. Here the right-hand side
may be regarded as being the geometric mean of | f (z)| for z on the circle |z| =
R. Each factor of the product above is ≥ 1, and if |z k | ≤ r , then R/|z k | ≥ R/r .
Thus Lemma 6.1 follows easily from the above. The products used in the proofs
of Lemmas 6.1 and 6.3 are known as Blaschke products. Their use (usually with
infinitely many factors) is an important tool of complex analysis. Lemma 6.2 is
due to Borel (1897); it refines an earlier estimate of Hadamard. Carathéodory’s
contributions on this subject are recounted by Landau (1906; Section 4).
Lemma 6.4 is implicit in Landau (1909, p. 372), and may have been known
earlier. It can also be easily derived from the identity (10.29) that arises by
applying Hadamard’s theory of entire functions to the zeta function.
The Prime Number Theorem was first proved, in the qualitative form π (x) ∼
x/ log x, independently by Hadamard (1896) and de la Vallée Poussin (1896).
In these papers, it was shown that ζ (1 + it) = 0, but no specific zero-free region
6.3 Notes 193

was established. The first proof that ζ (1 + it) = 0 given by de la Vallée Poussin
was rather complicated, but later in his long paper he gave a second proof
depending on the inequality 1 − cos 2θ ≤ 4(1 + cosθ ). This is equivalent to the
non-negativity of the cosine polynomial 3 + 4 cos θ + cos 2θ , which Mertens
(1898) used to obtain the result of Exercise 6.4. Our Lemma 6.5 is derived by
the same method. The classical zero-free region of Theorem 6.6 was established
first by de la Vallée Poussin (1899). The estimates (6.6) and (6.8) of Theorem 6.7
were first proved by Gronwall (1913).
Wider zero-free regions have been established by using exponential sum es-
timates to obtain better upper bounds for |ζ (s)| when σ is near 1 . The first such
improvement was derived by Hardy & Littlewood. Their paper on this was never
published, but accounts of their approach have been given by Landau (1924b)
and Titchmarsh (1986, Chapter 5). Littlewood (1922) announced that from
these estimates he had deduced that ζ (s) = 0 for σ ≥ 1 − c(log log τ )/ log τ .
As explained by Ingham (1932, p. 66), Littlewood never published his com-
plicated proof, because the simpler method of Landau (1924a) had become
available.
In 1935, Vinogradov introduced a new method for estimating Weyl sums. A
N
Weyl sum is a sum of the form n=1 e( f (n)) where f ∈ R[x]. The quality of
Vinogradov’s estimate depends on rational approximations to the coefficients
of f , and on the degree of f . The function f (x) = t log x is not a polynomial,
but by approximating to it by polynomials one can make Vinogradov’s method
apply. This was first done by Chudakov (1936 a, b, c), who derived estimates
for ζ (s) for σ near 1 that allowed him to deduce that ζ (s) = 0 for
σ > 1 − c(log τ )−a (6.24)
for a > 10/11. Vinogradov (1936b) gave stronger exponential sum estimates,
which Titchmarsh (1938) used to obtain a zero-free region of the above form for
a > 4/5. Hua (1949) introduced a further refinement of Vinogradov’s method,
from which Titchmarsh (1951, Chapter 6) and Tatuzawa (1952) derived the
zero-free region
σ > 1 − c(log τ )−3/4 (log log τ )−3/4 .
By refining the passage from Weyl sums to the zeta function, Korobov (1958a)
obtained (6.24) for a > 5/7, and then Korobov (1958b, c) and Vinogradov
(1958) obtained a > 2/3. In fact, Vinogradov claimed that one can take a =
2/3, but this seems to be still out of reach. Richert’s polished exposition of
Vinogradov’s method is reproduced in Walfisz (1963). Other expositions have
since been given by Karatsuba & Voronin (1992, Chapter 4), Montgomery
(1994, Chapter 4), and Vaughan (1997). Richert (1967) used Vinogradov’s
194 The Prime Number Theorem

method to show that

3/2
ζ (s) t 100(1−σ ) (log t)2/3 (6.25)

for σ ≤ 1, t ≥ 2. From this it follows that ζ (s) = 0 for

σ ≥ 1 − c(log τ )−2/3 (log log τ )−1/3 . (6.26)

The methods of Hadamard and de la Vallée Poussin depended on the analytic

continuation of ζ (s), on bounds for the size of ζ (s) in the complex plane, and
on Hadamard’s theory of entire functions. The first two of these are achieved
most easily by Riemann’s functional equation (see Corollaries 10.3–10.5). An
abbreviated account of the third is found in Lemma 10.11. Landau (1903a)
showed that one can obtain a zero-free region using only the local analytic
properties of the zeta function. This enabled Landau to prove the Prime Ideal
Theorem, which is the natural extension of the Prime Number Theorem to
algebraic number fields: If K is an algebraic number field, then the number
of prime ideals p in K with N (p) ≤ x is asymptotic to x/ log x as x → ∞.
This could not have been done at that time by the methods of Hadamard and
de la Vallée Poussin, since the analytic continuation and functional equation of
the Dedekind zeta function ζ K (s) was established only later, by Hecke (1917).
Landau did not achieve Theorem 6.6 at the first attempt, but he refined his
approach in a series of papers culminating in the polished exposition of Landau
(1924a).
Section 6.2. Ingham (1932, pp. 60–65; cf. Titchmarsh 1986, pp. 56–60)
developed a general system by which any given zero-free region of the zeta
function can be used to derive an associated bound for the error term in the
Prime Number Theorem. In particular, he showed that if ζ (s) = 0 for s in the
region (6.24), then

ψ(x) = x + O(x exp(−c(log x)b )) (6.27)

where b = 1/(1 + a). Similarly, from the zero-free region (6.26) it follows that

π (x) = li(x) + O x exp − c(log x)3/5 (log log x)−1/5 . (6.28)

Turán (1950) used his method of power sums to show conversely that (6.27)
implies (6.24). More general converse theorems have since been established by
Stás (1961) and Pintz (1980, 1983, 1984). A similar converse theorem in which

an upper bound for M(x) = n≤x µ(n) is used to produce a zero-free region
has been given by Allison (1970).
That M(x) = o(x) was ﬁrst proved by von Mangoldt (1897). The quantitative
estimate (6.17) is due to Landau (1908a). The relation (6.19), asserted by Euler
6.4 References 195

(1748; Chapter 15, no. 277), was ﬁrst proved by von Mangoldt (1897). Landau
(1899a) and de la Vallée Poussin (1899) shortly gave simpler proofs.

6.4 References
Allison, D. (1970). On obtaining zero-free regions for the zeta-function from estimates
of M(x), Proc. Cambridge Philos. Soc. 67, 333–337.
Borel, E. (1897). Sur les zéros des fonctions entièrs, Acta Math. 20, 357–396.
Chudakov, N. G. (1936a). Sur les zéros de la fonction ζ (s), C. R. Acad. Sci. Paris 202,
191–193.
(1936b). On zeros of the function ζ (s), Dokl. Akad. Nauk SSSR 1, 201–204.
(1936c). On zeros of Dirichlet’s L-functions, Mat. Sb. (1) 43, 591–602.
(1937). On Weyl’s sums, Mat. Sb. (2) 44, 17–35.
(1938). On the functions ζ (s) and π(x), Dokl. Akad. Nauk SSSR 21, 421–422.
Cipolla, M. (1902). La determinazione assintotica dell’ n imo numero primo, Rend. Accad.
Sci. Fis-Mat. Napoli (3) 8, 132–166.
Euler, L. (1748). Introductio in analysin inﬁnitorum, I, Lausanne; Opera omnia Ser 1,
Vol. 8, Teubner, 1922.
Gronwall, T. H. (1913). Sur la fonction ζ (s) de Riemann au voisinage de σ = 1, Rend.
Mat. Cir. Palermo 35, 95–102.
Hadamard, J. (1896). Sur la distribution des zéros de la fonction ζ (s) et ses conséquences
arithmétiques, Bull. Soc. Math. France 24, 199–220.
Hardy, G. H. (1921). Note on Ramanujan’s trigonometrical function cq (n), and certain
series of arithmetical functions, Proc. Cambridge Philos. Soc. 20, 263–271.
Hecke, E. (1917). Über die Zetafunktion beliebiger algebraischer Zahlkörper, Nachr.
Akad. Wiss. Göttingen, 77–89; Mathematische Werke, Göttingen: Vandenhoeck &
Ruprecht, 1959, pp. 159–171.
Hua, L. K. (1949). An improvement of Vinogradov’s mean-value theorem and several
applications, Quart. J. Math. Oxford Ser. 20, 48–61.
Ingham, A. E. (1932). The Distribution of Prime Numbers, Cambridge Tracts Math. 30.
Cambridge: Cambridge University Press.
(1945). Some Tauberian theorems connected with the Prime Number Theorem, J.
London Math. Soc. 20, 171–180.
Jensen, J. L. W. V. (1899). Sur un nouvel et important théorème de la théorie des
fonctions, Acta Math. 22, 359–364.
Karatsuba, A. A. & Voronin, S. M. (1992). The Riemann Zeta-function. Berlin: de
Gruyter.
Korobov, N. M. (1958a). On the zeros of the function ζ (s), Dokl. Akad. Nauk SSSR 118,
231–232.
(1958b). Weyl’s estimates of sums and the distribution of primes, Dokl. Akad. Nauk
SSSR 123, 28–31.
(1958c). Evaluation of trigonometric sums and their applications, Usp. Mat. Nauk 13,
no. 4, 185–192.
Landau, E. (1899a). Neuer Beweis der Gleichung ∞ µ(k)
k=1 k = 0, Inaugural Dissertation,
Berlin; Collected Works, Vol. 1. Essen: Thales Verlag, pp. 69–83.
196 The Prime Number Theorem

(1899b). Contribution à la théorie de la fonction ζ (s) de Riemann, C. R. Acad. Sci.

Paris, 129, 812–815; Collected Works, Vol. 1. Essen: Thales Verlag, 1985, pp.
84–88.
(1900). Sur quelques problèmes rélatifs à la distribution des nombres premiers, Bull.
Soc. Math. France 28, 25–38; Collected Works, Vol. 1. Essen: Thales Verlag, 1985,
pp. 92–105.
(1901a). Über die asymptotischen Werthe einiger zahlentheoretischer Functionen,
Math. Ann. 54, 570–591; Collected Works, Vol. 1. Essen: Thales Verlag, 1985, pp.
141–162.
(1901b). Solutions de questions proposées, Nouv. Ann. de Math. (4) 1, 281–283;
Collected Works, Vol. 1. Essen: Thales Verlag, 1985, pp. 181–182.
(1903a). Neuer Beweis des Primzahlsatzes und Beweis des Primidealsatzes, Math.
Ann. 56, 645–670; Collected Works, Vol. 1. Essen: Thales Verlag, 1985, pp. 327–
353.
(1903b). Über die Maximalordnung der Permutationen gegebenen Grades, Arch.
Math. Phys. (3) 5, 92–103; Collected Works, Vol. 1. Essen: Thales Verlag, 1985,
pp. 384–396.
(1903c). Über die zahlentheoretische Funktion µ(k), Sitzungsber. Kaiserl. Akad. Wiss.
Wien math-natur. Kl. 112, 537–570; Collected Works, Vol. 2. Essen: Thales Verlag,
1986, pp. 60–93.
(1905). Sur quelques inégalités dans la théorie de la fonction ζ (s) de Riemann, Bull.
Soc. Math. France 33, 229–241; Collected Works, Vol. 2. Essen: Thales Verlag,
1986, pp. 167–179.
(1906). Über den Picardschen Satz, Vierteljahrschr. der Naturf. Ges. Zürich 51, 252–
318; Collected Works, Vol. 3. Essen: Thales Verlag, 1986, pp. 113–179.
(1907). Über die Multiplikation Dirichlet’scher Reihen, Rend. Circ. Mat. Palermo 24,
81–160; Collected Works, Vol. 3. Essen: Thales Verlag, 1986, pp. 323–401.
(1908a). Beiträge zur analytischen Zahlentheorie, Rend. Mat. Circ. Palermo 26, 169–
302; Collected Works, Vol. 3. Essen: Thales Verlag, 1986, pp. 411–544.
(1908b). Über die Einteilung der positiven ganzen Zahlen in vier Klassen nach der
Mindestzahl der zu ihrer additiven Zusammensetzung erforderlichen Quadrate,
Arch. Math Phys. (3) 13, 305–312; Collected Works, Vol. 4. Essen: Thales Verlag,
1986, 59–66.
(1909). Handbuch der Lehre von der Verteilung der Primzahlen, Leipzig: Teubner.
(1924a). Über die Wurzeln der Zetafunktion, Math. Z. 20, 98–104; Collected Works,
Vol. 8. Essen: Thales Verlag, 1987, pp. 70–76.
(1024b). Über die ζ -funktion und die L-funktionen, Math. Z. 20, 105–125; Collected
Works, Vol. 8. Essen: Thales Verlag, 1987, pp. 77–98.
Littlewood, J. E. (1922). Researches in the theory of the Riemann ζ -function, Proc.
London Math. Soc. (2), 20, xxii–xxvii; Collected papers, Vol. 2. Oxford: Oxford
University Press, 1982, pp. 844–850.
von Mangoldt, H. (1897). Beweis der Gleichung ∞ µ(k)
k=1 k = 0, Sitzungsber. Königl.
Preuß. Akad. Wiss. Berlin, 835–852.
Massias, J.-P., Nicolas, J.-L., & Robin, G. (1988). Évaluation asymptotique de l’ordre
maximum d’un élément du groupe symétrique, Acta Arith. 50, 221–242.
(1989). Effective bounds for the maximal order of an element in the symmetric group,
Math. Comp. 53, 665–678.
6.4 References 197

Mertens, F. (1897). Ueber eine Zahlentheoretische Function, Sitzungsber. Akad. Wiss.

Wien Abt. 2a 106.
(1898). Über eine Eigenschaft der Riemannscher ζ -Funktion, Sitzungsber. Kais. Akad.
Wiss. Wien Abt. 2a 107, 1429–1434.
Montgomery, H. L. (1994). Ten Lectures on the Interface Between Analytic Number The-
ory and Harmonic Analysis, CBMS Regional Conf. Series in Math. 84. Providence:
Amer. Math. Soc.
Montgomery, H. L. & Vaughan, R. C. (2001). Mean values of multiplicative functions,
Period. Math. Hungar. 43, 199–214.
Pintz, J. (1980). On the remainder term of the prime number formula, II. On a theorem
of Ingham, Acta Arith. 37, 209–220.
(1983). Oscillatory Properties of the Remainder Term of the Prime Number Formula,
Studies in Pure Math. Basel: Birkhäuser, pp. 551–560.
(1984). On the remainder term of the prime number formula and the zeros of Rie-
mann’s zeta-function, Number Theory (Noordwijkerhout, 1983). Lecture notes in
math. 1068. Berlin: Springer-Verlag, pp. 186–197.
Pólya, G. & Szegö, G. (1972). Problems and Theorems in Analysis, Vol. 1. Grundl.
math. Wiss. 193. New York: Springer-Verlag.
Richert, H.-E. (1967). Zur Abschätzung der Riemannschen Zetakunktion in der Nähe
der Vertikalen σ = 1, Math. Ann. 169, 97–101.
Rosser, J. B. (1939). The n-th prime is greater than n log n, Proc. London Math. Soc. (2)
45, 21–44.
Rosser, J. B. & Schoenfeld, L. (1962). Approximate formulas for some functions of
prime numbers, Illinois J. Math. 6, 64–94.
Stás, W. (1961). Über die Umkehrung eines Satzes von Ingham, Acta Arith. 6, 435–
446.
Tatuzawa, T. (1952). On the number of primes in an arithmetic progression, Jap. J. Math.
21, 93–111.
Titchmarsh, E. C. (1938). On ζ (s) and π (x), Quart. J. Math. Oxford Ser. 9, 97–108.
(1951). The Theory of the Riemann Zeta-function, Oxford: Oxford University
Press.
(1986). The Theory of the Riemann Zeta-function, Second Ed. Oxford: Oxford
University Press.
Turán, P. (1950). On the remainder-term in the prime-number formula, II, Acta. Math.
Acad. Sci. Hungar. 1, 155–166; Collected Papers, Vol. 1. Budapest: Akadémiai
Kiado, 1990, pp. 541–551.
de la Vallée Poussin, C. J. (1896). Recherches analytiques sur la théorie des nombres
premiers, I–III, Ann. Soc. Sci. Bruxelles 20, 183–256, 281–362, 363–397.
(1899). Sur la fonction ζ (s) et le nombre des nombres premiers inférieurs à une limite
donnée, Mem. Couronnés de l’Acad. Roy. Sci. Bruxelles 59.
Vaughan, R. C. (1997). The Hardy–Littlewood Method, Second Edition, Cambridge
Tracts in Math. 125, Cambridge: Cambridge University Press.
Vinogradov, I. M. (1935). On Weyl’s sums, Mat. Sb. 42, 521–530.
(1936a). A new method for resolving certain general questions in the theory of num-
bers, Mat. Sb. (1) 43, 9–19.
(1936b). A new method of estimation of trigonometrical sums, Mat. Sb. (1) 43, 175–
188.
198 The Prime Number Theorem

(1947). The Method of Trigonometrical Sums in the Theory of Numbers, Trav. Inst.
Math. Stecklov 23; English translation, London: Interscience Publishers, 1954.
(1958). A new evaluation of ζ (1 + it), Izv. Akad. Nauk SSSR 22, 161–164.
Walﬁsz, A. (1963). Weylsche Exponentialsummen in der neuren Zahlentheorie, Math.
Forschungsberichte 15. Berlin: Deutscher Verlag Wiss.
7
Applications of the Prime Number Theorem

We now use the Prime Number Theorem, and other estimates obtained by similar
methods, to estimate the number of integers whose multiplicative structure is
of a speciﬁed type.

7.1 Numbers composed of small primes

Let ψ(x, y) denote the number of integers n, 1 ≤ n ≤ x, all of whose prime
factors are ≤ y. Obviously, if y ≥ x, then
ψ(x, y) = [x] = x + O(1). (7.1)
√
Also, if n ≤ x, then n can have at most one prime factor p > x, and hence if
x 1/2 ≤ y ≤ x, then

ψ(x, y) = [x] − 1
y< p≤x n≤x
p|n

= [x] − [x/ p]
y< p≤x
1
= x−x + O(π(x)).
y< p≤x p

By the estimates of Chebyshev and Mertens (Corollary 2.6 and Theorem 2.7(d)),
this is

log x x
= x 1 − log +O .
log y log x
Thus if we take u = (log x)/(log y), so that y = x 1/u , then we see that

x
ψ x, x 1/u = (1 − log u)x + O (7.2)
log x

199
200 Applications of the Prime Number Theorem

0 1

Figure 7.1 The Dickman function ρ(u) for 0 ≤ u ≤ 4.

uniformly for 1 ≤ u ≤ 2. We shall show more generally that there is a function

ρ(u) > 0 such that

ψ x, x 1/u ∼ ρ(u)x (7.3)
as x → ∞ with u bounded. The function ρ(u) that arises here is known as the
Dickman function; it may be deﬁned to be the unique continuous function on
[0, ∞) satisfying the differential–delay equation
uρ (u) = −ρ(u − 1) (7.4)
for u > 1 together with the initial condition that
ρ(u) = 1 (7.5)
for 0 ≤ u ≤ 1. Before proceeding further we note some simple properties of
this function. By dividing both sides of (7.4) by u and then integrating, we ﬁnd
that
v
dt
ρ(v) = ρ(u) − ρ(t − 1) (7.6)
u t
for 1 ≤ u ≤ v. Also, from (7.4) we see that (uρ(u)) = ρ(u) − ρ(u − 1), so that
by integrating it follows that
u
uρ(u) = ρ(v) dv + C
u−1

for u ≥ 1, where C is a constant of integration. On taking u = 1 we deduce that

C = 0, and hence that
u
uρ(u) = ρ(v) dv (7.7)
u−1

for u ≥ 1.
As might be surmised from Figure 7.1, ρ(u) is positive and decreasing. To
prove this, let u 0 be the inﬁmum of the set of all solutions of the equation
ρ(u) = 0. By the continuity of ρ it follows that ρ(u 0 ) = 0. But ρ(u) > 0 for
7.1 Numbers composed of small primes 201

0 ≤ u < u 0 , and hence if we take u = u 0 in (7.7), then the left-hand side is

0 while the right-hand side is positive, a contradiction. Thus ρ(u) > 0 for all
u ≥ 0, and by (7.4) it follows that ρ (u) < 0 for all u > 1. Figure 7.1 also
suggests that ρ(u) tends to 0 rapidly as u → ∞. We now establish a crude
estimate in this direction.
Lemma 7.1 The function ρ(u) is positive and decreasing for u ≥ 0, and
satisﬁes the inequalities
1 1
≤ ρ(u) ≤ .
2 (2u + 1) (u + 1)
Proof For positive integers U we prove by induction that the upper bound
holds for 0 ≤ u ≤ U . To provide the basis of the induction we need to show
that (s) ≤ 1 for 1 ≤ s ≤ 2. This is immediate from the relations
∞

(1) = (2) = 1, (s) = e−x x s−1 (log x)2 d x > 0 (0 < s < ∞).
0
(7.8)
Since ρ(u) is decreasing, we see by (7.7) that uρ(u) ≤ ρ(u − 1). Thus if the
desired upper bound holds for u ≤ U and if U ≤ u ≤ U + 1, then
ρ(u − 1) 1 1
ρ(u) ≤ ≤ =
u u (u) (u + 1)
by (C.4).
After making the change of variables u = v/2, the desired lower bound
asserts that ρ(v/2) ≥ 1/(2 (v + 1)). We let V run through positive integral
values, and prove by induction on V that the lower bound holds for 0 ≤ v ≤ V .
To establish the lower bound for 0 ≤ v ≤ 2 it sufﬁces to show that (s) ≥ 1/2
for all s > 0. From (7.8) we see that (s) ≥ 1 for 0 < s ≤ 1 and for s ≥ 2; thus
it remains to note that if 1 ≤ s ≤ 2, then
∞ 1 ∞
1 1
(s) = e−x x s−1 d x ≥ e−x x d x + e−x d x = 1 − > .
0 0 1 e 2
(The actual fact of the matter is that mins>0 (s) = (1.4616 . . .) =
0.8856 . . . .) Since ρ(u) is decreasing, we see by (7.7) that uρ(u) ≥ ρ(u −
1/2)/2. Thus if the lower bound holds for 0 ≤ v ≤ V and if V ≤ v ≤ V + 1,
then
ρ((v − 1)/2) 1 1
ρ(v/2) ≥ ≥ =
v 2v (v) 2 (v + 1)
by (C.4). This completes the inductive step, so the proof is complete.

We now use elementary reasoning to show that (7.3) holds uniformly for u
in bounded intervals.
202 Applications of the Prime Number Theorem

Theorem 7.2 (Dickman) Let ψ(x, y) be the number of positive integers not
exceeding x composed entirely of prime numbers not exceeding y, and let ρ(u)
be deﬁned as above. Then for any U ≥ 0 we have

x
ψ x, x 1/u
= ρ(u)x + O (7.9)
log x
uniformly for 0 ≤ u ≤ U and all x ≥ 2.

Proof We restrict U to integral values, and induct on U . The basis of the

induction is provided by (7.1) and (7.5). Also, (7.2) gives (7.9) for 1 ≤ u ≤ 2
since from (7.6) we see that

ρ(u) = 1 − log u (7.10)

for 1 ≤ u ≤ 2. Suppose now that U is an integer, U ≥ 2, and that (7.9) holds

uniformly for 0 ≤ u ≤ U . We show that (7.9) holds uniformly for U ≤ u ≤
U + 1. To this end we classify n according to the size of the largest prime
factor P(n) of n. Thus we see that

ψ(x, y) = 1 + card{n ≤ x : P(n) = p}.
p≤y

Here the first term on the right reflects the fact that if x ≥ 1, then ψ(x, y)
counts the number n = 1 for which P(1) is undefined. In the sum on the right,
the summand is ψ(x/ p, p), and hence we see that

ψ(x, y) = 1 + ψ(x/ p, p). (7.11)
p≤y

On differencing, it follows that if y ≤ z, then

ψ(x, y) = ψ(x, z) − ψ(x/ p, p). (7.12)
y< p≤z

Suppose that z = x 1/U and that y = x 1/u with U ≤ u ≤ U + 1. Deﬁne u p by

the relation p = (x/ p)1/u p . That is,
log x
up = − 1,
log p
which is ≤ u − 1 ≤ U if p ≥ y. Hence by the inductive hypothesis the right-
hand side of (7.12) is
ρ((log x)/(log p) − 1)
x
ρ(U )x + O −x
log x y< p≤z p

1
+O x . (7.13)
y< p≤z p log x/ p
7.1 Numbers composed of small primes 203

Let s(w) = p≤w 1/ p, and write Mertens’ estimate (Theorem 2.7(d)) in the
form s(w) = log log w + c + r (w). Then the sum in the main term above is
z z
ρ((log x)/(log w) − 1) ds(w) = ρ((log x)/(log w) − 1) d log log w
y y
z
+ ρ((log x)/(log w) − 1) dr (w).
y
(7.14)
We put t = (log x)/(log w). Since
dw dt
d log log w = =− ,
w log w t
the ﬁrst integral on the right-hand side of (7.14) is
u
dt
ρ(t − 1) . (7.15)
U t
By integrating by parts and the estimate r (w) 1/ log w we see that the second
integral on the right-hand side of (7.14) is
z z
ρ((log x)/(log w) − 1)r (w) − r (w) dρ((log x)/(log w) − 1)
y y
z
1
1+ 1 |dρ((log x)/(log w) − 1)|
log x y
1
log x
since ρ is monotonic and bounded. By Mertens’ estimate (Theorem 2.7(d)) we
also see that the error term in (7.13) is
x 1 x
log x y< p≤z p log x
since log log z = log log y + O(1). On combining our estimates in (7.12) we
ﬁnd that
u
dt x
ψ(x, x 1/u ) = x ρ(U ) − ρ(t − 1) +O .
U t log x
Thus by (7.6) we have the desired estimate for U ≤ u ≤ U + 1, and the proof
is complete.

As for ψ(x, y) when y < x ε , we show next that

ψ(x, (log x)a ) = x 1−1/a+o(1) (7.16)
for any ﬁxed a ≥ 1. The upper bound portion of this is obtained by means of
bounds for an associated Dirichlet series, while the lower bound is derived by
combinatorial reasoning.
204 Applications of the Prime Number Theorem

An upper bound for ψ(x, y) can be constructed by observing that if σ > 0,

then
1
x σ σ σ 1 −1
ψ(x, y) ≤ ≤x =x 1− σ . (7.17)
n≤x n p|n⇒ p≤y
nσ p≤y p
p|n⇒ p≤y

Rankin used this chain of inequalities to derive an upper bound for ψ(x, y).
This approach is fruitful in a variety of settings, and has become known as
‘Rankin’s method’.
To use the above, we must establish an upper bound for the product on the
right-hand side. The size of this product is a little difﬁcult to describe, because its
behaviour depends on the size of σ . If σ is near 0, then most of the factors are ap-
proximately (1 − y −σ )−1 , and hence we expect the product to be approximately
(1 − y −σ )−y/ log y . If σ is larger (but still < 1), then the general factor is approx-
imately exp( p −σ ), and hence the product is approximately the exponential of
y
dt y 1−σ
p −σ ∼ σ
∼ .
p≤y 2 t log t (1 − σ ) log y
We begin by making these relations precise.
Lemma 7.3 If 0 ≤ σ ≤ 1, then
y
du
p −σ = σ
+ O y 1−σ exp − c log y + O(1). (7.18)
p≤y 2 u log u

Proof We write the left-hand side as

y y y
u −σ dπ (u) = u −σ d li(u) + u −σ dr (u)
2− 2− 2−
y
where r (u) = π(u) − li(u). The first integral on the right is 2 u −σ (log u)−1 du.
By integrating by parts we find that the second integral is
y
y −σ r (y) − 2−σ r (2− ) + σ r (u)u −σ −1 du.
2
√
Suppose that b is a positive constant chosen so that r (u) u exp(−b log u).
Then the first two terms above can be absorbed into the error terms in (7.18) if
c < b. To complete the proof it suffices to show that
y
u −σ exp(−b log u) du 1 + y 1−σ exp − b3 log y , (7.19)
2
for then we have (7.18) with c = b/3.
√
To prove (7.19) we note that if σ ≥ 1 − b/(2 log y), then

u 1−σ exp − b2 log u = exp (1 − σ ) log u − b2 log u

≤ exp b2 (log u)/ log y − b2 log u
≤1
7.1 Numbers composed of small primes 205

for 2 ≤ u ≤ y. Hence for σ in this range the integral in (7.19) is

y ∞
du du
≤ b√ < b√ 1.
2 u exp 2
log u 2 u exp 2
log u
Now suppose that
b
σ ≤1− √ . (7.20)
2 log y
y 1/4 y
We write the integral in (7.19) as 2 + y 1/4 = I1 + I2 , say. Then
y 1/4
y (1−σ )/4
I1 ≤ u −σ du < ,
2 1−σ
which by (7.20) is

y 1−σ log y exp − 34 (1 − σ ) log y y 1−σ exp − b3 log y .

As for I2 , we note that if u ≥ y 1/4 , then log u ≥ 14 log y. Hence

y −σ y 1−σ
I2 ≤ exp − b2 log y u du ≤ exp − b2 log y
2 1−σ
b 1−σ b
exp − 2 log y y log y y 1−σ
exp − 3 log y .
These estimates combine to give (7.19), so the proof is complete.

Lemma 7.4 If y ≥ 2 and 1 − 4/ log y ≤ σ ≤ 1, then

p −σ = log log y + O(1). (7.21)
p≤y

If y ≥ 2 and 0 ≤ σ ≤ 1 − 4/ log y, then

−σ y 1−σ 1 y 1−σ
p = + log +O . (7.22)
p≤y (1 − σ ) log y 1−σ (1 − σ )2 (log y)2

Proof Suppose that 1 − 4/ log y ≤ σ ≤ 1. If u ≤ y, then

−σ

u = u −1 u 1−σ = u −1 exp (1 − σ ) log u = u −1 1 + O((1 − σ ) log u)

= u −1 + O u −1 (1 − σ ) log u .
Hence
y y y
du du du
σ
= + O (1 − σ ) = log log y + O(1).
2 u log u 2 u log u 2 u
Thus (7.21) follows from Lemma 7.3.
To prove (7.22) we let v = exp(4/(1
y − σ )), and observe that v ≤ y. We write
v
the integral in Lemma 7.3 as 2 + v = I1 + I2 , say. By the above we see that
I1 = log log v + O(1) = log 1/(1 − σ ) + O(1). By integration by parts we see
206 Applications of the Prime Number Theorem

that
y 1−σ v 1−σ 1 y
du
I2 = − + .
(1 − σ ) log y (1 − σ ) log v 1−σ v u σ (log u)2
Here the first term on the right is one of the main terms in (7.22), and the second
term is O(1). Let J denote the integral on the right. To complete the proof it
suffices to show that
y 1−σ
J . (7.23)
(1 − σ )(log y)2
To this end we integrate by parts again:
y 1−σ v 1−σ 2 y
dw
J= − + .
(1 − σ )(log y)2 (1 − σ )(log v)2 1−σ v w σ (log w)3
4 −4
Here the second term on the right-hand side is e 2 (1 − σ ) 1 − σ , while
the first term on the right-hand side is larger. As for the integral on the right, we
observe that if w ≥ v, then (log w)3 ≥ 4(log w)2 /(1 − σ ). Hence the last term
on the right above has absolute value not exceeding J/2. Thus we have (7.23),
and the proof is complete.

Lemma 7.5 Suppose that y ≥ 2. If max 2/ log y, 1 − 4/ log y ≤ σ ≤ 1,
then
−1
1 − p −σ log y. (7.24)
p≤y

If 2/ log y ≤ σ ≤ 1 − 4/ log y, then

1
(1 − p −σ )−1 =
p≤y 1−σ
1−σ
y 1
× exp 1+O + O(y −σ ) . (7.25)
(1 − σ ) log y (1 − σ ) log y

when σ ≤ 2/3 since then y ≤ e . The

12
Proof The bound (7.24) is trivial
−1
estimate (1 − δ) = exp δ + O(δ ) holds uniformly for |δ| ≤ 1/2. We take
2

δ = p −σ for p > v = e1/σ to deduce that

−σ −1 −σ −2σ
1− p = exp p +O p .
v< p≤y v< p≤y v< p≤y

Now (7.24) follows at once from Lemma 7.4 when σ ≥ 2/3. Thus it remains
to establish (7.25). The sum in the error term above is 1 for σ > 5/8. If
3/8 ≤ σ ≤ 5/8, then by Lemma 7.4 it is y / log y. If 2/ log y ≤ σ ≤ 3/8,
1/4

then by Lemma 7.4 the sum is y 1−2σ / log y. Thus in any case this error term
7.1 Numbers composed of small primes 207

is majorized by the error terms on the right-hand side of (7.25). By Lemma 7.4,
the main term is
y 1−σ 1
p −σ = + log
v< p≤y (1 − σ ) log y 1 − σ

y 1−σ v
+O + O .
(1 − σ )2 (log y)2 log v
Since 2/ log y ≤ σ ≤ 1 − 4/ log y, y satisﬁes y ≥ e6 , and σ (1 − σ ) log y ≥
2(1 − 2/ log y) ≥ 4/3. Hence (y 1−σ )3/4 ≥ v and the second error term above
is dominated by the ﬁrst.
It remains to consider the contribution of the primes p ≤ v. If σ > 1/3, then
the contribution of these primes is 1, so we may suppose that 2/ log y ≤
σ ≤ 1/3. In this range
log p
1 − p −σ σ log p = .
log v
Since

log v
log C v,
p≤v log p

it follows that

(1 − p −σ )−1 < exp(Cv) = exp Ce1/σ ≤ exp C y 1/2 ,
p≤v

which sufﬁces. Thus the proof is complete.

We now bound ψ(x, y) by combining Lemma 7.5 with the inequalities (7.17).
Theorem 7.6 If y = x 1/u and log x ≤ y ≤ x 1/9 , then

u log log u
ψ(x, y) < x(log y) exp − u log u − u log log u + u −
log u
2
u u log u
+O +O .
log u y
Here the ﬁrst error term is larger than the second if y ≥ (log x) log log x,
while if y is smaller, then the second error term dominates.
Proof We ﬁrst note that we may suppose that y ≥ 9 log x, since the bound for
smaller y follows by taking y = 9 log x. To motivate the choice of σ in (7.17)
we note that the expression to be minimized is approximately
y −σ
u
x σ exp du .
2 log u
208 Applications of the Prime Number Theorem

On taking logarithmic derivatives, this suggests that we should take σ to be the

root of the equation
y 1−σ
log x = . (7.26)
1−σ
In actual fact we take
log u + log log u
σ =1− . (7.27)
log y
It is easy to see that for this σ the right-hand side of (7.26) is
log u
log x ,
log u + log log u
so it is reasonable to expect that the simple choice (7.27) is close enough to the
root of (7.26) for our present purposes.
From the inequalities 9 log x ≤ y ≤ x 1/9 it follows that the σ given by (7.27)
satisﬁes 2/ log y ≤ σ ≤ 1 − 1/ log y. Hence the stated upper bound follows by
combining (7.17) with the estimates of Lemma 7.5.

To obtain companion lower bounds we observe that if k is chosen so that y k ≤

x, then ψ(x, y) certainly counts all integers n composed of primes p ≤ y such
that (n) ≤ k. Put r = π(y), and suppose that p1 , p2 , . . . , pr are the primes
not exceeding y. Then n is of the form n = p1a1 p2a2 · · · prar , and ψ(x, y) is at least
as large as the number of solutions of the inequality a1 + a2 + · · · + ar ≤ k in
non-negative integers ai . For this quantity we have an exact formula, as follows.

Lemma 7.7 Let A(r, k) denote the number of solutions of the inequality a1 +
a2 + · · · + ar ≤ k in non-negative integers ai . Then A(r, k) = r +k k
.
r
Analytic Proof Let ar +1 = k − i=1 ai . Then A(r, k) is the number of ways
of writing k = a1 + a2 + · · · + ar +1 , which is the coefﬁcient of x k in the power
series
r +1
∞ ∞
−r −1 r +k
x a
= (1 − x) = xk
a=0 k=0
k

by the ‘negative’ binomial theorem.

Combinatorial Proof Suppose that we have k circles ◦ and r bars | arranged

in a line. Let a1 be the number of circles to the left of the first bar, let a2 be the
number of circles between the first and second bar, and so on, so that ar is the
number of circles between the last two bars. (The number of circles to the right

of the last bar is k − ai .) Thus a configuration of circles and bars determines
a choice of non-negative ai with a1 + a2 + · · · + ar ≤ k. But conversely, a
7.1 Numbers composed of small primes 209

choice of such ai determines a conﬁguration of circles and bars. The number

of ways of choosing the positions of the k circles in the r + k available places
is r +k
k
.

Theorem 7.8 If log x ≤ y ≤ x, then

x
ψ(x, y) exp(−u log log x + u/2).
y
Proof Let r = π (y) and let k be the largest integer such that y k ≤ x. That is,
k = [u]. Then by Lemma 7.7 and Stirling’s formula we see that

r +k r +k k r +k r 1
ψ(x, y) ≥ √ . (7.28)
k k r k
The identity
r
k log(1 + r/k) + r log(1 + k/r ) = log(1 + k/t) dt
0

shows that the left-hand side is an increasing function of r . It can be supposed

that x is sufﬁciently large. Let z = y/(k log y). Then the expression (7.28) is
k
y k log y y/ log y 1
1+ 1+ √ ≥ (z(1 + 1/z)z )k ,
k log y y k
Moreover u − 1 < k ≤ u ≤ y/ log y and z(1 + 1/z)z is increasing for z ≥

1. Thus the above is √≥ (z (1 + 1/z )z )k ≥ (z (1 + 1/z )z )u−1 where z =
y/(u log y). As z ≤ y/ k this is
u
1 y u log y y/ log y
≥ 1+
y u log y y

x y
= exp −u log log x + log(1 + (log x)/y) .
y log y
The stated inequality now follows on noting that log(1 + δ) ≥ δ/2 for 0 ≤
δ ≤ 1.

When y is of the form y = (log x)a with a not too large, the upper bound of
Theorem 7.6 and the lower bound of Theorem 7.8 are quite close, and we have

Corollary 7.9 If y = (log x)a and 1 ≤ a ≤ (log x)1/2 /(2 log log x), then

log x (log a + O(1)) log x
x 1−1/a exp < ψ(x, y) < x 1−1/a exp .
5a log log x a log log x
Proof The lower bound follows from Theorem 7.8 since log y ≤ (log x)/
(4a log log x) in the range under consideration. As for the upper bound, we
note that log u log log x, so that log log u = log log log x + O(1). Hence
210 Applications of the Prime Number Theorem

log u + log log u = log log x − log a + O(1), and the result follows from
Theorem 7.6.

For 1 ≤ u ≤ 4 we may use the differential equation (7.4) and the initial
condition (7.5) to derive formulæ for ρ(u) (see Exercise 7.1.6 below), but for
larger u we take a different approach.

Theorem 7.10 For any real or complex number s we have

∞ s −z
e −1
ρ(u)e−us du = exp C0 + dz (7.29)
0 0 z
where C0 is Euler’s constant. Conversely, for any u > 0 and any real σ0 we
have
σ0 +i∞ s −z
eC0 e −1
ρ(u) = exp dz eus ds. (7.30)
2πi σ0 −i∞ 0 z
Proof Let F(s) denote the integral on the left-hand side of (7.29); this is the
Laplace transform of ρ(u). In view of the rapid decay of ρ(u) established in
Lemma 7.1, we see that the integral converges for all s, and hence that F(s) is
an entire function. On integrating by parts we see that
∞
1 1
F(s) = + ρ (u)e−us du,
s s 1

and hence that

∞
(s F(s)) = − uρ (u)e−us du.
1

The differential–delay identity (7.4) for ρ(u) thus yields a differential equation
for F(s),

(s F(s)) = e−s F(s).

By separation of variables it follows that

s
e−z − 1
F(s) = F(0) exp dz .
0 z
To determine the value of F(0) we note that
1 −z ∞
e −1 e−z
1 = lim s F(s) = F(0) exp dz + dz .
s→+∞ 0 z 1 z
By integration by parts we see that
∞ ∞
1
e−z − 1 e−z
dz + dz = e−z log z dz =
(1) = −C0 (7.31)
0 z 1 z 0
7.1 Numbers composed of small primes 211

by (C.12) and Theorem C.2. Hence F(0) = eC0 . An arithmetic proof of this
is found in Exercise 7.1.7 below. Thus we have the identity (7.29), and (7.30)
follows by applying the inverse Laplace transform to both sides.

7.1.1 Exercises
1. (Chowla & Vijayaraghavan 1947) Show that if f (x) is a function that tends
to inﬁnity in such a way that log f (x) = o(log x) then almost all integers n
have a prime factor larger than f (n). That is
1
lim card{n ≤ x : P(n) > f (n)} = 1
x→∞ x

where P(n) denotes the largest prime factor of n.

2. (de Bruijn 1951b) Let P(n) denote the largest prime factor of n. Show that

log P(n) ∼ Dx log x
n≤x
∞
where D = 0 ρ(u)(u + 1)−2 du is called Dickman’s constant.
3. (cf. Alladi & Erdős 1977) Let P(n) denote the largest prime factor of n.
(a) Show that
x
P(n) = p + O x 3/2 .
n≤x
√
x< p≤x
p

(b) Show that the sum on the right above is

= k p + O x 3/2 .
√
1≤k≤ x x/(k+1)< p≤x/k

(c) Show that

y2 y2
p= +O .
p≤y 2 log y (log y)2
(d) Show that
∞
1 1 π2
k − = .
k=1
k2 (k + 1)2 6
(e) Conclude that

π2 x2 x2
P(n) = +O .
n≤x 12 log x (log x)2

4. Show that ρ (k) (u) has a jump discontinuity at u = k, and is continuous for
u > k.
5. (a) Show that ρ(u) is convex upwards for all u ≥ 1.
(b) Show that if u ≥ 2, then uρ(u) ≥ ρ(u − 1/2).
212 Applications of the Prime Number Theorem

(c) Show that if u ≥ 2, then (2u − 1)ρ(u) ≤ ρ(u − 1).

6. (a) Show that if 1 ≤ u ≤ 2, then ρ(u) = 1 − log u.
(b) Show that if 2 ≤ u ≤ 3, then
u
log(t − 1)
ρ(u) = 1 − log u + dt.
2 t
(c) Show that if 3 ≤ u ≤ 4, then
u
log(t − 1) u
(log u/t) log(t − 2)
ρ(u) = 1 − log u + dt − dt.
2 t 3 t −1

7. Let P(σ ) = p≤y (1 − p −σ )−1 .
(a) Explain why
1
P(1) = = eC0 log y + O(1).
p|n⇒ p≤y
n

P
(b) Show that if σ ≥ 1, then P
(σ ) log y.
(c) Deduce that
log n
−P (1) = (log y)2 .
n n
p|n⇒ p≤y

(d) Conclude that

1 (log y)2
.
n>x n log x
p|n⇒ p≤y

(e) Show that

1 u
ψ(y v , y)
= (log y) dv + O(1)
n≤x n 0 yv
p|n⇒ p≤y

where u = (log x)/ log y.

(f) Deduce that
∞
ρ(u) du = eC0 .
0

(g) Show that ∞ n=1 nρ(n) = e .
C0

8. (Erdős & Nicolas 1981) Let α be ﬁxed, 0 < α < 1.

(a) Let k be the least integer > α(log x)/ log x, put y = x , and set
rlog
1/k

r = π(y). Show that there are at least k integers n ≤ x such that

ω(n) > α(log x)/ log log x.
(b) Show that the number of integers n ≤ x such that ω(n) >
α(log x)/ log log x is at least x 1−α+o(1) .
7.1 Numbers composed of small primes 213

(c) Show that if σ > 1 and A ≥ 1, then the number of integers n ≤ x such
that ω(n) > α(log x)/ log log x is at most
∞
Aω(n)
x σ A−k .
n=1
nσ
(d) Show that if A = log x and σ = 1 + (log log log x)/ log log x, then the
above is x 1−α+o(1) .
9. (de Bruijn 1966) Assume that 0 < σ ≤ 3/ log y, and note that this interval
covers a range that is not treated in Lemma 7.5.
(a) Show that 1 − p −σ σ log p, and hence deduce that

C
(1 − p −σ )−1 ≤ exp log
p≤y p≤y σ log p

Cy 4
≤ exp log (7.32)
log y σ log y
for a suitable constant C.
(b) Write
1 − y −σ
(1 − p −σ )−1 = (1 − y −σ )−π (y) = F1 · F2 ,
p≤y p≤y 1 − p −σ
say. Show that

Cy 4
F1 ≤ (1 − y −σ )−y/ log y exp log .
(log y)2 σ log y
(c) Note that
1 − p −σ (y/ p)σ − 1
= 1 − , (7.33)
1 − y −σ yσ − 1
and hence deduce that the above is ≥ 1 − c log y/ p
log y
, so that

C
F2 ≤ exp log y/ p ≤ exp C y/(log y)2 .
log y p≤y
(d) Conclude that

−σ −1 −σ −y/ log y Cy 4
(1 − p ) ≤ (1 − y ) exp log
p≤y (log y)2 σ log y
for 0 < σ ≤ 3/ log y.
10. (de Bruijn 1966) Lemma 7.5 suffers from a loss of precision when
3/ log y ≤ σ ≤ (log log y)/ log y. To obtain a reﬁned estimate in this range,
write
(1 − p −σ )−1 = F1 · F2 · F3
p≤y
214 Applications of the Prime Number Theorem

where the Fi are products over the intervals p ≤ exp(1/σ ), exp(1/σ ) <
p ≤ y/ exp(1/σ ), and y/ exp(1/σ ) < p ≤ y, respectively.

(a) Use (7.32) to show that F1 ≤ exp Cσ e1/σ .
(b) Use Lemma 7.5 to show that

C y 1−σ
F2 ≤ exp 1/σ .
e log y
(c) Use the identity (7.33) to show that
1 − p −σ cσ log y/ p
≥1− ,
1 − y −σ yσ
and hence deduce that

log y/ p
F3 ≤ (1 − y −σ )−π (y) exp Cσ
p≤y yσ
1−σ
−σ −y/ log y y Cσ y 1−σ
≤ (1 − y ) exp + .
(log y)2 log y
(d) Conclude that

−σ −1 −σ −y/ log y Cσ y 1−σ
(1 − p ) ≤ (1 − y ) exp
p≤y log y
when 3/ log y ≤ σ ≤ (log log y)/ log y.
11. (de Bruijn 1966)
(a) For σ > 0 let f (σ ) = x σ (1 − y −σ )−y/ log y . Show that f (σ ) is mini-
mized precisely when
log(1 + y/ log x)
σ = .
log y
(b) Show that for the above σ ,

log x y + log x y y + log x
f (σ ) = exp log + log .
log y log x log y y
(c) Show that if y ≤ log x, then

log x y + log x
ψ(x, y) ≤ exp log
log y log x

y 1 y + log x
+ 1+O log .
log y log y y
(d) Show that if log x ≤ y ≤ (log x)2 , then

log x 1 y + log x
ψ(x, y) ≤ exp 1+O log
log y log y log x

y y + log x
+ log .
log y y
7.2 Numbers composed of large primes 215

12. (Erdős 1963) Show that

log x
ψ(x, log x) = exp (2 log 2 + o(1)) .
log log x
13. (de Bruijn 1966) Show that if a is ﬁxed, 0 < a < 1, then
ψ(x, (log x)a ) = exp((1/a − 1 + o(1))(log x)a ).
14. Let ψ2 (x, y) denote the number of square-free integers n ≤ x composed
entirely of primes p ≤ y.
(a) Show that

ψ2 (x, y) = µ(d)ψ(x/d 2 , y).
d≤x
p|d⇒ p≤y

(b) (Ivić) Let δ > 0 be ﬁxed. Then

6
ψ2 (x, y) ∼ ψ(x, y)
π2
uniformly for x δ ≤ y ≤ x.
(c) Show that ψ2 (x, log x) = ψ(x, log x)1/2+o(1) .
(d) Show that if a > 1 and y ≥ (log x)a , then ψ2 (x, y) = ψ(x, y)1+o(1) .
(e) Show that if 0 < a < 1 and y ≤ (log x)a , then ψ2 (x, y) = ψ(x, y)o(1) .
(f) Show that ψ2 (x, c log x) = ψ(x, c log x)φ(c)+o(1) for any ﬁxed c > 0,
where
⎧
⎪ c log 2
⎪
⎨ (c + 1) log(c + 1) − c log c (0 < c ≤ 2),
φ(c) =
⎪ c log c − (c − 1) log(c − 1)
⎪
⎩ (c ≥ 2).
(c + 1) log(c + 1) − c log c

7.2 Numbers composed of large primes

Let (x, y) denote the number of integers n ≤ x composed entirely of primes
p ≥ y. The number 1 is such a number as it is an empty product. Thus it is clear
that if y > x, then
(x, y) = 1 (7.34)
Also, if x 1/2 ≤ y ≤ x, then

x y x
(x, y) = π(x) − π(y − ) + O(1) = − +O (7.35)
log x log y (log x)2
For smaller values of y we show that
w(u)x
(x, y) ∼ (7.36)
log y
216 Applications of the Prime Number Theorem

0 1

Figure 7.2 Buchstab’s function w(u) and its horizontal asymptote e−C0 for 1 ≤ u ≤ 4.

where u = (log x)/ log y and w(u) is a function determined by the initial con-
dition
w(u) = 1/u (7.37)
for 1 < u ≤ 2 and for u > 2 by the differential–delay equation
(uw(u)) = w(u − 1). (7.38)
Before proceeding further we ﬁrst derive some of the simplest properties of
the function w(u) depicted in Figure 7.2. By integrating (7.38) we deduce that
u−1
uw(u) = 1 w(v) dv + C for u > 2, and by letting u tend to 2 we ﬁnd that
C = 1 so that
u−1
uw(u) = w(v) dv + 1 (7.39)
1

for u ≥ 2. From this it is evident that if w(v) ≤ 1 for v ≤ u − 1, then w(v) ≤ 1

for v ≤ u, and that if w(v) ≥ 1/2 for v ≤ u − 1, then w(v) ≥ 1/2 for v ≤
u. Thus we conclude that 1/2 ≤ w(u) ≤ 1 for all u > 1. From the identity
uw (u) = w(u − 1) − w(u) we deduce that |w (u)| ≤ 1/(2u) for all u > 2. Let
M(u) = maxv≥u |w (v)|. Since w(u − 1) − w(u) = −w (ξ ) for some ξ , u −
1 < ξ < u, we know that
M(u) ≤ M(u − 1)/u.
Let k be chosen so that 1 < u − k ≤ 2. By using the above inequality k times
we ﬁnd that
M(u − k) 1
M(u) ≤ .
u(u − 1) · · · (u − k + 1) (u + 1)
That is,
1
w (u) (7.40)
(u + 1)
7.2 Numbers composed of large primes 217

∞
for u > 2. Since w (u) tends to 0 rapidly, it follows that the integral 2 w (v) dv
converges absolutely, and hence we see that limu→∞ w(u) exists. Since it is to

be expected that (x, y) is approximately x p<y (1 − 1/ p) when y is small,
it is not surprising that

lim w(u) = e−C0 . (7.41)

u→∞

We shall prove this later, as a consequence of Theorem 7.12. First we establish

the basic asymptotic estimate (7.36).

Theorem 7.11 (Buchstab) Let (x, y) denote the number of positive integers
n ≤ x composed entirely of prime numbers p ≥ y, and let w(u) be deﬁned as
above. Then

w(u)x y x
(x, y) = − +O (7.42)
log y log y (log x)2
uniformly for 1 ≤ u ≤ U and all y ≥ 2. Here u = (log x)/ log y, which is to
say that y = x 1/u .

The term −y/ log y can be included in the error term when y x/ log x but,
in view of (7.35), has to be present when y is close to x. It might be difﬁcult
to prove that the above holds uniformly for all u ≥ 1 because of the precise
form of the error term, but the weaker assertion (7.36) can be shown to hold for
u ≥ 1 + ε, since sieve methods can be used when u is large.

Proof The number of positive integers n ≤ x whose least prime factor is p is

exactly (x/ p, p). Hence by classifying integers according to their least prime
factor we see that

(x, y) = 1 + (x/ p, p). (7.43)
y≤ p≤x

This is an identity of Buchstab; similar ‘Buchstab identities’ are important in

sieve theory. We show by induction on U that

w(u)x y x
(x, y) = − +O (7.44)
log y log y (log x)2
for U ≤ u ≤ U + 1. When U = 1 this is (7.35), and it is only in this ﬁrst range
that the second main term is signiﬁcant. For the inductive step we apply (7.43)
with y = x 1/u and with y = x 1/U and subtract to see that

x, x 1/u = x, x 1/U + (x/ p, p).
x 1/u ≤ p<x 1/U
218 Applications of the Prime Number Theorem

Choose u p so that p = (x/ p)1/u p . Then the above is

x, x 1/U + x/ p, (x/ p)1/u p .
x 1/u ≤ p<x 1/U

But u p = (log x)/ log p − 1 ∈ [U − 1, U ], so by the inductive hypothesis,

when U ≥ 2, the above is

U w(U )x x
+O 2
log x (log x)
u p w(u p )x x p
+ +O + O .
x 1/u ≤ p<x 1/U
p log x/ p p(log x)2 log p

The sum over p of the ﬁrst error term is x/(log x)2 , and the sum over p of the
second is x 2/U /(log x)2 , which is acceptable since U ≥ 2. To estimate the
contribution of the main term in the sum we write the Prime Number Theorem in
the form π(t) = li(t) + R(t), apply Riemann–Stieltjes integration, and integrate
the term involving R(t) by parts, to see that the sum of the main term is

x 1/U xw log x − 1 x 1/U − x 1/U
log t
dt + f (t)R(t) − R(t) d f (t) (7.45)
x 1/u t(log t)2 x 1/u − x 1/u
where
log x
xw log t
−1
f (t) = .
t log t
Since f (t) x/(t 2 log t) and R(t) t/(log t) A , the terms involving R(t)
U x/(log x) . By the change of variables v =
A
contribute an amount
(log x)/ log t − 1 we see that the ﬁrst integral in (7.45) is
u−1
x
w(v) dv,
log x U −1

which by (7.39) is
x
= (uw(u) − U w(U )).
log x
On combining our estimates we obtain (7.44), so the inductive step is
complete.

We now derive formulæ for w(u) similar to those in Theorem 7.10 involving
ρ(u).

Theorem 7.12 If s > 0, then

∞
s
1 − e−z
s+s w(u)e−us du = exp −C0 + dz (7.46)
1 0 z
7.2 Numbers composed of large primes 219

where C0 is Euler’s constant. If u > 1 and σ0 > 0, then

σ0 +i∞ ∞ −z
1 e
w(u) = exp dz − 1 eus ds. (7.47)
2πi σ0 −i∞ s z
Since the right-hand side of (7.46) is an entire function, we see that the
Laplace transform of w(u) is entire apart from a simple pole at s = 0 with
residue e−C0 .
Proof Let G(s) denote the left-hand side of (7.46). Then

G(s) ∞
=− w(u)ue−us du.
s 1
By integrating by parts we see that this is
w(u)ue−us ∞ 1 ∞ −e−s G(s)
− w(u − 1)e−us du =
s 1 s 2 s2
by (7.37) and (7.38). That is,
1 − e−s
G (s) = G(s) ,
s
which by the method of separation of variables implies that
s
1 − e−z
G(s) = A exp dz
0 z
where A is a positive constant. To determine the value of A we note that
1 ∞ −z
G(s) 1 − e−z e
1 = lim = A exp dz − dz .
s→∞ s 0 z 1 z
From (7.31) we deduce that A = e−C0 , and hence we have (7.46). To obtain
(7.47) it sufﬁces to take the inverse Laplace transform, since
∞
s
1 − e−z e−z
dz = dz + log s + C0 .
0 z s z

7.2.1 Exercises
1. By using (7.31), or otherwise, show that
∞
s
1 − e−z e−z
dz = C0 + log s + dz
0 z s z
when s > 0.
2. (a) Show that
1 + log(u − 1)
w(u) =
u
for 2 ≤ u ≤ 3.
220 Applications of the Prime Number Theorem

(b) Show that

1 u
log(v − 2)
w(u) = 1 + log(u − 1) + dv
u 3 v−1
for 3 ≤ u ≤ 4.
(c) Show that

1 log(v − 2)
u
w(u) = 1 + log(u − 1) + dv
u 3 v−1

u log u−1
v−1
log(v − 3)
+ dv
4 v−2

for 4 ≤ u ≤ 5.
3. (Friedlander 1972) Let S be a set of positive integers not exceeding X , and
suppose that (a, b) ≤ Y whenever a ∈ S, b ∈ S, a = b. Let M(X, Y ) denote
the maximum cardinality of all such sets S.
(a) Let S0 be the set of those positive integers n ≤ X such that if d|n, d < n,
then d ≤ Y . Show that card S0 = M(X, Y ).
(b) Show that if Y ≤ X 1/2 , then

M(X, Y ) = 1 + π(X ) − π (Y ) + (Y, p).
p≤Y

(c) Show that if X 1/2 < Y ≤ X , then

M(X, Y ) = 1 + π(X ) − π (Y ) + (Y, p) + (X/ p, p).
p<X/Y X/Y ≤ p≤Y

7.3 Primes in short intervals

Let Jacobsthal’s function g(q) be the length of the longest gap between con-
secutive reduced residues modulo q. We show that there are long gaps between
primes by showing that there exist integers q for which g(q) is large. Since the
average gap between consecutive reduced residues (mod q) is q/ϕ(q), it is
obvious that
q
g(q) ≥ .
ϕ(q)
If p1 < p2 < · · · < pk are the distinct primes dividing q, then by the Chinese
Remainder Theorem there is an x such that x ≡ −i (mod pi ) for 1 ≤ i ≤ k.
Then (x + i, q) > 1 for 1 ≤ i ≤ k, and hence

g(q) ≥ ω(q) + 1.
7.3 Primes in short intervals 221

These observations can be combined: It can be shown that

qω(q)
g(q) . (7.48)
ϕ(q)
This is not quite enough to produce long gaps between primes, but for certain
q we improve on the above to establish

Lemma 7.13 Let P = P(z) = p≤z p. Then

g P(z)
lim = ∞.
z→∞ z
This immediately yields

Theorem 7.14 (Westzynthius) Let pn denote the n th prime number. Then

pn+1 − pn
lim sup = ∞.
n→∞ log pn
Proof of Theorem 7.14 Suppose that N = g(P) − 1 and that M is chosen,
P ≤ M < 2P, so that (M + m, P) > 1 for 1 ≤ m ≤ N . But M + m > P ≥
(M + m, P), and hence M + m is composite because it has the proper divisor
(M + m, P). If n is chosen so that pn is the largest prime not exceeding M,
then pn+1 − pn ≥ g(P) and pn < 2P, which is < e2z when z is large. Hence
pn+1 − pn g(P)
≥
log pn 2z
which tends to inﬁnity as z → ∞.

Proof of Lemma 7.13 Let L be large and ﬁxed, and put N = [z L/3]. We show
that if z > z 0 (L), then there exists an integer M such that (M + n, P(z)) > 1
for 1 ≤ n ≤ N . Put

P1 = p, P2 = p, P3 = p, P4 = p,
p≤L L< p≤L L L L < p≤z/3 z/3< p≤z

and let N be the set of those integers n, 1 ≤ n ≤ N , such that (n, P1 P3 ) = 1.

The members of N are (i) 1; (ii) integers n composed entirely of prime factors
of P2 ; (iii) primes p, z/3 < p ≤ N . Thus

card N ≤ 1 + ψ(N , L L ) + π (N ) − π(z/3).

If z is sufﬁciently large, then L L < log N , so that ψ(N , L L ) < N ε by Corol-

lary 7.9. Hence

card N < π (N ).
222 Applications of the Prime Number Theorem

We choose M ≡ 0 (mod P1 P3 ), so that (M + n, P1 P3 ) > 1 if 1 ≤ n ≤ N , n ∈

/
N . To bound the number of n ∈ N such that (M + n, P2 ) = 1 we average as
in the proof of Lemma 3.5. Clearly
q q
1= 1= ϕ(q) = ϕ(q) card N
m=1 n∈N n∈N m=1 n∈N
(m+n,q)=1 (m+n,q)=1

for any integer q. Hence

1
min 1 ≤ (card N ) 1− .
m
n∈N p|q
p
(m+n,q)=1

By taking q = P2 we see that there is an M (mod P2 ) such that

1
card{n ∈ N : (M + n, P2 ) = 1} ≤ (card N ) 1− .
p|P2
p

For such an M,

1
card{1 ≤ n ≤ N : (M + n, P1 P2 P3 ) = 1} ≤ π (N ) 1− .
p|P2
p

By Mertens’ theorem (Theorem 2.7(e)), the product on the right is ∼ 1/L as

L → ∞. Suppose that L is chosen sufﬁciently large to ensure that this product
is ≤ 3/(2L). Then the right-hand side above is
3N z
∼ .
2L log N 2 log z
The number of primes dividing P4 is π (z) − π (z/3) ∼ 2z/(3 log z) as z → ∞.
Thus if z is large, then there are more such primes than there are integers n,
1 ≤ n ≤ N , for which (M + n, P1 P2 P3 ) = 1. Hence for each such n we may as-
sociate a prime pn , pn |P4 , in a one-to-one manner, and take M ≡ −n (mod pn ).
Then (M + n, P4 ) > 1 and we are done.

The success of the argument just completed can be attributed to the fact that
the number of n, 1 ≤ n ≤ N , for which (n, P1 P3 ) = 1 is considerably smaller

than N p|P1 P3 (1 − 1/ p). By considering how L may be chosen as a function
of z we obtain a quantitative improvement of Lemma 7.13 and hence also of
Theorem 7.14.
Theorem 7.15 (Rankin) Let pn denote the n th prime number in increasing
order. There is a constant c > 0 such that
pn+1 − pn
lim sup ≥ c.
n→∞ (log pn )(log log pn )(log log log log pn )
(log log log pn )2
7.3 Primes in short intervals 223

Proof We repeat the argument in the proof of Lemma 7.13, with the sole
change that L is allowed to depend on z. If L is chosen so that
N
ψ(N , L L ) < , (7.49)
(log N )2
then L = o(log N ), and hence

z
ψ(N , L L ) = o .
log N
Since z/ log N ≤ z/ log z π (z/3), it follows that
ψ(N , L L ) = o(π (z/3)),
and the proof proceeds as before.
By Theorem 7.6 we see that
N
ψ N , N 1/u <
(log N )2
if u log u ≥ 3 log log N , which is the case if u ≥ 4(log log N )/ log log log N .
Taking u = (log N )/ log L L , we deduce that (7.49) holds if
(log N )(log log log N )
L log L < .
4 log log N
This is satisﬁed if
(log N )(log log log N )
L< ,
4(log log N )2
since then log L < log log N . Since N > z when L ≥ 3, we conclude that we
may take
(log z)(log log log z)
L= .
4(log log z)2
Hence
z(log z)(log log log z)
g P(z) >
13(log log z)2
for all z > z 0 , and this gives the stated result.

Concerning the maximum number of primes in a short interval, by the Brun–

Titchmarsh inequality (Theorem 3.9) and the Prime Number Theorem we see
that
π (x + y) − π(x) < (2 + ε)π(y)
for y > y0 (ε). Let
ρ(y) = lim sup(π (x + y) − π (x)). (7.50)
x→∞
224 Applications of the Prime Number Theorem

Thus ρ(y) < (2 + ε)π(y). Very little is known about ρ(y). It was once conjec-
tured that

π (M + N ) ≤ π(M) + π (N ) (7.51)

for M > 1, N > 1, but there is now serious doubt as to the validity of this
inequality. Indeed, it seems likely that ρ(y) > π (y) for all large y. To see why,
let

M+N
ρ(N ) = max 1. (7.52)
M
n=M+1
p|n⇒ p>N

Clearly ρ(N ) ≤ ρ(N ). We expect that

ρ(N ) = ρ(N ) (7.53)

for all N , since this would follow from the

Prime k-tuple conjecture. Let a1 , a2 , . . . , ak , be given integers. Then there
exist inﬁnitely many positive integers n such that n + a1 , n + a2 , . . . , n + ak
are all prime, provided that for every prime number p there is an integer n such
that (n + ai , p) = 1 for i = 1, 2, . . . , k.
We now show that ρ(N ) > π (N ) for all large N , so that (7.51) and (7.53)
are inconsistent.

Theorem 7.16 There is an absolute constant N0 such that if N > N0 then

ρ(N ) − π(N ) N (log N )−2 .

Proof Suppose that N is even and that N > 2. Then for every M,

M+N
M+N −1
M+N
1 = 1 ≥ 1.
n=M+1 n=M+1 n=M+1
p|n⇒ p>N p|n⇒ p≥N p|n⇒ p>N −1

Hence ρ(N ) ≥ ρ(N − 1) when N is even, N > 2, so it sufﬁces to treat the case
when N is odd, say N = 2K + 1. Let P(K ) denote the set of integers n with
K /(2 log K ) < |n| ≤ K and |n| prime. Then

card P(K ) = 2(π (K ) − π (K /(2 log K ))),

so by Theorem 6.9,
K
card P(K ) = π (2K + 1) + (c + o(1))
(log K )2
where c = 2 log 2 − 1 > 0. We now show that P(K ) can be translated to form

a set of integers {M + n : n ∈ P(K )} with each member coprime to p≤N p.
By the Chinese Remainder Theorem it sufﬁces to show that for every prime
7.3 Primes in short intervals 225

number p ≤ N there is a residue class r p (mod p) that contains no element of

P(K ).
Obviously each element of P(K ) is coprime to each prime p ≤ K /(2 log K ),
so we may take r p = 0 for such primes. It remains to treat the primes p for
which K /(2 log K ) < p ≤ 2K + 1. This is accomplished by means of a clever
application of Lemma 7.13. Suppose that K /(2 log K ) < p ≤ 2K + 1. We
show that there is an r p such that if |hp + r p | ≤ K , then hp + r p ∈
/ P(K ). By
Lemma 7.13 there is an interval J = [M1 − 3 log K , M1 + 3 log K ] in which
every integer j is divisible by a prime p j with p j ≤ 13 log K . By the Chinese
Remainder Theorem, we can choose r p so that r p ≡ M1 p (mod p j ) for each
j ∈ J . This can be done with 0 < r p ≤ exp ϑ( 13 log K ) < K 1/2 . If |h| ≤
3 log K then h = j − M1 for some j ∈ J and so h ≡ −M1 (mod p j ). Hence
hp + r p ≡ −M1 p + r p ≡ 0 (mod p j ), which implies that hp + rp ∈/ P(K ). On
the other hand, if |h| > 3 log K , then |hp + r p | ≥ 2 − o(1) K > K , so that
3

hp + r p ∈/ P(K ) in this case also. Since the arithmetic progression hp + r p has

no element in common with P(K ) the proof is complete.

7.3.1 Exercises
1. Show that the function ρ(N ) is weakly increasing.
2. (a) Show that in the prime k-tuple conjecture, the hypothesis that for every
prime p the numbers a j do not cover all residue classes (mod p) is
satisfied for all p > k, so that it is enough to verify the hypothesis for
p ≤ k (a finite calculation for any given set of a j ).
(b) Prove the converse of the prime k-tuple conjecture: If there exist in-
finitely many integers n for which n + a j is prime for all j, 1 ≤ j ≤ k,
then for every prime p there is a residue class x (mod p) such that
x + a j ≡ 0 (mod p)(1 ≤ j ≤ k).
3. Show that g(q) qω(q)/ϕ(q).
4. (cf. Erdős 1951) Show that if 0 < c < 1/2 then there exist arbitrarily large
numbers x such that the interval (x, x + c(log x)/ log log x) contains no
square-free number.
5. (cf. Erdős 1946, Montgomery 1987) Suppose that 2 ≤ h ≤ x. Let P de-
note the set of all primes p ≤ h, let D denote the set of positive integers

composed entirely of primes in P, and let f (n) = p|n, p∈P (1 − 1/ p).

(a) Show that f (n) = d|n,d∈D µ(d)/d.
(b) Show that
6
f (n) = 2 h + O(log h)
x<n≤x+h
π
uniformly in x.
226 Applications of the Prime Number Theorem

(c) Show that

ϕ(n) 1
≥ f (n) − .
n p|n
p
p>h

(d) Among those primes p > h that divide an integer in the interval (x, x +
h], let Q be those for which p ≤ h log x, and R those for which p >
h log x. Show that
1
log log log x.
p∈Q
p

(e) Explain why

p n,
p∈R x<n≤x+h
U < p≤2U

and deduce that

h log x
card{ p ∈ R : U < p ≤ 2U } .
log U
(f) By summing over U = 2k h log x, show that
1 1
.
p∈R
p log(h log x)

(g) Show that

6 ϕ(n) 6
h + O(log h) + O(log log log x) ≤ ≤ 2 h + O(log h).
π 2
x<n≤x+h
n π
6. (cf. Pillai & Chowla 1930) Show that there is an absolute constant c > 0
such that there exist arbitrarily large x for which ϕ(n)/n < 1/4 when x <
n ≤ x + c log log log x. Deduce that
ϕ(n) 6
− 2 x = (log log log x).
n≤x n π
7. (Hausman & Shapiro 1973; cf. Montgomery & Vaughan 1986)
(a) Show that
⎛ ⎞2
q
⎜ h
ϕ(q) ⎟
⎝ 1− h⎠
n=1 m=1
q
(m+n,q)=1

ϕ(q)2 r2 p( p − 2)
= µ(r )2 {h/r }(1 − {h/r }) .
q r |q ϕ(r )2 p|q
( p − 1)2
r >1 pr
7.3 Primes in short intervals 227

(b) Use the inequality {α}(1 − {α}) ≤ α to show that

⎛ ⎞2

q
⎜
h
ϕ(q) ⎟
⎝ 1− h ⎠ ≤ hϕ(q).
n=1 m=1
q
(m+n,q)=1

8. (Erdős 1951) (a) For a positive integer q, let S(q) denote the set of those
residue classes s modulo q 2 such that (s, q) is a perfect square. Show

that if q is square-free, then S(q) contains exactly p|q ( p 2 − p + 1)
elements.
(b) Show that if q is square-free and 1 ≤ h ≤ q 2 , then there is an integer
a such that the number of members of S(q) in the interval (a, a + h]
is at most

1 1
h 1− + 2 .
p|q
p p

(c) From now on, suppose that q is the product of those primes p ≤ y such
that p ≡ 3 (mod 4). By recalling Corollary 4.12, or otherwise, show
√
that the expression above is h/ log y.
(d) Show that if an integer n can be expressed as a sum of two squares,
then n ∈ S(q).
(e) Let R be the set of those primes p, y < p ≤ C y, such that p ≡
3 (mod 4). Here C is an absolute constant, taken to be sufﬁciently
large to ensure that R has at least y/ log y elements. Note that such a
constant exists, in view of Exercise 4.3.5(e). Let r denote the product of
all members of R. Suppose that the number of members of S(q) lying
in the interval (a, a + h] is < y/ log y. For each s ∈ S(q) satisfying
a < s ≤ a + h, associate a prime p ∈ R. Suppose that the integer b is
chosen modulo p 2 so that s + bq 2 ≡ p (mod p 2 ). Show that the interval
(a + bq 2 , a + bq 2 + h] does not contain a sum of two squares.
(f) Show that a and b can be chosen so that 0 < a + bq 2 < (qr )2 .
(g) Show that log qr y.
√
(h) Show that this construction succeeds with h y/ log y
(log qr )/(log log qr )1/2 .
(i) Conclude that there exist arbitrarily large x such that there is no sum of
two squares between x and x + c(log x)/(log log x)1/2 . Here c is a suit-
ably small positive constant. (Note that a stronger result is established
in the next exercise.)
228 Applications of the Prime Number Theorem

9. (Richards 1982) For every prime p ≤ y, let β( p) denote the greatest positive
integer such that p β ≤ y, and put

q= p 2β( p) .
p≤y
p≡3 (4)

(a) Show that q = exp(2ψ(y; 4, 3)).

(b) Show that log q y.
(c) Suppose that 1 ≤ n ≤ y. Show that if n ≡ 3 (mod 4), then there is a
prime p|q such that p divides n to an odd power.
(d) Let x = (q − 1)/4. Show that x is an integer, and that 4x ≡ −1
(mod q).
(e) Show that if 1 ≤ i ≤ y/4 and p|q, then the power of p that exactly
divides x + i is the same as the power of p that exactly divides 4i − 1.
(f) Deduce that no integer in the interval (x, x + y/4] can be expressed as
a sum of two squares.
(g) Conclude that there exist arbitrarily large numbers x such that no num-
ber between x and x + c log x is a sum of two squares. Here c is a
suitably small positive constant.

7.4 Numbers composed of a prescribed number of primes

Let σk (x) denote the number of integers n with 1 ≤ n ≤ x and (n) = k. Then
σ1 (x) = π (x) ∼ x/ log x. Consider σ2 (x). Clearly

σ2 (x) = 1= (π (x/ p) − π( p) + O(1)) .
p1 , p2 √
p1 ≤ p2 p≤ x
p1 p2 ≤x

By the Prime Number Theorem this is

x x
= (1 + o(1)) +O .
√
p≤ x
p(log x/ p) log x

Thus, by partial summation and a further application of the Prime Number

Theorem we ﬁnd that
x log log x
σ2 (x) ∼ . (7.54)
log x
By inducting on k in this manner it can be shown that
x(log log x)k−1
σk (x) ∼ (7.55)
(k − 1)! log x
7.4 Numbers composed of a prescribed number of primes 229

for any ﬁxed k. Since the sum over all k ≥ 1 of the right-hand side is exactly x,
it is tempting to think that the above holds quite uniformly in k. However this
is not the case, as we shall presently discover. To obtain precise estimates that
are uniform in k we apply analytic methods. In Section 2.4 we determined the
asymptotic distribution of the additive function (n) − ω(n) by establishing
the mean value of the multiplicative function z (n)−ω(n) . In the same spirit
we shall derive information concerning the distribution of (n) from mean
value estimates of z (n) . Since the Euler product of this latter function behaves
badly when |z| is large, we start not with z (n) but with dz (n) deﬁned by the
identities
−z ∞
ζ (s)z = 1 − p −s = dz (n)n −s (σ > 1). (7.56)
p n=1

Since dz ( p) = z = z ( p) , the functions dz (n) and z (n) are ‘nearby’, and hence
the mean value of z (n) can be derived from that for dz (n) by elementary
reasoning.

Theorem 7.17 Let Dz (x) = n≤x dz (n), and let R be any positive real num-
ber. If x ≥ 2, then
x(log x)z−1
Dz (x) = + O(x(log x)z−2 )
(z)
uniformly for |z| ≤ R.
Proof Let a = 1 + 1/ log x. Then by Corollary 5.3,
a+i T
1 xs x
Dz (x) − ζ (s)z ds |dz (n)| min 1,
2πi a−i T s 1
T |x − n|
2 x<n<2x
(7.57)
xa −a
+ |dz (n)|n .
T n
Since |dz (n)| is erratic, we must exercise some care in estimating the error terms
above. Let A = {n : |n − x| ≤ x/(log x)2R+1 }. Without loss of generality we
may suppose that R is an integer. We note that |dz (n)| ≤ d|z| (n) ≤ d R (n). By
the method of the hyperbola we see by induction on R that

D R (x) = x PR (log x) + O R x 1−1/R
where PR is a polynomial of degree R − 1. Hence the contribution to the ﬁrst
sum in the error term in (7.57) of the n ∈ A is

|dz (n)| x(log x)−R−2
n∈A
230 Applications of the Prime Number Theorem

The contribution of the n ∈

/ A is
T −1 (log x)2R+1 x(log x) R−1 .
√
We take T = exp log x to see that this is also x(log x)−R−2 . The second
sum in the error term in (7.57) is ζ (a) R R
(log x) . Thus the total error term
−R−2
is x(log x) .
If z is a positive integer, then ζ (s)z has a pole at s = 1, and we can extract
a main term by the calculus of residues, as in our proof of the Prime Number
Theorem (Theorem 6.9). On the other hand, if z is not an integer, then ζ (s)z
has a branch point at s = 1, so greater care must be exercised in moving the
path of integration. Put b = 1 − c/ log T where c is a small positive constant,
and replace the contour from a − i T to a + i T by a path consisting of C1 ,
C2 , C3 where C1 is a polygonal with vertices a − i T , b − i T , b − i/ log x, C2
begins with a line segment from b − i/ log x to 1 − i/ log x, continues with
the semicircle {1 + eiθ / log x : −π/2 ≤ θ ≤ π/2}, and concludes with the line
segment from 1 + i/ log x to b + i/ log x, and ﬁnally C3 is polygonal with
vertices b + i/ log x, b + i T , a + i T . By Theorem 6.7, ζ (s)z (log x) R on the
new path, so the integrals over C1 and C3 contribute an amount x(log x)−R−2 .
On C2 we have ζ (s)z /s = (s − 1)−z (1 + O(|s − 1|)). Hence

1 xs 1
ζ (s)z ds = (s − 1)−z x s ds + O |s − 1|1−z x σ |ds| .
2πi C2 s 2πi C2 C2
(7.58)
By the change of variables s = 1 + w/ log x we see that the main term above is
1
x(log x)z−1 w−z ew dw
2πi H2

where H2 starts at −β − i, loops around 0, and ends at −β + i where β =

c(log x)/ log T . Let H1 be the contour H1 = {w = u − i : −∞ < u ≤ −β},
and similarly let H3 = {w = u + i : −∞ < u ≤ −β}. If we integrate over
the union of the Hi , then we obtain Hankel’s formula (see Theorem C.3)
∞ −u/2
for 1/ (z). The integral over H1 is R β e du R e−β/2 , which is
√
small since T = exp( log x). Thus we see that the main term in (7.58) is
√
x(log x)z−1 / (z) + O R (x exp(−c log x)) for some constant c. On the semi-
circular part of C2 the integrand in the error term in (7.58) is x(log x)z−1 , so
the contribution is x(log x)z−2 . By the change of variables s = 1 + w/ log x
we see that the linear portions of C2 contribute an amount
∞
x(log x)z−2 (u 2 + 1)(R−1)/2 e−u du R x(log x)z−2 .
0

Thus we have the stated estimate, and the proof is complete.

7.4 Numbers composed of a prescribed number of primes 231

We now establish a procedure by which we can pass from dz (n) to other

nearby functions.
∞
m=1 |bz (m)|(log m) /m is uniformly
2R+1
Theorem 7.18 Suppose that
bounded for |z| ≤ R, and for σ ≥ 1 let
∞
F(s, z) = bz (m)m −s .
m=1

Let az (n) be deﬁned by the relation

∞
ζ (s)z F(s, z) = az (n)n −s (σ > 1)
n=1

and let A z (x) = n≤x az (n). Then for x ≥ 2,
F(1, z)
A z (x) = x(log x)z−1 + O x(log x)z−2 .
(z)

Proof Since az (n) = m|n bz (m)dz (n/m), we see by Theorem 7.17 that

A z (x) = bz (m)Dz (x/m) + bz (m)
m≤x/2 x/2<m≤x

x bz (m) |bz (m)|
z−2
= (log x/m) + O x
z−1
(log 2x/m) .
(z) m≤x/2 m m≤x m
(7.59)
The error term here is
|bz (m)| |bz (m)|
x(log x)z−2 + x(log x)−R−2 (log m)2R
√
m≤ x
m √
m> x
m

x(log x)z−2 .
In the main term, when m ≤ x 1/2 we write

(log x/m)z−1 = (log x)z−1 + O (log m)(log x)z−2 .
Thus the ﬁrst sum on the right-hand side of (7.59) is
bz (m)
= (log x)z−1
m≤x/2
m
⎛ ⎞
|bz (m)| |bz (m)|
+ O ⎝(log x)z−2 log m + (log x) R−1 ⎠
√
m≤ x
m √
m> x
m

|bz (m)|
z−2
= (log x) F(1, z) + O (log x)
z−1
(log m) 2R+1
,
m m
232 Applications of the Prime Number Theorem

which gives the result.

Suppose that R < 2, and let

z −1 1 z
F(s, z) = 1− s 1− s (7.60)
p p p

for σ > 1, |z| ≤ R. Then az (n) = z (n) in the notation of Theorem 7.18. Hence,
with σk (x) deﬁned as at the beginning of this section we ﬁnd that

∞
A z (x) = z (n) = σk (x)z k .
n≤x k=0

Here the power series on the right is actually a polynomial, since σk (x) = 0 for
sufficiently large k, when x is fixed. Our asymptotic estimate for A z (x) enables
us to recover an estimate for the power series coefficients σk (x), since Cauchy’s
formula asserts that
1 A z (x)
σk (x) = dz (7.61)
2πi |z|=r z k+1
for r < 2.

Theorem 7.19 Suppose that R < 2, that F(s, z) is given by (7.60), and that
G(z) = F(1, z)/ (z + 1). Then

k−1 x(log log x)k−1 k
σk (x) = G 1 + OR (7.62)
log log x (k − 1)! log x (log log x)2
uniformly for 1 ≤ k ≤ R log log x.

Since G(0) = G(1) = 1, we see that (7.55) holds when k = o(log log x), and
also when k = (1 + o(1)) log log x, but that (7.55) does not hold in general. The
restriction to R < 2 is necessary because of the contribution of the prime p = 2
in the Euler product (7.60) for F(s, z). If z ≥ 2, then the behaviour is different;
see Exercises 7.4.5 and 7.4.6, below.

Proof Our quantitative form of the Prime Number Theorem (Theorem 6.9)
gives the case k = 1, so we may assume that k > 1. We substitute the estimate of
Theorem 7.18 in (7.61) with r = (k − 1)/ log log x. The error term contributes
an amount
k
x k−1 (log log x)
x(log x)r −2r −k = e
(log x)2 (k − 1)k
k
x(log log x) x(log log x)k−3
.
(k − 1)!(log x)2 (k − 1)! log x
7.4 Numbers composed of a prescribed number of primes 233

This is majorized by the error term in (7.62) since G((k − 1)/ log log x) 1.
The main term we obtain from (7.61) is x I / log x where
1
I = G(z)(log x)z z −k dz
2πi |z|=r
G(r ) 1
= (log x)z z −k dz + (G(z) − G(r ))(log x)z z −k dz.
2πi |z|=r 2πi |z|=r
By integration by parts we ﬁnd that
r 1
(log x)z z −k dz = (log x)z z 1−k dz.
2πi |z|=r 2πi |z|=r

We multiply both sides by G (r ) and combine with the former identity to see
that
G(r )
I = (log x)z z −k d x
2πi |z|=r
1
+ (G(z) − G(r ) − G (r )(z − r ))(log x)z z −k dz. (7.63)
2πi |z|=r
Here the ﬁrst integral is (log log x)k−1 /(k − 1)! by Cauchy’s theorem, which
gives the desired main term. On the other hand,
z
G(z) − G(r ) − G (r )(z − r ) = (z − w)G (w) dw |z − r |2 ,
r

so that if we write z = r e2πiθ , then the second integral in (7.63) is

1/2
r 3−k (sin π θ)2 e(k−1) cos 2π θ dθ.
−1/2

But | sin x| ≤ |x| and cos 2πθ ≤ 1 − 8θ 2 for −1/2 ≤ θ ≤ 1/2, so the above is
∞
(log log x)k−3 ek−1
θ 2 e−8(k−1)θ dθ r 3−k ek−1 (k − 1)−3/2 =
2
r 3−k ek−1
0 (k − 1)k−3/2
k(log log x)k−3 /(k − 1)!.
This completes the proof of the theorem.

The decomposition in (7.63) is motivated by the observation that |(log x)z |

is largest, for |z| = r , when z = r . We take the Taylor expansion to the second
term because
(z − r )2 (log x)z z −k dz |z − r |2 |(log x)z z −k | |dz|,

whereas

(z − r )(log x)z z −k dz = o |z − r ||(log x)z z −k | |dz| .
234 Applications of the Prime Number Theorem

By the calculus of residues we may write

1 d k−1
I = G(z)(log x)z
(k − 1)! dz k−1 z=0

k−1
G (ν) (0) (log log x)k−1−ν
= .
ν=0
ν! (k − 1 − ν)!

This gives a more accurate, but more complicated, main term.

In Section 2.3 we saw that (n) rarely differs very much from log log n.
In particular, from Theorem 2.12 we see that if r < 1, then the number
of n ≤ x for which (n) < r log log n is r x/ log log x. We now give a
much sharper upper bound for the number of occurrences of such large
deviations.

Theorem 7.20 Let A(x, r ) denote the number of n ≤ x such that (n) ≤
r log log x, and let B(x, r ) denote the number of n ≤ x for which (n) ≥
r log log x. If 0 < r ≤ 1 and x ≥ 2, then

A(x, r ) x(log x)r −1−r log r .

If 1 ≤ r ≤ R < 2 and x ≥ 2, then

B(x, r ) R x(log x)r −1−r log r .

Proof We argue directly from Theorem 7.18, using a modiﬁed form of

Rankin’s method. If 0 ≤ r ≤ 1 and (n) ≤ r log log x, then r r log log x ≤ r (n) .
Hence

A(x, r ) ≤ (log x)−r log r r (n) .
n≤x

By Theorem 7.18 this is

F(1, r )
∼ x(log x)r −1−r log r
(r )
where F(s, z) is taken as in (7.60). This gives the result since F(1, r ) 1 and
(r ) 1 uniformly for 0 < r ≤ 1.
Now suppose that 1 ≤ r ≤ R < 2 and that (n) ≥ r log log x. Then r (n) ≥
r log log x
r , and hence

B(x, r ) ≤ (log x)−r log r r (n) .
n≤x

Thus we have only to proceed as before to obtain the result.

7.4 Numbers composed of a prescribed number of primes 235

In discussing Theorem 2.12 we proposed a probabilistic model, which in

conjunction with the Central Limit Theorem would predict that the quantity
(n) − log log n
αn = √ (7.64)
log log n

is asymptotically normally distributed. We now conﬁrm this.

Theorem 7.21 Let αn be given by (7.64) and suppose that Y > 0. Then the
number of n, 3 ≤ n ≤ x, such that αn ≤ y is

x
(y)x + OY √
log log x

uniformly for −Y ≤ y ≤ Y where

y
1
e−t /2
2
(y) = √ dt.
2π −∞

Proof Let
(n) − log log x
βn = √ .
log log x
√
Since (y) 1 and αn − βn 1/ log log x when x 1/2 ≤ n ≤ x and (n) ≤
2 log log x, it sufﬁces to consider βn in place of αn . We may of course also
suppose that x is large.
Let k be a natural number and let u be deﬁned by writing k = u + log log x.
If |u| ≤ 12 log log x, then by Stirling’s formula (see (B.26) or the more general
Theorem C.1) we see that
(log log x)k−1
(k − 1)!
12 −log log x−u
eu log x u 1
=√ 1+ 1+O .
2π log log x log log x log log x

The estimate log(1 + δ) = δ − δ 2/2 + O(|δ|3 ) holds uniformly for |δ| ≤ 1/2.
By taking δ = u/ log log x we ﬁnd that
12 −log log x−u
u
1+
log log x

u − u2 u2 |u|3
= exp −u + − + O .
2 log log x 4(log log x)2 (log log x)2
236 Applications of the Prime Number Theorem

Suppose now that |u| ≤ (log log x)2/3 . By considering separately |u| ≤
(log log x)1/2 and (log log x)1/2 < |u| ≤ (log log x)2/3 we see that
u 1 |u|3
√ + .
log log x log log x (log log x)2
Similarly, by considering |u| ≤ 1 and |u| > 1 we see that
u2 1 |u|3
√ + .
(log log x)2 log log x (log log x)2
On combining these estimates we deduce that

(log log x)k−1 log x −u 2
= √ exp
(k − 1)! 2π log log x 2 log log x

1 |u|3
× 1+O √ +O
log log x (log log x)2
uniformly for |u| ≤ (log log x)2/3 . In Theorem 7.19 we have G(1) = 1 and

k−1 1 + |u|
G = G(1) + O .
log log x log log x
Hence by Theorem 7.19,

x exp −(k−log
2
log x)
2 log log x
σk (x) = √
2π log log x

1 |k − log log x|3
× 1+O √ +O .
log log x (log log x)2
By Theorem 7.20 we know that the contribution of k ≤ log log x −
(log log x)2/3 is negligible. We sum over the range

log log x − (log log x)2/3 ≤ k ≤ log log x + y(log log x)1/2 .

This gives rise to three sums, one for the main term and two for error terms.
Each of these sums can be considered to be a Riemann sum for an associated
integral, and the stated result follows.

7.4.1 Exercises
1. Let p1 , p2 , . . . , p K be distinct primes. Show that the number of n ≤ x
composed entirely of the pk is
(log x) K
K + O (log x) K −1 .
K ! k=1 log pk
7.4 Numbers composed of a prescribed number of primes 237

2. (a) Let dz (n) be defined as in (7.56), and suppose that |z| ≤ R. Show that
|dz (n)| ≤ d|z| (n) ≤ d R (n).
(b) Let F(s, z) be defined as in (7.60). Show that if 0 < r < 1 and σ > 1/2,
then 0 < F(σ, r ) < 1.
(c) Let F(s, z) be defined as in (7.60). Show that if 1 < r < 2, then the
Dirichlet series coefficients of F(s, r ) are all non-negative.
3. (a) Show that if

z 1 z
F(s, z) = 1+ s 1− s ,
p p −1 p
then F(s, z) converges for σ > 1/2, uniformly for |z| ≤ R.
(b) Show that if F(s, z) is taken as above, and if az (n) is defined as in
Theorem 7.18, then az (n) = z ω(n) .
(c) Let ρk (x) denote the number of n ≤ x for which ω(n) = k. Show that
if x ≥ 2, then

k−1 x(log log x)k−1 k
ρk (x) = G 1 + OR
log log x (k − 1)! log x (log log x)2
uniformly for 1 ≤ k ≤ R log log x where G(z) = F(1, z)/ (z + 1).
(d) Show that G(0) = G(1) = 1.
(e) Let A(x, r ) denote the number of n ≤ x for which ω(n) ≤ r log log x.
Show that
A(x, r ) x(log x)r −1−r log r
uniformly for 0 < r ≤ 1.
(f) Let B(x, r ) denote the number of n ≤ x for which ω(n) ≥ r log log x.
Show that
B(x, r ) x(log x)r −1−r log r
uniformly for 1 ≤ r ≤ R.
4. (a) Show that if

z 1 z
F(s, z) = 1+ s 1− s ,
p p p
then F(s, z) converges for σ > 1/2, uniformly for |z| ≤ R.
(b) Show that if F(s, z) is taken as above, and if az (n) is defined as in
Theorem 7.18, then az (n) = µ(n)2 z ω(n) .
(c) Let πk (x) denote the number of square-free n ≤ x for which ω(n) = k.
Show that if x ≥ 2, then

k−1 x(log log x)k−1 k
πk (x) = G 1 + OR
log log x (k − 1)! log x (log log x)2
uniformly for 1 ≤ k ≤ R log log x where G(z) = F(1, z)/ (z + 1).
238 Applications of the Prime Number Theorem

(d) Show that G(0) = G(1) = 1.

5. (a) Show that if x ≥ 2, then

2(n) = cx(log x)2 + O(x log x)
n≤x

where c is a positive constant.

(b) Show that if x ≥ 2, then

2ω(n) = cx log x + O(x)
n≤x

where c is a positive constant.

6. Show that if (2 + ε) log log x ≤ k ≤ R log log x, then σk (x) ∼ c2−k x log x.
7. Show that if δ ≤ r ≤ 1 − δ (or 1 + δ ≤ r ≤ 2 − δ), then A(x, r ) (or
√
B(x, r ), respectively) is x(log x)r −1−r log r / log log r .
8. Show that if x is large, then there is a k such that
x
σk (x) ≥ √ .
3 log log x

9. Show that the mean value n≤x d(n) ∼ x log x is due to the numbers n ≤ x
√
for which |ω(n) − 2 log log x| log log x.
10. Suppose that 1/2 ≤ r ≤ R. Show that the number of square-free n ≤ x that
can be written as a sum of two squares and for which ω(n) ≥ r log log x is
r −1−r log 2r
R x(log x) .
11. (Addison 1957) Let Mq,k (x) denote the number of n ≤ x such that (n) ≡
k (mod q).
(a) Show that if q is ﬁxed, then Mq,k (x) ∼ x/q as x → ∞.
(b) Show that if q is ﬁxed, q > 2, then

x x
Mq,k (x) − = ±
q (log x)κ

where κ = 1 − cos 2π/q.

12. Show that
1 x
∼
1<n≤x
ω(n) log log x

as x → ∞.
13. Show that if x ≥ 2, then
(n)
x
=x+O .
1<n≤x
ω(n) log log x
7.5 Notes 239

14. Suppose that 0 ≤ α ≤ 1. Show that

card{m : m|n, m ≤ n α }
2 √ x
= x arcsin α + O √ .
n≤x d(n) π log x

15. Show that if x ≥ 16, then

6 x
1 = 2x + O .
n≤x π log log log x
(n,(n))=1

7.5 Notes
Section 7.1. Theorem 7.2 was first proved by Dickman (1930), and was redis-
covered by Chowla & Vijayaraghavan (1947), Ramaswami (1949), and Buch-
stab (1949). de Bruijn (1951a) gave a more precise estimate for ψ(x, y), over
a longer range of y. There is a considerable range of applications of ψ(x, y),
such as those to the distribution of k th power residues, Waring’s problem, and
the complexity of arithmetical algorithms in computer science. As a reflection
of this there have been two significant survey articles, by Norton (1971) and by
Hildebrand & Tenenbaum (1993).
Our treatment of ψ(x, y) is fairly elementary, but it would be natural to take
a more analytic approach, and use Perron’s formula to write
c+i∞
1 xs
ψ(x, y) = (1 − p −s )−1 ds
2πi c−i∞ p≤y s
c+i∞
1 xs
= ζ (s) (1 − p −s ) ds.
2πi c−i∞ p>y s

For s not too large, an approximation to the product over p > y is provided by
the Prime Number Theorem, and this suggests the main term
c+i∞ ∞ s
1 −s −1 x
(x, y) = ζ (s) exp − v (log v) dv ds.
2πi c−i∞ y s
It can be shown that this is indeed a good approximation to ψ(x, y) over a very
long range, but the technical details are rather heavy. By Theorem 7.10 it is not
hard to show that
∞
(x, y) = x ρ(u − v)d([y v ]y −v )
0−

where we use (7.30) to extend the deﬁnition of ρ(u) to u ≤ 0. It follows that

(x, y) ∼ ρ(u)x
240 Applications of the Prime Number Theorem

for a large range of u. For the further development of the theory, especially on
the analytic side, see Hildebrand & Tenenbaum (1993).
Section 7.2. Theorem 7.11 is due to Buchstab (1937). The ﬁner details of
the behaviour of (x, y) when u is large are intimately connected with sieve
theory, especially that of the linear sieve, i.e., the sieve in which on average one
residue class (mod p) is removed. The standard references are Greaves (2001),
Halberstam & Richert (1974), Selberg (1991).
Section 7.3. Theorem 7.14 was ﬁrst proved by Westzynthius (1931). Erdős
(1935a) showed that
pn+1 − pn
lim sup 2
> 0,
n→∞ (log pn )(log log pn )/(log log log pn )

and then Rankin (1938) obtained Theorem 7.15 with c = 1/3. The value of c
has been successively improved by Schönhage (1963), Rankin (1963), Maier
& Pomerance (1990), culminating in the value c = 2eC0 of Pintz (1997). Erdős
offered a $10,000 prize for the ﬁrst proof that the limsup in Theorem 7.15 is
+∞.
Early studies of g(P(z)) were conducted by Backlund (1929), Brauer &
Zeitz (1930), Ricci (1934), and Chang (1938). The size of g(P(z)) is not known;
possibly it is z log z. However, it is conceivable that inﬁnitely often pn+1 − pn
is as large as (log pn )θ where θ > 1. In particular, Cramér (1936) conjectured
that
pn+1 − pn
lim sup = 1.
n→∞ (log pn )2
Theorem 7.16 is due to Hensley & Richards (1973).
Section 7.4. The analysis of σk (x) is based on Selberg’s exposition (1954) of
Sathe (1953a,b, 1954a,b). Sathe (1954b) also shows that the bound R log log x
cannot be replaced by 2 log log x + 1. Arguments giving rise to versions of
Theorem 7.20 occur in Erdős (1935b). A qualitative version of Theorem 7.21
is a special case of Erdős & Kac (1940). Quantitative versions with various
weaker error terms were obtained by LeVeque (1949) and Kubilius (1956).
Theorem 7.21 had been conjectured by LeVeque and was established by Rényi
& Turán (1958). They also showed that the error term is both uniform in x and
best-possible.

7.6 References
Addison, A. W. (1957). A note on the compositeness of numbers, Proc. Amer. Math.
Soc. 8, 151–154.
7.6 References 241

Alladi, K. & Erdős, P. (1977). On an additive arithmetic function, Paciﬁc J. Math. 71,
275–294.
Backlund, R. J. (1929). Über die Differenzen zwischen den Zahlen, die zu den n ersten
Primzahlen teilerfremd sind, Annales Acad. sci. Fennicae 32 (Lindelöf-Festschrift),
Nr. 2, 9 pp.
Brauer, A. & Zeitz, H. (1930). Über eine zahlentheoretische Behauptung von Legendre,
Sitzungsb. Math. Ges. Berlin 29, 116–125.
de Bruijn, N. G. (1949). The asymptotically periodic behavior of the solutions of some
linear functional equations, Amer. J. Math. 71, 313–330.
(1950a). On the number of uncancelled elements in the sieve of Eratosthenes, Nederl.
Akad. Wetensch. Proc. 52, 803–812. (= Indag. Math. 12, 247–256)
(1950b). On some linear functional equations, Publ. Math. 1, 129–134.
(1951a). The asymptotic behaviour of a function occurring in the theory of primes,
J. Indian Math. Soc. 15 (A), 25–32.
(1951b). On the number of positive integers ≤ x and free of prime factors > y, Proc.
Nederl. Akad. Wetensch. 54, 50–60.
(1966). On the number of positive integers ≤ x and free of prime factors > y, II,
Proc. Koninkl. Nederl. Akad. Wetensch. A 69, 239–247. (= Indag. Math. 28)
Buchstab, A. A. (1937). Asymptotic estimates of a general number-theoretic function,
Mat. Sb. (2) 44, 1239–1246.
(1949). On those numbers in an arithmetic progression all prime factors of which are
small in magnitude, Dokl. Akad. Nauk SSSR (N. S.) 67, 5–8.
Chang, T.-H. (1938). Über aufeinanderfolgende Zahlen, von denen jede mindestens
einer von n linearen Kongruenzen genügt, deren Moduln die ersten n Primzahlen
sind, Schr. Math. Sem. Inst. Angew. Math. Univ. Berlin 4, 35–55.
Chowla, S. D. & Vijayaraghavan, T. (1947). On the largest prime divisors of numbers,
J. Indian Math. Soc. (2) 12, 31–37.
Cramér, H. (1936). On the order of magnitude of the difference between consecutive
prime numbers, Acta Arith. 2, 23–46.
DeKoninck, J.-M. (1972). On a class of arithmetical functions, Duke Math. J. 39, 807–
818.
Dickman, K. (1930). On the frequency of numbers containing prime factors of a certain
relative magnitude, Ark. Mat. Astr. fys. 22, 1–14.
Duncan, R. L. (1970). On the factorization of integers, Proc. Amer. Math. Soc. 25,
191–192.
Erdős, P. (1935a). On the difference of consecutive primes, Quart. J. Math., Oxford ser.
6, 124–128.
(1935b). On the normal number of prime factors of p − 1 and some related problems
concerning Euler’s φ- function. Quart. J. Math., Oxford ser. 6, 205–213.
(1946). Some remarks about additive and multiplicative functions, Bull. Amer. Math.
Soc. 52, 527–537.
(1951). Some problems and results in elementary number theory, Publ. Math. Debre-
cen 2, 103–109.
(1962). On the integers relatively prime to n and on a number-theoretic function
considered by Jacobsthal, Math. Scand. 10, 163–170.
(1963). Problem and Solution Nr. 136, Wiskundige opgaven met de Oplossingen 21.
242 Applications of the Prime Number Theorem

Erdős, P. & Kac, M. (1940). The Gaussian law of errors in the theory of additive number
theoretic functions, Amer. J. Math. 62, 738–742.
Erdős, P. & Nicolas, J.-L. (1981). Sur la fonction: nombre de facteurs premiers de n,
Enseignoment Math. (2) 27, 3–27.
Friedlander, J. B. (1972). Maximal sets of integers with small common divisors, Math.
Ann. 195, 107–113.
Greaves, G. (2001). Sieves in Number Theory, Ergeb. Math. (3) 43. Berlin: Springer-
Verlag.
Halberstam, H. (1970). On integers all of whose prime factors are small, Proc. London
Math. Soc. (3) 21, 102–107.
Halberstam, H. & Richert, H.-E. (1974). Sieve Methods, London Mathematical Society
Monographs No. 4. London: Academic Press, 1974.
Hardy, G. H. & Littlewood, J. E. (1923). Some problems of “Partitio Numerorum”: III
On the expression of a number as a sum of primes, Acta Math. 44, 1–70.
Hausman, M. & Shapiro, H. N. (1973). On the mean square distribution of primitive
roots of unity, Comm. Pure Appl. Math. 26, 539–547.
Hensley, D. & Richards, I. (1973). Two conjectures concerning primes, Analytic Number
Theory, Proc. Sympos. Pure Math. 24. Providence: Amer. Math. Soc., 123–128.
(1973/4). Primes in intervals, Acta Arith. 25, 375–391.
Hildebrand, A. (1984). Integers free of large prime factors and the Riemann Hypothesis,
Mathematika 31, 258–271.
(1985). Integers free of large prime divisors in short intervals, Oxford Quart. J. 36,
57–69.
(1986a). On the number of positive integers ≤ x and free of prime factors > y,
J. Number Theory 22, 289–307.
(1986b). On the local behavior of ψ(x, y), Trans. Amer. Math. Soc. 297, 729–751.
(1987). On the number of prime factors of integers without large prime divisors,
J. Number Theory 25, 81–106.
Hildebrand, A. & Tenenbaum, G. (1986). On integers free of large prime factors, Trans.
Amer. Math. Soc. 296, 265–290.
(1993). Integers without large prime factors, J. Théor. Nombres Bordeaux. 5, 411–484.
Kubilius, I. P. (1956). Probabilistic methods in the theory of numbers, Uspehi Mat. Nauk
(N.S.) 11 68, 31–66.
Legendre, A. M. (1798). Théorie des Nombres, First edition, Vol. 2, pp. 71–79.
LeVeque, W. J. (1949). On the size of certain number-theoretic functions, Trans. Amer.
Math. Soc. 66, 440–463.
Maier, H. & Pomerance, C. (1990). Unusually large gaps between consecutive primes,
Trans. Amer. Math. Soc. 322, 201–237.
Montgomery, H. L. (1987). Fluctuations in the mean of Euler’s phi function, Proc. Indian
Acad. Sci. (Math. Sci.) 97, 239–245.
Montgomery, H. L. & Vaughan, R. C. (1986). On the distribution of reduced residues,
Ann. of Math. (2) 123 (1986), 311–333.
Norton, K. K. (1971). Numbers with Small Factors and the Least k’th Power Non-
Residues, Memoir 106, Providence: Amer. Math. Soc.
Pillai, S. S. & Chowla, S. D. (1930). On the error terms in some asymptotic formulæ in
the theory of numbers, I, J. London Math Soc. 5, 95–101.
7.6 References 243

Pintz, J. (1997). Very large gaps between consecutive primes, J. Number Theory 63,
286–301.
Ramaswami, V. (1949). The number of positive integers ≤ x and free of prime divisors
> y, and a problem of S. S. Pillai, Duke Math. J. 16, 99–109.
Rankin, R. A. (1938). The difference between consecutive primes, J. London Math. Soc.
13, 242–247.
(1963). The difference between consecutive primes, V, Proc. Edinburgh Math. Soc.
(2)13, 331–332.
Rényi, A. & Turán, P. (1958), On a theorem of Erdős–Kac, Acta Arith. 4, 71–84.
Ricci, G. (1934). Ricerche aritmetiche sui polinomi, II, Rend. Palermo 58, 190–208.
Richards, I. (1982). On the gaps between numbers which are sums of two squares, Adv.
in Math. 46, 1–2.
Sathe, L. G. (1953a,b,1954a,b). On a problem of Hardy on the distribution of integers
with a given number of prime factors I, II, III, IV, J. Indian Math. Soc. (N.S.) 17,
63–82 & 83–141, 18, 27–42 & 43–81.
Schinzel, A. (1961). Remarks on the paper “Sur certaines hypothèses concernant les
nombres premiers”, Acta Arith. 7, 1–8.
Schönhage, A. (1963). Eine Bemerkung zur Konstruktion grosser Primzahllücken, Arch.
Math. 14, 29–30.
Selberg, A. (1954). Note on a paper of L. G. Sathe, J. Indian Math. Soc. 18, 83–87.
(1991). Collected papers, Vol. II. Berlin: Springer-Verlag.
Westzynthius, E. (1931). Über die Verteilung der Zahlen, die zu den n ersten Primzahlen
teilerfremd sind, Comment. Phys.–Math. Soc. Sci. Fennica 5, Nr. 25, 37 pp.
8
Further discussion of the
Prime Number Theorem

8.1 Relations equivalent to the Prime Number Theorem

The Prime Number Theorem asserts that
x
π (x) ∼ (8.1)
log x
as x → ∞. In this section we consider a number of asymptotic relations
that are equivalent to the Prime Number Theorem in the sense that they can
be derived from, and also imply the Prime Number Theorem, by means of
simple elementary arguments. These relations can also be proved by using the
same analytic machinery that we used to prove the Prime Number Theorem, but
the elementary techniques that we use to derive one relationship from another
have permanent utility.
that π (x) = ψ(x)/ log x + O(x/(log x) ) and that
2
In Corollary 2.5 we saw
ψ(x) = ϑ(x) + O x 1/2
. Hence (8.1) is equivalent to
ψ(x) = x + o(x), (8.2)
and also to
ϑ(x) = x + o(x). (8.3)
These equivalences are fairly trivial, since the arithmetic functions involved are

nearly the same. At a somewhat deeper level, we consider M(x) = n≤x µ(n),
and show that the estimate
M(x) = o(x) (8.4)
is equivalent to the Prime Number Theorem. As was remarked in Chapter 6,
the relation (8.4) can be proved analytically, by applying the truncated Perron
formula to the Dirichlet series 1/ζ (s) and using the zero-free region of the zeta
function, as in the proof of the Prime Number Theorem. To derive (8.4) from

244
8.1 Relations equivalent to the Prime Number Theorem 245

(8.2) it would be natural to express µ(n) as the Dirichlet convolution of (n)

with some other function. As an aid to discovering such a function we would
write
1 ζ (s) 1
= · .
ζ (s) ζ (s) ζ (s)

Unfortunately, 1/ζ (s) = −1/ (log n)n −s cannot be expanded as a Dirichlet

series (because log 1 = 0), so we reach an impasse. To circumvent this difﬁculty

we introduce a valuable trick. Instead of treating M(x) directly, we ﬁrst consider

N (x) := n≤x µ(n) log n. Since

M(x) log x − N (x) = µ(n) log(x/n) log(x/n) x,
n≤x n≤x

it is clear that (8.4) is equivalent to the estimate

N (x) = o(x log x). (8.5)
To derive (8.5) from (8.2) we observe that the Dirichlet series generating func-
tion of µ(n) log n is −(1/ζ (s)) = ζ (s)/ζ (s)2 . Alternatively, in elementary lan-
guage, we recall (1.22), which asserts that

ζ
(d) = log n − (s) · ζ (s) = −ζ (s) .
d|n
ζ

By the Möbius inversion formula, this gives

ζ
(n) = µ(d) log n/d − (s) = −ζ (s) · 1/ζ (s) , (8.6)
d|n
ζ

as was already noted in the proof of Theorem 2.4. But

d
0 = (log n) µ(d) 0= (ζ (s) · 1/ζ (s))
d|n
ds

for all n, and so

ζ
(n) = − µ(d) log d − (s) = −ζ (s) · (ζ (s)/ζ (s)2 ) .
d|n
ζ

By Möbius inversion a second time, we deduce that

ζ
µ(n) log n = − µ(d)(n/d) ζ (s)/ζ (s) = (1/ζ (s)) · (s) .
2

d|n
ζ

Since (n/d) is 1 on average, we adjust by this amount:

µ(n) log n (n > 1),
µ(d)(1 − (n/d)) =
d|n
1 (n = 1).
246 Further discussion of the Prime Number Theorem

We sum this over n ≤ x (which is to say we apply (2.7)) to see that

µ(d)([x/d] − ψ(x/d)) = N (x) + 1.
d≤x

From (8.2) we know that for any ε > 0 there is a large number C = C(ε) such
that |ψ(y) − [y]| < εy provided that y ≥ C. That is, |ψ(x/d) − [x/d]| ≤ εx/d
for d ≤ x/C. Thus
εx
µ(d) (ψ(x/d) − [x/d]) ≤ εx log x.
d≤x/C d≤x/C
d

The remaining range we treat trivially:

x
µ(d)(ψ(x/d) − [x/d]) x log 2C.
x/C<d≤x x/C<d≤x
d

Since ε can be taken arbitrarily small, we see that (8.5), and hence (8.4), follows
from (8.2).
It is worth pausing here to note that the choice of the main term above is
extremely delicate. If we had subtracted x/d instead of [x/d], then we would

have had to consider the question of the size of the sum d≤x µ(d)/d, which

will be considered later. Since d≤x µ(d)[x/d] = 1 for all x ≥ 1, we avoid the
problem by this judicious choice of the main term.
To complete our proof that (8.4) is equivalent to (8.2) we now assume (8.4),
and derive (8.2). By summing (8.6) over n, which is to say by applying (2.7),
we see that

ψ(x) = µ(d)T (x/d)
d≤x

where T (x) = m≤x log m as in Section 2.2. We recall that T (x) = x log x −
x + O(log x) by the integral test. The main term here is approximately the same
as applies to the summatory function of the divisor function, since Theorem
1/2 2.2

asserts that D(x) = m≤x d(m) = x log x + (2C0 − 1)x + O x . Indeed,
the arithmetic function d(m) − 2C0 , when summed over m, produces exactly
the same main terms as log m. That is, if f (m) = log m − d(m) + 2C0 and

F(x) = m≤x f (m) then F(x) x 1/2 . On the other hand, r |n µ(r )d(n/r ) =

1 for all n and d|n µ(d) = 0 for all n > 1, so that

(n) − 1 (n > 1),
µ(d) f (n/d) =
d|n
2C0 − 1 (n = 1).

On summing this over n ≤ x we ﬁnd that

µ(d)F(x/d) = ψ(x) − [x] + 2C0 . (8.7)
d≤x
8.1 Relations equivalent to the Prime Number Theorem 247

We now use (8.4) to show that the left-hand side above is o(x), which thus gives
(8.2). The reasoning employed at this point is useful for other purposes, so we
axiomatize the argument, as follows.

Theorem 8.1 (Axer’s theorem) Suppose that ad is a sequence such that

(i) d≤x ad = o(x) and that (ii) d≤x |ad | x. Suppose also that F(x) is
a function deﬁned on [1, ∞) such that (iii) F(x) has bounded variation in the
interval [1, C] for any ﬁnite C ≥ 1, and that (iv) F(x) x/(log x)c for some
constant c > 1. Then

ad F(x/d) = o(x).
d≤x

By taking ad = µ(d) and F(x) as in (8.7), we see that (8.4) implies (8.2).

Proof Suppose that 1 ≤ U ≤ x/2. From (ii) and (iv) we see that
U x
ad F(x/d) c
|ad | .
x/(2U )<d≤x/U
(log U ) x/(2U )<d≤x/U (log U )c

On taking U = 2 j and summing over j ≥ J we ﬁnd that

∞
1 x
ad F(x/d) x c c c−1
.
d≤x/2 J j=J
j J

This is small compared with x if J is large. Let A(x) = d≤x ad . To treat the
remaining range, x/2 J < d ≤ x, we sum by parts. We do not use the Riemann–
Stieltjes integral here because A(y) and F(x/y) may have common disconti-
nuities. Let n 0 = [x/2 J ] and n 1 = [x]. Then

ad F(x/d) = (A(d) − A(d − 1))F(x/d)
n 0 <d≤n 1 n 0 <d≤n 1

= A(d)F(x/d) − A(d)F(x/(d + 1))
n 0 <d≤n 1 n 0 −1<d≤n 1 −1

= A(n 1 )F(x/n 1 ) − A(n 0 )F(x/(n 0 + 1))

+ A(d) (F(x/d) − F(x/(d + 1))) .
n 0 <d<n 1

Since A(n i ) = o(x) and F(x/n i ) J 1, the ﬁrst two terms are harmless. As the
points x/d are monotonically arranged in the interval [1, 2 J ], the sum above
has absolute value not exceeding

max |A(d)| |F(x/d) − F(x/(d + 1))| ≤ max |A(d)| var[1,2 J ] F.
d≤x d≤x
n 0 <d<n 1

By (i) and (iii) this is o(x) for any given J . Thus the proof is complete.
248 Further discussion of the Prime Number Theorem

By means of a further application of Axer’s theorem, we now show that

∞
µ(d)
=0 (8.8)
d=1
d

is also equivalent to the Prime Number Theorem. We take ad = µ(d) and

F(x) = {x} = x − [x] in Axer’s theorem. Thus from (8.4) we deduce that

µ(d){x/d} = o(x).
d≤x

But d≤x µ(d)[x/d] = 1 when x ≥ 1, so the left-hand side above is
µ(d)
−1 + x .
d≤x
d

Since this is o(x), we obtain (8.8). To derive (8.4) from (8.8) is easier, in view
of the following useful principle:

Lemma 8.2 If ∞ d=1 ad /d converges, then d≤x ad = o(x).

Proof Let x be given, set r (u) = u<d≤x ad /d, and note that
x
ad = r (u) du.
d≤x 0

But r (u) is bounded (independently of x), and |r (u)| < ε for u > U0 , so the
integral is U0 + εx. That is, the sum is o(x), as desired.

8.1.1 Exercises

1. As in Section 2.2, let T (x) = n≤x log n, and recall that T (x) = x log x −
x + O(log x).

(a) Show that T (x) = d≤x (d)[x/d].
(b) Show that
(d)
x = T (x) − {x/d} − ((d) − 1){x/d}.
d≤x
d d≤x d≤x

(c) Use (8.2) and Axer’s theorem to show that the last sum above is o(x).
(d) Recall Exercise 2.1.1.
(e) Show that (8.2) implies that
(d)
= log x − C0 + o(1), (8.9)
d≤x
d

and note how this compares with Theorem 2.7(a).

8.1 Relations equivalent to the Prime Number Theorem 249

(f) Apply Lemma 8.2 with ad = (d) − 1 to show that (8.9) implies (8.2).
Hence (8.2) and (8.9) are equivalent.
(g) Show that

(n){x/n} = (1 − C0 )x + o(x).
n≤x

2. (a) By recalling the proof of Theorem 2.2(c), or otherwise, show that (8.2)
implies that
x
ψ(u)
du = log x − 1 − C0 + o(1). (8.10)
1 u2

(b) Show that (8.10) implies (8.2).

3. Let b be deﬁned as in Theorem 2.7. (a) Imitate the proof of Theorem 2.7(d)
to show that (8.2) implies that
1
= log log x + b + o(1/ log x). (8.11)
p≤x p

(b) Show that (8.11) implies (8.1).

4. (a) Use (8.10) and Exercise 5.2.12 to show that
µ(d)
log(x/d) = o(log x). (8.12)
d≤x
d

(b) Show that (8.10) implies that

µ(d)
log d = o(log x). (8.13)
d≤x
d

(c) By partial summation, derive (8.4) from (8.13), and thus show that (8.2),
(8.12) and (8.13) are all equivalent. (Note that a deeper assertion concerning
the sum in (8.13) was already proved in Exercise 6.2.15.)

5. Let F(n) = d|n f (d) for all n. The opening remarks in Chapter 2 raise the
possibility of a connection between the two relations

(i) S(x) = n≤x F(n) = cx + o(x);
∞
(ii) d=1 f (d)/d = c.
In Exercise 6.2.19 we have seen that (i) and the hypothesis f (n) 1 imply
(ii). Apply Axer’s theorem with ad = f (d), F(x) = {x} to show that (ii) and

the hypothesis n≤x | f (n)| x imply (i).
6. Let dk (n) be the k th divisor function, as deﬁned in Exercise 2.1.18. Put

D0 (x) = 1, and for positive integral k let Dk (x) = n≤x dk (n).

(a) Show that if k is a positive integer, then d≤x µ(d)Dk (x/d) = Dk−1 (x).
250 Further discussion of the Prime Number Theorem

(b) Let g(n) be an arithmetic function, put G(x) = n≤x g(n), and suppose
that
G(x) = x P(log x) + O(x/(log x)c )

where c > 1 and P is a polynomial of degree K . Let Pk be the polynomial

deﬁned in Exercise 2.1.18, and explain why there exist constants ak so that
K +1
P(z) = k=1 ak Pk (z). By applying Axer’s theorem with F(x) = G(x) −
K +1
k=1 ak Dk (x), show that

µ(d)G(x/d) = x Q(log x) + o(x)
d≤x

where Q is a polynomial of degree K − 1 with leading coefﬁcient equal to

K times the leading coefﬁcient of P.
7. Show that Axer’s theorem holds with hypothesis (iv) replaced by the weaker
condition that |F(x)|
∞ ≤ ω(x)x for some non-negative function ω(x) satisfy-
ing ω(x) and 1 ω(x)/x d x < ∞.

8.2 An elementary proof of the Prime Number Theorem

As we saw in Exercise 2.1.5, a version of Möbius inversion asserts that the two
relationships

B(x) = A(x/n), A(x) = µ(n)B(x/n) (8.14)
n≤x n≤x

are equivalent. Some familiar – and useful – examples of this pairing are
displayed in Table 8.1. In many instances of (8.14), the functions A(x) and
B(x) are summatory functions of arithmetic functions a(n) and b(n), respec-
tively, in which case a(n) and b(n) are linked by the more common Möbius
inversion

b(n) = a(d), a(n) = µ(d)b(n/d). (8.15)
d|n d|n

The linear operator that takes A(x) to B(x) is continuous, but the transformation
is nevertheless quite unstable. For example, the choice of the functions A(x) in
the second and third lines of Table 8.1 are very close, and yet the corresponding
functions B(x) differ quite substantially.
When the asymptotic rate of growth of A(x) is known, it is easy to deduce that
of B(x), as a form of Abelian theorem. For example, if A(x) ∼ x as x → ∞,
then B(x) ∼ x log x. However, from the fourth line of Table 8.1 we see that
8.2 An elementary proof of the Prime Number Theorem 251

Table 8.1

A (x) B (x)
1 [x]
x x = x log x + C0 x + O(1)
1
n
n≤x
[x] d(n) = x log x + (2C0 − 1)x + O x 1/2
n≤x
ψ(x) log n = x log x − x + O(log x)
n≤x
log x/n 1
x log x x = x(log x)2 + C1 x log x + C2 x + O(1)
n≤x
n 2

some sort of Tauberian converse would be useful, for the purpose of proving
the Prime Number Theorem. Unfortunately, it is difﬁcult to establish anything
stronger than the trivial estimate

A(x) |B(x/n)|. (8.16)
n≤x

From this we see that if B(x) 1, then A(x) x. This is rather weak, since
the same upper bound for A(x) can be deduced from a weaker upper bound for
B(x): From (8.16) we see that

B(x) xα, 0 ≤ α < 1 =⇒ A(x) α x. (8.17)

As a ﬁrst application of this, we take A(x) = ψ(x) − x + 1 + C0 . Then from

lines 1, 2, and 4 of Table 8.1 we see that B(x) log x, and by (8.17) it follows
that A(x) x. That is, ψ(x) x, which is the upper bound portion of Cheby-
shev’s estimate. To achieve greater success we construct a prime number sum
in which the main term is larger than O(x).

Theorem 8.3 (Selberg) Let

2 (n) = (n) log n + (b)(c).
bc=n

Then for x ≥ 1,

2 (n) = 2x log x + O(x).
n≤x

Clearly 2 (n) > 0 only when ω(n) ≤ 2. Thus the sum on the left above is
analogous to ψ(x) but with prime powers replaced by products of two prime
powers, counted with suitable weights.
252 Further discussion of the Prime Number Theorem

Proof We begin by noting that

2 (d) = (d) log d + (b)(c)
d|n d|n d|n bc=d

= (d) log d + (b) (c).
d|n b|n c|n/b

Here the sum over c is log n/b, so the above is

= log n (d)
d|n

= (log n)2 . (8.18)

Hence by Möbius inversion it follows that

2 (n) = µ(d)(log n/d)2 . (8.19)
d|n

Take now

A(x) = 2 (n) − 2x log x + c1 x + c2 (8.20)
n≤x

where c1 and c2 are constants to be chosen later. Then by (8.18) and lines
1, 2, and 5 of Table 8.1 we see that the corresponding B(x) given by (8.14)
is
log x/n 1
B(x) = (log n)2 − 2x + c1 x + c2 [x].
n≤x n≤x n n≤x n
x
By the integral test the ﬁrst sum is 1 (log u)2 du + O((log x)2 ) = x(log x)2 −
2x log x + 2x + O((log x)2 ). Hence the above is

= −2x log x + 2x − 2C1 x log x − 2C2 x

+ c1 x log x + c1 C0 x + c2 x + O((log x)2 ).

We now choose c1 and c2 so that the leading terms cancel. That is, we take
c1 = 2 + 2C1 and c2 = −2 + 2C2 − c1 C0 . Then B(x) (log x)2 , and hence
by (8.17) it follows that A(x) x. The desired estimate then follows from
(8.20).

Selberg’s identity may be modiﬁed in a variety of ways. For example, we

note that
x x
ψ(u)
(n) log n = log u dψ(u) = ψ(x) log x − du.
n≤x 1 1 u
8.2 An elementary proof of the Prime Number Theorem 253

By Chebyshev’s estimate this last integral is x, and hence the above is

= ψ(x) log x + O(x). (8.21)
On inserting this in Selberg’s identity, we ﬁnd that

ψ(x) log x + ψ(x/n)(n) = 2x log x + O(x). (8.22)
n≤x

Our object is to show that each term on the left above is ∼ x log x as x →
∞. Suppose, to the contrary, that ψ(x) is somewhat larger than anticipated,

say ψ(x) = ax with a > 1. By combining Mertens’ estimate n≤x (n)/n =
log x + O(1) with (8.22), we see that ψ(y)/y is on average approximately 2 − a
as y runs over the points x/ p k , counted with the appropriate weights. Note that
2 − a < 1. That is, if x is chosen so that ψ(x) is unusually large, then ψ(x/ p k )
must be unusually small for many prime powers p k . Such an argument may
be repeated, so that one finds that ψ(x/( p k q )) is unusually large for many
prime powers q . The points x/ p k and x/( p k q ) are highly interlacing, so that
ψ(y) would have to switch rapidly back and forth between large and small
values. However, ψ(x) is a (weakly) increasing function, which implies that
if it is unusually large at one point, then it continues to be unusually large for
√
some time after. More precisely, if ψ(x) ≥ ax with a > 1, then ψ(y) ≥ a y
√
uniformly
√ for x ≤ y ≤√ a x. Similarly, if ψ(x) ≤ bx with b < 1 then ψ(y) ≤
b y uniformly for b x ≤ y ≤ x. Of course an interval on which ψ(y) is
large cannot overlap with one on which ψ(y) is small. One expects to reach a
contradiction by showing that these intervals are too numerous and too long to
all fit in the interval [1, x]. Our remaining task is to convert this intuitive line
of reasoning into a rigorous proof.
Let R(x) be defined by the relation ψ(x) = x + R(x). By combining the
estimate of Mertens cited above with (8.22) we see that

R(x) log x + R(x/n)(n) x. (8.23)
n≤x

Here the sum is a weighted average of values of R, but the total amount of

weight, n≤x (n) = ψ(x), remains in doubt. To overcome this difﬁculty, we
iterate the identity (8.23) as follows: By replacing x in (8.23) by x/m we ﬁnd
that

R(x/m) log x/m + R(x/(mn))(n) x/m.
n≤x/m

We multiply this by (m) and sum over all m ≤ x, and thus ﬁnd that

R(x/m)(m) log x/m + R(x/(mn))(m)(n) x log x.
m≤x mn≤x
254 Further discussion of the Prime Number Theorem

We multiply both sides of (8.23) by log x and subtract the above to see that

R(x)(log x)2 = − R(x/n)(n) log n
n≤x

+ R(x/(mn))(m)(n) + O(x log x). (8.24)
mn≤x

This has the advantage over (8.23) that we know how much weight resides in the
coefﬁcients on the right-hand side, by virtue of Theorem 8.3. We now formulate
a Tauberian principle that is appropriate to estimate the above expression.

Lemma 8.4 Suppose that an ≥ 0 and bn ≥ 0 for all n, and that

1 3
x log x ≤ an ≤ x log x, (8.25)
2 n≤x 2
1 3
x log x ≤ bn ≤ x log x (8.26)
2 n≤x 2

for all large x. Suppose also that

an + bn ∼ 2x log x (8.27)
n≤x

as x → ∞. Finally, suppose that r (u) is a function such that

|r (u)| ≤ βu (8.28)
for all large u where 0 < β ≤ 1, and that

r (v) − r (u) ≥ −(v − u) (8.29)

when v ≥ u. Then

β2
(an − bn )r (x/n) ≤ β − + o(1) x(log x)2 .
n≤x 100

Proof Without loss of generality the hypotheses hold for all x ≥ 1, u ≥ 1,

since changes in the definitions of an , bn for small n, and r (u) for small u entail
additional error terms of magnitude O(x log x). It suffices to show that

β2
(an − bn )r (x/n) ≤ β − + o(1) x(log x)2 , (8.30)
n≤x 100
since the reverse inequality can then be derived by exchanging the roles of an
and bn . By applying first (8.28) and then (8.27) we see that the left-hand side
above is trivially
a n + bn
≤ βx ∼ βx(log x)2 . (8.31)
n≤x n
8.2 An elementary proof of the Prime Number Theorem 255

We write the left-hand side of (8.30) in the form

an + bn βx
βx

βx − an − r (x/n) − bn + r (x/n) .
n≤x n n≤x n n≤x n
By (8.31), this is
∼ βx(log x)2 − S A − S B ,
say. Note that both factors of the summands in S A are non-negative, so that
S A ≥ 0. Similarly, S B ≥ 0. We need to show that
2
β
S A + SB ≥ + o(1) x(log x)2 . (8.32)
100
To this end we show that

βx βx 1 2
an − r (x/n) + bn + r (x/n) ≥ β x log y (8.33)
y<n≤16y
n n 16

for all large y. Then (8.32) follows on summing this over y = x16−k , 1 ≤ k ≤
[(log x)/ log 16] . In proving (8.33) we consider three cases.
Case 1. r (u) ≤ 12 βu for all u ∈ [ 16y
x
, 4y
x
]. Then r (x/n) ≤ 12 βx/n for all n ∈
[4y, 16y], and hence
an
βx 1
an − r (x/n) ≥ βx .
y<n≤16y
n 2 4y<n≤16y n

Since the denominator does not exceed 16y, the above is

βx
≥ an .
32y 4y<n≤16y

Here the sum is n≤16y an − n≤4y an , which by (8.25) is ≥ 8y log 16y −
6y log 4y > 2y log y. Thus the above is
βx
≥ log y.
16
Since β ≤ 1, this gives (8.33) in this case.
Case 2. r (u) ≥ − 12 βu for all u ∈ [ 4y
x x
, y ]. Then r (x/n) ≥ − 12 βx/n for n ∈
[y, 4y]. Arguing as in the preceding case, but using (8.26) instead of (8.25), we
ﬁnd that
bn
βx 1 βx βx log y
bn + r (x/n) ≥ βx ≥ bn ≥ .
y<n≤4y
n 2 y<n≤4y
n 8y y<n≤4y
16

This gives (8.33) in this case.

If neither Case 1 nor Case 2 applies, then we have
256 Further discussion of the Prime Number Theorem

Case 3. There is a u 1 ∈ [ 16yx

, 4y
x
] such that r (u 1 ) ≥ 12 βu 1 , and a u 2 ∈ [ 4y
x x
, y]
such that r (u 2 ) ≤ − 2 βu 2 . Let u 4 be the inf of those u ≥ u 1 such that
1

r (u) ≤ − 12 βu. We show that r (u 4 ) = − 12 βu 4 . Suppose that r (u 4 ) > − 12 βu 4 ,

say r (u 4 ) + 12 βu 4 = δ > 0. Suppose that
δ
u4 ≤ v < u4 + . (8.34)
1 − 12 β

Then by (8.29) we see that

1
r (v) ≥ r (u 4 ) − (v − u 4 ) = − βu 4 + δ − (v − u 4 ).
2
From the upper bound in (8.34) we deduce that the above expression is >
− 12 βv. That is, the inequality r (u) ≤ − 12 βu holds at no point of the interval
(8.34). Since this contradicts the deﬁnition of u 4 , it follows that r (u 4 ) ≤ − 12 βu 4 .
Now suppose that r (u 4 ) < − 12 βu 4 , say −r (u 4 ) − 12 βu 4 = δ > 0. Suppose also
that
δ
u4 − ≤ u ≤ u4. (8.35)
1 − 12 β

Then by (8.29) we see that

1
r (u) ≤ r (u 4 ) + (u 4 − u) = − βu 4 − δ + (u 4 − u).
2
From the lower bound in (8.35) we deduce that this expression is ≤ − 12 βu.
That is, the inequality r (u) ≤ − 12 βu holds throughout the interval (8.35).
Since this contradicts the deﬁnition of u 4 , we conclude that r (u 4 ) = − 12 βu 4 .
Put
1 − 12 β
u3 = u4,
1 + 12 β

and suppose that

u3 < u ≤ u4. (8.36)

Then by (8.29) we see that

1
r (u) ≤ r (u 4 ) + (u 4 − u) = − βu 4 + (u 4 − u).
2
From the lower bound in (8.36) we deduce that this expression is < 12 βu. That
is, the inequality r (u) ≥ 12 βu holds at no point of the interval (8.36), and hence
u1 ≤ u3.
8.2 An elementary proof of the Prime Number Theorem 257

To summarize, we have 16y x

≤ u 1 ≤ u 3 ≤ u 4 ≤ xy and |r (u)| ≤ 12 βu for u 3 <
u ≤ u 4 . Hence

βx βx
an − r (x/n) + bn + r (x/n)
x/u 4 ≤n<x/u 3
n n
1 a n + bn
≥ βx
2 x/u 4 ≤n<x/u 3 n

1
= β + o(1) x (log x/u 3 )2 − (log x/u 4 )2 . (8.37)
2
To estimate the last factor above we note that
x x 1 + 12 β ∞
β 2r +1
log − log = log = > β.
u3 u4 1 − 12 β r =0
(2r + 1)22r
Also, since u 3 and u 4 do not exceed x/y, it follows that
x x
log + log ≥ 2 log y.
u3 u4
Hence the expression (8.37) is
≥ β 2 + o(1) x log y.

Thus we have (8.33) in this case also, and the proof of Lemma 8.4 is complete.

To complete the proof of the Prime Number Theorem we apply Lemma 8.4
with

an = (n) log n, bn = (b)(c).
bc=n

We combine Chebyshev’s estimates in the form

(log 2 + o(1))x ≤ ψ(x) ≤ (2 log 2 + o(1))x
with (8.21) to see that

(log 2 + o(1))x log x ≤ an ≤ (2 log 2 + o(1))x log x. (8.38)
n≤x

This gives (8.25), and (8.27) is Selberg’s identity as expressed in Theorem 8.3.
To obtain (8.26) it sufﬁces to subtract (8.38) from (8.27). We apply the lemma
with r (u) = R(u) = ψ(u) − u. Then

r (v) − r (u) = (n) − (v − u) ≥ −(v − u),
u<n≤v

so we have (8.28). Let α = lim sup |r (u)|/u. Our object is to show that α = 0.
We know that α ≤ 1/2, by Chebyshev’s estimates. Suppose that α > 0, and
258 Further discussion of the Prime Number Theorem

choose β, 0 < β ≤ 1 so that

β2
β− < α < β.
100
By combining the conclusion of Lemma 8.4 with (8.24) we deduce that α ≤
β − β 2 /100, a contraction. Thus α = 0, and the proof of the Prime Number
Theorem is complete.

8.2.1 Exercises
1. For which entries in Table 8.1 are A(x) and B(x) summatory functions of
arithemtic functions a(n) and b(n) related as in (8.15) ?

2. If A(x) = M(x) := n≤x µ(n) in (8.14), then what is the function B(x) ?
3. (a) Verify the Dirichlet series identity
ζ ζ 2 ζ
(s) + (s) = (s).
ζ ζ ζ
(b) Compute the Dirichlet series coefﬁcients of the three functions in the
above identity, and thus give a proof of (8.18) by means of formal Dirich-
let series.
(c) Compute the leading term of the Laurent expansions of the three func-
tions above, at the point s = 1.
(d) Suppose that ρ is a zero of ζ (s) of multiplicity m > 0. Compute the
singular portion of the Laurent expansions of the three functions above,
at s = ρ. Note that the pole of ζ /ζ at s = ρ is simple if and only if ρ
is a simple zero of ζ (s).
4. Let a = lim supx→∞ ψ(x)/x and b = lim infx→∞ ψ(x)/x. Suppose that a
sequence xν tending to inﬁnity is chosen so that limν→∞ ψ(xν )/xν = a. Use
(8.22) to show that for each ν a prime pν can be selected so that xν / pν → ∞
and lim infν→∞ ψ(xν / pν )/(xν / pν ) ≤ 2 − a. Thus show that a + b ≤ 2. By
a similar argument, show that a + b ≥ 2. Hence demonstrate that the relation
a + b = 2 is a consequence of (8.22).
5. (a) Show that

log x log p + (log p) log q x.
p k ≤x p k q ≤x
k≥2 k+≥3

Here p and q denote prime numbers.

(b) As usual, let ϑ(x) = p≤x log p, and use Selberg’s identity to show
that

ϑ(x) log x + ϑ(x/ p) log p = 2x log x + O(x).
p≤x
8.3 The Wiener–Ikehara Tauberian theorem 259

6. Show that d|n µ(d)(log n/d)2 = (n) log n + d|n (d)(n/d).
7. Let k be a positive integer, and put

k (n) = µ(d)(log n/d)k .
d|n

(a) Show that

k+1 (n) = k (n) log n + k (d)(n/d).
d|n

| f (x)| 1 x
| f (u)|
α = lim sup , β = lim sup du.
x→∞ x x→∞ log x 1 u2
Show that β ≤ α(1 − α 2 /(32cM)).

8.3 The Wiener–Ikehara Tauberian theorem

In Chapter 6 we developed some understanding of the analytic behaviour of
the zeta function, which allowed us to show that ζ (s) = 0 for σ ≥ 1 − c/ log τ ,
which in turn permitted us to establish the Prime Number Theorem with an error
√
term x exp(−c log x). On the other hand, it is reasonable to ask what is the
least information concerning the zeta function that would sufﬁce to establish
the Prime Number Theorem in the weak form (8.1). In this section we establish
a general Tauberian theorem, from which the Prime Number Theorem follows
from the information that the functions
1 1
ζ (s) − , ζ (s) +
s−1 (s − 1)2
are continuous in the closed half-plane σ ≥ 1, and that
ζ (1 + it) = 0 (8.39)
for all real t. Conversely from (8.2) we see that
∞
ζ s ψ(x) − x 1
− (s) = +s dx = o
ζ s−1 1 x s+1 σ −1
+
as σ → 1 with t ﬁxed, t = 0. But if ζ (s) had a zero of multiplicity m at 1 + it,
then
ζ m
(s) ∼
ζ s−1
260 Further discussion of the Prime Number Theorem

when s is near 1 + it. Since this is possible only when m = 0, we have (8.39).
The above observations can be paraphrased as ‘the Prime Number Theorem
is equivalent to the assertion (8.39)’, although one needs to bear in mind the
continuity conditions also.

Suppose that α(s) = ∞ −s
n=1 an n . In Section 5.2 we derived information
concerning partial sums of this series at s = 1 from the behaviour of α(σ ) as
σ → 1+ . We now take much stronger hypotheses that concern α(s) throughout
the closed half-plane σ ≥ 1, but we obtain from them much stronger conclu-
sions, concerning partial sums of the series at s = 0. Our proof of the Hardy–
Littlewood Tauberian theorem (Theorem 5.7) depended on a simple lemma con-
cerning one-sided polynomial approximation (Lemma 5.8). Our new approach
depends similarly on a corresponding lemma concerning one-sided trigonomet-
ric approximation, as follows.

Lemma 8.5 Let E(x) = e x for x ≤ 0, and E(x) = 0 for x > 0. For any given
ε > 0 there is a T and continuous functions f + (x), f − (x) with f ± ∈ L 1 (R)
such that
(i) f − (x) ≤ E(x) ≤ f + (x) for all real x;
(ii) *
f ± (t) = 0 for |t| ≥ T ; ∞
∞
(iii) −∞ f + (x) d x < 1 + ε, −∞ f − (x) d x > 1 − ε.

Before proving the above, we ﬁrst explore its consequences.

Since the f ± ∈ L 1 (R), it follows that the Fourier transforms * f ± (t) are uni-
formly continuous. Thus from (ii) above it follows that * f ± (±T ) = 0, so that
*f ± (t) = 0 for all t with |t| ≥ T . Since the f ± are also continuous, it follows
by the Fourier integral theorem that
τ
lim (1 − |t|/τ ) *
f ± (t)e(t x) dt = f ± (x)
τ →∞ −τ

for all x. But the functions *f ± are supported

T on the ﬁxed interval [−T, T ], so
the limit on the left above is simply −T * f ± (t)e(t x) dt. That is,
T
f ± (x) = *
f ± (t)e(t x) dt (8.40)
−T
T
for all x. It may be further noted that −T * f ± (t)e2πit z dt is an entire function of
z. Thus f ± (x) is the restriction to the real axis of an entire function.

Theorem 8.6 (Wiener–Ikehara) Suppose that the function a(u) is non-

negative and increasing on [0, ∞), that
∞
α(s) = e−us da(u)
0
8.3 The Wiener–Ikehara Tauberian theorem 261

converges for all s with σ > 1, and that r (s) := α(s) − c/(s − 1) extends to a
continuous function in the closed half-plane σ ≥ 1. Then
x
1 da(u) = ce x + o(e x )
0
as x → ∞.
By making the change of variable a(u) = A (eu ), we obtain the following
equivalent formulation.
Corollary 8.7 (Wiener–Ikehara) Suppose that A(v) is non-negative and in-
creasing on [1, ∞), that
∞
α(s) = v −s d A(v)
1

converges for all s with σ > 1, and that r (s) := α(s) − c/(s − 1) extends to a
continuous function in the closed half-plane σ ≥ 1. Then
x
1 d A(v) = cx + o(x)
1
as x → ∞.

By setting A(v) = n<v an we obtain a useful Tauberian theorem for Dirich-
let series.
Corollary 8.8 (Wiener–Ikehara) Suppose that an ≥ 0 for all n, that
∞
α(s) = an n −s
n=1

converges for all s with σ > 1, and that r (s) := α(s) − c/(s − 1) extends to a
continuous function in the closed half-plane σ ≥ 1. Then

an = cx + o(x)
n≤x

as x → ∞.
By taking an = (n), we see that (8.39) gives the hypotheses with c = 1,
and hence we obtain the Prime Number Theorem in the form (8.2).
Proof of Theorem 8.6 Take δ > 0, and let E(u) be as in Lemma 8.5. Then
x ∞
e−δu da(u) = e x E(u − x)e−(1+δ)u da(u),
0 0

which by Lemma 8.5(i) is

∞
≤ ex f + (u − x)e−(1+δ)u da(u).
0
262 Further discussion of the Prime Number Theorem

By (8.40) this is
∞ T
= ex *
f + (t)e(tu − t x) dt e−(1+δ)u da(u).
0 −T

By Fubini’s theorem we may interchange the order of integration. Thus the

above is
T ∞
= ex *
f + (t)e(−t x) e−(1+δ−2πit)u da(u) dt
−T 0
T
= ex *
f + (t)e(−t x)α(1 + δ − 2πit) dt. (8.41)
−T

If a(u) = eu , then α(s) = 1/(s − 1), and thus from the above calculation we
see in particular that
∞ T
* 1
f + (u − x)e−δu du = f + (t)e(−t x) dt.
0 −T δ − 2πit
x
On multiplying both sides by ce and combining this with (8.41), we deduce
that
x T
e−δu da(u) ≤ e x *
f + (t)e(−t x)r (1 + δ − 2πit) dt
0 −T
∞
+ ce x f + (u − x)e−δu du.
0

Since r (s) is uniformly continuous in the closed rectangle 1 ≤ σ ≤ 1 + δ,

|t| ≤ 2π T , each of the above three terms tends to a limit as δ → 0+ .
Thus
x T ∞
1 da(u) ≤ e x *
f + (t)e(−t x) r (1 − 2πit) dt + ce x f + (u − x) du.
0 −T 0

We divide through by e x and let x tend to inﬁnity. The ﬁrst integral on the right
tends to 0 by the Riemann–Lebesgue lemma, and the second integral on the
∞
right tends to −∞ f + (u) du. Thus we see that
x ∞
lim sup e−x 1 da(u) ≤ c f + (u) du ≤ c(1 + ε)
x→∞ 0 −∞

by Lemma 8.5(iii). By using f − similarly we may also show that

x
lim inf e−x 1 da(u) ≥ c(1 − ε).
x→∞ 0

Since ε may be taken arbitrarily small, we obtain the stated result, apart from
the need to prove Lemma 8.5.
8.3 The Wiener–Ikehara Tauberian theorem 263

Proof of Lemma 8.5 We assume, as we may, that T ≥ 1. Let

sin π T x 2 3T sin π T x/2 4
T (x) = T , JT (x) =
πT x 4 π T x/2
be the Fejér and Jackson kernels, respectively. These functions have a peak of
height T and width 1/T at 0, and have total mass 1. Set
∞
f (x) = (E ! JT )(x) = E(u)JT (x − u) du.
−∞

This is a weighted average of the values of E(u) with special emphasis on those
u near x. We show that

f (x) = E(x) + O(min(1, 1/(T x)2 )). (8.42)

To establish this we consider several cases. If |x| ≤ 1/T we simply observe
∞
that 0 ≤ f (x) ≤ −∞ JT (u) du = 1. If x ≥ 1/T we observe that 0 ≤ f (x)

T −3 −∞ (x − u)−4 du
0
1/(T x)3 . By the calculus of residues it is easy to show
∞
that −∞ JT (u) du = 1. Hence
∞
f (x) − E(x) = (E(u) − E(x))JT (x − u) du.
−∞

Next, suppose that −1 ≤ x ≤ −1/T . If 2x ≤ u ≤ 0, then E(u) − E(x)

= e x (eu−x − 1) = e x (u − x + O((u − x)2 )). Thus
0 −x
(E(u) − E(x))JT (x − u) du = −e x u JT (u) du
2x x
−x
+O u 2 JT (u) du .
x

Here the ﬁrst integral on the right vanishes because the integrand is an odd
function, and the second integral is 1/T 2 . On the other hand,
∞ ∞
(E(u) − E(x))JT (x − u) du T −3 u −4 du 1/|T x|3 ,
0 −x
2x
and similarly −∞ 1/|T x|3 , so we have (8.42) in this case also. Finally,
suppose that x ≤ −1. Then E(u) − E(x) = e x (u − x + O((u − x)2 )) for x −
1 ≤ u ≤ x + 1, so that
x+1 1
(E(u) − E(x))JT (x − u) du = − e x u JT (u) du
x−1 −1
1
+ O ex u 2 JT (u) du e x T −2 ,
−1
264 Further discussion of the Prime Number Theorem

which is 1/(T x)2 . On the other hand,

x−1 ∞
(E(u) − E(x))JT (x − u) du e x T −3 u −4 du (T x)−2 ,
−∞ 1

and
∞
(E(u) − E(x))JT (x − u) du T −3 x −4 ,
x+1

so we again have (8.42).

Clearly T (x) T min(1, 1/(T x)2 ), but there is no inequality in the reverse
direction because T (x) vanishes at integral multiples of 1/T . To overcome
this difﬁculty we consider also a translate of the Fejér kernel. Since

T (x) + T (x + 1/(2T )) T min(1, 1/(T x)2 ),

we take
C
f ± (x) = f (x) ± ( T (x) + T (x + 1/(2T ))) .
T
By (8.42) we see that if C is taken large enough, then f − (x) ≤ E(x) ≤ f + (x)
for all x.
By Fubini’s theorem it is easy to see that if f 1 , f 2 ∈ L 1 (R) then the convo-
lution f 1 ! f 2 is also in L 1 (R), and also that f * *
1 ! f 2 (t) = f 1 (t) f 2 (t). Hence
in particular, f ∈ L (R) and *
1
f (t) = *
E(t)*JT (t). But * JT (t) = 0 for |t| ≥ T ,
* *
so f (t) = 0 for |t| ≥ T . Also, T (t) = 0 for |t| ≥ T , and we see that the
functions f ± have the property (ii).
Finally, we note by Fubini’s theorem that
∞ ∞ ∞
f (x) d x = E(x) d x JT (u) du = 1 · 1 = 1,
−∞ −∞ −∞
∞
and hence −∞ f ± (x) d x = 1 ± 2C/T . Thus we have (iii) if T ≥ C/ε, so the
proof is complete.

8.3.1 Exercises
1. Use the Wiener–Ikehara theorem (Theorem 8.6) to show that M(x) = o(x).
2. (Dressler 1970; cf. Bateman 1972) Let f (n) denote the number of positive
integers k such that ϕ(k) = n.
(a) Show that if σ > 1, then

∞
f (n) ∞
1 1 1
= = 1+ + + ··· ,
n=1
ns k=1
ϕ(k)s p ϕ( p)s ϕ( p 2 )s

and explain why this is not an Euler product in the usual sense.
8.3 The Wiener–Ikehara Tauberian theorem 265

(b) Let the above Dirichlet series be F(s). Show that F(s) = ζ (s)G(s) for
σ > 1, where

1 1
G(s) = 1− s + .
p p ( p − 1)s
(c) By writing
p
1 1
− s =s u −s−1 du,
( p − 1)s p p−1

show that the above is p −σ −1 for any ﬁxed s.

(d) Let K be a compact set in the complex plane, and let σ0 = mins∈K σ .
Show that ( p − 1)−s − p −s p −σ0 −1 uniformly for s ∈ K.
(e) Show the product G(s) converges locally uniformly in the half-plane
σ > 0, and hence represents an analytic function in this region.
(f) Show that G(1) = ζ (2)ζ (3)/ζ (6).
(g) Use the Wiener–Ikehara theorem (Theorem 8.6) to show that the number
of integers k such that ϕ(k) ≤ x is asymptotic to G(1)x as x → ∞.
3. Show that Corollary 8.8 still holds if the hypothesis an ≥ 0 is replaced by
the weaker hypothesis that there is a constant C such that an ≥ C for all n.

4. Let σs (n) = d|n d s , and let cq (n) be Ramanujan’s sum, as discussed in
Section 4.1.
(a) Show that if n is a positive integer, then
∞
cq (n) σ1−s (n)
= (σ > 1).
q=1
q s ζ (s)

(b) Show that if n is a ﬁxed positive integer, then q≤x cq (n) = o(x) as
x → ∞.
(c) Show that if n is a positive integer, then

x
cq (n) = d.
q≤x q d|n
d≤x

(d) By Axer’s theorem, or otherwise, show that if n is a positive integer, then

∞
cq (n)
= 0.
q=1
q

(See also Exercise 4.1.8.)

5. (Graham & Vaaler 1981) Let f + (x) and f − (x) be as in Lemma 8.5.
(a) Use the Poisson summation formula to show that
∞
∞
f + (n/T ) = T *
f + (kT ) .
n=−∞ k=−∞
266 Further discussion of the Prime Number Theorem

(b) Explain why the right-hand side above is = T * f + (0) = T R f + (x) d x.
(c) Explain why the left-hand side above is ≥ (1 − e−1/T )−1 .
(d) Deduce that
1
f + (x) d x ≥ .
R T (1 − e−1/T )
(e) Suppose that T ≥ 2. Show that the right-hand side above is = 1 +
1/(2T ) + O(1/T 2 ).
(f) Show similarly that
1
f − (x) d x ≤ ,
R T (e1/T − 1)
and that the right-hand side is = 1 − 1/(2T ) + O(1/T 2 ) when T ≥ 2.

8.4 Beurling’s generalized prime numbers

One of the most valuable generalizations of the Prime Number Theorem is to
algebraic number fields. Suppose that K is an algebraic number field of degree
d over the rationals, and let O K denote the ring of algebraic integers in K . For
some fields K the members of O K factor uniquely into primes, but in general
this is not the case. However, it is always true that ideals in O K factor uniquely
into prime ideals. For an ideal a of O K , let N (a) denote its norm, which is to
say the size of the quotient ring O K /a. For σ > 1 we can define the Dedekind
zeta function of K by the absolutely convergent series

ζ K (s) = N (a)−s .
a

This is an ordinary Dirichlet series, since the N (a) are positive integers, and

thus the above can be written in the form an n −s where an is the number of
ideals with norm n.
Counting ideals a with N (a) ≤ x is rather like counting rational integers. The
ideals can be parametrized by the points of a lattice in Rd , so one is counting
lattice points in a certain region, which is approximately the volume of that
region, and thus it can be shown that the number I (x) of ideals a with N (a) ≤ x is

I (x) = cx + O x 1−1/d (8.43)
where c = c(K ) is a certain positive constant, called the ideal density. Here
the implicit constant may also depend on K , which we assume is ﬁxed. By
Theorem 1.3 it follows that
∞ ∞
cs
ζ K (s) = s I (x)x −s−1 d x = +s (I (x) − cx)x −s−1 d x.
1 s − 1 1
8.4 Beurling’s generalized prime numbers 267

Since this latter integral is uniformly convergent for σ > 1 − 1/d + δ, we de-
duce that ζ K (s) is analytic in the half-plane σ > 1 − 1/d apart from a simple
pole at s = 1 with residue c. Moreover, we see that if δ is ﬁxed, δ > 0, then
ζ K (s) |t| uniformly for σ ≥ 1 − 1/d + δ, |t| ≥ 1.
If a and b are two ideals in O K , then

N (ab) = N (a)N (b). (8.44)

Hence ζ K (s) has an Euler product formula

ζ K (s) = (1 − N (p)−s )−1

for σ > 1. On taking logarithmic derivatives we also see that

ζ
− K (s) = (a)N (a)−s
ζK a

where (a) = log N (p) if a = pk , (a) = 0 otherwise. Thus, as in Lemma 6.5,

ζ K ζ K ζ K
−3 (σ ) − 4 (σ + it) − (σ + 2it) ≥ 0
ζK ζK ζK
for σ > 1 and any real t. Also as in Chapter 6 we may derive a zero-free
region for ζ K (s), namely that ζ K (s) = 0 provided that σ > 1 − c/ log τ . Here,
as before, τ = |t| + 4, and c is a constant depending on K . Continuing as in
Chapter 6, we can derive estimates analogous to those in Theorem 6.7, but with
constants depending on K , and we may use our quantitative version of Perron’s
formula (Theorem 5.2) to establish a quantitative version of the Prime Ideal
Theorem:

Theorem 8.9 (Landau) Let K be an algebraic number ﬁeld of ﬁnite de-

gree over Q, and let O K denote the ring of algebraic integers in K . Then
for x ≥ 2 the number of prime ideals p in O K such that N (p) ≤ x is
√
li(x) + O K (x exp(−c log x)) where c depends on K .

It is notable that the chain of reasoning we have just described depends only
on the estimate (8.43) and the identity (8.44). Thus the entire situation could
be abstracted as follows. Suppose we have a sequence P of real numbers pi
such that 1 < p1 ≤ p2 ≤ · · · and pi → ∞. We call these numbers ‘generalized
primes’. We form products of powers of these numbers, p1a1 p2a2 · · · pkak , and
call such products ‘generalized integers’. Let N (x) denote the number of such
products whose value does not exceed x. If

N (x) = cx + O(x θ ) (8.45)

268 Further discussion of the Prime Number Theorem

for some c > 0 and θ < 1, then by the reasoning we have outlined it follows
that the number P(x) of generalized primes pi such that pi ≤ x is li(x) +
√
O(x exp(−c log x)).
The integers Z form an additive group, a cyclic group generated by the
number 1. Moreover, the positive integers form a multiplicative semigroup
with the primes as generators. From the additive property of the integers we
know that [x] = x + O(1), which is a strong form of (8.45). However, it is now
quite clear that our proof of the Prime Number Theorem requires no further
knowledge of the additive nature of the integers beyond this estimate.
We have seen that the estimate (8.45) gives a generalization of the Prime
Number Theorem with the classical error term. We now consider the issue of
how much this hypothesis can be weakened, if the goal is only to obtain a
generalization of (8.1), namely that P(x) ∼ x/ log x as x → ∞.
Theorem 8.10 (Beurling) Let P = { pi } where 1 < p1 ≤ p2 ≤ · · · and pi →
∞, and let N (x) denote the number of products p1a1 p2a2 · · · pkak ≤ x where the
ai are non-negative integers. Suppose that there is a positive constant c such
that

x
N (x) = cx + O (8.46)
(log x)γ
for x ≥ 2. Let P(x) denote the number of members of P not exceeding x. If
γ > 3/2, then
x
P(x) ∼ (8.47)
log x
as x → ∞.
Proof Let N = {n j } where 1 = n 1 < n 2 ≤ n 3 ≤ · · · are the generalized inte-
gers, and for σ > 1 let

ζP (s) = n −s .
n∈N

Since the n ∈ N are not necessarily rational integers, the above is not necessarily
an ordinary Dirichlet series, but it is an example of a ‘generalized Dirichlet
series’. In any case it is an absolutely convergent series and by integration by
parts as in the proof of Theorem 1.3 we see that
∞ ∞
ζP (s) = u −s d N (u) = s N (u)u −s−1 du.
1− 1

We subtract cu from N (u) to see that

∞
cs
ζP (s) = +s (N (u) − cu)u −s−1 du.
s−1 1
8.4 Beurling’s generalized prime numbers 269

∞
From (8.46) we know that 1 |N (u) − cu|u −2 du < ∞. Hence the integral
above is uniformly convergent for σ ≥ 1, and consequently it is continuous in
this closed half-plane. Thus we can extend the deﬁnition of ζP (s) so that ζP (s) =
c/(s − 1) + r0 (s) and r0 (s) is continuous for σ ≥ 1. To bound the modulus
of continuity of r0 (s) we differentiate. Thus ζP (s) = −c/(s − 1)2 + r1 (s) for
σ > 1 where
∞ ∞
r1 (s) = r0 (s) = (N (u) − cu)u −s−1 du − s (N (u) − cu)(log u)u −s−1 du.
1 1
∞
If (8.46) holds with γ > 2, then 1 |N (u) − cu|(log u)u −2 du < ∞ and then
r1 (s) is continuous in the closed half-plane σ ≥ 1. When γ is smaller, however,
the situation is more delicate. From now on we assume, as we may, that 3/2 <
γ ≤ 2. Since
∞ ∞
(log u)1−γ u −σ du = v 1−γ e−(σ −1)v dv
2 log 2
∞
= (σ − 1)γ −2 u 1−γ e−u du
(σ −1) log 2

(σ − 1)− 2 +η ,
1

(σ − 1)− 2 +η uni-
1
where η = η(γ ) > 0, from (8.46) we deduce that r1 (s)
formly for σ > 1. Consequently, if t is ﬁxed, t = 0, then
σ
ζP (α + it) dα (σ − 1) 2 +η
1
ζP (σ + it) − ζP (1 + it) = (8.48)
1

for σ > 1, σ near 1.

Next we use the above estimate to show that
ζP (1 + it) = 0 (8.49)
when t is real, t = 0. By mimicking the proof of the usual Euler product formula
for ζ (s), we see that
ζP (s) = (1 − p −s )−1
p∈P

for σ > 1. This product is absolutely convergent, and each factor is non-zero,
so ζP (s) = 0 for σ > 1, and indeed we may write
∞
1 −r s
log ζP (s) = p . (8.50)
p∈P r =1
r

Instead of the cosine polynomial 3 + 4 cos θ + cos 2θ used in Chapter 6, we

K
must now employ a non-negative cosine polynomial a0 + k=1 ak cos kθ for
which the ratio a1 /a0 is larger. As we observed in Section 6.1, it is always the
270 Further discussion of the Prime Number Theorem

case that a1 < 2a0 , but we can make a1 as close to 2a0 as we wish by using the
Fejér kernel K (θ) with K large, since
K
k 1 sin π K θ 2
K (θ) = 1 + 2 1− cos 2π kθ = ≥ 0.
k=1
K K sin π θ
Hence if σ > 1, then

K ∞
1 K
ζP (σ + ikt)(1−|k|/K ) = exp (1 − |k|/K ) p −ir kt
p∈P r =1
r pr σ k=−K
k=−K

∞
1
= exp K (r t(log p)/(2π)) .
p∈P r =1
r pr σ

Now ζP (σ − it) = ζP (σ + it), so that |ζP (σ − it)| = |ζP (σ + it)|. Also,

K (θ ) ≥ 0 for all θ . Hence from the above we see that
K
ζP (σ ) |ζP (σ + ikt)|2(1−k/K ) ≥ 1.
k=1

Suppose that t is a ﬁxed, non-zero real number. As σ tends to 1 from above,

the numbers |ζP (σ + ikt)| tend to ﬁnite limits, and ζP (σ ) 1/(σ − 1). Thus
K
|ζP (σ + it)| (σ − 1) 2(K −1)
as σ → 1+ . Here the implicit constant may depend not only on P but also on
t. Suppose now that ζP (1 + it) = 0. Then from (8.48) we have ζP (σ + it)
(σ − 1) 2 +η as σ → 1+ . This contradicts the lower bound above if K is large
1

enough, say K > 1 + 2η 1

. Hence ζ (1 + it) = 0, as desired.
For n ∈ N let (n) = log p if n = pr and p ∈ P, (n) = 0 otherwise. On
differentiating (8.50) we see that
ζ
− P (s) = (n)n −s
ζP n∈N

for σ > 1. Set

S(x) = (n).
n∈N
n≤x

Suppose for the moment that γ > 2. Then r0 (s) and r1 (s) are both continuous
in the closed half-plane σ ≥ 1, and then
ζ 1
− P (s) = + r (s)
ζP s−1
where
r0 (s) + (s − 1)r1 (s)
r (s) = −
(s − 1)ζP (s)
8.4 Beurling’s generalized prime numbers 271

is continuous in the closed half-plane σ ≥ 1. Then by the Wiener–Ikehara

theorem it follows that S(x) ∼ x as x → ∞. Under the weaker hypothesis that
3/2 < γ ≤ 2 we are no longer able to guarantee that r1 (s) is continuous, but
by Plancherel’s identity it is bounded in mean-square. Thus, below, we follow
the lines of the proof of the Wiener–Ikehara theorem, but with an appeal to
Plancherel’s identity where continuity had sufﬁced before.
Suppose that δ > 0, that T is a large positive number, and that E(u) is deﬁned
as in Lemma 8.5. Then

(n)n −δ = x (n)n −1−δ E(log n − log x)
n∈N n∈N
n≤x

which by Lemma 8.5 is

≤x (n)n −1−δ f + (log n − log x)
n∈N
T
x −2πit
≤x (n)n −1−δ *
f + (t) dt
n∈N −T n

T
* ζ
= −x f + (t)x −2πit P (1 + δ − 2πit) dt. (8.51)
−T ζP
As for the main term, we note that similarly
∞ ∞ T −2πit
* x
u −1−δ f + (log u − log x) du = u −1−δ f + (t) du dt
1 1 −T u
T ∞
= *
f + (t)x −2πit u −1−δ+2πit du dt
−T 1
T
* 1
= f + (t)x −2πit dt.
−T δ − 2πit
We multiply both sides of this by x and combine with (8.51) to see that
∞
(n)n −δ ≤ x u −1−δ f + (log u − log x) du
n∈N 1
n≤x
(8.52)

T
* ζ 1
+x f + (t)x −2πit − P (1 + δ − 2πit) − dt.
−T ζP δ − 2πit
By using our formulæ for ri (s) in terms of integrals we see that we may write
r0 (s) − c
r1 (s) = r0 (s) = −s J (s) +
s
where
∞
J (s) = (N (u) − cu) (log u)u −s−1 du,
1
272 Further discussion of the Prime Number Theorem

and
c r0 (s) − c
−ζP (s) = − + s J (s).
(s − 1)2 s
Thus
ζP 1 c(s − 1) + (1 − 2s)r0 (s) s
− (s) − = + J (s)
ζP s−1 s(s − 1)ζP (s) ζP (s)
and by splitting the integral at X , where X is a large parameter we have
ζP 1
− (s) − = C(s) + R(s)
ζP s−1
where
∞
R(s) = (N (u) − cu) (log u)u −s−1 du
X

and C(s) is continuous for σ ≥ 1. We consider ﬁrst the contribution of the

remainder R(s) to (8.52). By the Cauchy–Schwartz inequality we see that
T 2
*
f + (t)x −2πit R(1 + δ − 2πit) dt
−T
(8.53)
∞
T
* 1 + δ − 2πit 2 T
(N (u) − cu)(log u) 2
≤ f + (t) dt du dt.
−T ζP (1 + δ − 2πit) −T X u 2+δ−2πit
In Theorem 5.4 we take σ = 1 + δ and w(u) = (N (u) − cu) log u for u ≥ X ,
w(u) = 0 otherwise. Thus we see that
∞ ∞ 2
(N (u) − cu)(log u)u −2−δ+2πit du dt
−∞ X
∞
= (N (u) − cu)2 (log u)2 u −3−2δ du,
X

which by (8.46) is
∞
u −1 (log u)2−2γ du γ (log X )3−2γ
X

uniformly for δ > 0. The ﬁrst integral on the right-hand side of (8.53) is also
uniformly bounded as δ tends to 0, since ζP (1 + it) = 0. Thus the contribution
of R(s) to (8.52) is γ (log X )3/2−γ , uniformly for δ > 0. Hence if we let δ
tend to 0 from above in (8.52), and divide through by x, we ﬁnd that
∞ T
S(x) *
≤ u −1 f + (log u − log x) du + f + (t)x −2πit C(1 − 2πit) dt
x 1 −T

+ Oγ (log X )3/2−γ .
8.4 Beurling’s generalized prime numbers 273

∞
As x tends to infinity, the first integral on the right tends to −∞ f + (v) dv. Since
*
f + (t)C(1 − 2πit) is a continuous function of t, by the Riemann–Lebesgue
lemma the second integral on the right tends to 0 as x tends to infinity. Hence
∞
S(x)
lim sup ≤ f + (v) dv + Oγ (log X )3/2−γ .
x→∞ x −∞

By Lemma 8.5 we know that the integral on the right is < 1 + ε if T is sufﬁ-
ciently large. Since X may also be taken arbitrarily large, we conclude that the
limsup above is ≤ 1. By a similar argument with f + replaced by f − , we ﬁnd
that the corresponding liminf is ≥ 1, so we have the generalized Prime Number
Theorem in the form S(x) ∼ x. By integrating by parts we obtain the desired
relation (8.47).

We now show that the exponent 3/2 is critical in Beurling’s theorem.

Theorem 8.11 The primes P can be chosen in such a way that (8.46) holds
with γ = 3/2 but (8.47) fails.
The general idea is that if ζP (s) has a simple pole at s = 1 and zeros of
multiplicity 1/2 at 1 ± ia, say
(s − 1 − ia)1/2 (s − 1 + ia)1/2
ζP (s) = H (s) (8.54)
s−1
where H (s) is analytic for σ > θ, θ < 1, then we can express N (x) by Perron’s
formula applied to ζP (s). After moving the contour to the left, we would find
that the residue at s = 1 gives rise to the main term cx, and the loop of contour
around the branch points at 1 ± ia give oscillatory terms of size x/(log x)3/2 .
On the other hand,
ζ 1 1 1 H
− P (s) = − − − (s),
ζP s − 1 2(s − 1 − ia) 2(s − 1 + ia) H
which suggests that S is approximately
x 1+ia x 1−ia
x− − .
2(1 + ia) 2(1 − ia)
This is of the order of magnitude x but not asymptotic to x. It is of course essen-
tial that the above main term should be increasing; we note that its derivative is
1 − cos(a log x) ≥ 0. For a rigorous construction we begin by defining primes
so that S(x) approximates this main term, and then we show that the resulting
ζP (s) satisfies (8.54).
Proof Let a be a fixed positive real number, and set
x
1 − cos(a log u)
f (x) = du.
1 log u
274 Further discussion of the Prime Number Theorem

We note that this function is increasing and tends to inﬁnity with x. Hence for
each positive integer j there is a unique real number p j such that f ( p j ) = j. If
p j ≤ x < p j+1 , then P(x) = j and j ≤ f (x) < j + 1; hence P(x) = [ f (x)].
By integration by parts we see that
x
u iα x 1+iα x
du = +O .
2 log u (1 + iα) log x (log x)2
By taking α = −a, 0, a, and combining, we see that

x ia x −ia x x
f (x) = 1 − − +O ,
2(1 + ia) 2(1 − ia) log x (log x)2
and consequently
P(x) 1 P(x) 1
lim inf =1− √ , lim sup =1+ √ .
x→∞ x/ log x 1 + a2 x→∞ x/ log x 1 + a2
Clearly
x
log p = log u d[ f (u)]
p∈P 1
p≤x
x x
= log u d f (u) − log u d{ f (u)}
1 1
x x x
{ f (u)}
= 1 − cos(a log u) du − { f (u)} log u + du
1 1 1 u
x 1+ia x 1−ia
=x− − + O(log x),
2(1 + ia) 2(1 − ia)
and hence
x 1+ia x 1−ia
S(x) = x − − + O x 1/2 .
2(1 + ia) 2(1 − ia)
Let r (x) denote this last error term. Then for σ > 1,
∞
ζP
− (s) = u −s d S(u)
ζP 1
1 1 1
= − − + g(s)
s − 1 2(s − 1 − ia) 2(s − 1 + ia)
where g(s) is analytic for σ > 1/2. Hence
1 1
log ζP (s) = − log(s − 1) + log(s − 1 − ia) + log(s − 1 + ia) + G(s)
2 2
where G (s) = −g(s), and so we have (8.54) with H (s) = e G(s) .
To complete the proof we need not only (8.54) but also an estimate of the
size of ζP (s) when σ < 1. To this end we mimic the approach used to estimate
8.4 Beurling’s generalized prime numbers 275

1/ζ (s) in Theorem 6.7. Since P(x) x/ log x it follows that log ζP (1 + δ +
it) log 1/δ uniformly for 0 < δ ≤ 1/2. If t ≥ 4 + a and 1 − 1/ log t ≤ σ ≤
1 + 1/ log t, then
ζ ∞
− P (s) = (n)n −s + u −s d S(u).
ζP 2 t2
n≤t
n∈N

Here the sum is

(n)
log t,
2
n
n≤t
n∈N

and the integral is

∞
t 2(1−s) t 2(1+ia−s) t 2(1−ia−s) r (t 2 )
− − − 2s + s r (u)u −s−1 du 1,
s−1 2(s − 1 − ia) 2(s − 1 + ia) t t2

so that
1+1/ log t
ζP
log ζP (s) = − (α + it)dα + log ζP (1 + 1/ log t + it)
σ ζP
1 + log log t

for σ ≥ 1 − 1/ log t. Hence there is a constant A such that ζP (s) (log t) A for
σ ≥ 1 − 1/ log t, t ≥ 4 + a.
We now estimate N (x) by taking an inverse Mellin transform of ζP (s).
However, the truncated Perron formula (Corollary 5.3) is not so useful since
we lack information concerning the number of generalized integers in a short
interval. To avoid this difﬁculty we use Cesàro weights as discussed in Section
5.1, by means of which we see that if b > 1 and h > 0, then
1 b+i∞
(x + h)s+1 − x s+1
ζP (s) ds = w+ (n)
2πi h b−i∞ s(s + 1) n∈N

where
⎧
⎨1 (u ≤ x),
w+ (u) = (x + h − u)/ h (x < u ≤ x + h),
⎩
0 (u > x + h).
We now pull the contour to the left. In view of (8.54), at s = 1 we encounter a
simple pole with residue c(x + h/2) where c = a H (1). Because of the branch
points at 1 ± ia, we slit the plane by the segments σ ± ia for −∞ < σ ≤ 1.
Our contour follows theupper and lower sides of these segments; the integral
1 σ
−∞ (x + h) (1 − σ )
1/2
along these loops is dσ x/(log x)3/2 . By taking
276 Further discussion of the Prime Number Theorem

more care, and using Theorem C.3, we could obtain oscillatory main terms of
this order of magnitude. On the rest of the contour we estimate the integral as
in the proof of the Prime Number Theorem, and thus we see that

1 x
N (x) ≤ w+ (n) = cx + ch + O
2 (log x)3/2
n∈N 2
x
+O exp − C log x .
h
On taking h = x/(log x)2 we obtain an upper bound of the desired type. To
obtain a corresponding lower bound we argue similarly from the formula
1 b+i∞
x s+1 − (x − h)s+1
ζP (s) ds = w− (n)
2πi h b−i∞ s(s + 1) n∈N
where
⎧
⎨1 (u ≤ x − h),
w− (u) = (x − u)/ h (x − h < u ≤ x),
⎩
0 (u ≥ x).

8.5 Notes
Section 8.1. Historical accounts of the development of prime number theory
and of the various proofs of the Prime Number Theorem have been given
by Bateman & Diamond (1996), Narkiewicz (2000), and by Schwarz (1994).
Axer’s theorem originates in Axer (1911). The deﬁnitive account of Axer’s
theorem is that of Landau (1912).
Section 8.2. In former times, an argument was considered to be ‘non-
elementary’ if it involved Cauchy’s theorem or Fourier inversion. Prior to Sel-
berg’s elementary proof of the Prime Number Theorem, a distinction was drawn
between those results that could be obtained by elementary arguments, and those
that could not. Selberg’s elementary proof rendered the terminology nugatory.
Theorem 8.3 and a deduction of the Prime Number Theorem occur in Selberg
(1949). There are a number of variants of the less than straightforward Tauberian
process used in the deduction; see, for example, Erdős (1949), Wright (1952),
and Levinson (1969). For a historical review of elementary proofs of the Prime
Number Theorem see Goldfeld (2004).
Quantitative estimates of the form
π (x) = li(x)(1 + O((log x)−a ))

have been derived by elementary methods. van der Corput (1956) obtained
a = 1/200, Kuhn (1955) obtained a = 1/10, Breusch a = 1/6 − ε, and
8.5 Notes 277

Wirsing (1962) a = 3/4. Then Bombieri (1962a,b) and Wirsing (1964) showed
that the above is true for any fixed positive a. Subsequently, elementary tech-
niques have been used to show that
π(x) = li(x) + O(x exp(−c(log x)−b ))
for various values of b. Diamond & Steinig (1970) obtained b = 1/7 − ε, Lavrik
& Sobirov (1973) b = 1/6 − ε, and Srinivasan & Sampath (1988) b = 1/6.
Although the estimates obtained by elementary methods have thus far been
weaker than those derived by analytic means, we have no reason to believe that
this will always be the case.
Section 8.3. The theorem of Ikehara (1931) represented a major advance,
because it gave for the first time a Tauberian theorem that could be used to
prove the Prime Number Theorem without imposing growth conditions on the
Dirichlet series generating function. Ikehara assumed that α(s) − c/(s − 1) is
analytic in the closed half-plane σ ≥ 1. Wiener (1932) showed that mere conti-
nuity is enough, but this is of lesser significance, since still weaker hypotheses
are sufficient – see Korevaar (2006).
The heart of the Wiener–Ikehara proof of the Prime Number Theorem is
Lemma 8.5, which has the effect of enabling one to reduce directly to a use
of the Riemann–Lebesgue lemma on a finite section of the line s = 1. In the
proof of Lemma 8.5 we see that it suffices to take T = C/ε, and from Exercise
8.3.5 we see that it is necessary to take T ≥ 1/(2ε) + O(1). Graham & Vaaler
(1981) have shown that f + and f − can be constructed so that equality is achieved
in Exercise 8.3.5(e),(g).
Lemma 8.5, with T small and ε large, is also useful for proving interesting
theorems of Fatou and Riesz. Fatou (1906) showed that if an = o(1), then the

series f (z) = an z n converges at any point of the circle |z| = 1 at which f is

analytic. Landau (1910, Section 10) gives Riesz’s proof that if n≤x an = o(x),

then the Dirichlet series α(s) = an n −s converges at every point of the line
σ = 1 at which α(s) is analytic. Riesz (1916) extended this to generalized
Dirichlet series.
For detailed discussion of Wiener’s Tauberian theorem, the Ikehara theorem,
and Tauberian theorems associated with the elementary proof of the Prime
Number Theorem see Pitt (1958).
Section 8.4. The concept of generalized primes are introduced in Beurling
(1937). The hypothesis of Theorem 8.10 can be weakened: Kahane (1997) has
shown that if
∞
(N (x) − cx)2 x −3 (log x)2 d x < ∞,
1
then (8.47) still follows.
278 Further discussion of the Prime Number Theorem

Theorem 8.11 is due to Diamond (1970b). Diamond (1973) also showed

that if (8.46) holds with γ > 1, then one has an estimate P(x) x/ log x of
the Chebyshev kind. Zhang (1993) showed that the hypothesis here can be
weakened to
∞
|N (y) − cy| d x
sup < ∞.
1 y≤x y x

In the negative direction, Hall (1973) showed that if γ < 1, then the hypothesis
(8.46) is not sufﬁcient to imply a Chebyshev estimate. Also, Kahane (1998) has
shown that the hypothesis
∞
|N (x) − cx|
dx < ∞
1 x2
does not imply a Chebyshev estimate. Zhang (1987b) has shown that if (8.46)
holds with γ > 1, then

µ(n) = o(x) .
n≤x
n∈N

In the classical context, the above is equivalent – by Axer’s theorem – to the

Prime Number Theorem. However, in the Beurling situation, if 1 < γ ≤ 3/2,
the above holds but PNT may fail.
Nyman (1949) showed that if (8.46) holds for all γ (with the implicit con-
stant depending on γ ), then P(x) = li(x) + Oc (x/(log x)c ) for all c. Malliavin
(1961) showed that if N (x) = cx + O(x exp(−(log x)a )) where 0 < a < 1,
then π(x) = li(x) + O(x exp(−(log x)b )) with b = a/10. Both these authors
proved converse theorems in which an estimate for P(x) is used to estab-
lish a corresponding estimate for N (x), but those results have since been
sharpened by Diamond (1970a). It is now known that the method of Lan-
dau, in which one starts from (8.45) to derive the indicated error term, is
sharp: Diamond, Montgomery & Vorhauer (2006) have shown that if θ is given,
1/2 < θ < 1, then there exists a Beurling system for which (8.45) holds, but
√
P(x) − li(x) = ± (x exp(−c log x)).
Some of the ideas and themes developed in connection with the Prime Num-
ber Theorem have had ramiﬁcations in surprisingly diverse areas. See, for exam-
ple, Hejhal’s expositions (1976, 1983) of Selberg’s trace formula for P S L(2, R),
and the monograph of Parry & Pollicott (1990) on the periodic orbit structure
of hyperbolic dynamics.
Some writers avoid the term ‘Beurling’, and instead discuss ‘arithmetic
semigroups’. The mathematics is the same in either case. For more on this topic
see Bateman & Diamond (1969), and Knopfmacher (1990).
8.6 References 279

8.6 References
Axer, A. (1911). Über einige Grenzwertsätze, Sitz. Kais. Akad. Wiss. Wien. math-natur.
Klasse 120, 1253–1298.
Balanzario, E. P. (2000). On Chebyshev’s inequalities for Beurling’s generalized primes,
Math. Slovaca 50, No.4, 415–436.
Bateman, P. T. (1972). The distribution of values of the Euler function, Acta Arith. 21,
329–345.
Bateman, P. T. & Diamond, H. G. (1969). Asymptotic distribution of Beurling’s
generalized prime numbers, Studies in Number Theory, W. J. LeVeque, Ed.,
MAA Studies in math. 6. Washington: Mathematical Association of America,
pp. 152–210.
(1996). A hundred years of prime numbers, Amer. Math. Monthly 103, 729–741.
Beurling, A. (1937). Analyse de la loi asymptotique de la distribution des nombres
premiers généralisés, I, Acta Math. 68, 255–291.
Bombieri, E. (1962a). Maggiorazione del resto nel “Primzahlsatz” col metodo di Erdős–
Selberg, Ist. Lombardo Accad. Sci. Lett. Rend. A 96, 343–350.
(1962b). Sulle formule di A. Selberg generalizzate per classi di funzioni aritmetiche
e le applicazioni al problema del resto nel “Primzahlsatz”, Riv. Mat. Univ. Parma
(2) 3, 393–440.
Borel, J.-P. (1980/81). Quelques résultats d’équirépartition liés aux nombres généralisés
de Beurling, Acta Arith. 38, 255–272.
(1984). Sur le prolongement des fonctions ζ associées à un système des nombres
premiers généralisés de Beurling, Acta Arith. 43, 273–282.
Breusch, R. (1960). An elementary proof of the prime number theorem with remainder
term, Paciﬁc J. Math. 10, 487–497.
van der Corput, J. G. (1956). Sur le reste dans la démonstration élémentaire du theorème
des nombres premiers, Colloque sur la Théorie des Nombres (Bruxelles, 1955).
Paris: Masson & Cie, pp. 163–182.
Diamond, H. G. (1969). The prime number theorem for Beurling’s generalized numbers,
J. Number Theory 1, 200–207.
(1970a). Asymptotic distribution of Beurling’s generalized integers, Illinois J. Math.
14, 12–28.
(1970b). A set of generalized numbers showing Beurling’s theorem to be sharp, Illinois
J. Math. 14, 29–34.
(1973). Chebyshev estimates for Beurling generalized prime numbers, Proc. Amer.
Math. Soc. 39, 503–508.
(1977). When do Beurling generalized integers have a density?, J. Reine Angew. Math.
295, 22–39.
Diamond, H. G., Montgomery, H. L., & Vorhauer, U. M. A. (2006). Beurling primes
with large oscillation, Math. Ann., 334, 1–36.
Diamond, H. G. & Steinig, J. (1970). An elementary proof of the prime number theorem
with a remainder term, Invent. Math. 11, 199–258.
Dressler, R. E. (1970). A density which counts multiplicity, Paciﬁc Math. J. 34, 371–378.
Erdős, P. (1949). On a new method in elementary number theory which leads to an
elementary proof of the prime number theorem, Proc. Natl. Acad. Sci. USA 35,
374–384.
280 Further discussion of the Prime Number Theorem

Fatou, P. (1906). Séries trigonométriques et séries de Taylor, Acta Math. 30, 335–400.
Goldfeld, D. (2004). The elementary proof of the prime number theorem: an histori-
cal perspective, Number Theory (New York, 2003). New York: Springer-Verlag,
pp. 179–192.
Graham, S. W. & Vaaler, J. D. (1981). A class of extremal functions for the Fourier
transform, Trans. Amer. Math. Soc. 265, 283–302.
Hall, R. S. (1972). The prime number theorem for generalized primes, J. Number Theory
4, 313–320.
(1973). Beurling generalized prime number systems in which the Chebyshev inequal-
ities fail, Proc. Amer. Math. Soc. 40, 79–82.
Hejhal, D. A. (1976). The Selberg Trace Formula for P S L(2, R). Vol. I, Lecture Notes
Math. 548. Berlin: Springer-Verlag.
(1983). The Selberg Trace Formula for P S L(2, R). Vol. 2, Lecture Notes Math. 1001.
Berlin: Springer-Verlag.
Ikehara, S. (1931). An extension of Landau’s theorem in the analytic theory of numbers,
J. Math. Phys. 10, 1–12.
Ingham, A. E. (1945). Some Tauberian theorems connected with the prime number
theorem, J. London Math. Soc. 20, 171–180.
Kahane, J.-P. (1995). Sur travaux de Beurling et Malliavin, Séminaire Bourbaki Vol. 7
Exp. 225, Paris: Soc. Math. France, 27–39.
(1996). Une formula de Fourier pour les nombres premiers. Application aux nombres
premiers généralisés de Beurling, Harmonic analysis from the Pichorides viewpoint
(Anogia, 1995) Publ. Math. Orsay, 96–01, Orsay: Univ. Paris XI, 41–49.
(1997). Sur les nombres premiers généralisés de Beurling. Preuve d’une conjecture
de Bateman et Diamond, J. Théor. Nombres Bordeaux 9, 251–266.
(1998). Le rôle des algèbres A de Wiener, A∞ de Beurling et H 1 de Sobolev
dans la théorie des nombres premiers généralisés de Beurling, Ann. Inst. Fourier
(Grenoble) 48, 611–648.
(1999). Un théorème de Littlewood pour les nombres premiers de Beurling Bull.
London Math. Soc. 31, 424–430.
Knopfmacher, J. (1990). Abstract Analytic Number Theory, Second Edition. New York:
Dover.
Korevaar, J. (2006). The Wiener–Ikehara theorem by complex analysis, Proc. Amer.
Math. Soc. 134, 1107–1116.
Kuhn, P. (1955). Eine Verbesserung des Restgliedes beim elementaren Beweis des
Primzahlsatzes, Math. Scand. 3, 75–89.
Landau, E. (1910). Über die Bedeutung einiger neuen Grenswertsätze der Herren Hardy
und Axer, Prace mat.-ﬁz. 21, 97–177; Collected Works, Vol. 4. Essen: Thales Verlag,
1986, pp. 267–347.
(1912). Über einige neuere Grenzwertsätze, Rend. Circ. Mat. Palermo 34, 121–131;
Collected Works, Vol. 5. Essen: Thales Verlag, 1986, pp. 145–155.
Lavrik, A. F. & Sobirov, A. Š. (1973). The remainder term in the elementary proof of
the Prime Number Theorem, Dokl. Akad. Nauk SSSR 211, 534–536.
Levinson, N. (1969). A motivated account of an elementary proof of the Prime Number
Theorem, Amer. Math. Monthly 76, 225–245.
Malliavin, P. (1961). Sur le reste de la loi asymptotique de répartition des nombres
premiers généralisés de Beurling, Acta Math. 106, 281–298.
8.6 References 281

Narkiewicz, W. (2000). The Development of Prime Number Theory. Berlin: Springer-

Verlag.
Nyman, B. (1949). A general Prime Number Theorem, Acta Math. 81, 299–307.
Parry, W. & Pollicott, M. (1990). Zeta functions and the periodic orbit structure of
hyperbolic dynamics, Astérisque No. 268, pp. 187–188.
Pitt, H. R. (1958). Tauberian Theorems. Oxford: Oxford University Press.
Riesz, M. (1916). Ein Konvergenzsatz für Dirichletsche Reihen, Acta Math. 40, 349–361.
Schwarz, W. (1994). Some remarks on the history of the Prime Number Theorem
from 1896 to 1960, Development of mathematics 1900–1950 (Luxembourg, 1992).
Basel: Birkhäuser, pp. 565–616.
Selberg, A. (1949). An elementary proof of the prime-number theorem, Ann. Math. (2)
50, 305–313.
Srinivasan, B. R. & Sampath, A. (1988). An elementary proof of the Prime Number
Theorem with a remainder term, J. Indian, Math. Soc., New Ser. 53, No.1-4, 1-50.
Widder, D. V. (1971). An Introduction to Transform Theory. New York: Academic Press.
Wiener, N. (1932). Tauberian theorems, Ann. of Math. (2) 33, 1–100; Collected Works,
Vol. 2. Cambridge: MIT, 1979, pp. 519–619.
Wirsing, E. (1962). Elementare Beweise des Primzahlsatzes mit Restglied, I, J. Reine
Angew. Math. 211, 205–214.
(1964). Elementare Beweise des Primzahlsatzes mit Restglied, II, Reine Angew., J.
Math. 214/215, 1–18.
Wright, E. M. (1952). The elementary proof of the Prime Number Theorem, Proc. Roy.
Soc. Edinbugh A 63, 257–267.
Zhang, W. B. (1987a). Chebyshev type estimates for Beurling generalized prime num-
bers, Proc. Amer. Math. Soc. 101, 205–212.
(1987b). A generalization of Halász’s theorem to Beurling’s generalized integers and
its application, Illinois J. Math. 31, 645–664.
(1988). Density and O-density of Beurling generalized integers, J. Number Theory
30, 120–139.
(1993). Chebyshev type estimates for Beurling generalized prime numbers, II, Trans.
Amer. Math. Soc. 337, 651–675.
9
Primitive characters and Gauss sums

9.1 Primitive characters

Suppose that d | q and that χ ! is a character (mod d), and set
!
χ (n) (n, q) = 1;
χ (n) = (9.1)
0 otherwise.
Then χ (n) is multiplicative and has period q, so by Theorem 4.7 we deduce that
χ (n) is a Dirichlet character (mod q). In this situation we say that χ ! induces
χ. If q is composed entirely of primes dividing d, then χ (n) = χ ! (n) for all n,
but if there is a prime factor of q not found in d, then χ (n) does not have period
d. Nevertheless, χ and χ ! are nearly the same in the sense that χ ( p) = χ ! ( p)
for all but at most ﬁnitely many primes, and hence

! χ ! ( p)
L(s, χ ) = L(s, χ ) 1− . (9.2)
p|q
ps

Our immediate task is to determine when one character induces another.

Lemma 9.1 Let χ be a character (mod q). We say that d is a quasiperiod

of χ if χ (m) = χ (n) whenever m ≡ n (mod d) and (mn, q) = 1. The least
quasiperiod of χ is a divisor of q.

Proof Let d be a quasiperiod of χ , and put g = (d, q). We show that g is

also a quasiperiod of χ . Suppose that m ≡ n (mod g) and that (mn, q) = 1.
Since g is a linear combination of d and q, and m − n is a multiple of g,
it follows that there are integers x and y such that m − n = d x + qy. Then
χ (m) = χ (m − qy) = χ (n + d x) = χ (n). Thus g is a quasiperiod of χ .

With more effort (see Exercise 9.1.1) it can be shown that if d1 and d2
are quasiperiods of χ , then (d1 , d2 ) is also a quasiperiod, and hence the least

282
9.1 Primitive characters 283

quasiperiod divides all other quasiperiods, and in particular it divides q (since

q is a quasiperiod of χ ).
The least quasiperiod d of χ is called the conductor of χ . Suppose that d
is the conductor of χ . If (n, d) = 1, then (n + kd, d) = 1. Also, if (r, d) = 1
then there exist values of k (mod r ) for which (n + kd, r ) = 1. Hence there
exist integers k for which (n + kd, q) = 1. For such a k put χ ! (n) = χ (n + kd).
Although there are many such k, there is only one value of χ (n + kd) when
(n + kd, q) = 1. We extend the deﬁnition of χ ! by setting χ ! (n) = 0 when
(n, d) > 1. It is readily seen that χ ! is multiplicative and that χ ! has period
d. Thus by Theorem 4.7, χ ! is a character modulo d. Moreover, if χ0 is the
principal character modulo q, then χ (n) = χ ! (n)χ0 (n). Thus χ ! induces χ .
Clearly χ ! has no quasiperiod smaller than d, for otherwise χ would have a
smaller quasiperiod, contradicting the minimality of d. In addition, χ ! is the
only character (mod d) that induces χ , for if there were another, say χ1 , then
for any n with (n, d) = 1 we would have χ ! (n) = χ ! (n + kd) = χ (n + kd) =
χ1 (n + kd) = χ1 (n), on choosing k as above.
A character χ modulo q is said to be primitive when q is the least quasiperiod
of χ . Such χ are not induced by any character having a smaller conductor. We
summarize our discussion as follows.
Theorem 9.2 Let χ denote a Dirichlet character modulo q and let d be the
conductor of χ . Then d | q, and there is a unique primitive character χ ! modulo
d that induces χ.
We now identify the primitive characters in such a way that we can describe
them in terms of the explicit construction of Section 5.2.
Lemma 9.3 Suppose that (q1 , q2 ) = 1 and that χ1 and χ2 are characters
modulo q1 and q2 , respectively. Put χ (n) = χ1 (n)χ2 (n). Then the character χ
is primitive modulo q1 q2 if and only if both χ1 and χ2 are primitive.
Proof For convenience write q = q1 q2 . Suppose that χ is primitive modulo
q, and for i = 1, 2 let di be the conductor of χi . If (mn, q) = 1 and m ≡ n
(mod d1 d2 ) then χi (m) = χi (n) for i = 1, 2, and hence d1 d2 is a quasiperiod of
χ. Since χ is primitive, this means that d1 d2 = q. But di | qi , so this implies
that di = qi , which is to say that the characters χi are primitive.
Now suppose that χi is primitive modulo qi for i = 1, 2, and let d be the
conductor of χ . Put di = (d, qi ). We show that d1 is a quasiperiod of χ1 . Sup-
pose that m ≡ n (mod d1 ) and that (mn, q1 ) = 1. Choose m so that m ≡ m
(mod q1 ), m ≡ 1 (mod q2 ). Similarly, choose n so that n ≡ n (mod q1 )
and n ≡ 1 (mod q2 ). Thus m ≡ n (mod d) and (m n , q) = 1, and hence
χ(m ) = χ (n ). But χ (m ) = χ1 (m) and χ (n ) = χ1 (n), so χ1 (m) = χ1 (n). Thus
284 Primitive characters and Gauss sums

d1 is a quasiperiod of χ1 . Since χ1 is primitive, it follows that d1 = q1 . Similarly

d2 = q2 . Thus d = q, which is to say that χ is primitive.

By Lemma 9.3 we see that in order to exhibit the primitive characters ex-
plicitly it sufﬁces to determine the primitive characters (mod p α ). Suppose ﬁrst
that p is odd, and let g be a primitive root of p α . Then by (4.16) we know that
any character χ (mod p α ) is given by

k indg n
χ (n) = e
ϕ( p α )

for some integer k. If α = 1, then χ is primitive if and only if it is non-principal,

which is to say that ( p − 1) k. If α > 1, then χ is primitive if and only if p k.
Now consider primitive characters (mod 2α ). When α = 1 we have only the
principal character, which is imprimitive. When α = 2 we have two characters,
namely the principal character, which is imprimitive, and the primitive character
χ given by χ (4k + 1) = 1, χ (4k − 1) = −1. When α ≥ 3, we write an odd
integer n in the form n ≡ (−1)µ 5ν (mod 2α ), and then characters (mod 2α ) are
of the form

jµ kν
χ (n) = e + α−2
2 2

where j is determined (mod 2) and k is determined (mod 2α−2 ). Here χ is

primitive if and only if k is odd.
We now give two useful criteria for primitivity.

Theorem 9.4 Let χ be a character modulo q. Then the following are equiv-
alent:
(1) χ is primitive.
(2) If d | q and d < q, then there is a c such that c ≡ 1 (mod d), (c, q) = 1,
χ(c) = 1.
(3) If d | q and d < q, then for every integer a,

q
χ (n) = 0.
n=1
n≡a (mod d)

Proof (1) ⇒ (2). Suppose that d | q, d < q. Since χ is primitive, there exist
integers m and n such that m ≡ n (mod d), χ (m) = χ (n), χ (mn) = 0. Choose
c so that (c, q) = 1, cm ≡ n (mod q). Thus we have (2).
(2) ⇒ (3). Let c be as in (2). As k runs through a complete residue system
(mod q/d), the numbers n = ac + kcd run through all residues (mod q) for
9.1 Primitive characters 285

which n ≡ a (mod d). Thus the sum S in question is

q/d
S= χ (ac + kcd) = χ (c)S.
k=1

Since χ (c) = 1, it follows that S = 0.

(3) ⇒ (1). Suppose that d | q, d < q. Take a = 1 in (3). Then χ (1) = 1
is one term in the sum, but the sum is 0, so there must be another term χ (n)
in the sum such that χ (n) = 1, χ (n) = 0. But n ≡ 1 (mod d), so d is not a
quasiperiod of χ , and hence χ is primitive.

9.1.1 Exercises
1. Let f (n) be an arithmetic function with period q such that f (n) = 0 when-
ever (n, q) > 1. Call d a quasiperiod of f if f (m) = f (n) whenever m ≡ n
(mod d) and (mn, q) = 1.
(a) Suppose that d1 and d2 are quasiperiods, put g = (d1 , d2 ), and suppose
that m ≡ n (mod g) and (mn, q) = 1. Show that there exist integers a
and b such that m = n + ad1 + bd2 and (n + ad1 , q) = 1.
(b) Show that if d1 and d2 are quasiperiods of f then so also is (d1 , d2 ).
(c) Show that the least quasiperiod of f divides all quasiperiods.
2. Let S(q) denote the set of all Dirichlet characters χ (mod q), and put T (q) =
)
d|q S(d). Show that the members of T (q) form a basis of the vector space
of all arithmetic functions with period q if and only if q is square-free.
3. For d|q let U(d, q) denote the set of ϕ(q/d) functions

χ (a/d) (a, q) = d,
f (a) =
0 otherwise

where χ runs over all Dirichlet characters (mod q/d). Set V(q) =
)
d|q U(d, q). Show that the members of V(q) form a basis for the vector
space of arithmetic functions with period q.
4. For i = 1, 2 let χi be a character (mod qi ) where (q1 , q2 ) = 1, and suppose
that di is the conductor of χi . Show that d1 d2 is the conductor of χ1 χ2 .
5. For i = 1, 2 suppose that χi is a character (mod qi ). Show that the following
two assertions are equivalent:
(a) The characters χ1 and χ2 are induced by the same primitive character.
(b) χ1 ( p) = χ2 ( p) for all but at most ﬁnitely many primes p.
6. Let ϕ2 (q) denote the number of primitive characters (mod q).
(a) Show that ϕ2 (q) is a multiplicative function.

(b) Show that d|q ϕ2 (d) = ϕ(q).
286 Primitive characters and Gauss sums

(c) Show that

2
2 1
ϕ2 (q) = q 1− 1− .
pq
p p 2 |q
p

(d) Show that ϕ2 (q) > 0 if and only if q ≡ 2 (mod 4).

7. Suppose that χ is a character (mod q), and that d is the conductor of χ. Show
that if (a, q) = 1, then

q
ϕ(q)
χ (n) = .
n=1
ϕ(d)
n≡a(mod d)

8. (Martin 2006; Vorhauer 2006) Let d(χ ) denote the conductor of χ .

(a) Use the identity log d = r |d (r ) to show that

log d(χ ) = ϕ(q) log q − (r ) 1.
χ r |q χ
r d(χ )

(b) Show that if pa q and 1 ≤ b ≤ a, then the number of χ modulo q such

that p b d(χ ) is exactly ϕ(q)ϕ( p b−1 )/ϕ( pa ).
(c) Conclude that

log p
log d(χ ) = ϕ(q) log q − .
χ p|q
p−1

9.2 Gauss sums

Given a character χ modulo q, we deﬁne the Gauss sum τ (χ ) of χ to be

q
τ (χ ) = χ (a)e(a/q). (9.3)
a=1

This may be regarded as the inner product of the multiplicative character χ (a)
with the additive character e(a/q). As such, it is analogous to the gamma
∞
function (s) = 0 x s−1 e−x d x, which is the inner product of the multiplicative
character x s with the additive character e−x with respect to the invariant measure
d x/x. Gauss sums are invaluable in transferring questions concerning Dirichlet
characters to questions concerning additive characters, and vice versa.
The Gauss sum is a special case of the more general sum

q
cχ (n) = χ (a)e(an/q). (9.4)
a=1
9.2 Gauss sums 287

When χ is the principal character, this is Ramanujan’s sum

q
cq (n) = e(an/q), (9.5)
a=1
(a,q)=1

whose properties were discussed in Section 4.1. We now show that the sum
cχ (n) is closely related to τ (χ ).
Theorem 9.5 Suppose that χ is a character modulo q. If (n, q) = 1, then

q
χ (n)τ (χ ) = χ (a)e(an/q), (9.6)
a=1

and in particular
τ (χ ) = χ (−1)τ (χ ). (9.7)
Proof If (n, q) = 1, then the map a → an permutes the residues modulo q,
and hence
q
χ (n)cχ (n) = χ (an)e(an/q) = τ (χ ).
a=1

On replacing χ by χ , this gives (9.6), and (9.7) follows by taking n = −1.

Theorem 9.6 Suppose that (q1 , q2 ) = 1, that χi is a character modulo qi for
i = 1, 2, and that χ = χ1 χ2 . Then
τ (χ ) = τ (χ1 )τ (χ2 )χ1 (q2 )χ2 (q1 ).
Proof By the Chinese Remainder Theorem, each a (mod q1 q2 ) can be written
uniquely as a1 q2 + a2 q1 with 1 ≤ ai ≤ qi . Thus the general term in (9.3) is
χ1 (a1 q2 )χ2 (a2 q1 )e(a1 /q1 ) e(a2 /q2 ), so the result follows.

For primitive characters the hypothesis that (n, q) = 1 in Theorem 9.5 can
be removed.
Theorem 9.7 Suppose that χ is a primitive character modulo q. Then (9.6)
√
holds for all n, and |τ (χ )| = q.
Proof It sufﬁces to prove (9.6) when (n, q) > 1. Choose m and d so that
(m, d) = 1 and m/d = n/q. Then

q
d
q
χ (a)e(an/q) = e(hm/d) χ (a).
a=1 h=1 a=1
a≡h (mod d)

Since d | q and d < q, the inner sum vanishes by Theorem 9.4. Thus (9.6) holds
also in this case.
288 Primitive characters and Gauss sums

We replace χ in (9.6) by χ , take the square of the absolute value of both

sides, and sum over n to see that

q
q 2
q
q
q
ϕ(q)|τ (χ )|2 = χ (a)e(an/q) = χ (a)χ (b) e((a − b)n/q).
n=1 a=1 a=1 b=1 n=1

The innermost sum on the right is 0 unless a ≡ b (mod q), in which case it is
√
equal to q. Thus ϕ(q)|τ (χ )|2 = ϕ(q)q, and hence |τ (χ )| = q.

If χ is primitive modulo q, then not only does (9.6) hold for all n but also
τ (χ ) = 0, and hence we have

Corollary 9.8 Suppose that χ is a primitive character modulo q. Then for

any integer n,

1
q
χ (n) = χ (a)e(an/q).
τ (χ ) a=1

This is very useful, since it allows us to express the multiplicative character

χ as a linear combination of additive characters e(an/q). As a ﬁrst application,
we use this formula to express L(1, χ) in closed form.

Theorem 9.9 Suppose that χ is a primitive character modulo q with q > 1.

If χ (−1) = 1, then

−τ (χ )
q−1
L(1, χ) = χ (a) log(sin πa/q), (9.8)
q a=1

while if χ (−1) = −1, then

iπ τ (χ )
q−1
L(1, χ ) = aχ (a). (9.9)
q 2 a=1
∞
Proof Since L(1, χ ) = n=1 χ (n)/n, by Corollary 9.8,

1 ∞
1 1 ∞
q−1 q−1
e(an/q)
L(1, χ) = χ (a)e(an/q) = χ (a) .
τ (χ ) n=1 n a=1 τ (χ ) a=1 n=1
n

But log(1 − z)−1 = ∞ n=1 z /n for |z| ≤ 1, z = 1, where the logarithm is
n

the principal branch. We take z = e(θ ) where 0 < θ < 1. Since 1 − e(θ) =
−2ie(θ/2) sin πθ, it follows that log(1 − e(θ)) = log(2 sin π θ) + iπ (θ − 1/2).
Thus
−1
q−1
L(1, χ) = χ (a)(log(2 sin πa/q) + iπ (a/q − 1/2)).
τ (χ ) a=1
9.2 Gauss sums 289

q−1
Since a=1 χ (a) = 0, this is
−1
(S + i T )
τ (χ )
q−1 q−1
where S = a=1 χ (a) log(sin πa/q) and T = π/q a=1 χ (a)a. On replacing
a by q − a we see that S = χ (−1)S and T = −χ (−1)T . Thus if χ (−1) = 1,
then T = 0 and so
−1
q−1
L(1, χ ) = χ (a) log(sin πa/q).
τ (χ ) a=1
Then by (9.7) we obtain (9.8). If χ (−1) = −1 then S = 0 and so

−iπ
q−1
L(1, χ) = χ (a)a.
τ (χ )q a=1
Then by (9.7) we obtain (9.9).

We next show that τ (χ ) can be expressed in terms of τ (χ ! ) where χ ! is the

primitive character that induces χ .
Theorem 9.10 Let χ be a character modulo q that is induced by the primitive
character χ ! modulo d. Then τ (χ ) = µ(q/d)χ ! (q/d)τ (χ ! ).
Proof If (d, q/d) > 1, then χ ! (q/d) = 0, so we begin by showing that τ (χ ) =
0 in this case. Let p be a prime such that p | d, p | q/d, and write a = jq/ p + k
with 0 ≤ j < p, 0 ≤ k < q/ p. Then

q−1
q/ p
p
τ (χ) = χ (a)e(a/q) = χ ( jq/ p + k)e( j/ p + k/q).
a=0 k=1 j=1

But p | (q/ p), so ( jq/ p + k, q) = 1 if and only if ( jq/ p + k, q/ p) = 1, which

in turn is equivalent to (k, q/ p) = 1. Also, d | q/ p, so the above is

q/ p
p
= χ ! (k)e(k/q) e( j/ p).
k=1 j=1
(k,q/ p)=1

Here the inner sum vanishes, so τ (χ ) = 0 when (d, q/d) > 1.

Now suppose that (d, q/d) = 1, and let χ0 denote the principal character
modulo q/d. Then by Theorem 9.6,
τ (χ ) = τ (χ0 χ ! ) = τ (χ0 )τ (χ ! )χ0 (d)χ ! (q/d).
By taking n = 1 in Theorem 4.1 we ﬁnd that τ (χ0 ) = µ(q/d). Thus we have
the stated result.
290 Primitive characters and Gauss sums

We now turn our attention to the more general cχ (n). To this end we begin
with an auxiliary result.

Lemma 9.11 Let χ be a character modulo q induced by the primitive char-

acter χ ! modulo d. Suppose that r | q. Then
q !
χ (b)ϕ(q)/ϕ(r ) if (b, r ) = 1 and d | r,
χ (n) =
n=1
0 otherwise.
n≡b (mod r )

Proof Let S(b, r ) denote the sum in question. If p | (b, r ) and n ≡ b (mod r ),
then p | n, and so (n, q) > 1. Thus each term in S(b, r ) is 0. Thus we are
done when (b, r ) > 1, so we suppose that (b, r ) = 1. Consider next the case
when d r . Then r is not a quasiperiod of χ . Hence there exist m and n such
that (mn, q) = 1, m ≡ n (mod r ), and χ (m) = χ (n). Choose c so that cn ≡
m (mod q). Then c ≡ 1 (mod r ) and χ(c) = 1. Hence χ (c)S(b, r ) = S(b, r ),
as in the proof of Theorem 9.4, so S(b, r ) = 0 in this case. Finally suppose
that d | r . Let χ0 be the principal character modulo q. If n ≡ b (mod r ), then
χ ! (n) = χ ! (b). Thus

q
S(b, r ) = χ ! (b) χ0 (n).
n=1
n≡b (mod r )

Write q/r = q1 q2 where q1 is the largest divisor of q/r that is relatively prime
to r . Then the sum on the right above is

q1 q2
1 = q2 ϕ(q1 ) = ϕ(q)/ϕ(r ),
k=1
(kr +b,q1 )=1

as required.

We are now in a position to deal with cχ (n).

Theorem 9.12 Let χ be a character modulo q induced by the primitive char-

acter χ ! modulo d. Put r = q/(q, n). Then cχ (n) = 0 if d r , while if d | r ,
then
ϕ(q)
cχ (n) = χ ! (n/(q, n))χ ! (r/d)µ(r/d) τ (χ ! ).
ϕ(r )
Proof If (n, q) = 1, then by Theorem 9.5 and Theorem 9.10 we see that

cχ (n) = χ (n)τ (χ ) = χ ! (n)µ(q/d)χ ! (q/d)τ (χ ! ).

Since r = q, we have d | r , so we have the correct result. Now suppose that

(n, q) > 1. In the deﬁnition (9.4) of cχ (n), let a = br + k with 0 ≤ b < q/r ,
9.2 Gauss sums 291

1 ≤ k ≤ r . Then

r
q/r
cχ (n) = e(kn/q) χ (br + k).
k=1 b=1

By Lemma 9.11 this is 0 when d r . Thus we may suppose that d | r . Then, by

Lemma 9.11,

r
cχ (n) = e(kn/q)χ ! (k)ϕ(q)/ϕ(r ).
k=1
(k,r )=1

Put m = n/(q, n), and let χ1 denote the character modulo r induced by χ ! .
Then the above is
ϕ(q) r
= e(km/r )χ1 (k).
ϕ(r ) k=1
Since (m, r ) = 1, we see by the ﬁrst case treated that the above is
ϕ(q) !
χ (m)µ(r/d)χ ! (r/d)τ (χ ! ),
ϕ(r )
which sufﬁces.

9.2.1 Exercises
1. (a) Show that

1 e(a/q) (a, q) = 1,
χ (a)τ (χ ) =
ϕ(q) χ 0 otherwise.

(b) Show that for all integers a,

1
e(a/q) = χ (a/d)τ (χ ).
d|q
ϕ(q/d) χ (mod q/d)
d|a

2. Let
p k
an
G k (a) = e .
n=1
p

(a) Let Nk (h) denote the number of solutions of the congruence x k ≡ h

(mod p). Explain why
p
ah
G k (a) = Nk (h)e .
h=1
p
292 Primitive characters and Gauss sums

(b) Let l = (k, p − 1). Show that if k is a positive integer, then Nk (h) =
Nl (h) for all h, and hence that G k (a) = G l (a).
(c) Suppose that k | ( p − 1). Explain why

p
p
|G k (a)|2 = p Nk (h)2 .
a=1 h=1

(d) Suppose that k | ( p − 1). Show that there are ( p − 1)/k residues h
(mod p) for which Nk (h) = k, that Nk (0) = 1, and that Nk (h) = 0 for
all other residue classes (mod p). Hence show that the right-hand side
above is p(1 + ( p − 1)k).
(e) Let k be a divisor of p − 1. Suppose that p a, p c, and that b ≡ ack
(mod p). Show that G k (a) = G k (b).
√
(f) Suppose that k | ( p − 1). Show that if p a then |G k (a)| < k p.
3. Suppose that k | ϕ(q) and that (h, q) = 1.
(a) Explain why

1 1 if x k ≡ h (mod q),
χ (x k )χ (h) =
ϕ(q) χ 0 otherwise.

(b) Let Nk (h) be as in Exercise 2(a). Show that

Nk (h) = χ (h).
χ
χ k =χ0

4. Suppose that k | ( p − 1), that Nk (h) is as in Exercise 2(a), and let χ be a

character of order k, say χ (n) = e((ind n)/k).
(a) Show that for all h,

k−1
Nk (h) = 1 + χ j (h).
j=1

(b) Show that if p a, then

k−1
G k (a) = χ j (a)τ (χ j ).
j=1
√
(c) Show that if p a, then |G k (a)| ≤ (k − 1) p.
5. Suppose that χi is a character (mod qi ) for i = 1, 2, with (q1 , q2 ) = 1. Show
that
cχ1 χ2 (n) = χ1 (q2 )χ2 (q1 )cχ1 (n)cχ2 (n) .
6. (Apostol 1970) Let χ be a character modulo q such that the identity (9.6)
holds for all integers n. Show that χ is primitive (mod q).
9.2 Gauss sums 293

7. Let N (q) denote the number of pairs x, y of residue classes (mod q) such
that y 2 ≡ x 3 + 7 (mod q).
(a) Show that N (q) is a multiplicative function of q, that N (2) = 2, N (3) =
3, N (7) = 7, and that N ( p) = p when p ≡ 2 (mod 3).
(b) Suppose that p ≡ n1 (mod 3). Let χ1 (n) be a cubic character modulo p,
and let χ2 (n) = p be the quadratic character modulo p. Show that
p
1
p
N ( p) = e(7a/ p) 1 + χ1 (h) + χ1 (h) e(ah/ p)
2
p a=1 h=1
p

× (1 + χ2 (k))e(−ak/ p)
k=1
2
= p+ τ (χ1 )τ (χ2 )τ χ12 χ2 χ1 χ2 (−7) ,
p
√
and deduce that |N ( p) − p| ≤ 2 p.
(c) Deduce that N ( p) > 0 for all p.
(d) Show that N (2k ) = 2k−1 for k ≥ 2, that N (3k ) = 2 · 3k−1 for k ≥ 2,
that N (7k ) = 6 · 7k−1 for k ≥ 2, and that N ( p k ) = N ( p) p k−1 for all
other primes.
(e) Conclude that the congruence y 2 ≡ x 3 + 7 (mod q) has solutions for
every positive integer q.
(f) Suppose that x and y are integers such that y 2 = x 3 + 7. Show that
2 | y, x ≡ 1 (mod 4), and that x > 0. Note that y 2 + 1 = (x + 2)(x 2 −
2x + 4), so that y 2 + 1 is composed of primes ≡ 1 (mod 4), and yet x +
2 ≡ 3 (mod 4). Deduce that this equation has no solution in integers.
8. (Mordell 1933) Explain why the number N of solutions of the congruence
c1 x1k1 + · · · + cm xmkm ≡ c (mod p) is
1
p m
N= e(−ac/ p) G k j (ac j )
p a=1 j=1

where G k is deﬁned as in Exercise 2.

(b) Suppose that c = 0 but that p does not divide any of the numbers c j .

Show that |N − p m−1 | ≤ C p m/2 where C = mj=1 ((k j , p − 1) − 1).
(c) Suppose that c ≡ 0 (mod p) and that for all j, c j ≡ 0 (mod p). Show
that |N − p m−1 | ≤ C p (m−1)/2 where C is deﬁned as above.
9. (Mattics 1984) Suppose that h has order ( p − 1)/k modulo p. Show that
p−1 m
h √
e ≤ 1 + (k − 1) p.
m=1
p
10. Let χ1 and χ2 be primitive characters (mod q).
294 Primitive characters and Gauss sums

(a) Show that if (a, q) = 1, then

q
τ (χ 1 χ 2 )
χ1 (n)χ2 (a − n) = χ1 χ2 (a)q .
n=1
τ (χ 1 )τ (χ 2 )
(b) Show that if χ1 χ2 is primitive, then

q
τ (χ1 )τ (χ2 )
χ1 (n)χ2 (a − n) = χ1 χ2 (a) (9.10)
n=1
τ (χ1 χ2 )
for all a.

When a = 1, the sum (9.10) is known as the Jacobi sum J (χ1 , χ2 ). In the
same way that the Gauss sum is analogous to the gamma function, the Jacobi
sum (and its evaluation in terms of Gauss sums) is analogous to the beta function
1
(α) (β)
B(α, β) = x α−1 (1 − x)β−1 d x = .
0 (α + β)
11. Let C be the smallest ﬁeld that contains the ﬁeld Q of rational numbers and
is closed under square roots. Thus C is the set of complex numbers that
are constructible by ruler-and-compass. We show that if p is of the form
p = 2k + 1, then ζ = e(1/ p) ∈ C, which is to say that a regular p-gon can
be constructed.
(a) Let p be any prime, and χ any non-principal character modulo p.
Explain why

p
τ (χ )2 χ (n)χ (1 − n) = pτ (χ 2 ).
n=1

(b) From now on assume that p is of the form p = 2k + 1. Explain why

k
χ 2 = χ0 for any character modulo p, and deduce that χ (n) ∈ C for
all χ and all integers n.
(c) Deduce that if τ (χ 2 ) ∈ C, then τ (χ ) ∈ C.
(d) Suppose that χ has order 2r . Show successively that the numbers
r r −1
−1 = τ (χ 2 ), τ (χ 2 ), . . . , τ (χ 2 ), τ (χ )

lie in C.

(e) Explain why χ τ (χ ) = ( p − 1)ζ .
(f) (Gauss) If p = 2k + 1, then ζ ∈ C.
p
12. Let χ be a character modulo p and put J (χ ) = n=1 χ (n)χ (1 − n).
√
(a) Show that if χ 2 = χ0 , then |J (χ )| = p.
(b) Suppose that p ≡ 1 (mod 4). Show that there is a quartic character χ
modulo p.
9.3 Quadratic characters 295

(c) Show that if χ is a quartic character, then J (χ ) is a Gaussian integer.

That is, J (χ ) = a + ib where a and b are rational integers.
(d) Deduce that a 2 + b2 = p.
13. (a) Write

q
q
|τ (χ )|2 = χ (m)e(m/q) χ (n)e(−n/q),
m=1 n=1

and in the second sum replace n by mn where (m, q) = 1, to see that

the above is

q
= χ (n)cq (n − 1).
n=1

(b) Use Theorem 4.1 to show that the above is

q
= dµ(q/d) χ (n).
d|q n=1
n≡1 (mod d)
√
(c) Use Theorem 9.4 to show that if χ is primitive, then |τ (χ )| = q.

9.3 Quadratic characters

A character is quadratic if it has order 2 in the group of characters modulo
q. That is, the character takes on only the values −1, 0, and 1, with at least
one −1. Similarly, a character is real if all its values are real. Hence a real
character iseither the principal character or a quadratic character. The Legendre
symbol np L is a primitive quadratic character modulo p, and further quadratic
characters arise from the Jacobi and Kronecker symbols. We now determine
all quadratic characters modulo q. If χ is a character modulo q induced by the
primitive character χ ! modulo d, d | q, then χ is quadratic if and only if χ ! is
quadratic. Hence it sufﬁces to determine the primitive quadratic characters.
Suppose that χ is a character modulo q, that q = q1 q2 , (q1 , q2 ) = 1,
χ = χ1 χ2 as in Lemma 9.3. By the Chinese Remainder Theorem we see that
χ is a real character if and only if both χ1 and χ2 are real characters. Hence by
Lemma 9.3, χ is a primitive quadratic character if and only if χ1 and χ2 are. Thus
it sufﬁces to determine the primitive quadratic characters modulo a prime power.
In Section 5.2 we saw that a character χ modulo p may be written in the
form χ(n) = e(k ind n/( p − 1)). Such a character is primitive provided that
it is non-principal, which is to say that k ≡ 0 (mod p − 1). Similarly, χ is
quadratic if and only if the least denominator of the fraction k/( p − 1) is 2. If
296 Primitive characters and Gauss sums

p = 2 then this is impossible, but for p > 2 this is equivalent to the condition
k ≡ ( p − 1)/2 (mod p − 1). Thus there is no quadratic character modulo 2,
but for each odd prime p there is a unique quadratic character, given by the
Legendre symbol.
Now suppose that p is an odd prime and that q = p m with m > 1. We have
seen that a character χ modulo such a q is of the form χ (n) = e(k ind n/ϕ(q)),
and that χ is primitive if and only if p k. This character is quadratic only when
k ≡ ϕ(q)/2 (mod ϕ(q)), so there is a unique quadratic character modulo q, but
it is not primitive because p | k for this k. That is, the only quadratic character
modulo p m is induced by the primitive quadratic character modulo p.
Finally, suppose that q = 2m . For the modulus 2 there is only the principal
character, but for q = 4 we have a primitive quadratic character

(−1)(n−1)/2 (n odd),
χ1 (n) =
0 (n even).

For m > 2 we write χ ((−1)µ 5ν ) = e( jµ/2 + kν/2m−2 ), and we see that this
character is real if and only if 2m−3 | k. However, the character is primitive if and
only if k is odd, so primitive quadratic characters arise only when m = 3, and for
this modulus we have two different characters (corresponding to j = 0, j = 1).
Let χ2 ((−1)µ 5ν ) = e(ν/2). That is, χ2 (n) = (−1)(n −1)/8 . Then the characters
2

modulo 8 are χ0 , χ1 , χ2 , and χ1 χ2 , of which the latter two are primitive.

We next show that the primitive quadratic characters arise precisely from
the Kronecker symbol dn K . We say that d is a quadratic discriminant if either
(a) d ≡ 1 (mod 4) and d is square-free
or
(b) 4 | d, d/4 ≡ 2 or 3 (mod 4), and d/4 is square-free.
d
For each quadratic discriminant d we deﬁne the Kronecker symbol n K
by the
following relations:

(i) dp K = 0 when p | d;

d 1 when d ≡ 1 (mod 8),
(ii) =
2 K −1 when d ≡ 5 (mod 8);

(iii) dp K = dp L , the Legendre symbol, when p > 2;

d 1 when d > 0,
(iv) =
−1 −1 when d < 0;
d K
(v) n K is a totally multiplicative function of n.
It is not immediately apparent that this deﬁnition of the Kronecker symbol gives
rise to a character, but we now show that this is the case.
9.3 Quadratic characters 297

Theorem 9.13 Let d be a quadratic discriminant. Then χd (n) = dn K is a
primitive quadratic character modulo |d|, and every primitive quadratic char-
acter is given uniquely in this way.
−4
Proof It is easy 8 to see that n K
is the primitive quadratic character modulo
4. Similarly, n K and −8 n K
are the primitive quadratic characters p modulo
8.
Suppose that p is a prime, p ≡ 1 (mod 4). We show that n K = p L for all n

n.
p seethis,
To note that if q is anodd prime, then by (iii) and 2 quadratic preciprocity,
p q p ( p 2 −1)/8

= = . Also, = (−1) = , and −1
=1=
−1
q K q L p L 2 K p L K
. Since these two functions agree on all primes, and also on −1, and
p L
both are totally multiplicative, it follows that np K = np L for all integers n.

Suppose that p is a prime, p ≡ 3 (mod 4). We show that −np K = np L
for all n. To see this, note that if q is an odd prime, then by (iii)2 and
quadratic reciprocity, −qp K = −qp L = qp L . Also, −2p K = (−1)((− p) −1)/8

= (−1)( p −1)/8 = 2p L , and −−1p K = −1 = −1
2
p L
. Since these two functions
and also on −1, and both are totally multiplicative, it follows
agree on all primes,
that −np K = np L for all integers n.
Suppose next that d1 and d2 are quadratic discriminants with (d1 , d2 ) = 1. Put
d = d1 d2 . Supposing that dni K is a primitive quadratic character modulo |di | for
i = 1, 2, we shall show that dn K is a primitive d quadratic
d1 d2 character
d2 |d|. If
d1 modulo
q is an odd prime, then by (iii), q K = q L = q L q L = q K q K . Also,
d
d d1 d2
by (ii) we see that d2 K = d21 K d22 K , and by (iv) that −1 = −1 .
d d1 d2 K K −1 K
Since n K = n K n K when n is a prime or n = −1, and since both sides
are totally multiplicative functions, it follows that this identity holds for all
integers n. Hence by Lemma 9.3, dn K is a primitive character modulo |d|.
This allows us to account for all primitive quadratic characters, so the proof
is complete.

Since the Kronecker symbol and Legendre symbol agree whenever both are
deﬁned, we may omit the subscripts. The same remark applies to theJacobi
symbol qn J , which for odd positive q = p1 p2 · · · pr is deﬁned to be qn J =
r n d
i=1 pi L . Sometimes we let χd (n) denote the character n .
A character χ modulo q is an even function, χ (−n) = χ (n), if χ (−1) = 1;
for the primitive quadratic character χd this n occurs if d > 0. In the case of the
Legendre symbol, if p ≡ 1 (mod 4), then p L = χ p (n) is even. Similarly, χ is
odd, χ (−n) = −χ (n), if χ (−1) = −1. For χd this occurs when d < 0. For the
Legendre symbol, if p ≡ 3 (mod 4), then np L = χ− p (n) is odd.
We have taken the quadratic reciprocity law for the Legendre symbol for
granted, since it is treated in a variety of ways in elementary texts. In Exercise
9.3.6 below we outline a proof of quadratic reciprocity that is unusual that
298 Primitive characters and Gauss sums

it applies directly to the Jacobi symbol, without ﬁrst being restricted to the
Legendre symbol. For future purposes it is convenient to formulate quadratic
reciprocity also for the Kronecker symbol.

Theorem 9.14 Suppose that d1 and d2 are relatively prime quadratic discrim-
inants. Then

d1 d2
= ε(d1 , d2 ) (9.11)
d2 d1
where ε(d1 , d2 ) = 1 if d1 > 0 or d2 > 0, and ε(d1 , d2 ) = −1 if d1 < 0 and
d2 < 0.

For odd n let m 2 be the largest square dividing n. Then there is a unique
d1 that n = ±m dd2 2 ,
2
choice of sign and a unique quadratic discriminant d2 such
and then if (n, d1 ) = 1 the above can be applied to express n in terms of d1 .
If n is even, then 4n = m 2 d2 for unique
d1 m and quadratic
d2 discriminant d2 , so if
(n, d1 ) = 1 we can again express n in terms of d1 .

Proof Suppose that d1 = p ≡ 1 (mod 4). Then

p d2 d2
= = ,
d2 K p L p K
so (9.11) holds in this case. Next suppose that d1 = − p where p ≡ 3 (mod 4).
Then

−p d2 d2 d2
= = ,
d2 K p L −1 K − p K
so (9.11) holds in this case also. Next consider
the case d1 = −4.
−4 dThen
d2 isdodd,

and hence d2 ≡ 1 (mod 4), so that −4 = = 1, while 2
−4 K
= −12 K ,
d2 K 1 K

and (9.11) again holds. If d1 = 8 then d2 is odd and d82 K = (−1)(d2 −1)/8 =
2

d2 −8 −4 8
, so (9.11) holds. Similarly, if d2 is odd, then d2 K = d2 K d2 K =
88 K d2 d2 d2
d2 K
= 8 K = −1 K −8 K
, so again (9.11) holds.
Now let d1 , d2 and d be pairwise coprime quadratic discriminants. Then

d1 d2 d1 d2
= .
d K d K d K
Suppose that (9.11) holds for the pair d1 , d, and also for the pair d2 , d. Then
the above is

d d
= ε(d1 , d) ε(d2 , d)
d1 K d2 K

d
= ε(d1 , d)ε(d2 , d) .
d1 d2 K
9.3 Quadratic characters 299

But ε(d1 , d)ε(d2 , d) = ε(d1 d2 , d), so it follows that (9.11) holds also for the
pair d1 d2 , d. Since all quadratic discriminants can be constructed as the product
of smaller quadratic discriminants, or by appealing to the special cases already
considered, it follows now that (9.11) holds for all quadratic discriminants.

Let χ be a character modulo q. By means of Theorems 9.7 and 9.10 we can

describe |τ (χ )|. By Theorem 9.5 we may also relate the argument of τ (χ ) to
that of τ (χ ), but otherwise there is little in general that we can say about the
argument of τ (χ ). However, in the special case of quadratic characters, a striking
phenomenon arises, which was ﬁrst noted and established by Gauss. Suppose
that χd is a primitive quadratic character. Then χ d = χd , so by multiplying
both sides of (9.7) by τ (χd ), and√using Theorem 9.7, we see that τ (χd )2 =
√
χd (−1)|d| = d. Thus τ (χd ) = ± d if d > 0 and τ (χd ) = ±i −d if d < 0.
We show below that in both cases it is always the positive sign that occurs. We
begin with the following fundamental result.

Theorem 9.15 Let

q 2
an
S(a, q) = e .
n=1
2q

If a and q are positive integers and at least one of them is even, then

S(a, q) = S(q, a)e(1/8) q/a.

Proof We apply the Poisson summation formula, in the form of Theorem D.3,
to the function f (x) = e(ax 2 /(2q)) for 1/2 < x < q + 1/2, with f (x) = 0
otherwise. Thus

K
S(a, q) = f (n) = lim *
f (k)
K →∞
n k=−K

where
q+1/2
*
f (k) = e(ax 2 /(2q) − kx) d x.
1/2

We complete the square by writing

ax 2 a k 2q
− kx = (x − kq/a)2 − ,
2q 2q 2a
and make the change of variable u = (x − kq/a)/q, to see that
1/(2q)+1−k/a
*
f (k) = qe(−k 2 q/(2a)) e(aqu 2 /2) du.
1/(2q)−k/a
300 Primitive characters and Gauss sums

By integrating by parts we see that

*
f (k) a,q 1/(|k| + 1) .

Since at least one of a and q is even, if k ≡ r (mod a) then qk 2 ≡ qr 2 (mod 2a).

Thus if we write k = am + r , then
K /a
K a 2
* −qr 1/(2q)+1−m−r/a
f (k) = q e e(aqu /2) du
2

k=−K r =1
2a m=−K /a 1/(2q)−m−r/a
+Oq,a (1/K ).

Here the integrals may be combined to form

∞ one integral, which, as K tends to
inﬁnity tends to I (aq/2) where I (c) = −∞ e(cu 2 ) du. This is a conditionally
convergent improper Riemann U integral,
∞but it is not necessary to evaluate this
2
symmetrically as limU →∞ −U , since U e(cu ) du 1/U , by integration by
parts. Thus we have shown that

S(a, q) = q S(q, a)I (aq/2).

We take a = 2 and q = 1, and √note that S(2, 1) = 1 and S(1, 2) = 1 + i. Hence

I (1) = 1/(1 − i) = e(1/8)/ 2. By a linear change of variables it is clear that
√
if c > 0 then I (c) = I (1)/ c. On combining this information in the above, we
obtain the stated identity.

By taking a = 2 we immediately obtain

Corollary 9.16 (Gauss) For any positive integer q,

⎧ 1/2
⎪
⎪ q if q ≡ 1 (mod 4),
q
+ −q ⎨
1 i 0 if q ≡ 2 (mod 4),
e(n 2 /q) = q 1/2 =
1 + i −1 ⎪
⎪ iq 1/2
if q ≡ 3 (mod 4),
n=1 ⎩
(1 + i)q 1/2 if q ≡ 0 (mod 4).

This in turn enables us to reach our goal.

Theorem 9.17√ Let χd (n) = dn be a primitive quadratic character. If d > 0,
√
then τ (χd ) = d. If d < 0 then τ (χd ) = i −d.

In the special case of the Legendre symbol, if we write τ p =

p n √
n=1 p e(n/ p), then this asserts that τ p = p for p ≡ 1 (mod 4), while
√
τ p = i p for p ≡ 3 (mod 4).

Proof As in some of the preceding proofs, we establish the identities when

the modulus is an odd prime or power of 2, and then write d = d1 d2 to extend
to the general primitive quadratic character.
9.3 Quadratic characters 301

Let
q 2
ax
G(a, q) = e . (9.12)
x=1
q

n of the congruence x ≡

2
If p is an odd prime,
n then the number of solutions
p
n (mod p) is 1 + p L , so G(a, p) = n=1 1 + p e(an/ p). Thus if p a,
then
p
n
G(a, p) = e(an/ p). (9.13)
n=1
p
Suppose that p ≡ 1 (mod 4). Then from the above we see that τ (χ p ) = G(1, p),
√
and then by taking q = p in Corollary 9.16 it follows that G(1, p) = p in
this case.
Now suppose that p ≡ 3 (mod 4). Then from the above we see that τ (χ− p ) =
G(1, p), and then by taking q = p in Corollary 9.16 it follows that G(1, p) =
√
i p in this case.
Clearly τ (χ√−4 ) = e(1/4) − e(3/4) = 2i, τ (χ8 ) = e(1/8) − e(3/8) − e(5/8) √
+ e(7/8) = 8, and τ (χ−8 ) = e(1/8) + e(3/8) − e(5/8) − e(7/8) = i 8.
Thus we have the stated result when d is a power of 2.
Next suppose that d = d1 d2 where d1 and d2 are quadratic discriminants and
(d1 , d2 ) = 1. Then by Theorem 9.6, τ (χd ) = τ (χd1 )τ (χd2 )χd1(|d2 |)χd2(|d1 |). By
considering the possible combinations of signs of d1 and of d2 we ﬁnd that
χd1(|d2 |)χd2(|d1 |) = χd1(d2 )χd2(d1 ) in all cases. This product is ε(d1 , d2 ) in the
notation of Theorem 9.14. That is,
τ (χd ) = ε(d1 , d2 )τ (χd1)τ (χd2).
Thus if τ (χd1) and τ (χd2) have the asserted values, then so also does τ (χd ).
Since every primitive quadratic character can be constructed this way, the proof
is complete.

9.3.1 Exercises
1. (a) Show that if p > 2 and p b, then
p
n n+b
= −1.
n=1
p p

(b) Suppose that p > 2 and that p d. Explain why

p 2 p
x −d n n−d
= 1+ ,
x=1
p n=1
p p
and deduce that this sum is −1.
302 Primitive characters and Gauss sums

(c) Put d = b2 − 4ac, and suppose that p > 2, p d. Show that

p
ax 2 + bx + c a
= .
x=1
p p
2. Let p be a prime, p ≡ 1 (mod 4), and let N be a set of Z residue classes
modulo p.
(a) Explain why
m − n 1 a
p 2
=√ e(an/ p) .
m∈N n∈N
p p a=1 p n∈N

(b) Suppose that m−n = 1 whenever m ∈ N , n ∈ N , and m = n. Show
√ p
that Z ≤ p.
3. Put f a (r ) = r 2 + a1r + a0 where a = (a0 , a1 ). Show that if r1 , r2 , r3 are
distinct modulo p, then
p p
f a (r1 ) f a (r2 ) f a (r3 )
= p.
a0 =1 a1 =1
p p p

4. We used Corollary 9.16 to determine the sign of τ (χ± p ), and then used
quadratic reciprocity to determine the sign of τ (χd ) for the general quadratic
discriminant d. We now show that quadratic reciprocity for the Legendre
symbol can be derived from Theorem 9.15 (mainly Corollary 9.16). Let
q
G(a, q) = n=1 e(an 2 /q).
(a) Suppose that p is an odd prime. Explain why
p
a n
G(a, p) = e(n/ p)
p L n=1 p
when (a, p) = 1.
(b) Suppose that (q1 , q2 ) = 1. By writing n modulo q1 q2 in the form n =
n 1 q2 + n 2 q1 , show that G(a, q1 q2 ) = G(aq2 , q1 )G(aq1 , q2 ).
(c) Let p and q denote odd primes. Show that

p q
G(1, pq) = G(1, p)G(1, q),
q L p L
and use Corollary 9.16 to show that

p q p−1 q−1
= (−1) 2 · 2 .
q L p L

(d) By taking a = −1 in (a), and using Corollary 9.16, show that −1 p
=
(−1)( p−1)/2 .
(e) By taking a = 4 in Theorem 9.15, show that 2p L = (−1)( p −1)/8 .
2
9.3 Quadratic characters 303

(f) Suppose that p is an odd prime, and k is an integer, k ≥ 2. Show that

G(a, p k ) = pG(a, p k−2 ).
5. Let L1 denote the contour z = u, −∞ < u < ∞ in the complex plane,
let∞ L2 denote
2
the contour z = (1 + i)u, −∞ < u < ∞, and let I (c) =
−∞ e(cu ) du, as in the proof of Theorem 9.15.
2
(a) Note that I (c) = L1 e2πicz dz.
2 2
(b) Explain why L1 e2πicz dz = L2 e2πicz dz.
(c) Show that
∞ ∞
1+i 1+i
e−4πcu du = √ e−v dv =
2 2 2
e2πicz dz = (1 + i) √ .
L2 −∞ 2 πc −∞ 2 c
(d) Thus give a proof, independent of that found in the proof of Theorem
9.15, that
∞
1
e(cu 2 ) du = √ .
−∞ (1 − i) c
6. Quadratic reciprocity à la Conway (1997, pp. 127–133). If (a, n) = 1 and n
is an odd
positive integer, then we deﬁne the Zolotarev symbol (not a standard
term) an Z to be 1 if the map x→ ax is an even permutation of a complete
residue system modulo n, and an Z = −1 if it is odd.
(a) Compute the decomposition of thepermutation
x → 7x (mod 15) into
disjoint cycles, and thus show that 15 7
Z
= −1.
(b) Suppose that p is an odd prime and that a has order h modulo p. Show
that the map x → ax (mod p) consists of one 1-cycle (0) and ( p − 1)/ h
h-cycles. Deduce that ap Z = (−1)( p−1)/ h .
(c) Continue in the same notation, and show that ( p − 1)/
h is even if and
only if a ( p−1)/2 ≡ 1 (mod p). Deduce that ap Z = ap L .
(d) If n is odd and positive, then the permutation x → −x (mod n) consists
of one 1-cycle
−1 and (n − 1)/2 2-cycles of the form (x − x). Hence deduce
that n Z = (−1)(n−1)/2 .
(e) If (ab, n) = 1, then the map x → abx (mod n) is the composition of
the map x a→
ax (mod n) and the map x → bx (mod n). Deduce that
ab
n Z
= b
n Z n Z
.
(f) Let p be a prime, p > 2,and let g be a primitive root of p. By (b)k with
h = p − 1, deduce that gp Z = −1. Then by (e) deduce that gp Z =
(−1)k , and hence give a second proof of (c).
(g) Suppose that n is odd and positive, and that (a, n) = 1. Let

P = {1, 2, . . . , (n − 1)/2}, N = {−1, −2, . . . , −(n − 1)/2}.

Let K be the number of k ∈ P such that ak ∈ N (mod n). Put εk = 1

304 Primitive characters and Gauss sums

if k and ak lie in the same subset, otherwise put εk = −1. Note that
εk = ε−k . Let π + be the permutation that leaves N ﬁxed and maps P to
itself by the formula k → εk ak (mod n). Let π − be the map that leaves
P ﬁxed and maps N to itself by the formula k → εk ak (mod n). Finally
let π ∗ be the product of those transpositions (ak − ak) for which k ∈ P
and ak ∈ N . Show that the map x → ax (mod n) is the permutation
π ∗ π + π − . Let σ be the ‘sign change permutation’ x → −x (mod n).
Show that π − = σ π + σ . That is, π + and π − are conjugate permutations.
They are the same apart from the fact that they operate on different sets.
Thus they have the same cycle structure, and hence the same parity.
Deduce that an Z = (−1) K .
(h) Suppose that n is odd and positive, that (a, n) = 1, and that a > 0.
Show that an Z = (−1) K where K is the number of integers lying in the
intervals ((r − 12 ) an , ran ) for r = 1, 2, . . . [a/2].
(i) Show that if a > 0, (2a, n) = 1, m ≡ n (mod 4a), then ma Z = an Z .
2
Show that if n is odd and positive, then n Z = (−1)( p −1)/8 .
2
(j)
(k) Suppose that m and n are odd and positive, and that m ≡ −n (mod 4),
say m + n = 4a. Justify the following manipulations:

m 4a a a 4a n
= = = = = .
n Z n Z n Z m Z m Z m Z

(l) Suppose that m and n are odd and positive, and that m ≡ n (mod 4), say
m > n and m − n = 4a. Justify the following manipulations:

m 4a a a 4a
= = = =
n Z n Z n Z m Z m Z

−n n
= = (−1)(m−1)/2 .
m Z m Z

(m) Suppose that a is odd and positive and that (2a, mn) = 1. Show that
a mn a−1 mn−1 m n a−1 mn−1
= (−1) 2 2 = (−1) 2 2
mn Z a Z a Z a Z
a a + a−1 2 + 2
a−1 mn−1 m−1 a−1 n−1
= (−1) 2 2 2 2 .
m Z n Z
a
Show that this last exponent is even, so that mn Z
= ma Z an Z in this
case.
(n) Suppose that a is oddand negative
a and that (a, mn) = 1. Use (m) to
show that the identity mn Z = m Z n Z holds in this case also. Thus
a a

this holds for all odd a.

9.3 Quadratic characters 305

(o) Suppose that a is even and that (a, mn) = 1. Justify the following ma-
nipulations:

a −a mn−1 mn − a mn−1
= (−1) 2 = (−1) 2
mn Z mn Z mn
Z
mn − a mn − a mn−1
= (−1) 2
m n
Z
Z
−a −a a a
(−1) 2 + 1 2+ 2 .
mn−1 mn−1 m− n−1
= (−1) 2 =
m Z n Z m Z n Z
Show that this last exponent is even, and thus deduce that
a a a
=
mn Z m Z n Z

holds in all cases.

(p) Suppose that (a, m) = 1 and that m is odd, composite, and square-free.
Show that the permutation x → ax (mod m) of reduced residues modulo
m is always even. (Hence it is essential that we used complete residue
systems in the above.)
7. Let p be a prime number, p > 2. (a) Show that
p−1
k
(1 − e(k/ p))( p ) = exp(−τ (χ p )L(1, χ p ))
k=1

where χ p (n) = kp .

Let R = r : 0 < r < p, rp = 1 , N = n : 0 < n < p, np = −1 , and
set

sin π n/ p
Q = n∈N .
r ∈R sin πr/ p

(b) Show that if p ≡ 3 (mod 4), then Q = 1.

√
(c) Show that if p ≡ 1 (mod 4), then Q = exp( p L(1, χ p )).
8. (Chowla & Mordell 1961) Continue withthe notation of the preceding prob-
lem, let c be chosen, 0 < c < p, so that cp = −1, and put

1 − z cr
f (z) = − 1.
r ∈R
1 − zr

(a) Show that if L(1, χ p ) = 0, then f (e(1/ p)) = 0.

(b) Explain why f is a polynomial with integral coefﬁcients.
(c) Show that if L(1, χ p ) = 0, then there exists a polynomial g ∈ Z[z] such
that f (z) = g(z)(1 + z + · · · + z p−1 ).
306 Primitive characters and Gauss sums

(d) By taking z = 1 in the above, show that it would follow that c( p−1)/2 ≡
1 (mod p).
(e) Explain why c( p−1)/2 ≡ −1 (mod p); deduce that L(1, χ p ) = 0.

9.4 Incomplete character sums

M+N
Let χ be a character modulo q. We call the sum n=M+1 χ (n) incomplete if
N < q. Such a sum trivially has absolute value at most N . We now use our
knowledge of Gauss sums to show that if χ is non-principal, then this sum is
o(N ) provided that N is not too small compared with q. Suppose ﬁrst that χ is
a primitive character modulo q with q > 1. Then by Corollary 9.8,

M+N
1
q
M+N
χ (n) = χ (a) e(an/q).
n=M+1
τ (χ ) a=1 n=M+1

Here the inner sum is a geometric series. We note that

M+N
e((M + N + 1)α) − e((M + 1)α)
e(nα) =
n=M+1
e(α) − 1
sin π N α
= e((2M + N + 1)α/2) (9.14)
sin π α
if α is not an integer. (If α ∈ Z, then the sum is N .) On combining this with the
above, we see that

1
M+N q
a(2M + N + 1) sin πa N /q
χ (n) = χ (a)e . (9.15)
n=M+1
τ (χ ) a=1 2q sin πa/q

By Theorem 9.7 and the triangle inequality the right-hand side has absolute
value

1
q−1
1
<√ .
q a=1 sin πa/q
(a,q)=1

Here the second half of the range of summation contributes the same amount as
the first. Hence it suffices to multiply by 2 and sum over 1 ≤ a ≤ q/2. However,
if q is odd, then q/2 is not an integer and hence the sum is actually over the
range 1 ≤ a ≤ (q − 1)/2, while if q is even, then 4 | q (since if q ≡ 2 (mod 4),
then there is no primitive character modulo q), and hence (q/2, q) > 1, and so
it suffices to sum over 1 ≤ a ≤ q/2 − 1 in this case. Hence in either case the
9.4 Incomplete character sums 307

expression above is

2
(q−1)/2
1
≤√ .
q a=1 sin πa/q
The function f (α) = sin πα is concave downward in the interval [0, 1/2], and
hence it lies above the chord through the points (0, 0), (1/2, 1). That is, sin π α ≥
2α for 0 ≤ α ≤ 1/2. Thus the above is

√ 1 √ 1+ √
(q−1)/2 (q−1)/2 1 (q−1)/2
2a + 1 √
≤ q < q log 2a
= q log = q log q.
a=1
a a=1 1− 1
2a a=1
2a −1
That is,

M+N
√
χ (n) < q log q (9.16)
n=M+1

when χ is primitive. We now extend this to imprimitive non-principal characters.

Suppose that χ is induced by χ ! modulo d. Let r be the product of those primes
that divide q but not d. Then

M+N
M+N
χ (n) = χ ! (n)
n=M+1 n=M+1
(n,r )=1

M+N
= χ ! (n) µ(k)
n=M+1 k|(n,r )

= µ(k) χ ! (n)
k|r M<n≤M+N
k|n

= µ(k)χ ! (k) χ ! (m).
k|r M/k<m≤(M+N )/k

By the case already treated, we know that the inner sum above has absolute
value not exceeding d 1/2 log d, and hence the given sum has absolute value
not more than 2ω(r ) d 1/2 log d. But 2ω(r ) ≤ d(r ) r 1/2 ≤ (q/d)1/2 , so we have
proved
Theorem 9.18 (The Pólya–Vinogradov inequality) Let χ be a non-principal
character modulo q. Then for any integers M and N with N > 0,

M+N
√
χ (n) q log q.
n=M+1

In (9.16) we saw that the implicit constant can be taken to be 1 when χ

is primitive. With a little more care it can be seen that the implicit constant
308 Primitive characters and Gauss sums

can be taken to be 1 for all non-principal characters. The above estimate is

important in many contexts, but we conﬁne ourselves to two applications at this
point.

Corollary 9.19 Let χ be a non-principal character modulo p, and let n χ be

1
√ +ε
the least positive integer n such that χ (n) = 1. Then n χ ε p2 e .

Proof Suppose that χ (n) = 1 for all n ≤ y. Then χ (n) = 1 whenever n is

composed entirely of primes q ≤ y. Hence, in the notation of Section 7.1, if
y ≤ x < y 2 , then

χ (n) = ψ(x, y) + χ (q)[x/q]
n≤x y<q≤x

where q denotes a prime. Thus

χ (n) ≥ ψ(x, y) − [x/q] = [x] − 2 [x/q]
n≤x y<q≤x y<q≤x

log x x
= x 1 − 2 log +O .
log y log x
√
If x = p 1/2 (log p)2 , then the sum on the left is o(x), while
√
if y > x 1/ e+ε
, then
the lower bound on the right is εx. Thus n χ ε x 1/ e+ε .

Corollary 9.20 The number of primitive roots modulo p in the interval [M +

1, M + N ] is
ϕ( p − 1)
N + O p 1/2+ε .
p
Since the number of primitive roots in an interval of length p is exactly ϕ( p −
1), the above asserts that primitive roots are roughly uniformly distributed into
subintervals of length N provided that N > p 1/2+ε .

Proof Let q1 , q2 , . . . , qr be the distinct prime factors of p − 1, and put q =

r
i=1 qi . Then n is a primitive root modulo p if and only if (ind n, q) = 1. For
1 ≤ i ≤ r put

ind n
χi (n) = e .
qi
Then

1
qi
1 if qi | ind n,
χi (n)a =
qi a=1 0 otherwise.
9.4 Incomplete character sums 309

Thus

1
r qi
1 if n is a primitive root (mod p),
χ0 (n) − χi (n)ai =
i=1
qi ai =1 0 otherwise.

The left-hand side above is

1 ϕ(q/d) µ(d)
qi −1
r
ai
1 − 1/qi χ0 (n) − χi (n) = χ (n).
i=1
qi ai =1 d|q
q/d d χ
ordχ =d

Thus the number of primitive roots in the interval [M + 1, M + N ] is

1
M+N
ϕ(q/d)µ(d) χ (n). (9.17)
q d|q χ n=M+1
ordχ =d

The only character of order d = 1 is the principal character χ0 , which gives us

the main term
ϕ(q) ϕ( p − 1)
((1 − 1/ p)N + O(1)) = N + O(1).
q p
A character of order d > 1 is non-principal, and for such characters the inner-
most sum in (9.17) is p 1/2 log p. Since there are ϕ(d) such characters, the
contribution in (9.17) of d > 1 is
ϕ(q) 1/2
p log p |µ(d)| 2ω( p−1) p 1/2 log p p 1/2+ε .
q d|( p−1)

This gives the stated result.

Suppose that χ is a non-principal character modulo q. Further insights

into the Pólya–Vinogradov inequality may be gained by considering the sum

f χ (α) = 0<n≤qα χ (n) as a function of the real variable α, for 0 ≤ α ≤ 1. We
extend the domain of f χ (α) by periodicity, and compute its Fourier coefﬁcients:
1
q 1
*
f χ (k) = f χ (α)e(−kα) dα = χ (n) e(−kα) dα.
0 n=1 n/q

The nature of this integral depends on whether k = 0 or not. In the former case
we ﬁnd that

−1
q q
* n
f χ (0) = χ (n) 1 − = nχ (n),
n=1
q q n=1
while for k = 0 we have

q
1 − e(−kn/q) 1
q
cχ (−k)
*
f χ (k) = χ (n) = χ (n)e(−kn/q) = .
n=1
−2πik 2πik n=1 2πik
310 Primitive characters and Gauss sums

It is convenient to restrict to primitive characters, since then cχ (−k) =

χ (−k)τ (χ ) by Theorem 9.5. Since f χ (α) is a function of bounded variation
it follows that
−1 τ (χ ) χ (−k)
q
f χ (α) = nχ (n) + e(kα) (9.18)
q n=1 2πi k=0 k

at points of continuity of f χ , with the understanding that the sum is calculated

K
as the limit of the symmetric partial sums −K . If χ (−1) = 1, then f χ (α) is
an odd function and the contributions of k and of −k can be combined to form
a sine series. If χ(−1) = −1, then f χ (α) is an even function, and the two terms
merge to form a cosine series. In this case it is interesting to note that if we take
α = 0 then we obtain another proof of (9.9). Among other possible values of
α that might be considered, the possibility α = 1/2 is particularly striking. If
χ (−1) = 1 then f χ (1/2) = 0 by symmetry, so in continuing we suppose that
χ (−1) = −1. Note that if q is odd then 1/2 is not of the form n/q, and hence
f χ (α) is continuous at 1/2. On the other hand, there is no primitive character
modulo 2 and hence if q is even then 4 | q. In this case we can solve the equation
n/q = 1/2 by taking n = q/2, but then q/2 is even, so that (q/2, q) > 1, and
hence χ (q/2) = 0. Hence f χ (α) is continuous at 1/2 in all cases, and we deduce
that
−1
q
τ (χ ) ∞
χ (k)
χ (n) = nχ (n) − (−1)k .
0<n≤q/2
q n=1
πi k=1
k

As we already discovered by taking α = 0, the ﬁrst term on the right is

τ (χ )L(1, χ )/(πi). But

∞
χ (k)(−1)k
= (21−s χ (2) − 1)L(s, χ)
k=1
ks
for any character χ and any s with positive real part, so we have proved
Theorem 9.21 Let χ be a primitive character modulo q such that χ (−1)
= −1. Then
τ (χ )
χ (n) = (2 − χ (2)) L(1, χ ).
1≤n≤q/2
πi

In the special case that χ is a quadratic character we know the exact value
of the Gauss sum, and hence we can say more.
Corollary 9.22 If d is a quadratic discriminant with d < 0, then
d
> 0.
1≤n≤|d|/2
n
9.4 Incomplete character sums 311

On taking α = (M + N )/q and then α = M/q, and differencing, we see

that

M+N
τ (χ ) χ (−k)
χ (n) = e(k M/q)(e(k N /q) − 1) + O(1).
n=M+1
2πi k=0 k

Since e(k N /q) − 1 ∼ 2πik N /q when |k| is small compared with N /q, for
rough heuristics we think of the above as being approximately
τ (χ )N
χ (−k)e(k M/q).
q 0<|k|≤N /q

Here a sum over an interval of length N reﬂects – approximately – to form a sum

over an interval of length N /q. Further examples of this sort of phenomenon
will emerge when we consider approximate functional equations of ζ (s) and of
L(s, χ ).
The Fourier expansion (9.18) is also useful in deriving quantitative estimates.
We know not only that Var[0,1] f χ = ϕ(q), but (by Theorems 2.10 and 3.1) also
that this variation is reasonably well distributed in subintervals, in the sense
that Var[α,β] f χ ϕ(q)(β − α) when β − α > q −1+ε . We apply Theorem D.2
to f χ (α), and divide the range of integration (0, 1) into K intervals of length
1/K , throughout each of which the integrand has a constant order of magnitude.
Thus we see that

−1 τ (χ ) χ (−k)
q
ϕ(q)
f χ (α) = nχ (n) + e(kα) + O log 2K
q n=1 2πi 0<|k|≤K k K
(9.19)

for K ≤ q 1−ε . This can be used to obtain sharper constants in the Pólya–
Vinogradov inequality; see Exercise 9.4.9.
We can also show that the estimate provided by the Pólya–Vinogradov in-
equality is in general not far from the truth.

Theorem 9.23 Suppose that χ is a non-principal character modulo q. Then

M+N
|τ (χ )|
max χ (n) ≥ .
M,N
n=M+1
π

Proof Clearly

q
M+N
q
M+N
M+N
e(M/q) χ (n) ≤ χ (n) ≤ q max χ (n) .
M
M=1 n=M+1 M=1 n=M+1 n=M+1
312 Primitive characters and Gauss sums

Here the sum on the left is

N
q
N
q
e(M/q)χ (M + n) = e(−n/q) χ (M)e(M/q).
n=1 M=1 n=1 M=1

By (9.14) this is

−(N + 1) sin π N /q
e τ (χ ).
2q sin π/q

If q is even, then we may take N = q/2, and then the quotient of sines is
= 1/(sin π/q) ≥ q/π , while if q is odd, then we may take N = (q − 1)/2, in
which case the quotient of sines is
π
cos 2q 1 q
= π ≥ .
sin πq 2 sin 2q π

The stated lower bound now follows by combining these estimates.

√
If χ is primitive modulo q, then the lower bound of Theorem 9.23 is q/π .
Further lower bounds of this nature can be derived by using Parseval’s identity
(4.4) for the ﬁnite Fourier transform; see Exercise 9.4.8. In addition to the lower
bound above, which applies to all characters, for a sparse subset of characters
we can obtain a better lower bound.

Theorem 9.24 (Paley) There is a positive constant c such that

M+N
d √
max > c d log log d
M,N
n=M+1
n

for inﬁnitely many positive quadratic discriminants d.

Proof Let χ be a primitive character modulo q such that χ (−1) = 1. By taking

M = k − h − 1 and N = 2h + 1 in (9.15) we see that

k+h
1
q
sin πa(2h + 1)/q
χ(n) = χ (a)e(ak/q) .
n=k−h
τ (χ ) a=1 sin πa/q

Let h be the integer closest to q/3. Then the sine in the numerator is
approxi-

mately sin 2πa/3 when a is small. We shall choose χ so that χ (a) = a3 L when
a is small. Thus these two factors are strongly correlated. We would take k = 0
except for the need to dampen the effects of the larger values of a. To this end
9.4 Incomplete character sums 313

we sum over k, for −K ≤ k ≤ K and divide by 2K + 1. Thus by (9.14),

1 K
k+h
χ (n)
2K + 1 k=−K n=k−h
1
q
sin πa(2h + 1)/q sin π (2K + 1)a/q
= χ (a) . (9.20)
τ (χ ) a=1 sin πa/q (2K + 1) sin πa/q
Here the last factor is approximately 1 if a/q ≤ 1/K , and decreases as a/q
becomes larger. Thus, despite its complicated appearance, the expression above
is effectively

2q A
χ (a) sin 2πa/3
πτ (χ ) a=1 a
where A = q/K . To make this precise we observe that

sin π(2h + 1)a/q = sin 2πa/3 + O(a/q)

and that

sin π(2K + 1)a/q 1+ 2
a/q2 ) (a/q ≤ 1/K ),
= O(K
(2K + 1) sin πa/q O K −1 a/q−1 (a/q > 1/K ).
Thus the right-hand side of (9.20) is

2
q/K
1 a a
= χ (a) +O sin 2πa/3 + O
τ (χ ) a=1 πa/q q q
2 2
K a 1 q2
× 1+O +O √
q2 q q/K <a≤q/2 K a 2

2q χ (a) sin 2πa/3

q/K
√
= + O( q). (9.21)
πτ (χ ) a=1 a
Now let y be a large parameter, and suppose that

q ≡ 5 (mod 8),

q p
= (3 < p ≤ y). (9.22)
p L 3 L
Thus by the Chinese Remainder Theorem, q is restricted to certain residue

classes modulo Q = 8 3< p≤y p. Now let q be the least positive number that
satisﬁes these constraints. Then q issquare-free, and hence q is a quadratic
q
discriminant, so we may take χ(n) = n K . Also, q < Q. By the Prime Number
Theorem in the form of (6.13) we see that log Q = (1 + o(1))y. Let K be the
314 Primitive characters and Gauss sums

least integer such that K > q/y. Then by (9.22), χ (a) = a3 L for 1 ≤ a ≤ q/K ,
√
(a, 3) = 1. Thus 1≤a≤u χ (a) sin 2πa/3 = u/ 3 + O(1), so the main term in
(9.21) is
√
2 q 2 √
√ (log y + O(1)) ≥ √ + o(1) q log log q.
π 3 π 3
This completes the proof.

In the two preceding theorems we have seen that the character sum can be
large when N is comparable to q. For shorter sums we would expect the sum
to be smaller, and indeed one would conjecture that if χ is a non-principal
character modulo q, then

M+N
χ (n) ε N 1/2 q ε (9.23)
n=M+1

for any ε > 0. Although our present knowledge falls far short of this, we now
show that some improvement of the Pólya–Vinogradov inequality is possible, at
least in some situations. Our approach depends on the Riemann hypothesis for
curves over a ﬁnite ﬁeld, in the form of the following character sum estimate,
which we derive from the exposition of Schmidt (1976).

Lemma 9.25 (Weil) Suppose that d|( p − 1) with d > 1 and that χ is a char-
acter modulo p of order d. Suppose further that e j ≥ 1 (1 ≤ j ≤ k), that d e j
for some j with 1 ≤ j ≤ k and that the c1 , c2 , . . . , ck are distinct modulo p.
Then

p

χ (n + c1 )e1 (n + c2 )e2 · · · (n + ck )ek ≤ (k − 1) p 1/2 .
n=1

Proof Let f (x) = (x + c1 )e1 (x + c2 )e2 · · · (x + ck )ek . Then, by Lemma 4B of

Schmidt (1976), f (x) cannot satisfy f (x) ≡ g(x)d (mod p) identically where g
is a polynomial with integer coefﬁcients. The lemma then follows from Theorem
2C ibidem.

Lemma 9.26 Suppose that χ is a non-principal character modulo p and let

2r

p
h
Sh,r = χ (m + n) .
n=1 m=1

Then Sh,r r 2r h r p + h 2r p 1/2 for positive integers r .
9.4 Incomplete character sums 315

Proof Clearly we may suppose that h ≤ p. Let d denote the order of χ . Then
d > 1 and

p
Sh,r = χ ((n + m 1 ) · · · (n + m r )(n + m r +1 )d−1 · · · (n + m 2r )d−1 ).
m 1 ,...,m 2r n=1

For a given 2r –tuple m 1 , . . . , m 2r let c1 < c2 < · · · < ck be the distinct val-
ues of the m j , and let al and bl denote the number of occurrences of
cl amongst the m 1 , . . . , m r and m r +1 , . . . , m 2r respectively. Let el = al +
(d − 1)bl . Then (n + m 1 ) · · · (n + m r )(n + m r +1 )d−1 · · · (n + m 2r )d−1 = (n +
c1 )e1 · · · (n + ck )ek . Note that e1 + · · · + ek = r + r (d − 1) = r d. If there is an
1
l such that d el , then by Lemma 9.25 the sum over n is bounded by (k − 1) p 2 ,
and so the total contribution to Sh,r from such 2r –tuples is
1
≤ 2r h 2r p 2 .

On the other hand, if d|el for every l, then kd ≤ e1 + · · · ek = r d and so k ≤ r .

The number of choices of m 1 , . . . , m 2r with m l ∈ {c1 , . . . , ck } is at most k
2r

and the number of choices for c1 , . . . , ck is k . Thus the total contribution to

Sh,r from these terms is bounded by

h
k 2r p r 2r h r p.
k≤r
k

Our main result takes the following form.

Theorem 9.27 (Burgess) For any odd prime p and any positive integer r we
have

M+N
r +1
r N 1− r p 4r 2 (log p)αr
1
χ (n)
n=M+1

where αr = 1 when r = 1 or 2 and αr = 1

2r
otherwise.

Suppose that δ > 1/4. If N > p δ , then the bound above is o(N ) if r is
chosen suitably large in terms of δ. Thus any interval of length N contains both
quadratic residues and quadratic non-residues. In addition the reasoning used
to derive Corollary 9.19 applies here, so we see that the least positive quadratic
1
√ +ε
non-residue modulo p is ε p 4 e .

Proof When r = 1 or N > p 5/8 the bound is weaker than the Pólya–
Vinogradov Inequality (Theorem 9.18), and when r > 2 and N > p 1/2 the
r +1
stated bound is weaker than the case r = 2. Also, when N ≤ p 4r the bound is
316 Primitive characters and Gauss sums

worse than trivial. Hence we may suppose that

5/8
r +1 p when r = 2,
p > p0 , r ≥ 2, and p 4r < N ≤ (9.24)
p 1/2 when r > 2.
Let S(M, N ) denote the sum in question. Then

M+N
S(M, N ) = χ (n + ab) + S(M, ab) − S(M + N , ab).
n=M+1

Let

M(y) = max |S(M, N )|.

M,N
N ≤y

Then

M+N
S(M, N ) = χ (n + ab) + 2θM(ab)
n=M+1

where |θ | ≤ 1. We sum this over a ∈ [1, A] and b ∈ [1, B]. Thus

AB S(M, N ) = χ (n + ab) + 2ABθ1 M(AB).
n,a,b

We suppose that

A< p (9.25)

and then deﬁne ν() to be the number of pairs a, n with a ∈ [1, A], n ∈ [M +
1, M + N ] and n ≡ a (mod p). Thus

p
χ (n + ab) = χ (a) χ ( + b)
n,a,b =1 n,a b
n≡a (mod p)

p
≤ ν() χ ( + b) .
=1 b

By Hölder’s inequality,
p 2r p 2r −1 p 2r
2r

ν() χ ( + b) ≤ ν() 2r −1 χ ( + b)
=1 b =1 =1 b

and
2r −1 p 2r −2

p
2r

p
ν() 2r −1 ≤ ν() ν()2 .
=1 =1 =1
9.4 Incomplete character sums 317

Clearly

p
ν() = AN .
=1

We show below that if

1
AN < p, 1 ≤ A ≤ N, (9.26)
2
then

p
ν()2 AN log p. (9.27)
=1
-1 . - .
Assuming this, we take A = 10 N p −1/(2r ) , B = p 1/(2r ) . Then (9.24) gives
(9.25) and (9.26). Thus from Lemma 9.26 with h = B we see that
1 r +1 1
χ (n + ab) r N 2− r p 4r 2 (log p) 2r .
n,a,b

Hence there is an absolute constant C such that

1 r +1 1
|S(M, N )| ≤ Cr N 1− r p 4r 2 (log p) 2r + 2M(N /10). (9.28)
Choose M1 , N1 with N1 ≤ N so that |S(M1 , N1 )| = M(N ). If (9.24) fails
r +1
because N1 ≤ p 4r , then (9.28) with M = M1 , N = N1 is trivial. Thus we
have
1
M(N ) ≤ N 1− r λ + 2M(N /10) (9.29)
where
r +1 1
λ = Cr p 4r 2 (log p) 2r .
r +1
Moreover (9.29) is also trivial . ≤ p . We apply (9.29) repeatedly with
when N 4r
-
N replaced by [N /10], [N /10]/10 , and so on. Thus

K
2k 10−k(1− r ) + 2 K +1 M(10−K −1 N ).
1 1
M(N ) ≤ N 1− r λ
k=0
−K −1
The trivial bound M(10 N) 10−K N with a judicious choice of K suf-
ﬁces to give
1
M(N ) N 1− r λ
which completes the proof, apart from the need to establish (9.27) with (9.26).
Clearly

ν()2

318 Primitive characters and Gauss sums

is the number of choices of a, n, a , n , with a, a ∈ [1, A], n, n ∈ [1, N ],

M + n ≡ a (mod p), M + n ≡ a (mod p). Since 1 ≤ a, a ≤ A < p, by
elimination of l we see that this is the number of solutions of (a − a )M ≡
a n − an (mod p) with a, n, a , n as before. Given any such pair a, a , choose
k so that k ≡ (a − a )M (mod p) and |k| < p/2. We have 1 ≤ a n, an ≤ AN ≤
N 2 p − 2r < p/2 in all cases. Thus a n − an = k. Given any one pair n = n 0 ,
1 1

10
n = n 0 satisfying this equation we have, in general, n = n 0 + (a,a a
) h, n =
a N (a,a )
n 0 + (a,a )
h. Moreover |h| ≤ max{a,a }
. Therefore the total number of possible
2N (a,a )
pairs n,n is at most 1 + max{a,a }
. Hence
N (a, a )
ν()2 A2 +
1≤a≤a ≤A
a
N
A2 +
d≤A 1≤b≤b ≤A/d
b
A2 + AN log 2A.

and so we have (9.27).

9.4.1 Exercises
1. Let χ be a non-principal character modulo q, and suppose that (a, q) = 1.
Choose a so that aa ≡ 1 (mod q).
(a) Explain why

M+N
M+ab+N
χ (a) χ (an + b) = χ (n).
n=M+1 n=M+ab+1

(b) Show that

M+N
√
χ (an + b) q log q.
n=M+1
√
2. With reference to the proof of Theorem
√ 9.21, show that 2ω(r ) ≤ c r for
all positive integers r where c = 4/ 6, and that equality holds only when
r = 6.
3. Show that if χ is a character modulo q with χ (−1) = −1, then

q
q
n 2 χ (n) = q nχ (n).
n=1 n=1
9.4 Incomplete character sums 319

4. (a) Let cn and f (n) have period q. Show that

q
q
1*
q
1*
q
cn f (n) = cn f (k)e(kn/q) = f (k)*
c(−k).
n=1 n=1
q k=1 q k=1

(b) Suppose that 1 ≤ N ≤ q and set f (n) = 1 for M + 1 ≤ n ≤ M + N ,

and f (n) = 0 for other residues (mod q). Show that *
f (0) = N and by
(9.14) or otherwise that

* sin π k N /q
f (k) = e(−(2M + N + 1)k/q)
sin π k/q
for k ≡ 0 (mod q).
(c) By subtracting *c(0)N /q from both sides and applying the triangle in-
equality, show that

M+N
N
q
1 |*
q−1
c(k)|
cn − cn ≤
n=M+1
q n=1 q k=1 sin π k/q

5. (a) Suppose that a function f is concave upwards. Explain why

x+δ
1
f (x) ≤ f (u) du
2δ x−δ

for δ > 0.
(b) Take f (u) = csc πu, x = k/q, and δ = 1/(2q), and sum over k to see
that

q−1
1 1−1/(2q)
1
<q du.
k=1
sin π k/q 1/(2q) sin πu

(c) Note that csc v has the antiderivative log(csc v − cot v), and hence de-
duce that the integral above is
π
q 1 + cos 2q
= log π .
π 1 − cos 2q

(d) By means of the inequalities 1 − θ 2 /2 ≤ cos θ ≤ 1 deduce that the

above is
q 16q 2 2q 4q
< log 2 = log .
π π π π
(e) Note that this is < q log q if q > exp((log 4/π )/(1 − 2/π )) =
1.944 . . . .
6. Let cn be a sequence with period q and ﬁnite Fourier transform *
c(k).
320 Primitive characters and Gauss sums

(a) Show that

2

q
M+N
N
q
1
q−1
sin2 π N k/q
cn − cn = |*
c(k)|2
M=1 n=M+1
q n=1 q k=1 sin2 π k/q

for 1 ≤ N ≤ q.
(b) Suppose that cn = 1 for 0 < n < q and that c0 = 0. Show that *
c(0) =
q − 1 and that *
c(k) = −1 for 0 < k < q. Deduce that

q−1
sin2 π N k/q
= (q − N )N
k=1 sin2 π k/q

for 0 ≤ N ≤ q.
(c) Take q = 2N and write k = 2n − 1 to deduce that

N
1
2 = 1.
N sin π 2n−1
n=1 2N

Let N tend to inﬁnity to show that ∞ n=1 (2n − 1)
−2
= π 2 /8, and hence
that ζ (2) = π /6.
2

7. (a) Show that if χ is a primitive character modulo q, q > 1, then

2

q
M+N
χ (n) ≤ Nq
M=1 n=M+1

for 1 ≤ N ≤ q.
(b) Show that if χ = χ0 (mod p), then
2

p
M+N
χ (n) = N(p − N)
M=1 n=M+1

for 1 ≤ N ≤ p.

8. Let f χ (α) = 0<n≤qα χ (n). Show that if χ is a primitive character modulo
q, then
1
q 1
| f χ (α) − aχ |2 dα = 1− 2
0 12 p|q p

where aχ = 0 if χ (−1) = 1, and

−1
q
aχ = nχ (n) = −i L(1, χ )τ (χ )/π
q n=1

if χ (−1) = −1.
9.5 Notes 321

9. (a) Show that

log p
log log 3q.
d|q
p−1

(b) Recall Exercise 2.1.16, and show that

1 ω(q)
ϕ(q) ϕ(q) 2
= log K + O log log q + O
k≤K
k q q K
(k,q)=1

for 1 ≤ K ≤ q.
(c) Suppose that χ is a primitive character modulo q, q > 1. Use Theo-
rem D.2 to show that

M+N
τ (χ ) χ (−k)
χ (n) = e(k M/q)(e(k N /q) − 1)
n=M+1
2πi 0<|k|≤K k

ϕ(q)
+O log 2K
K
when K < q 1−ε .
(d) By taking K = q 1/2 log q show that if χ is a primitive character modulo
q, q > 1, then

M+N
ϕ(q) 1/2
χ (n) ≤ q log q + O q 1/2 log log 3q .
n=M+1
πq

10. (Bernstein 1914a,b) Let χ be a primitive character (mod q), with q > 1.
Show that
√
(1 − |n|/q)χ (n)e(nα) q
|n|≤q

uniformly in α.

9.5 Notes
Section 9.2. That the sum in (9.6) vanishes when (n, q) > 1 was proved by de la
Vallée Poussin (1896), in a complicated way. We follow the simpler argument
that Schur showed Landau (1908, pp. 430–431).
The evaluation of the sum cχ is found in Hasse (1964, pp. 449–450). Our
derivation follows that of Montgomery & Vaughan (1975). A different proof
has been given by Joris (1977).

Section 9.3. Let ζ K (s) = a N (a)−s be the Dedekind zeta function of the
algebraic number ﬁeld K . Here the sum is over all ideals a in the ring O K of
integers in K . In case K is a quadratic extension of Q, then the discriminant
322 Primitive characters and Gauss sums

√
d of K is a quadratic discriminant, K = Q( d), and ζ K (s) = ζ (s)L(s, χd ). In

other words, the number of ideals of norm n is k|n χd (k).
Section 9.4. Concerning the constant that can be taken in Theorem 9.18,
see Landau (1918), Cochrane (1987), Hildebrand (1988a,b), and Granville &
Soundararajan (2005). Granville & Soundararajan (2005) also show that in the
√
case of a cubic character, the sum in Theorem 9.18 is q(log q)θ where θ
is an absolute constant, θ < 1.
On the assumption of the Generalized Riemann Hypothesis for all Dirichlet
characters, Montgomery & Vaughan (1977) have shown that

M+N
χ (n) q 1/2 log log q.
n=M+1

See Granville & Soundararajan (2005) for a much simpler proof. Paley’s lower
bound, Theorem 9.24 above, shows that the above is essentially best-possible.
Nevertheless, it is known that one can do better a good deal of the time. In fact
in Montgomery & Vaughan (1979) it is shown that for each θ ∈ (0, 1) there is a
c(θ) > 0 such that if P > P0 (θ ), then for at least θπ (P) primes p ≤ P we have
N
n
max ≤ c(θ) p 1/2 ,
N
n=1
p
and if q > P0 (θ), then for at least θϕ(q) of the non-principal characters modulo
q we have

N
max χ (n) ≤ c(θ)q 1/2 .
N
n=1

Walfisz (1942) and Chowla (1947) showed that there exist infinitely many
primitive quadratic characters χ for which L(1, χ) eC0 log log q. In view
of Theorem 9.21, this provides an alternative approach for proving estimates
similar to Paley’s Theorem 9.24. For recent developments concerning large
L(1, χ ), see Vaughan (1996), Montgomery & Vaughan (1999), and Granville
& Soundararajan (2003).
Lemma 9.25 is a consequence of Weil’s proof of the Riemann Hypothesis
for curves over finite fields, and originally depended on considerable machinery
from algebraic geometry. Later Stepanov used constructs from transcendence
theory to estimate complete character sums, and subsequently Bombieri used
Stepanov’s ideas to give a proof of Weil’s theorem that depends only on the
Riemann–Roch theorem. Schmidt (1976) gives an exposition of this more
elementary approach that even avoids the Riemann–Roch theorem. Friedlander
& Iwaniec (1992) showed that the Pólya–Vinogradov inequality can be sharp-
ened, in the direction of Burgess’ estimates, without using Weil’s estimates. The
9.6 References 323

proof of Theorem 9.27 above is developed from one of Iwaniec appearing in

Friedlander (1987), with a further wrinkle from Friedlander & Iwaniec (1993).
Burgess ﬁrst (1957) treated the Legendre symbol and then (1962a, b) gener-
alized his method to deal with arbitrary Dirichlet characters having cube-free
conductor. Burgess’ extension to composite moduli involves an extra new idea
that does not extend well when the conductor is divisible by higher powers of
primes. For some progress in this direction see Burgess (1986).

9.6 References
Apostol, T. M. (1970). Euler’s ϕ-function and separable Gauss sums, Proc. Amer. Math.
Soc. 24, 482–485.
Baker, R. C. & Montgomery, H. L. (1990). Oscillations of quadratic L-functions,
Analytic Number Theory (Urbana, 1989), Prog. Math. 85. Boston: Birkhäuser,
pp. 23–40.
Bernstein, S. N. (1914a). Sur la convergence absolue des séries trigonométriques, C. R.
Acad, Sci. Paris 158, 1661–1663.
(1914b). Ob absoliutnoi skhodimosti trigonometricheskikh riadov, Soobsch. Khar’k.
matem. ob-va (2) 14, 145–152; 200–201.
Burgess, D. A. (1957). The distribution of quadratic residues and non-residues, Mathe-
matika 4, 106–112.
(1962a). On character sums and primitive roots, Proc. London Math. Soc. (3) 12,
179–192.
(1962b). On character sums and L-series, Proc. London Math. Soc. (3) 12, 193–
206.
(1986). The character sum estimate with r = 3, J. London Math. Soc. (2) 33, 219–
226. √
Chowla, S. (1947). On the class-number of the corpus P( −k), Proc. Nat. Inst. Sci.
India 13, 197–200.
Chowla, S. & Mordell, L. J. (1961). Note on the nonvanishing of L(1), Proc. Amer.
Math. Soc. 12, 283–284.
Cochrane, T. (1987). On a trigonometric inequality of Vinogradov, J. Number Theory
27, 9–16.
Conway, J. H. (1997). The Sensuous Quadratic Form, Carus monograph 26. Washington:
Math. Assoc. Amer.
Friedlander, J. B. (1987). Primes in arithmetic progressions and related topics, Analytic
Number Theory and Diophantine Problems (Stillwater, 1984), Prog. Math. 70,
Boston: Birkhäuser, pp. 125–134.
Friedlander, J. B. & Iwaniec, H. (1992). A mean-value theorem for character sums,
Michigan Math. J. 39, 153–159.
(1993). Estimates for character sums, Proc. Amer. Math. Soc. 119, 365–372.
(1994). A note on character sums, The Rademacher legacy to mathematics (University
Park, 1992), Contemp. Math. 166, Providence: Amer. Math. Soc., pp. 295–299.
Fujii, A., Gallagher, P. X., & Montgomery, H. L. (1976). Some hybrid bounds for
character sums and Dirichlet L-series, Topics in Number Theory (Proc. Colloq.
324 Primitive characters and Gauss sums

Debrecen, 1974), Colloq. Math. Soc. Janos Bolyai 13. Amsterdam: North-Holland,
pp. 41–57.
Granville, A. & Soundararajan, K. (2003). The distribution of values of L(1, χd ), Geom.
Funct. Anal. 13, 992–1028; Errata 14 (2004), 245–246.
(2006). Large character sums: pretentious characters and the Pólya-Vinogradov in-
equality, to appear, 24 pp.
Hasse, H. (1964). Vorlesungen über Zahlentheorie, Second Edition, Grundl. Math. Wiss.
59. Berlin: Springer-Verlag.
Hildebrand, A. (1988a). On the constant in the Pólya–Vinogradov inequality, Canad.
Math. Bull. 31, 347–352.
(1988b). Large values of character sums, J. Number Theory 29, 271–296.
Joris, H. (1977). On the evaluation of Gaussian sums for non-primitive characters,
Enseignement Math. (2) 23, 13–18.
Landau, E. (1908). Nouvelle démonstration pour la formule de Riemann sur le nom-
bre des nombres premiers inférieurs à une limite donnée, et démonstration d’une
formule plus générale pour le cas des nombres premiers d’une progression
arithmétique, Ann. École Norm. Sup. (3) 25 399–448; Collected Works, Vol. 4.
Essen: Thales Verlag, 1986, pp. 87–130.
(1918). Abschätzungen von Charaktersummen, Einheiten und Klassenzahlen, Nachr.
Akad. Wiss. Göttingen, 79–97; Collected Works, Vol. 7. Essen: Thales Verlag, 1986,
pp. 114–132.
Martin, G. (2006). Inequities in the Shanks–Rényi prime number race, 32 pp., to appear.
Mattics, L. E. (1984). Advanced problem 6461, Amer. Math. Monthly 91, 371.
Montgomery, H. L. (1976). Distribution questions concerning a character sum, Topics in
Number Theory (Proc. Colloq. Debrecen, 1974), Colloq. Math. Soc. Janos Bolyai
13. Amsterdam: North-Holland, pp. 195–203.
(1980). An exponential polynomial formed with the Legendre symbol, Acta Arith.
37, 375–380.
Montgomery, H. L. & Vaughan, R. C. (1975). The exceptional set in Goldbach’s problem,
Acta Arith. 27, 353–370.
(1977). Exponential sums with multiplicative coefficients, Invent. Math. 43, 69–82.
(1979). Mean values of character sums, Canad. J. Math. 31, 476–487.
(1999). Extreme values of Dirichlet L-functions at 1, Number Theory in Progress,
Vol. 2 (Zakopane–Kościelisko, 1997). Berlin: de Gruyter, pp. 1039–1052.
Mordell, L. J. (1933). The number of solutions of some congruences in two variables,
Math. Z. 37, 193–209.
Paley, R. E. A. C. (1932). A theorem of characters, J. London Math. Soc. 7, 28–32.
Pólya, G. (1918). Über die Verteilung der quadratischen Reste und Nichtreste, Nachr.
Akad. Wiss. Göttingen, 21–29.
Schmidt, W. M. (1976). Equations over finite fields. An elementary approach, Lecture
Notes Math. 536, Berlin: Springer-Verlag.
Schur, I. (1918). Einige Bemerkungen zu der vorstehenden Arbeit des Herrn G. Pólya:
Über die Verteilung der quadratischen Reste und Nichtreste, Nachr. Akad. Wiss.
Göttingen, 30–36.
de la Vallée Poussin, C. J. (1896). Recherches analytiques sur la théorie des nombres
premiers, I–III, Ann. Soc. Sci. Bruxelles 20, 183–256, 281–362, 363–397.
9.6 References 325

Vaughan, R. C. (1996). Small values of Dirichlet L-functions at 1, Analytic Number

Theory. (Allerton Park, 1995), Vol. 2, Prog. Math. 139, Boston: Birkhäuser, pp.
755–766.
Vinogradov, I. M. (1918). Sur la distribution des résidus et des nonrésidus des puissances,
J. Soc. Phys. Math. Univ. Permi, 18–28.
(1919). Über die Verteilung der quadratischen Reste und Nichtreste, J. Soc. Phys.
Math. Univ. Permi, 1–14.
Vorhauer, U. M. A. (2006). A note on comparative prime number theory, to appear.
Walﬁsz, A. (1942). On the class-number of binary quadratic forms, Trudy Tbliss. Mat.
Inst. 11, 57–71.
10
Analytic properties of the zeta function
and L-functions

10.1 Functional equations and analytic continuation

In Section 1.3 we saw that the zeta function can be analytically continued to the
half-plane σ > 0. We now derive an important formula for the Riemann zeta
function, one that serves to define the zeta function throughout the complex
plane. From this formula we see that the zeta function is analytic at all points
except for s = 1, and we find that ζ (s) is related to ζ (1 − s). In preparation for
this we first use the Poisson summation formula to establish a corresponding
functional equation for theta functions.

Theorem 10.1 For arbitrary real α, and complex numbers z with z > 0,

∞
∞
e−π(n+α) z = z −1/2 e(kα)e−πk /z ,
2 2
(10.1)
n=−∞ k=−∞

and

∞
∞
(n + α)e−π (n+α) z = −i z −3/2 ke(kα)e−π k /z
2 2
(10.2)
n=−∞ k=−∞

where the branch of z 1/2 is determined by 11/2 = 1.

Proof We can obtain (10.2) from (10.1) by differentiating with respect

to α, since the differentiated series are uniformly convergent for α in a
compact set. As for (10.1), we note that if g(u) = f (u + α), then * g(t) =
*
f (t)e(tα). (Conventions governing the deﬁnition of the Fourier transform *f
are established in Appendix D.) We apply the Poisson summation formula
(Theorem D.3) to g(u), where f (u) = e−π u z , and it remains only to demon-
2

strate that *f (t) = z −1/2 e−π t /z . Writing

−π x 2 z − 2πit x = −π (x + it/z)2 z − π t 2 /z,

326
10.1 Functional equations and analytic continuation 327

we see that
+∞
*
f (t) = e−π t /z
2
e−π(x+it/z) z d x.
2

−∞

We consider this integral to be a contour integral in the complex plane. We

note that the integrand tends to 0 very rapidly as |x| tends to inﬁnity with
|x| bounded. Hence by Cauchy’s theorem we may translate the path of in-
tegration to the line x − it/z, −∞ < x < +∞, and we ﬁnd that the above
+∞
integral is −∞ e−π x z d x. We now turn the path of integration through an
2

angle − 12 arg z and again apply Cauchy’s theorem. After reparametrizing,

+∞
we see that our integral is z −1/2 −∞ e−π x d x = z −1/2 . This completes the
2

proof.

Theorem 10.2 For any complex number s, except s = 0 and s = 1, and any
non-zero complex number z with z ≥ 0,

∞
ζ (s) (s/2)π −s/2 = π −s/2 n −s (s/2, π n 2 z)
n=1

∞
+ π (s−1)/2 n s−1 ((1 − s)/2, π n 2 /z) (10.3)
n=1
z (s−1)/2 z s/2
+ − .
s−1 s
Here (s, a) is the incomplete gamma function,
∞
(s, a) = e−w ws−1 dw, (10.4)
a

and we may take the path of integration to be the ray w = a + u, 0 ≤ u < ∞,

so that
∞
(s, a) = e−u−a (u + a)s−1 du.
0

Now (u + a)s−1 |a|σ −1 uniformly for a ≥ 0, |a| ≥ ε > 0, and |σ | ≤ C, so

−s
that n (s/2, π n z) n −2 uniformly for z ≥ 0, |z| ≥ ε, |s| ≤ C. Thus the
2

two sums on the right are uniformly convergent for s in any compact set, and
hence by a theorem of Weierstrass they represent entire functions. The last two
terms have simple poles at 1 and 0, respectively. As for the left-hand side, we
note that (s/2) has a pole at s = 0, and never vanishes, so it follows that ζ (s)
is analytic for all s = 1. If we simultaneously replace s by 1 − s and z by 1/z,
then the two sums on the right in (10.3) are exchanged, and the last two terms
are also exchanged, so that the value of the right-hand side is invariant. These
observations may be summarized as follows:
328 Analytic properties of ζ (s) and L(s, χ )

Corollary 10.3 The function

1
ξ (s) = s(s − 1)ζ (s) (s/2)π −s/2 (10.5)
2
is entire, and ξ (s) = ξ (1 − s) for all s.
This is the functional equation of the zeta function, first proved by Riemann
in 1860. Since ζ (s) = 0 for σ ≥ 1, it follows that ξ (s) = 0 for σ ≥ 1, and
by the functional equation that ξ (s) = 0 for σ ≤ 0. The zeros of ζ (s) in the
critical strip 0 < σ < 1 coincide precisely with those of ξ (s). As (s/2) has
simple poles at s = 0, −2, −4, −6, . . . , the zeta function has simple zeros at
s = −2, −4, −6, . . . . These are the trivial zeros of the zeta function. The only
other zeros of the zeta function are the non-trivial zeros, in the critical strip.
The generic non-trivial zero is denoted ρ = β + 1 iγ . By theSchwarz reflec-
tion principle, ξ (s) = ξ (s); hence
in
particular
ξ 2 − it = ξ 1
2
+ it . But the
functional equation gives ξ 12 − it = ξ 12 + it , so it follows that ξ 12 + it
is real for all real t. Similarly, if ρ is a zero of ξ (s) then so also are ρ, 1 − ρ,
and 1 − ρ. The as yet unproved Riemann Hypothesis (RH) asserts that all non-
trivial zeros of the zeta function have real part 1/2; that is, all the zeros of ξ (s)
lie on the critical line σ = 1/2. We shall find it instructive to explore a number
of consequences of this famous conjecture, in Chapter 13.
Proof of Theorem 10.2 By Euler’s integral formula (Theorem C.2) for (s/2)
we see that if σ > 0, then
∞
(s/2) = e−x x s/2−1 d x. (10.6)
0

By the linear change of variables x = π n 2 u it follows that

∞
n −s (s/2)π −s/2 = e−πn u u s/2−1 du.
2

We assume that σ > 1 and sum over n to ﬁnd that

∞ ∞
ζ (s) (s/2)π −s/2 = e−πn u u s/2−1 du
2

n=1 0
∞ ∞
e−π n
2
= u
u s/2−1 du. (10.7)
0 n=1

Here the exchange of integration and summation is permitted by absolute con-

vergence. Suppose, for the present, that z > 0. We may consider the integral
above to be a contour integral in the complex plane, and by Cauchy’s theorem
we may replace the path of integration by the ray from 0 that passes through
z. We now consider separately the integral from 0 to z, and the integral from
10.1 Functional equations and analytic continuation 329

z to ∞. We call these integrals 1 , 2 , respectively. By reversing the steps we
made in passing from (10.6) to (10.7) we see immediately that
∞
−s/2
2 =π n −s (s/2, π n 2 z).
n=1

To treat 1 we let

+∞
e−πn
2
ϑ(u) = u
(10.8)
−∞

for u > 0. Then the sum in the integrand in (10.7) is (ϑ(u) − 1)/2. Thus
1 z 1 z s/2−1
1 = ϑ(u)u s/2−1
du − u du.
2 0 2 0
Here the second integral is 2s z s/2 . By Theorem 10.1 we know that ϑ(u) =
u −1/2 ϑ(1/u). Hence the ﬁrst term above is

z ∞
1 z 1 z s/2−3/2
e−π n /u u s/2−3/2 du +
2
ϑ(1/u)u s/2−3/2 du = u du.
2 0 0 n=1
2 0
2 (s−1)/2
Here the second integral is s−1 z . By the change of variable v = 1/u we
see that the ﬁrst term above is

∞ ∞
e−πn v v (1−s)/2−1 dv.
2

1/z n=1

We exchange the order of summation and integration, and make the linear
change of variables x = π n 2 v, to see that this is

∞
π (s−1)/2 n s−1 ((1 − s)/2, π n 2 /z).
n=1

Hence
z (s−1)/2 z s/2 ∞
= − + π (s−1)/2 n s−1 ((1 − s)/2, π n 2 /z),
1
s−1 s n=1

so we have the desired identity for σ > 1. But, as already noted, the two sums
represent entire functions, so the right-hand side of (10.3) is analytic for all s
except for simple poles at s = 1 and s = 0. Hence by the uniqueness of analytic
continuation the identity (10.3) holds for all s except at the poles.

The functional equation of Corollary 10.3 can also be expressed asymmet-

rically:
Corollary 10.4 For all s = 1,
πs
ζ (s) = ζ (1 − s)2s π s−1 (1 − s) sin . (10.9)
2
330 Analytic properties of ζ (s) and L(s, χ )

Proof By the reﬂection principle (C.6) and the duplication formula (C.9), we
see that
1−s
1 1−s s πs πs
2s = 1− sin = π −1/2 2s (1 − s) sin .
2
π 2 2 2 2

Thus the stated identity follows from Corollary 10.3.

By Stirling’s formula, we can describe |ζ (s)| in terms of |ζ (1 − s)|.

Corollary 10.5 Suppose that A > 0 is ﬁxed. Then
|ζ (s)| τ 1/2−σ |ζ (1 − s)|
uniformly for |σ | ≤ A and |t| ≥ 1. Here τ = |t| + 4, as usual.
Proof Since the above is invariant when s is replaced by 1 − s, we may sup-
pose that −A ≤ σ ≤ 1/2. We may also suppose that t ≥ 1, since |ζ (σ − it)| =
|ζ (σ + it)|. We consider the factors on the right-hand side of (10.9). By Stir-
ling’s formula as formulated in (C.18), we see that
| (1 − s)| (1 − s)1/2−s = |1 − s|1/2−σ exp(t arg(1 − s)).
But arg(1 − s) = − arctan t/(1 − σ ) = −π/2 + O(1/t) and |1 − s| ∼ t, so
| (1 − s)| t 1/2−σ exp(−π t/2). On the other hand, sin z = (ei z − e−i z )/(2i),
so | sin πs/2| exp(πt/2), and we obtain the stated result.

Let σ be fixed, and let µ(σ ) denote the infimum of those exponents µ
such that ζ (σ + it) τ µ . This is the Lindelöf µ-function. By Corollary 1.17
we know that µ(σ ) = 0 for σ ≥ 1 and that µ(σ ) ≤ 1 − σ for 0 < σ ≤ 1. By
Corollary 10.5 we see that µ(σ ) = µ(1 − σ ) + 1/2 − σ . Hence in particular,
µ(σ ) = 1/2 − σ for σ ≤ 0. For 0 < σ < 1 the value of µ(σ ) is at present
unknown, but the Lindelöf Hypothesis (LH) asserts that ζ (1/2 + it) ε τ ε ,
which is to say that µ(1/2) = 0. From this it follows that

0 for σ ≥ 1/2,
µ(σ ) = (10.10)
1/2 − σ for σ ≤ 1/2.
Three different proofs that LH implies the above are found in Exercises 10.1.
18–20. Also, from Exercises 10.1.20 and 10.1.21 we see that LH is equivalent
to a certain assertion concerning the distribution of the zeros of ζ (s). Since
this assertion is visibly weaker than RH, it is evident that RH implies LH. In
Chapter 13 we shall show that RH implies a quantitative form of LH.
Concerning special values of the zeta function, we observe first that since
ζ (s) ∼ 1/(s − 1) for s near 1, it follows from Corollary 10.4 that
ζ (0) = −1/2. (10.11)
10.1 Functional equations and analytic continuation 331

In addition, we note that Corollary B.3 asserts that

(−1)k−1 22k−1 B2k 2k
ζ (2k) = π (10.12)
(2k)!
for each positive integer k. Hence by taking s = 1 − 2k in Corollary 10.4 we
deduce that
−B2k
ζ (1 − 2k) = (10.13)
2k
for positive integers k. An alternative proof of this is found in Appendix B.
We may also determine the value of ζ (0), as follows. Let f (s) = (s − 1)ζ (s).
By Corollary 1.16 we know that f (s) = 1 + C0 (s − 1) + · · · for s near 1.
On multiplying both sides of (10.9) by s − 1 we see that f (s) = −ζ (1 −
s)2s π s−1 (2 − s) sin π s/2. On differentiating both sides and setting s = 1 we
discover that C0 = 2ζ (0) − 2ζ (0) log 2π + 2ζ (0) (1). But ζ (0) = −1/2 and

(1) = −C0 , so we ﬁnd that
1
ζ (0) = − log 2π. (10.14)
2
Our treatment of the zeta function extends readily to L-functions.
Theorem 10.6 For z with z > 0 let

∞
χ (n)e−πn z/q ,
2
ϑ0 (z, χ) =
n=−∞
∞
nχ (n)e−π n
2
ϑ1 (z, χ) = z/q
.
n=−∞

If χ is a primitive character modulo q, then

τ (χ )
ϑ0 (z, χ) = 1/2 z −1/2 ϑ0 (1/z, χ ),
q
τ (χ ) −3/2
ϑ1 (z, χ) = 1/2 z ϑ1 (1/z, χ )
iq
where the branch of z 1/2 is determined by 11/2 = 1.
Though both these functions are deﬁned for all χ , we note that if χ (−1) =
−1, then ϑ0 (z, χ) = 0 for all z, while if χ (−1) = 1, then ϑ1 (z, χ) = 0 identi-
cally. Thus ϑ0 (z, χ) is of interest when χ (−1) = 1, and ϑ1 (z, χ ) is useful when
χ (−1) = −1.
Proof Since χ is periodic with period q, it follows that

q ∞
e−π (mq+a) z/q .
2
ϑ0 (z, χ) = χ (a)
a=1 m=−∞
332 Analytic properties of ζ (s) and L(s, χ )

By (10.1) with α = a/q and z replaced by qz we see that the above is

q ∞
= (qz)−1/2 e−π k /(qz) e(ak/q)
2
χ (a)
a=1 k=−∞

∞
q
= (qz)−1/2 e−πk /(qz)
2
χ (a)e(ak/q).
k=−∞ a=1

Since χ is primitive, we know by Theorem 9.7 that the inner sum on the right is
τ (χ )χ (k) for all k. This gives the identity for ϑ0 . The identity for ϑ1 is proved
similarly, using (10.2).

In order to unify our formulæ we ﬁnd it convenient to put

0 if χ(−1) = 1,
κ = κ(χ ) = (10.15)
1 if χ(−1) = −1.
In this notation, the formulæ of Theorem 10.6 read
ε(χ )
ϑκ (z, χ) = 1/2+κ ϑκ (1/z, χ ) (10.16)
z
where
τ (χ )
ε(χ) = κ √ . (10.17)
i q
Suppose that χ is primitive. Some of our results concerning Gauss sums can be
reformulated in terms of ε(χ ). Firstly, from Theorem 9.7 we see that |ε(χ )| = 1.
Secondly, by Theorems 9.5 and 9.7 we see that ε(χ )ε(χ ) = 1. Finally, if χ is
not only primitive but also quadratic, then ε(χ ) = 1, by Theorem 9.17.
In the same way that Theorem 10.2 was derived from (10.8), the following
is an immediate consequence of (10.16).
Theorem 10.7 Let χ be a primitive character modulo q with q > 1. Then for
any complex numbers s and z with z ≥ 0,
L(s, χ) ((s + κ)/2)(q/π )(s+κ)/2

∞
= (q/π )(s+κ)/2 χ (n)n −s ((s + κ)/2, π n 2 z/q) (10.18)
n=1

∞
+ ε(χ )(q/π )(1−s+κ)/2 χ (n)n s−1 ((1 − s + κ)/2, π n 2 /(qz)).
n=1

As was the case with the zeta function, the above is ﬁrst proved for σ > 1.
Since each term of the series is entire, and since the series are locally uniformly
convergent, the right-hand side is an entire function of s, and this provides an
analytic continuation of L(s, χ) to the entire complex plane. If in the above we
10.1 Functional equations and analytic continuation 333

replace χ by χ , s by 1 − s, and z by 1/z, and then multiply both sides by ε(χ )

then the right-hand side above is unchanged, and thus we obtain a functional
equation for L(s, χ), as follows.
Corollary 10.8 Let χ be a primitive character modulo q with q > 1. The
function
ξ (s, χ) = L(s, χ ) ((s + κ)/2)(q/π )(s+κ)/2 (10.19)
is entire, and ξ (s, χ ) = ε(χ )ξ (1 − s, χ ) for all s.
Let χ be a primitive character modulo q, q > 1. We already know that
L(s, χ ) = 0 for σ > 1. Since the gamma function has no zeros, it follows that
ξ (s, χ ) = 0 in this half-plane. By the functional equation, ξ (s, χ ) = 0 also
for σ < 0, and hence L(s, χ) = 0 for σ < 0 except that L(s, χ ) must have
simple zeros where the gamma factor has simple poles, which is to say at
−κ, −κ − 2, −κ − 4, . . . . These are the trivial zeros of L(s, χ ). Zeros ρ =
β + iγ of L(s, χ) in the critical strip 0 ≤ β ≤ 1 are called non-trivial. The
conjecture that these latter zeros all lie on the critical line σ = 1/2 is the
Generalized Riemann Hypothesis (GRH). If ρ is a non-trivial zero of L(s, χ),
then by the functional equation 1 − ρ is a zero of L(s, χ ). Consequently 1 − ρ is
a zero of L(s, χ ), since in general L(s, χ ) = L(s, χ ). The pair of zeros ρ, 1 − ρ
are symmetrically placed with respect to the critical line. Of course, if β = 1/2
then ρ = 1 − ρ. For complex characters there is no symmetry about the real
axis, but if χ is quadratic then χ = χ , and so if ρ is a zero then so also are ρ,
1 − ρ, and 1 − ρ.
The functional equation of an L-function can also be expressed asymmetri-
cally.
Corollary 10.9 Suppose that χ is a primitive character (mod q) with q > 1.
Then for all s,
π
L(s, χ ) = ε(χ )L(1 − s, χ )2s π s−1 q 1/2−s (1 − s) sin (s + κ).
2
Proof When κ = 0 we proceed as in the proof of Corollary 10.4. When κ = 1
we use the reﬂection formula (C.6) and the duplication formula (C.9) to see
that
(1 − s/2) 1
= (1 − s/2) (1/2 − s/2) sin π(s + 1)/2
((s + 1)/2) π
π
= 2s π −1/2 (1 − s) sin (s + 1).
2
This, with the identity ξ (s, χ) = ε(χ )ξ (1 − s, χ ), gives the stated result.

By the same method used to prove Corollary 10.5 we obtain

334 Analytic properties of ζ (s) and L(s, χ )

Corollary 10.10 Let χ be a primitive character (mod q) with q > 1, and

suppose that A > 0 is ﬁxed. Then

|L(s, χ)| (qτ )1/2−σ |L(1 − s, χ )|

uniformly for |σ | ≤ A and |t| ≥ 1. If −A ≤ σ ≤ 1/2 and |t| ≤ 1, then

L(s, χ) q 1/2−σ |L(1 − s, χ )|.

Let χ be a character modulo q. If χ is imprimitive, then χ is induced by a

primitive character χ ! modulo d, for some d|q, and

χ ! ( p)
L(s, χ ) = L(s, χ ! ) 1− . (10.20)
p|q
ps

If p|d, then χ ! ( p) = 0, and thus in the above product we may conﬁne our
attention to those primes p|q such that p d. For such a prime, the factor
1 − χ ! ( p)/ p s is an entire function whose zeros form an arithmetic progression
on the imaginary axis. Thus L(s, χ) has all the zeros of L(s, χ ! ), and if there are
primes p|q such that p d, then L(s, χ) has additional zeros on the imaginary
axis. Such zeros constitute a ﬁnite union of arithmetic progressions. In the
special case χ = χ0 , we have

1
L(s, χ0 ) = ζ (s) 1− s .
p|q
p

Thus L(s, χ0 ) has a pole at s = 1 with residue ϕ(q)/q, it has all the zeros of
ζ (s), and it also has zeros of the form 2πik/ log p where k takes integral values
and p|q.

10.1.1 Exercises
1. Let ϑ(u) be deﬁned as in (10.8). Show that ϑ (1) = −ϑ(1)/4.
2. Let f be an even function in L 1 (R), let β > 1, suppose that f (x) = O(x −β )
as x → ∞, and that * f (u) = O(u −β ) as u → ∞. Show that
∞
∞ ∞
2ζ (s) f (x)x s−1 d x = 2 n −s f (x)x s−1 d x
0 n=1 n
∞ ∞
+2 n s−1 *
f (u)u −s du
n=1 n

− f (0)/s + *
f (0)/(s − 1)

for 1 − β < σ < β.

10.1 Functional equations and analytic continuation 335

3. (Heilbronn 1938; cf. Weil 1967)

(a) Show that for c > 1, x > 0,
1 c+i∞ ∞
ζ (s) (s/2)(π x)−s/2 ds = 2 e−π n x .
2

2πi c−i∞ n=1

(b) With ϑ(x) deﬁned as in (10.8), use the functional equation of the zeta
function to show that ϑ(x) = x −1/2 ϑ(1/x) for x > 0.
4. (Lavrik 1965)
(a) Suppose that z > 0, that σ0 > max(0, −σ ), and that s = 0, s = −1,
s = −2, . . . . By pulling the contour to the left and summing the
residues, show that
1 σ0 +i∞
dw ∞
(−1)k z s+k
(w + s)z −w = (s) − .
2πi σ0 −i∞ w k=0
k!(s + k)
(b) Show that if σ > 0, then the right-hand side above is (s, z).
(c) Argue that both sides are entire functions of s, and hence that the
identity
σ0 +i∞
1 dw
(s, z) = (w + s)z −w
2πi σ0 −i∞ w
holds for all complex s.
(d) Show that if σ0 > max(0, (1 − σ )/2), then
∞
π −s/2 n −s (s/2, π n 2 z)
n=1
σ0 +i∞
1 dw
= ζ (s + 2w) (w + s/2)π −w−s/2 z −w .
2πi σ0 −i∞ w
(e) Suppose now that s = 0 and s = 1. Explain why the integrand has poles
at w = 0, w = (1 − s)/2, w = −s/2, and nowhere else.
(f) Show that when the contour is pulled to the left, the pole at w = 0
contributes ζ (s) (s/2)π −s/2 , the pole at w = (1 − s)/2 contributes
z (s−1)/2 /(s − 1), and the pole at −s/2 contributes −z s/2 /s.
(g) Suppose the contour is pulled to the left to an abscissa σ1 <
min(0, −σ/2). By means of the identity ζ (s) (s/2)π −s/2 = ζ (1 − s)
((1 − s)/2)π (s−1)/2 and the change of variable w → −w, show

that the expression is π (s−1)/2 ∞ n=1 n
s−1
((1 − s)/2, π n 2 /z). Thus
demonstrate that Theorem 10.2 can be derived from Corollary 10.3.
5. Suppose that α is real, that z > 0 and that χ is a primitive character
(mod q).
(a) Show that
∞
τ (χ ) ∞
χ (n)e−π(n+α) z/q = 1/2 z −1/2 χ (k)e(kα/q)e−π k /(qz) .
2 2

n=−∞ q k=−∞
336 Analytic properties of ζ (s) and L(s, χ )

(b) By differentiating with respect to α, or otherwise, show that

∞
τ (χ ) ∞
χ (n)(n + α)e−π(n+α) z/q = 1/2 z −3/2 χ (k)ke(kα/q)e−π k /(qz) .
2 2

n=−∞ iq k=−∞

6. Let α and β be real numbers, and suppose that z > 0, and put

∞
e(nβ)e−π(n+α) z .
2
ϑ0 (z; α, β) =
n=−∞

(a) Show that if f (x) = e(βx)e−π (x+α) z , then *f (t) = e(−αβ)z −1/2 .
2

(b) Show that ϑ0 (z; α, β) = e(−αβ)z −1/2 ϑ(1/z, −β, α).

(c) Without using the result of (b), show that ϑ0 (z; α, β) = ϑ0 (z; −α, −β).
7. Show that
∞ ∞
(1 − 2π n 2 x)e−πn x > (2π (n + 1/2)2 x − 1)e−π (n+1/2) x > 0
2 2

n=−∞ n=−∞

for all x > 0.

8. Use the functional equation of the zeta function in any convenient form to
show that
πs
ζ (1 − s) = ζ (s)21−s π −s (s) cos .
2
9. Show that if k is a positive integer, then
(−1)k (2k)!ζ (2k + 1)
ζ (−2k) = .
22k+1 π 2k
10. Let ϑ(x) be deﬁned as in (10.8). Show that
1 1 ∞ s/2 dx
ζ (s) (s/2)π −s/2 = + x + x (1−s)/2 (ϑ(x) − 1)
s(s − 1) 2 1 x
for all s except s = 1 or s = 0.
11. (Walﬁsz 1931, p. 454) Show that
∞ ∞
1 5
2 b2
= .
a=1 b=1
a 2
(a,b)=1

12. (Mallik 1977) Let χ be a primitive quadratic character.

(a) Show that ξ (1/2, χ ) = 0.
(b) Show that if L(1/2, χ ) = 0, then sgn L (1/2, χ ) = −sgn L(1/2, χ ).
13. Let χ be a primitive character modulo q, and let θ be a real number such
that e2iθ = ε(χ ). Thus eiθ is one of the square roots of ε(χ ). Show that
ξ (1/2 + it, χ )e−iθ is real for all real t.
14. Let χ be a primitive character modulo q with q > 1, and suppose that
χ (−1) = 1.
10.1 Functional equations and analytic continuation 337

(a) For each positive integer k, show that

(−1)k−1 22k−1 π 2k τ (χ )
q
L(2k, χ) = χ (a)B2k (a/q).
(2k)! q a=1

(b) For positive integers k, deduce that

−q 2k−1
q
L(1 − 2k, χ ) = χ (a)B2k (a/q).
2k a=1
15. Let χ be a primitive character modulo q with q > 1, and suppose that
χ (−1) = −1.
(a) For each non-negative integer k, show that
i(−1)k 22k π 2k+1 τ (χ )
q
L(2k + 1, χ ) = χ (a)B2k+1 (a/q).
(2k + 1)! q a=1

(b) Show that when k = 0, the above is consistent with the formula of
Theorem 9.9.
(c) For non-negative integers k, deduce that
−q 2k
q
L(−2k, χ) = χ (a)B2k+1 (a/q).
2k + 1 a=1
16. (a) Let p1 and p2 be distinct primes. Show that (log p1 )/(log p2 ) is irra-
tional.
(b) Let χ be a character modulo q. Show that all zeros of L(s, χ ) on the
imaginary axis are simple, except possibly for zeros at the point s = 0.
(c) Let a positive integer m and a primitive character χ ! be given. Show
that there is a character χ induced by χ ! such that L(s, χ ) has a zero
at s = 0 of exact multiplicity m.
17. (Landau 1907) (a) Let χ denote the character modulo 5 such that χ (2) = i.
Show that L(1, χ ) = (−1 − 3i)π τ (χ )/25. √
(b) With χ as above, show that L(2, χ 2 ) = 4 5π 2 /125.
(c) Let χ be as above.√By using Exercise 9.2.9, or otherwise, show that
τ (χ )2 = (−1 − 2i) 5.
(d) With χ as above, show that
L(1, χ )2
= 1 + i/2.
L(2, χ 2 )
(e) Let χ denote a non-principal character modulo q. Show that
∞
L(s, χ)2
2ω(n) χ (n)n −s =
n=1
L(2s, χ 2 )
for σ > 1/2.
338 Analytic properties of ζ (s) and L(s, χ )

(f) Let εn = 1 if n ≡ 1 (mod 5), εn = −1 if n ≡ −1 (mod 5), and εn = 0

otherwise. Show that
∞
εn 2ω(n)
= 1.
n=1
n

18. Suppose throughout that 0 < δ ≤ 1/2. (a) Let α(s) = ∞ n=1 an n
−s
be
a Dirichlet series with abscissa of convergence σc . Show that if σ0 >
max(δ, σc ), then
δ σ0 +i∞
xw
an ((x/n)δ − (n/x)δ ) = α(w) dw
n≤x πi σ0 −i∞ (w − δ)(w + δ)

(b) By taking α(w) = ζ (1/2 + it + w), and considering the residues aris-
ing from poles at w = 1/2 − it and at w = δ, show that

ζ (1/2 + δ + it) = x −δ n −1/2−it ((x/n)δ − (n/x)δ )
n≤x
∞
δx −δ x iu
+ ζ (1/2 + it + iu) du
π −∞ u 2 + δ2
2δx 1/2−δ−it
−
(1/2 − it − δ)(1/2 − it + δ)
= T1 + T2 + T3 ,

say.
(c) Show that

1
T1 1 + x 1/2−δ min , log x .
|δ − 1/2|
(d) Let M(T ) = max0≤t≤T |ζ (1/2 + it)|. Show that

T2 x −δ M(2τ )

uniformly for 0 < δ ≤ 1/2.

(e) Show that T3 x 1/2−δ /τ 2 .
(f) By taking x = M(2τ )2 , show that

1
ζ (σ + it) M(2τ )2−2σ min , log M(2τ )
|σ − 1|
uniformly for 1/2 ≤ σ ≤ 1.
(g) Show that if M(T ) ε T ε , then µ(σ ) = 0 for σ ≥ 1/2.
(h) By Corollary 10.5, deduce that if M(T ) ε T ε , then µ(σ ) = 1/2 − σ
when σ ≤ 1/2.
10.1 Functional equations and analytic continuation 339

19. Let M(σ, T ) = max1≤t≤T |ζ (σ + it)|. Suppose that σ, σ1 , σ2 are ﬁxed, 0 ≤

σ1 < σ < σ2 ≤ 1. Let C denote the rectangular contour with vertices σ2 −
σ − iτ/2, σ2 − σ + iτ/2, σ1 − σ + iτ/2, σ1 − σ − iτ/2.
(a) Show that
1 xw
ζ (σ + it) = ζ (s + w) dw.
2πi C w(w + 1)
(b) Deduce that
ζ (σ + it) M(σ1 , 2τ )x σ1 −σ + M(σ2 , 2τ )x σ2 −σ .
(c) By choosing x suitably, show that
M(σ, T ) M(σ1 , 2T )(σ2 −σ )/(σ2 −σ1 ) M(σ2 , 2T )(σ −σ1 )/(σ2 −σ1 ) .
(d) Deduce that
σ2 − σ σ − σ1
µ(σ ) ≤ µ(σ1 ) + µ(σ2 ).
σ2 − σ1 σ2 − σ1
(e) Conclude that µ(σ ) ≤ 12 (1 − σ ) for 0 ≤ σ ≤ 1.
(f) Show that if µ(1/2) = 0, then (10.10) holds for all σ .
20. (Backlund 1918) Assume the Lindelöf Hypothesis (LH) throughout, and
suppose that δ is a small ﬁxed positive number and that t is not the ordinate
γ of a zero ρ of ζ (s).
(a) Show that the number of zeros ρ = β + iγ of ζ (s) in the rectangle
1/2 + δ ≤ β ≤ 1, T − 1 ≤ γ ≤ T + 1 is o(log T ).
(b) Show that
ζ 1
(s) = + o(log τ )
ζ ρ s −ρ

uniformly for 1/2 + 2δ ≤ σ ≤ 2 where the sum is over those zeros ρ

for which 1/2 + δ ≤ β ≤ 1, t − 1 ≤ γ ≤ t + 1.
(c) Show that if σ1 < σ2 and t = γ , then
σ2
σ −β 1 (σ2 − β)2 + (t − γ )2
dσ = log .
σ1 (σ − β)2 + (t − γ )2 2 (σ1 − β)2 + (t − γ )2
(d) Show that if 1/2 ≤ σ1 ≤ 1 and t = γ , then
2
σ −β
dσ ≥ 0.
σ1 (σ − β)2 + (t − γ )2
(e) Show that if t is not the ordinate of a zero, then
2
ζ
(σ + it) dσ ≥ −ε log τ
σ1 ζ
uniformly for 1/2 + 2δ ≤ σ ≤ 2.
340 Analytic properties of ζ (s) and L(s, χ )

(f) Show that µ(σ ) = 0 for 1/2 < σ ≤ 2.

(g) Deduce that µ(σ ) = 1/2 − σ for −1 ≤ σ < 1/2.
(h) Show that
σ2
t −γ t −γ t −γ
dσ = arctan − arctan .
σ1 (σ − β)2 + (t − γ )2 σ2 − β σ1 − β
(i) Deduce that
σ2
t −γ
dσ ≤ π.
σ1 (σ − β)2 + (t − γ )2
(j) Conclude that arg ζ (1/2 + 2δ + it) = o(log τ ).
21. (Backlund 1918; cf. Littlewood 1924) Suppose now that the number of zeros
ρ of ζ (s) in a rectangle 1/2 + δ ≤ β ≤ 1, t − 1 ≤ γ ≤ t + 1 is o(log τ ) as
t → ∞, and put
ζ 1
f (s) = (s) −
ζ ρ s −ρ

where the sum is over the o(log τ ) zeros in such a rectangle.

(a) Explain why f (s) log τ in the disc |s − 2 − it0 | ≤ 3/2 − 2δ.
(b) Explain why f (s) = o(log τ ) in the disc |s − 2 − it0 | ≤ 1/2.
(c) Use Hadamard’s three circles theorem to show that f (s) = o(log τ ) for
|s − 2 − it0 | ≤ 3/2 − 3δ.
(d) Deduce that ζ (1/2 + 3δ + it) τ ε.
(e) Suppose that our hypothesis concerning the number of zeros in a
rectangle holds for every ﬁxed positive δ. Deduce that µ(σ ) = 0 for
σ > 1/2.
(f) By Exercise 19(d), conclude that µ(1/2) = 0, i.e., that LH follows.

22. For 0 < α ≤ 1 and σ > 1 let ζ (s, α) = ∞ n=0 (n + α)
−s
be the Hurwitz zeta
function.
(a) Show that
∞
x s−1 e−αx
ζ (s, α) (s) = dx
0 1 − e−x
for σ > 1.
(b) Let
z s−1 e−αz
I (s, α) = dz
C(r ) 1 − e−z
where C(r ) is a contour that runs by a straight line from ir + ∞ to ir ,
by a semicircle from ir through −r to −ir , and then by a straight line
from −ir to −ir + ∞. Note that the value of I (s, α) is independent
10.1 Functional equations and analytic continuation 341

of r for 0 < r < 2π. By letting r → 0 show that I (s, α) = (e2πis − 1)

ζ (s, α) (s) for σ > 1.
(c) By means of (C.6), show that
(1 − s)e−πis
ζ (s, α) = I (s, α)
2πi
for σ > 1.
(d) Show that I (s, α) is an entire function of s. Deduce by the above that
ζ (s, α) is meromorphic.
(e) Show that I (k, α) = 0 for k = 2, 3, . . . .
(f) Show that I (1, α) = 2πi.
(g) Deduce that ζ (s, α) is analytic everywhere except for a simple pole at
s = 1 with residue 1.
(h) Show that if k is an integer, then
(1−α)z
ze
I (k, α) = z k−2 dz.
|z|=1 ez − 1
(i) By Exercise B.3, deduce that if k is a non-negative integer, then

I (−k, α) = 2πi Bk+1 (1 − α)/(k + 1)!.

(j) By Theorem B.1, deduce that if k is a positive integer then

−Bk (α)
ζ (1 − k, α) = .
k
In particular, ζ (0, α) = 1/2 − α.
23. (Lerch 1894; cf. Berndt 1985) Let α be ﬁxed, 0 < α ≤ 1. (a) Show that

∞
ζ (s, α) − ζ (s) = α −s + ((n + α)−s − n −s )
n=1

for σ > 0.
(b) Show that
n+α
(n + α)−s − n −s + αsn −s−1 = s(s + 1) (n + α − u)u −s−2 du.
n

(c) Deduce that

∞
ζ (s, α) − ζ (s) + αsζ (s + 1) = α −s + ((n + α)−s − n −s + αsn −s−1 )
n=1

for σ > −1, and that the series is locally uniformly convergent in this
half-plane.
342 Analytic properties of ζ (s) and L(s, χ )

(d) Show that

ζ (s, α) − ζ (s) + αζ (s + 1) + αsζ (s + 1)
∞
−s −log (n + α) log n α αs log n
= −α log α + + s + s+1 −
n=1
(n + α)s n n n s+1
∂
for σ > −1. (Here ζ (s, α) is meant to denote ∂s
ζ (s, α).)
(e) By Corollary 1.16, or otherwise, show that
lim ζ (s + 1) + sζ (s + 1) = C0 .
s→0

(f) Deduce that

∞
ζ (0, α) − ζ (0) + αC0 = −log a + (−log (n + α) + log n + α/n).
n=1

By (10.14) and the deﬁnition (C.1) of the gamma function, conclude

that
(α)
ζ (0, α) = log √ .
2π
24. (a) Let χ be a character modulo q. Show that

q
L(s, χ ) = q −s χ (a)ζ (s, a/q).
a=1

(b) Show that if χ is a non-principal character modulo q, then

−1
q
L(0, χ ) = χ (a)a.
q a=1
(c) Show that if χ is a non-principal character modulo q, then
q
L (0, χ) = L(0, χ) log q + χ (a) log (a/q).
a=1

25. Let Q(x, y) = ax + bx y + cy where a, b, c are real numbers, and put

2 2

d = b2 − 4ac. Suppose that Q is positive-deﬁnite, which is to say that

a > 0 and d < 0. For z with z > 0, put
√
ϑ Q (z) = e−2π Q(m,n)z/ −d .
m,n∈Z

(a) Show that

√ √
e−π zn −d/(2a)
e−2πa(m+bn/(2a)) z/ −d
2 2
ϑ Q (z) = .
n m

(b) Apply Theorem 10.1 to the inner sum, take the sum over n inside, and
apply Theorem 10.1 a second time to show that ϑ Q (z) = ϑ Q (1/z)/z.
10.1 Functional equations and analytic continuation 343

(c) For σ > 1 put

ζ Q (s) = Q(m, n)−s .
(m,n)=(0,0)

Show that if z ≥ 0, then

ζ Q (s) (s)(−d)s/2 (2π)−s

2π Q(m, n)z
= (−d)s/2 (2π)−s Q(m, n)−s s, √
(m,n)=(0,0) −d

2π Q(m, n)
+ (−d)(1−s)/2 (2π )s−1 Q(m, n)s−1 1 − s, √
(m,n)=(0,0) z −d
z s−1 z −s
+ − .
2(s − 1) 2s

(d) Deduce that ζ Q (s) is a meromorphic function whose only singularity

√
is a simple pole at s = 1 with residue π/ −d.
(e) Put ξ Q (s) = ζ Q (s) (s)(−d)s/2 (2π)−s . Show that ξ Q (s) = ξ Q (1 − s)
for all s except s = 0, s = 1.
(f) Show that ζ Q (0) = −1/2.
(g) Show that ζ Q (−k) = 0 for all positive integers k.
26. Let K be an algebraic number ﬁeld. The Dedekind zeta function of K is de-

ﬁned to be ζ K (s) = a N (a)−s for σ > 1, where the sum is over all integral
ideals in the ring O K of algebraic integers in K . This is a natural general-
ization of the Riemann zeta function, and indeed ζQ (s) = ζ (s). Since ideals
in O K factor uniquely into prime ideals, and since N (ab) = N (a)N (b) for
any pair a, b of ideals, it follows that

ζ K (s) = (1 − N (p)−s )−1

for σ > 1. Let d denote the discriminant of K . In the case that K is a

quadratic field, by analysing how rational primes split in K it emerges
that ζ K (s) = ζ (s)L(s, χd ) where χd (n) = dn K is the Kronecker symbol.
Thus the functional equations of ζ (s) and of L(s, χd ) give a functional
equation for ζ K (s) in this case. From now on, suppose √ that K is a com-
plex quadratic field, which is to say that K = Q( d) where d < 0 is a
fundamental quadratic discriminant. Let w denote the number of units in
O K , which is to say that w = 6 if d = −3, w = 4 if d = −4, and w = 2
if d < −4. Let h be the class number of K . Then there are precisely h
reduced positive definite binary quadratic forms of discriminant d, say
Q 1 , Q 2 , . . . , Q h . As m and n run over integral values, (m, n) = (0, 0), the
344 Analytic properties of ζ (s) and L(s, χ )

values Q i (m, n) run over the the values N (a) for ideals a in the i th ideal
class Ci , each value being taken exactly w times. Thus

ζ Q i (s) = w N (a)−s
a∈Ci

in the notation of the preceding exercise, and

1 h
ζ K (s) = ζ Q (s).
w i=1 i

(a) For z > 0, let

h
∞ √
ϑK (z) = ϑ Q i (z) = h + w r (n)e−2π nz/−d

i=1 n=1

where r (n) = r K (n) = k|n χd (k) is the number of ideals in O K with
norm n. Show that ϑK (z) = ϑK (1/z)/z.
(b) Show that if z ≥ 0, then

ζ K (s) (s)(−d)s/2 (2π)−s

∞
√
= (−d)s/2 (2π )−s r (n)n −s s, 2πnz/ −d
n=1

∞
√
+ (−d)(1−s)/2 (2π )s−1 r (n)n s−1 1 − s, 2π n/ z −d
n=1
s−1 s
hz hz
+ − .
2w(s − 1) 2ws
(c) Deduce that ζ K (s) is a meromorphic function whose only singularity
√
is a simple pole at s = 1 with residue hπ/(w −d).
(d) Put ξ K (s) = ζ K (s) (s)(−d)s/2 (2π)−s . Show that ξ K (s) = ξ K (1 − s)
for all s except s = 1 and s = 0.
(e) Show that ζ K (0) = −h/(2w).
(f) Show that ζ K (−k) = 0 for all positive integers k.
(g) Show that r (n 2 ) ≥ 1 for all positive integers n.
(h) Show that if L(1/2, χd ) ≥ 0, then h (−d)1/4 log(−d).
27. Let α be an arbitrary complex number and z a complex number with z > 0.
Let f (u) = e−π (u+α) z . Show that * f (t) = z −1/2 e2πitα e−π t /z . Deduce that
2 2

the identities of Theorem 10.1 hold for arbitrary complex α.

√
28. Grössencharaktere for Q( −1), continued from Exercises 4.2.7 and
4.3.10. (a) By two applications of the preceding exercise, show that if z
10.2 Products and sums over zeros 345

and w are complex numbers with z > 0, then

1 −π (c2 +d 2 )/z 2πi(c+id)w/z
e−π(a +b ) e2πi(a+ib)w =
2 2
e e .
a,b∈Z
z c,d∈Z
(b) Differentiate both sides of the above m times with respect to w, and
then set w = 0, to show that
1 −π (c2 +d 2 )/z
e−π (a +b )z (a + ib)m = m+1
2 2
e (c + id)m .
a,b
z c,d

(c) Explain why the above reduces to 0 = 0 if 4 m.

(d) Let χm and L(s, χm ) be deﬁned as before. Show that if m is a positive
integer and z ≥ 0, then
L(s, χm ) (s + 2m)π −s
π −s χm (a + ib)
= (s + 2m, π (a 2 + b2 )z)
4 (a,b)=(0,0) (a 2 + b2 )s
π s−1 χm (a + ib)
+ (1 − s + 2m, π (a 2 + b2 )/z).
4 (a,b)=(0,0) (a 2 + b2 )1−s

(e) Deduce that L(s, χm ) is an entire function when m is a non-zero integer.

(f) For each positive integer m, put ξ (s, χm ) = L(s, χm ) (s + 2m)π −s .
Show that ξ (s, χm ) = ξ (1 − s, χm ) for all s.
(g) Show that if m is a positive integer, then L(s, χm ) has simple zeros
at −2m, −2m − 1, −2m − 2, . . . , but no other zeros in the half-plane
σ < 0.
(h) Show that ξ (σ, χm ) is real for all real σ , and that ξ (1/2 + it, χm ) is real
for all real t.

10.2 Products and sums over zeros

If P(z) is a polynomial, then we may express P(z) as a product over its zeros
zi ,
P(z) = c(z − z 1 )(z − z 2 ) · · · (z − z n ).
The question arises whether a more general entire function may be similarly
represented as a product over its zeros, say

z
f (z) = c 1− . (10.21)
n zn
This is an issue that was addressed by Weierstrass and Hadamard. Rather than
derive their extensive theory, we establish only a simple part of it that sufﬁces
346 Analytic properties of ζ (s) and L(s, χ )

for our purposes. We do not quite achieve a formula of the type (10.21) for the
zeta function, but we obtain a serviceable substitute.

Lemma 10.11 Suppose that f (z) is an entire function with a zero of order K
at 0, and that f (z) vanishes at the non-zero numbers z 1 , z 2 , z 3 , . . . . Suppose
also that there is a constant θ , 1 < θ < 2, such that
max | f (z)| ≤ exp(R θ )
|z|≤R

for all sufﬁciently large R. Then there exist numbers A = A( f ) and B = B( f ),

such that
∞
z
f (z) = z e
K A+Bz
1− e z/zk
k=1
z k

for all z. Here the product is uniformly convergent for z in compact sets.

Proof We may suppose that K = 0, since if K > 0 then the function f (z)/z K
does not vanish at the origin. Let N f (R) denote the number of zeros of f (z) in the
disc |z| ≤ R. By Jensen’s inequality (Lemma 6.1) we ﬁnd that N f (R) ≤ 8R θ for

all sufﬁciently large R. Thus R<|zk |≤2R |z k |−2 ≤ 8R θ −2 , so by summing over
∞
dyadic blocks we see that k=1 |z k |−2 < ∞. (Alternatively, if more precision
∞
were desired, we could write this sum as 0 r −2 d N f (r ) , and integrate by parts.)
But (1 − z)e = 1 + O(|z| ) uniformly for |z| ≤ 1, so the product
z 2

∞
z
g(z) = 1− e z/zk
k=1
z k

is uniformly convergent in compact regions, and hence represents an entire

function. Thus h(z) = f (z)/( f (0)g(z)) is a non-vanishing entire function with
h(0) = 1.
Next we derive an upper bound for Mh (R). To this end we write the product
above in three parts,

g(z) = = P1 (z)P2 (z)P3 (z),

k∈K1 k∈K2 k∈K3

where |z k | ≤ R/2 for k ∈ K1 , R/2 < |z k | ≤ 3R for k ∈ K2 , and |z k | > 3R for

k ∈ K3 . Suppose that R ≤ |z| ≤ 2R. If |z k | ≤ R/2, then |1 − z/z k | ≥ |z/z k | −
1 ≥ 1, and hence
|P1 (z)| ≥ e−2R/|zk | .
k∈K1

Now
1
R θ −1 .
k∈K
|z k |
1
10.2 Products and sums over zeros 347

Thus
θ
|P1 (z)| ≥ e−c R
for all large R. Since card K 2 ≤ 72R θ , it follows that there is an r , R ≤ r ≤ 2R,
for which |r − |z k || ≥ 1/R 2 for all k. If r is chosen in this way and |z| = r , then
|r − |z k || 1
|1 − z/z k | ≥ ≥
|z k | 27R 3
for all k ∈ K2 . Hence
θ
|P2 (z)| ≥ e−c R log R

when |z| = r . Finally,

θ
e−c R /|z k |2
≥ e−c R
2
|P3 (z)| ≥
k∈K3

for |z| ≤ 2R. Hence we see that for each large R there is an r , R ≤ r ≤ 2R, for
θ θ
which |g(z)| ≥ e−c R log R when |z| = r . Thus |h(z)| ≤ ec R log R for such z, and
hence by the maximum modulus principle
θ
Mh (R) ≤ ec R log R
.
Now put j(z) = log h(z) with j(0) = 0. Then j(z) ≤ c R θ log R for all large
R, so that by the Borel–Carathéodory lemma (Lemma 6.2),
j(z) R θ log R
for all large R. But θ < 2, so j(z) must be a polynomial of degree at most 1,
say j(z) = A + Bz, and the proof is complete.

In order to apply our lemma to ξ (s) we need an upper bound for |ξ (s)|. From
Corollary 1.17 we see that ζ (s) |s|1/2 when σ ≥ 1/2 and |s| ≥ 2. Thus by
Stirling’s formula (Theorem C.1) it follows that
ξ (s) exp(|s| log |s|) (10.22)
when σ ≥ 1/2 and |s| ≥ 2. In view of the functional equation found in Corollary
10.3, this same upper bound therefore holds for all s with |s| ≥ 2. Since
ξ (s) = (s − 1)ζ (s) (1 + s/2)π −s/2 , (10.23)
it follows from (10.11) that ξ (0) = 1/2. Thus by Lemma 10.11 we obtain
Theorem 10.12 Let ξ (s) be deﬁned as in Corollary 10.3. There is a constant
B such that

1 Bs s
ξ (s) = e 1− es/ρ (10.24)
2 ρ ρ

for all s. Here the product is extended over all zeros ρ of ξ (s).
348 Analytic properties of ζ (s) and L(s, χ )

All known zeros of the zeta function are simple, and it is plausible to conjec-
ture that they all are. In the (unlikely) event that a multiple zero is encountered,
the associated factor in the above product is to be repeated as many times as
the multiplicity.
Thus far we have remarked upon the zeros of ξ (s) without having proved
that they exist. However, from (10.24) we see that if ξ (s) had at most finitely
many zeros then there would be a constant C > 0 such that ξ (s) exp(C|s|)
for all
large s. On the contrary, by Stirling’s formula we find that ξ (σ ) =
exp 12 σ log σ + O(σ ) as σ → ∞, so it is evident that ξ (s) has infinitely many
zeros. Concerning the density of the zeros, the following estimate is useful.

Theorem 10.13 For T ≥ 0, let N (T ) denote the number of zeros ρ = β + iγ

of ξ (s) in the rectangle 0 < β < 1, 0 < γ ≤ T . Any zeros with γ = T should
be counted with weight 1/2. Then

N (T + 1) − N (T ) log (T + 2).

Proof We apply Jensen’s inequality (Lemma 6.1) to ξ (s), on a disc with centre
2 + i(T + 1/2) and radius R = 11/6. By taking r = 7/4, it follows from the
estimates of Corollary 1.17 that the number of zeros ρ in the rectangle 1/2 ≤
β ≤ 1, T ≤ γ ≤ T + 1 is log (T + 2). (Alternatively, we could appeal to
Theorem 6.8.) But ρ is a zero if and only if 1 − ρ is a zero, so the rectangle
0 ≤ β ≤ 1/2, T ≤ γ ≤ T + 1 contains the same number of zeros as the former
one. Thus we have the result.

By summing the above over integral values of T , we deduce that N (T )

T log T . Alternatively, this same upper bound follows from (10.22) by means

of Jensen’s inequality. Hence ρ |ρ|−A < ∞ for all A > 1. With a little more

work we could show that 1/|ρ| = ∞ (see Exercise 10.1), and indeed that
N (T ) T log T for all large T (see Exercise 10.4). A much more precise
asymptotic formula for N (T ) will be derived in Chapter 14.
We recall that the logarithmic derivative of a function f (z) is deﬁned to
be f (z)/ f (z). Since f (z)/ f (z) = dz
d
log f (z), it follows that the logarithmic
derivative of a product is the sum of the logarithmic derivatives of the factors.
Although log f (z) is multiple-valued, the ambiguity involves only an additive
constant, so f (z)/f (z) is a well-deﬁned single-valued analytic function wher-
ever f (z) is analytic and non-zero. If f has a zero at a of multiplicity m, then
f /f has a simple pole at a with residue m. If f has a pole at a of multiplicity m
then f /f has a simple pole at a with residue −m. Hence if f is meromorphic
then f /f is meromorphic with only simple poles, which occur at the zeros and
poles of f .
10.2 Products and sums over zeros 349

By taking logarithmic derivatives in the deﬁnition (10.5) of ξ (s) we ﬁnd that

ξ 1 1 ζ 1 1
(s) = + + (s) + (s/2) − log π. (10.25)
ξ s s−1 ζ 2 2
By taking logarithmic derivatives in the functional equation of Corollary 10.3
we see that
ξ ξ
(s) = − (1 − s). (10.26)
ξ ξ
By logarithmically differentiating the asymmetric form (10.9) of the functional
equation, we discover that
ζ ζ
π πs
(s) = − (1 − s) + log 2π − (1 − s) + cot . (10.27)
ζ ζ 2 2
By taking logarithmic derivatives of both sides of the identity (10.24) we obtain
Corollary 10.14 Let B be deﬁned as in Theorem 10.12. Then
ξ 1 1

(s) = B + + (10.28)
ξ ρ s−ρ ρ

and
ζ 1 1 1 1 1

(s) = B + log π − − (s/2 + 1) + + .
ζ 2 s−1 2 ρ s−ρ ρ
(10.29)
Moreover,

1 1 1 1 −C0 1
B=− + =− = − 1 + log 4π
2 ρ 1−ρ ρ ρ ρ 2 2
= −0.0230957 . . . . (10.30)
In the above, it is to be understood that if ξ (s) has a multiple zero ρ, then the
summand arising from ρ is to be repeated as many times as the multiplicity.
Proof The second identity follows from the first by means of (10.25). As for

(10.30), we observe first by taking s = 0 in (10.28) that B = ξξ (0). Also, by

taking s = 1 in (10.28) we find that ξξ (1) = B + ρ (1/(1 − ρ) + 1/ρ). By
(10.26), this is −B, so we obtain the first identity in (10.30). Since B is real,
we may write

1 1 1
B=− + .
2 ρ 1−ρ ρ

However, ρ 1/(1 − ρ) and ρ 1/ρ are absolutely convergent, so these
two sums may be written separately, above. Since 1 − ρ runs over zeros of
350 Analytic properties of ζ (s) and L(s, χ )

the zeta function as ρ does, the two sums are equal, and we obtain the second
identity in (10.30). By logarithmically differentiating the fundamental identity

s (s) = (s + 1) we see that 1/s + (s) = (s + 1). Hence (10.25) may be
rewritten as
ξ 1 ζ 1 1
(s) = + (s) + (s/2 + 1) − log π.
ξ s−1 ζ 2 2
We obtain the third identity in (10.30) by taking s = 0 in the above, in view of
(10.11), (10.14), and (C.12).

In order to extend our theory to include L-functions, we need an upper bound

for |L(s, χ)| that corresponds to the bound for the zeta function provided by
Corollary 1.17.
Lemma 10.15 Let χ be a non-principal character modulo q, and suppose
that δ > 0 is ﬁxed. Then

1
L(s, χ) (1 + (qτ )1−σ ) min , log qτ
|σ − 1|
uniformly for δ ≤ σ ≤ 2.
Landau noted that an estimate relating to the zeta function often has a
‘q-analogue’ in which n −it is replaced by χ (n) and τ is replaced by q. In
the above we have a ‘hybrid’ of the two, with χ (n)n −it and qτ throughout.

Proof Let S(u, χ ) = 0<n≤u χ (n). Then for σ > 0,
∞
L(s, χ ) = χ (n)n −s + u −s d S(u, χ )
n≤x x
∞ ∞
= χ (n)n −s + S(u, χ )u −s − S(u, χ) du −s
x x
n≤x
∞
= χ (n)n −s − S(x, χ )x −s + s S(u, χ )u −s−1 du.
n≤x x

This is analogous to Theorem 1.12. To estimate the sum we use (1.29). For the
remaining terms we use the trivial estimate S(u, χ ) q. The stated estimate
then follows by taking x = qτ .

Now suppose that χ is a primitive character modulo q, q > 1. By Stir-

ling’s formula we see that ξ (s, χ ) q 1/2+σ exp(|s| log |s|) when σ ≥ 1/2 and
|s| ≥ 2. By the functional equation of Corollary 10.8, it follows that
ξ (s, χ ) exp(|s| log q|s|) (10.31)
for all s with |s| ≥ 2. Hence by Lemma 10.11 we obtain
10.2 Products and sums over zeros 351

Theorem 10.16 Let χ be a primitive character modulo q, q > 1, and let

ξ (s, χ ) be deﬁned as in Corollary 10.8. There is a constant B(χ ) such that

s
ξ (s, χ ) = ξ (0, χ )e B(χ )s
1− es/ρ (10.32)
ρ ρ

for all s. Here the product is extended over all zeros ρ of ξ (s, χ).

We expect that the zeros of ξ (s, χ ) are all simple, but if a multiple zero is
encountered, then the factor that it contributes to the above product is to be
repeated as many times as its multiplicity. In analogy to Theorem 10.13, we
have

Theorem 10.17 Let χ be a character modulo q. The number of zeros

ρ = β + iγ of L(s, χ ) in the rectangle 0 ≤ β ≤ 1, T ≤ γ ≤ T + 1 is
log q(|T | + 2).

Proof First suppose that χ is primitive. We apply Jensen’s inequality

(Lemma 6.1) to L(s, χ ), on a disc with centre 2 + i(T + 1/2) and radius
R = 11/6. By taking r = 7/4, it follows from the estimates of Lemma 10.15
that the number of zeros ρ in the rectangle 1/2 ≤ β ≤ 1, T ≤ γ ≤ T + 1 is
log q(T + 2). But L(ρ, χ ) = 0 if and only if L(1 − ρ, χ ) = 0 (except pos-
sibly for a trivial zero at s = 0 if χ (−1) = 1), so the rectangle 0 ≤ β ≤ 1/2,
T ≤ γ ≤ T + 1 contains the same number of zeros as (or at most one more
than) the former one. Thus we have the result when χ is primitive.
Suppose now that χ is induced by a primitive character χ ! modulo r , with
r |q. Then

! χ ! ( p)
L(s, χ) = L(s, χ ) 1− .
p|q
ps
pr

Here each factor in the product has zeros forming an arithmetic progression
on the imaginary axis with common difference 2πi/ log p. Thus L(s, χ ) has

log r (|T | + 2) zeros of L(s, χ ! ), and additionally has p|q log p log q
zeros on the imaginary axis with imaginary part between T and T + 1. This
completes the argument.

Suppose that χ is a primitive character modulo q. By taking logarithmic

derivatives in the deﬁnition (10.18) of ξ (s, χ ), we see that
ξ L 1
1 q
(s, χ) = (s, χ ) + ((s + κ)/2) + log . (10.33)
ξ L 2 2 π
By taking logarithmic derivatives in the functional equation of Corollary 10.8
352 Analytic properties of ζ (s) and L(s, χ )

we see that
ξ ξ
(s, χ) = − (1 − s, χ ). (10.34)
ξ ξ
By logarithmically differentiating the asymmetric form of the functional equa-
tion found in Corollary 10.9, we discover that
L L q
π π
(s, χ ) = − (1 − s, χ ) − log − (1 − s) + cot (s + κ)
L L 2π 2 2
(10.35)

By taking logarithmic derivatives of both sides of the identity (10.31) we

obtain

Corollary 10.18 Let χ be a primitive character modulo q, q > 1, and let

B(χ ) be deﬁned as in Theorem 10.16. Then
ξ 1 1

(s, χ ) = B(χ ) + + (10.36)
ξ ρ s−ρ ρ

and
L 1
1 q 1 1

(s, χ) = B(χ ) − ((s + κ)/2) − log + + .
L 2 2 π ρ s−ρ ρ
(10.37)

Moreover,

1 1 1 1
B(χ ) = − + =− (10.38)
2 ρ 1−ρ ρ ρ ρ

and
−1 q L 1
B(χ ) = log − (1, χ ) + C0 + (1 − κ) log 2. (10.39)
2 π L 2
As always, multiple zeros are counted multiply.

Proof The second identity follows from the ﬁrst by means of (10.33). To
obtain the ﬁrst identity in (10.38), we take s = 1 in (10.36), and apply (10.34)
to see that
1 1

ξ ξ
B(χ) + + = (1, χ) = − (0, χ ) = −B(χ ) = −B(χ ).
ρ 1−ρ ρ ξ ξ

From Theorem 10.17 we know that the number of zeros ρ of ξ (s, χ ) with |ρ| ≤

R is R log q R for R ≥ 2. Hence the sums ρ 1/(1 − ρ) and ρ 1/ρ
are absolutely convergent. As the map ρ → 1 − ρ merely permutes zeros of
10.2 Products and sums over zeros 353

ξ (s, χ ), the ﬁrst of these two sums is unchanged if we replace ρ by 1 − ρ.

Hence the two sums are equal, and we obtain the second part of (10.38).

To derive (10.39) we ﬁrst take s = 0 in (10.36) to see that B(χ ) = ξξ (0, χ ).

By (10.34) it follows that B(χ ) = − ξξ (1, χ ). The stated identity now follows
by taking s = 1 in (10.33), in view of (C.11) and (C.14).

10.2.1 Exercises
1. Let f satisfy the hypotheses of Lemma 10.11, and suppose that
∞
1
< ∞.
k=1
|z k|

(a) Show that there are numbers A and B and a non-negative integer K such

that f (z) = z K e A+Bz g(z) where g(z) = ∞ k=1 (1 − z/z k ).
(b) Observe that for any complex number w, |1 − w| ≤ e|w| and show that
there is a number C such that |g(z)| ≤ eC|z| .

(c) Deduce that ρ 1/|ρ| = ∞ where the sum is over all non-trivial zeros
of the zeta function.
2. (a) Let B be the constant given in (10.30). Show that if ρ = 1/2 + iγ is a
zero of the zeta function on the critical line, then

|γ | ≥ (−1/B − 1/4)1/2 = 6.5611 . . . .

(b) Let γ be given, and put f (β) = β/(β 2 + γ 2 ). Show that if 0 ≤ β ≤ 1,

then f (β) ≥ β/(1 + γ 2 ). Deduce that if 0 ≤ β ≤ 1, then f (β) + f (1 −
β) ≥ f (0) + f (1).
(c) Show that if ρ = β + iγ is a non-trivial zero of the zeta function with
β = 1/2, then

|γ | ≥ (−2/B − 1)1/2 = 9.2518 . . . .

3. (Landau 1903) Show that

1/m
1 ∞
µ(n)(log n)m 1
lim sup = .
m→∞ m! n=1 n 3

4. (a) Show that

1 1
= log σ + O(1)
ρ σ −ρ 2

for σ ≥ 2, where the sum is over all non-trivial zeros of the zeta
function.
354 Analytic properties of ζ (s) and L(s, χ )

(b) Deduce that

1 3 1

1
− = log σ + O(1)
ρ σ −ρ 4 2σ − ρ 8

for σ ≥ 2.
(c) Show that each summand above is ≤ 1/(σ − 1).
(d) Show that if |γ | ≥ 3σ and σ is large, then the summand arising from ρ
in the sum above is ≤ 0.
(e) Concludethat N (T ) T log T when T is large.
5. Put f (s) = s+1
1
− s+23/4
.
(a) Show that if t ≥ 2, then
1
f (1 + it − ρ) = log t + O(1)
ρ 8

where the sum is over all non-trivial zeros ρ of ζ (s).

(b) Show that f (s) ≤ 1 when σ ≥ 0.
(c) Show that if 0 ≤ σ < 2, then f (s) ≤ 0 when
(σ + 1)(σ + 2)(σ + 5)
t2 ≥ .
2−σ
(d) Deduce that f (s) ≤ 0 if 0 < σ < 1 and |t| ≥ 6.
(e) Show that N (T + 6) − N (T − 6) log T for all T > T0 .

6. (a) Show that for s near 1 the Laurent expansion of ζζ (s) begins
ζ −1
(s) = + C0 + · · · .
ζ s−1
(b) Deduce that
ζ 1
(1 − s) = + C0 + O(|s|)
ζ s
for s near 0.

(c) Show that (1) = −C0 .
(d) Show that
π πs 1
cot = + O(|s|)
2 2 s
for s near 0.

(e) Deduce by (10.27) that ζζ (0) = log 2π .
(f) Use this to give a second proof that ζ (0) = − 12 log 2π .
7. (Taylor 1945) (a) Show that if σ > 1/2, then |ξ (s + 1/2)| > |ξ (s − 1/2)|.
(b) Put f (s) = ξ (s + 1/2) + ξ (s − 1/2). Show that all zeros of f (s) have
real part 1/2.
10.2 Products and sums over zeros 355

(c) Assume RH. Show that if c is ﬁxed, c > 0, then all zeros of ξ (s + c) +
ξ (s − c) have real part 1/2.
8. (Vorhauer 2006) Let B(χ ) denote the constant in Theorem 10.16.
(a) Show that
1−β β 1
+ 2 ≥
(1 − β)2 + γ 2 β + γ2 1 + γ2
uniformly for 0 ≤ β ≤ 1.
(b) Deduce that
1 1
B(χ ) ≤ − .
2 γ 1 + γ2

(c) Show that

ξ 1
(2, χ) = log q + O(1).
ξ 2
(d) Show that
ξ 1
(2, χ ) = .
ξ ρ 2−ρ

(e) Show that

ξ 1 1 1
(2, χ ) = + .
ξ 2 ρ 2−ρ 1+ρ

(f) Show that

2−β 1+β 3
+ ≤
(2 − β) + γ
2 2 (1 + β) + γ
2 2 1 + γ2
uniformly for 0 ≤ β ≤ 1.
(g) Conclude that
−1
B(χ ) ≤ log q + O(1).
6
K k
9. Let K > 0 be given, and put E(z) = (1 − z) exp k=1 z /k .
(a) Show that

K
z k
E (z) = −z K exp .
k=1
k

(b) Deduce that the power series coefﬁcients of E (z) are all ≤ 0.

(c) Write E(z) = ∞ m
m=0 Am z . Show that A0 = 1, Am = 0 for 1 ≤ m ≤ K ,

Am < 0 for m > K , and that m>K Am = −1.
(d) Show that if |z| ≤ r ≤ 1, then |1 − E(z)| ≤ 1 − E(r ) ≤ r K +1 .
356 Analytic properties of ζ (s) and L(s, χ )

10.3 Notes
Section 10.1. The case α = 0 of (10.1) was given by Poisson (1823). de la Vallée
Poussin observed that the left-hand side of (10.1) has period 1 with respect to
α, and then computed the Fourier coefﬁcients of this function to obtain (10.1).
This is rather similar to using the Poisson summation formula, as we have done.
Theorem 10.1 is the basis for a very large class of functional equations and was
ﬁrst exploited systematically by Hecke. For the most general version see Tate’s
thesis, reproduced in Tate (1967). Riemann gave two proofs of Corollary 10.3.
Riemann’s second method involved using Theorem 10.1 to establish the formula
of Exercise 10.1.10. This is the case z = 1 of Theorem 10.2, with the order of
summation and integration reversed. Theorem 10.2 is due to Lavrik (1965),
who derived it from Corollary 10.3 in the manner outlined in Exercise 10.1.4.
For further proofs of the functional equation, see Titchmarsh (1986, Chapter 2).
The proofof Theorem 10.1 can be arranged so that one does not depend on
the fact that e−π x d x = 1. To see this, let c denote the value of this integral.
2

Then the proof given establishes (10.1) with the factor c on the right-hand side.
But if z = 1 and α = 0 the two sides of (10.1) are visibly equal and positive,
so it follows that c = 1.
The functional equation for ζ (s) was established by Riemann (1860), and
that for L(s, χ) by de la Vallée Poussin (1896) although it was known in some
special cases earlier. See the commentary of Landau (1909, p. 899).
Section 10.2 The product formula of Theorem 10.12 was established by
Hadamard (1893). The constant B(χ ) in Theorem 10.16 was long considered
to be mysterious; the simple formula (10.39) for it is due to Vorhauer (2006).

10.4 References
Backlund, R. J. (1918). Über die Beziehung zwischen Anwachsen und Nullstellen der
Zetafunktion, Öfv. af ﬁnska vet. soc. förh. 61A, Nr. 9.
Berndt, B. C. (1985). The gamma function and the Hurwitz zeta-function, Amer. Math.
Monthly 92, 126–130.
Hadamard, J. (1893). Étude sur les propriétés des fonctions entières et en particulier
d’une fonction considérée par Riemann, J. Math. Pures Appl. (4) 9, 171–215.
Heilbronn, H. (1938). On Dirichlet series which satisfy a certain functional equation,
Quart J. Math. Oxford Ser. 9, 194–195.
Landau, E. (1903). Über die zahlentheoretische Funktion µ(k), Sitzungsber. Kais. Akad.
Wiss. Wien 112, 537–570; Collected Works, Vol. 2. Essen: Thales Verlag, 1986,
pp. 60–93.
(1907). Bemerkungen zu einer Arbeit des Herrn V. Furlan, Rend. Circ. Mat. Palermo
23, 367–373; Collected Works, Vol. 3. Essen: Thales Verlag, 1986, pp. 316–322.
10.4 References 357

(1909). Handbuch der Lehre von der Verteilung der Primzahlen, Third edition. New
York: Chelsea, 1974.
Lavrik, A. F. (1965). The abbreviated functional equation for the L-function of Dirichlet,
Izv. Akad. Nauk UzSSR Ser. Fiz.-Mat. Nauk 9, 17–22.
Lerch, M. (1894). Weitere Studien auf dem Gebiete der Malmstén’schen Reihen. Mit
einem Briefe des Herrn Hermite, Rozpravy 3, No. 28, 63 pp.
Littlewood, J. E. (1924). On the zeros of the Riemann Zeta-function, Cambridge Philos.
Soc. Proc. 22, 295–318.
Mallik, A. (1977). If L( 12 , χ) > 0, then L 12 , χ cannot be a minimum, Studia Sci.
Math. Hungar. 12, 445–446.
Poisson, S. D. (1823). Suite de mémoire sur les intégrales définies et sur la sommation
des séries, l’École Royale, J. Polytechnique 12, 404–509.
Riemann, B. (1860). Ueber die Anzahl der Primzahlen unter einer gegebenen Grösse,
Monatsberichte der Königlichen Preussichen Akademie der Wissenschaften zu
Berlin aus dem Jahre 1859, 671–680; Werke. Leipzig: Teubner, 1876, pp. 3–47.
Reprint: New York: Dover, 1953.
Tate, J. T. (1967). Fourier analysis in number fields, and Hecke’s zeta-functions, Alge-
braic Number Theory (Brighton, 1965). Washington: Thompson, pp. 305–347.
Taylor, P. R. (1945). On the Riemann zeta function, Quart. J. Math. Oxford Ser. 16,
1–21.
Titchmarsh, E. C. (1986). The Theory of the Riemann Zeta-function, Second Edition.
Oxford: Oxford University Press.
de la Vallée Poussin, C. (1896). Recherches analytique sur la théorie des nombres pre-
miers. Deuxième partie: Les fonctions de Dirichlet et les nombres premiers de
la forme linéaire M x + N , Annales de la Société scientifique de Bruxelles, 20,
281–342.
Vorhauer, U. M. A. (2006). The Hadamard product formula for Dirichlet L-functions,
to appear.
A. Walfisz (1931). Teilerprobleme, II, Math. Z. 34, 448–472.
A. Weil (1967). Über die Bestimmung Dirichletscher Reihen durch Funktionalgleichun-
gen, Math. Ann. 168, 149–156.
11
Primes in arithmetic progressions: II

11.1 A zero-free region

For a given integer q, the primes not dividing q are distributed in the reduced
residue classes modulo q. As there are no other obvious restrictions on the
primes modulo q, we expect the primes to be uniformly distributed amongst
the reduced residue classes. Let π (x; q, a) denote the number of primes p ≤ x
such that p ≡ a (mod q). We anticipate that if (a, q) = 1, then
x
π(x; q, a) ∼ as x −→ ∞ .
ϕ(q) log x
This asymptotic estimate is the Prime Number Theorem for arithmetic pro-
gressions; it can readily be established by adapting the methods of Chapters
4 and 6. For many purposes, however, it is important to have a quantitative
form of this, from which one can tell how large x should be, as a function of
q, to ensure that π (x; q, a) is near li(x)/ϕ(q). To obtain such an estimate we
must first derive a zero-free region for the Dirichlet L-functions L(s, χ ) that is
explicit in its dependence on both q and t. For the most part our arguments are
natural generalizations of the analysis in Chapter 6, but we shall encounter a
new difficulty in connection with the possible existence of a real zero β near 1
of L(s, χ ) when χ is a quadratic character.

The approximate partial fraction expansion of ζζ (s) (cf. Lemma 6.4) de-
pends on the upper bound for |ζ (s)| provided by Corollary 1.17. By using
Lemma 10.15 in a similar manner, we now derive a corresponding approximate

partial fraction formula for LL (s, χ ) . In order to formulate a unified result for
both the principal and non-principal characters, it is convenient to employ the
notation

1 if χ = χ0 ,
E 0 (χ ) = (11.1)
0 otherwise.

358
11.1 A zero-free region 359

Lemma 11.1 If χ is a character (mod q) and 5/6 ≤ σ ≤ 2, then

L E 0 (χ ) 1
− (s, χ) = − + O(log qτ )
L s−1 ρ s −ρ
3
where the sum is over all zeros ρ of L(s, χ ) for which ρ − 2
+ it ≤ 5/6.

Proof When χ is non-principal we apply Lemma 6.3 to the function

3
f (z) = L z + + it , χ
2
with R = 5/6 and r = 2/3. By Lemma 10.15 we may take M = Cqτ for a
suitable absolute constant C, and by the Euler product for L(s, χ ) we see that
−1 −1
1 − χ ( p) p − 2 −it 1 + p −3/2
3
| f (0)| = L 32 + it, χ = ≥ 1.
p p

Now suppose that χ = χ0 . The zeros of the function 1 − p −s form an arithmetic

progression on the imaginary axis. Hence by (4.22), the zeros of L(s, χ0 ) are
the zeros of ζ (s) together with the union of several arithmetic progressions on
the imaginary axis. Since these latter zeros all lie at a distance ≥ 3/2 from the
point 32 + it, none of them is included in the sum over ρ. Moreover, by taking
logarithmic derivatives of both sides of (4.22) we see that
L ζ log p
(s, χ0 ) = (s) + .
L ζ p|q
ps − 1

But (log p)/( p s − 1) 1 for σ ≥ 5/6, so the sum over p is ω(q)

log q by Theorem 2.10. Hence we obtain the stated identity by appealing to
Lemma 6.4.

The generalization of Lemma 6.5 is straightforward.

Lemma 11.2 If σ > 1, then

L L L
−3 (σ, χ0 ) − 4 (σ + it, χ) − (σ + 2it, χ 2 ) ≥ 0.
L L L

Proof By the Dirichlet series expansion (4.25) for LL (s, χ) we see that the
left-hand side above is
∞
(n)
σ
(3 + 4χ (n)n −it + χ (n)2 n −2it ).
n=1
n
(n,q)=1

The quantity χ(n)n −it is unimodular when (n, q) = 1, so for such n there is a
360 Primes in Arithmetic Progressions: II

real number θn such that χ (n)n −it = eiθn . Thus the above is
∞
(n)
(3 + 4 cos θn + cos 2θn ).
n=1
nσ
(n,q)=1

This is non-negative because 3 + 4 cos θ + cos 2θ = 2(1 + cos θ )2 ≥ 0 for

all θ.

The groundwork laid above enables us to establish a variant of Theorem 6.6

for Dirichlet L-functions.

Theorem 11.3 There is an absolute constant c > 0 such that if χ is a Dirichlet

character modulo q, then the region

Rq = {s : σ > 1 − c/ log qτ }

contains no zero of L(s, χ ) unless χ is a quadratic character, in which case

L(s, χ ) has at most one, necessarily real, zero β < 1 in Rq .

A zero lying in Rq , as described above, is called exceptional. No exceptional

zero is known, and indeed it may be conjectured that if χ is quadratic, then
L(σ, χ ) > 0 for all σ > 0. We give further study to exceptional zeros in the
next section.

Proof The case χ = χ0 is immediate from (4.22) and Theorem 6.6, so we

may assume that χ is non-principal. Also, the Euler product (4.21) for L(s, χ )
is absolutely convergent when σ > 1, and hence L(s, χ) = 0 for such s. Thus
it sufﬁces to consider a zero ρ0 = β0 + iγ0 of L(s, χ ) with 12/13 ≤ β0 ≤ 1.
We consider several cases, the ﬁrst of which parallels the proof of Theorem 6.6
most closely.

Case 1. Complex χ . If σ > 1 and ρ is a zero of an L-function, then (s − ρ) > 0

and hence (1/(s − ρ)) > 0. Thus by Lemma 11.1, if 0 < δ ≤ 1, then
L 1
− (1 + δ, χ0 ) ≤ + c1 log q,
L δ
L −1
− (1 + δ + iγ0 , χ ) ≤ + c1 log q(|γ0 | + 4), (11.2)
L 1 + δ − β0
L
− (1 + δ + 2iγ0 , χ 2 ) ≤ c1 log q(2|γ0 | + 4)
L
for some absolute constant c1 . The hypothesis that χ is complex is needed for
this last inequality, to ensure that χ 2 = χ0 in the appeal to Lemma 11.1. We
multiply both sides of the ﬁrst inequality by 3, the second by 4, and sum all
11.1 A zero-free region 361

three. By Lemma 11.2, the resulting left-hand side is non-negative. That is,
3 4
− + c2 log q(|γ0 | + 4) ≥ 0
δ 1 + δ − β0
for some constant c2 . If β0 = 1, then letting δ → 0+ gives an immediate con-
tradiction, so it may be assumed that β0 < 1. Then, on taking δ = 6(1 − β0 ), it
follows that
1
1 − β0 ≥ .
14c2 log q(|γ0 | + 4)
Hence ρ0 ∈/ Rq if c is chosen sufficiently small.
This argument also applies with only small changes when χ is quadratic,
provided that |γ0 | is large. We can even allow |γ0 | to be small, as long as it is
large compared with 1 − β0 . We now consider such a case.
Case 2. Quadratic χ , |γ0 | ≥ 6(1 − β0 ). By Theorem 4.9, L(1, χ ) = 0, so γ0 =
0. Hence we can proceed as above, except that as χ 2 = χ0 the third inequality
in (11.2) must be replaced by the weaker inequality
L δ
− (1 + δ + 2iγ0 , χ 2 ) ≤ 2 + c1 log q(2|γ0 | + 4).
L δ + 4γ02
Again if β0 = 1, then taking δ → 0+ gives a contradiction. Thus it can be
supposed that β0 < 1. Since |γ0 | ≥ 6(1 − β0 ), this implies that
L δ
− (1 + δ + 2iγ0 , χ 2 ) ≤ 2 + c1 log q(2|γ0 | + 4).
L δ + 144(1 − β0 )2
We combine this inequality with the first two inequalities in (11.2) and apply
Lemma 11.2 with σ = 1 + δ = 1 + 6(1 − β0 ) to see that

1 3 4 6
− + + c2 log q(|γ0 | + 4) ≥ 0.
1 − β0 6 7 180
The factor in large parentheses above is −4/105 < −1/27, so
1
1 − β0 ≥ .
27c2 log q(|γ0 | + 4)
Case 3. Quadratic χ , 0 < |γ0 | ≤ 6(1 − β0 ). Since L(s, χ) is real when s is
real, it follows by the Schwarz reflection principle that L(β0 − iγ0 , χ) = 0.
Hence by Lemma 11.1 we see that if 1 < σ ≤ 2, then
L 1 1
− (σ, χ ) ≤ − − + c1 log 4q
L σ − ρ0 σ − ρ0
−2(σ − β0 )
= + c1 log 4q
(σ − β0 )2 + γ02
−2(σ − β0 )
≤ + c1 log 4q. (11.3)
(σ − β0 )2 + 36(1 − β0 )2
362 Primes in Arithmetic Progressions: II

Rather than apply Lemma 11.2 we simply observe that if σ > 1, then
L L ∞
(n)
− (σ, χ0 ) − (σ, χ ) = (1 + χ (n)) ≥ 0. (11.4)
L L n=1
nσ
(n,q)=1

We put σ = 1 + δ = 1 + a(1 − β0 ) and combine the ﬁrst inequality in (11.2)

and (11.3) in the above to deduce that

1 1 2(a + 1)
− + c2 log 4q ≥ 0.
1 − β0 a (a + 1)2 + 36
The factor in large parentheses is ∼ −1/a as a → ∞, so it is certainly possible
to choose a value of a so that this factor is negative. Indeed, when a = 13 this
factor is −33/754 < −1/27, and hence
1
1 − β0 ≥ .
27c2 log 4q
(We note that our supposition that β0 ≥ 12/13 implies that σ = 1 + 13(1 −
β0 ) ≤ 2, so that Lemma 11.1 is applicable.)
Case 4. Quadratic χ , real zeros. If β0 is a real zero of L(s, χ ), then β0 < 1
by Theorem 4.9. Suppose that β0 ≤ β1 < 1 are two such zeros. Then by Lemma
11.1,
L 1 1
− (σ, χ) ≤ − − + c1 log 4q
L σ − β0 σ − β1
2
≤− + c1 log 4q.
σ − β0
On combining the ﬁrst part of (11.2) and the above in (11.4) with σ = 1 + δ =
1 + a(1 − β0 ), we ﬁnd that

1 1 2
− + c2 log 4q ≥ 0.
1 − β0 a a+1
On taking a = 2 we deduce that
1
1 − β0 ≥ .
6c2 log 4q
This completes the proof.

In the same way that Theorem 6.7 was derived from Theorem 6.6, we now

derive estimates for LL (s, χ ) and log L(s, χ ) in a portion of the critical strip.

Theorem 11.4 Let χ be a non-principal character modulo q, let c be the

constant in Theorem 3, and suppose that σ ≥ 1 − c/(2 log qτ ). If L(s, χ ) has
no exceptional zero, or if β1 is an exceptional zero of L(s, χ) but |s − β1 | ≥
11.1 A zero-free region 363

1/ log q, then
L
(s, χ) log qτ, (11.5)
L
| log L(s, χ )| ≤ log log qτ + O(1), (11.6)

and
1
log qτ. (11.7)
L(s, χ)
Alternatively, if β1 is an exceptional zero of L(s, χ) and |s − β1 | ≤ 1/ log q,
then
L 1
(s, χ ) = + O(log q) (s = β1 ), (11.8)
L s − β1
| arg L(s, χ )| ≤ log log q + O(1) (s = β1 ), (11.9)

and

|s − β1 | |L(s, χ )| |s − β1 |(log q)2 . (11.10)

Proof If σ > 1, then by Corollary 1.11 we see that

L ∞
ζ 1
(s, χ) ≤ (n)n −σ = − (σ ) .
L n=1
ζ σ −1

Hence (11.5) is obvious if σ ≥ 1 + 1/ log qτ . Let s1 = 1 + 1/ log qτ + it.

Then
L
(s1 , χ ) log qτ.
L
From this and Lemma 11.1 it follows that
1
log qτ (11.11)
ρ s1 − ρ

where the sum is over those zeros of L(s, χ) for which |ρ − (3/2 + it)| ≤ 5/6.
Hence
1 1 1

= − + O(log qτ ). (11.12)
ρ s −ρ ρ s−ρ s1 − ρ

Suppose that 1 − c/(2 log qτ ) ≤ σ ≤ 1 + 1/ log qτ and that |s − β1 | ≥

1/ log q if L(s, χ ) has an exceptional zero β1 . Since |s − ρ| |s1 − ρ| for
all zeros ρ, it follows that
1 1 1 + 1/ log qτ − σ 1 1
− = .
s−ρ s1 − ρ (s − ρ)(s1 − ρ) |s1 − ρ|2 log qτ s1 − ρ
364 Primes in Arithmetic Progressions: II

On summing this over ρ and appealing to (11.11) we ﬁnd that

1
log qτ, (11.13)
ρ s −ρ

and (11.5) follows by Lemma 11.1.

To derive (11.6) we ﬁrst note that if σ > 1, then

∞
(n)
| log L(s, χ )| ≤ n −σ = log ζ (σ ).
n=2
log n

Since ζ (σ ) ≤ σ/(σ − 1) by Corollary 1.14, we see that (11.6) holds when σ ≥

1 + 1/ log qτ . In particular, (11.6) holds at the point s1 = 1 + 1/ log qτ + it.
To treat the remaining s it sufﬁces to note that
s
L
log L(s, χ ) − log L(s1 , χ ) = (w, χ) dw |s1 − s| log qτ 1
s1 L
by (11.5). The estimate (11.6) trivially implies (11.7) since log 1/|L(s, χ )| =
− log L(s, χ ).
Now suppose that L(s, χ) has an exceptional zero β1 such that |s − β1 | ≤
1/ log q. Then 1 − c/(2 log 4q) ≤ σ ≤ 1 + 1/ log q, so by Lemma 11.1,
L 1 1
(s, χ) = + + O(log q)
L s − β1 ρ s −ρ

where ρ denotes a sum over all zeros ρ such that |ρ − (3/2 + it)| ≤ 5/6

except for the exceptional zero β1 . The proof of (11.13) applies to ρ , so we
have (11.8). Proceeding as in the proof of (11.6), we ﬁnd that
s − β1
log L(s, χ ) = log + log L(s1 , χ ) + O(1),
s1 − β1
which implies that
s − β1
log L(s, χ ) − log ≤ | log L(s1 , χ )| + O(1) ≤ log log q + O(1).
s1 − β1
But arg(s − β1 ) 1, arg(s1 − β1 ) 1, and log |s1 − β1 | = − log log q +
O(1), so we have (11.9) and (11.10).

Our methods yield not only a zero-free region, but also enable us to bound
the number of zeros ρ of L(s, χ ) that might lie near 1 + it.

Theorem 11.5 Let n(r ; t, χ ) denote the number of zeros ρ of L(s, χ ) in the
disc |ρ − (1 + it)| ≤ r . Then n(r ; t, χ ) r log qτ uniformly for 1/ log qτ ≤
r ≤ 3/4.
11.1 A zero-free region 365

Here the constraint r ≥ 1/ log qτ is needed because L(s, χ ) might have

an exceptional zero. If L(s, χ ) has no exceptional zero, then the bound holds
uniformly for 0 ≤ r ≤ 3/4, in view of the zero-free region of Theorem 11.3.

Proof In view of Theorem 6.8, we may suppose that χ is non-principal. Sup-

pose ﬁrst that 1/ log qτ ≤ r ≤ 1/3. Take s1 = 1 + r + it. Then (s1 − ρ)−1 ≥
0 for all zeros ρ, and (s1 − ρ)−1 1/r if ρ is counted by n(r ; t, χ ). Hence
1 1
n(r ; t, χ)
r ρ s1 − ρ

where the sum is over all zeros ρ such that |ρ − (3/2 + it)| ≤ 5/6. By
Lemma 11.1 we see that the above is log qτ , since
L ζ 1
(s1 ) ≤ − (1 + r ) log qτ.
L ζ r
If 1/3 ≤ r ≤ 3/4, then it sufﬁces to apply Jensen’s inequality to L(s, χ) on a
disc with centre 3/2 + it, with R = 4/3 and r = 5/4, in view of the estimates
provided by Lemma 10.15.

11.1.1 Exercises
1. Let S(x; q) denote the number of integers n, 0 < n ≤ x, such that (n, q) = 1,
and put R(x; q) = S(x; q) − (ϕ(q)/q)x.
(a) Show that if σ > 0, x > 0, and s = 1, then
ϕ(q) x 1−s R(x; q) ∞
L(s, χ0 ) = χ0 (n)n −s + · − +s R(u; q)u −s−1 du.
n≤x q s − 1 x s
x

Show that this includes Theorem 1.12 as a special case.

(b) Let δ > 0 be ﬁxed. Show that if σ ≥ δ, then
ϕ(q) x 1−s
L(s, χ0 ) = · + χ0 (n)n −s + O(d(q)|s|x −σ ).
q s − 1 n≤x

2. Suppose that δ is ﬁxed, 0 < δ < 1. Show that

log p
s −1
(log q)1−δ
p|q
p

uniformly for σ ≥ δ. (This improves on the estimate used in the latter part
of the proof of Lemma 11.1.)
3. (a) Show that if σ > 0, then
∞
1 1
ζ (s) = + −s ({x} − 1/2)x −s−1 d x.
s−1 2 1
366 Primes in Arithmetic Progressions: II

(b) Show that if f (x) is a monotonically decreasing function, then

1
(x − 1/2) f (x) d x ≤ 0.
0

(c) Show that

1 1
ζ (σ ) > +
σ −1 2
for σ > 0.
(d) Show that
∞
1
− ζ (s) = + ({x} − 1/2)(1 − s log x)x −s−1 d x
(s − 1)2 1

for σ > 0.
(e) Show that if σ > 0, then
∞
1 1 1
ζ (σ ) + < |1 − σ log x|x −σ −1 d x = .
(σ − 1)2 2 1 eσ
(f) Justify the following chain of inequalities for σ > 1:
1
+ 1 −1) 2
ζ (σ −1)2 eσ 1 1 + (σ eσ 1
− (σ ) < = · < .
ζ 1
σ −1
+ 1
2
σ − 1 1 + σ −1 2
σ − 1

(g) Show that if χ0 is the principal character (mod q), then

L 1
− (σ, χ0 ) <
L σ −1
for σ > 1. (This improves on the ﬁrst inequality in (11.2), in the proof
of Theorem 11.3.)
4. Let χ be a character (mod q), and suppose that the order d of χ is odd.
(a) Show that χ (n) ≥ − cos π/d for all integers n.
(b) Show that if σ > 1, then log |L(σ, χ )| ≥ −(cos π/d) log ζ (σ ).
(c) Show that L(1, χ ) L(1 + 1/ log q, χ ).
(d) Show that |L(1, χ)| (log q)− cos π/d .
(e) Deduce in particular that if χ is a cubic character (mod q), then
√
|L(1, χ )| 1/ log q.√
5. Grössencharaktere for Q( −1), continued from Exercise 10.1.28. For an
ideal a = (a + ib) in the ring O{a + ib : a, b ∈ Z} of Gaussian integers, put
χm (a) = e4mi arg(a+ib) . The ideal a is the set of (Gaussian integer) multiples of
the number a + ib, but it can equally well be expressed as the set of Gaussian
integer multiples of (a + ib)i k for k = 0, 1, 2, 3. Note that the stated value
of χm (a) is independent of the choice of k.
11.2 Exceptional zeros 367

(a) Show that

−1
χm (p)
L(s, χm ) = 1−
p N (p)s

for σ > 1, where the product is over all prime ideals p in the ring.
(b) Let (a) = log(a 2 + b2 ) if a = (a + ib)k for some positive integer
k and a + ib is a Gaussian prime, and (a) = 0 otherwise. Show
that
L (a)χm (a)
(s, χm ) = −
L a N (a)s

for σ > 1.
(c) Show that there is an absolute constant c > 0 such that L(s, χm ) = 0 for
σ > 1 − c/ log mτ for every positive integer m.

11.2 Exceptional zeros

Although there is no known quadratic character χ for which L(s, χ) has an
exceptional real zero, the possible existence of such zeros is a recurring issue in
the theory in its current stage of development. The techniques of the preceding
section do not seem to offer a means of eliminating exceptional zeros entirely,
but nevertheless they may be used to show that such zeros occur at most rarely.
To this end we introduce a variant of Lemma 11.5 that allows us to consider
two different quadratic characters.

Lemma 11.6 (Landau) Suppose that χ1 and χ2 are quadratic characters. If

σ > 1, then
ζ L L L
− (σ ) − (σ, χ1 ) − (σ, χ2 ) − (σ, χ1 χ2 ) ≥ 0.
ζ L L L
Proof It sufﬁces to express the left-hand side as a Dirichlet series and to note
that
1 + χ1 (n) + χ2 (n) + χ1 χ2 (n) = (1 + χ1 (n))(1 + χ2 (n)) ≥ 0

for all n.

Theorem 11.7 (Landau) There is a constant c > 0 such that if χ1 and χ2

are quadratic characters modulo q1 and q2 , respectively, and if χ1 χ2 is non-
principal, then L(s, χ1 )L(s, χ2 ) has at most one real zero β such that 1 −
c/ log q1 q2 < β < 1.
368 Primes in Arithmetic Progressions: II

Proof Since any given L-function can have at most one such zero, if there
are two zeros, then one of them, say β1 , is a zero of L(s, χ1 ), and the other,
β2 , is a zero of L(s, χ2 ). We may assume that c is so small that 5/6 ≤ βi < 1.
Also, we note that χ1 χ2 is a non-principal character (mod q1 q2 ). Hence by four
applications of Lemma 11.1 we see that if 0 < δ ≤ 1, then
ζ 1
− (1 + δ) ≤ + c1 log 4,
ζ δ
L −1
− (1 + δ, χi ) ≤ + c1 log qi ,
L 1 + δ − βi
L
− (1 + δ, χ1 χ2 ) ≤ c1 log q1 q2 .
L
We sum these inequalities and apply Lemma 11.4 to see that
1 1 1
− − + c2 log q1 q2 ≥ 0.
δ 1 + δ − β1 1 + δ − β2
Without loss of generality we may suppose that β1 ≤ β2 . Then
1 2
− + c2 log q1 q2 ≥ 0,
δ 1 + δ − β1
and by taking δ = 2(1 − β1 ) we deduce that
1
1 − β1 ≥ .
6c2 log q1 q2

The following corollaries are immediate.

Corollary 11.8 (Landau) There is a positive constant c > 0 such that

χ L(s, χ ) has at most one zero in the region σ > 1 − c/ log qτ . Here
the product is over all Dirichlet characters χ (mod q). If such a zero
exists then it is necessarily real and the associated character χ is
quadratic.

Corollary 11.9 (Landau) For each positive number A there is a c(A) > 0
such that if {qi } is a strictly increasing sequence of natural numbers with the
property that for each qi there is a primitive quadratic character χi (mod qi )
for which L(s, χi ) has a zero βi satisfying
c(A)
βi > 1 − ,
log qi
then
qi+1 > qiA .
11.2 Exceptional zeros 369

Corollary 11.10 (Page) There is a constant c > 0 such that for every Q ≥ 1
the region σ ≥ 1 − c/ log Qτ contains at most one zero of the function
∗
L(s, χ)
q≤Q χ

where ∗χ denotes a product over all primitive characters χ (mod q). If such
a zero exists, then it is necessarily real and the associated character χ is
quadratic.

We now turn to the problem of showing that even an exceptional zero cannot
be too close to 1. By taking s = 1 in (11.10) we see that this is equivalent
to showing that L(1, χ) cannot be too small. Suppose that χ is a primitive

quadratic character modulo q, and let r (n) = d|n χ (d). Then r (n) ≥ 0 for all

n and r (n) ≥ 1 when n is a perfect square. Since ∞ n=1 r (n)n
−s
= ζ (s)L(s, χ )
for σ > 1, we ﬁnd that
L(1, χ )x 1−s
r (n)n −s = + ζ (s)L(s, χ ) + error terms. (11.14)
n≤x 1−s

Here the error terms are small if x is sufﬁciently large in terms of q. Estimates of
this kind can be derived from Corollary 1.15 by the method of the hyperbola, or
else by employing an inverse Mellin transform. Suppose that 0 ≤ s < 1 in the
above. We can give a lower bound for the left-hand side, which yields a lower
bound for L(1, χ) if the second term on the right-hand side does not interfere.
Since ζ (s) < 0 for 0 < s < 1 (cf. Corollary 1.14), this term is harmless if
L(s, χ ) ≥ 0. If this cannot be arranged, we may alternatively eliminate this
term by taking two values of x and differencing. Since the method of the
hyperbola leads to tedious details, we use an inverse Mellin transform to derive
a more precise version of (11.14). To make the estimates easier we introduce
an Abelian weighting of the sum. By (5.23) with x replaced by 1/x we see that
∞
1 2+i∞
r (n)en/x = ζ (s)L(s, χ ) (s)x s ds.
n=1
2πi 2−i∞

We move the contour of integration to the line s = −1/2, which gives rise to
residues at the poles at s = 1 and s = 0. Thus the above is
−1/2+i∞
1
= L(1, χ )x + ζ (0)L(0, χ) + ζ (s)L(s, χ) (s)x s ds.
2πi −1/2−i∞

By Corollary 10.5 we know that ζ (−1/2 + it) τ , by Corollary 10.10 we

know that L(−1/2 + it, χ ) qτ , and by (C.19) we know that (−1/2 +
it) τ −1 e−π τ/2 . Hence the integral is q x −1/2 . By (10.11) we know
that ζ (0) = −1/2, and by Corollary 10.9 we know that L(0, χ ) ≥ 0. (More
370 Primes in Arithmetic Progressions: II

precisely, L(0, χ) = 0 if χ (−1) = 1, and L(0, χ ) q 1/2 L(1, χ ) if χ (−1) =

−1.) Since the perfect squares on the left-hand side contribute an amount
x 1/2 , we deduce that

x 1/2 x L(1, χ) + q x −1/2 .

On taking x = Cq with C a large constant we deduce that L(1, χ) q −1/2 .

Now consider the possibility that χ is an imprimitive quadratic character. Then
there is a primitive quadratic character χ ! modulo d, with d|q, that induces

χ. Thus L(1, χ) = L(1, χ ! ) p|q/d (1 − χ ! ( p)/ p) ≥ L(1, χ ! )ϕ(q/d)d/q
d −1/2 (log log 3q/d)−1 q −1/2 , by Theorem 2.9, so we have

Theorem 11.11 If χ is a quadratic character modulo q, then L(1, χ)

q −1/2 .

By (11.10) the following corollary is immediate.

Corollary 11.12 There is an absolute constant c > 0 such that if χ is a

quadratic character modulo q and L(s, χ ) has an exceptional zero β1 , then
c
β1 ≤ 1 − 1/2 .
q (log q)2
By elaborating on the above argument we can obtain better lower bounds for
1 − β1 . To facilitate this we ﬁrst establish a convenient inequality that depends
only on the analyticity and size of the relevant Dirichlet series in the immediate
vicinity of the real axis.

Lemma 11.13 (Estermann) Suppose that f (s) is analytic for |s − 2| ≤ 3/2,

and that | f (s)| ≤ M for s in this disc. Suppose also that

∞
F(s) = ζ (s) f (s) = r (n)n −s
n=1

for σ > 1, that r (1) = 1, and that r (n) ≥ 0 for all n. If there is a σ ∈ [19/20, 1)
such that f (σ ) ≥ 0, then
1
f (1) ≥ (1 − σ )M −3(1−σ ) .
4
To put this in perspective, we recall that our proof in Chapter 4 that
L(1, χ ) = 0 depended on Landau’s theorem (Theorem 1.7). The above amounts
to a quantitative elaboration of Landau’s theorem, for if f (1) were 0, then F(s)
would be analytic for s > 1/2, so by Landau’s theorem the Dirichlet series
would converge when σ > 1/2. This would imply that F(σ ) > 0 for σ > 1/2.
But ζ (σ ) < 0 for 1/2 < σ < 1 (cf. Corollary 1.14), so it would follow that
11.2 Exceptional zeros 371

f (σ ) < 0 in this interval. Thus the hypothesis above that f (σ ) ≥ 0 implies –

by Landau’s theorem – that f (1) > 0. In the above we obtain not just this
qualitative information but a quantitative lower bound for f (1) in terms of the
size of σ and the size of f (s) in a surrounding disc.

Proof As in the proof of Landau’s theorem we begin by expanding F(s) in

powers of 2 − s,

∞
F(s) = bk (2 − s)k (11.15)
k=0

for |s − 2| < 1. By Cauchy’s coefﬁcient formula we know that

(−1)k (k) 1 ∞
bk = F (2) = r (n)n −2 (log n)k .
k! k! n=1

Thus bk ≥ 0 for all k, and b0 = ∞ n=1 r (n)n
−2
≥ 1. For |s − 2| < 1 we may
write
1 1 ∞
= = (2 − s)k .
s−1 1 − (2 − s) k=0

On multiplying this by f (1) and subtracting from (11.15) we deduce that

f (1) ∞
F(s) − = (bk − f (1))(2 − s)k (11.16)
s−1 k=0

for |s − 2| < 1. But the left-hand side is analytic for |s − 2| ≤ 3/2, so the series
converges in this larger disc. In order to estimate the coefﬁcients on the right-
hand side we bound the left-hand side when s lies on the circle |s − 2| = 3/2.
To this end, we note by (1.24) that
∞
1 [u] − u
|ζ (s)| = 1 + +s du
s−1 1 u s+1
1 |s|
≤ 1+ + .
|s − 1| σ
The relation |s − 2| = 3/2 implies that |s − 1| ≥ 1/2, that |s| ≤ 7/2, and that
σ ≥ 1/2. Hence |ζ (s)| ≤ 10 for the s under consideration. Since | f (1)/(s −
1)| ≤ 2M, it follows that the left-hand side of (11.16) has modulus ≤ 12M
for |s − 2| ≤ 3/2. By the Cauchy coefﬁcient inequalities we deduce that |bk −
f (1)| ≤ 12M(2/3)k . We apply this bound for all k > K where K is a parameter
to be chosen later. Thus from (11.16) we see that if 1/2 < σ ≤ 2, then
f (1) K k
ζ (σ ) f (σ ) − ≥ (bk − f (1))(2 − σ )k − 12M 2
(2 − σ ) .
σ −1 k=0 k>K
3
372 Primes in Arithmetic Progressions: II

We observe that if 19/20 ≤ σ < 1, then 23 (2 − σ ) ≤ 7/10. We also recall that

b0 ≥ 1 and that bk ≥ 0 for all k. Hence the above is
1 − (2 − σ ) K +1
≥ 1 − f (1) − 40M(7/10) K +1 .
1 − (2 − σ )
On cancelling the common term f (1)/(1 − σ ) from both sides, and rearranging,
we ﬁnd that
f (1)(2 − σ ) K +1
1≤ + ζ (σ ) f (σ ) + 40M(7/10) K +1 ,
1−σ
a relation comparable to (11.14). To ensure that the last term on the right does
not overwhelm the left-hand side, we take K = [(log 80M)/ log 10/7]. Then
the last term on the right is ≤ 1/2. Since ζ (σ ) < 0 by Corollary 1.14, and
f (σ ) ≥ 0 by hypothesis, it follows that
1 10
f (1) ≥ (1 − σ )(2 − σ )−K −1 ≥ (1 − σ )(2 − σ )−K . (11.17)
2 21
But

(2 − σ ) K ≤ (2 − σ )(log 80M)/ log 10/7 = (80M)(log(2−σ ))/ log 10/7

≤ 80(log 21/20)/ log 10/7 M (log(2−σ ))/ log 10/7 .

Here the ﬁrst factor is < 13/7. Since log(1 + δ) ≤ δ for any δ ≥ 0, on taking
δ = 1 − σ we see that log(2 − σ ) ≤ 1 − σ . Also, log 10/7 > 1/3 and it can
certainly be supposed that M ≥ 1, so the expression above is < (13/7)M 3(1−σ ) .
This with (11.17) gives the desired lower bound for f (1).

We are now prepared to prove an important strengthening of Theorem 11.11.

Theorem 11.14 (Siegel) For each positive number ε there is a positive con-
stant C(ε) such that if χ is a quadratic character modulo q, then

L(1, χ ) > C(ε)q −ε .

Proof We assume, as we may, that ε ≤ 1/5. For the present we restrict our
attention to primitive characters. We consider two cases, according to whether
there exists a primitive quadratic character χ1 such that L(s, χ1 ) has a real zero
β1 in the interval [1 − ε/4, 1), or not. Suppose ﬁrst that there is no such zero.
We take f (s) = L(s, χ), σ = 1 − ε/4. Then f (σ ) > 0 and by Lemma 10.15
we may take M q 1/2 . Hence by Lemma 11.13, f (1) εq −3ε/8 . Thus there
−ε
is a constant C1 (ε) > 0 such that L(1, χ) ≥ C1 (ε)q .
Now consider the contrary case, in which there is a primitive quadratic char-
acter χ1 modulo q1 such that L(s, χ1 ) has a real zero β1 ≥ 1 − ε/4. Since
L(1, χ1 ) > 0 there is a constant C2 (ε) > 0 such that L(1, χ1 ) ≥ C2 (ε)q1−ε .
11.2 Exceptional zeros 373

Now suppose that χ is a primitive quadratic character, χ = χ1 . We apply

Lemma 11.13 with f (s) = L(s, χ )L(s, χ1 )L(s, χ χ1 ). To see that the Dirichlet
series coefficients of ζ (s) f (s) are non-negative, we note first that if g(s) is a
Dirichlet series with non-negative coefficients, then exp g(s) is also a Dirichlet
series with non-negative coefficients, since the power series coefficients of the
exponential function are non-negative. Then it suffices to apply this observation
with
∞
(n)
g(s) = log ζ (s) f (s) = (1 + χ (n))(1 + χ1 (n))n −s .
n=1
log n
In view of Lemma 10.15 we may take M = C3 qq1 . On taking σ = β1 , we find
that
1 1
f (1) ≥ (C3 qq1 )−3(1−β1 ) ≥ (C3 qq1 )−3ε/4 ≥ C4 (ε)q −ε .
4 4
Now
f (1) = L(1, χ)L(1, χ1 )L(1, χ χ1 ) L(1, χ )(log qq1 )2
by Lemma 10.15, and hence we deduce that
L(1, χ ) ≥ C5 (ε)q −2ε . (11.18)
We may assume that C5 ≤ C1 , so that (11.18) holds in either case.
We now extend to imprimitive characters. Suppose that χ is induced by a
primitive character χ ∗ (mod d), so that q = dr for some r . Then

χ ∗ ( p) ϕ(r ) ϕ(r )
L(1, χ ) = L(1, χ ∗ ) 1− ≥ L(1, χ ∗ ) ≥ C5 (ε)d −2ε .
p|r
p r r

By Theorem 2.9 the above is

≥ C6 (ε)(dr )−2ε = C6 (ε)q −2ε ,

and hence the proof is complete.

We are unable to compute the value of the constant C(ε) in Siegel’s theorem
when ε < 1/2, because we have no way of estimating the size of the small-
est possible q1 when the second case arises in the proof. Such a constant is
called ‘non-effective.’ This is our ﬁrst encounter with a non-effective constant,
so the distinction between effectively computable constants and non-effective
constants arises here for the ﬁrst time.
Corollary 11.15 For any ε > 0 there is a positive number C(ε) such that
if χ is a quadratic character modulo q and β is a real zero of L(s, χ ), then
β < 1 − C(ε)q −ε .
374 Primes in Arithmetic Progressions: II

Proof We may certainly suppose that β > 1 − c/ log 4q > 1 − log1 q , where
c is the number appearing in Theorem 11.3, so that β is an exceptional zero by
the criterion following that theorem. By taking s = 1 in (10) we see that

L(1, χ) (1 − β)(log q)2

and the corollary follows easily from the theorem.

11.2.1 Exercises
1. Call a modulus q ‘exceptional’ if there is a primitive quadratic character
χ (mod q) such that L(s, χ ) has a real zero β such that β > 1 − c/ log q.
Show that if c is sufﬁciently small, then the number of exceptional q not
exceeding Q is log log Q.
2. Use the last part of Theorem 4 to show that if L(s, χ ) has an exceptional
zero β1 , then L (β1 , χ) 1.
3. (cf. Mahler 1934, Davenport 1966, Haneke 1973, Goldfeld & Schinzel 1975)

Suppose that χ is a quadratic character, and put r (n) = d|n χ (d).
(a) Show that
χ (n)
= L(1, χ ) + O q 1/2 y −1 log q .
n≤y n

(b) Show that

χ (n) log n
= −L 1, χ ) + O(q 1/2 y −1 (log qy)2 .
n≤y n

(c) Verify that

r (n) χ (d) 1 1 χ (d)
= +
n≤x n d≤y
d m≤x/d m m≤x/y m d≤x/m d

χ (d) 1
−
d≤y
d m≤x/y
m
= 1 + 2 − 3 ,

say.
(d) Show that

1 = (log x + C0 )L(1, χ) + L (1, χ ) + O q 1/2 y −1 (log qy)2 + O(yx −1 ).

(e) Show that

2 = (log x/y + C0 )L(1, χ) + O(yx −1 log q) + O q 1/2 y −1 log q .
11.2 Exceptional zeros 375

(f) Show that

3 = (log x/y + C0 )L(1, χ) + O(yx −1 log q) + O q 1/2 y −1 (log q x)2 .
(g) Show that
r (n)
= (log x + C0 )L(1, χ) + L (1, χ ) + O q 1/4 x −1/2 (log q x)3/2 .
n≤x n

(h) Show that for each c < 1/2 there is a constant q0 (c) such that if q ≥ q0 (c)
and L(1, χ) < c/ log q, then
r (n)
L (1, χ ) .
n≤q n

(i) Show that L (σ, χ ) (log q)3 for σ ≥ 1 − 1/ log q.

(j) Show that there is an absolute constant c > 0 such that if L(s, χ) has an
exceptional zero β1 for which β1 ≥ 1 − c/(log q)3 , then
r (n)
L(1, χ ) (1 − β1 ) .
n≤q n

4. Use Estermann’s lemma (Lemma 11.13) to give a second proof that if L(s, χ )
has an exceptional zero β1 , then L(1, χ ) 1 − β1 (cf. (11.10) of Theorem
11.4).
5. Use Estermann’s lemma (Lemma 11.13) to give a second proof that if χ is a
cubic character (mod q), then L(1, χ) (log q)−1/2 (cf. Exercise 11.1.4(e)).
6. (Tatuzawa 1951) Let χ1 and χ2 be distinct primitive quadratic characters,
modulo q1 and q2 , respectively, and suppose that L(1, χi ) < Cεqi−ε for i =
1, 2 where 0 < ε ≤ 1 and C > 0.
(a) Show that minx>1 logx x = e. By a change of variables, deduce
that if ε > 0, then minx>1 x ε / log x = eε. Use this to show that
minx>1 x ε /(log x)2 = e2 ε 2 /4.
(b) Explain why there exists a constant c1 > 0 such that L(1, χ) ≥ c1 / log q
whenever L(s, χ ) has no exceptional zero. Let C1 = ec1 . Show that if
C < C1 , then L(s, χ1 ) and L(s, χ2 ) have exceptional zeros, say β1 and
β2 . (From now on, suppose that C < C1 .)
(c) Explain why there is a positive constant c2 such that L(1, χ) ≥ c2 (1 − β)
whenever β is an exceptional zero of L(s, χ ). Let C2 = c2 /6. Show that
if C < C2 , then β > 1 − ε/6. Let C3 = c2 /20. Show that if C < C3 ,
then β > 19/20. (From now on, suppose that C < Ci for i = 1, 2, 3.)
(d) Explain why there is a constant c3 > 0 such that at most one of L(s, χ1 ),
L(s, χ2 ) has a zero in the interval [1 − c3 / log q1 q2 , 1].
(e) Show that L(s, χ1 )L(s, χ2 ) has a zero β that satisﬁes the three inequal-
ities β ≥ 19/20, β ≥ 1 − ε/6, β ≤ 1 − c3 / log q1 q2 .
376 Primes in Arithmetic Progressions: II

(f) Let f (s) = L(s, χ1 )L(s, χ2 )L(s, χ1 χ2 ). Show that there is an absolute
constant c4 > 0 such that f (1) ≥ c4 (log q1 q2 )−1 (q1 q2 )−ε/2 .
(g) Explain why there is a constant c5 > 0 such that L(1, χ1 χ2 ) ≤
c5 log q1 q2 .
1/2 −1/2
(h) Show that C ≥ c4 c5 e/4.
(i) Conclude that there is a positive effectively computable absolute C such
that if 0 < ε ≤ 1, then the inequality L(1, χ) > Cεq −ε holds for all
primitive quadratic characters, with at most one exception.
7. (Fekete & Pólya 1912, Pólya & Szegö 1925, p. 44, Heilbronn 1937) Let

S1 (x, χ) = 1≤n≤x χ (n).
(a) Show that if χ is a quadratic character such that S1 (x, χ) ≥ 0 for all
x ≥ 1, then L(σ,
χ ) > 0 for all σ > 0.
(b) Let χd (n) = dn . Show that the hypothesis above holds for d =
−3, −4, −7, −8, but not for d = 5, 8.
N
(c) For k > 1 let Sk (N , χ ) = n=1 Sk−1 (n, χ ). Show that

N
N −n+k−1
Sk (N , χ) = χ (n).
n=1
k−1

(d) Let f (x) = f (x + 1) − f (x) and k f (x) = ( k−1 f (x)). Show that

k f (x) = rk=0 (−1)r rk f (x + k − r ), and that if f (k) (x) is continu-
ous then

x+1 u 1 +1 u k−1 +1
k f (x) = ··· f (k) (u k ) du k du k−1 · · · du 1 .
x u1 u k−1

(e) Show that if σ > 0, then (−1)k k (x −σ ) > 0 for all x > 0.

(f) Show that L(s, χ ) = (−1)k ∞ −s
n=1 Sk (n, χ) k (n ) for σ > 0.
(g) Show that if χ is a quadratic character and k is an integer such that
Sk (N , χ) ≥ 0 for
all integers N ≥ 1, then L(σ, χ) > 0 for all σ > 0.
(h) For χ5 (n) = n5 and χ8 (n) = n8 ﬁnd the least k such that the hypothesis
above is satisﬁed.

(i) Let P(z, χ ) = ∞ n=1 χ (n)z for |z| < 1. Show that P(z, χ)(1 − z)
n −k
=
∞
n=1 Sk (n, χ)z for |z| < 1.
n

(j) Show that if χ is a quadratic character for which Sk (N , χ ) ≥ 0 for all

P(z,nχ) > 0 for 0 < z < 1. ∞
positive integers N , then

(k) Show that 12 n=1 163 (7/10) = −0.0483, and that
n
n=13 (7/10) =
n

0.0323. Deduce that P(0.7, χ−163 ) < 0, and hence that for any k there
is an N for which Sk (N , χ−163 ) < 0.
11.3 The Prime Number Theorem for APs 377

8. S. Chowla (1972) conjectured that for any primitive quadratic character χ ∗

there is a character χ induced by χ ∗ such that S1 (x, χ ) ≥ 0 for all x ≥ 1
(in the notation of the preceding exercise). Show that Chowla’s conjecture
implies that L(σ, χ) > 0 when χ is a quadratic character and σ > 0. See
also Rosser (1950).
9. (Bateman & Chowla 1953) Suppose that k is a positive integer such that
λ(n) n k
1− ≥0 (11.19)
1≤n≤x
n x
for all x ≥ 1. (It is not known whether there is such a k.) (a) Show that if χ
is a quadratic character, then
χ (n) n k λ(n) n k
1− ≥ 1−
1≤n≤x
n x 1≤n≤x
n x
for all x ≥ 1.
(b) Show that if there is a k such that (11.19) holds for all x ≥ 1, then
L(σ, χ ) > 0 when χ is a quadratic character and σ > 0.

11.3 The Prime Number Theorem for

arithmetic progressions
The various inequalities for zeros of Dirichlet L-functions established above
are motivated by a desire to imitate for primes in arithmetic progressions the
quantitative form of the Prime Number Theorem achieved in Theorem 6.9. For
(a, q) = 1 we set

π(x; q, a) = 1, ϑ(x; q, a) = log p, ψ(x; q, a) = (n),
p≤x p≤x n≤x
p≡a (q) p≡a (q) n≡a (q)
(11.20)
and correspondingly for any Dirichlet character χ we put

π(x, χ ) = χ ( p), ϑ(x, χ ) = χ ( p) log p, ψ(x, χ) = χ (n)(n).
p≤x p≤x n≤x
(11.21)
By multiplying both sides of (4.27) by (n), and summing over n ≤ x, we see
that
1
ψ(x; q, a) = χ (a)ψ(x, χ ), (11.22)
ϕ(q) χ
and similarly for π(x; q, a) and ϑ(x; q, a). We deal with ψ(x, χ) in much the
same way that we dealt with ψ(x) in Chapter 6.
378 Primes in Arithmetic Progressions: II

√
Theorem 11.16 There is a constant c1 > 0 such that if q ≤ exp(2c1 log x),
then

ψ(x, χ) = E 0 (χ )x + O x exp − c1 log x (11.23)
when L(s, χ ) has no exceptional zero, but
x β1
ψ(x, χ ) = − + O x exp − c1 log x (11.24)
β1
when L(s, χ) has an exceptional zero β1 . Here E 0 (χ ) = 1 if χ = χ0 , and
E 0 (χ ) = 0 otherwise.
Proof By Theorems 4.8 and 5.2 we see that
σ0 +i T
−1 L xs
ψ(x, χ ) = (s, χ) ds + R
2πi σ0 −i T L s
where σ0 > 1 and

x (4x)σ0
∞
(n)
R (n) min 1, +
x/2<n<2x
T |x − n| T n=1 n σ0

by Corollary 5.3. As in the proof of Theorem 6.9 we suppose that 2 ≤ T ≤ x

and set σ0 = 1 + 1/ log x. Thus
x
R (log x)2 ,
T
as before. As in the proof of Theorem 6.9, we let C denote a closed contour
that consists of line segments joining the points σ0 − i T , σ0 + i T , σ1 + i T ,
σ1 − i T , but now the choice of σ1 is a little more complicated, since we want
to ensure that C does not pass too closely to an exceptional zero.
Case 1. There is no exceptional zero. In this case we take σ1 = 1 − c/(5 log qT )
where c is the constant in Theorem 11.3. If χ is non-principal, then the integrand
is analytic on and inside C, but if χ = χ0 , then it has a pole at s = 1 with residue
x. Hence
−1 L xs
(s, χ) ds = E 0 (χ )x. (11.25)
2πi C L s
We estimate the integrals from σ0 + i T to σ1 + i T , from σ1 + i T to σ1 − i T ,
and from σ1 − i T to σ0 − i T as in the proof of Theorem 6.9, using the estimate
(11.5) of Theorem 11.4. Thus we ﬁnd that

1 −c log x
ψ(x, χ ) − E 0 (χ )x x(log x) 2
+ exp . (11.26)
T 5 log qT
Case 2. There is an exceptional zero β1 , and it satisﬁes β1 ≥ 1 − c/(4 log qT ).
In this case we take σ1 = 1 − c/(3 log qT ). The integrand in (11.25) now has
11.3 The Prime Number Theorem for APs 379

a pole inside C at β1 , so the left-hand side of (11.25) has the value −x β1/β1 .
Otherwise, the estimates proceed as before, and we find that

x β1 1 −c log x
ψ(x, χ ) = − + O x(log x)2 + exp . (11.27)
β1 T 5 log qT
Case 3. There is an exceptional zero β1 , but it satisfies β1 < 1 − c/(4 log qT ).
We proceed exactly as in Case 1, and so we obtain (11.26). To pass to (11.27)
it suffices to note that

x β1 −c log x
x exp
β1 5 log qT
in the current case.
We have established (11.26) if there is no exceptional zero, and (11.27)
if there is one. To complete our argument, we need only observe that if
√ √ √
c1 = c/20, if q ≤ exp(2c1 log x), and if T = exp(2c1 log x), then (11.26)
gives (11.23) and (11.27) gives (11.24).

We are now in a position to prove

Corollary 11.17 (Page) Let c1 be the same constant as in Theorem 11.16. If

(a, q) = 1, then
x
ψ(x; q, a) = + O x exp − c1 log x (11.28)
ϕ(q)
when there is no exceptional character modulo q, and
x χ1 (a)x β1
ψ(x; q, a) = − + O x exp − c1 log x (11.29)
ϕ(q) ϕ(q)β1
when there is an exceptional character χ1 modulo q and β1 is the concomitant
zero.
√
Proof If q ≤ exp 2c1 log x , then we have only to insert the estimates of
Theorem 11.16 into (11.22). If q is larger, then the stated estimates are still
valid, but are worse than trivial. To see this, note ﬁrst that the largest term in
ψ(x; q, a) is ≤ log x, and the number of terms is ≤ x/q + 1, so it is immediate
that

ψ(x; q, a) ≤ (x/q + 1) log x x exp(−c1 log x)
√
when q ≥ exp(2c1 log x).

Presumably, exceptional zeros do not exist. However, if such a zero does

exist, then we have a second main term in (11.29) that is bigger than the error
380 Primes in Arithmetic Progressions: II

term when x < exp(c12 /(1 − β1 )2 ). If β1 is extremely close to 1, then one might
have β1 ≥ 1 − 1/ log x, and in such a situation the second main term is of the
same order of magnitude as the ﬁrst main term, since
x β1 1
x− = (β1 − 1)x β1/β1 + (log x) x σ dσ (1 − β1 )x log x. (11.30)
β1 β1

Thus if 1 − β1 is small compared with 1/ log x, then the main term is nearly
doubled if χ1 (a) = −1, and it is nearly annihilated if χ1 (a) = 1. Unfortunately,
the upper bound provided by the Brun–Titchmarsh theorem (Theorem 3.9) is
not quite strong enough to refute such a possibility.
The constants c and c1 in Theorems 11.3, 11.4, 11.16 and Corollary 11.17
are effectively computable. However, if we are willing to accept non-effective
constants, then by Siegel’s theorem (Theorem 11.14), or more precisely by its
corollary (Corollary 11.15), we can eliminate the second main term, provided
that q is more sharply limited.

Corollary 11.18 Let c1 be the same constant as in Theorem 11.16. For any
positive A there is an x0 (A) such that if q ≤ (log x) A , then

ψ(x, χ) = E 0 (χ )x + O x exp − c1 log x (11.31)

for x ≥ x0 (A).

Proof Suppose that χ is quadratic and that L(s, χ ) has an exceptional zero
β1 . Then

x β1 = x exp(−(1 − β1 ) log x) ≤ x exp(−C(ε)q −ε log x)

by Siegel’s theorem (Corollary 11.15). Since q ≤ (log x) A , the above is

≤ x exp(−C(ε)(log x)1−Aε ).

In order to reach (11.31) we need to take ε a little smaller than 1/(2A), say
ε = 1/(3A). Then the above is

≤ x exp − c1 log x

provided that x ≥ x0 = exp((c1 /C(ε))6 ).

The constraint q ≤ (log x) A can be rewritten as x ≥ exp(q 1/A ). This implies

the constraint x ≥ x0 (A) if q is sufﬁciently large, say q ≥ q0 (A). We note also
that the implicit constant in (11.31) is absolute. If we were to allow the implicit
constant to depend on A, e.g. to be as large as exp((c1 /C(ε))3 ), then we would
11.3 The Prime Number Theorem for APs 381

obtain an estimate

ψ(x, χ) A
x exp − c1 log x

that is valid for all q and all x ≥ exp q 1/A , though of course the implicit
constant is so large that the bound is worse than the trivial ψ(x, χ ) x when
x < x0 . By applying (11.22) and (11.28), we obtain

Corollary 11.19 (The Siegel–Walﬁsz theorem) Let c1 be the constant in The-

orem 11.16, and suppose that A is given, A > 0. If q ≤ (log x) A and (a, q) = 1,
then
x
ψ(x; q, a) = + OA x exp − c1 log x .
ϕ(q)
Pertaining to ϑ(x; q, a) and π(x; q, a) we have estimates similar to those of
Corollary 11.17.

Corollary 11.20 Let c1 be the constant in Theorem 11.16. If (a, q) = 1, then

x
ϑ(x; q, a) = + O x exp − c1 log x (11.32)
ϕ(q)
and
li(x)
π(x; q, a) = + O x exp − c1 log x (11.33)
ϕ(q)
when there is no exceptional character modulo q, but
x χ1 (a)x β1
ϑ(x; q, a) = − + O x exp − c1 log x (11.34)
ϕ(q) ϕ(q)β1
and

li(x) χ1 (a)li x β1
π(x; q, a) = − + O x exp − c1 log x (11.35)
ϕ(q) ϕ(q)
when there is an exceptional character χ1 modulo q and β1 is the concomitant
zero.

Proof Since

0 ≤ ψ(x; q, a) − ϑ(x; q, a) ≤ ψ(x) − ϑ(x) x 1/2 ,

the assertions concerning ϑ(x; q, a) follow immediately from Corollary 11.17.

As for π(x; q, a), we write
x x
1 li(x) 1
π(x; q, a) = dϑ(u; q, a) = + d(ϑ(u; q, a) − u/ϕ(q)).
2 − log u ϕ(q) 2− log u
This last integral we integrate by parts (as in the proof of Theorem 6.9), and
382 Primes in Arithmetic Progressions: II

ﬁnd that it is
ϑ(u; q, a) − u/ϕ(q) x x
ϑ(u; q, a) − u/ϕ(q)
− du.
log u 2− 2 u(log u)2
If there is no exceptional zero, then the numerator in the integrand is
√ √
u exp(−c1 log u) x exp(−c1 log x), so we obtain (11.33). If there is
an exceptional character χ1 , then the main term is reduced by χ1 (a)/ϕ(q) times
the amount
x β1
x
1 u β1 x
u β1 −1 1
d = du = dv = li(x β1 ) + O(1).
2 log u β1 2 log u 2β1 log v
The error term is still treated in the same way, so we obtain (11.35).

By arguing in the same manner from Corollary 11.19, we obtain

Corollary 11.21 Let c1 be the constant in Theorem 11.16, and suppose that
A is given, A > 0. If q ≤ (log x) A and (a, q) = 1, then
x
ϑ(x; q, a) = + OA x exp − c1 log x (11.36)
ϕ(q)
and
li(x)
π(x; q, a) = + OA x exp − c1 log x . (11.37)
ϕ(q)

11.3.1 Exercises
1. Suppose that χ is a character modulo q. Explain why

q
ψ(x, χ ) = χ (a)ψ(x; q, a).
a=1
(a,q)=1
√
2. Suppose that exp(2c1 log x) ≤ q ≤ x. Show that there is a positive con-
stant c2 such that

−c2 log x
ψ(x, χ ) = E 0 (χ )x + O x exp
log q
if L(s, χ) has no exceptional zero, and that

x β1 −c2 log x
ψ(x, χ) = − + x exp
β1 log q
if L(s, χ) has the exceptional zero β1 .
√
3. Show that if q ≤ exp(2c1 log x), then

ϑ(x, χ) = E 0 (χ )x + O x exp − c1 log x
11.3 The Prime Number Theorem for APs 383

when L(s, χ ) has no exceptional zero, and that

x β1
ϑ(x, χ ) = − + O x exp − c1 log x
β1
when L(s, χ) has an exceptional zero β1 . q 2
√
4. Suppose that q ≤ exp(c1 log x), and put x0 = exp log 2c1
.
(a) Explain why π(x0 ; χ ) x0 ≤ x 1/4 .
(b) Treat π(x, χ) − π (x0 , χ ) as in the proof of Corollary 11.20 to show
that

π(x, χ) x exp − c1 log x

if L(s, χ ) has no exceptional zero, and that

π(x, χ ) = − li(x β1 ) + O x exp − c1 log x

if L(s, χ) has the exceptional zero β1 .

5. Suppose that A is given, A > 0. Show that if q ≤ (log x) A , then

ϑ(x, χ) = E 0 (x)x + O x exp − c1 log x ,

and that

π (x, χ) = E 0 (χ )li(x) + O x exp − c1 log x .

By analogy with (11.20) we set

(x; q, a) = λ(n), M(x; q, a) = µ(n). (11.38)
n≤x n≤x
n≡a(q) n≡a(q)

Here it is no longer natural to restrict to (a, q) = 1. Correspondingly, if χ is a

character modulo q, we put

(x, χ) = χ (n)λ(n), M(x, χ) = χ (n)µ(n). (11.39)
n≤x n≤x
√
6. Let c1 be the constant of Theorem 11.16, suppose that q ≤ exp(2c1 log x)
and that χ is a character modulo q. Show that

(x, χ) x exp − c1 log x

when L(s, χ) has no exceptional zero, and that

L(2β1 , χ0 )x β1
(x, χ ) =
+ O x exp − c1 log x
L (β1 , χ)β1
when L(s, χ) has an exceptional zero β1 . (Note that in this latter case, the
result of Exercise 11.1.2 is useful.)
384 Primes in Arithmetic Progressions: II

√
7. Let c1 be the constant of Theorem 11.16, suppose that q ≤ exp(2c1 log x)
and that χ is a character modulo q. Show that

M(x, χ ) x exp − c1 log x

when L(s, χ ) has no exceptional zero, and that

x β1
M(x, χ) =
+ O x exp − c1 log x
L (β1 , χ)β1
when L(s, χ ) has an exceptional zero β1 .
8. Let c1 be the constant in Theorem 11.16, and suppose that A is given,
A > 0. Show that if q ≤ (log x) A and χ is a character modulo q, then

(x, χ ) A exp − c1 log x ,

and that

M(x, χ) A
x exp − c1 log x .

9. Show that if (a, q) = 1, then

1
(x; q, a) = χ (a)(x, χ ),
ϕ(q) χ

and that
1
M(x; q, a) = χ (a)M(x, χ).
ϕ(q) χ

10. Let c1 be the constant in Theorem 11.16. Show that if (a, q) = 1, then

(x; q, a) x exp − c1 log x

if there is no exceptional χ modulo q, and that

χ1 (a)L(2β1 , χ0 )x β1
(x; q, a) =
+ O x exp − c1 log x
ϕ(q)L (β1 , χ1 )β1
if there is an exceptional character χ1 modulo q with associated zero β1 .
11. Suppose that (a, q) = d, and write a = db, q = dr .
(a) Show that (x; q, a) = λ(d)(x/d; r, b).
(c) Show that
x
(x; q, a) exp − c1 log x/d
d
if no L-function modulo r has an exceptional zero, and that
λ(d)χ1 (b)L(2β1 , χ0 )(x/d)β1 x
(x; q, a) = + O exp − c 1 log x/d
ϕ(r )L (β1 , χ1 )β1 d
11.3 The Prime Number Theorem for APs 385

if there is an exceptional character χ1 modulo r with associated zero

β1 . Here χ0 is the principal character modulo r .
(d) Show that if q ≤ (log x) A , then

(x; q, a) A x exp − c1 log x

for all a.
12. Suppose that (a, q) = 1. Show that

M(x; q, a) x exp − c1 log x

if there is no exceptional character χ modulo q, and that

χ1 (a)x β1
M(x; q, a) = + O x exp − c1 log x
ϕ(q)L (β1 , χ1 )β1
if there is an exceptional character χ1 modulo q with associated
zero β1 .
13. Suppose that d = (a, q), and write q = dr , a = bd.
(a) Show that if d is not square-free, then M(x; q, a) = 0.
(b) Explain why one does not expect that M(x; q, a) = µ(d)M(x/d; r, b)
is true in general.
(c) Show instead that

M(x; q, a) = µ(d) µ(k)M(x/(dk); r, bk)
k|d
(k,r )=1

where kk ≡ 1 (mod r ).
(d) Show that M(x; q, a) x/q in any case.
√
(e) Deduce that M(x; q, a) x exp(−c log x) if there is no exceptional
character modulo r , and that

µ(d)χ1 (b)(x/d)β1 χ1 ( p)
M(x; q, a) = 1 − + O x exp − c log x
ϕ(r )L (β1 , χ1 )β1 p|d p β1
pr

if there is an exceptional character χ1 with associated zero β1 .

√
(f) Show that if q ≤ (log x) A , then M(x; q, a) A x exp(−c log x) for
all a.
√
14. Grössencharaktere for Q( −1), continued from Exercise 11.1.5. Put
√
ψ(x, χm ) = N (a)≤x (a)χm (a). Show that if 1 ≤ m ≤ exp( log x),
√
then ψ(x, χm ) x exp(−c log x) where c > 0 is a suitable absolute
constant.
386 Primes in Arithmetic Progressions: II

11.4 Applications
The fundamental estimates of the preceding section can be applied to a
wide variety of counting problems, of which the following are representative
examples.

Theorem 11.22 (Walﬁsz) Let A > 0 be ﬁxed, and let R(n) denote the number
of ways of writing n as a sum of a prime and a square-free number. Then

R(n) = c(n)li(n) + O n/(log n) A

where

1 1 1
c(n) = 1− = 1+ 2 1− .
pn
p( p − 1) p|n
p − p−1 p p( p − 1)

Proof Clearly

R(n) = µ(n − p)2
p<n

= µ(d)
p<n d 2 |(n− p)

by (2.4). Here the divisibility relation is equivalent to asserting that p ≡

n (mod d 2 ). Hence on inverting the order of summations we see that the above
is

= µ(d)π(n − 1; d 2 , n).
√
d≤ n
√
If (d, n) > 1, then the summand is O(1), and hence such d ≤ n contribute
√
an amount that is O( n). We now restrict our attention to those d for which
(d, n) = 1. For small d, say d ≤ y = (log x) A we can apply the Siegel–Walﬁsz
theorem (Corollary 11.19). Thus we see that
µ(d)
µ(d)π(n − 1; d 2 , n) = li(x) + O x y exp − c log x .
d≤y d≤y
ϕ(d )2
(d,n)=1 (d,n)=1

Since ϕ(d 2 ) = dϕ(d), we see that the sum in the main term is

∞
µ(d) 1 1
+O = 1− + O(1/y)
d=1
dϕ(d) d>y
dϕ(d) pn
p( p − 1)
(d,n)=1

by (1.31). To treat d > y we could appeal to the Brun–Titchmarsh theorem

(Theorem 3.9), but the moduli d 2 are increasing so rapidly that the trivial
11.4 Applications 387

estimate π (x; q, a) 1 + x/q is enough:

n n
π(n − 1; d 2 , n) .
√
y<d< n
√
y<d< n
d2 y

On combining our estimates we obtain the stated result.

In some situations, as below, we ﬁnd it fruitful to use the Prime Number

Theorem for arithmetic progressions in conjunction with sieve estimates.

Theorem 11.23 Let N (x) denote the number of integers n ≤ x for which
(n, ϕ(n)) = 1. Then
e−C0 x
N (x) ∼
log log log x
as x → ∞.

Proof We note that (n, ϕ(n)) = 1 if and only if n has the following two prop-
erties: (i) n is square-free, and (ii) there do not exist prime factors p, p of n
such that p ≡ 1 (mod p). Let p(n) denote the least prime factor of n. We shall
show that if p(n) is small compared with log log x then n is unlikely to have the
property (ii). We also show that n is likely to have both properties (i) and (ii) if
p(n) is large compared with log log x. Thus N (x) is approximately the number
of integers n ≤ x for which p(n) > log log x.
Let A p (x) denote the number of n ≤ x that satisfy (i) and (ii) and for which
p(n) = p. Thus

N (x) = A p (x).
p≤x

We begin by estimating A p (x) when p ≤ log log x. Let p be given, and suppose
that n is an integer such that p(n) = p and for which (ii) holds. Write n = pm;
then m is relatively prime to all prime numbers < p and also to all primes
≡ 1 (mod p). Thus by the sieve estimate (3.20) we see that

x 1 1
A p (x) 1− 1− .
p p < p p p ≤x/ p
p
p ≡1( p)

Here the ﬁrst product is 1/ log p by Mertens’ estimate (Theorem 2.7(e)).

By Theorem 4.12(d) we know that the second product is (log x)−1/( p−1) for
any ﬁxed prime p. To derive a bound that is uniform in p we appeal to the
Siegel–Walﬁsz theorem (Corollary 11.19), by which we see that π (u; p, 1)
388 Primes in Arithmetic Progressions: II

u/( p log u) uniformly for u ≥ e p . Hence by integrating by parts we deduce

that
1 1 log log x

(log log x/ p − log p)
e p ≤ p ≤x/ p
p p p
p ≡1( p)

uniformly for p ≤ log log x. Hence there is a constant c > 0 such that in this
range,
x
A p (x) exp(−c(log log x)/ p).
p log p
Now it is not hard to show that the number of integers n ≤ x such that p(n) = p
is x/( p log p) uniformly for p ≤ x/2. Hence the exponential above reflects
the relative improbability that n satisfies condition (ii). On summing, we find
that
x
A p (x) exp(−c(log log x)/U ).
(log U )2
2 U < p≤U
1

We take U = 2−k log log x and sum over k to see that

x
A p (x) .
p≤log log x
(log log log x)2

We now consider n for which p(n) is large, say p(n) ≥ y where y, to be

chosen later, is somewhat larger than log log x. Let (x, y) denote the number
of integers n ≤ x composed entirely of prime numbers > y. By the sieve of
Eratosthenes (Theorem 3.1) and Mertens’ estimate (Theorem 2.7(e)) we see
that

e−C0 x x
A p (x) ≤ (x, y) = +O 2
+ O e y/ log y .
y< p≤x log y (log y)

To derive a corresponding lower bound for the left-hand side we start with the
numbers counted by (x, y) and then delete those that do not satisfy (i) or (ii).
If n does not satisfy (i), then there is a prime number p such that p 2 |n. The
number of such n ≤ x is not more than [x/ p 2 ] ≤ x/ p 2 . Hence the total number

of n counted in (x, y) for which (i) fails is not more than x p>y p −2
x/(y log y). Similarly, if n does not satisfy (ii), then there exist primes p, p
with pp |n such that p ≡ 1 (mod p). If p and p are given, then the number
of n ≤ x for which pp |n is ≤ x/( pp ). Hence the total number of n counted in
(x, y) for which (ii) fails is not more than
1 1
x . (11.40)
√ p
y≤ p≤ x p ≤x/ p
p
p ≡1( p)
11.4 Applications 389

By the Brun–Titchmarsh inequality (Theorem 3.9) we see that

1 1
p p log 2U/ p
U < p ≤2U
p ≡1( p)

uniformly for U ≥ p. We take U = 2k p and sum over k to see that the inner
sum in (11.40) is (log log 4x/ p 2 )/ p. Hence the expression (11.40) is
1 x log log x
x(log log x) 2
.
p>y p y log y

On combining our estimates we see that

eC0 x x
A p (x) ≥ −O 2
− O e y/ log y
y≤ p≤x log y (log y)

x x log log x
−O −O .
y log y y log y
In order that the last error term above is of a smaller order of magnitude than
the main term, it is necessary to choose y so that y/ log log x → ∞. Thus there
is necessarily a remaining range log log x < p ≤ y to be treated. By using the
sieve (i.e., (3.20)) as in our treatment of small p we see that the number of
√
integers n ≤ x for which p(n) = p is x/( p log p), uniformly for p ≤ x.
Hence A p (x) x/( p log p), and consequently
x
A p (x) .
U ≤ p≤2U
(log U )2

We put U = 2k log log x and sum over 1 ≤ k ≤ K where K y

log log log x
to
see that
x y
A p (x) 2
log .
log log x≤ p≤y
(log log log x) log log x

In order that this is a smaller order of magnitude than the main term, it is
necessary to take y ≤ (log log x)(1+ε) with ε → 0 as x → ∞. By taking y to
be of this form with ε tending to 0 slowly, we obtain the stated result.

11.4.1 Exercises
1. Let R(n) be deﬁned as in Theorem 11.22.
(a) Show that if there is a primitive quadratic character χ1 (mod q1 ), q1 ≤
√
exp( log x), for which L(s, χ1 ) has a real zero β1 > 1 − c(log x)−1/2 ,
then

R(n) = c(n)li(n) − χ1 (n)c1 (n)li(n β1 ) + O n exp − c log n
390 Primes in Arithmetic Progressions: II

where

∞
µ(d)
c1 (n) = .
d=1
dϕ(d)
(d,n)=1
q1 |d 2

(b) Show that c1 (n) = 0 if 8|q1 .

(c) Show that if q1 is odd, then
µ(q1 )c(q1 n)
c1 (n) = .
q1 ϕ(q1 )
(d) Show that if 4q1 , then
4µ(q1 /2)c(q1 n)
c1 (n) =
q1 ϕ(q1 )
2. In the proof of Theorem 11.23, specify ε as an explicit function of x to show
that

x log log log log x
N (x) = e−C0 + O .
log log log x log log log x
3. Let a be a fixed non-zero integer. Show that the number of primes p ≤ x
such that p + a is square-free is c(a)li(x) + OA (x(log x)−A ) where c(a) is
defined as in Theorem 11.22.
4. Show that the appeal to the Siegel–Walfisz theorem in the proof of Theorem
11.23 can be replaced by an appeal to Page’s theorem in conjunction with
Corollary 11.12.
5. (Vaughan 1973) Let A and B be positive numbers. Show that
ϕ( p − 1) B
= C li(x) + OA,B (x/(log x) A )
p≤x p − 1
where

1 − (1 − 1/ p) B
C= 1− .
p p−1
6. (Erdős 1951)
(a) Let r (n) denote the number of solutions of p + 2k = n with p prime
√
and k ≥ 1, and let y = c log x where c is a sufficiently small positive

constant. Define q = 2< p≤y p. If there is a primitive character χ ∗
modulo q ∗ with q ∗ |q for which L(s, χ ∗ ) has an exceptional zero, then
let p be any prime divisor of q ∗ and define q = q / p. Otherwise let
q = q . Prove that

x x
r (qm) = +O .
m≤x/q
ϕ(q) log 2 ϕ(q) log x

(b) Show that r (n) = (log log n).

11.5 Notes 391

11.5 Notes
Section 11.1. Theorem 11.3 is a combination of work by Gronwall (1913) and
Titchmarsh (1930).
Section 11.2. Lemma 11.6, Theorem 11.7, and Corollaries 11.8, 11.9 origi-
nate in Landau (1918a, b), while Corollary 11.10 is from Page (1935). Theorem
11.11 can also be proved by appealing to the Dirichlet class number formula,
which asserts that if d is a quadratic discriminant and χd (n) = dn K is the
associated quadratic character, then
⎧
⎪ 2π h
⎪ √
⎨ (d < 0),
w −d
L(1, χd ) =
⎪
⎪ h log ε
⎩ √ (d > 0);
d
√
see Davenport (2000, Section 6). If d < 0, then χd (−1) = −1, Q( d) is an
imaginary quadratic field with class number h, and w denotes the number of
roots of unity in the field (which is to say that w = 6 if d = −3, √ w = 4 if
d = −4, and w = 2 otherwise). If d > 0, then χd (−1) = 1, Q( d) is a √ real
quadratic field with class number h and fundamental unit ε. Since ε d,
it follows that if χ is a quadratic character with χ (−1) = 1, then L(1, χ)
(log q)/q 1/2 .
Corollary 11.12 has been sharpened by Davenport (1966), Haneke (1973),
and by Goldfeld & Schinzel (1975).
Section 11.3. Let h(d) denote the number of equivalence classes of primitive
binary quadratic forms of discriminant d. Gauss (1801, Section 303) conjec-
tured that h(d) → ∞ as d → −∞. (The behaviour for d > 0 is quite different –
the heuristics of Cohen & Lenstra (1984a, b) predict that h( p) = 1 for a positive
proportion of primes p ≡ 1 (mod 4).) For Gauss, the generic binary quadratic
form was written ax 2 + 2bx y + cy 2 , which is to say that the middle coefficient
is even. Put = b2 − ac. In Gauss’s notation, Landau (1903) found that if
< 0, then the class number is 1 precisely when = −1, −2, −3, −4, −7.
Binary quadratic forms ax 2 + bx y + cy 2 with d = b2 − 4ac correspond, when
√ to ideals in the ring O K of integers
d is a fundamental quadratic discriminant,
in the quadratic number field K = Q( d). In this notation, h(d) = 1 if and
only if O K is a unique factorization domain. The problem of determining all
d < 0 for which h(d) = 1 is now solved, but historically it was enormously
more difficult than the class number 1 problem settled by Landau. Landau
(1918b) recorded Hecke’s observation that if d < 0 is a quadratic discriminant
and L(s, χd ) > 0 for 1 − c/ log |d| < s < 1, then h(d) c |d|1/2 / log |d|. In
view of Dirichlet’s class number formula (4.36), we have obtained Hecke’s
result – by a different method – in Theorem 11.4. Thus we have a good lower
392 Primes in Arithmetic Progressions: II

bound for h(d) when d < 0, except for those d for which L(s, χd ) has an ex-
ceptional real zero. Deuring (1933) showed that if h(d) = 1 has infinitely many
solutions with d < 0, then the Riemann Hypothesis is true. Mordell (1934)
showed that the same conclusion can be derived from the weaker hypothe-
sis that h(d) does not tend to infinity as d → −∞. Heilbronn (1934) found
that instead of arguing from a hypothetical zero ρ of the zeta function with
β > 1/2 one could just as well argue from an exceptional zero of a quadratic
L-function, and thus proved Gauss’s conjecture that h(d) → ∞ as d → −∞.
Landau (1935) put Heilbronn’s theorem in a quantitative form: h(d) > |d|3/8−ε
as d → −∞. Through a different arrangement of the technical details, Siegel
(1935) sharpened Landau’s argument to show that h(d) > |d|1/2−ε , which by
(4.36) is the case d < 0 of Theorem 11.14. To achieve his result, Siegel first gen-
eralized to algebraic number fields the formula (found in Exercise 10.1.10) that
Riemann used to prove the functional equation for ζ (s). Then Siegel applied this
√ √
to the quartic number field K = Q( d1 , d2 ) whose Dedekind zeta function
is ζ K (s) = ζ (s)L(s, χd1 )L(s, χd2 )L(s, χd1 d2 ). It is now recognized that Siegel’s
formula arises through the choice of the kernel in a Mellin transform, and that
many other choices work just as well; see Goldfeld(1974). Our exposition is
based on that of Estermann (1948).
It is easy to show that the complex quadratic field of discriminant d < 0
has unique factorization in the nine cases d = −3, −4, −7, −8, −11, −19,
−43, −67, −163. Heilbronn & Linfoot (1934) showed that there could ex-
ist at most one more such discriminant. The ‘problem of the tenth discrimi-
nant’ was solved first by Heegner (1952). However, Heegner’s paper contained
many assertions for which proofs were not provided, and Heegner also used
results from Weber’s Algebra which were known not to be trustworthy. Con-
sequently, for many years Heegner’s paper was thought to be incorrect. Baker
(1966) proved a fundamental lower bound for linear forms in logarithms of
algebraic numbers, which by means of a result of Gel’fond & Linnik (1948)
reduced the class number 1 problem to a finite calculation. Meanwhile, Stark
(1967) showed that there is no tenth discriminant by translating Heegner’s
argument into parallel language where it could be checked. After a reexami-
nation of Heegner’s work, Deuring (1968), Birch (1969), and Stark (1969) all
concluded that Heegner’s paper was after all correct. Gel’fond & Linnik re-
duced the class number problem to a question concerning linear forms in three
logarithms, which Baker treated successfully. However, with a small modifi-
cation of their argument, Gel’fond & Linnik could have reduced the problem
to linear forms in two logarithms, which Gel’fond had already treated. Thus
one could say that Gel’fond & Linnik ‘should’ have solved the problem in
1948.
11.6 References 393

Baker (1971) and Stark (1971b, 1972) reduced the complete determination
of complex quadratic fields with h(d) = 2 to a finite calculation which was
provided by Bundschuh & Hock (1969), Ellison et al. (1971), Montgomery &
Weinberger (1973), and by Stark (1975).
The effective determination of all quadratic discriminants d < 0 for which
h(d) takes specific larger values became possible only with the addition of
further ideas. Goldfeld (1976) showed that a zero at s = 1/2 of the L-function
of an elliptic curve would be useful if it is of sufficiently high multiplicity.
In particular, if (i) the Birch–Swinnerton-Dyer conjectures are true, and if (ii)
there exist elliptic curves of arbitrarily high rank, then h(d) A (log |d|)A for
arbitrarily large A, with an effectively computable implicit constant. Although
these conjectures remain unproved, Gross & Zagier (1986) were able to establish
enough to give an effective lower bound for h(d) tending to infinity. For accounts
of this, see Zagier (1984), Goldfeld (1985), Coates (1986), and finally Oesterlé
(1988), who developed the Goldfeld and Gross–Zagier work to show that
√
1 [2 p]
h(d) ≥ (log |d|) 1− .
55 p|d
p+1
p<|d|

By means of this inequality, Arno (1992), Wagner (1996), and Arno, Robinson &
Wheeler (1998) treated progressively larger collections of class numbers. Most
recently, Watkins (2004) settled the complete determination of all discriminants
d < 0 for which h(d) ≤ 100.
With regard to Corollary 11.17, Page (1935) states the final conclusion in
a less precise form in which the term corresponding to the exceptional zero is
replaced by O(x β1 /φ(q)).
The deduction of Corollaries 11.18 and 11.19 from Siegel’s theorem was
first recorded by Walfisz (1936).
Section 11.4. Theorem 11.22 is due to Walfisz (1936). In a weaker form it
occurs first in Estermann (1931), and is given in a somewhat refined form but
without the benefit of Siegel’s theorem in Page (1935). For similar theorems
see see Mirsky (1949).
Theorem 11.23 is due to Erdős (1948).

11.6 References
Arno, S. (1992). The imaginary quadratic ﬁelds of class number 4, Acta Arith. 60,
321–334.
Arno, S., Robinson, M. L., & Wheeler, F. S. (1998). Imaginary quadratic ﬁelds with
small class number, Acta Arith. 83, 295–330.
394 Primes in Arithmetic Progressions: II

Baker, A. (1966). Linear forms in the logarithms of algebraic numbers, I, Mathematika

13, 204–216.
(1971). Imaginary quadratic fields with class number 2, Ann. of Math. (2) 94, 139–152.
Bateman, P. T. & Chowla, S. (1953).The equivalence of two conjectures in the theory
of numbers, J. Indian Math. Soc. (N.S.) 17, 177–181.
Birch, B. J. (1969). Weber’s class invariants, Mathematika 16, 283–294.
Buell, D. A. (1999). The last exhaustive computation of class groups of complex
quadratic number fields, Number Theory (Ottawa, 1996), CRM Proc. Lecture Notes
19, Providence: Amer. Math. Soc., pp. 35–53.
Bundschuh, P. & Hock, A. (1969). Bestimmung aller imaginär-quadratischen Zahlkörper
der Klassenzahl Eins mit Hilfe eines Satzes von Baker, Math. Z. 111, 191–204.
Coates, J. (1986). The work of Gross and Zagier on Heegner points and the derivatives
of L-series, Seminar Bourbaki, Vol. 1984/1985, Astérisque No. 133–134, 55–72.
Chowla, S. (1972). On L-series and related topics, Proc. Number Theory Conf. (Boulder,
1972), Boulder: University of Colorado, pp. 41–42.
Cohen, H. & Lenstra, H. (1984a). Heuristics on class groups, Number Theory (New
York, 1982). Lecture Notes in Math. 1052. Berlin: Springer-Verlag, pp. 26–36.
(1984b). Heuristics on class groups of number fields, Number Theory (Noordwijker-
hout, 1983). Lecture Notes in Math. 1068. Berlin: Springer-Verlag, pp. 33–62.
Davenport, H. (1966). Eine Bemerkung über Dirichlets L-Funktionen, Nachr. Akad.
Wiss. Göttingen Math.-Phys. Kl. II, 203–212; Collected Works, Vol. 4. London:
Academic Press, 1977, pp. 1816–1825.
(2000). Multiplicative Number Theory, Third edition, Graduate Texts in Math. 74.
New York: Springer-Verlag.
Deuring, M. (1933). Imaginäre quadratische Zahlkörper mit der Klassenzahl 1, Math.
Z. 37, 405–415.
(1968). Imaginäre quadratische Zahlkörper mit der Klassenzahl Eins, Invent.
Math. 5, 169–179.
Ellison, W. J., Pesek, J., Stall, D. S. & Lunnon, W. F. (1971). A postscript to a paper of
A. Baker, Bull. London Math. Soc. 3, 75–78.
Erdős, P. (1948). Some asymptotic formulas in number theory, J. Indian Math. Soc.
(N. S.) 12, 75–78.
(1951). On some problems of Bellman and a theorem of Romanoff, J. Chinese Math.
Soc. (N. S.) 1, 409–421.
Estermann, T. (1931). On the representations of a number as the sum of a prime and a
quadratfrei number, J. London Math. Soc. 6, 219–221.
(1948). On Dirichlet’s L functions, J. London Math. Soc. 23, 275–279.
Fekete, M. & Pólya, G. (1912). Über ein Problem von Laguerre, Rend. Circ. Mat.
Palermo 34, 1–32.
Gauss, C. F. (1801). Disquisitiones Arithmeticae, Leipzig: Fleischer.
Gel’fond, A. O. & Linnik, Yu. V. (1948). On Thue’s method in the problem of effective-
ness in quadratic fields, Dokl. Akad. Nauk SSSR 61,773–776.
Goldfeld, D. M. (1974). A simple proof of Siegel’s theorem, Proc. Nat. Acad. Sci. U.S.A.
71, 1055.
(1975). On Siegel’s zero, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 2, 571–583.
(1976). The class number of quadratic fields and the conjectures of Birch and
Swinnerton-Dyer, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 3, 624–663.
11.6 References 395

(1985). Gauss’ class number problems for imaginary quadratic fields, Bull. Amer.
Math. Soc. 13, 23–37.
(2004). The Gauss class number problem for imaginary quadratic fields, Heegner
Points and Rankin L-series, Math. Sci. Res. Inst. Publ. 49. Cambridge: Cambridge
University Press, 25–36.
Goldfeld, D. M. & Schinzel, A. (1975). On Siegel’s zero, Ann. Scuola Norm. Sup. Pisa
Cl. Sci. (4) 2, 571–583.
Gronwall, T. H. (1913). Sur les séries de Dirichlet correspondant à des caractères com-
plexes, Rend. Circ. Mat. Palermo 35, 145–159.
Gross, B. H. & Zagier, D. B. (1986). Heegner points and derivatives of L-series, Invent.
Math. 84, 225–320.
Haneke, W. (1973). Über die reellen Nullstellen der Dirichletschen L-Reihen, Acta Arith.
22, 391–421; Corrigendum, 31 (1976), 99–100.
Heegner, K. (1952). Diophantische Analysis und Modulfunktionen, Math. Z. 56, 227–
253.
Heilbronn, H. (1934). On the class-number in imaginary quadratic fields, Quart. J. Math.
Oxford Ser. 5, 150–160.
(1937). On real characters, Acta Arith. 2, 212–213.
Heilbronn, H. & Linfoot, E. (1934). On the imaginary quadratic corpora of class-number
one, Quart. J. Math. Oxford Ser. 5, 293–301.
Landau, E. (1903). Über die Klassenzahl der binären quadratischen Formen von neg-
ativer Discriminante, Math. Ann. 56, 671–676; Collected Works, Vol. 1. Essen:
Thales Verlag, 1985, pp. 354–359.
(1918a). Über imaginär-quadratische Zahlkörper mit gleicher Klassenzahl, Nachr.
Akad. Wiss. Göttingen, 277–284; Collected Works, Vol. 7. Essen: Thales Verlag,
1986, pp. 142–160.
(1918b). Über die Klassenzahl imaginär-quadratischer Zahlkörper, Nachr. Akad.
Wiss. Göttingen, 285–295; Collected Works, Vol. 7. Essen: Thales Verlag,
pp. 150–160.
(1935). Bemerkungen zum Heilbronnschen Satz, Acta Arith. 1, 1–18; Collected Works,
Vol. 9. Essen: Thales Verlag, 1987, pp. 265–282.
Mahler, K. (1934). On Hecke’s theorem on the real zeros of the L-functions and the
class number of quadratic fields, J. London Math. Soc. 9, 298–302.
Mirsky, L. (1949). The number of representations of an integer as the sum of a prime
and a k-free integer, Amer. Math. Monthly 56, 17–19.
Montgomery, H. L. & Weinberger, P. J. (1973). Notes on small class numbers, Acta
Arith. 24, 529–542.
Mordell, L. J. (1934). On the Riemann Hypothesis and imaginary quadratic fields with
given class number, J. London Math. Soc. 9, 405–415.
Oesterlé, J. (1988). Le problème de Gauss sur le nombre de classes, Enseignement Math.
(2) 34, 43–67.
Page, A. (1935). On the number of primes in an arithmetic progression, Proc. London
Math. Soc. (2) 39, 116–141.
Pólya, G. & Szegö, G. (1925). Aufgaben und Lehrsätze aus der Analysis, Vol. 2, Grundl.
Math. Wiss. 20. Berlin: Springer.
Rosser, J. B. (1950). Real roots of real Dirichlet L-series, J. Research Nat. Bur. Standards
45, 505–514.
396 Primes in Arithmetic Progressions: II

Siegel, C. L. (1935). Über die Classenzahl quadratischer Zahlkörper, Acta Arith. 1,

83–86.
(1968). Zum Beweis des Starkschen Satzes, Invent. Math. 5, 180–191.
Stark, H. M. (1967). A complete determination of the complex quadratic fields of class-
number one, Michigan Math. J. 14, 1–27.
(1969). On the “gap” in a theorem of Heegner, J. Number Theory 1, 16–27.
(1971a). Recent advances in determining all complex quadratic fields of a given class-
number, Number Theory Institute (Stony Brook, 1969), Proc. Sympos. Pure Math.
20. Providence: Amer. Math. Soc., pp. 401–414.
(1971b). A transcendence theorem for class-number problems, Ann. of Math. (2) 94,
153–173.
(1972). A transcendence theorem for class-number problems, II, Ann. of Math. (2)
96, 174–209.
(1973). Class-numbers of complex quadratic fields, Modular Functions of One Vari-
able, I (Proc. Internat. Summer School, Univ. Antwerp, Antwerp, 1972), Lecture
Notes in Math. 320. Berlin: Springer-Verlag, pp. 153–174.
(1975). On complex quadratic fields with class-number two, Math. Comp. 29, 289–
302.
Tatuzawa, T. (1951). On a theorem of Siegel, Japan. J. Math. 21, 163–178.
Titchmarsh, E. C. (1930). A divisor problem, Rend. Circ. Mat. Palermo 54, 414–429;
Correction, 57 (1933), 478–479.
Vaughan, R. C. (1973). Some applications of Montgomery’s sieve, J. Number Theory 5,
64–79.
Wagner, C. (1996). Class number 5, 6 and 7, Math. Comp. 65, 785–800.
Walfisz, A. (1936). Zur additiven Zahlentheorie. II, Math. Z. 40, 592-607.
Watkins, M. (2004). Class numbers of imaginary quadratic fields, Math. Comp. 73,
907–938.
Zagier, D. (1984). L-series of elliptic curves, the Birch–Swinnerton-Dyer conjecture,
and the class number problem of Gauss, Notices Amer. Math. Soc. 31, 739–743.
12
Explicit formulæ

12.1 Classical formulæ

When we proved the Prime Number Theorem, we conﬁned the contour of
integration to the zero-free region. If we pull the contour further to the left, then
we encounter a number of poles that leave residues, and thus we can express the
error term in the Prime Number Theorem as a sum over the zeros of ζ (s). Let
ψ0 (x) = (ψ(x + ) + ψ(x − ))/2. By applying Perron’s formula (Theorem 5.1) to

the Dirichlet series − ζζ (s) = n (n)n −s , we see that
σ0 +i T
−1 ζ xs
ψ0 (x) = lim (s) ds.
T →∞ 2πi σ0 −i T ζ s
Here the integrand has a pole at s = 1, at zeros ρ, at s = 0, and at the trivial
zeros −2k. Since x s decays very rapidly as σ → −∞, it is reasonable to expect
that we can pull the contour to the left, and thus show that the above is
xρ ζ ∞
x −2k
= x − lim − (0) + . (12.1)
T →∞ ρ ρ ζ k=1
2k
|γ |≤T

Here ζζ (0) = log 2π by (10.11) and (10.14), and the sum over the trivial zeros is
1
− log(1 − 1/x 2 ) ,
2
which is continuous and tends to 0 as x → ∞. In order to give a rigorous proof

of the above, we ﬁrst establish estimates for ζζ (s).
Lemma 12.1 We have
ζ −1 1
(s) = + + O(log τ ) (12.2)
ζ s−1 ρ s − ρ
|γ −t|≤1

uniformly for −1 ≤ σ ≤ 2.

397
398 Explicit formulæ

Here the first term on the right is significant only for |t| ≤ 1. We could prove
the above by the same method that we used to prove Lemma 6.4, but we find it
instructive to argue instead from Corollary 10.14.
Proof By combining (10.29) and Theorem C.1, it is immediate that
ζ −1 1 1

1
(s) = + + − log τ + O(1).
ζ s−1 ρ s−ρ ρ 2
On applying this at σ + it and at 2 + it, and differencing, it follows that
ζ −1 1 1

(s) = + − + O(1).
ζ s−1 ρ s−ρ 2 + it − ρ
By Theorem 10.13 it is clear that
1
1 log τ.
ρ 2 + it − ρ ρ
|γ −t|≤1 |γ −t|≤1

Now suppose that n is a positive integer, and consider those zeros ρ for which
n ≤ |γ − t| ≤ n + 1. Since
1 1 2−σ 1
− = ,
s−ρ 2 + it − ρ (s − ρ)(2 + it − ρ) n2
it follows that such zeros contribute an amount
N (t + n + 1) − N (t + n) + N (t − n) − N (t − n − 1) log(τ + n)
.
n2 n2
On summing over n we obtain the stated estimate.

Lemma 12.2 For each real number T ≥ 2 there is a T1 , T ≤ T1 ≤ T + 1,

such that
ζ
(σ + i T1 ) (log T )2
ζ
uniformly for −1 ≤ σ ≤ 2.
Proof By Theorem 10.13, there is a T1 ∈ [T, T + 1] such that |T1 − γ |
1/ log T for all zeros ρ. Since each summand in (12.2) is log T , and there
are log T summands, the estimate is immediate.

The next lemma is useful in Chapter 14, but we establish it here since it is a
also an immediate corollary of Lemma 12.1.
Lemma 12.3 For any real number t,
arg ζ (σ + it) log τ
uniformly for −1 ≤ σ ≤ 2.
12.1 Classical formulæ 399

The function log ζ (s) has a branch point at s = 1, and also at zeros ρ of
the zeta function. To obtain a single branch of the logarithm, we remove from
the complex plane the interval (−∞, 1], and also intervals of the form (−∞ +
iγ , β + iγ ]. What remains is simply connected, and in this region we take
that branch of log ζ (s) for which log ζ (s) → 0 as σ → ∞. This is the branch
of the logarithm that we have expanded as a Dirichlet series, for σ > 1 (cf.
Corollary 1.11). Thus, if t is not the ordinate of a zero, we deﬁne arg ζ (s) =
log ζ (s) by continuous variation from ∞ + it to σ + it, which is to say
that
∞
ζ
arg ζ (s) = − (α + it) dα.
σ ζ
If t is the ordinate of a zero then we set arg ζ (s) = (arg ζ (σ + it + ) + arg ζ (σ +
it − ))/2.

Proof Suppose that −1 ≤ σ ≤ 2, and that t is not the ordinate of a zero.

Then
2
ζ
arg ζ (σ + it) = arg ζ (2 + it) − (α + it) dα.
σ ζ
Here arg ζ (2 + it) 1 uniformly in t, by Corollary 1.11. Thus by Lemma 12.1,
the right-hand side above is
2
1
− dα + O(log τ ).
|γ −t|≤1 σ
α + it − ρ

Here the summand is

σ −β 2−β
arctan − arctan .
t −γ t −γ
If t > γ , then this lies between −π and 0, while if t < γ , then the above lies
between 0 and π. Thus in any case the quantity is bounded, and by Theo-
rem 10.13 the number of summands is log τ , so we have the result when t
is not the ordinate of a zero. Since the ordinates of zeros have no ﬁnite limit
point, we obtain the same bound when t is the ordinate of a zero, since in that
case arg ζ (s) = (arg ζ (σ + it + ) + arg ζ (σ − it − ))/2.

Lemma 12.4 Let A denote the set of those points s ∈ C such that σ ≤ −1
and |s + 2k| ≥ 1/4 for every positive integer k. Then
ζ
(s) log(|s| + 1)
ζ
uniformly for s ∈ A.
400 Explicit formulæ

Proof We recall (10.27), in which the ﬁrst two terms are bounded for s ∈ A.
Also,

(1 − s) log(|s| + 1)

by Theorem C.1. Finally

πs 2i
= i + iπs
cot 1
2 e −1
since s is bounded away from even integers, so we have the result.

We are now in a position to prove the explicit formula (12.1) in a quantitative

form.
Theorem 12.5 Let c be a constant, c > 1, suppose that x ≥ c, that T ≥ 2,
and let x denote the distance from x to the nearest prime power, other than x
itself. Then
xρ 1
ψ0 (x) = x − − log 2π − log(1 − 1/x 2 ) + R(x, T ) (12.3)
ρ ρ 2
|γ |≤T

where

x x
R(x, T ) (log x) min 1, + (log x T )2 . (12.4)
T x T
Since x > 0 for all x, we obtain (12.1) by letting T → ∞ in the above.
Moreover, if n 1 < n 2 are two consecutive prime powers, then from the above

we see that |γ |≤T x ρ/ρ converges uniformly for x in an interval of the form
[n 1 + δ, n 2 − δ]. This sum, of course, cannot be uniformly convergent for x
in a neighbourhood of a prime power, since ψ0 (x) has jump discontinuities
at such points, but we see from the above that it is boundedly convergent in
the neighbourhood of a prime power. The sum over ρ is also convergent when
x = 1, but it is not boundedly convergent near 1, since log(1 − 1/x 2 ) → −∞
as x → 1+ .
Proof Let T1 be the number supplied by Lemma 12.2. Then by Theorem 5.2
and its Corollary 5.3, with σ0 = 1 + 1/ log x, we see that
σ0 +i T1
−1 ζ xs
ψ0 (x) = (s) ds + R1
2πi σ0 −i T1 ζ s
where

x x ∞
(n)
R1 (n) min 1, + .
x/2<n<2x
T |x − n| T n=1 n σ0
n=x
12.1 Classical formulæ 401

Here the second sum is − ζζ (σ0 ) 1/(σ0 − 1) = log x. In the ﬁrst sum, the
terms for which x + 1 ≤ n < 2x contribute an amount
x log x x
(log x)2 .
x+1≤n<2x
T (n − x) T

The terms for which x/2 < n ≤ x − 1 are handled similarly. Finally, any terms
for which x − 1 < n < x + 1 contribute an amount

x
(log x) min 1, ,
T x
so

x x
R1 (log x) min 1, + (log x)2 .
T x T
Let K denote an odd positive integer, and let C denote the contour consisting
of line segments connecting σ0 − i T1 , −K − i T1 , −K + i T1 , σ0 + i T1 . Then
by Cauchy’s residue theorem,
xρ x −2k ζ
ψ0 (x) = x − + − (0) + R1 + R2
ρ ρ 1≤k<K /2
2k ζ
|γ |<T1

where
−1 ζ xs
R2 = (s) ds.
2πi C ζ s
Since |σ ± i T1 | ≥ T , we see by Lemma 12.2 that
σ0 ±i T1 σ0
ζ xs (log T )2 x(log T )2 x(log T )2
(s) ds x σ dσ .
−1±i T1 ζ s T −1 T log x T
Similarly, since (log |σ ± i T1 |)/|σ ± i T1 | (log T )/T , we see by Lemma
12.4 that
−1±i T1 −1
ζ log T log T log T
(s)x s ds x σ dσ .
−K ±i T1 ζ T −∞ x T log x T
As | − K + it| ≥ K , by Lemma 12.4 we also see that
−K +i T1
ζ xs log K T −K T1
T log K T
(s) ds x 1 dt .
−K −i T1 ζ s K −T1 KxK
This tends to 0 as K → ∞, so we obtain the stated result.

Let ψ0 (x, χ ) = (ψ(x + , χ ) + ψ(x − , χ))/2. Not surprisingly, our treatment

of ψ0 (x) extends readily to provide explicit formulæ for ψ0 (x, χ ).
402 Explicit formulæ

Lemma 12.6 Let χ be a primitive character modulo q with q > 1. Then

L 1
(s, χ ) = + O(log qτ ) (12.5)
L ρ s−ρ
|γ −t|≤1

uniformly for −1 ≤ σ ≤ 2.
Proof By combining (10.37) and Theorem C.1, it is immediate that
L 1 1

(s, χ) = B(χ ) + + + O(log qτ ).
L ρ s−ρ ρ
On applying this at σ + it and 2 + it, and differencing, it follows that
L 1 1

(s, χ ) = − + O(log qτ ).
L ρ s−ρ 2 + it − ρ
By Theorem 10.17 it is clear that
1
1 log qτ.
ρ 2 + it − ρ ρ
|γ −t|≤1 |γ −t|≤1

Now suppose that n is a positive integer, and consider those zeros ρ for which
n ≤ |γ − t| ≤ n + 1. Since
1 1 2−σ 1
− = ,
s−ρ 2 + it − ρ (s − ρ)(2 + it − ρ) n2
it follows that such zeros contribute an amount
log q + log(|t + n| + 2) + log(|t − n| + 2) log q(τ + n)
.
n2 n2
On summing over n we obtain the stated estimate.

Lemma 12.7 Let χ be a primitive character modulo q, and suppose that

T ≥ 2. Then there is a T1 , T ≤ T1 ≤ T + 1, such that
L
(σ ± i T1 , χ) (log qT )2
L
uniformly for −1 ≤ σ ≤ 2.
Proof By Theorem 10.17, there is a T1 ∈ [T, T + 1] such that both |T1 −
γ| 1/ log qT and |T1 + γ | 1/ log qT for all zeros ρ of L(s, χ ). Since
each summand in (12.5) is log qT , and there are log qT summands, the
estimate is immediate.
Lemma 12.8 Let χ be a primitive character modulo q, q > 1. Then
arg L(s, χ ) log qτ
uniformly for −1 ≤ σ ≤ 2.
12.1 Classical formulæ 403

Proof Suppose that −1 ≤ σ ≤ 2, and that t is not the ordinate of a zero. Then
2
L
arg L(σ + it, χ ) = arg L(2 + it, χ ) − (α + it, χ ) dα.
σ L
Here arg L(2 + it, χ ) 1 uniformly in t, by Theorem 4.8. Thus by
Lemma 12.6, the right-hand side above is
2
1
− dα + O(log qτ ).
|γ −t|≤1 σ
α + it − ρ

Here the summand is

σ −β 2−β
arctan − arctan .
t −γ t −γ
If t > γ , then this lies between −π and 0, while if t < γ , then the above lies
between 0 and π. Thus in any case the quantity is bounded, and by Theo-
rem 10.17 the number of summands is log τ , so we have the result when t
is not the ordinate of a zero. Since the ordinates of zeros have no ﬁnite limit
point, we obtain the same bound when t is the ordinate of a zero, since in that
case arg L(s, χ ) = (arg L(σ + it + , χ ) + arg L(σ − it − , χ ))/2.

Lemma 12.9 Let χ be a primitive character modulo q with q > 1, put κ = 0

or 1 according as χ (−1) = 1 or −1, and let A(κ) denote the set of points s ∈ C
such that σ ≤ −1 and |s + 2n − κ| ≥ 1/4 for each positive integer n. Then
L
(s, χ ) log(2q|s|)
L
uniformly for s ∈ A(κ).
Proof By (10.35) and Theorem C.1 we see that
L π π
(s, χ) = cot (s + κ) + O(log q) + O(log(|s| + 2)).
L 2 2
Here
π 2i
cot (s + κ) = i + iπ (s+κ) 1
2 e −1
since s is bounded away from integers with the parity of κ.

Theorem 12.10 Let c be a constant, c > 1. Suppose that x ≥ c, that T ≥ 2,

and that χ is a primitive character modulo q with q > 1. Then
xρ 1
ψ0 (x, χ) = − − log(x − 1)
ρ ρ 2
|γ |≤T
χ (−1)
− log(x + 1) + C(χ ) + R(x, T ; χ ) (12.6)
2
404 Explicit formulæ

where
L q
C(χ ) = (1, χ ) + log − C0 (12.7)
L 2π
and

x x
R(x, T ; χ ) (log x) min 1, + (log q x T )2 . (12.8)
T x T
Here x denotes the distance from x to the nearest prime power, other than x
itself.

Proof Put σ0 = 1 + 1/ log x. By arguing as in the proof of Theorem 12.5, we

see that
σ0 +i T1
−1 L xs
ψ0 (x, χ) = (s, χ) ds + R1
2πi σ0 −i T1 L s
where

x x
R1 (log x) min 1, + (log x)2 .
T x T
Let K be chosen so that K − κ is an odd positive integer, and let C denote
the contour consisting of the line segments connecting σ0 − i T1 , −K − i T1 ,
−K + i T1 , σ0 + i T1 where T1 is chosen as in Lemma 12.7. Since K and κ have
opposite parity, the line segment from −K − i T1 to −K + i T1 lies in the region
A(κ) of Lemma 12.9. Thus by Cauchy’s residue theorem,
xρ x κ−2k
ψ0 (x, χ ) = − + + E + R1 + R2
ρ ρ 1≤k<(K +κ)/2
2k − κ
|γ |<T1

where κ = 0 if χ (−1) = 1 and κ = 1 if χ (−1) = −1, E is the residue of

L xs
− (s, χ )
L s
at s = 0, and
−1 L xs
R2 = (s, χ ) ds.
2πi C L s
By proceeding as in the latter part of the proof of Theorem 12.5, but using now
Lemma 12.7 and Lemma 12.9 in place of Lemma 12.2 and Lemma 12.4, we
see that
x T log q K
R2 (log qT )2 + .
T KxK
12.1 Classical formulæ 405

This last term tends to 0 as K → ∞. Put

xρ
R3 = − .
ρ ρ
T <|γ |<T1

Then R(x, T ) = R1 + R2 + R3 , and R3 x T −1 log qT by Theorem 10.17.

It remains to compute the residue E. By logarithmic differentiation of the
functional equation in the asymmetric form of Corollary 10.9, we ﬁnd that
L L q
π π
(s, χ) = − (1 − s, χ ) − log − (1 − s) + cot (s + κ)
L L 2π 2 2
(12.9)
L
If χ(−1) = −1, then L
(s, χ ) is analytic at s = 0, so
L L q
E =− (0, χ ) = (1, χ ) + log − C0 ,
L L 2π
in view of (C.11). Since cot z is an odd function, its Laurent expansion about

z = 0 is of the form cot z = 1/z + ∞ k=1 ck z
2k−1
. Hence if χ (−1) = 1, we see
L
by (12.8) that the Laurent expansion of L (s, χ ) begins
L 1 L q
(s, χ ) = − (1, χ ) − log + C0 + · · ·
L s L 2π
Hence
L q
E = − log x + (1, χ ) + log − C0
L 2π
in this case.
Finally, we note that
∞
x −2k 1 ∞
x 1−2k 1 x +1
= − log(1 − x −2 ), = log .
k=1
2k 2 k=1
2k − 1 2 x −1
This completes the proof.

By letting T → ∞ we immediately obtain

Corollary 12.11 Suppose that χ is a primitive character modulo q, q > 1,
and that x > 1. Then
xρ 1 χ (−1)
ψ0 (x, χ) = − − log(x − 1) − log(x + 1) + C(χ ). (12.10)
ρ ρ 2 2
By Theorem 11.4 we see that C(χ ) log q if L(s, χ ) has no exceptional
zero, and that
1
C(χ ) = + O(log q)
1 − β1
406 Explicit formulæ

if L(s, χ ) has the exceptional zero β1 . In this latter case, the sum over ρ includes
a large term due to ρ = 1 − β1 . This, however, is largely cancelled by C(χ),
since
x 1−β1 − 1 log x 1−β1
− =− x σ dσ x 1−β1 log x. (12.11)
1 − β1 1 − β1 0

This is quite small compared with the contribution −x β1 /β1 made by ρ = β1 ,

not to mention the contributions of other zeros with β ≥ 1/2.
In principle, we could derive an explicit formula for ψ0 (x, χ ) when χ is
imprimitive, by taking into account the contributions made by zeros on the
imaginary axis. However, we ﬁnd it simpler to pass from ψ0 (x, χ ! ) to ψ0 (x, χ )
by elementary reasoning. Suppose that χ is a character modulo q induced by
the primitive character χ ! modulo d, where d|q. (The possibility that d = 1 is
not excluded here.) Then

ψ0 (x, χ ! ) − ψ0 (x, χ ) = χ ! p k log p
p|q k
pd 1< p k ≤x
log x
log p (12.12)
p|q
log p
pd
≤ ω(q/d) log x
(log q/d)(log x).
Note that the distinction between ψ0 (x, χ) and ψ(x, χ ) can be dropped at this
point:
ψ(x, χ ) = ψ0 (x, χ ! ) + O((log 2q)(log x)). (12.13)
This estimate, though somewhat crude, sufﬁces for most purposes.
The explicit formulæ that we have established thus far arise from Perron’s
formula. We may similarly derive other explicit formulæ using other kernels in
the inverse Mellin transform. Examples of such formulæ are found in Exercises
12.1.5–10. In some cases it may not be so easy to apply complex variable
techniques, but for such weighted sums over primes we may use the formulæ
above, with integration by parts. For example, from Theorem 12.5 we see that
x
w(n)(n) = w(u)dψ(u)
n≤x 2−
x x
= w(u) du − w(u)u ρ−1 du + smaller terms.
2 ρ 2
|γ |≤T

To facilitate the estimation of these ‘smaller terms’ it is useful to record a little

more information concerning the error terms in the truncated explicit formula.
12.1 Classical formulæ 407

Theorem 12.12 Suppose that c is a constant, c > 1, and let χ be a character

modulo q. For x ≥ c and T ≥ 2 there exist functions E 1 (x, χ) and E 2 (x, T, χ)
with the following properties:
xρ
ψ(x, χ) = E 0 (χ )x − + E 1 (x, χ ) + E 2 (x, T, χ); (12.14)
ρ ρ
|γ |≤T
x
1 |d E 1 (u, χ )| (log xq)2 ; (12.15)
c
x
E 2 (x, T, χ) log x + (log x T q)2 ; (12.16)
T
x
x2
|E 2 (u, T, χ )| du (log x T q)2 . (12.17)
c T
Proof Suppose ﬁrst that χ is non-principal. Thus χ is induced by a primitive
character χ ! (mod d) where 1 < d ≤ q. Put
1
E 1 (x, χ ) = ψ0 (x, χ) − ψ0 (x, χ ! ) − log(x − 1)
2
χ (−1)
− log(x + 1) + C(χ ! ), (12.18)
2
!
E 2 (x, T, χ) = ψ(x, χ ) − ψ0 (x, χ ) + R(x, T ; χ ) (12.19)
where R(x, T ; χ ! ) is deﬁned by taking χ = χ ! in (12.6). Thus (12.6) gives
(12.14). By (12.12) we see that
x log x
1 |d(ψ0 (u, χ) − ψ0 (u, χ ! ))| log p (log x)(log q).
c p|q
log p
pd

Thus we have (12.15). It is also clear that (12.8) gives (12.16). To obtain (12.17),
we note that

x
u x x
1 x 2 log T
min 1, du ≤ 1+ du .
c T u T pk ≤2x x/T u T log x

Since ψ(x, χ ) − ψ0 (x, χ) = 0 except for jump discontinuities at the prime

powers, this term makes no contribution to the integral (12.17). Thus we have
(12.17).
Now suppose that χ is principal. Put
1
E 1 (x, χ0 ) = ψ(x, χ0 ) − ψ0 (x) − log 2π − log(1 − 1/x 2 ),
2
E 2 (x, T, χ0 ) = ψ(x, χ0 ) − ψ0 (x, χ0 ) + R(x, T )
where R(x, T ) is deﬁned by (12.3). Then the desired assertions follow from
(12.3) and (12.4) in the same way as in the former case, so the proof is
complete.
408 Explicit formulæ

12.1.1 Exercises
1. Suppose that |s − 1| ≥ 1. Show that

log ζ (s) = log(s − ρ) + O(log τ )
ρ
|γ −t|≤1

uniformly for −1 ≤ σ ≤ 2, where log ζ (s) is deﬁned by continuous variation

along the ray from σ + it to ∞ + it, with log ζ (∞ + it) = 0, and | log(s −
ρ)| < π.
2. (a) By using the Brun–Titchmarsh inequality, show that
(n)
(log x)(log log x).
x+1≤n≤2x
n−x

(b) Let R1 be deﬁned as in the proof of Theorem 12.5. Show that

x x
R1 (log x) min 1, + (log x)(log log x).
T x T
3. Let δ be a small positive number. For a given T ≥ 4, let S = {t ∈ [T,
T + 1] : minγ |t − γ | ≥ δ/ log T }, and for T ≤ t ≤ T + 1 deﬁne
1
f (t) = log T +
T −1≤γ ≤T +2
|t − γ |

where the sum is over ordinates γ of zeros of the zeta function.

(a) Show that if T ≤ t ≤ T + 1, then
ζ
max (s) f (t).
−1≤σ ≤2 ζ
(b) Show that meas S 1 whenever δ is a sufﬁciently small positive con-
stant.
(c) Show that

f (t) dt (log T ) log log T.

(d) Deduce that for every T ≥ 4 there is a T1 ∈ [T, T + 1] such that

ζ
max (σ + i T1 ) (log T ) log log T.
−1≤σ ≤2 ζ
4. Show that if s = 1, and ζ (s) = 0, then
(n) x 1−s ζ x ρ−s ∞
x −2k−s
= − (s) − +
n≤x ns 1−s ζ ρ ρ −s k=1
2k + s
12.1 Classical formulæ 409

where it is understood that the term n = x is counted with weight 1/2 if x

is a prime power, and the sum over ρ is calculated as limT →∞ |γ |≤T .
5. (cf. Ingham 1932, p. 81) By (12.1) we know that
xρ 1
= x − ψ0 (x) − log 2π − log(1 − 1/x 2 )
ρ ρ 2

for x > 1. Show that if 0 < x < 1, then

xρ (n) 1 1−x
= + log x + C0 + x + log .
ρ ρ n≤1/x
n 2 1+x

6. (de la Vallée Poussin 1896) Show that if x > 1, then

1 2 x ρ+1 ζ
(n)(x − n) = x − − (log 2π)x + (−1)
n≤x 2 ρ ρ(ρ + 1) ζ

∞
x −2k+1
− .
k=1
2k(2k − 1)

7. Show that if x > 1, then

xρ
ζ 1 ∞ −2k
x
(n) log x/n = x − − (log 2π ) log x − (0) − .
ρ ρ ζ
2 4 k=1 k 2
n≤x

8. (Hardy & Littlewood 1918; Wigert 1920) (a) Let k be a non-negative integer.
Show that for s near −k, the Laurent expansion of (s) begins

(−1)k (−1)k
(s) = + (k + 1) + · · · .
k!(s + k) k!

(b) Let k be a positive integer. Show that for s near −2k, the Laurent expan-

sion of ζζ (s) begins

ζ 1 ζ
(s) = − (2k + 1) + log 2π − (2k + 1) + · · · .
ζ s + 2k ζ

(c) Show that if z > 0, then

∞
(n)e−n/z = z − (ρ)z ρ − e−1/z log 2π + (−1 + cosh 1/z) log z
n=1 ρ
∞
ζ z −k ∞
z −2k−1
+ (−1)k (k + 1) − (2k + 2) .
k=1
ζ k! k=0
(2k + 1)!
410 Explicit formulæ

2
9. Suppose that a > 0, that x ≥ 1, and that x is not of the form e2a k
where k
is a positive integer. Show that

1 ∞
−(log x/n)2
√ (n) exp
2π a n=1 2a 2

= ea /2 x − ea ρ /2 x ρ + e2a k x −2k
2 2 2 2 2

ρ 0<k< log2x
2a
2 ∞
1 −(log x) ζ
(−(log x)/a 2 + it)e−a t /2 dt.
2 2
− exp
2π 2a 2 −∞ ζ

12.2 Weil’s explicit formula

In order to see better the relationship between a sum over zeros and a corre-
sponding sum over primes, we now derive an explicit formula that applies to a
general class of kernels. (The next theorem is not used later, and can be omitted
on a ﬁrst reading.)
Theorem 12.13 (Weil) Let F(x) be a measurable function such that
∞
e( 2 +δ0 )2π|x| |F(x)| d x < ∞,
1
(12.20)
−∞

and
∞
e( 2 +δ0 )2π |x| |d F(x)| < ∞
1
(12.21)
−∞

where δ0 > 0 is ﬁxed. Suppose that F(x) = 12 (F(x − ) + F(x + )) for all x, and
that F(x) + F(−x) = 2F(0) + O(|x|). Put
∞
(s) = F(x)e−(s−1/2)2π x d x
−∞

for −δ0 < σ < 1 + δ0 . Let χ be a primitive character modulo q. Then

1
lim (ρ) = E 0 (χ ) ((0) + (1)) + log q/π + (1/4 + κ/2) F(0)
T →∞ 2π
|γ |≤T

1 ∞
(n) −1 1
− χ (n)F log n + χ (n)F log n
2π n=1 n 1/2 2π 2π
∞
e−(1+2κ)π x
+ (2F(0) − F(x) − F(−x)) d x. (12.22)
0 1 − e−4π x
Here E 0 (χ ) = 1 if χ = χ0 , E 0 (χ ) = 0 otherwise, and κ = 0 if χ (−1) = 1,
κ = 1 if χ (−1) = −1.
12.2 Weil’s explicit formula 411

We note that if ρ = 1/2 + iγ , then

∞
(ρ) = * ).
F(x)e(−γ x) d x = F(γ
−∞

The values of / can be evaluated explicitly; from Appendix C we see
that

(1/4) = −C0 − 3 log 2 − π/2

and

(3/4) = −C0 − 3 log 2 + π/2.

Here C0 is Euler’s constant. Since |d f g| ≤ | f | |dg| + |g| |d f |, from
(12.20) and (12.21) we see that ea|x| F(x) is of bounded variation for any a,
0 ≤ a ≤ (1/2 + δ0 )2π. Hence F(x) exp(−(1/2 + δ0 )2π |x|), and (s) is an-
alytic in the strip −δ0 < σ < 1 + δ0 . For |t| ≤ 1 we note that φ(s) 1. For
|t| ≥ 1 we integrate by parts to see that
∞
1
(s) = e(−t x) d (F(x) exp((1 − 2σ )π x)) ;
2πit −∞

hence (s) 1/(|t| + 1) uniformly for −δ0 ≤ σ ≤ 1 + δ0 . In these estimates,

and in the proof below, implicit constants may depend on F and on δ0 .
Proof We note that
1 ξ
(ρ) = (s) (s, χ) ds
|γ |≤T1
2πi C ξ

where C is the closed polygonal contour with vertices −δ1 + i T1 , −δ1 − i T1 ,

1 + δ1 − i T1 , 1 + δ1 + i T1 . Here 0 < δ1 < δ0 , and T1 is chosen so that |T −
T1 | ≤ 1, and so that
ξ
(σ ± i T1 , χ ) (log qT )2
ξ
uniformly for −1 ≤ σ ≤ 2. Thus
1+δ1 +i T −δ1 −i T
1 ξ (log T )2
(ρ) = + (s) (s, χ ) ds + O .
|γ |≤T
2πi 1+δ1 −i T −δ1 +i T ξ T

By the functional equation for ξ (s, χ), we see that

ξ ξ
(s, χ ) = − (1 − s, χ ).
ξ ξ
412 Explicit formulæ

Hence the integral above is

1+δ1 +i T
1 ξ ξ
(s) (s, χ) + (1 − s) (s, χ ) ds. (12.23)
2πi 1+δ1 −i T ξ ξ
From (10.25) and (10.33) we see that

ξ 1 1 1 q 1
L
(s, χ ) = E 0 (χ) + + log + ((s + κ)/2) + (s, χ ).
ξ s s−1 2 π 2 L
(12.24)
For 1 < σ < 1 + δ0 ,
L ∞
(s) (s, χ ) = − (s) (n)χ (n)n −s
L n=1
(12.25)

∞ ∞
1
=− (n)χ (n)n −1/2 F x− log n e−(s−1/2)2π x d x,
n=1 −∞ 2π

and similarly
L ∞
(1 − s) (s, χ ) = − (n)χ (n)n −1/2
L n=1
∞
1
× F −x + log n e−(s−1/2)2π x d x. (12.26)
−∞ 2π
From the estimate F(x) e−(1/2+δ0 )2π |x| we see that
∞
(n)n −1/2 F x − 2π 1
log n e−(1/2+δ1 )2π x d x
n −∞
⎛
∞

∞
⎜
−1/2
(n)n ⎝ e−(1+δ0 +δ1 )2π x n 1/2+δ0 d x
n=1
(log n)/(2π )
(log n)/(2π )
⎞

+ e(δ0 −δ1 )2π x n −1/2−δ0 d x ⎠

−∞

(n)n −1−δ1 1.
n

A similar calculation relates to the second term (12.26), and hence for
s = 1 + δ1 + it,
∞
L L * (t)
(s) (s, χ) + (1 − s) (s, χ ) = H (x)e(−t x) d x = H
L L −∞
12.2 Weil’s explicit formula 413

where
∞
(n) log n
H (x) = − χ (n)F x −
n=1
n 1/2 2π

log n
+ χ (n)F −x + e−(1/2+δ1 )2π x .
2π
Now H (x) is of bounded variation, since
(n)
log n −(1/2+δ1 )2π x
VarH ≤ 1/2
Var F x − e
n n 2π
(n)
log n −(1/2+δ1 )2π x
+ 1/2
Var F −x + e
n n 2π

=2 (n)n −1−δ1 Var F(x)e−(1/2+δ1 )2π x 1.
n

Moreover, H (x) = (H (x + ) + H (x − ))/2, and thus by the Fourier integral

theorem,
T
lim * (t) dt = H (0).
H
T →∞ −T

That is,
1+δ1 +i T
1 L L
lim (s) (s, χ) + (1 − s) (s, χ ) ds
T →∞ 2πi 1+δ −i T L L

1

−1 (n) − log n log n
= χ (n)F + χ (n)F .
2π n n 1/2 2π 2π

The remaining terms from (12.24) contribute to the integral (12.23) an amount
1+δ1 +i T
1
G(s) ds.
2πi 1+δ1 −i T

where

1 1 1 q 1 s+κ
G(s) = E 0 (χ ) + + log + ((s) + (1 − s))
s s−1 2 π 2 2
By Cauchy’s theorem this is

1 1/2+i T
log2 qT
G(s) ds + E 0 (χ )((0) + (1)) + O .
2πi 1/2−i T T
414 Explicit formulæ

To treat this latter integral we note that

1/2+i T
1 1 1
+ ((s) + (1 − s)) ds
2πi 1/2−i T s s−1

−4i T t 1 1
= + it + − it dt = 0.
π −T 1 + 4t 2 2 2

Now (1/2 + it) = *

F(t), and hence
1/2+i T
1 1
(log q/π )((s) + (1 − s)) ds
2πi 1/2−i T 2
T
log q/π * F(0)
= F(t) + *
F(−t) dt −→ log q/π
4π −T 2π
as T tends to inﬁnity. Thus to complete the proof of the theorem it sufﬁces to
establish

Lemma 12.14 Let a > 0 and b > 0 be ﬁxed. If J ∈ L 1 (R), J is of bounded

variation on R, and if J (x) = J (0) + O(|x|), then
T
lim (a ± ibt)*
J (t) dt
T →∞ −T
∞
2π e−2πax/b
= (a)J (0) + (J (0) − J (∓x)) d x. (12.27)
b 0 1 − e−2π x/b
1
If G and J are in L (R), then
∞ ∞
G(t)*
J (t) dt = *
G(x)J (x) d x,
−∞ −∞

since both sides are

∞ ∞
G(t)J (x)e(−t x) d x dt.
−∞ −∞

We cannot apply this with G(t) = (a ± ibt), since this function is not in
L 1 (R). Nevertheless, the right-hand side of (12.27) is a linear functional of J ,

which thus serves as a surrogate for the Fourier transform of (a ± ibt), at
least when the test function J is sufﬁciently well-behaved.

Proof It sufﬁces to consider the + sign on the left-hand side of (12.27),

for if K (x) = J (−x) then K * (t) = *
J (−t). We suppose ﬁrst that J (0) = 0. The
integral with respect to t on the left-hand side of (12.27) is
∞ T
J (x) (a + ibt)e(−xt) dt d x.
−∞ −T
12.2 Weil’s explicit formula 415

Since (a + ibt) log(|t| + 2), the inner integral above is T log T , uni-
formly in x. Put δ = T −2/3 . The contribution to the above by those x for which
|x| ≤ δ is
δ
|x|T log T d x δ 2 T log T = T −1/3 log T.
−δ

For |x| ≥ δ we appeal to Theorem C.5 to estimate the inner integral. The error
term in Theorem C.5 contributes an amount
∞
min(x, 1)T −1 x −2 d x T −1 log T.
δ

By integrating by parts we see that

∞ ∞
e(−x T ) J (δ)e(−δT ) 1 e(−x T )
J (x) dx = − J (x) dx
δ x 2πiδT 2πi T δ x2
∞
1 e(−x T )
+ d J (x)
2πi T δ x
1 1 ∞ 1 ∞
+ min(x, 1)x −2 d x + |d J |
T T δ δT δ
T −1/3 ,

and similarly for the three related terms. Hence

−δ
T
−2π e2πax/b
(a + ibt) J*(t) dt = J (x) d x + O T −1/3 log T .
−T b −∞ 1−e 2π x/b

0
On the right-hand side we see that −δ ··· δ, so that
∞
T
−2π e−2πax/b
lim (a + ibt)*
J (t) dt = J (−x) d x
T →∞ −T b 0 1 − e−2π x/b
provided that J (0) = 0. To obtain the general case we apply the above to
* (t) =
the function√K (x) = J (x) − J (0)e−π x /A where A > 0 is large. Then K
2

*
J (t) − J (0) Ae −π At 2
, and hence
T T
lim * (t) dt = lim
(a + ibt) K (a + ibt)*
J (t) dt
T →∞ −T T →∞ −T
√ ∞
(a + ibt)e−π At dt.
2
− J (0) A
−∞

This last integral is

∞
(a) + O(|t|) e−π At dt = (a)A−1/2 + O(A−1 ).
2

−∞
416 Explicit formulæ

On the other hand,

∞
e−2πax/b
−2π −2π x/b
K (−x) d x
0 1−e
∞
e−2πax/b
= 2π (J (0) − J (−x)) d x
0 1 − e−2π x/b
∞
e−2πax/b −π x 2 /A
+ 2π J (0) −2π
e − 1 d x.
0 1−e
x/b

Now e−α = 1 + O(α) for α ≥ 0, and hence this last integral is

1 ∞
x A−1 d x + e−2πax/b x 2 A−1 d x A−1 .
0 1

On combining these estimates, we see that (12.29) holds apart from an error
term O(A−1/2 ), and we obtain the result since A can be arbitrarily large.

12.3 Notes

Section 12.1. Let (x) = n≤x (n)/ log n. Riemann (1859) gave a heuristic
proof that if x > 1, and x is not a prime power, then
∞
du
(x) = Li(x) − Li (x ρ ) − log 2 + .
ρ x (u 2 − 1)u log u

Here the sum over the zeros is conditionally convergent, and it is to be un-
derstood that it is computed as the limit, as T → ∞, of the sum over those
zeros for which |γ | ≤ T . The above formula was first proved rigorously by von
Mangoldt (1895), and additional proofs were subsequently given by Landau
(1908a, b). For further discussion of the explicit formula in the form given by
Riemann, see Edwards (1974, Chapter 1). von Mangoldt (1895) also proved the
explicit formula (12.1). Landau (1909, Section 89) was the first to show that
the limit in (12.1) is attained uniformly for x in a compact interval not con-
taining a prime power. Cramér (1918) showed that (12.1) can be derived from
the above. von Koch (1910) and Landau (1912) estimated the error term that
arises when the explicit formula is truncated, as in Theorem 12.5. The explicit
formula for ψ0 (x, χ) was first established by Landau (1908b), but with not
so much attention to the constant term. In the customary form of this explicit
formula (cf. Davenport (2000, p. 117)), the constant term is expressed in terms
of the constant B(χ ) that arises in the Hadamard product formula for ξ (s, χ ).
Our presentation, which avoids this, is that of Vorhauer (2006).
12.4 References 417

Section 12.2. Although many specific explicit formulæ were derived by vari-
ous authors for a variety of purposes, it was Guinand (1942) who first suggested
that it would be possible to specify a general class of such formulæ. Guinand
(1948) did this assuming the Riemann Hypothesis, but it seems that he im-
posed RH only in order to obtain a wider class of test functions. Theorem
12.13 is a special case of the main result of Weil (1952), who treats general
L-functions associated with Grössencharaktere χ , which are representations
of the group of idèle-classes of an algebraic number field k into the multiplica-
tive group of non-zero complex numbers. Weil also showed that a necessary
and sufficient condition for the Riemann hypothesis to hold for L is that the
right-hand side corresponding to (12.22) is non-negative for all functions F of a
certain class. Gallagher (1987) widened the class of test functions in Guinand’s
formula and gave several applications. See also Besenfelder (1977a, b),
Yoshida (1982), Jorgenson, Lang & Goldfeld (1994), and Bombieri & Lagarias
(1999).

12.4 References
Barner, K. (1981). On A. Weil’s explicit formula, J. Reine Angew. Math. 323, 139–152.
Besenfelder, H.-J. (1977a). Die Weilsche “Explizite Formel” und temperierte Distribu-
tionen, J. Reine Angew. Math. 293–294, 228–257.
(1977b). Zur Nullstellenfreiheit der Riemannschen Zeta-funktion auf der Geraden
σ = 1, J. Reine Angew. Math. 295, 116–119.
Besenfelder, H.-J. & Palm, G. (1997). Einige Äquivalenzen zur Riemannschen Vermu-
tung, J. Reine Angew. Math. 293–294, 109–115.
Bombieri, E. & Lagarias, J. C. (1999). Complements to Li’s criterion for the Riemann
hypothesis, J. Number Theory 77, 274–287.
Cramér, H. (1918). Über die Herleitung der Riemannschen Primzahlformel, Arkiv för
Mat. Astr. Fys. 13, no. 24, 7 pp.
Davenport, H. (2000). Multiplicative Number Theory, Third Edition, Graduate Texts
Math. 74. New York: Springer-Verlag.
Edwards, H. M. (1974). Riemann’s Zeta Function, Pure and Applied Math. 58. New
York: Academic Press.
Gallagher, P. X. (1987). Applications of Guinand’s formula, Analytic number the-
ory and Diophantine problems (Stillwater, 1984), Progress in Math. 70. Boston:
Birkhäusen, pp. 135–157.
Guinand, A. P. (1937). A class of self-reciprocal functions connected with summation
formulæ, Proc. London Math. Soc. (2) 43, 439–448.
(1938). Summation formulæ and self-reciprocal functions, Quart. J. Math. Oxford
Ser. 9, 53–67.
(1939a). Finite summation formulæ, Quart. J. Math. 10, 38–44.
(1939b). Summation formulæ and self-reciprocal functions (II), Quart. J. Math. 10,
104–118.
418 Explicit formulæ

(1939c). A formula for ζ (s) in the critical strip, J. London Math. Soc. 14, 97–100.
(1941). On Poisson’s summation formula, Ann. of Math. (2) 42, 591–603.
(1942). Summation formulæ and self-reciprocal functions (III), Quart. J. Math. 13,
30–39.
(1948). A summation formula in the theory of prime numbers, Proc. London Math.
Soc. 50, 107–119.
Hardy, G. H. & Littlewood, J. E. (1918). Contributions to the theory of the Riemann
zeta-function and the theory of the distribution of primes, Acta Math. 41, 119–196;
Collected Papers, Vol. 2. Oxford: Clarendon Press, 1967, pp. 20–97.
Ingham, A. E. (1932). The Distribution of Prime Numbers, Cambridge Tract No. 30.
Cambridge: Cambridge University Press.
Jorgenson, J., Lang, S., & Goldfeld, D. (1994). Explicit Formulas. Lecture Notes in
Math. 1593. Berlin: Springer-Verlag.
von Koch, H. (1910). Contributions à la théorie des nombres premiers, Acta Math. 33,
293–320.
Landau, E. (1908a). Neuer Beweis der Riemannschen Primzahlformel, Sitzungsber.
Königl. Preuß. Akad. Wiss. Berlin, 737–745; Collected Works, Vol. 4, Essen: Thales
Verlag, 1986, pp. 11–19.
(1908b). Nouvelle démonstration pour la formule de Riemann sur le nombre des
nombres premiers inférieurs à une limite donnée, et démonstration d’une formule
plus générale pour le cas des nombres premiers d’une progression arithmétique,
Ann. l’École Norm. Sup. (3) 25, 399–442; Collected Works, Vol. 4, Essen: Thales
Verlag, 1986, pp. 87–130.
(1909). Handbuch der Lehre von der Verleilung der Primzahlen. Leipzig: Teubner.
Reprint: New York: Chelsea, 1953.
(1912). Über einige Summen, die von den Nullstellen der Riemannschen Zetafunktion
abhängen, Acta Math. 35, 271–294; Collected Works, Vol. 5. Essen: Thales Verlag,
1986, pp. 62–85.
von Mangoldt, H. (1895). Zu Riemann’s Abhandlung “Ueber die Anzahl der Primzahlen
unter einer gegebenen Grösse”, J. Reine Angew. Math. 114, 255–305.
Riemann, B. (1859). Ueber die Anzahl der Primzahlen unter einer gegebenen Grösse,
Monatsber. Kgl. Preuss. Akad. Wiss. Berlin, 671–680; Werke, Leipzig: Teubner,
1876, pp. 3–47. Reprint: New York: Dover, 1953.
de la Vallée Poussin, C. J. (1896). Recherches analytiques sur la théorie des nombres
premiers, I–III, Ann. Soc. Sci. Bruxelles 20, 183–256, 281–362, 363–397.
Vorhauer, U. M. A. (2006). The Hadamard product formula for Dirichlet L-functions,
to appear.
Weil, A. (1952). Sur les “formules explicites” de la théorie des nombres premiers, Comm.
Sém. Math. Univ. Lund [Medd. Lunds Univ. Mat. Sem.], Tome Supplementaire,
252–265.
Wigert, S. (1920). Sur la théorie de la fonction ζ (s) de Riemann, Ark. Mat. 14, 1–17.
Yoshida H. (1992). On Hermitian forms attached to zeta functions, Zeta functions in
geometry (Tokyo, 1990), Adv. Stud. Pure Math. 21. Tokyo: Kinokuniya , 281–325.
13
Conditional estimates

13.1 Estimates for primes

From the explicit formula for ψ0 (x) we see that the contribution to the error
term ψ0 (x) − x made by a typical zero ρ = β + iγ is −x ρ /ρ. This has absolute
value x β /|γ |, which diminishes as |γ | increases, but it depends much more
sensitively on the value of β. We recall that if ρ is a zero, then so also is
1 − ρ. Since at least one of these has real part ≥ 1/2, we see that the Riemann
Hypothesis represents the best of all possible worlds, in the sense that the error
term in the Prime Number Theorem is smallest when the Riemann Hypothesis
is true. By Theorem 10.13 we ﬁnd that
1 log 2n
(log T )2 . (13.1)
ρ |ρ| 1≤n≤T
n
|γ |≤T

Thus by taking T = x in Theorem 12.5, we obtain

Theorem 13.1 Assume RH. Then for x ≥ 2,

ψ(x) = x + O x 1/2 (log x)2 , (13.2)

ϑ(x) = x + O x 1/2 (log x)2 , (13.3)

π(x) = li(x) + O x 1/2 log x . (13.4)
In Chapter 15 we shall show that these estimates for the error term are within
a factor (log x)2 of being best possible, which is not surprising since each zero
individually contributes an amount of the order x 1/2 .
Proof The second assertion follows from the ﬁrst by Corollary 2.5. By inte-
gration by parts we ﬁnd that
x
1 ϑ(x) − x 2 x
ϑ(u) − u
π(x) = du + + + 2
du, (13.5)
2 log u log x log 2 2 u(log u)
and so the third assertion follows from the second.

419
420 Conditional estimates

The factor (log x)2 in (13.2) can be avoided if we take smoother weights.
For example, put

ψ1 (x) = (x − n)(n). (13.6)
n≤x

Then we have the explicit formula

x 2 x ρ+1 ζ ζ
ψ1 (x) = − − (0)x + (−1) + O x −1/2 (13.7)
2 ρ ρ(ρ + 1) ζ ζ

for x ≥ 2. Assuming RH, it follows easily that

1
ψ1 (x) = x 2 + O x 3/2 . (13.8)
2
Assuming RH, we can also describe more precisely the relationships between
the three standard prime-counting functions ψ(x), ϑ(x), and π (x).

Theorem 13.2 Assume RH. Then

ϑ(x) = ψ(x) − x 1/2 + O x 1/3 , (13.9)

and
1/2
ϑ(x) − x x
π(x) − li(x) = +O . (13.10)
log x (log x)2
Proof By an easy elaboration on Corollary 2.5, we see that

ϑ(x) = ψ(x) − ψ x 1/2 + O x 1/3 .

Hence (13.9) follows immediately from (13.2). To obtain (13.10), put

x
ϑ1 (x) = (x − p) log p = ϑ(u) du.
p≤x 2

By (13.8) and (13.9) it follows that ϑ1 (x) = x 2 /2 + O x 3/2 . By integration by
parts we see that the ﬁnal integral in (13.5) is

ϑ1 (u) − u 2 /2 x x
ϑ1 (u) − u 2 /2
+ (1 + 2/ log u) du
u(log u)2 2 2 (u log u)2
x
x 1/2
+ u −1/2 (log u)−2 du
(log x)2 2
x 1/2
.
(log x)2
Thus (13.10) follows from (13.5).
13.1 Estimates for primes 421

As for primes in short gaps, we see from (13.4) that

x+h
1
π(x + h) − π (x) = du + O x 1/2 log x .
x log u
Here the main term on the right is larger than the error term if h ≥ C x 1/2 (log x)2 .
We can do slightly better than this by counting primes between x and x + h
with a smoother weight.
Theorem 13.3 (Cramér) There is a constant C > 0 such that if the Rie-
mann Hypothesis is true, then for every x ≥ 2 the interval (x, x + C x 1/2 log x)
contains at least x 1/2 prime numbers.
Proof Let h be a parameter to be determined, and put w(u) = 1 − |u − x|/ h
when |u − x| ≤ h, and w(u) = 0 otherwise. Then by three applications of (13.7)
we see that
1
(n)w(n) = (ψ1 (x + h) − 2ψ1 (x) + ψ1 (x − h))
n h

1 (x + h)ρ+1 − 2x ρ+1 + (x − h)ρ+1 1
=h− +O .
h ρ ρ(ρ + 1) hx
(13.11)
Assuming RH, we note that the summand here is obviously
x 3/2
. (13.12)
γ2
Moreover, if γ > x/ h, then the three terms in the numerator may have quite
different arguments, in which case the above estimate is the best that we can
assert in general. On the other hand, if γ is smaller, then some cancellation
must occur in the numerator. To see this, note that the summand may be written
x+h
(h − |x − u|)u ρ−1 du h 2 x −1/2 (13.13)
x−h

assuming RH. This improves on (13.12) when |γ | < x/ h. We use this estimate
for the size of the summand together with Theorem 10.13 to see that the sum
in (13.11) is hx 1/2 log x/ h. Hence if h = C x 1/2 log x, then
h
(n) ≥ .
x−h<n<x+h
2
To complete the proof it remains to estimate the contribution made by higher
powers of primes on the left-hand side. The number of squares in this interval is
log x, so the squares of the primes contribute an amount that is (log x)2 .
For each k > 2 there is at most one k power in the interval. Moreover, if p k is
th
422 Conditional estimates

in the interval, then k log x. Hence the higher powers contribute an amount
2
(log x) , and the proof is complete.

Although Cramér’s theorem is highly non-trivial, and is signiﬁcantly stronger

than anything that we know how to prove unconditionally, it is nevertheless
disappointing that it falls so far short of what we conjecture to be true, namely
that for every ε > 0 the interval [x, x + x ε ] contains a prime, for all x > x0 (ε).
In order to understand the weakness in our approach, write
(x + h)ρ − x ρ
ψ(x + h) − ψ(x) − h = − + ···. (13.14)
ρ ρ

The contribution of zeros with |γ | > x/ h can be attenuated by employing a

smoother weight, but no amount of smoothing will eliminate the smaller zeros.
However, if |γ | ≤ x/ h then the argument of (x + h)ρ is near that of x ρ , so there
is some signiﬁcant cancellation in the numerators above. Indeed,
(x + h)ρ − x ρ x+h
= u ρ−1 du hx −1/2
ρ x

if 0 ≤ h ≤ x and β = 1/2. Taking this a step further, we see that the above is

= hx ρ−1 + O(h 2 |γ |x β−2 ).

Thus the left-hand side of (13.14) bears a passing resemblance to

−hx −1/2 x iγ , (13.15)
|γ |≤x/ h

if we assume RH. Here the sum has xh −1 log x/ h terms, and with sums of
independent random variables in mind, we might guess that the above sum is
(x/ h)1/2+ε , which suggests

Conjecture 13.4 If 2 ≤ h ≤ x, then

ψ(x + h) − ψ(x) = h + Oε h 1/2 x ε .

Although we expect there to be considerable cancellation in (13.15), any such

cancellation that might occur among the contributions of the zeros is discarded
in the proof of Theorem 13.3. Thus it seems that if we are to argue through
zeta zeros to obtain an improvement of Theorem 13.3, then we need not just
RH but also some deeper information concerning the distribution of the γ –
more precisely that the numbers γ log x are approximately uniformly distributed
modulo 2π. Although we cannot demonstrate that the desired cancellation
occurs for all x, we can show that there is considerable cancellation in mean
square.
13.1 Estimates for primes 423

Theorem 13.5 Assume RH. Then for X ≥ 2,

2X
(ψ(x) − x)2 d x X 2.
X

Note that if we were to use the pointwise bound of Theorem 13.1 to bound
the left-hand side above, then we would obtain an estimate that is larger than
the above by a factor (log X )4 . From the above we see that ψ(x) = x + O(x 1/2 )
on average.

Proof Take T = X in the explicit formula of Theorem 12.5. Then

xρ
ψ(x) = x − + R(x)
|γ |≤X
ρ

where
∞
2X 2
R(x)2 d x X (log X )4 + log p k 1+ u −2 du
X X/2< p k <3X 1

X (log X )4 .

On the other hand, the sum over zeros contributes

2X xρ 2 1 2X
dx = x 1+i(γ1 −γ2 ) d x
X |γ |≤X
ρ γ1 ,γ2 ρ1 ρ2 X
|γi |≤X
1
X2 .
γ1 ,γ2 |ρ1 ρ2 | |2 + i(γ1 − γ2 )|

To complete the proof it sufﬁces to show that

1
< ∞. (13.16)
γ1 ,γ2 |γ1 γ2 |(1 + |γ1 − γ2 |)

In view of the symmetry of zeros about the real axis, we may conﬁne our
attention to γ1 > 0. For each such zero, we consider γ2 in various ranges. By
Theorem 10.13, the sum over γ2 < −γ1 is
1 1 log n log γ1
.
γ2 |γ2 |(1 + |γ1 − γ2 |) γ2 γ22
n>γ1 n
2 γ1
γ2 <−γ1 γ2 <−γ1

Similarly, the sum over those γ2 for which |γ2 | ≤ 12 γ1 is

1 1 1 log n (log γ1 )2
.
γ1 γ2 γ2 γ1 1≤n≤γ1 n γ1
0<γ2 ≤γ1
424 Conditional estimates

The sum over those γ2 for which 12 γ1 < γ2 < 32 γ1 is

1 1 log γ1 1 (log γ1 )2
,
γ1 γ2 1 + |γ1 − γ2 | γ1 1≤n≤γ1 n γ1
|γ2 −γ1 |≤γ1 /2

and ﬁnally the sum over γ2 ≥ 32 γ1 is

1 log n log γ1
.
γ2 γ2 2
n>γ1 n
2 γ1
γ2 ≥ 32 γ1

We sum these estimates, multiply by 1/γ1 , and sum over γ1 to see that the
expression (13.16) is
(log γ1 )2 ∞
(log n)3
< ∞.
γ1 >0 γ1
2
n=1
n2

This completes the proof.

The oscillations of x iγ = eiγ log x become slower as x increases, since

d
dx
log x = 1/x → 0 as x → ∞. However, with the change of variable x = eu
we have x iγ = eiγ u , which is a periodic function of u. Put

ψ eu − eu
f (u) = . (13.17)
eu/2
Assuming RH, the explicit formula of Theorem 12.5 gives
eiγ u
f (u) = − + o(1)
ρ ρ

as u → ∞. This provides a kind of Fourier expansion of f (u). Since

U +1 eU +1 eU +1
dx −2U
| f (u)| du =
2
(ψ(x) − x) 2 2
e (ψ(x) − x)2 d x,
U eU x eU

Theorem 13.5 is equivalent (assuming RH) to the estimate

U +1
| f (u)|2 du 1. (13.18)
U

By averaging | f (u)|2 over a longer interval we obtain not just an upper bound,
but an asymptotic formula.

Theorem 13.6 Assume RH, and let f (u) be deﬁned as in (13.17). Then
1 U m 2ρ
lim | f (u)|2 du =
U →∞ U 0 distinct γ
|ρ|2

where m ρ denotes the multiplicity of the zero ρ.

13.1 Estimates for primes 425

Proof Since the explicit formula for ψ0 (x) is uniformly convergent in intervals
free of prime powers, and is boundedly convergent in a neighbourhood of a
prime power, it follows that
U
1
| f (u)|2 du
U 1
1 U
= lim ei(γ1 −γ2 )u du + o(1)
T →∞ γ ,γ
1 2
ρ1 ρ2 U 1
|γi |≤T

1 1 1 1
= 1− + O min 1, + o(1).
U γ1 ,γ2 |ρ1 |2 γ1 ,γ2 |γ1 γ2 | U |γ1 − γ2 |
γ1 =γ2 γ1 =γ2

Here the sum over γ1 = γ2 is ﬁnite already when U = 1, in view of (13.16).

Since each term in this sum tends to 0 as U → ∞, it follows that
1 U 1
lim | f (u)|2 du = .
γ1 ,γ2 |ρ1 |
U →∞ U 1 2
γ1 =γ2

Suppose that ρ = 1/2 + iγ is a zero, and that its multiplicity is m ρ . Then the
equation γi = γ has m ρ solutions for i = 1 and for i = 2. Thus there are m 2ρ
pairs (γ1 , γ2 ) such that γ1 = γ2 = γ , so we have the result.
We now return to the distribution of primes in arithmetic progressions.
Theorem 13.7 Let q be given, and suppose that GRH holds for all L-functions
modulo q. Then for x ≥ 2,

ψ(x, χ ) = E 0 (χ )x + O x 1/2 (log x)(log q x) , (13.19)
1/2
ϑ(x, χ ) = E 0 (χ )x + O x (log x)(log q x) , (13.20)
1/2
π (x, χ) = E 0 (χ )li(x) + O x log q x (13.21)
where E 0 (χ ) = 1 or 0 according as χ = χ0 or not.
Proof For χ0 these relations follow from Theorem 1 and (12.14). Suppose
that χ is non-principal, and that χ ! is a primitive character that induces χ . Thus
χ ! is a character modulo d for some d|q, 1 < d ≤ q. By taking T = x in the
explicit formula for ψ(x, χ ! ), and appealing to Theorem 10.17, we see that
ψ(x, χ ! ) x 1/2 (log q x)(log x),
and then by (12.14) we have (13.19). By the triangle inequality, |ψ(x, χ ) −
ϑ(x, χ )| ≤ ψ(x) − ϑ(x). From Corollary 2.5 we know that this latter quantity
is x 1/2 , so (13.20) follows from (13.19). On inserting (13.20) into the identity
ϑ(x, χ ) x
ϑ(u, χ )
π(x, χ) = + 2
du,
log x 2 u(log u)

we obtain (13.21).
426 Conditional estimates

Corollary 13.8 Let q be given, and assume GRH for all L-functions modulo
q. Suppose that (a, q) = 1. Then for x ≥ 2,
x
ψ(x; q, a) = + O x 1/2 (log x)2 , (13.22)
ϕ(q)
x
ϑ(x; q, a) = + O x 1/2 (log x)2 , (13.23)
ϕ(q)
li(x)
π(x; q, a) = + O x 1/2 log x . (13.24)
ϕ(q)
Note that trivially,

0 ≤ ψ(x; q, a) ≤ (log x) 1 ≤ (log x)(1 + x/q).
0<n≤x
n≡a (q)

Thus we see that the bound (13.22) is worse than trivial if q > x 1/2 . However,
if q is smaller, say q ≤ x θ with θ < 1/2, then (13.22) provides a form of the
Prime Number Theorem for arithmetic progressions with a much better error
term than we were able to prove unconditionally (cf. Corollary 11.17).

Proof In view of the remarks above, we may assume that q ≤ x 1/2 . By (11.22)
we see that
x ψ(x, χ0 ) − x 1
ψ(x; q, a) − = + χ (a)ψ(x, χ ). (13.25)
ϕ(q) ϕ(q) ϕ(q) χ =χ0

Thus by the triangle inequality,

x |ψ(x, χ0 ) − x| 1
|ψ(x; q, a) − |≤ + |ψ(x, χ )|, (13.26)
ϕ(q) ϕ(q) ϕ(q) χ =χ0

and so (13.22) follows from (13.19). The other relations are proved
similarly.

Since L(s, χ) has log q zeros with γ 1, we expect (assuming GRH) that
ψ(x, χ ) is usually about (x log q)1/2 in size. Thus the estimates of Theorem 13.7
are close to what we presume would be best possible. On the right-hand side
of (13.25), we have ϕ(q) terms. With sums of independent random variables in
mind, we would expect therefore that the right-hand side of (13.25) is usually
(x(log q)/ϕ(q))1/2 . Since we are unable to prove that there is cancellation
in (13.25), we have no recourse but to use the triangle inequality, as in (13.26).
However, we conjecture that a lot has been lost at this point.

Conjecture 13.9 If (a, q) = 1 and q ≤ x, then

x
ψ(x; q, a) = + Oε x 1/2+ε /q 1/2 .
ϕ(q)
13.1 Estimates for primes 427

Although we are unable to conﬁrm our speculations concerning cancellation

in (13.25) for any individual a, we can show that such cancellation must occur
on average.

Corollary 13.10 Assume GRH for all L-functions modulo q. If 2 ≤ q ≤ x,

then
q
(ψ(x; q, a) − x/ϕ(q))2 x(log x)4 .
a=1
(a,q)=1

Proof We claim that

q 2
c(χ )χ (a) = ϕ(q) |c(χ )|2 (13.27)
a=1 χ χ
(a,q)=1

for arbitrary complex numbers c(χ ). To understand why this holds, expand the
left-hand side and take the sum over a inside, to see that it is

q
= c(χ1 )c(χ2 ) χ1 (a)χ2 (a).
χ1 χ2 a=1
(a,q)=1

By the basic orthogonality property of Dirichlet characters (cf (4.14)), the inner
sum here is ϕ(q) if χ1 = χ2 , and is 0 otherwise, and this gives (13.27). By
taking c(χ ) = (ψ(x, χ ) − E 0 (χ )x)/ϕ(q), it follows by (11.22) that

q
1
(ψ(x; q, a) − x/ϕ(q))2 = |ψ(x, χ) − E 0 (χ )x|2 ,
a=1
ϕ(q) χ
(a,q)=1

The stated estimate now follows from (13.19).

For non-principal χ let n(χ ) denote the least character non-residue of χ ,

which is to say the least positive integer n such that χ (n) = 1 and χ (n) = 0.
Since
ψ(x, χ0 ) = ψ(x) + O((log q)(log x)) x

for x ≥ C(log q)(log log q), it follows by taking x = C(log q)2 (log log q)2 in
(13.19) that n(χ ) (log q)2 (log log q)2 . As was the case with Cramér’s the-
orem (Theorem 13.3), we can do slightly better by using a weighted sum of
primes.

Theorem 13.11 Let χ be a non-principal character modulo q, and assume

that L(s, χ ) = 0 for σ > 1/2. Then n(χ ) (log q)2 .
428 Conditional estimates

Proof By taking k = 1 in (5.17)–(5.19), we see that

−1 σ0 +i∞
L x s+1
χ (n)(n)(x − n) = (s, χ) ds.
n≤x 2πi σ0 −i∞ L s(s + 1)

On pulling the contour to the line σ = 1/4, we see that the above is
x ρ+1 x 5/4 ∞
L x it
− − (1/4 + it, χ ) dt.
ρ ρ(ρ + 1) 2π −∞ L (1/4 + it)(5/4 + it)

By Theorem 10.17, the sum over ρ is x 3/2 log q. By Theorem 10.17 with
L
Lemma 12.7, we see that L (1/4 + it, χ ) log qτ . Hence the second term
5/4
above is x log q. Thus

χ (n)(n)(x − n) x 3/2 log q. (13.28)
n≤x

On the other hand,

χ0 (n)(n)(x − n) = (n)(x − n) + O(x(log x)(log q)) x2
n≤x n≤x
(13.29)

if x ≥ C(log q)(log log q). If χ (n) = χ0 (n) for all prime powers n ≤ x, then
the left-hand sides of (13.28) and (13.29) are equal. However, the right-
hand sides are inconsistent if we take x = C(log q)2 , so we obtain the stated
result.

Weaker hypotheses concerning the zeros of L(s, χ ) also imply bounds for
n(χ ). The argument here depends on a careful selection of the kernel in the
inverse Mellin transform.

Theorem 13.12 Let χ be a non-principal character (mod q), and suppose

that δ is chosen, 1/ log q ≤ δ ≤ 1/2, so that L(s, χ ) = 0 for 1 − δ < σ < 1,
0 < |t| ≤ δ 2 log q. Then n(χ ) < (Aδ log q)1/δ . Here A is a suitable absolute
constant.

Proof First we show that if 1/ log q ≤ R ≤ 1, then

1 log q
. (13.30)
|ρ−1|>R
|ρ − 1|2 R

To see this, note that

1 1 log q
n(2R; 0, χ)
R<|ρ−1|≤2R
|ρ − 1|2 R2 R
13.1 Estimates for primes 429

by Theorems 11.5 and 10.17. On replacing R by 2k R, and summing, we deduce

that
1 log q
.
R<|ρ−1|≤1
|ρ − 1|2 R

As for zeros farther from 1, we note by Theorem 10.17 that

1 ∞
log 2qn
log q,
|ρ−1|>1
|ρ − 1|2
n=1
n2

and so we have (13.30) for all R ≥ 1/ log q.

Let x and y be parameters to be chosen later so that 2 < y ≤ x 1/3 . For x/y 2 ≤
u ≤ x y 2 set w(u) = (2 log y − | log(x/u)|)x/u, and put w(u) = 0 otherwise.
Then
s−1 2
−1 σ0 +i∞ L y − y 1−s
w(n)χ (n)(n) = (s, χ) x s ds (13.31)
n 2πi σ0 −i∞ L s − 1
for σ0 > 1. We move the contour to the abscissa σ0 = −1/2, and ﬁnd that the
above is
y ρ−1 − y 1−ρ 2
=− x ρ − (1 − κ)(y − 1/y)2
ρ ρ − 1
(13.32)
−1/2+i∞ s−1 1−s 2
1 L y −y
− (s, χ) x s ds.
2πi −1/2−i∞ L s−1
Here the second term arises because L(s, χ ) has a trivial zero at s = 0 if
χ (−1) = 1. Suppose that χ is induced by a primitive character χ ! . Then by
(10.20) we see that
L L χ ! ( p) log p
(s, χ) = (s, χ ! ) + .
L L p|q
p s − χ ! ( p)

When σ = −1/2, the summand above is log p, and so by Lemma 12.9

we see that LL (−1/2 + it, χ) log qτ . Hence the last term in (13.32) is
x −1/2 y 3 log q. If χ is imprimitive, then L(s, χ ) may have inﬁnitely many
zeros on the imaginary axis. Such zeros are to be included in the sums in (13.30)
and (13.32). If a zero ρ is real, then its contribution in (13.32) is negative. If ρ
is a zero for which β ≤ 1 − δ, then its contribution to (13.32) is
x 1−δ y 2δ
.
|ρ − 1|2
From (13.30) with R = δ we see that the total contribution of such zeros is
x 1−δ y 2δ (log q)/δ.
430 Conditional estimates

If ρ is a zero for which β > 1 − δ and ρ is not real, then by hypothesis we have
|γ | ≥ δ 2 log q. The summand in (13.32) is x/|ρ − 1|, so that from (13.30)
with R = δ 2 log q we see that such zeros contribute an amount x/δ 2 . On
combining these estimates we ﬁnd that there is an absolute constant c1 > 0
such that

w(n)χ (n)(n) ≤ c1 x 1−δ y 2δ δ −1 log q + xδ −2 . (13.33)
n

If we replace χ by χ0 in (13.31) and argue as in the proof of the Prime Number

Theorem, we find that

w(n)χ0 (n)(n) = 4(log y)2 x + O x exp − c log x + O(y 2 log q).
n
(13.34)
Here the second error term reflects the possible contribution of zeros of L(s, χ0 )
on the imaginary axis. If χ (n) = χ0 (n) for all n for which w(n) = 0, then the
left-hand side in (13.33) is identical with that in (13.34). Thus we wish to show
that the right-hand sides cannot be equal, with a choice of x and y for which
x y 2 is as small as possible. To this end, note that if x = (C 3 δ log q)1/δ and
y = C 1/δ , then the right-hand side of (13.33) is (1 + 1/C)x/δ 2 , while the
right-hand side of (13.34) is (log C)2 x/δ 2 , uniformly for C ≥ 2. Thus if C
is a sufficiently large absolute constant, then the left-hand members of (13.33)
and (13.34) cannot be identical, and we have the stated result.

13.1.1 Exercises
1. Let = supρ β where ρ runs over all non-trivial zeros of ζ (s). Show that

ψ(x) = x + O(x (log x)2 ),

ϑ(x) = x + O(x (log x)2 ),
π (x) = = li(x) + O(x log x).

2. Let F(x) be as in the proof of Theorem 13.3. Suppose that 2 ≤ ≤ h ≤ x,

and put w(u) = 0 for u ≤ x − , w(u) = (u − x + )/ for x − ≤
u ≤ x, w(u) = 1 for x ≤ u ≤ x + h, w(u) = (x + h + − u)/ for x +
h ≤ u ≤ x + h + , w(u) = 0 for u ≥ x + h + .
(a) Show that
1
(n)w(n) = (F(x + h + ) − F(x + h) − F(x) + F(x − ))
n

1 1
=h+ − S(ρ) + O
ρ x
13.1 Estimates for primes 431

where
(x + h + )ρ+1 − (x + h)ρ+1 − x ρ+1 + (x − )ρ+1
S(ρ) = .
ρ(ρ + 1)
(b) Show that if RH holds, then S(ρ) h x −1/2 for |γ | ≤ x/ h, that
S(ρ) ≤ x /|γ | for x/ h ≤ |γ | ≤ x/ , and that S(ρ)
1/2
x 3/2 /γ 2 for
γ | ≥ x/ .
(c) Show that if RH holds, then

2h
ψ(x + h) − ψ(x) = h + O x 1/2 (log x) log 1/2
x log x
uniformly for x 1/2 log x ≤ h ≤ x.
3. Assume RH. Show that
X
dx m 2ρ
(ψ(x) − x)2 ∼ (log X )
x2 ρ |ρ|
2
2

as X → ∞.
4. Assume RH. Suppose that T is given, T ≥ 2, and let f (u) be deﬁned as in
(13.17). Show that
1 U eiγ u 2 m 2ρ
lim f (u) + du = .
U →∞ U 1 ρ ρ ρ |ρ|2
|γ |≤T |γ |>T

5. Assume GRH for all L-functions modulo q. (a) Show that

χ (n)(n)(x − n) = E 0 (χ )x 2 /2 + O x 3/2 log q ,
n≤x

χ ( p)(log p)(x − p) = E 0 (χ )x 2 /2 + O x 3/2 log q .
p≤x

(b) Show that if (a, q) = 1, then

x2
(n)(x − n) = + O x 3/2 log q ,
n≤x 2ϕ(q)
n≡a (q)
x2
(log p)(x − p) = + O x 3/2 log q .
p≤x 2ϕ(q)
p≡a (q)

(c) Deduce that if (a, q) = 1, then the least prime p ≡ a (mod q) is

ϕ(q)2 (log q)2 .
6. Assume Conjecture 13.9. Show that if (a, q) = 1, then there is a prime
number p ≡ a (mod q) such that p ε q 1+ε .
7. Let χ be a non-principal character, and let n(χ ) denote the least positive
integer n such that χ (n) = 1, χ (n) = 0. Show that n(χ ) is a prime number.
432 Conditional estimates

8. (Montgomery 1971, p. 121) Let χ be a character modulo q, and let d denote

the order of χ .
(a) Show that

1 d
1 if χ (n) = e(a/d),
χ (n)e(−ak/d) =
k
d k=1 0 otherwise.

(b) Assume that GRH holds for the d − 1 L-functions L(s, χ k ) where
0 < k < d. Show that for each d th root of unity e(a/d) there is a prime
p such that χ ( p) = e(a/d), with p d 2 (log q)2 .
9. (Montgomery
1971, p. 122) Let P(y) denote the set of those primes p such
that np = 1 for all n ≤ y, and let P(y) be the product of all primes not
exceeding y. Suppose that 2 ≤ y ≤ x.
(a) Explain why
p1
log p = 2−π(y) (log p) 1+ .
x< p≤2x x< p≤2x p1 ≤y p
p∈P(y)
(b) For each m|P(y), m > 1, let χm be the quadratic character determined
p1
by quadratic reciprocity so that χm ( p) = p1 |m p . Also, let χ1 (n) =
1 for all n. Explain why the above is

= 2−π(y) (ϑ(2x, χm ) − ϑ(x, χm )).
m|P(y)

(c) Assume GRH for all quadratic L-functions. Show that the above is

= 2−π (y) x(1 + o(1)) + O x 1/2 (log x)2 .
(d) Show that if y = 23 (log x)(log log x), then the above is positive, for all
sufﬁciently large x.
(e) Let n 2 ( p) denote the least quadratic non-residue
n of p, which is to say
the least positive integer n such that p = −1. Show that if GRH
is true for all quadratic L-functions, then there exist inﬁnitely many
primes p such that n 2 ( p) > 23 (log p)(log log p).
10. (Littlewood 1924a; cf. Goldston 1982)
(a) Show (unconditionally) that
(x + h)ρ+1 − x ρ+1
ψ(x) ≤ x − + O(h)
ρ hρ(ρ + 1)
for 2 ≤ h ≤ x/2.
(b) Show (unconditionally) that
x ρ+1 − (x − h)ρ+1
ψ(x) ≥ x − − O(h)
ρ hρ(ρ + 1)
for 2 ≤ h ≤ x/2.
13.2 Estimates for the zeta function 433

(c) Now, and in the following, assume RH. Show that

(x + h)ρ+1 − x ρ+1
x 1/2 log x/ h.
ρ hρ(ρ + 1)
|γ |>x/ h

(d) Show that if |γ | ≤ x/ h, then

(x + h)ρ+1 − x ρ+1 xρ
= + O x −1/2 h .
hρ(ρ + 1) ρ
(e) Show that
(x + h)ρ+1 − x ρ+1 xρ
= + O x 1/2 log x/ h .
ρ hρ(ρ + 1) ρ ρ
|γ |≤x/ h |γ |≤x/ h

(f) Show that

xρ
ψ(x) = x − + O x 1/2 log x .
√
|γ |≤ x/ log x
ρ

13.2 Estimates for the zeta function

We now show that our estimates of ζ (s) and of ζζ (s) can be improved if we
assume RH. To this end, we begin with a useful explicit formula. For x ≥ 2,
y ≥ 2, put
⎧
⎪
⎨1 if 1 ≤ u ≤ x;
log u/x
w(u) = w(x, y; u) = 1 − log y if x ≤ u ≤ x y;
⎪
⎩0 if u ≥ x y.
Then by two applications of (5.20) we ﬁnd that
(n) −1 σ0 +i∞
ζ (x y)w − x w
w(n) s = (s + w) dw,
n≤x y n 2πi log y σ0 −i∞ ζ w2

and on pulling the contour to the left we see that this is

ζ (x y)1−s − x 1−s
=− (s) +
ζ (1 − s)2 log y
(x y)ρ−s − x ρ−s ∞
(x y)−2k−s − x −2k−s
− − (13.35)
ρ (ρ − s)2 log y k=1
(2k + s)2 log y

provided that s = 1 and that ζ (s) = 0. This much is true unconditionally, but
from now on we assume RH, and show that the sum on the left provides a useful

approximation to − ζζ (s) when σ > 1/2.
434 Conditional estimates

Theorem 13.13 Assume RH. Then

ζ (n)
(s) ≤ σ
+ O((log τ )2−2σ ) (13.36)
ζ n≤(log τ )2
n

uniformly for 1/2 + 1/ log log τ ≤ σ ≤ 3/2, |t| ≥ 1.

Proof If σ ≥ 1/2, then |y ρ−s − 1| ≤ 2. Hence for σ > 1/2, the sum over ρ
in (13.25) has absolute value not exceeding
2x 1/2−σ 1
.
log y ρ |s − ρ|2

By (10.29) and (10.30) we see that

1
(σ − 1/2)
ρ (σ − 1/2) + (t − γ )
2 2

ζ 1
1 σ −1
= (s) + (s/2 + 1) − log π + ,
ζ 2 2 (σ − 1)2 + t 2
and by Theorem C.1 this is
ζ 1
= (s) + log τ + O(1).
ζ 2
On inserting this in (13.35), we ﬁnd that
ζ (n) θ 2x 1/2−σ ζ
(s) = − w(n) s + (s)
ζ n≤x y n (σ − 1/2) log y ζ
(13.37)

x 1/2−σ log τ (x y)1−σ y 1−σ
+O +O +O
(σ − 1/2) log y τ2 τ2
where θ is a complex number satisfying |θ| ≤ 1. Thus
ζ (n) x 1/2−σ log τ (x y)1−σ y 1−σ
(s) w(n) s + + + 2 (13.38)
ζ n≤x y n (σ − 1/2) log y τ 2 τ

provided that
2x 1/2−σ
≤ c < 1. (13.39)
(σ − 1/2) log y
We take

1
y = exp , x = (log τ )2 /y.
σ − 1/2
Then the left-hand side of (13.39) is 2e(log τ )1−2σ , and so (13.39) holds with
13.2 Estimates for the zeta function 435

c = 2/e for σ ≥ 1/2 + 1/ log log τ . We observe that

(n) (n)
w(n) s log τ
n≤x y n n≤(log τ )2
n 1/2

uniformly for σ ≥ 1/2. On inserting this in (13.38), we ﬁnd that

ζ
(s) log τ
ζ
uniformly for σ ≥ 1/2 + 1/ log log τ , |t| ≥ 1. We insert this on the right-hand
side of (13.37) to obtain the stated estimate.

Corollary 13.14 Assume RH. Then

ζ 1
(s) ((log τ )2−2σ + 1) min , log log τ
ζ |σ − 1|
uniformly for 1/2 + 1/ log log τ ≤ σ ≤ 3/2, |t| ≥ 1.

Proof By Chebyshev’s estimate (Theorem 2.4) we know that

(n)
U 1−σ .
U ≤n<eU
nσ

On summing this over U = ek for 0 ≤ k ≤ 2 log log τ , we obtain the stated

bound from Theorem 13.13.

Corollary 13.15 Assume RH. Then

(n) (log τ )2−2σ
| log ζ (s)| ≤ σ
+O (13.40)
n≤(log τ )2
n log n log log τ

uniformly for 1/2 + 1/ log log τ ≤ σ ≤ 3/2, |t| ≥ 1.

Proof Since
3/2
ζ
log ζ (σ + it) = log ζ (3/2 + it) − (α + it) dα,
σ ζ
it follows by the triangle inequality that
3/2
ζ
| log ζ (σ + it)| ≤ | log ζ (3/2 + it)| + (α + it) dα,
σ ζ
which by Corollary 13.13 is

(n) −σ (log τ )2−2σ
≤ | log ζ (3/2 + it)| + n − n −3/2 + O .
n≤(log τ )2
log n log log τ
436 Conditional estimates

But

∞
(n)
∞
(n)
| log ζ (3/2 + it)| = n −3/2−it ≤ n −3/2 ,
n=1
log n n=1
log n

so it follows that
(n)
(n) −σ (log τ )2−2σ
| log ζ (σ + it)| ≤ n + n −3/2 + O .
n≤(log τ )2
log n n>(log τ )2
log n log log τ
(13.41)
By the Chebyshev estimate ψ(x) x we see that
(n)
n −3/2 U −1/2 (log U )−1 .
U <n≤2U
log n

By taking U = (log τ )2 2k , and summing over k ≥ 0, we deduce that

(n)
n −3/2 (log τ )−1 (log log τ )−1 .
n>(log τ )2
log n

Since this is majorized by the error term in (13.41), we have (13.40).

Corollary 13.16 Assume RH. If |t| ≥ 1, then

1
| log ζ (s)| ≤ log + O(σ − 1) (13.42)
σ −1
for 1 + 1/ log log τ ≤ σ ≤ 3/2,

| log ζ (s)| ≤ log log log τ + O(1) (13.43)

for 1 − 1/ log log τ ≤ σ ≤ 1 + 1/ log log τ , and

1 (log τ )2−2σ
| log ζ (s)| ≤ log +O (13.44)
1−σ (1 − σ ) log log τ
for 1/2 + 1/ log log τ ≤ σ ≤ 1 − 1/ log log τ .

Proof To establish (13.42), we note that if 1 < σ ≤ 3/2, then

∞
(n)
∞
(n)
| log ζ (s)| = n −s ≤ n −σ = log ζ (σ )
n=1
log n log nn=1
1
= log 1/(σ − 1) + O(1) = log + O(σ − 1).
σ −1
As for (13.43), we note ﬁrst that
(n)
= log log z + O(1)
n≤z n log n
13.2 Estimates for the zeta function 437

by Mertens’ estimates (Theorem 2.7). Also, if σ = 1 + O(1/ log z), then

σ
n −σ − n −1 = n −α dα log n |σ − 1|n −1 log n
1

for 1 ≤ n ≤ z, so that
(n) (n)
(n −σ − n −1 ) |σ − 1| |σ − 1| log z 1.
n≤z log n n≤z n

On combining these estimates with z = (log τ )2 , we see that the sum in (13.40)
is ≤ log log log τ + O(1), which gives the desired estimate.
Concerning (13.44), we note that
(n) z
1
n −σ = σ
dψ(u)
n≤z log n 2− u log u
z
1 ψ(z) − z
= σ log u
du + σ + 21−σ/ log 2
2 u z log z

z
ψ(u) − u 1
+ σ +1 log u
σ+ du. (13.45)
2 u log u

By the change of variable v = u 1−σ , the ﬁrst integral immediately above is

li(z 1−σ ) − li(21−σ ) . But
z 1−σ
li(z 1−σ )
(1 − σ ) log z
for σ ≤ 1 − 1/ log z, and

2
dv 2
1
−li 21−σ = = + O(1) dv
21−σ log v 21−σ v−1
1
= − log(21−σ − 1) + O(1) = log + O(1).
σ −1
By Theorem 13.1, the second term in (13.45) is z 1/2−σ log z, and the ﬁnal
integral in (13.45) is
∞
u −σ −1/2 log u du (σ − 1/2)−2 .
2

On combining these estimates, we ﬁnd that

(n)
1 z 1−σ
σ
= log +O ,
n≤z n log n 1−σ (1 − σ ) log z

uniformly for 1/2 < σ ≤ 1 − 1/ log z. On taking z = (log τ )2 , the desired

estimate now follows from (13.40).
438 Conditional estimates

From Corollary 13.16 we see that if RH holds, then

1
|ζ (1 + it)| log log τ
log log τ
for |t| ≥ 1. We can make this more precise by taking a little more care.

Corollary 13.17 Assume RH. Then |ζ (1 + it)| ≤ 2eC0 log log τ + O(1).

Proof We observe that

(n) (n)
∞
1 1 −1
= ≤ k
= log 1−
n≤z n log n p k ≤z
n log n p≤z k=1 kp p≤z p
= C0 + log log z + O(1/ log z)

by Mertens’ estimate (Theorem 2.7). We take z = (log τ )2 , insert this in Corol-

lary 13.15, and exponentiate to obtain the stated bound.

To complete the picture, we estimate |ζ (s)| and argζ (s) when σ is near 1/2.
Of these estimates, the upper bound for |ζ (s)| is the most immediate.

Theorem 13.18 Assume RH. There is an absolute constant C > 0 such that

C log τ
|ζ (s)| < exp
log log τ
uniformly for σ ≥ 1/2, |t| ≥ 1.

Note that this is a quantitative form of the Lindelöf Hypothesis (LH).

Proof Put σ1 = 1/2 + 1/ log log τ . For σ ≥ σ1 , the above is contained in

Here the ﬁrst member on the right-hand side is bounded by Corollary 13.15,
and 0 ≤ σ1 − σ ≤ 1/ log log τ , so we have the stated bound.

To obtain the remaining estimates, we ﬁrst establish two lemmas, which are
of interest in their own right.
13.2 Estimates for the zeta function 439

Lemma 13.19 Assume RH. Then for T ≥ 4,

log T
N (T + 1/ log log T ) − N (T ) .
log log T

Proof Take s = 1/2 + 1/ log log T + i T . Then ζζ (s) log T by Corollary
13.14. Hence by Lemma 12.1 it follows that
1
log T.
ρ s−ρ
|γ −T |≤1

Here each summand has positive real part, and for T ≤ γ ≤ T + 1/ log log T
the real part is ≥ 12 log log T , so we obtain the stated bound.

By mimicking the proof of Lemma 12.1, we obtain

Lemma 13.20 Assume RH. If |σ − 1/2| ≤ 1/ log log τ , then
ζ 1
(s) = + O(log τ ).
ζ ρ s − ρ
|γ −t|≤1/ log log τ

In applying the above, one is free to replace the condition |γ − t|

≤ 1/ log log τ by a different condition, say |γ − t| ≤ δ, provided that
δ 1/ log log τ . To see why this is so, note that a summand in one sum that is
missing in the other has absolute value log log τ , and that by Lemma 13.19
there are (log τ )/ log log τ such summands. Hence the total contribution
made by terms in one sum but not the other is log τ , and a discrepancy of
this size may be absorbed in the error term.
Proof Put σ1 = 1/2 + 1/ log log τ , and set s1 = σ1 + it. We apply
Lemma 12.1 at s1 and at s, and difference, to see that
ζ ζ 1 1

(s) = (s1 ) + − + O(log τ ).
ζ ζ |γ −t|≤1
s−ρ s1 − ρ

Here the ﬁrst term on the right-hand side is log τ , by Corollary 13.14. Let
k be a positive integer, and consider zeros for which k/ log log τ ≤ |γ − t| ≤
(k + 1)/ log log τ . By the preceding lemma, there are (log τ )/ log log τ such
zeros, each one of which contributes an amount (log log τ )/k 2 to the above
sum. On summing over k we see that the contribution of zeros for which |γ −
t| > 1/ log log τ is log τ . Finally, for the zeros with |γ − t| ≤ 1, we observe
that |1/(s1 − ρ)| ≤ log log τ , and there are (log τ )/ log log τ such zeros, so
we have the stated result.

If t is not the ordinate of a zero of the zeta function, then we deﬁne arg ζ (s)
by continuous variation along the ray α + it where α runs from σ to +∞,
440 Conditional estimates

and arg(+∞ + it) = 0. If t is the ordinate of a zero, then we put arg ζ (s) =
(arg ζ (σ + it + ) + arg ζ (σ + it − ))/2.

Theorem 13.21 Assume RH. Then

log τ
arg ζ (s)
log log τ
uniformly for σ ≥ 1/2, |t| ≥ 1.

Proof We may assume that t is not the ordinate of a zero. Let σ1 and s1
be deﬁned as in the preceding proof. If σ ≥ σ1 , then the above follows from
Corollary 13.16. Suppose now that 1/2 ≤ σ ≤ σ1 . Then
σ1
ζ
arg ζ (s) = arg ζ (s1 ) − (α + it) dα.
σ ζ
Since 0 ≤ σ1 − σ ≤ 1/ log log τ , by Lemma 13.20 the right-hand side above is
σ1
1 log τ
=− dα + O .
|γ −t|≤1/ log log τ σ
α + it − ρ log log τ

Here the summand is

σ − 1/2 σ1 − 1/2
arctan − arctan .
γ −t γ −t
If γ > t, then the above lies between 0 and π/2, while if γ < t, then it lies
between −π/2 and 0. In either case, the contribution is bounded, and there are
(log τ )/ log log τ summands by Lemma 13.19, so we have the result.

Although a lower bound for |ζ (s)| at all heights is out of the question, we
can show, assuming RH, that there are heights for which a lower bound can be
established.

Theorem 13.22 Assume RH. There is an absolute constant C such that for
every T ≥ 4 there is a t, T ≤ t ≤ T + 1, such that

−C log T
|ζ (s)| ≥ exp
log log T
uniformly for −1 ≤ σ ≤ 2.

Proof By Corollary 10.5 we see that if −1 ≤ σ ≤ 1/2, then |ζ (s)| |ζ (1 −

σ + it)|. Thus we may restrict our attention to 1/2 ≤ σ ≤ 2. Put σ1 = 1/2 +
1/ log log T . From Corollary 13.16 we have the desired lower bound for all
heights, for σ1 ≤ σ ≤ 2. For the remaining interval, I = [1/2, σ1 ], we show
13.2 Estimates for the zeta function 441

that
T +1
1 log T
log dt . (13.46)
T min |ζ (s)| log log T
σ ∈I

Put s1 = σ1 + it. Then

σ1
ζ
log |ζ (s)| = log |ζ (s1 )| − (α + it) dα.
σ ζ
By Corollary 13.16 and Lemma 13.20, this is
σ1
1 log T
=− dα + O
σ ρ α + it − ρ log log T
|γ −t|≤δ

where δ = 1/ log log T . The summands are non-negative, so the above is

σ1
1 log T
≥− dα + O .
1/2 ρ α + it − ρ log log T
|γ −t|≤δ

Since this lower bound applies for all σ ∈ I , the above provides a lower bound
for log minσ ∈I |ζ (s)|. We note that
σ1 γ +δ δ δ
1 x
dt dα = dy dx
1/2 γ −δ α + it − ρ 0 −δ x 2 + y2
π/2 2δ
r cos θ
≤ r dr dθ = 4δ.
−π/2 0 r2
Hence
T +1 σ1 1 log T
dα dt δ ,
T 1/2 ρ α + it − ρ ρ log log T
|γ −t|≤δ T −1≤γ ≤T +2

so we have (13. 46), and the proof is complete.

By Theorem 5.2 and Corollary 5.3 with σ0 = 1 + 1/ log x and 1 ≤ T ≤ x,

we see that
σ0 +i T
1 xs x log x
M(x) = ds + O . (13.47)
2πi σ0 −i T ζ (s)s T
By Corollary 13.16 we see (assuming RH) that |ζ (1/2 + ε + it)| ε τ −ε .
Hence, by moving the contour to the abscissa 1/2 + ε, we deduce that
M(x) ε x 1/2+ε . This can be made more precise, by determining ε as a
function of x, but in order to do so we need a lower bound for |ζ (s)| when
1/2 < σ ≤ 1/2 + 1/ log log τ .
442 Conditional estimates

Theorem 13.23 Assume RH. There is a constant C > 0 such that if |t| ≥ 1,
then
⎧
C log τ
⎨exp log
1 log τ
for σ ≥ 1/2 + 1/ log log τ,
≤
ζ (s) ⎩exp C log τ log e
for 1/2 < σ ≤ 1/2 + 1/ log log τ.
log log τ (σ −1/2) log log τ

Proof The first part follows from Corollary 13.14. Let σ1 and s1 be defined
as in the proof of Lemma 13.20, and suppose that 1/2 < σ ≤ σ1 . Then
σ1
ζ
log ζ (s) = log ζ (s1 ) − (α + it) dα.
σ ζ
Here the first term on the right is (log τ )/ log log τ , by Corollary 13.16. By
Lemma 13.19 we know that the sum in Lemma 13.20 has (log τ )/ log log τ
terms. Since each term has absolute value ≤ 1/(σ − 1/2), it follows that
ζ log τ
(α + it)
ζ (α − 1/2) log log τ
for 1/2 < α ≤ σ1 . Hence

σ1 − 1/2 log τ
log ζ (s) 1 + log ,
σ − 1/2 log log τ
which gives the stated bound.

Theorem 13.24 Assume RH. Then there is an absolute constant C > 0 such
that

C log x
M(x) x 1/2 exp
log log x
for x ≥ 4.
Proof Put σ1 = 1/2 + 1/ log log x, and let C denote the contour that passes
by straight line segments from σ0 − i x to σ1 − i x to σ1 + i x to σ0 + i x. Then
σ0 +i x
xs xs
ds = ds,
σ0 −i x ζ (s)s C ζ (s)s
since the integrand is analytic in the rectangle enclosed by these contours. By
the ﬁrst case of Theorem 13.22 we see that
σ0 +i x
xs C log x C log x
ds exp σ0 x σ −1 dσ exp ,
σ1 +i x ζ (s)s log log x σ1 log log x
and the same estimate applies to the integral from σ1 − i x to σ0 − i x. Similarly,
by the second part of Theorem 13.22 we see that
σ1 +i x
xs x
C log τ e log log x dt
ds x σ1 exp log .
σ1 −i x ζ (s)s 0 log log τ log log τ τ
13.2 Estimates for the zeta function 443

By logarithmic differentiation we may conﬁrm that the argument of the expo-

nential is an increasing function of t for 0 ≤ t ≤ x. Thus we obtain the stated
bound by taking T = x in (13.47).

13.2.1 Exercises
1. (a) Show (unconditionally) that
ξ 1
(s) =
ξ ρ s − ρ
whenever ξ (s) = 0.
(b) Show (unconditionally) that
ξ
(1/2 + it) = 0
ξ
for all t such that ξ (1/2 + it) = 0.
(c) Assume RH. Show that
⎧
> 0 if σ > 1/2,
ξ ⎨
(s) = 0 if σ = 1/2 and ξ (s) = 0,
ξ ⎩
< 0 if σ < 1/2.
(d) Assume RH. Show that if ξ (s) = 0, then s = 1/2.
(e) Assume RH, and let t be any ﬁxed real number. Show that |ξ (σ +
it)| is a strictly increasing function of σ for 1/2 ≤ σ < ∞, and that
|ξ (σ + it)| is a strictly decreasing function of σ for −∞ < σ ≤ 1/2.
(f) Assume RH, and suppose that t is a ﬁxed real number. Show that

(σ − 1/2) ξξ (σ + it) is an increasing function of σ for 1/2 ≤ σ < ∞.
(g) Assume RH. Show that if 1/2 < σ2 ≤ σ1 , then
ξ
σ2 − 1/2 (σ1 −1/2) ξ (σ1 +it)
|ξ (σ2 + it)| ≥ |ξ (σ1 + it)| · .
σ1 − 1/2
2. (a) Show (unconditionally) that if ξ (s) = 0, then
2
ξ ξ 1
(s) − (s) = − .
ξ ξ ρ (s − ρ)2

(b) Show (unconditionally) that if t is real, then ξ (1/2 + it) ∈ iR.

(c) Show (unconditionally) that if t is real, then ξ (1/2 + it) ∈ R.
(d) Show (unconditionally) that if t is real, then
1
ρ (1/2 + it − ρ)
2

is real.
444 Conditional estimates

(e) Assume RH. Show that if ξ (1/2 + it) = 0, then

2
ξ ξ
(1/2 + it) > (1/2 + it).
ξ ξ
(f) Assume RH. Show that if ξ (1/2 + it) = 0 and ξ (1/2 + it) = 0, then
sgn ξ (1/2 + it) = sgn ξ (1/2 + it).
(g) Assume RH. Show that if ξ (1/2 + it) = 0 and ξ (1/2 + it) = 0, then
∂2
sgn ξ (1/2 + it) = −sgn ξ (1/2 + it).
∂t 2
(h) Assume RH. Suppose that ξ (1/2 + iγ ) = ξ (1/2 + iγ ) = 0, and that
ξ (1/2 + it) = 0 for γ < t < γ . Show that ξ (1/2 + it) has exactly
one zero with γ < t < γ , and that this zero is necessarily simple.
(i) Assume RH. In the above notation, show that the number of zeros of
ξ (1/2 + it) in the interval [γ , γ ), counting multiplicity, is the same
as the number of zeros of ξ (1/2 + it) in the same interval.
(j) Assume RH. Let N1 (T ) denote the number of zeros of ξ (s) with imag-
inary part in the interval [0, T ]. Show that N1 (T ) = N (T ) + O(1).
3. Let χ be a primitive character modulo q, q > 1, and suppose that L(s, χ ) =
0 for σ > 1/2. Show that
(n)
L (log qτ )2−2σ
(s, χ) ≤ +O
L n≤(log qτ )2
nσ log log τ

uniformly for 1/2 + 1/ log log qτ ≤ σ ≤ 3/2.

4. Let χ be a primitive character modulo q, q > 1, and suppose that L(s, χ ) =
0 for σ > 1/2. Show that

L 1
(s, χ) ((log qτ )2−2σ + 1) min , log log qτ
L |σ − 1|
uniformly for 1/2 + 1/ log log qτ ≤ σ ≤ 3/2.
5. Let χ be a primitive character modulo q, q > 1, and suppose that L(s, χ) =
0 for σ > 1/2. Show that

(n) (log qτ )2−2σ
| log L(s, χ )| ≤ + O
n≤(log qτ )2
n σ log n log log qτ

uniformly for 1/2 + 1/ log log qτ ≤ σ ≤ 3/2.

6. Let χ be a primitive character modulo q, q > 1, and suppose that L(s, χ) =
0 for σ > 1/2.
(a) Show that
1
|L(s, χ )| ≤ log + O(σ − 1)
σ −1
13.2 Estimates for the zeta function 445

uniformly for 1 + 1/ log log qτ ≤ σ ≤ 3/2.

(b) Show that
|L(s, χ )| ≤ log log qτ + O(1)
uniformly for 1 − 1/ log log qτ ≤ σ ≤ 1 + 1/ log log qτ .
(c) Show that

1 (log qτ )2−2σ
|L(s, χ )| ≤ log +O
1−σ (1 − σ ) log log qτ
uniformly for 1/2 + 1/ log log qτ ≤ σ ≤ 1 − 1/ log log qτ .
7. Let χ be a primitive character modulo q, q > 1, and suppose that L(s, χ ) =
0 for σ > 1/2. Show that |L(1 + it, χ)| ≤ 2eC0 log log qτ .
8. Let χ be a primitive Dirichlet character modulo q with q > 1, and suppose
that L(s, χ ) = 0 for σ > 1/2. Show that there is an absolute constant C > 0
such that

C log qτ
|L(s, χ )| ≤ exp
log log qτ
uniformly for 1/2 ≤ σ ≤ 3/2.
9. Let χ be a primitive character modulo q, q > 1, and suppose that L(s, χ) =
0 for σ > 1/2. Show that the number of zeros ρ = 1/2 + iγ of L(s, χ) with
T ≤ γ ≤ T + 1/ log log qτ is (log qτ )/(log log qτ ) uniformly in T .
10. Let χ be a primitive character modulo q, q > 1, and suppose that L(s, χ ) =
0 for σ > 1/2. Show that if |σ − 1/2| ≤ 1/ log log qτ , then
L 1
(s, χ) = + O(log qτ ).
L |γ −t|≤1/ log log qτ
s−ρ

11. (Selberg 1946b, Section 5) Let χ be a primitive character modulo q, q > 1,

and suppose that L(s, χ ) = 0 for σ > 1/2. Show that
log qτ
arg L(s, χ )
log log qτ
uniformly for σ ≥ 1/2.
12. Let χ be a character modulo q, and suppose that χ is induced by a primitive
character χ ! where χ ! is a character modulo d for some d|q. Show that

L L 1
(s, χ) − (s, χ ! ) (log q)1−σ + 1 min , log log q .
L L |σ − 1|
13. (Vorhauer 2006) Let χ be a primitive character modulo q, q > 1, and
suppose that L(s, χ ) = 0 for σ > 1/2. Show that
1 1
lim = log q + O(log log q).
T →∞
|r |≤T
ρ 2
446 Conditional estimates

14. (Axer 1911) Assume RH.

(a) Show that if c = 1/4 + ε, then
c+i T
ζ (s)x s
|ds| x 1/4+ε T 1/4+ε .
c−i T ζ (2s)s
(b) Let Q(x) denote the number of square-free integers not exceeding x.
Show that if RH is true, then
6
Q(x) = 2 x + O x 2/5+ε .
π
(A better estimate is obtained in Exercise 16 below.)
15. Assume RH.
(a) Show that if c = 1/2 + ε, then
c+i T
ζ (s)x s
|ds| x 1/4+ε T ε .
c−i T ζ (2s)s(s + 1)
(b) Show that if RH is true, then
3
µ(n)2 (1 − n/x) = 2 x + O x 1/4+ε .
n≤x π

16. (Montgomery & Vaughan 1981)

(a) Show that

Q(x) = µ(d).
d,m
d 2 m≤x

Let 1 denote the sum of the above terms for which d ≤ y, and let
2 denote the sum of the above terms for which d > y. Here y is a
parameter to be determined later, 1 ≤ y ≤ x 1/2 .
(b) Put

S(x, y) = µ(d)B1 (x/d 2 )
d≤y

where B1 (u) = u − 1/2 is the ﬁrst Bernoulli polynomial. Show that

µ(d) 2 1
1 = x − M(y) − S(x, y).
d≤y
d 2

(c) Assume RH. Show that if σ ≥ 1/2 + 2ε, then

µ(d) 1 y w−s
= dw
d≤y
d s 2πi C0 ζ (w)(w − s)
13.3 Notes 447

where C0 is a contour running from σ0 − i∞ to σ0 − i y to 1/2 + ε − i y

to 1/2 + ε + i y to σ0 + i y to σ0 + i∞ and σ0 = 1 + 1/ log y. Deduce
that
µ(d) 1
= + O y 1/2−σ +ε τ ε .
d≤y
d s ζ (s)

(d) Put f y (s) = 1/ζ (s) − d≤y µ(d)/d s . Show that
σ1 +i∞
1 xs
2 = ζ (s) f y (2s) ds
2πi σ1 −i∞ s
where σ1 = 1 + 1/ log x.
(e) Show (unconditionally) that
1 xs
2 = f y (2) + ζ (s) f y (2s) ds
2πi C1 s
where C1 is a contour running from σ1 − i∞ to σ1 − i x to 1/2 − i x to
1/2 + i x to σ1 + i x to σ1 + i∞.
(f) Assume RH. Show that 2 x 1/2+ε y −1/2 .
(g) Note that the estimate S(x, y) y is trivial.
(h) Show that if RH is true, then
6
Q(x) = 2 x + O x 1/3+ε .
π

13.3 Notes
Section 13.1. Theorem 13.1 is due to von Koch (1901). Theorems 13.3 and
13.5 are due to Cramér (1921). The order of magnitude of the estimate in
Theorem 13.5 is optimal, in view of Theorem 13.6, which is from Cramér
(1922). Wintner (1941) showed (assuming RH) that the function f (u) deﬁned
in (13.17) has a limiting distribution. That is, there is a weakly monotonic
function F(x) with limx→−∞ F(x) = 0, limx→+∞ F(x) = 1, such that
1
lim meas{u ∈ [0, U ] : f (u) ≤ x} = F(x)
U →∞ U

whenever x is a point of continuity of F. The result of Exercise 13.1.4 is

useful in this connection. If in addition to RH, the ordinates γ > 0 are linearly
independent over the ﬁeld Q of rational numbers, then this distribution function
is the same as the distribution function of the random variable
cos 2π X γ
X =2
γ >0
ρ
448 Conditional estimates

where the X γ are independent random variables, each one uniformly distributed
on [0, 1]. It can be shown (unconditionally) that the distribution function FX of
X satisﬁes the inequalities
√ √ √ √
exp −c1 xe 2π x < 1 − FX (x) < exp −c2 xe 2π x (13.48)

for x ≥ 2 where c1 and c2 are positive absolute constants.

Concerning the mean square distribution of primes in short intervals, Selberg
(1943) showed (assuming RH) that
X
dx
(ψ((1 + δ)x) − ψ(x) − δx)2 δ(log X )2
0 x2
uniformly for 1/ X ≤ δ ≤ 1/ log X . Theorem 13.7 and Corollary 13.8 are due
to Titchmarsh (1930). Corollary 13.10 is due to Turán (1937). Theorem 13.11,
in the case of the Legendre symbol, is due to Ankeny (1952), who used deeper
estimates of Selberg (1946b) found in Exercise 13.1.11. Our simpler proof, and
the extension to general non-principal characters, is from Montgomery (1971,
p. 120). Theorem 13.12 is from Montgomery (1994, p. 164). See also Lagarias,
Montgomery & Odlyzko (1979).
Section 13.2. All results here from Theorem 13.13 through Theorem 13.21
are due to Littlewood (1922, 1924b, 1926, 1928), although our proofs are much
simpler than in the original ones. Indeed, referring to Theorem 13.21, Littlewood
commented that, ‘The proof of this theorem is long and difﬁcult, and depends on
a singularly varied set of ideas.’ Precursors to Theorem 13.21 were established
by Bohr, Landau & Littlewood (1913), Cramér (1918), and Landau (1920).
See Titchmarsh (1927) for an alternative proof. Our simpler approach is that
of Selberg (1944). Littlewood (1928) not only established Corollary 13.17, but
also showed (assuming RH) that
π2
|ζ (1 + it)| ≥ + O((log log τ )−2 ).
12eC0 log log τ
In the opposite direction, Titchmarsh (1928) showed (unconditionally) that
|ζ (1 + it)|
lim sup ≥ eC0 .
t→+∞ log log t
Also, Titchmarsh (1933) showed (unconditionally) that
π2
lim inf |ζ (1 + it)| log log t ≥ .
t→+∞ 6eC0
Here we see a factor of 2 between the two sets of bounds. The same factor of
2 arises when we consider what is known concerning large values of the zeta
13.4 References 449

function in the critical strip. Let α(σ ) denote the least number such that

ζ (σ + it) exp (log τ )α(σ )+ε
as t → ∞. From Corollary 13.16 we see that α(σ ) ≤ 2 − 2α, assuming RH.
In the opposite direction, Titchmarsh (1928) showed (unconditionally) that
α(σ ) ≥ 1 − α. More precisely, it is known that if 1/2 ≤ σ < 1, then there is a
c(σ ) > 0 such that

c(σ )(log τ )1−σ
|ζ (σ + it)| = exp .
(log log τ )σ
For 1/2 < σ < 1 this is due to Montgomery (1977); the case σ = 1/2 is due
to Balasubramanian & Ramachandra (1977). Opinions as to where the truth
lies between these bounds vary widely among experts. For more on the value
distribution of the zeta function and L-functions, see Titchmarsh (1986), Joyner
(1986), and Laurinčikas (1996).
That the estimate M(x) x 1/2+ε is equivalent to RH was proved by
Littlewood (1912). Theorems 13.22 through 13.24 are due to Titchmarsh (1927).
Theorem 13.24 has been improved upon by Maier & Montgomery (2006), who
showed (assuming RH) that

M(x) x 1/2 exp (log x)39/61 .

13.4 References
Ankeny, N. C. (1952). The least quadratic non residue, Ann. of Math. 55, 65–72.
Axer, A. (1911). Über einige Grenzwertsätze, S.-B. Wiss. Wien IIa 120, 1253–1298.
Balasubramanian, R. & Ramachandra, K. (1977). On the frequency of Titchmarsh’s
phenomenon for ζ (s), III, Proc. Indian Acad. Sci. Sect. A 86, 341–351.
Bohr, H., Landau, E., & Littlewood, J. E. (1913). Sur la fonction ζ (s) dans le voisi-
nage de la droite σ = 1/2, Acad. Roy. Belgique Bull. Cl. Sci., 1144–1175; Bohr’s
Collected Works, Vol. 1. København: Dansk Mat. Forening, 1952, B.2; Landau’s
Collected Works, Vol. 6. Essen: Thales Verlag, 1986, pp. 61–93; Littlewood’s Col-
lected Papers, Vol. 2. Oxford: Oxford University Press, 1982, pp. 797–828.
Cramér, H. (1918). Über die Nullstellen der Zetafunktion, Math. Z. 2, 237–241;
Collected Works, Vol. 1. Berlin: Springer-Verlag, 1994, 92–96.
(1921). Some theorems concerning prime numbers, Arkiv för Mat. Astr. Fys. 15, no. 5,
33 pp.; Collected Works, Vol. 1. Berlin: Springer-Verlag, 1994, pp. 138–170.
(1922). Ein Mittelwertsatz der Primzahltheorie, Math. Z. 12, 147–153; Collected
Works, Vol. 1. Berlin: Springer-Verlag, 1994, pp. 229–235.
Goldston, D. A. (1982). On a result of Littlewood concerning prime numbers, Acta Arith.
40, 263–271.
Joyner, D. (1986). Distribution Theorems of L-functions, Pitman Research Notes in
Math. 142. Harlow: Longman.
450 Conditional estimates

von Koch, H. (1901). Sur la distribution des nombres premiers, Acta Math. 24, 159–182.
Lagarias, J. C., Montgomery, H. L., & Odlyzko, A. M. (1979). A bound for the least
prime ideal in the Chebotarev density theorem, Invent. Math. 54, 271–296.
Landau, E. (1920). Über die Nullstellen der Zetafunktion, Math. Z. 6, 151–154;
Collected Works, Vol. 7. Essen: Thales Verlag, 1986, pp. 226–229.
Laurinčikas, A. (1996). Limit Theorems for the Riemann Zeta-function, Mathematics
and its Applications 352. Dordrecht: Kluwer.
Littlewood, J. E. (1912). Quelques conséquences de l’hypothèse que la fonction ζ (s) de
Riemann n’a pas de zéros dans le demi-plan R(s) > 12 , Comptes Rendus Acad. Sci.
Paris 154, 263–266; Collected Papers, Vol. 2. Oxford: Oxford University Press,
1882, pp. 793–796.
(1922). Researches in the theory of the Riemann ζ -function, Proc. London Math. Soc.
(2) 20, xxii–xxviii; Collected Papers, Vol. 2. Oxford: Oxford University Press, 1982,
pp. 844–850.
(1924a). Two notes on the Riemann Zeta-function, Proc. Cambridge Philos. Soc.
22, 234–242; Collected Papers, Vol. 2. Oxford: Oxford University Press, 1982,
pp. 851–859.
(1924b). On the zeros of the Riemann zeta-function, Proc. Cambridge Philos. Soc.
22, 295–318; Collected Papers, Vol. 2. Oxford: Oxford University Press, 1982, pp.
860–883.
(1926). On the Riemann zeta function, Proc. London Math. Soc. (2) 24, 175–201;
Collected Papers, Vol. 2. Oxford: Oxford University Press, 1982, pp. 844–910.
(1928). Mathematical Notes (5): On the function 1/ζ (1 + ti), Proc. London Math.
Soc. (2) 27, 349–357; Collected Papers, Vol. 2, Oxford: Oxford University Press,
1982, pp. 911–919.
Maier, H. & Montgomery, H. L. (2006). On the sum of the Möbius function, to appear,
16 pp.
Montgomery, H. L. (1971). Topics in Multiplicative Number Theory, Lecture Notes in
Math. 227. Berlin: Springer-Verlag.
(1977). Extreme values of the Riemann zeta-function, Comment. Math. Helv. 52,
511–518.
(1994). Ten lectures on the interface between analytic number theory and harmonic
analysis, CMBS 84. Providence: Amer. Math. Soc.
Montgomery, H. L. & Vaughan, R. C. (1981). The distribution of square-free numbers,
Recent Progress in Analytic Number Theory (Durham, 1979), Vol. 1. London:
Academic Press, pp. 247–256.
Selberg, A. (1943). On the normal density of primes in small intervals, Arch. Math.
Natur-vid. 47, 87–105; Collected Papers, Vol. 1, New York: Springer Verlag, 1989,
pp. 160–178.
(1944). On the Remainder in the Formula for N (T ), the Number of Zeros of ζ (s) in
the Strip 0 < t < T . Avhandl. Norske Vid.-Akad. Oslo I. Mat.-Naturv. Kl., no. 1;
Collected Papers, Vol. 1, New York: Springer Verlag, 1989, pp. 179–203.
(1946a). Contributions to the Theory of the Riemann zeta-function, Arch. Math.
Naturvid. 48, 89–155; Collected Papers, Vol. 1, New York: Springer Verlag, 1989,
pp. 214–280.
(1946b). Contributions to the Theory of Dirichlet’s L-functions, Skrifter Norske Vid.-
Akad. Oslo I. Mat.-Naturvid. Kl., no. 3; Collected Papers, Vol. 1, New York:
Springer Verlag, 1989, pp. 281–340.
13.4 References 451

Titchmarsh, E. C. (1927). A consequence of the Riemann hypothesis, J. London Math.

Soc. 2, 247–254.
(1928). On an inequality satisﬁed by the zeta-function of Riemann, Proc. London
Math. Soc. (2) 28, 70–80.
(1930). A divisor problem, Rend. Circ. Mat. Palermo 54, 414–429.
(1933). On the function 1/ζ (1 + it), Quart. J. Math. Oxford 4, 64–70.
(1986). The Theory of the Riemann Zeta-function, Second edition. Oxford: Oxford
University Press.
Turán, P., (1937). Über die Primzahlen der Arithmetischen Progression, I, Acta Sci.
Szeged 8, 226–235; Collected Papers, Vol. 1. Budapest: Akadémiai Kiadó, 1990,
pp. 64–73.
Vorhauer, U. M. A. (2006). The Hadamard product formula for Dirichlet L-functions,
to appear.
Wintner, A. (1941). On the distribution function of the remainder term of the Prime
Number Theorem, Amer. J. Math. 63, 233–248.
14
Zeros

14.1 General distribution of the zeros

If T > 0 is not the ordinate of a zero of the zeta function, then we let N (T ) denote
the number of zeros ρ = β + iγ of ζ (s) in the rectangle 0 < β < 1, 0 < γ < T .
If T is the ordinate of a zero, then we set N (T ) = (N (T + ) + N (T − ))/2. By
the argument principle we obtain
Theorem 14.1 For any real t, put
1
S(t) = arg ζ (1/2 + it). (14.1)
π
If T > 0, then
1 T
N (T ) = arg (1/4 + i T /2) − log π + S(T ) + 1. (14.2)
π 2π
Proof Since
1 1
N (T ) = (N (T + ) + N (T − )), S(T ) = (S(T + ) + S(T − )),
2 2
it sufﬁces to prove (14.2) when T is not the ordinate of a zero. Let C denote the
contour that proceeds by straight lines from 2 to 2 + i T to −1 + i T to −1 to
2. Then by the argument principle,
1 ξ
N (T ) = (s) ds.
2πi C ξ
Now let C1 denote the contour that proceeds by line segments from 1/2
to 2 to 2 + i T to 1/2 + i T , and let C2 be the
contour
that proceeds from
1/2 + i T to −1 + i T to −1 to 1/2. Thus C = C1 + C2 . For s ∈ C2 we use the
identity
ξ ξ
(s) = − (1 − s),
ξ ξ

452
14.1 General distribution of the zeros 453

and thus we see that

ξ ξ ξ
(s) ds = − (1 − s) ds = (s) ds
C2 ξ C2 ξ C3 ξ
where C3 proceeds from 1/2 − i T to 2 − i T to 2 to 1/2. On adding this to the
integral over C1 , we see that the contribution of the interval [1/2, 2] cancels,
and hence
1 ξ
N (T ) = (s) ds
2πi C4 ξ
where C4 runs from 1/2 − i T to 2 − i T to 2 + i T to 1/2 + i T . By (10.25) we
see that the above is
1 s 1/2+i T
= log s + log(s − 1) + log ζ (s) + log (s/2) − log π .
2πi 2 1/2−i T

By the Schwarz reﬂection principle, the real parts cancel and the imaginary
parts reinforce. Thus the above is

1
= arg(1/2 + i T ) + arg(−1/2 + i T ) + arg ζ (1/2 + i T )
π

T
+ arg (1/4 + i T /2) − log π .
2

Here arg(1/2 + i T ) + arg(−1/2 + i T ) = π, so we have the stated result.

By Stirling’s formula (Theorem C.1) we know that

1
log (s) = (s − 1/2) log s − s + log 2π + O(1/|s|). (14.3)
2
By using this, we obtain

Corollary 14.2 For T ≥ 2,

T T T 7
N (T ) = log − + + S(T ) + O(1/T ).
2π 2π 2π 8
Proof Clearly

(−1/4 + i T /2) log(1/4 + i T /2) − (1/4 + i T /2)
1 T 1 2 T
= − arg 14 + i T2 + log 16 + T4 − .
4 4 2
But arg(1/4 + i T /2) = π/2 + O(1/T ), and log(1/16 + T 2 /4) = 2 log T /2 +
O(1/T 2 ), so we obtain the stated result.

By combining the above with Lemma 12.3 or Theorem 13.20, we obtain

454 Zeros

Corollary 14.3 For T ≥ 4,

T T T
N (T ) = log − + O(log T ).
2π 2π 2π
Corollary 14.4 If the Riemann Hypothesis is true, then

T T T log T
N (T ) = log − +O .
2π 2π 2π log log T
Note that these estimates imply the estimates of Theorem 10.13 and
Lemma 13.18, respectively. In addition, from the ﬁrst estimate above we see
that there is an absolute constant C > 0 such that
N (T + h) − N (T ) h log T (14.4)
uniformly for C ≤ h ≤ T . Similarly, there is an absolute constant C > 0 such
that if RH is true, then (14.4) holds for C/ log log T ≤ h ≤ T , T ≥ 4. By mod-
ifying our method we obtain corresponding estimates for the number of zeros
of a Dirichlet L-function.
Theorem 14.5 Let χ be a primitive character modulo q, with q > 1. For
T > 0, let N (T, χ) denote the number of zeros ρ = β + iγ of L(s, χ ) with
0 < β < 1 and 0 ≤ γ ≤ T . Any zeros with γ = 0 or γ = T should be counted
with weight 1/2. Also, for any real number T , put
1
S(T, χ ) = arg L(1/2 + i T, χ). (14.5)
π
Then
1 T q
N (T, χ ) = arg (1/4 + κ/2 + i T /2) + log + S(T, χ ) − S(0, χ )
π 2π π
where κ = 0 or 1 according as χ (−1) = 1 or −1.
There is no need to establish a separate result pertaining to zeros with γ < 0,
since the number of zeros of L(s, χ ) with −T ≤ γ ≤ 0 is N (T, χ ).
Proof We may assume that T is not the ordinate of a zero, for if it were, then
we have only to replace T by T ± , and average. However, we must take some
precautions against the possibility that L(s, χ) has a zero on the real axis in
the interval (0, 1). Let C ± be the contour from 2 ± iε to 2 + i T to −1 + i T to
−1 ± iε to 2 ± iε, let C1± be the contour from 1/2 ± iε to 2 ± iε to 2 + i T to
1/2 + i T , let C2± be the path from 1/2 + i T to −1 + i T to −1 ± iε to 1/2 ± iε,
and let C3± be the path from 1/2 − i T to 2 − i T to 2 ∓ iε to 1/2 ∓ iε. By the
argument principle, the number of zeros with 0 < γ ≤ T is
1 ξ 1 ξ 1 ξ
(s, χ) ds = (s, χ) ds + (s, χ) ds.
2πi C+ ξ 2πi C1+ ξ 2πi C2+ ξ
14.1 General distribution of the zeros 455

For s ∈ C2+ we write

ξ ξ
(s, χ ) = − (1 − s, χ ),
ξ ξ
and thus we find that
ξ ξ ξ
(s, χ) ds = − (1 − s, χ ) ds = (s, χ ) ds.
C2+ ξ C2+ ξ C3+ ξ
By (10.33), it follows that
ξ s 1/2+i T
(s, χ ) ds = log L(s, χ) + log ((s + κ)/2) + log q/π
+ ξ 2 1/2+iε
C1
= log L(1/2 + i T, χ ) − log L(1/2 + iε, χ)
+ log (1/4 + κ/2 + i T /2) − log (1/4 + κ/2 + iε/2)
T −ε q
+i log ,
2 π
and that
ξ s 1/2−iε
(s, χ ) ds = log L(s, χ ) + log ((s + κ)/2) + log q/π
+ ξ 2 1/2−i T
C3
= log L(1/2 − iε, χ ) − log L(1/2 − i T, χ )
+ log (1/4 + κ/2 − iε/2) − log (1/4 + κ/2 − i T /2)
T −ε q
+i log .
2 π
When these quantities are added, the real parts cancel and the imaginary parts
are doubled, so after dividing by 2πi we find that the number of zeros with
0 < γ ≤ T is
1 T q
arg (1/4 + κ/2 + i T /2) + S(T, χ) − S(0+ , χ) + log .
π 2π π
By proceeding similarly with the opposite sign, we find that the number of zeros
with 0 ≤ γ ≤ T is
1 T q
arg (1/4 + κ/2 + i T /2) + S(T, χ) − S(0− , χ) + log .
π 2π π
We form the average of these two identities to obtain the stated result.

Corollary 14.6 Let χ be a primitive character modulo q, with q > 1. Then

for T > 0,
T qT T
N (T, χ) = log − + S(T, χ) − S(0, χ)− χ (−1)/8 + O(1/(T + 1)).
2π 2π 2π
Proof If 0 < T ≤ 2, then arg (1/4 + κ/2 + i T /2) 1 and T log T /2 −
T 1, so the estimate is immediate in this case. Suppose that T ≥ 2.
456 Zeros

Clearly

((−1/4 + κ/2 + i T /2) log(1/4 + κ/2 + i T /2) − (1/4 + κ/2 + i T /2))

T T
= (−1/4 + κ/2) arg(1/4 + κ/2 + i T /2) + log((1/4 + κ/2)2 +T 2 /4)− .
4 2
Here arg(1/4 + κ/2 + i T /2) = π/2 + O(1/T ), log((1/4 + κ/2)2 + T 2 /4) =
2 log T /2 + O(1/T 2 ), and 2κ − 1 = −χ (−1), so the result follows by Stir-
ling’s formula in the form (14.3).

By combining the above with Lemma 12.8 we obtain

Corollary 14.7 Let χ be a primitive character modulo q, q > 1. Then for

T ≥ 4,
T qT T
N (T, χ) = log − + O(log qT ).
2π 2π 2π

14.1.1 Exercise
1. Let χ be a primitive character modulo q with q > 1. Show that if L(s, χ) = 0
for σ > 1/2, then

T qT T log qT
N (T, χ ) = log − +O
2π 2π 2π log log qT
for T ≥ 2.

14.2 Zeros on the critical line

At present we are unable to prove the Riemann Hypothesis, which asserts that all
non-trivial zeros of the zeta function lie on the critical line σ = 1/2. However,
we are able to show that inﬁnitely many zeros lie on this line.

Theorem 14.8 (Hardy) There exist inﬁnitely many real numbers γ such that
ζ (1/2 + iγ ) = 0.

For real t, let

(1/4 + it/2)π −1/4−it/2
Z (t) = ζ (1/2 + it) . (14.6)
| (1/4 + it/2)π −1/4−it/2 |
Thus, as depicted in Figure 14.1, Z (t) is real-valued, |Z (t)| = |ζ (1/2 + it)|,
and Z (t) changes sign at γ if and only if ζ (s) has a zero at 1/2 + iγ of odd
14.2 Zeros on the critical line 457

0
0 0 0 0 100

Figure 14.1 Graph of Z (t) for 0 ≤ t ≤ 100.

multiplicity. If T > 0 is a real number such that

2T 2T
Z (t) dt < |Z (t)| dt, (14.7)
T T

then Z (t) is not of constant sign in the interval (T, 2T ), which is to say that ζ (s)
has at least one zero 1/2 + iγ of odd multiplicity, with T < γ < 2T . Although
it is possible to show that (14.7) holds for all large T , the requisite arguments
involve technical tools that we have not yet developed.
Fortunately, there is a
family of weights W (t) such that the integral W (t)Z (t) dt can be evaluated
by interpreting it as an inverse Mellin transform with a familiar kernel. Thus we
are able to establish a weighted variant of (14.7), which sufﬁces for our purpose.
In preparation for the main argument, we establish two preliminary results.
Lemma 14.9 If z > 0 and σ0 > 1, then
1 σ0 +i∞
∞
ζ (s) (s/2)(π z)−s/2 ds = 2 e−π n z .
2

2πi σ0 −i∞ n=1

This is the inverse of the Mellin transform relationship (10.7) that Riemann
used to establish the functional equation.
Proof By Theorem C.4 we see that if w > 0 and σ0 > 0, then
σ0 +i∞
1
(s/2)w −s/2 ds = 2e−w .
2πi σ0 −i∞

We take w = πn z, and sum over n, to obtain the desired identity. Here the
2

exchange of summation and integration is permissible since the Dirichlet series

for ζ (s) is uniformly convergent on the abscissa σ0 , and since
∞
((σ0 + it)/2)(π z)−s/2 dt < ∞.
−∞

458 Zeros

Lemma 14.10 We have

T
ζ (1/2 + it) dt = T + O T 1/2
1

uniformly for T ≥ 2.
Proof Let C denote the rectangular contour with vertices 1/2 + i, 2 + i,
2 + i T , 1/2 + i T . Since ζ (s) is analytic in this rectangle, we have

ζ (s) ds = 0
C

by Cauchy’s theorem. The integral from 1/2 + i to 2 + i is an absolute constant,

and by Corollary 1.17 the integral from 1/2 + i T to 2 + i T is
2
1 + T 1−σ (log T ) dσ T 1/2 .
1/2

Thus
T T
ζ (1/2 + it) dt = ζ (2 + it) dt + O T 1/2 .
1 1

This latter integral is

∞ T ∞
n −i − n −i T
= n −2 n −it dt = T − 1 + = T + O(1),
n=1 1 n=2
in 2 log n
so we have the stated result.

Proof of Theorem 14.8 The integrand in Lemma 14.9 has a pole at s = 1

with residue z −1/2 , but is otherwise analytic for σ > 0. We move the path of
integration to the line σ = 1/2, and multiply both sides by z 1/4 to see that
∞
1
ζ (1/2 + it) (1/4 + it/2)π −1/4−it/2 z −it/2 dt
2π −∞
(14.8)

∞
= −z −1/4 + 2z 1/4 e−πn z .
2

n=1
∞
Here the left-hand side is of the form −∞ W (t)Z (t) dt with
| (1/4 + it/2)|
W (t) = .
2π 5/4 z it/2
Write z in polar coordinates, z = r eiθ . Then z −it/2 = r −it/2 eθ t/2 . For our app-
roach to work, W (t) must have constant argument. Accordingly, we take r = 1,
and set θ = π/2 − δ where δ is small and positive. By (C.19) we see that

| (s/2)| τ (σ −1)/2 e−π τ/4 .

14.2 Zeros on the critical line 459

Hence

−1/4 π(t−τ )/4 −δt/2 τ −1/4 e−(π −δ)τ/2 if t ≥ 0,
W (t) τ e e
τ −1/4 e−(1−δ)π τ/2 if t ≤ 0.
Thus W (t) tends to 0 very rapidly as t → −∞, but relatively slowly as t →
+∞. In particular,

W (t) τ −1/4

uniformly for 0 ≤ t ≤ 1/δ.

By the above and Lemma 14.10 we see that
∞ 1/δ 1/δ
W (t)|Z (t)| dt δ 1/4 |Z (t)| dt = δ 1/4 |ζ (1/2 + it)| dt
−∞ 1/(2δ) 1/(2δ)

δ −3/4 .

In order to exhibit

a disparity, we must show that the right-hand side
of (14.8) is o δ −3/4 . To this end it sufﬁces to argue fairly crudely. Since
z = ie−iδ = sin δ + i cos δ, by the triangle inequality the right-hand side of
(14.8) is
∞
e−π n sin δ .
2

n=1

By the integral test this is

∞ ∞
e−π u sin δ
du = (sin δ)−1/2 e−π v dv δ −1/2 .
2 2
≤
0 0

If ζ (s) had only ﬁnitely many zeros on the critical line, then we would have
∞ ∞
W (t)Z (t) dt = W (t)|Z (t)| dt + O(1)
−∞ −∞

uniformly as δ → 0+ . On the contrary, we have shown that

∞ ∞
W (t)Z (t) dt δ −1/2 , W (t)|Z (t)| dt δ −3/4 ,
−∞ −∞

so the theorem is proved.

14.2.1 Exercise
1. (a) Show that the right-hand side of (14.8) is

= −z −1/4 − z 1/4 + z 1/4 ϑ(z),

in the notation of (10.8).

460 Zeros

(b) Show that if z = ie−iδ = sin δ + i cos δ, then

∞
(−1)n (1 + O(n 2 δ 2 ))e−π n sin δ
2
ϑ(z) = .
n=−∞

(c) Show that

∞
n 2 e−π n sin δ
δ −3/2
2

n=−∞

for 0 < δ ≤ 1.
(d) By taking α = 1/2 in Theorem 10.1, or otherwise, show that

∞
(−1)n e−πn x −1/2 e−π/(4x)
2
x

n=−∞

uniformly for 0 < x ≤ 1.

(e) Show that if z is taken as in (b), then ϑ(z) δ 1/2 .
(f) Conclude that the right-hand side of (14.8) is = −2 cos π/8 + O(δ 1/2 ).

14.3 Notes
Section 14.1. Theorem 14.1 and Corollary 14.2 are due to Backlund (1914,
1918), and this gave a shorter proof of Corollary 14.3 which had been ob-
tained by von Mangoldt (1905). Earlier von Mangoldt (1895) had the error
term O((log T )2 ). Riemann (1859) proposed Corollary 14.3 but with no indica-
tion of a proof. It is remarkable that Corollary 14.3 is perhaps the only theorem
on the Riemann zeta function that has not seen some signiﬁcant improvement
in the last 100 years.
Although the maximum order of S(t) is unclear, even assuming the Riemann
Hypothesis, we have considerable (unconditional) knowledge of its moments
and distribution. Selberg (1944) showed that if k is a ﬁxed non-negative even
integer, then
T
k!
S(t)k dt = T (log log T )k/2 + O(T (log log T )k/2−1 ).
0 (k/2)!(2π )k
Although Selberg did not mention it, his techniques can also be used to show
that
T
S(t)k dt = o(T (log log T )k/2 )
0

when k is odd. From these estimates it follows that the distribution of S(t) is
14.4 References 461

asymptotically normal, in the sense that

c
1 1
e−t /2 dt
2
lim meas{t ∈ [0, T ] : 2π S(t) ≤ c log log T } = √
T →∞ T 2π ∞

for any given real number c. Similar results apply to the distribution of the real
part of log ζ (1/2 + it), and indeed Selberg (unpublished) showed that the real
and imaginary parts can be treated simultaneously. Speciﬁcally,
T
(log ζ (1/2 + it))h (log ζ (1/2 − it))k dt = δh,k k!T (log log T )k
0

+ Oh,k T (log log T )(h+k−1)/2

where

1 if h = k,
δh,k =
0 otherwise.
From this it follows that log ζ (1/2 + it) is asymptotically normally distributed
in the complex plane, in the sense that if is a set in the complex plane with
Jordan content, then
1 / log ζ (1/2 + it) 0 1
e−|z| d x d y.
2
lim meas t ∈ [4, T ] : √ ∈ =
T →∞ T log log t π

Section 14.2. Theorem 14.8 was announced and a proof sketched in Hardy
(1914). Further details are given in Hardy & Littlewood (1917). Let N0 (T )
denote the number of zeros of the form 1/2 + iγ with 0 < γ ≤ T . Hardy
& Littlewood (1921) showed that N0 (T ) T . Later Selberg improved this,
ﬁrst (1942a) to N0 (T ) T log log T and then (1942b) to N0 (T ) T log T ,
so that a positive proportion of the zeros are on the 12 -line. Levinson (1974)
introduced an alternative method that enabled him to show that at least one-
third of the non-trivial zeros are on the 12 -line. Selberg’s method detects only
zeros of odd multiplicity. This should not be a handicap, since presumably all
zeros are simple. Heath-Brown (1979) has observed that Levinson’s method
detects only simple zeros. Conrey (1989) used Levinson’s method to show that
N0 (T ) 25 N (T ).
The proof we have given of Hardy’s Theorem 14.8 is but one of several
described by Titchmarsh (1986, Chapter 10).

14.4 References
Backlund, R. J. (1914). Sur les zéros de la fonction ζ (s) de Riemann, C. R. Acad. Sci.
Paris 158, 1979–1981.
462 Zeros

(1918). Über die Nullstellen der Riemannschen Zetafunktion, Acta Math. 41, 345–
375.
Conrey, J. B. (1989). More than two ﬁfths of the zeros of the Riemann zeta function are
on the critical line, J. Reine Angew. Math. 399, 1–26.
Hardy, G. H. (1914). Sur les zéros de la fonction ζ (s) de Riemann, C. R. Acad. Sci. Paris
158, 1012–1014; Collected Papers, Vol. 2, Oxford: Oxford University Press, 1967,
pp. 6–8.
Hardy, G. H. & Littlewood, J. E. (1917). Contributions to the theory of the Riemann
Zeta-function and the theory of the distribution of primes, Acta Math. 41, 119–196;
Collected Papers, Vol. 2, Oxford: Oxford University Press, 1967, pp. 20–97.
(1921). The zeros of Riemann’s zeta-function on the critical line, Math. Z. 10, 283–
317; Collected Papers, Vol. 2, Oxford: Oxford University Press, 1967, pp. 115–149.
Heath–Brown, D. R. (1979). Simple zeros of the Zeta-function on the critical line, Bull.
London Math. Soc. 11, 17–18.
Levinson, N. (1974). More than one third of zeros of Riemann’s zeta-function are on
σ = 1/2, Adv. Math. 13, 383–436.
von Mangoldt, H. (1895). Zu Riemann’s Abhandlung “Ueber die Anzahl der Primzahlen
unter einer gegebenen Grösse”, J. Reine Angew. Math. 114, 255–305.
(1905). Zur Verteilung der Nullstellen der Riemannschen Funktion ξ (t), Math. Ann.
60, 1–19.
Riemann, B. (1859). Ueber die Anzahl der Primzahlen unter eine gegebenen Grösse,
Monatsber. Kgl. Preuss. Akad. Wiss. Berlin, 671–680; Werke, Leipzig: Teubner,
1876, pp. 3–47. Reprint: New York: Dover, 1953.
Selberg, A. (1942a). On the zeros of Riemann’s zeta-function on the critical line, Arch.
Math. Naturvid. 45, 101–114; Collected Papers, Vol. 1, New York: Springer Verlag,
1989, pp. 142–155.
(1942b). On the Zeros of Riemann’s Zeta-function, Skr. Norske Vid. Akad. Oslo I.,
no. 10; Collected Papers, Vol. 1, New York: Springer Verlag, 1989, pp. 156–159.
(1944). On the Remainder in the Formula for N (T ), the Number of Zeros of ζ (s) in
the Strip 0 < t < T , Avh. Norske Vid. Akad. Oslo. I, no. 1; Collected Papers, Vol.
1, New York: Springer Verlag, 1989, pp. 179–203.
Titchmarsh, E. C. (1986). The Theory of the Riemann Zeta-function, Second edition.
New York: Oxford University Press.
15
Oscillations of error terms

15.1 Applications of Landau’s theorem

In this section we make repeated use of the following simple analogue of Lan-
dau’s theorem (Theorem 1.7) concerning Dirichlet series with non-negative
coefﬁcients.

Lemma 15.1 Suppose that A(x) is a bounded Riemann-integrable function

in any ﬁnite interval 1 ≤ x ≤ X , and that A(x)
∞ ≥ 0 for all x > X 0 . Let σc
−σ
denote the inﬁmum of those σ for which X 0 A(x)x d x < ∞. Then the
function
∞
F(s) = A(x)x −s d x
1

is analytic in the half-plane σ > σc , but not at the point s = σc .

Proof Write
X0 ∞
F(s) = A(x)x −s d x + A(x)x −s d x = F1 (s) + F2 (s),
1 X0

say. Then the function F1 (s) is entire, and the proof of Theorem 1.7 can be
adapted to F2 (s) to give the stated result.

In Exercise 13.1.1 we saw that if denotes the supremum of the real parts of
the zeros of the zeta function, then ψ(x) = x + O(x (log x)2 ). Conversely, if

ψ(x) = x + O(x α+ε ), then by Theorem 1.3 the Dirichlet series ∞ n=1 ((n) −
1)n −s converges for σ > α, and hence ζ (s) = 0 in this half-plane. That is,
ψ(x) − x = (x −ε ). We now sharpen this, by showing that ψ(x) − x must
be large in both signs.

463
464 Oscillations of error terms

Theorem 15.2 Let denote the supremum of the real parts of the zeros of
the zeta function. Then for every ε > 0,

ψ(x) − x = ± (x −ε ) (15.1)

and

π (x) − li(x) = ± (x −ε ) (15.2)

as x → ∞.

Proof By Theorem 1.3 we have

∞
ζ
− (s) = s ψ(x)x −s−1 d x
ζ 1

for σ > 1. Hence

∞
ζ (s) 1
− − = (ψ(x) − x)x −s−1 d x
sζ (s) s − 1 1

for σ > 1. Suppose that

ψ(x) − x < x −ε for all x > X 0 (ε). (15.3)

Then we apply Lemma 15.1 to the function

∞
1 ζ (s) 1
+ + = (x −ε − ψ(x) + x)x −s−1 d x.
s − + ε sζ (s) s − 1 1

Here the left-hand side has a pole at − ε, but is analytic for real s > − ε,
in view of Corollary 1.14. Hence the above identity holds for σ > − ε,
and both sides are analytic in this half-plane. But by the deﬁnition of ,
the function ζ /ζ has poles with real part > − ε. From this contradiction
we deduce that the assertion (15.3) is false. That is, ψ(x) − x = + (x −ε ).
To obtain the corresponding − estimate we argue similarly using the
identity
∞
1 ζ (s) 1
− − = (x −ε + ψ(x) − x)x −s−1 d x.
s − + ε sζ (s) s − 1 1

In contrast to the situation of Corollary 2.5 or Theorem 13.2, it does not seem
possible to derive (15.2) from (15.1) by integrating by parts. Instead, we pursue
an argument modelled on the one just given. First we examine the Mellin
transform of li(x). By integrating by parts we see that
∞ ∞ ∞
dx du
s li(x)x −s−1 d x = s
= e−u .
2 2 x log x (s−1) log 2 u
15.1 Applications of Landau’s theorem 465

Clearly this is
∞
du 1
e−u − 1
= e−u + du − log(s − 1) − log log 2.
1 u (s−1) log 2 u
By (7.31) we see that this is
(s−1) log 2
e−u − 1
=− du − C0 − log(s − 1) − log log 2.
0 u
Thus we ﬁnd that
∞
s li(x)x −s−1 d x = − log(s − 1) + r (s)
2

where r (s) is an entire function. Put

(n)
(x) = .
n≤x log n
By Theorem 1.3 we know that
∞
s (x)x −s−1 d x = log ζ (s)
2

for σ > 1. Hence

1 1 r (x)
− log(ζ (s)(s − 1)) +
s−+ε s s
∞
= (x −ε − (x) + li(x))x −s−1 d x
2

for σ > 1. We observe that this function is analytic on the real axis for s > −
ε. Thus by Lemma 1, if (x) − li(x) < x −ε for all sufﬁciently large x, then the
identity above holds in the half-plane σ > − ε. However, we are assuming
that the zeta function has a zero ρ = β + iγ with β > − ε, and the left-hand
side above has a logarithmic singularity at s = ρ. Thus we have a contradiction,
and so (x) − li(x) = + (x −ε ). Since π (x) = (x) + O(x 1/2/ log x), and
since ≥ 1/2, it follows that π (x) − li(x) = + (x −ε ). For the corresponding
− estimate, we argue similarly from the identity
1 1 r (x)
+ log(ζ (s)(s − 1)) −
s−+ε s s
∞
= (x −ε + (x) − li(x))x −s−1 d x.
2

Next we show that if there is a zero of ζ (s) on the line σ = , then we may
draw a stronger conclusion.
466 Oscillations of error terms

Theorem 15.3 Suppose that is the supremum of the real parts of the zeros
of ζ (s), and that there is a zero ρ with ρ = , say ρ = + iγ . Then
ψ(x) − x 1
lim sup ≥ , (15.4)
x→∞ x |ρ|
and
ψ(x) − x 1
lim inf
≤− . (15.5)
x→∞ x |ρ|
Proof Suppose that ψ(x) ≤ x + cx for all x ≥ X 0 . Then by Lemma 15.1,
∞
c ζ (s) 1
+ + = (cx − ψ(x) + x)x −s−1 d x (15.6)
s − sζ (s) s − 1 1

for σ > . Call this function F(s). Then

1 1
F(s) + eiφ F(s + iγ ) + e−iφ F(s − iγ )
2 2
∞
= (cx − ψ(x) + x)(1 + cos(φ − γ log x))x −s−1 d x
1

for σ > . We now consider the behaviour of these two expressions as s tends
to from above through real values. On the right-hand side, the integral from
1 to X 0 is uniformly bounded, while the integral from X 0 to ∞ is non-negative.
Thus the lim inf of the right-hand side is > −∞ as s → + . On the other hand,
the left-hand side is a meromorphic function that has a pole at s = with
residue
meiφ me−iφ
c+ +
2ρ 2ρ
where m ≥ 1 denotes the multiplicity of the zero ρ. We choose φ so that eiφ /ρ =
−1/|ρ|. Then the above is c − m/|ρ|. This quantity must be non-negative, for if
it were negative, then the left-hand side would tend to −∞ as s → + . Hence
c ≥ 1/|ρ|, and we have (15.4). The proof of (15.5) is similar.

Corollary 15.4 As x tends to +∞,

ψ(x) − x = ± x 1/2 , (15.7)

ϑ(x) − x = − x 1/2 , (15.8)
and

π(x) − li(x) = − x 1/2 (log x)−1 . (15.9)
The problem of proving + companions of (15.8) and (15.9) is more difﬁcult,
and is dealt with in the next section.
15.1 Applications of Landau’s theorem 467

Proof We ﬁrst prove (15.7). If RH is false, then > 1/2, and we have a
stronger result by Theorem 15.2. If RH holds, then we have (15.7) by Theo-
rem 15.3, and the remaining assertions follow by Theorem 13.2.

Many similar results can be proved using the above ideas. For example, for

M(x) = n≤x µ(n) we ﬁnd, in the manner of Theorem 15.2, that
M(x) = ± (x −ε ). (15.10)
In analogy to (15.6) we put
∞
1 c
G(s) = − = (M(x) − cx )x −s−1 d x.
sζ (s) s − 1

Then in the manner of the proof of Theorem 15.3, we ﬁnd that if + iγ is a

zero of ζ (s), then
M(x) 1
lim sup ≥ , (15.11)
x→∞ x |ρζ (ρ)|
and
M(x) 1
lim inf
≤− . (15.12)
x→∞ x |ρζ (ρ)|
Here we are assuming that ζ (ρ) = 0. In the contrary case ρ would be a multiple
zero of ζ (s), and our method would allow us to replace the right-hand side of
(15.11) by +∞ and that of (15.12) by −∞. In fact we can prove still more, by
considering the function
∞
1 c(m − 1)!
H (s) = − = (M(x) − cx (log x)m−1 )x −s−1 d x.
sζ (s) (s − )m 1

Then our method allows us to deduce that if + iγ is a zero of multiplicity

m ≥ 1, then
M(x) = ± (x (log x)m−1 ).
Then in the manner of Corollary 15.4 we ﬁnd that in any case

M(x) = ± x 1/2 , (15.13)
and that if ζ (s) has a multiple zero, then

M(x) = ± x 1/2 log x . (15.14)
In the explicit formula for ψ(x) − x, or for M(x), the arguments of the terms in
the sum over the zeros are governed by the quantities x iγ . If the ordinates γ > 0
are linearly independent over Q, then these arguments will tend to be statistically
independent as x runs over a long range. Numerical experiments have failed
468 Oscillations of error terms

to disclose any linear dependences, and in the absence of any indication to the
contrary, we presume that the ordinates γ > 0 are linearly independent. Under
this assumption, we can improve on the estimate (15.13).
Theorem 15.5 Let 0 < γ1 < γ2 < · · · < γ K and γ be ordinates of zeros of
ζ (s). For 1 ≤ k ≤ K let εk take one of the values −1, 0, 1. Suppose that

K
εk γk = 0 (15.15)
k=1

for such εk only when εk = 0 for all k. Suppose also that the equation

K
εk γk = γ (15.16)
k=1

has a solution only if γ is one of the γk , say γ = γk0 and that in this case the
only solution is obtained by taking εk0 = 1, εk = 0 for k = k0 . Then

M(x) K
1
lim sup ≥
(15.17)
x→∞ x 1/2 k=1
|ρ ζ
k (ρk )|

and
M(x) K
1
lim inf ≤ − (ρ )|
. (15.18)
x→∞ x 1/2 k=1
|ρk ζ k

Proof In view of (15.10) and (15.14), we may assume that RH holds and that
all zeros of the zeta function are simple. We suppose that M(x) ≤ cx 1/2 for all
large x and consider the integral
∞
M(x) − cx 1/2 K
I (s) = (1 + cos(φk − γk log x)) d x.
1 x s+1 k=1

With G(s) deﬁned as above (with = 1/2), we multiply out the product to
see that this integral is a linear combination of G at various arguments. More
precisely, we see that
1 K
I (s) = G(s) + (eiφk G(s + iγk ) + e−iφk G(s − iγk )) + J (s)
2 k=1
where J (s) is a linear combination of G at arguments of the form

K
s +i εk γk
k=1

with more than one of the εk non-zero. The function G(s) is analytic in the
half-plane σ > 0, except for poles at s = 1/2 and at the non-trivial zeros ρ.
15.1 Applications of Landau’s theorem 469

Hence by Landau’s theorem we see that I (s) converges for σ > 1/2, and our
hypotheses (15.15), (15.16) imply that J (s) is analytic at the point s = 1/2.
Thus the integral I (s) has a pole at s = 1/2 with residue

K
eiφk
−c + .
k=1
ρk ζ (ρk )

We choose the φk so that the summands here are positive real. Since I (s) is
bounded above uniformly for s > 1/2, by letting s tend to 1/2 from above we
deduce that

K
1
c≥ .
k=1
|ρk ζ (ρk )|

This gives (15.17), and the proof of (15.18) is similar.

It is not known whether it is possible to choose zeros ρ in such a way that the
hypotheses (15.15), (15.16) hold, and for which the sum in (15.17) and (15.18)
is large, but at least we are able to establish

Theorem 15.6 Suppose that the Riemann Hypothesis is true and that the zeros
of the zeta function are simple. Then
1
(ρ)|
T
0<γ ≤T
|ζ

as T → ∞.

From this it follows by partial summation that

1
(ρ)|
log T
0<γ ≤T
|ρζ

as T → ∞. Thus by combining Theorems 15.5 and 15.6 we have

Corollary 15.7 If the ordinates γ > 0 of the Riemann zeta function are lin-
early independent over Q, then
M(x)
lim sup = +∞
x→∞ x 1/2
and
M(x)
lim inf = −∞.
x→∞ x 1/2
Proof of Theorem 15.6 It is enough to prove the inequality with T restricted
to the special sequence of values Tν of Theorem 13.21, for which |ζ (s)| τ −ε
470 Oscillations of error terms

uniformly for −1 ≤ σ ≤ 2. By the calculus of residues we see that

1 1 1
= ds
0<γ ≤Tν
ζ (ρ) 2πi C ζ (s)

where C is the rectangular contour with vertices 2 + i, 2 + i Tν , −1 + i Tν ,

−1 + i. The top of this rectangle contributes an amount Tνε . For s on the
left side of this contour, |ζ (s)| τ 3/2
by Corollary 10.5, so that the integral
along the left-hand side is 1. The integral along the bottom of the rectangle
is clearly 1 as well. To estimate the integral along the right-hand side, we
expand 1/ζ (s) in its Dirichlet series, and integrate term by term. The integral
of 1 contributes Tν − 1, while for n > 1 the integral of n −2−it is n −2 / log n.
On summing over n we ﬁnd that the integral of 1/ζ (s) over the right-hand side
of the rectangle is Tν + O(1). On combining these estimates we see that the
sum above is Tν + O(Tνε ), and this gives the stated result.

15.1.1 Exercises
1. (a) Suppose that ε is small and positive, and let Li(x) be deﬁned as in
Exercise 6.2.22. Explain why
∞ ∞
dx
s Li(x)x −s−1 d x = Li(1 + ε)(1 + ε)−s + = T1 + T2 .
1+ε 1+ε x s log x
(b) Show that Li(1 − ε) = Li(1 + ε) + O(ε).
(c) Show that
∞
dv
Li(1 − ε) = − e−v .
ε v
(d) Show that Li(1 + ε) log 1/ε.
(e) Deduce that
∞
dv 1
T1 = − e−v + O ε log .
ε v ε
(f) Show that
∞
dv
T2 = e−v .
(s−1) log(1+ε) v
(g) Show that
∞
dv
T2 = e−v + O(ε) .
(s−1)ε v
15.1 Applications of Landau’s theorem 471

(h) Show that

(s−1)ε
dv
T1 + T2 = − log(s − 1) − (e−v − 1) + O(ε log 1/ε).
ε v
(i) Conclude that
∞
s Li(x)x −s−1 d x = − log(s − 1)
1

for σ > 1.

2. Let ψ1 (x) = n≤x (n)(x − n). Show that ψ1 (x) − 12 x 2 = ± (x 3/2 ).
3. Show that ψ(2x) − 2ψ(x) = ± (x 1/2 ).
4. (a) Show that as x → ∞,

(1 − n/x)µ(n) = ± x 1/2 .
n≤x

(b) Show that as x → ∞,

µ(n)/n = ± x −1/2 .
n≤x

(c) Show that as x → ∞,

∞

µ(n)e−n/x = ± x 1/2 .
n=1

5. Let Q(x) denote the number of square-free numbers not exceeding x.

(a) Show that
6
Q(x) − 2 x = ± x 1/4 .
π
(b) Show that

Q(2x) − 2Q(x) = ± x 1/4 .
6. (a) Suppose that ζ (1/2 + iγ ) = 0 and that ζ (1/2 + 2iγ ) = 0. Show that
ψ(x) − x 4
lim sup ≥
x→∞ x 1/2 3|ρ|
and that
ψ(x) − x 4
lim inf ≤− .
x→∞ x 1/2 3|ρ|
(b) Show that if ζ (1/2 + iγ1 ) = ζ (1/2 + iγ2 ) = 0 but ζ (1/2 + i(γ1 +
γ2 )) = 0 and ζ (1/2 + i(γ1 − γ2 )) = 0, then
ψ(x) − x 1 1
lim sup ≥ +
x→∞ x 1/2 |1/2 + iγ1 | |1/2 + iγ2 |
472 Oscillations of error terms

and that
ψ(x) − x 1 1
lim inf ≤− − .
x→∞ x 1/2 |1/2 + iγ1 | |1/2 + iγ2 |

7. Show that n≤x (−1)ω(n) x 1/2+ε if and only if (3s − 2)/ζ (s) is analytic
for σ > 1/2.

8. (Ingham 1942; cf. Haselgrove 1958) Let L(x) = n≤x λ(n).
(a) Show that if > 1/2, then for every ε > 0, L(x) = ± (x −ε ) as
x → ∞.
(b) Show that lim infx→∞ L(x)/x 1/2 ≤ 1/ζ (1/2) (= −0.685 . . . ).
(c) Show that if ζ (s) has a multiple zero, then L(x) = ± x 1/2 log x .
(d) Show that if RH holds and σ is ﬁxed, 1/4 < σ < 1/2, then
|ζ (2s)/ζ (s)| = τ σ −1/2+o(1) .
(e) Show that if RH holds, then there is a sequence of Tν → ∞ in such a
way that Tν+1 ≤ Tν + 2, and
ζ (2ρ)

= Tν + O Tν3/4+ε .
0<γ ≤Tν
ζ (ρ)

(f) Show that if RH holds and the ordinates γ > 0 of the zeros of the zeta
function are linearly independent over Q, then
L(x)
lim sup = +∞
x→∞ x 1/2
and
L(x)
lim inf = −∞.
x→∞ x 1/2
9. (Turán 1948; cf. Haselgrove 1958)

(a) Show that if n≤x λ(n)/n ≥ 0 for all x ≥ 1, then the Riemann Hy-
pothesis is true.
(b) Show that

λ(n)/n = + x −1/2
n≤x

as x → ∞.
10. Let the positive integer q be ﬁxed. Suppose that if χ is a character (mod
q), then L(σ, χ ) = 0 for 0 < σ < 1. Suppose also that a and b are integers
such that (ab, q) = 1 and a ≡ b (mod q).
(a) Let = (q; a, b) denote the supremum of the real parts of the poles
of the function
L
(χ (a) − χ (b)) (s, χ).
χ L
15.1 Applications of Landau’s theorem 473

Show that
ψ(x; q, a) − ψ(x; q, b) = ± (x −ε )
for any ε > 0.
(b) Let r (a) denote the number of solutions of the congruence x 2 ≡ a
(mod q). Show that
r (a) 1/2
ϑ(x; q, a) = ψ(x; q, a) − x + o x 1/2 .
ϕ(q)
(c) Show that if (q; a, b) > 1/2, then
ϑ(x; q, a) − ϑ(x; q, b) = ± (x −ε ),
π(x; q, a) − π(x; q, b) = ± (x −ε )
for any ε > 0.
(d) Show that (q; a, b) ≥ 1/2.
(e) Show that

ψ(x; q, a) − ψ(x; q, b) = ± x 1/2 .
(f) Show that if r (a) ≥ r (b), then

ϑ(x; q, a) − ϑ(x; q, b) = − x 1/2 ,

π(x; q, a) − π (x; q, b) = − x 1/2 / log x .
(g) Show that if r (a) ≤ r (b), then

ϑ(x; q, a) − ϑ(x; q, b) = + x 1/2 ,

π(x; q, a) − π (x; q, b) = + x 1/2 / log x .
(h) Show that

π(x; 4, 1) − π (x; 4, 3) = − x 1/2 / log x .
11. (Hardy & Littlewood 1918; Landau 1918a, b) Let χ−4 (n) = ( −4
n
) denote
the non-principal character modulo 4, and let

T1 (x) = (n)χ−4 (n)(x − n).
n≤x

(a) Show that

x ρ+1
T1 (x) = − + O(x)
ρ ρ(ρ + 1)
where ρ runs over the non-trivial zeros of L(s, χ−4 ). In parts (b)–(l)
below, assume that all these zeros lie on the line σ = 1/2.
474 Oscillations of error terms

(b) Show that

1 L
= 2 log 2 − log π − C 0 + 2 (1, χ−4 ).
ρ |ρ|
2 L

(c) Show that L(1, χ−4 ) = π/4.

(d) Show that

log 3 ∞
(−1)k log 2k − 1 log 2k + 1
L (1, χ−4 ) = + − ,
6 k=2
2 2k − 1 2k + 1

and apply the alternating series test to show that 0.19 < L (1, χ−4 ) <
0.196.
(e) Deduce that
1
0.148 < < 0.164.
ρ |ρ|
2

(f) Show that |T1 (x)| < (0.165)x 3/2 for all large x.
(g) Show that
2 3/2
(log p)(x − p 2 ) = x + o x 3/2 .
p≤x 1/2
3

(h) Let T2 (x) = 2< p≤x (log p)(−1)( p−1)/2 (x − p). Show that

5 1
− x 3/2 < T2 (x) < − x 3/2
6 2
for all large x.

(i) Let T3 (x) = 2< p≤x (−1)( p−1)/2 (x − p). Show that
T2 (x) x
T2 (u) 2(x − u)
T3 (x) =+ 2 2
x+ du
log x 3 u (log u) log u
T2 (x) x 3/2
= +O .
log x (log x)2

(j) Let P(x) = p>2 (−1)( p−1)/2 e− p/x . Show that
∞
1
P(x) = T3 (u)e−u/x du.
x2 0

(k) Show that

∞
3 √ 5/2
u 3/2 (log u)−1 e−u/x du = π x (log x)−1 + O x 5/2 (log x)−2 .
2 4
15.2 The error term in the Prime Number Theorem 475

(l) Deduce that

3 x 1/2
P(x) < −
5 log x
for all large x.
(m) Chebyshev (1853) proposed that P(x) < 0 for all sufﬁciently large x.
Conclude that Chebyshev’s conjecture is equivalent to the assertion
that L(s, χ−4 ) = 0 for σ > 1/2.

15.2 The error term in the Prime Number Theorem

We have seen that ψ(x) − x changes sign inﬁnitely often. We now show that
these sign changes can be localized if there is a zero on the abscissa .

Theorem 15.8 Let denote the supremum of the real parts of the zeros of
ζ (s). If ζ (s) has a zero with real part , then there exists a constant C > 0 such
that ψ(x) − x changes sign in every interval [x, C x] for which x ≥ 2.

Proof For each integer k ≥ 0, put

1
Rk (y) = (y − log n)k (n) − e y .
k! n≤e y

We see easily that Rk (y) is differentiable for k > 1, and that Rk (y) = Rk−1 (y).
By the method used to prove explicit formulæ we see also that
eρy
Rk (y) = − + O(y k+1 ).
ρ ρ
k+1

Suppose that the numbers γ j are determined, 0 < γ1 < γ2 < . . . so that the
numbers ± iγ j constitute all the zeros of ζ (s) on the line σ = , and let

m j denote the multiplicity of the zero ρ j = + iγ j . Since ρ |ρ|−α < ∞ for
α > 1, we see that if k ≥ 1, then
m j eiγ j y
Rk (y) = −2ey + o(ey ) (15.19)
j ρ k+1
j

as y → ∞. Let K be the least number for which

m1 mj
> .
|ρ1 | K
j>1
|ρ j | K

Choose φ so that eiγ1 φ /ρ1K > 0. By taking k = K in (15.19) and using the above
inequality, we see that for all large numbers n, R K (φ + π n/γ1 ) is positive or
476 Oscillations of error terms

negative according as n is odd or even. Take C = exp(π (K + 2)/γ1 ). Then any

interval [y0 , y0 + log C] contains at least K + 2 points of the form φ + πn/γ1 .
Thus if y0 is large, then such an interval contains K + 2 points at which R K (y)
alternates in sign. By the mean value theorem for derivatives we know that if f is
differentiable on an interval [α, β] and f (α) < 0, f (β) > 0, then there must be
a number ξ , α < ξ < β, such that f (ξ ) > 0. Thus we can choose K + 1 points
in the interval [y0 , y0 + log C] at which R K −1 (y) alternates in sign. Continuing
in this manner, we conclude that we can ﬁnd three points in this interval at
which R1 (y) alternates in sign. Now R1 (y) is continuous, and R1 (y) = R0 (y)
in intervals containing no prime power, so that R1 (y) is an indeﬁnite integral of
R0 (y). Thus, although R0 (y) is not everywhere differentiable, it is nevertheless
true that R1 will be monotonic in any interval in which R0 is of constant sign.
Since R1 is not monotonic in the interval in question, we deduce that R0 changes
sign.

The method used to prove Corollary 15.7 could be applied to ψ(x) − x,

but for this function we have a different approach that succeeds without any
unproved hypothesis. In view of Theorem 15.2 we may assume that the Riemann
Hypothesis is true. By substituting e y for x in the explicit formula for ψ(x), we
see that
ψ(e y ) − e y
= − eiγ y /ρ + O e−y/2
e y/2 ρ

uniformly for y ≥ 1. Since 1/ρ = 1/(iγ ) + O(1/γ 2 ) and 1/γ 2 < ∞, the
above is
sin γ y
−2 + O(1).
γ >0
γ

Here each term in the sum is periodic, and if γ is large, then both the period and
the amplitude of the term are small. The sum is not absolutely convergent, but
by suitably averaging this with respect to y we may arrange that the γ beyond
a chosen point make a small contribution. Suppose, for simplicity, that by such
an averaging we could truncate the sum, which would leave us to consider the
partial sum
sin γ y
−2 . (15.20)
0<γ ≤T
γ

Here the sum of the absolute values of the coefﬁcients is (log T )2 , and the
sum will be of this order of magnitude if we can ﬁnd a y for which the fractional
parts {γ y/(2π)} are approximately 1/4 for all the above γ . This, however, is an
inhomogeneous problem of Diophantine approximation, and in general such a
15.2 The error term in the Prime Number Theorem 477

problem has a solution only if the coefficients γ are linearly independent over Q.
Moreover, in order to obtain a quantitative result it would be necessary to have
quantitative lower bounds for the absolute values of linear forms in the γ . Since
we have no such information, we are confined to homogeneous approximation.
Dirichlet’s theorem assures us that there exist large y for which each of the
numbers γ y/(2π) is near an integer. That is, γ y/(2π ) is small for 0 < γ ≤ T ,
where θ denotes the distance from θ to the nearest integer, θ = minn∈Z |θ −
n|. However, the sum (15.20) vanishes when y = 0, and will therefore be small
when the numbers γ y/(2π) are small. On the other hand, if we take y = π/T
in (15.20), then sin γ y γ /T , and the sum is N (T )/T log T . While this
is smaller than the (log T )2 that we might have hoped for, it is definitely large.
This y is small, but by Dirichlet’s theorem there exists a large number y0 for
which the numbers γ y0 /(2π) are small, and then we may take y = y0 ± π/T
to make the sum (15.20) large in either sign.
The truth of the matter is that the sum (15.20) is not an average of the error
term in the Prime Number Theorem, but we can form a weighted sum that
resembles (15.20).

Lemma 15.9 If the Riemann Hypothesis is true, then

1 eδ x sin γ δ sin(γ log x)

(ψ(u) − u) du = −2x 1/2 · + O x 1/2
(eδ − e−δ )x e−δ x γ >0
γδ γ

uniformly for x ≥ 4, 1/(2x) ≤ δ ≤ 1/2.

The ﬁrst factor in the sum is near 1 if γ is small compared to 1/δ, and then
becomes small for larger γ . Thus, despite its more complicated appearance, the
above sum behaves like the partial sum (15.20) with T 1/δ.

Proof We recall that

x x ρ+1 ζ
(ψ(u) − u) du = − − (0)x + O(1)
0 ρ ρ(ρ + 1) ζ

for x ≥ 2. We replace x by e±δ x and difference to see that the left-hand side in
the lemma is
δ (eδ(ρ+1) − e−δ(ρ+1) )x ρ
− + O(1). (15.21)
sinh δ ρ 2δρ(ρ + 1)

We appeal to RH, and observe that e±δ(ρ+1) = e±iγ δ (1 + O(δ)) = e±iγ δ +

O(δ). Since N (T + 1) − N (T ) log T , we see easily that γ γ −2 1. Thus
when we replace e±δ(ρ+1) by e±iγ δ in (15.21), we introduce an error term that
478 Oscillations of error terms

is x 1/2 . Hence the expression (15.21) is

δ sin γ δ x iγ
−i x 1/2 · + O x 1/2 .
sinh δ ρ δ ρ(ρ + 1)

The factor in parentheses is 1 + O(δ 2 ), and the sum over ρ is

1 1 1
+ (log 1/δ)2 ,
0<γ ≤1/δ
γ δ γ >1/δ γ 2

so our expression is
sin γ δ x iγ
−i x 1/2 · + O x 1/2 .
ρ δ ρ(ρ + 1)

Now 1/ρ = 1/(iγ ) + O(1/γ 2 ), and the ﬁrst factor in the above sum is |γ |,
so that if we replace 1/ρ by 1/(iγ ), then we introduce an error term that is

x 1/2 γ 1/γ 2 x 1/2 . Similarly we may replace 1/(ρ + 1) by 1/(iγ ). Thus
we see that the above sum is
sin γ δ x iγ
−x 1/2 · + O x 1/2 .
ρ γδ iγ

We now obtain the stated result by combining the contributions of γ

and −γ .

We now formulate a simple form of Dirichlet’s theorem that is suitable for

our use.

Lemma 15.10 (Dirichlet) If θ1 , . . . , θ K are real numbers, and N is a positive

integer, then there is a positive integer n ≤ N K such that θk n < 1/N for
1 ≤ k ≤ K.

Proof The point p(n) = ({θ1 n}, . . . , {θ K n}) lies in the hypercube [0, 1) K . We
partition this hypercube into N K hypercubes of side length 1/N . We allow n
to take the values 0, 1, . . . , N K , which gives us N K + 1 points. Hence by the
pigeon-hole principle there are two values of n, say 0 ≤ n 1 < n 2 ≤ N K , for
which the points p(n 1 ), p(n 2 ) lie in the same hypercube. Thus

θk n 1 − θk n 2 ≤ |{θk n 1 } − {θk n 2 }| < 1/N

for 1 ≤ k ≤ K . We take n = n 2 − n 1 to obtain the desired result.

Theorem 15.11 (Littlewood) As x → ∞,

ψ(x) − x = ± x 1/2 log log log x , (15.22)
15.2 The error term in the Prime Number Theorem 479

and

π(x) − li(x) = ± x 1/2 (log x)−1 log log log x . (15.23)
Proof We consider (15.22). If RH is false, then Theorem 15.2 is stronger.
Thus it remains to prove (15.22) if RH holds. Let N be a large integer. We
apply Lemma 15.10 to those numbers γ (log N )/(2π) for which 0 < γ ≤ T =
N log N . Thus in Lemma 15.10 we have K = N (T ) T log T , and there exists
an integer n, 1 ≤ n ≤ N K such that
1γ n 1
1 1 1
1 log N 1 <
2π N
for 0 < γ ≤ T . We take x = N n e±1/N , δ = 1/N in Lemma 15.9. From the
general inequality | sin 2πα − sin 2πβ| ≤ 2π α − β we see that
| sin(γ log x) ∓ sin γ /N | ≤ 2π/N .
Since
sin γ /N 1
· (log N )2
γ γ /N γ

and γ >T 1/γ 2 T −1 log T 1/N , we deduce that the right-hand side in
Lemma 15.9 is
sin γ /N 2
−1
∓2x 1/2
N + O x 1/2 .
γ >0
γ /N
K
The sum over γ is N log N . But x ≤ N N e1/N and K = N (T ) T log T
N (log N )2 , so that
log log x N (log N )3 ,
and hence log N ≥ (1 + o(1)) log log log x. The left-hand side in Lemma 15.9
is simply the average of ψ(u) − u over a neighbourhood of x. Since x N
and N is arbitrarily large, we have (15.22).
As for (15.23), we note that if RH holds, then (15.22) and (15.23) are equiva-
lent, in view of Theorem 13.2. If RH is false, then Theorem 15.2 gives a stronger
result.

15.2.1 Exercises
1. Show that

π(x; 4, 1) − π(x; 4, 3) = ± x 1/2 (log x)−1 log log log x
as x → ∞.
480 Oscillations of error terms

2. (a) Show that if f (k−1) (x) is continuous in [a, a + kh] and if f (k) (x) ex-
ists throughout (a, a + kh), then there exists a ξ ∈ (a, a + kh) such
that

k
k
h k f (k) (ξ ) = (−1)k f (a + j h).
j=0
j

(b) Show that there exist constants C > 0, c > 0 such that if RH holds,
then for all x ≥ 2,

sup (ψ(u) − u) ≥ cx 1/2

x≤u≤C x

and

inf (ψ(u) − u) ≤ −cx 1/2 .

x≤u≤C x

3. Show that for every C > 1 there is a δ = δ(C) > 0 such that if RH holds,
then

sup |ψ(u) − u| ≥ δx 1/2

x≤u≤C x

for all x ≥ 2.
4. (Ingham 1936)
(a) Let N be a positive integer, Y a positive real number, and let θ1 , . . . , θ K
be arbitrary real numbers. By using Dirichlet’s theorem, or otherwise,
show that there is a real number y, Y ≤ y ≤ Y N K such that θk y <
1/N for 1 ≤ k ≤ K .
(b) Let N be an integer > 1, Y a positive real number. Show that there
exist real numbers θ1 , . . . , θ K such that maxk θk y ≥ 1/N uniformly
for all real y in the interval Y ≤ y ≤ Y (N − 1) K .
(c) Suppose that RH holds. Show that there exists an absolute constant
c > 0 such that for any real numbers X ≥ 2 and Z ≥ 16 there exists
an x, X ≤ x ≤ X Z , for which

π(x) − li(x) > cx 1/2 (log x)−1 log log log Z ,

and an x in the same interval for which

π(x) − li(x) < −cx 1/2 (log x)−1 log log log Z .

(d) Deduce that there is an absolute constant C > 0 such that if RH holds,
then π(x) − li(x) changes sign in every interval [X, C X ] for X ≥ 2.
15.2 The error term in the Prime Number Theorem 481

5. Show that the implicit constant in Littlewood’s theorem can be taken to be

1/2. That is,
ψ(x) − x
lim sup ≥ 1/2,
x→∞ x 1/2 log log log x
with similar inequalities for the lim inf and for π (x) − li(x).

6. Suppose that q is an integer such that χ L(σ, χ ) = 0 for σ > 1/2. Show
that if (b, q) = 1, b ≡ 1 (mod q), then

π(x; q, 1) − π (x; q, b) = ± x 1/2 (log x)−1 log log log x .

7. Suppose that n |cn | < ∞, and put g(y) = n cn eiλn y where the λn are
real. Show that for any y0 and any ε > 0, there exist arbitrarily large num-
bers y such that |g(y) − g(y0 )| < ε.

8. Suppose that g(y) = n cn eiλn y is uniformly convergent for y in a neigh-
bourhood of y0 , and put

1 δ |y|
Mδ = 1− g(y0 + y) dy.
δ −δ δ

(a) Show that

2
sin λn δ/2
Mδ = cn eiλn y0
n λn δ/2

for all small positive δ.

(b) Show that Mδ → g(y0 ) as δ → 0+ .
9. (Jurkat 1973, Anderson 1991) Suppose that there is a constant K such
that M(x) ≤ K x 1/2 for all x ≥ 1, or that there is a constant K such that
−K x 1/2 ≤ M(x) for all x ≥ 1.
(a) Show that the Riemann Hypothesis is true, that the zeros of ζ (s) are
simple, and that |ζ (ρ)| 1/|ρ|.
(b) Show that there is a sequence of Tν tending to inﬁnity such that
xρ ∞
(−1)n−1 (2π/x)2n
M(x) = lim − 2 +
ν→∞
|γ |≤Tν
ρζ (ρ) n=1
(2n)!nζ (2n + 1)

for x > 0, and that the convergence is uniform in intervals that do not
contain a square-free number.
(c) Let
eiγ y
g(y) = lim .
ν→∞
|γ |≤Tν
ρζ (ρ)
482 Oscillations of error terms

Show that if g(y) is continuous at y0 , then for any ε > 0 there exist
arbitrarily large y such that |g(y) − g(y0 )| < ε.
(d) Show that g(0+ ) − g(0− ) = 1.
(e) Deduce that lim supx→∞ |M(x)|/x 1/2 ≥ 1/2.
10. (a) Let h(x) = (M(2x) − M(x))/x 1/2 . Show that h(1+ ) = −1 and that
h(1− ) = 1.
(b) Show that

lim sup µ(n) x −1/2 ≥ 1.
x→∞ x<n≤2x

15.3 Notes
Theorems 15.2 and 15.3, and Corollary 15.4, are due in substance to E. Schmidt
(1903). Mertens (1897) conjectured that |M(x)| ≤ x 1/2 for all x ≥ 1. This
‘Mertens Hypothesis’ was disproved by Odlyzko and te Riele (1984), who
showed that
M(x)
lim sup ≥ 1.06
x→∞ x 1/2
and that
M(x)
lim inf ≤ −1.009.
x→∞ x 1/2
One would expect that here the lim sup is +∞ and the lim inf is −∞, but
neither of these assertions has been proved. Ingham (1942) proved Theorem
15.5 under the stronger hypothesis that the ordinates γ > 0 are joined by at
most a finite number of linear relations. That one may restrict the coefficients
of the linear relations, and thus in principle verify the hypothesis for the first
several zeros, was shown by Bateman et al. (1971). The product used in the
proof of Theorem 15.5 is very similar to the Riesz products used in the study
of lacunary Fourier series (see Zygmund 1959, pp. 208–212).
The method used to prove Theorem 15.8 was introduced by Littlewood
(1927) for the purpose of providing a simple proof of Theorem 15.3.
Theorem 15.11 was announced by Littlewood (1914), who sketched the
proof. Full details were given later by Hardy and Littlewood (1918). The initial
proofs depended on an appeal to the Phragmén–Lindelöf principle. Ingham
(1936) found that this could be dispensed with. Ingham considered a more
complicated weighted average of ψ(u) − u which led to the simpler weighted
15.3 Notes 483

partial sum
sin γ y
(1 − γ /T )
0<γ ≤T
γ

of the sum (15.20). The present exposition was inspired by Ingham’s editorial
remark in Hardy’s Collected Works (1967, p. 99).
The proof given of Theorem 15.11 is non-effective in the sense that it does
not permit one to determine an explicit constant c about which one can assert
that π(x) > li(x) for some x < c. Skewes (1933, 1955) formulated a slightly
different division into cases (RH ‘nearly true’ vs. RH ‘signiﬁcantly false’),
which permitted him to show that one can take

c = exp(exp(exp(exp(7.705)))).

One of the problems here is to construct a function f (x) about which one can
assert that in any interval [x0 , f (x0 )] there exist x for which the sum over the non-
trivial zeros is not highly cancelling. That is, the conclusion of Theorem 15.2
must be put in a more quantitative, localized form. In this connection, Littlewood
(1937) was led to consider a question concerning a sum of cosines. Turán
(1946) discovered that the theorem formulated by Littlewood is false – the
argument provided establishes a weaker result than claimed. Turán undertook a
detailed study of such power sums. His ‘power sum method’ has many important
applications to the oscillatory error terms that arise in analytic number theory
(see Turán 1984). In particular, Knapowski (1961) used Turán’s method to
show, without need of extensive numerical calculations, that an effective upper
bound for the constant c can be determined. Subsequently, Lehman (1966)
used extensive numerical information concerning the zeros ρ to show that one
can take c = 1.65 × 101165 . Using the same method te Riele (1989) shows that
π(x) > lix for at least 10180 consecutive integers in the interval [6.627 . . . ×
10370 , 6.687 . . . × 10370 ]. More recently Bays & Hudson (2000) have given
some new regions where π (x) > li(x), the ﬁrst of these being around 1.39 ×
10316 . An extension of Littlewood’s theorem to Beurling primes has been given
by Kahane (1999).
Monach & Montgomery (cf. Monach 1980) have conjectured that for every
ε > 0 and every K > 0 there is a T0 (ε, K ) such that

kγ γ > exp(−T 1+ε ) (15.24)
0<γ ≤T

whenever T ≥ T0 and the kγ are integers, not all 0, for which |kγ | ≤ K . From
484 Oscillations of error terms

this they have shown that

ψ(x) − x 1
lim sup ≥ , (15.25)
x→∞ x 1/2 (log log log x)2 2π
and that
ψ(x) − x −1
lim inf ≤ . (15.26)
x→∞ x 1/2 (log log log x)2 2π
In view of (13.48), it is plausible that equality holds in (15.25) and (15.26).

Let L(x) = n≤x λ(n). It was conjectured by Pólya (1919) that L(x) ≤ 0
for all x ≥ 2, and it has been veriﬁed that this inequality holds for 2 ≤ x ≤
106 . Pólya’s conjecture was disproved by Haselgrove (1958), whose extensive
computer calculations led to the conclusion that
L(x)
lim sup > 0.
x→∞ x 1/2
Subsequently Lehman (1960) found that L(906,180,359) = 1.

15.4 References
Anderson, R. J. (1991). On the Möbius sum function, Acta Arith. 59, 205–213.
Bateman, P. T., Brown, J. W., Hall, R. S., Kloss, K. E., Stemmler, R. M. (1971). Linear
relations connecting the imaginary parts of the zeros of the zeta function, Computers
in Number Theory. New York: Academic Press, pp. 11–19.
Bays, C. & Hudson, R. H. (2000). A new bound for the smallest x with π (x) > li(x),
Math. Comp. 69, 1285–1296.
Chebyshev, P. L. (1853). On a new theorem concerning prime numbers of the forms
4n + 1 and 4n + 3, Bull. Acad. Imp. Sci. St. Petersburg, Phys.-Mat. Kl. 11, 208;
Collected Works, Vol. 1. Moscow-Leningrad: Akad. Nauk SSSR.
Hardy, G. H. (1967). Collected Papers of G. H. Hardy, Vol. 2, Oxford: Clarendon Press.
Hardy, G. H. & Littlewood, J. E. (1918). Contributions to the theory of the Riemann
zeta-function and the theory of the distribution of primes, Acta Math. 41, 119–196;
Collected Papers, Vol. 2. Oxford: Clarendon Press, 1967, pp. 20–97.
Haselgrove, C. B. (1958). A disproof of a conjecture of Pólya, Mathematika 5, 141–145.
Ingham, A. E. (1936). A note on the distribution of primes, Acta Arith. 1, 201–211.
(1942). On two conjectures in the theory of numbers, Amer. J. Math. 64, 313–319.
Jurkat, W. B. (1973). On the Mertens Conjecture and Related General -theorems, An-
alytic Number Theory (St. Louis, 1972), Proc. Sympos. Pure Math. 24. Providence:
Amer. Math. Soc., pp. 147–158.
Kahane, J.-P. (1999). Un théorème de Littlewood pour les nombres premiers de Beurling,
Bull. London Math. Soc. 31, 424–430.
Knapowski, S. (1961). On sign-changes in the remainder-term in the prime-number
formula, J. London Math. Soc. 36, 451–460.
15.4 References 485

Landau, E. (1905). Über einen Satz von Tschebyscheff, Math. Ann. 61, 527–550;
Collected Works, Vol. 2. Essen: Thales Verlag, 1986, pp. 206–229; Commentary,
Collected Works, Vol. 3. pp. 72–75.
(1918a). Über einige ältere Vermutungen und Behauptungen in der Primzahlentheorie,
Math. Z. 1, 1–24; Collected Works, Vol. 6. Essen: Thales Verlag, 1986, pp. 469–492.
(1918b). Über einige ältere Vermutungen und Behauptungen in der Primzahlentheorie,
Zweite Abhandlung, Math. Z. 1, 213–219; Collected Works, Vol. 6. Essen: Thales
Verlag, 1986, pp. 506–512.
Lehman, R. S. (1960). On Liouville’s function, Math. Comp. 14, 311–320.
(1966). On the difference π (x) − li(x), Acta Arith. 11, 397–410.
Littlewood, J. E. (1914). Sur la distribution des nombres premiers, C. R. Acad. Sci. Paris
158, 1869–1872; Collected Papers, Vol. 2. Oxford: Oxford University Press, 1982,
pp. 829–832.
(1927). Mathematical notes (3): On a theorem concerning the distribution of prime
numbers, J. London Math. Soc. 2, 41–45; Collected Papers, Vol. 2. Oxford: Oxford
University Press, 1982, pp. 833–837.
(1937). Mathematical notes. XII.: An inequality for a sum of cosines, J. London Math.
Soc. 12, 217–221; Collected Papers, Vol. 2. Oxford: Oxford University Press, 1982,
pp. 838–842.
Mertens, F. (1897). Über eine zahlentheoretische Funktion, Sitz. Akad. Wiss. Wien 106,
761–830.
Monach, W. R. (1980). Numerical Investigation of Several Problems in Number Theory,
Doctoral Thesis. Ann Arbor: University of Michigan.
Odlyzko, A. M. & te Riele, H. J. J. (1984). Disproof of the Mertens conjecture, J. Reine
Angew. Math. 357, 138–160.
Pólya, G. (1919). Verschiedene Bermerkungen zur Zahlentheorie, Jahresbericht
Deutsche Math.–Ver. 28, 31–40.
te Riele, H. J. J. (1989). On the sign of the difference π(x) − lix, Math. Comp. 48,
323–328.
Schmidt, E. (1903). Über die Anzahl der Primzahlen unter gegebener Grenze, Math.
Ann. 57, 195–204.
Skewes, S. (1933). On the difference π(x) − lix, J. London Math. Soc. 8, 277–283.
(1955). On the difference π(x) − lix, II, Proc. London Math. Soc. (3) 5, 48–69.
Turán, P. (1946). On a theorem of Littlewood, J. London Math. Soc. 21, 268–275;
Collected Papers, Vol. 1. Budapest: Akad Kiadó, 1990, pp. 284–293.
(1948). On some approximative Dirichlet polynomials in the theory of the zeta-
function of Riemann, Danske Vid. Selsk. Mat.-Fys. Medd. 24, no. 17, 36 pp.;
Collected Papers, Vol. 1. Budapest: Akad Kiadó, 1990, pp. 369–402.
(1984). On a New Method of Analysis and its Applications, New York: Wiley-
Interscience.
Zygmund, A. (1959). Trigonometric Series, Vol. 1. Cambridge: Cambridge University
Press.
Appendix A
The Riemann–Stieltjes integral

b
We generalize the Riemann integral a f (x) d x by deﬁning an integral
b
a f (x) dg(x) as a limit of Riemann sums n f (ξn ) g(x n ). More precisely,
for a < b suppose that we have a partition

a = x0 ≤ x1 ≤ · · · ≤ x N = b. (A.1)

For ξn in the interval xn−1 ≤ ξn ≤ xn we form the sum

N
S(xn , ξn ) = f (ξn )(g(xn ) − g(xn−1 )).
n=1
b
We say that the Riemann–Stieltjes integral a f (x) dg(x) exists and has the
value I if for every ε > 0 there is a δ > 0 such that

|S(xn , ξn ) − I | < ε

whenever the xn and the ξn are as above and

mesh{xn } = max (xn − xn−1 ) ≤ δ.

1≤n≤N

The values taken on by f and g may be either real or complex. We do not

determine precisely the pairs ( f, g) for which the Riemann–Stieltjes integral
exists. For our purposes it is enough to prove
b
Theorem A.1 The Riemann–Stieltjes integral a f (x) dg(x) exists if f is
continuous on [a, b] and g is of bounded variation on [a, b].

Proof We recall that by deﬁnition

N
Var[a,b] (g) = sup |g(xn ) − g(xn−1 )|
n=1

486
The Riemann–Stieltjes integral 487

where the supremum is taken over all {xn } satisfying (A.1). Since f is uniformly
continuous on [a, b], there is a δ > 0 such that | f (ξ ) − f (ξ )| < ε whenever
|ξ − ξ | ≤ δ. We show that

|S(xn , ξn ) − S(xn , ξn )| ≤ 2εVar[a,b] (g) (A.2)

provided that mesh{xn } ≤ δ and that mesh{xn } ≤ δ. This clearly sufﬁces.

Suppose ﬁrst that the partition {xn } is a subsequence of a second partition-
ing {xn }. Let M(n) = {m : xn−1 < xm ≤ xn }. The sets M(n) partition the set
{1, 2, . . . , M}, so we may write

S(xn , ξn ) − S(xm , ξm )

N
= f (ξn )(g(xn ) − g(xn−1 )) − f (ξm )(g(xm ) −
g(xm−1 )) .
n=1 m∈M(n)

Since the sequence {xn } is an increasing subsequence of the increasing sequence

{xm }, it follows that

g(xn ) − g(xn−1 ) = g(xm ) − g(xm−1

).
m∈M(n)

On inserting this in the former expression, we ﬁnd that it is

N
( f (ξn ) − f (ξm ))(g(xm ) − g(xm−1

)).
n=1 m∈M(n)

Since |ξn − ξm | ≤ δ, it follows that

|S(xn , ξn ) − S(xm , ξm )| ≤ ε |g(xm ) − g(xm−1

)|
n m∈M(n)

M
=ε |g(xm ) − g(xm−1

)|
m=1
≤ εVar[a,b] g. (A.3)

We now take {xm } to be the union of {xn } and {xn }, so that both {xn } and {xn }
are subsequences of {xm }. Since

|S(xn , ξn ) − S(xn , ξn )| = |S(xn , ξn ) − S(xm , ξm ) + S(xm , ξm ) − S(xn , ξn )|
≤ |S(xn , ξn ) − S(xm , ξm )| + |S(xm , ξm ) − S(xn , ξn )|

by the triangle inequality, the desired bound (A.2) follows by applying (A.3)
twice.
b
The main negative feature of the Riemann–Stieltjes integral is that a f dg
does not exist if f and g have a common discontinuity in (a, b). However,
488 The Riemann–Stieltjes integral

if f is continuous, the Riemann–Stieltjes integral enables us to express the

N
sum n=1 an f (n) in terms of the unweighted partial sums A(x) = 1≤n≤x an .
Indeed,

N N
an f (n) = f (x) d A(x). (A.4)
n=1 0

There is some freedom in the interval of integration, since the left endpoint
can be any number in [0, 1), and the right endpoint can be any number in
[N , N + 1) without affecting the value of the integral.
N Frequently it is useful
−
to integrate from 1 to N , i.e. to consider limε→0+ 1−ε . Some care must be
exercised in choosing the endpoints of integration, since for example
N
N
f (x) d A(x) = an f (n).
1 n=2
b b
Theorem A.2 If a f dg exists, then a g d f also exists, and
b b
g d f = f (b)g(b) − f (a)g(a) − f dg.
a a
b
As we see in the above, we lose no information by writing a f dg instead
b
of the longer a f (x) dg(x). On combining Theorems A.1 and A.2 we see that
b
a f dg exists if f is of bounded variation on [a, b] and g is continuous on
[a, b].

Proof Put ξ0 = a and ξ N +1 = b. Then

N
g(ξn )( f (xn ) − f (xn−1 ))
n=1

N +1
= f (b)g(b) − f (a)g(a) − f (xn−1 )(g(ξn ) − g(ξn−1 )).
n=1

Here the sum on the b right-hand side is a Riemann–Stieltjes sum S(ξn , xn−1 )
approximating to a f dg, since xn−1 ∈ [ξn−1 , ξn ]. Moreover, mesh{ξn } ≤
b
2mesh{xn }, so that the sum on the right tends to a f dg as mesh{xn } tends
to 0.

This proof displays the close relation between partial summation and inte-

gration by parts. Rather than sum the series an f (n) by parts, we can integrate
by parts in (A.4) to see that

N N
an f (n) = A(N ) f (N ) − A(x) d f (x). (A.5)
n=1 0
The Riemann–Stieltjes integral 489

b
It is to be expected that if g is differentiable, then f dg should resemble
b a

a f g d x. In this direction we establish

Theorem A.3 If g is continuous on [a, b], then

b
Var[a,b] g = |g (x)| d x.
a
If in addition f is Riemann integrable, then
b b
f (x) dg(x) = f (x)g (x) d x.
a a
Proof By the mean value theorem there is a ζn ∈ [xn−1 , xn ] such that
g(xn ) − g(xn−1 ) = g (ζn )(xn − xn−1 ).
Hence

N
N
|g(xn ) − g(xn−1 )| = |g (ζn )|(xn − xn−1 ),
n=1 n=1
b
which tends to a |g | d x as mesh{xn } tends to 0. Since g (x) is uniformly

continuous on [a, b], there is a δ > 0 such that |g (ξ ) − g (ζ )| < ε whenever

|ξ − ζ | < δ. Clearly

N
N
f (ξn )(g(xn ) − g(xn−1 )) = f (ξn )g (ζn )(xn − xn−1 )
n=1 n=1

N
= f (ξn )g (ξn )(xn − xn−1 )
n=1

N
+ f (ξn )(g (ζn ) − g (ξn ))(xn − xn−1 )
n=1
= 1 + 2 ,
b
say. The function f g is Riemann integrable, and hence 1 tends to a f g d x
as mesh{xn } tends to 0. Suppose that M is chosen so that |f (x)| ≤ M for all
b
x ∈ [a, b]. If mesh{xn } < δ, then |2 | ≤ Mε(b − a). Hence a f dg exists and
b
has the value a f g d x.

Continuing from (A.4), we see that if f is continuous, then

N N
an f (n) = A(N ) f (N ) − A(x) f (x) d x. (A.6)
n=1 0

This useful identity can be veriﬁed without mention of Riemann–Stieltjes in-

tegration, but its formulation and derivation is most natural through (A.4) and
(A.5).
490 The Riemann–Stieltjes integral

Suppose that f is Riemann

b integrable. A version of the triangle inequal-
b
ity asserts that | a f | ≤ a | f |. We now derive an analogue of this for the
Riemann–Stieltjes integral.

Theorem A.4 Suppose that g has bounded variation, and put g ∗ (x) =
Var[a,x] g. Then
b b
f (x) dg(x) ≤ | f (x)| dg ∗ (x).
a a

provided that both integrals exist.

Proof Clearly

N
|S(xn , ξn )| ≤ | f (ξn )||g(xn ) − g(xn−1 )|
n=1

N
≤ | f (ξn )|(g ∗ (xn ) − g ∗ (xn−1 )),
n=1

which gives the result.

The differential dg ∗ is sometimes abbreviated |dg|. From Theorem A.4

we see that if | f (x)| ≤ M for a ≤ x ≤ b and g is of bounded variation,
then
b
f (x) dg(x) ≤ MVar[a,b] g (A.7)
a
a
provided that the integral exists. As with Riemann integrals, we set a f dg =
b a c b b
0. If a > b we set a f dg = − b f dg, so that a + c = a for any real
numbers a, b, c. Finally, improper Riemann–Stieltjes integrals are deﬁned as
limits of proper integrals, e.g.
∞ b
f (x) dg(x) = lim f (x) dg(x).
a b→∞ a

Exercises
1. Suppose that ϕ(t) is continuous and strictly increasing for α ≤ t ≤ β, and
that ϕ(α) = a, ϕ(β) = b. Put F(t) = f (ϕ(t)), G(t) = g(ϕ(t)). Show that
b β
f (x) dg(x) = F(t) dG(t)
a α

provided that either integral exists.

The Riemann–Stieltjes integral 491

2. Let f and g be continuous, and h have bounded variation. Put I (x) =

x
a g dh. Show that
b b
f (x)g(x) dh(x) = f (x) d I (x).
a a

3. The proof of Theorem A.2 depends on summation by parts. We now

show that, conversely, summation by parts can be recovered from Theorem
A.2. Suppose that the numbers a1 , . . . , a N and b1 , . . . , b N are given. Put
An = a1 + · · · + an for 1 ≤ n ≤ N . For 1 ≤ x < N + 1 put A(x) = A[x] ;
set A(x) = 0 for x < 1. For 1/2 ≤ x ≤ N + 1/2 let B(x) = b[x+1/2] . (The
discontinuities of B(x) are displaced in order to ensure that A(x) and B(x)
do not have a common discontinuity.)
(a) Show that

N N
an bn = B(x) d A(x).
n=1 1−

(b) Show that

N −1 N
An (bn − bn+1 ) = − A(x) d B(x).
n=1 1−

(c) Use Theorem 2 to derive Abel’s lemma:

N
N −1
an bn = A N b N + An (bn − bn+1 ).
n=1 n=1

4. Show that
b 2 b b
f g dh ≤ | f | |dh|
2
|g| |dh|
2
a a a

provided that these integrals exist.

5. Suppose that f is non-negative and decreasing, that g(a) = h(a), and that
g(x) ≤ h(x) for a ≤ x ≤ b. Show that
b b
f dg ≤ f dh
a a

provided that these integrals exist.

6. (First mean value theorem) Suppose that f and g are real-valued functions
with f continuous on [a, b], and g weakly increasing on this interval. Put
m = minx∈[a,b] f (x), M = maxx∈[a,b] f (x).
(a) Show that
b
m(g(b) − g(a)) ≤ f dg ≤ M(g(b) − g(a)) .
a
492 The Riemann–Stieltjes integral

(b) Show that there is an x0 ∈ [a, b] such that

b
f dg = f (x0 )(g(b) − g(a)) .
a

7. (Second mean value theorem) Suppose that f and g are real-valued functions
with f weakly increasing on [a, b], and g continuous on this interval. Show
that there is an x0 ∈ [a, b] such that
b
f dg = f (a)(g(x0 ) − g(a)) + f (b)(g(b) − g(x0 )) .
a

8. (Darst & Pollard 1970) Suppose that f and g are real-valued functions with
f of bounded variation on [a, b], and g continuous on this interval. (a) Show
that if ξ ∈ [a, b] and f (ξ ) = 0, then
b
f dg ≤ Var[ξ,b] ( f ) max (g(b) − g(x)),
ξ ξ ≤x≤b
ξ
f dg ≤ Var[a,ξ ] ( f ) max (g(x) − g(a)).
a ξ ≤x≤b

(b) Show that if infa≤x≤b f (x) = 0, then

b
f dg ≤ Var[a,b] ( f ) max (g(β) − g(α)).
a a≤α≤β≤b

(c) Show that in general,

b
f dg ≤ (g(b) − g(a)) inf f (x) + Var[a,b] ( f ) max (g(β) − g(α)).
a a≤x≤b a≤α≤β≤b

9. Suppose that

1 if 0 < x ≤ 1, 1 if 0 ≤ x ≤ 1
f (x) = g(x) =
0 otherwise; 0 otherwise.
0 1 1
Show that −1 f dg and 0 f dg both exist, but that −1 f dg does not exist.

A.1 Notes
Our treatment follows that of Ingham in his lectures at Cambridge University.
Several variants of the Riemann–Stieltjes (R-S) integral have been proposed.
The integral as we have defined it is known as the uniform Riemann–Stieltjes
integral. A slightly more powerful variant is the refinement Riemann–Stieltjes
b
integral, in which a f dg is said to have the value I if for every ε > 0 there is a
partition {xn } such that if {xm } is a refinement of {xn }, then |S(xm , ξm ) − I | < ε
A.2 References 493

for all choices of ξm ∈ [xm−1 , xm ]. The refinement Riemann–Stieltjes integral is
developed in considerable detail by Apostol (1974, Chapter 9) and Bartle (1964,
b
Section 22), and is used by Bateman & Diamond (2004). If a f dg exists in
the sense of uniform R–S integration, then it also exists in the refinement R–S
sense, and has the same value. The refinementc integral has the attractive
c prop-
b
erty that if a < b < c, and if a f dg, b f dg both exist, then a f dg exists
and
c b c
f dg = f dg + f dg .
a a b

This is not true for the uniform R–S integral, as we see by the example in
Exercise A.9.
We mention without proof two more advanced properties of the Riemann–
Stieltjes integral: If f is continuous on [a, b], and if g is absolutely continuous
on the same interval, then
b b
f dg = f g
a a

where the integral on the right is a Lebesgue integral. Secondly, the Riesz
representation theorem, which is fundamental to functional analysis, asserts that
if G is a positive bounded linear functional on the space C[a, b] of continuous
functions on [a, b], then there exists a weakly increasing function g on [a, b]
such that
b
G( f ) = f dg
a

for all f ∈ C[a, b]. An account of this is given in Kestelman (1960, pp. 265–
269).
For more extensive accounts of Riemann–Stieltjes integration, see Apostol
(1974, Chapter 9), Hildebrandt (1938), Kestelman (1960, Chapter 11), Rankin
(1963, Section 29), or Widder (1946, Chapter 1).

A.2 References
Apostol, T. M. (1974). Mathematical Analysis, Second edition. Menlo Park: Addison–
Wesley.
Bartle, R. G. (1964). The Elements of Real Analysis. New York: Wiley.
Bateman, P. T. & Diamond, H. G. (2004). Analytic number theory. An introductory
course, Hackensack: World Scientiﬁc.
Darst, R. & Pollard, H. (1970). An inequality for the Riemann–Stieltjes integral, Proc.
Amer. Math. Soc. 25, 912–913.
494 The Riemann–Stieltjes integral

Hildebrandt, T. H. (1938). Stieltjes integrals of the Riemann type, Amer. Math. Monthly
45, 265–277.
Kestelman, H. (1960). Modern Theories of Integration. New York: Dover.
Rankin, R. A. (1963). An Introduction to Mathematical Analysis. Oxford: Pergamon.
Widder, D. V. (1946). The Laplace Transform, Princeton: Princeton University
Press.
Appendix B
Bernoulli numbers and the Euler–Maclaurin
summation formula

Suppose that f is a continuous function on an interval [a, b]. Then by Theorem

A.1,
b b b
f (n) = f (x) d[x] = f (x) d x − f (x)d{x},
a<n≤b a a a

since [x] = x − {x}. On integrating the last integral by parts (recall Theorem
A.2), we ﬁnd that the right-hand side above is
b b
f (x) d x − f (b){b} + f (a){a} + {x} d f (x).
a a

The familiar ‘integral test’ is an immediate corollary of this identity, and indeed
the last term on the right gives an explicit representation of the difference

between f (n) and f (x). If f has a continuous ﬁrst derivative then (by
Theorem A.3) we may replace d f (x) by f (x) d x in the last integral, so that
b b
f (n) = f (x) d x − f (b){b} + f (a){a} + {x} f (x) d x. (B.1)
a<n≤b a a

Of course this elementary identity can be veriﬁed easily without reference to

Riemann–Stieltjes integration. If f has derivatives of higher order, then the last
integral may be repeatedly integrated by parts. In order to systematize this we
introduce the Bernoulli polynomials.
We deﬁne the Bernoulli polynomials Bk (x) inductively. We begin by setting

B0 (x) = 1. (B.2)

If Bk−1 (x) is given, then Bk (x) is determined, apart from its constant term, by
the differential equation
d
Bk (x) = k Bk−1 (x) (k ≥ 1). (B.3)
dx

495
496 The Euler–Maclaurin summation formula

The Bernoulli number Bk is the constant term of Bk (x). Its value is determined
by the condition
1
Bk (x) d x = 0 (k ≥ 1). (B.4)
0

From (B.2) and (B.3) we see that B1 (x) = x + B1 , and from (B.4) we deduce
that B1 = −1/2. Hence B2 (x) = x 2 − x + B2 , and then we ﬁnd that B2 = 1/6.
These polynomials and numbers have many signiﬁcant properties, a few of
which we now investigate.

1.5

0.5

–1 –0.5 0.5 1 1.5

–0.5

–1

Figure B.1 The Bernoulli polynomials Bk (x) for k = 0, . . . , 4 and −1 ≤ x ≤ 2.

By using (B.3) inductively it is evident that

k
k j
Bk (x) = x Bk− j (k ≥ 0). (B.5)
j=0
j

In view of (B.3), the integral (B.4) is (Bk+1 (1) − Bk+1 (0))/(k + 1). Thus (B.4)
is equivalent to the assertion that

Bk (0) = Bk (1) (k ≥ 2). (B.6)

The Euler–Maclaurin summation formula 497

By taking x = 1 in (B.5) it then follows that

k
k
Bk = Bk− j (k ≥ 2). (B.7)
j=0
j

After subtracting Bk from both sides, this identity provides a formula for Bk−1
in terms of B0 , B1 , . . . , Bk−2 .
Next we determine a power series generating function for the Bk . The func-
tion z/(e z − 1) is analytic except at the points z = 2πki, k = 0. In particular,
this function is analytic in the disc |z| < 2π , and we may write its power series
in the form
z ∞
ck k
= z .
ez − 1 k=0
k!
After multiplying both sides by e z − 1 and equating power series coefﬁcients,
we see not only that c0 = 1 but also that the ck satisfy the recurrence (B.7).
Consequently ck = Bk for all k. That is,
z ∞
Bk k
= z (|z| < 2π ). (B.8)
e −1
z
k=0
k!
Theorem B.1 If k is odd, then
Bk = 0 (k ≥ 3), (B.9)
Bk (x) = −Bk (1 − x) (k ≥ 1), (B.10)
sgnBk (x) = (−1) (k+1)/2
(k ≥ 1, 0 < x < 1/2). (B.11)
If k is even, then
(−1)k/2 Bk (x) ↑ (k ≥ 2, 0 < x < 1/2), (B.12)
Bk (x) = Bk (1 − x) (k ≥ 0), (B.13)
sgnBk = (−1) (k/2)+1
(k ≥ 2). (B.14)
From (B.10) and (B.13) we see that Bk (x + 1/2) is an odd function for odd
k, and an even function for even k. From (B.10) it follows that the sign is
reversed in (B.11) if the interval 0 < x < 1/2 is replaced by 1/2 < x < 1, and
similarly from (B.12) and (B.13) we see that (−1)k/2 Bk (x) is strictly decreasing
for 1/2 ≤ x ≤ 1 when k is even, k ≥ 2. Such properties are evident in the graphs
of Figure B.1.
Proof These assertions are evident for k = 0, 1, 2. We proceed by induction.
Case 1. k odd. We integrate by parts in (B.4) and use (B.3) to see that
1
0 = Bk − k x Bk−1 (x) d x.
0
498 The Euler–Maclaurin summation formula

Table B.1

k Bk
0 1/1 = 1.00000 00000
1 −1/2 = −0.50000 00000
2 1/6 = 0.16666 66667
4 −1/30 = −0.03333 33333
6 1/42 = 0.02380 95238
8 −1/30 = −0.03333 33333
10 5/66 = 0.07575 75758
12 −691/2730 = −0.25311 35531
14 7/6 = 1.16666 66667
16 −3617/510 = −7.09215 68627
18 43867/798 = 54.97117 79449
20 −174611/330 = −529.12424 24242

1
From (B.13)k−1 we see that this integral is 12 0 Bk−1 . By (B.4) this integral
vanishes, so we have (B.9). To prove (B.10), let

f k (x) = Bk (x) + Bk (1 − x).

Then (B.3) gives f k (x) = k(Bk−1 (x) − Bk−1 (1 − x)), which vanishes by
(B.13)k−1 . Thus f k (x) is a constant. To determine its value we note that by (B.6)
and (B.9), f k (0) = 2Bk = 0. Thus we have (B.10). To prove (B.11) we ﬁrst note
that Bk (0) = Bk (1/2) = 0 by (B.9) and (B.10). Suppose that k ≡ 1 (mod 4).
It now sufﬁces to show that Bk (x) is convex for 0 < x < 1/2. But this fol-
lows from (B.3) and (B.12)k−1 . If k ≡ 3 (mod 4), then Bk (x) is concave for
0 < x < 1/2, and (B.11) again follows.
Case 2. k even. The assertion (B.12) is immediate from (B.3) and (B.11)k−1 .
To prove (B.13), take

gk (x) = Bk (x) − Bk (1 − x).

Then by (B.3) we have gk (x) = k f k−1 (x) = 0 by (B.10)k−1 . Thus gk (x) is a con-
stant. But gk (0) = 0 by (B.6). To prove (B.14) we note by (B.4) and (B.13) that
1/2
Bk (x) d x = 0.
0

From this and (B.12) it follows that (−1)k/2 Bk (0) < 0, (−1)k/2 Bk (1/2) > 0.
Thus we have (B.14), and the proof is complete.

The ﬁrst Bernoulli numbers are easily calculated; in Table B.1 we display
only the non-zero values.
The Euler–Maclaurin summation formula 499

For even k, the identity (B.13) contains (B.6) as a special case. For odd k,
(B.6) is similarly contained in (B.10), in view of (B.9). The identity (B.6) can
be generalized in other ways. For example,
Bk+1 (x + 1) − Bk+1 (x)
= xk (k ≥ 0). (B.15)
k+1
This is obvious for k = 0; to prove this for larger k we argue by induction.
By the inductive hypothesis we see that the derivatives of the two sides are
equal. Thus the two sides differ by at most a constant. We set x = 0 and use
(B.6) to see that this constant is 0.
Suppose that a and b are integers with a < b. In (B.15) we let x take on the
values a, a + 1, . . . , b, and sum, to obtain the important corollary

b
Bk+1 (b + 1) − Bk+1 (a)
nk = (k ≥ 0). (B.16)
n=a k+1

Apart from the value of the constant term, there can be at most one polynomial
with this property. Hence this identity provides a further characterization of the
polynomials Bk (x).
When (B.1) is integrated by parts repeatedly the functions Bk ({x}) arise.
Since these latter functions have period 1, it is natural to consider their ex-
pansions in Fourier series. In general, if f has period 1 we deﬁne the Fourier
coefﬁcient f (m) by the formula
1
*
f (m) = f (x)e(−mx) d x
0

where e(θ) = e2πiθ . From (B.4) we see that * Bk (0) = 0 for all k ≥ 1. By in-
tegrating by parts we ﬁnd that if m = 0, then *B1 (m) = −1/(2πim). If F has
*
period 1 and F = f ∈ L 1 (T), then F(m) =* f (m)/(2πim) for m = 0. Hence
by (B.3) we see that * Bk (m) = k *
Bk−1 (m)/(2πim) and hence that * Bk (m) =
−k!/(2πim) for m = 0. Now B1 ({x}) has a jump discontinuity at the in-
k

tegers, but since it has bounded variation on [0, 1] the symmetric partial
sums of its Fourier series will converge to B1 ({x}) when x is not an integer.
For k > 1 the function Bk ({x}) is continuous and its Fourier series is abso-
lutely convergent, so the series converges uniformly to Bk ({x}). Thus we have
proved

Theorem B.2 If x ∈
/ Z, then
1 ∞
1
B1 ({x}) = − sin 2π mx. (B.17)
π m=1 m
500 The Euler–Maclaurin summation formula

If k > 1, then

Bk ({x}) = −k! (2πim)−k e(mx) (B.18)
m=0

uniformly in x.
A self-contained proof of (B.17), with particular attention to the rate of
convergence, is given in Appendix D.1. Since only the deﬁning properties (B.3)
and (B.4) were used in deriving the above, these formulæ provide a second
means of proving the earlier assertions (B.6), (B.9), (B.10), (B.13), (B.14).
These formulæ have many applications. For example, we may take x = 0 in
(B.18) to obtain
Corollary B.3 For any integer k ≥ 1,
ζ (2k) = (−1)k−1 22k−1 π 2k B2k /(2k)!. (B.19)
Hence ζ (2) = π/6, ζ (4) = π /90, ζ (6) = π /945, and in general ζ (2k) is
4 6

a rational multiple of π 2k .
Since 1 < ζ (2k) < 1 + 22−2k for k ≥ 1, this gives not only the sign of Bk
but also a very precise estimate of its size, namely
2(2k)!(2π)−2k < |B2k | < 2(2k)!(2π )−2k (1 + 22−2k ) (k ≥ 1). (B.20)
We may similarly derive from Theorem B.2 an estimate for the Bernoulli poly-
nomials in the interval 0 ≤ x ≤ 1.
Corollary B.4 Suppose that 0 ≤ x ≤ 1. Then |B1 (x)| ≤ 1/2, and
|Bk (x)| ≤ k!21−k π −k ζ (k) (k ≥ 2). (B.21)
If k is even, then this takes the simpler form |Bk (x)| ≤ |Bk |, and equality
is achieved when x = 0 or 1. For odd k ≥ 3 the inequality can be improved
slightly (see Exercise B.5(e)).
We are now in a position to formulate the Euler–Maclaurin summation
formula.
Theorem B.5 (Euler–Maclaurin) Suppose that K is a positive integer and
that f has continuous derivatives through the K th order on the interval [a, b]
where a and b are real numbers with a < b. Then
b
f (n) = f (x) d x
a<n≤b a

K
(−1)k
+ Bk ({b}) f (k−1) (b) − Bk ({a}) f (k−1) (a)
k=1
k!
b
(−1) K
− B K ({x}) f (K ) (x) d x.
K! a
The Euler–Maclaurin summation formula 501

In most applications the last term is treated as an error term that is only
crudely bounded. For example, by Corollary B.4 above we see that the modulus
of this term does not exceed
b
2ζ (K )
| f (K ) (x)| d x. (B.22)
(2π ) K a

Further observations concerning this term are derived in Exercise B.16.

Proof We induct on K . The identity (1) gives the case K = 1. From (B.4),
and then (B.3), we see that
{x}
x
B K +1 ({x}) − B K +1
B K ({u}) du = B K (u) du = .
0 0 K +1
Hence by integrating by parts we ﬁnd that the last integral in Theorem B.5 is
1
B K +1 ({b}) f (K ) (b) − B K +1 ({a}) f (K ) (a)
K +1
b
1
− B K +1 ({x}) f (K +1) (x) d x,
K +1 a
which gives the inductive step.

The Euler–Maclaurin formula provides a means of deriving useful identities

and asymptotic estimates, and it is also important in numerical calculations.
We now use Theorem B.5 to derive some interesting formulæ for ζ (s).−s−k
We
assume initially that σ > 1, and take f (x) = x −s . Then f (k) (x) = k! −s
k
x ,
and on taking a = 1 and letting b tend to inﬁnity we ﬁnd that
1 1
K
−s Bk
ζ (s) = + − (−1)k
1s s − 1 k=1 k−1 k
∞
−s
− (−1) K B K ({x})x −s−K d x . (B.23)
K 1

Here the second term has a pole at s = 1, but the integral converges for σ >
1 − K , and hence this formula provides an analytic continuation of ζ (s) into
this larger half-plane. Since K can be taken arbitrarily large, it follows that ζ (s)
is
−sanalytic
in the entire plane, apart from the pole at s = 1. Moreover, the factor
K
has zeros at s = 0, s = −1, . . . , s = 1 − K , and so the last term vanishes
when s is a non-positive integer and K is sufficiently large. Let n denote a
non-negative integer, and set s = −n. If K ≥ n + 2, then we find that
1
K
n Bk
ζ (−n) = 1 − − (−1)k .
n + 1 k=1 k−1 k
Here the sum may be restricted to 1 ≤ k ≤ n + 1, since the binomial coef-
ficient vanishes when k > n. Thus we obtain an expression for ζ (−n) that is
502 The Euler–Maclaurin summation formula

independent of K . Since there are only finitely many terms on the right-hand side
above, and since each term is rational, it is at once clear that ζ (−n) is a rational
number. However, by making use of the properties of Bernoullipolynomials
we
can make this more precise. First we use the identity (n + 1) k−1 n
= k n+1 k
,
and then we observe that the second term on the right supplies an amount that
would arise if we allowed k = 0 in the sum. Thus we see that
1 n+1
n+1
ζ (−n) = 1 − (−1)k Bk .
n + 1 k=0 k
By taking x = −1 in (B.5), we see that the above is
(−1)n
=1+ Bn+1 (−1) .
n+1
By taking x = −1 in (B.15) we see that Bn+1 (−1) = Bn+1 − (−1)n (n + 1).
Hence we conclude that
Bn+1
ζ (−n) = (−1)n .
n+1
In conjunction with the values provided by Theorem B.1, this may be formulated
as follows.
Theorem B.6 Apart from a simple pole at s = 1, the zeta function is analytic
in the complex plane. Moreover, ζ (0) = −1/2, ζ (−2n) = 0 for n = 1, 2, . . . ,
and ζ (1 − 2n) = −B2n /(2n) for n = 1, 2, . . . .
The functional equation of the zeta function (Corollary 10.3) relates ζ (s) to
ζ (1 − s), so that for many purposes it suffices to consider ζ (s) for σ ≥ 1/2.
In this half-plane, the formula (B.23) is not very useful, since the terms in
the sum are far larger than ζ (s) when |s| is large. This is due to the fact that
in our application of the Euler–Maclaurin summation formula, the numbers
f (k) (1) increase rapidly in size with k. It is in situations in which the values
f (k) (x) decreases rapidly in size as k increases that the Euler–Maclaurin formula
provides accurate estimates. With this in mind we break the defining series
−s
n into two ranges, n ≤ N and n > N , and apply the sum formula only in
the second range. Taking a = N and letting b tend to infinity, we find that

N
N 1−s K
s+k−2
ζ (s) = n −s + + N −s Bk N −k+1 /k
n=1
s−1 k=1
k − 1
(B.24)
∞
s+K −1
− B K ({x})x −s−K d x.
K N

The initial derivation of this is carried out under the assumption that σ > 1,
but then one sees that the above provides a valid formula for ζ (s) throughout
The Euler–Maclaurin summation formula 503

the half-plane σ > 1 − K . The earlier formula (B.23) is recovered by taking

N = 1. The above formula is useful even in the half-plane σ > 1, in which the
defining series of ζ (s) is absolutely convergent. Suppose, for example, that we
wish to estimate ζ (3/2) to within 10−10 . If we were to use only the defining
series, it would be necessary to sum the first 4 · 1020 terms. In contrast to this, if
we take s = 3/2, N = 5, K = 15 in (B.24), then by (B.22) we find that the last
term has modulus < 0.5 · 10−10 . Since the term n = N in the first sum can be
combined with the term k = 1 in the second sum, this leaves us only 13 non-zero
quantities to evaluate, and we find that ζ (3/2) = 2.6123753487 to 10 decimal
places.
By applying the Euler–Maclaurin formula to f (x) = log x we obtain an
approximation to n!. For example, with a = 1, b = n, K = 2, we find that
∞
1 1 1
log(n!) = n log n − n + log n + c + − B2 ({x})x −2 d x (B.25)
2 12n 2 n

where
∞
11 1
c= + B2 ({x})x −2 d x.
12 2 1

From (B.22) we see that the last term in (B.25) has modulus less than 1/(12n).
In addition we describe below how it may be shown that c = 12 log 2π, so that
on exponentiating we obtain Stirling’s formula
n n√
n! = 2π n(1 + O(1/n)). (B.26)
e
More accurate approximations can be derived by using larger values of K . The
value of c can be determined by appealing to Wallis’s formula, which asserts
that
∞
2 1
= 1− 2 . (B.27)
π n=1
4n

Here the product of the ﬁrst N terms is

(2N + 1)(2N )!2
,
24N N !4
√
and on invoking (B.26) we see that this tends to 4e−2c , so that ec = 2π . A
simple proof of (B.27) is outlined in Exercise B.17 below. A determination of
c by use of an inverse Mellin transform and properties of the zeta function is
outlined in Exercise B.23 below. In the next appendix we extend our application
of the Euler–Maclaurin summation formula to give an asymptotic estimate of
the gamma function in the complex plane.
504 The Euler–Maclaurin summation formula

Exercises
1. Show that (−1) Bk (−x) = Bk (x) + kx k−1 for all k ≥ 0.
k

2. Prove the following generalization of (B.5):

k
k
Bk (x + h) = Bk− j (x)h j (k ≥ 0).
j=0
j

3. Show that if |z| < 2π, then

ze x z ∞
= Bk (x)z k /k!.
ez − 1 k=0

4. Show that if k ≥ 3 is odd, then Bk (x) has simple zeros at 0, 1/2, and 1,
and no other zeros in [0, 1]. Show that if k ≥ 2 is even, then Bk (x) has one
simple zero in (0, 1/2) and another in (1/2, 1), and no other zeros in [0, 1].
5. (Lehmer 1940) √
(a) Show that max0≤x≤1 |B3 (x)| = 3/36 < 3/(2π 3 ).
(b) Deduce that
∞ √
max m −3 sin 2π mx = 3π 3 /54 = 0.994527 . . . .
x
m=1

(c) Show that

,
2√ 2√
max |B5 (x)| = 1− 30 2 + 30 /120 < 15/(2π )5 .
0≤x≤1 15 3
(d) Using Theorem B.2, or otherwise, show that if k is odd, k ≥ 3, then

max |Bk (x)| = k!21−k π −k (1 − 3−k + O(4−k )).

0≤x≤1

(e) Show that if k is odd, k ≥ 3, then

max |Bk (x)| < k!21−k π −k .
0≤x≤1

6. Show that if j ≥ 1 and k ≥ 1, then

1
j!k!
B j (x)Bk (x) d x = (−1)k−1 B j+k .
0 ( j + k)!
7. Show that
Bk (1/2) = −(1 − 21−k )Bk (k ≥ 0).

8. Show that

∞
(−1)m π3
= .
m=0
(2m + 1)3 32
The Euler–Maclaurin summation formula 505

9. Show that if k ≥ 0 and q ≥ 1, then

q−1
Bk (q x) = q k−1 Bk (x + a/q).
a=0

(Suggestion: Suppose ﬁrst that 0 < x < 1/q, and use Theorem B.2.)
10. Show that if a and b are positive integers, then
1
(a, b)2
B1 ({ax})B1 ({bx}) d x = .
0 12ab
11. Using (8), or otherwise, show that
∞
B2k
z cot z = (−1)k (2z)2k
k=0
(2k)!

for |z| < π, and that

∞
B2k 4k
tan z = (−1)k−1 (2 − 22k )z 2k−1
k=1
(2k)!

for |z| < π/2. Show that all coefﬁcients in the latter series are positive.
∞
12. (a) Suppose that A(z) = ∞ n=0 an z /n! and B(z) =
n
n=0 bn z /n! are
n

power series with positive radii of convergence, and put C(z) =

A(z)B(z). Show that C(z) = ∞ n=0 cn z /n! has positive radius of con-
n

vergence, and that

∞
n
cn = ak bn−k . (B.28)
k=0
k
∞
(b) Suppose that B(z) = ∞ n=0 bn z /n! and C(z) =
n
n=0 cn z /n! are
n

power series with positive radii of convergence, and that b0 = 0. De-

duce that A(z) = C(z)/B(z) = ∞ n=0 an z /n! has positive radius of
n

convergence, and that (B.28) holds.

(c) In the above situation, suppose that the bn and cn are all integers, and
that b0 = ±1. Deduce that the an are all integers.
13. Put
B2k 4k
Tk = (−1)k−1 (2 − 22k ) .
2k
These are called the ‘tangent coefﬁcients’ because
∞
z 2k−1
tan z = Tk
k=1
(2k − 1)!

for |z| < π/2 (cf. Exercise 11). By taking C(z) = sin z, B(z) = cos z in the
preceding exercise, or otherwise, show that the Tk are all positive integers.
506 The Euler–Maclaurin summation formula

14. (a) By suitable applications of the identity of Exercise 3, or otherwise,

show that
e3z/4 − e z/4 ∞
z 2k
= −2 B2k+1 (1/4)
ez − 1 k=0
(2k + 1)!
for |z| < 2π .
(b) By the substitution z = 4iw, show that
∞
B2k+1 (1/4) 2k
sec w = (−1)k+1 42k+1 w
k=0
(2k + 1)!
for |w| < π/2.
(c) Put
B2k+1 (1/4)
E k = (−1)k+1 42k+1 .
2k + 1
These are called the ‘Euler numbers’ or ‘secant coefﬁcients’, since
∞
z 2k
sec z = Ek
k=0
(2k)!
for |z| < π/2. Show that E k > 0 for all k ≥ 0.
(d) By taking C(z) = 1, B(z) = cos z in Exercise 12, or otherwise, show
that the E k are all integers.
15. With the Euler numbers deﬁned as above, show that
Ek
L(2k + 1, χ−4 ) = π 2k+1
(2k)!22k+2
for all non-negative integers k.
16. Suppose that a and b are integers and that K is even.
(a) Show that if f (K ) (x) is of constant sign in (a, b), then the modulus of
the last term in the Euler–Maclaurin formula does not exceed that of
the term k = K in the sum.
(b) Show that
b 1/2
B K +1 ({x}) f (K +1) (x) d x = B K +1 (x)g(x) d x
a 0

where

b−a

g(x) = f (K +1) (a + r − 1 + x) − f (K +1) (a + r − x) .
r =1

(c) Show that if f (K +1) (x) exists and is monotonically decreasing in [a, b],
then
b
sgn B K ({x}) f (K ) (x) d x = −sgnB K .
a
The Euler–Maclaurin summation formula 507

(d) Show that if f (K ) < 0, f (K +1) > 0, f (K +2) < 0 throughout [a, b], then
the last term in the Euler–Maclaurin formula has smaller modulus than,
and opposite sign to, the term k = K in the sum.
(e) Show that
n!
1< √ < e1/(12n) .
(n/e)n 2π n
π
17. For n ≥ 0, let In = 0 (sin x)n d x.
(a) Show that I0 = π , I1 = 2.
(b) Show that In+2 = n+1 I .
n+2 n
(c) Show that In /In+1 → 1 as n → ∞.
(d) Deduce the formula (B.27) of Wallis (1656).
18. Show that if 0 < x < 1, then
∞
e(nα) π sin 2π αx − sin 2π(α − 1)x
2 − n2
= · .
n=−∞ x n 1 − cos 2π x
19. Let C0 denote Euler’s constant. Show that if N and K are positive integers,
then
N
1 1
K −1
B2k B2K
= log N + C0 + − 2k
−θ
n=1
n 2N k=1
2k N 2K N 2K
for some θ ∈ (0, 1).

20. Let t be real, ﬁxed. Show that n≤x (−1)n−1 n −it is boundedly oscillating.
21. (Carlitz 1964)
(a) Choose σ0 > 1 so that log ζ (σ0 ) = 2π . By substituting z = log ζ (s) in
(B.8), show that
log ζ (s) ∞
Bk
= (log ζ (s))k
ζ (s) − 1 k=0
k!
for σ > σ0 .
(b) Choose σ1 > 1 so that ζ (σ1 ) = 2. By writing log ζ (s) = log(1 +
(ζ (s) − 1)), show that
log ζ (s) ∞
(ζ (s) − 1)k
= (−1)k
ζ (s) − 1 k=0
k+1
for σ > σ1 .
(c) Show that there exist rational numbers b(n) such that
log ζ (s) ∞
= b(n)n −s
ζ (s) − 1 n=1

is absolutely convergent for σ > σ1 .

508 The Euler–Maclaurin summation formula

(d) Show that b(1) = 1.

(e) Show that b( p k ) = −1/(k(k + 1)) for k ≥ 1.
(f) Show that if n is square-free, then b(n) = Bω(n) .
22. Show that ζ (0) = − 12 log 2π. (Suggestion: Differentiate both sides of
(B.24), set s = 0, and then compare with (B.26).)

23. (a) Let F0 (x) = n≤x log n. Show that

F0 (x) = x log x − x + c − B1 (x) log x + O(1/x)

for x ≥ 1 where c is the constant in(B.25).

x
(b) Let F1 (x) = n≤x (x − n) log n = 1 F0 (u) du. Show that
1 2 3
F1 (x) = x log x − x 2 + cx + O(log x)
2 4
for x ≥ 1.
(c) By (5.19), show that
σ0 +i∞
−1 x s+1
F1 (x) = ζ (s) ds .
2πi σ0 −i∞ s(s + 1)
(d) Show that the residue of the above at s = 1 is 12 x 2 log x − 34 x 2 , and at
s = 0 is −ζ (0)x.
(e) Use Corollary 10.5, and Cauchy’s formula with a circular contour of
radius 1/ log τ to show that ζ (s) τ 1/2−σ log τ uniformly for −A ≤
σ ≤ −ε.
(f) Take the contour to the abscissa −1/2 + ε to show that
1 3
F1 (x) = x 2 log x − x 2 − ζ (0)x + O x 1/2+ε .
2 4
(g) By combining the above with the preceding exercise, show that c =
1
2
log 2π.
24. Show that 11 21/2 · · · n 1/n ∼ cn (log n)/2 as n → ∞, where c > 0 is an abso-
lute constant.
25. (Kinkelin 1860) Show that

11 22 · · · n n = Cn n /2+n/2+1/12 e−n /4 (1 + O(1/n 2 ))

2 2

as n → ∞, where c is a positive constant.

26. (Glaisher 1895)

(a) Let A0 (x) = n≤x n log n. Show that
1 1
A0 (x) = x 2 log x − x 2 − B1 (x)x log x
2 4
1 1
+ B2 (x)(log x + 1) + log C − + O(1/x)
2 12
The Euler–Maclaurin summation formula 509

for x ≥ 1 where C is the constant in the preceding exercise.

x
(b) Put A1 (x) = n≤x (x − n)n log n = 1 A0 (u) du. Show that
1 3 5 1
A1 (x) = x log x − x 3 − B2 (x)x log x
6 36 2
+ (log C − 1/12)x + O(log x)

for x ≥ 1. x

(c) Put A2 (x) = 1
2 n≤x (x − n)2 n log n = 1 A1 (u) du. Show that
1 4 13 4 1
A2 (x) = x log x − x + (log C − 1/12)x 2 + O(x log x)
24 288 2
for x ≥ 1.
(d) By using (5.19), show that
σ0 +i∞
−1 x s+2
A2 (x) = ζ (s − 1) ds .
2πi σ0 −i∞ s(s + 1)(s + 2)

(e) Show that the residue at s = 2 in the above integral is 24 x log x −

1 4
−1
13 4
288
x , and that the residue at s = 0 is 2
ζ (−1)x 2
.
(f) By taking the contour to the abscissa σ = −1/2 + ε, and using the
result of Exercise 23(e), show that
1 4 13 4 1
A2 (x) = x log x − x − ζ (−1)x 2 + O x 3/2+ε
24 288 2
for x ≥ 1.
(g) Show that (2) = 1 − C0 .
(h) By differentiating both sides of (10.9), show that
ζ (2) 1
ζ (−1) = 2
+ (1 − C0 − log 2π ).
2π 12
(i) Conclude that
1 1 ζ (2)
log C = log 2π + C0 −
12 12 2π 2
where C is the constant in Exercise 25.
27. (a) Integrate by parts to show that
1
Bk+1 (1)
x Bk (x) d x = .
0 k+1
(b) Use (B.5) to show that
1 k
k Bk− j
x Bk (x) d x = .
0 j=0
j j +2
510 The Euler–Maclaurin summation formula

(c) Conclude that if k > 0, then

k
k Bj Bk+1 (1)
= .
j=0
j k− j +2 k+1

In the next exercise we develop some of the ‘calculus of ﬁnite differ-

ences’, which we then use to derive an explicit formula for Bk+1 (x), and hence
for Bk .
28. For a given function f we let f denote the function f (x + 1) − f (x),
and we put (n) f = ( (n−1) f ).
(a) Show that
n
n
(n)
f (x) = (−1)i f (x + n − i).
i=0
i

(b) Suppose that f (x) is a polynomial expressed in the form

k
x
f (x) = cr (B.29)
r =0
r

where rx = x(x − 1) · · · (x − r + 1)/r ! for r > 0, and x0 = 1.
Show that

k
x
f (x) = cr .
r =1
r −1

(c) In the above notation, show that

k
x
(n)
f (x) = cr .
r =n r −n

(d) Deduce that

r
r
cr = (r )
f (x) = (−1)i f (r − i).
x=0
i=0
i

(e) Suppose that f is deﬁned as in (B.29), and put

k
x
F(x) = cr .
r =0
r +1

Show that F = f .
(f) Let f and F be as above, and suppose that G is a further function such
that G = f . Show that F − G is periodic with period 1, and hence
that if G is a polynomial then G = F + C for some constant C.
The Euler–Maclaurin summation formula 511

(g) Let f and F be as above, and suppose that a and b are integers such
that a ≤ b. Show that

b
f (x + j) = F(x + b + 1) − F(x + a).
j=a

29. Suppose that numbers ar k are chosen so that

k
xk = ar k x(x − 1) · · · (x − r + 1).
r =0

(a) Explain why the ar k are integers.

(b) Show that
r
r
ar k r ! = (−1)i (r − i)k .
i=0
i

Show that F(x + 1) − F(x) = x k .

(d) Show that F(0) = 0.
(e) Deduce that
Bk+1 (x) − Bk+1
F(x) = .
k+1
(f) Note that the coefﬁcient of x on the right-hand side above is Bk .
(g) Show that
d x (−1)r
= .
dx r + 1 x=0 r +1
(h) Conclude that

k
(−1)r ar k r !
k
1 r
r k
Bk = = (−1)i i . (B.30)
r =0
r +1 r =0
r + 1 i=0 i

30. (a) Show that if r + 1 is composite and r + 1 > 4, then (r + 1)|r !.

(b) Show that if k > 0, then a3k 3! = 3k − 3 · 2k + 3, and that this is a
multiple of 4 if k is even.
(c) Deduce that if k is positive and even, then
1 p−1
p−1 k
Bk ≡ (−1)i i (mod 1) .
p≤k+1
p i=0 i
512 The Euler–Maclaurin summation formula

p
31. Put Sk ( p) = a=1 a k .
(a) Show that S0 ( p) ≡ 0 (mod p).
(b) Show that if ( p − 1)|k and k > 0, then Sk ( p) ≡ −1 (mod p).
(c) Show that if (c, p) = 1, then ck Sk ( p) ≡ Sk ( p) (mod p).
(d) Show that if ( p − 1) k, then there is a c, (c, p) = 1, such that ck ≡
1 (mod p).
(e) Deduce that if ( p − 1) k, then Sk ( p) ≡ 0 (mod p).
(f) Summarize:

−1 (mod p) if ( p − 1)|k, k > 0;
Sk ( p) ≡
0 (mod p) otherwise.
32. (von Staudt 1840, Clausen 1840, cf. Lucas 1891, Carlitz 1960/61) By
combining the preceding two exercises, deduce the von Staudt–Clausen
theorem: If k is positive and even, then
1
Bk +
( p−1)|k
p

is an integer.
33. (a) Let Sk ( p) be deﬁned as in Exercise 29. Use the binomial theorem to
show that
n−1
n
Sk ( p) ≡ 0 (mod p).
k=0
k
(b) Deduce that
n
≡0 (mod p).
0<k<n
k
( p−1)|k

34. (Bartz & Rutkowski 1993)

(a) Suppose that q is a positive integer, and that a is a non-negative integer.
Explain why
k
k
q k Bk ((a + 1)/q) = B j (a/q)q j .
j=0
j

(b) Suppose that k = 1 or that k is a positive even integer, and let q be a pos-
itive integer. By using the von Staudt–Clausen theorem, or otherwise,
show that
1
q k Bk +
( p−1)|k
p
pq

is an integer.
B.1 Notes 513

(c) Suppose that k = 1 or that k is a positive even integer, and let q be a

positive integer. By inducting on a, show that
1
q k Bk (a/q) +
( p−1)|k
p
pq

is an integer.
(d) Suppose that k is odd, k ≥ 3, and that q is a positive integer. By in-
ducting on a, show that q k Bk (a/q) is an integer, for all non-negative
integers a.
35. (Almkvist & Meurman 1991) Suppose that q and k are positive integers.
Show that q k (Bk (a/q) − Bk ) is an integer for all integers a.
36. Suppose that 0 < α ≤ 1, and recall that the Hurwitz zeta function is deﬁned

to be ζ (s, α) = ∞ n=0 (n + α)
−s
for σ > 1.
(a) Show that
1 1
K
−s Bk (1 − α)
ζ (s, α) = + − (−1)k
α s s − 1 k=1 k k
∞
−s
− (−1) K B K ({x − α})x −s−K d x
K 1

for σ > 1 − K .
(b) Deduce that ζ (s, α) is an analytic function of s throughout the complex
plane, except for a simple pole with residue 1 at s = 1.
(c) Let n denote a non-negative integer. Show that
1 n+1
n+1
ζ (−n, α) = α n − (−1)k Bk (1 − α).
n + 1 k=0 k
(d) By (B.10), (B.13), (B.15), and Exercise 2, deduce that
Bn+1 (α)
ζ (−n, α) = − .
n+1

B.1 Notes
Although the notation we have adopted here is quite common, other (conﬂicting)
notations for the Bernoulli numbers are to be found in the literature. Thus it is
important to recognize the notational conventions when comparing texts.
The basic facts concerning the Bernoulli numbers and polynomials can be
derived in many ways, so the approach depends on one’s motivation. Other
expositions of note are found in Borevich & Shafarevich (1966, Section 5.8),
Rademacher (1973, Chapters 1, 2), and Boas (1977). The proof of the von
514 The Euler–Maclaurin summation formula

Staudt–Clausen theorem sketched in Exercises B.28–B.32 is due to Lucas

(1891). The critical identity (B.30) can also be derived by using the gener-
ating function (B.8) (cf. Carlitz 1960/61). Borevich & Shafarevich (1966, pp.
384–385) and Cassels (1986, pp. 7–10) give p-adic proofs, the latter of which
is due to Witt. The Bernoulli numbers possess a number of further arithmetic
properties, such as the Kummer congruences, which are best viewed from a
p-adic perspective (cf. Koblitz 1977, p. 44).
The fact that ζ (2k) is a rational multiple of π 2k was discovered by Euler.
As reported by Whittaker & Watson (1927, p. 127) and Barnes (1905, p. 253),
the Euler–Maclaurin sum formula was discovered by Euler in 1732, but not
published by him until 1738. Euler (9 June, 1736) wrote to Stirling of his
formula. Stirling (16 April, 1738) responded that Euler’s formula included his
own as a special case, but that the more general formula had been discovered by
Maclaurin. Euler then wrote to Stirling, waiving any claim of priority. Maclaurin
published the formula in 1742. Proofs of the formula have been given by Jacobi
(1834), Kronecker (1889, 1901, pp. 317–319), Wirtinger (1902), Barnes (1903),
Jordan (1922), and Hardy (1949, Chapter 13).
Euler invented a number of methods for accelerating the convergence of
series. Such methods (described in Hardy 1949, pp. 7–8, 23–29, 70–73)
can be applied to the zeta function. For example, the formula of Apéry
(1979),
5 ∞
(−1)n−1
ζ (3) = ,
2 n=1 n 3 2n
n

can be derived in this way. Apéry (cf. van der Poorten (1978/79), (1980),
Beukers (1979), Ball & Rivoal (2001)) used this formula to prove that ζ (3)
is irrational. It still is not known whether ζ (2k + 1) is irrational when k ≥ 2,
nor is it known whether ζ (2k + 1)/π 2k+1 is irrational. (In this latter connec-
tion see Grosswald (1970) and Terras (1976).) Presumably Euler’s constant
C0 = 0.577215664901532 . . . and Catalan’s constant
∞
L(2, χ−4 ) = (−1)m /(2m + 1)2 = 0.915965594 . . .
m=0

are irrational as well, but this has not been proved.

The value of ζ (−n) can be determined in a variety of ways. For example,
the values given in Theorem B.4 can be arrived at by combining the func-
tional equation of the zeta function (Theorem 10.4) with Corollary B.1 above.
Alternatively, by taking an = 1 in (5.23) we ﬁnd that
∞
1 x s−1
ζ (s) = dx
(s) 0 ex − 1
B.1 Notes 515

for σ > 1. Now suppose that the complex plane is slit along the positive real
axis, and that C is the ‘Hankel path’ that starts at +∞ on the positive side of
the slit, and follows the slit to the origin, circles the origin in the positive sense,
and then returns to +∞ along the negative side of the slit. Set
z s−1
I (s) = dz.
C ez − 1
This integral is uniformly convergent in any compact portion of the plane, and
therefore defines an entire function. Suppose that σ > 1. We shrink the path C
until it coincides with the slit. The integral along the first leg of the path is then
∞
x s−1
− d x.
0 ex − 1
The portion of the path that circles the origin becomes negligible, and the
integral along the second leg is
∞
(xe2πi )s−1
d x.
0 ex − 1
On combining these results and using the fact that (s) (1 − s) = π/ sin π s
(see Appendix C), we find that

ζ (s) = e−πis (1 − s)I (s)/(2πi).

Although we have derived this under the assumption that σ > 1, by the unique-
ness of analytic continuation it remains valid throughout the complex plane.
In general the integrand in I (s) has a branch point at the origin, but if s is a
negative integer then the singularity is merely a pole, the residue can then be
calculated using the power series (B.8), and we obtain Theorem B.4 once more.
See Apostol (1951) for a discussion of the values of the Lerch zeta functions.
By means of the Euler–Maclaurin formula one can calculate ζ (s) and its
derivatives, when |s| is not too large. Let S(t) and Z (t) be defined as in Chapter
14. As long as ζ (1/2 + it) is calculated sufficiently accurately to allow the sign
of Z (t) to be determined, one can prove the existence of zeros on the critical
line by detected changes of sign of Z (t). Let H (n) denote the assertion that
the first n zeros lie on the critical line and are simple. Gram (1903) established
H (10), Backlund (1914) H (79), and Hutchinson (1925) H (138), all using the
Euler–Maclaurin formula. Since the amount of computation to evaluate Z (t)
for a single value of t is comparable to t by this method, it would be slow
work to continue this for larger t. However, in unpublished notes of Riemann,
Siegel (1932) discovered indications of a more rapidly convergent formula,
known today as the Riemann–Siegel formula: Let θ = θ(t) = − 12 t log π +
516 The Euler–Maclaurin summation formula

√
arg (1/4 + it/2), m = [ t/(2π )]. Then

m
Z (t) = 2 n −1/2 cos(θ − t log n) + R(t)
n=1

where the remainder R(t) has an asymptotic expansion that is rapidly convergent
when t is large. The most trivial estimate is that R(t) t −1/4 , but if this is not
sufﬁcient one can write √
(−1)m−1 h t/(2π ) − m
R(t) = 1/4
+ O t −3/4
(t/(2π ))
where h(u) = (cos 2π (u 2 − u − 1/16))/ cos 2π u for 0 ≤ u < 1. Titchmarsh
(1935, 1936) used the above to establish H (1041). All such calculations fall
into two parts. First one calculates Z (t); by detecting sign changes one obtains
a lower bound for N (t). Secondly, one computes S(t), so that N (t) is known
via Theorem 14.1. Titchmarsh argued that if ζ (σ + it) > 0 for σ ≥ 1/2, then
N (t) is the integer nearest to
1 t
arg (1/4 + it/2) − log π + 1.
π 2π
Values of t for which this works are rare when t is large, but Turing (1953)
devised an alternative procedure that depends on the estimate
T
S(t) dt log T, (B.31)
0

which is due to Littlewood (1924). Turing (1953) was the ﬁrst to employ a
digital computer as an aid to the computation; he achieved H (1104). To be use-
ful in numerical calculations, estimates need to be constructed for the various
implicit constants. For the Riemann–Siegel formula this was done by Titch-
marsh. For (B.31) this was done by Turing. Titchmarsh’s analysis contained
errors that were later corrected by Rosser, Yohe & Schoenfeld (1969). Turing’s
argument also contained errors, which were repaired by Lehman (1970). Sub-
sequently, Lehmer (1956a,b) achieved H (25,000), Meller (1958) H (35,337),
Lehman (1966) H (250,000), Rosser, Yohe & Schoenfeld (1969) H (3,500,000),
Brent (1979) H (81,000,001), Brent, van de Lune, te Riele & Winter (1982a,b)
H (200,000,001), van de Lune & te Riele (1983) H (300,000,001), vande Lune,
te Riele & Winter (1986) H (1,500,000,001) and Wedeniwski H 9 · 1011
(cf http://www.zetagrid.net). The evaluation of ζ (1/2 + it) by means of the
Riemann–Siegel formula involves t 1/2 arithmetic operations, which is a big
improvement over the Euler–Maclaurin method. Odlyzko & Schönhage (1988)
have shown that if multiple evaluations are to be made, the amount of calcula-
tion per evaluation can be reduced to t ε . This new algorithmwas implemented

by Gourdon & Demichel (2004), who used it to establish H 1013 .
B.2 References 517

B.2 References
Almkvist, G. & Meurman, A. (1991). Values of Bernoulli polynomials and Hur-
witz’s zeta function at rational points, C. R. Math. Rep. Acad. Sci. Canada 13,
104–108.
Apéry, R. (1979). Irrationalité de ζ (2) et ζ (3), Astérisque 61, 11–13.
Apostol, T. M. (1951). On the Lerch zeta functions, Pacific J. Math. 1, 161–167.
Backlund, R. (1914). Sur les zéros de la fonction ζ (s) de Riemann, C. R. Acad. Sci.
Paris, 158, 1979–1982.
Ball, K. & Rivoal, T. (2001). Irrationalité d’une infinité de valeurs de la fonction zeta
aux entiers impairs, Invent. Math. 146, 193–207.
Barnes, E. W. (1903). The generalisation of the Maclaurin sum formula, and the range
of its applicability, Quart. J. 35, 175–188.
(1905). The Maclaurin sum-formula Proc. London Math. Soc. (2) 3, 253–272.
Bartz, K. & Rutkowski, J. (1993). On the von Staudt–Clausen theorem, C. R. Math. Rep.
Acad. Sci. Canada 15, 46–48.
Beukers, F. (1979). A note on the irrationality of ζ (2) and ζ (3), Bull. London Math. Soc.
11, 268–272.
Boas, R. P. (1977). Partial sums of infinite series, and how they grow, Amer. Math.
Monthly 84, 237–258.
Borevich, Z. I. & Shafarevich, I. R. (1966). Number Theory. New York: Academic
Press.
Brent, R. (1979). On the zeros of the Riemann zeta function in the critical strip, Math.
Comp. 33, 1361–1372.
Brent, R. P., van de Lune, J., te Riele, H. J. J., Winter, D. T. (1982a). The first 200,000,001
zeros of Riemann’s zeta function, Computational Methods in Number Theory, Part
II, Math. Centre Tracts 155. Amsterdam: Math. Centrum, 389–403.
(1982b). On the zeros of the Riemann zeta function in the critical strip. II, Math.
Comp. 39, 681–688; Corrigenda, 46 (1986), 771.
Carlitz, L. (1960/1961). The Staudt–Clausen theorem, Math. Mag. 34, 131–146.
(1964). Extended Bernoulli and Eulerian numbers, Duke Math. J. 31, 667–689.
Cassels, J. W. S. (1986). Local Fields, London Math Soc. Student Texts 3, Cambridge:
Cambridge University Press.
Clausen, Th. (1840). Theorem, Astronomische Nachrichten 17, 351.
Euler, L. (1732/33). Comm. Petropol. 6, 68–97; Opera, Vol. 1, 15, pp. 42–72.
Glaisher, J. W. L. (1895). On the constant which occurs in the formula for 11 22 · · · n n ,
Messenger of Math. 24, 1–16.
Gourdon, X. & Demichel, P. (2004). The 1013 first zeros of the Riemann zeta function,
and zeros computation at very large height, http://numbers.computation.free.fr/
Constants/Miscellaneous/zetazeros1e13–1e24.pdf.
Gram, J. (1903). Sur les zéros de la fonction ζ (s) de Riemann, Acta Math. 27,
289–304.
Grosswald, E., (1970). Die Werte der Riemannschen Zetafunktion an ungeraden Argu-
mentstellen, Nachr. Akad. Wiss. Göttingen Math.–Phys. Kl. II, 9–13.
Hardy, G. H., (1949). Divergent Series. London: Oxford University Press.
Hutchinson, J. I. (1925). On the roots of the Riemann zeta function, Trans. Amer. Math.
Soc. 27, 49–60.
518 The Euler–Maclaurin summation formula

Jacobi, C. G. J. (1834). De usu legitimo formulae summatoriae Maclaurinianae, J. Reine

Angew. Math. 12, 263–272; Gesammelte Werke, Vol. 6. Berlin: Reimer, 1891,
pp. 64–75.
Jordan, C. (1922). On a new demonstration of Maclaurin’s or Euler’s summation formula,
Tôhoku Math. J. 21, 244–246.
Kinkelin, H. (1860). Ueber eine mit der Gammafunction verwandte Transcendente und
deren Anwendung auf die Integralrechnung, J. Reine Angew. Math. 57, 122–138.
Koblitz, N. (1977). p-adic Numbers, p-adic Analysis, and Zeta Functions, Graduate
Texts Math. 58. New York: Springer-Verlag.
Kronecker, L. (1889). Bemerkungen über die Darstellung von Reihen durch Integrale,
J. Reine Angew. Math. 105, 157–159, 345–354; Werke, Vol. 5. Leipzig: Teubner,
1939, pp. 327–342.
(1901). Vorlesungen über Zahlentheorie, Vol. 1. Leipzig: Teubner.
Lehman, R. S. (1966). Separation of the zeros of the Riemann zeta function, Math.
Comp. 20, 523–541.
(1970). On the distribution of zeros of the Riemann zeta-function, Proc. London Math.
Soc. (3) 20, 303–320.
Lehmer, D. H. (1940). On the maxima and minima of Bernoulli polynomials, Amer.
Math. Monthly 47, 533–538.
(1956a). Extended computation of the Riemann zeta-function, Mathematika 3, 102–
108; MTAC 11 (1957), 273.
(1956b). On the roots of the Riemann zeta-function, Acta Math. 95, 291–298; MTAC
11 (1957), 107–108.
Littlewood, J. E. (1924). On the zeros of the Riemann zeta-function, Proc. Cambridge
Philos. Soc. 22, 295–318.
Lucas, É. (1891). Théorie des Nombres. Paris: Gauthier–Villars.
van de Lune, J. & te Riele, H. J. J. (1983). On the zeros of the Riemann zeta function in
the critical strip. III, Math. Comp. 41, 759–767; Corrigenda, 46 (1986), 771.
van de Lune, J., te Riele, H. J. J., & Winter, D. T. (1981). Rigorous high speed sep-
aration of zeros of Riemann’s zeta function, Afdeling Numerieke Wiskunde 113,
Amsterdam: Mathematisch Centrum.
(1986). On the zeros of the Riemann zeta function in the critical strip. IV, Math.
Comp. 46, 667–681.
Maclaurin, C. (1742). Treatise of Fluxions. Edinburgh, p. 672.
Meller, N. A. (1958). Computation connected with the check of Riemann’s hypothesis,
Dokl. Akad. Nauk SSSR 123, 246–248.
Nielsen, N. (1923). Traité élémentaire des nombres de Bernoulli, Paris: Gauthier–Villars.
Odlyzko, A. M. & Schönhage, A. (1988). Fast algorithms for multiple evaluations of
the Riemann zeta function, Trans. Amer. Math. Soc. 309, 797–809.
van der Poorten, A. (1978/79). A proof that Euler missed . . . Apéry’s proof of the irra-
tionality of ζ (3), An informal report, Math. Intelligencer 1, 195–203.
(1980). Some wonderful formulae . . . footnotes to Apéry’s proof of the irrationality
of ζ (3), Séminaire Delange–Pisot–Poitou, Théorie des nombres, Fasc. 2, Exp. No.
29, Paris: Secrétariat Math. 7 pp.
Rademacher, H. (1973). Topics in Analytic Number Theory. New York: Springer-Verlag.
Rosser, J. B., Yohe, J. M. & Schoenfeld, L. (1969). Rigorous computation and the zeros
of the Riemann zeta-function, Information Processing 68 (Proc. IFIP Congress,
B.2 References 519

Edinburgh, 1968), Vol. 1: Mathematics, Software, Amsterdam: North-Holland,

pp. 70–76; Errata, Math. Comp. 29 (1975), 243.
Siegel, C. L. (1932). Über Riemanns Nachlaß zur analytischen Zahlentheorie, Quellen
Studien Gesch. Math. Astro. Phys. 2, 45–80; Gesammelte Abhandlungen, Vol. 1.
Berlin: Springer-Verlag, 1966, pp. 275–310.
von Staudt, K. G. C. (1840). Beweis eines Lehresatzes, die Bernoullischen Zahlen be-
treffend, J. Reine Angew. Math. 21, 372–374.
Terras, A. (1976). Some formulas for the Riemann zeta function at odd integer argument
resulting from Fourier expansions of the Epstein zeta function, Acta Arith. 29,
181–189.
Titchmarsh, E. C. (1935). The zeros of the Riemann zeta function, Proc. Royal Soc.
London Ser. A 151, 234–255.
(1936). The zeros of the Riemann zeta function, Proc. Roy. Soc. London Ser. A 157,
261–263.
Turing, A. (1953). Some calculations of the Riemann zeta-function, Proc. London Math.
Soc. (3) 3, 99–117.
Wallis, J. (1656). Arithmetica Inﬁnitorum, Oxford.
Whittaker, E. T. & Watson, G. N. (1927). A Course of Modern Analysis, Fourth edition.
Cambridge: Cambridge University Press.
Wirtinger, W. (1902). Einige Anwendungen der Euler–Maclaurin’schen Summenformel,
insbesondere auf eine Aufgabe von Abel, Acta Math. 26, 255–271.
Appendix C
The gamma function

For any complex number s not equal to a non-positive integer we deﬁne the
gamma function by its Weierstrass product,
∞
e−C0 s es/n
(s) = . (C.1)
s n=1
1 + s/n
Here C0 is Euler’s constant, and we recall from Corollary 1.14 or Exercise B.15
that this constatnt is determined by the relation
N
1
= log N + C0 + O(1/N ). (C.2)
n=1
n
From (C.1) it is evident that 1/ (s) is an entire function with simple zeros at the
non-positive integers, which is to say that (s) is a non-vanishing meromorphic
function with simple poles at the non-positive integers as depicted in Figure C.1.
On considering the N th partial product in (C.1) and appealing to (C.2), we obtain
Gauss’s formula,
Ns N!
(s) = lim . (C.3)
N →∞ s(s + 1) · · · (s + N )

By taking s = 1 we see that (1) = 1. Moreover, from (C.3) it is also immediate

that
s (s) = (s + 1). (C.4)
Hence by induction we ﬁnd that
(n + 1) = n! (C.5)
for non-negative integers n. As will become apparent, the gamma function not
only interpolates the values of the factorial, but does so quite smoothly.
The function (s) (1 − s) has a simple pole at every integer. Since the same
can be said for 1/ sin πs, it is reasonable to investigate the relation between these

520
The gamma function 521

Figure C.1 Graph of (s) for −5 < s ≤ 5.

two functions. To this end we let p N (s) denote the expression on the right in
(C.3), and note that
N
N
p N (s) p N (1 − s) = (1 − (s/n)2 )−1 .
s(N + 1 − s) n=1

On the other hand, we recall that the Weierstrass product for the sine function
may be written
∞
s2
sin s = s 1− .
n=1
(π n)2

On comparing these formulæ we conclude that

π
(s) (1 − s) = . (C.6)
sin π s
We take s = 1/2 to see that (1/2)2 = π . But from (C.1) it is clear that
(1/2) > 0, so we have
√
(1/2) = π. (C.7)

From (C.1) we see that (s) never takes the value 0, and that it has sim-
ple poles at the non-positive integers. Let k be a non-negative integer. Since
522 The gamma function

sin π s ∼ (−1)k π(s + k) as s → −k, and since (k + 1) = k!, it follows from

(C.6) that
(−1)k
(s) ∼ (C.8)
k!(s + k)
as s → −k.
Similarly we observe that (s) (s + 1/2) has a simple pole at 0, −1/2, −1,
−3/2, −2, . . . , and that the same is true of (2s). We now establish a relation
between these two functions by observing that
p N (s) p N (s + 1/2) N + 1/2
= 21−2s p (1/2).
p2N (2s) N + s + 1/2 N
On letting N → ∞ and using (C.7) we obtain Legendre’s duplication
formula,
√
(s) (s + 1/2) = π 21−2s (2s). (C.9)
On taking logarithmic derivatives in (C.1) we ﬁnd that the digamma function

(s) can be written

1 ∞
1 1
(s) = − − C0 − − . (C.10)
s n=1
s + n n
Setting s = 1, we see in particular that

(1) = −C0 . (C.11)

Since (1) = 1, this is equivalent to

(1) = −C0 . (C.12)
∞
We write z = r e(θ ) in the power series expansion log(1 − z)−1 = n=1 z n /n,
let r → 1− , and apply Abel’s theorem to see that
∞
e(nθ )
= − log(1 − e(θ )) (C.13)
n=1
n
provided that θ ∈
/ Z. By applying this formula for various rational values of θ
we can express the series in (C.10) in closed form, for any rational value of s.
For example, by taking θ = 1/2 we ﬁnd that
1 1 1
1 − + − + · · · = log 2,
2 3 4
which with (C.10) gives

(1/2) = −C0 − 2 log 2. (C.14)
The gamma function 523

Also, since
⎧
⎨1 if n ≡ 1 (mod 4),
−1 − i 1 −1 + i
e(n/4) − e(n/2) + e(3n/4) = −1 if n ≡ 0 (mod 4),
4 2 4 ⎩
0 otherwise,
by taking θ = 1/4, 1/2, 3/4 in (C.13) we deduce via (C.10) that

(1/4) = −C0 − 3 log 2 − π/2. (C.15)

Similarly,

(3/4) = −C0 − 3 log 2 + π/2. (C.16)

We now consider the asymptotic behaviour of the gamma function.

Theorem C.1 Let δ > 0 be given, and let R = R(δ) be the set of those com-
plex numbers s for which |s| ≥ δ and | arg s| < π − δ. Then

(s) = log s + O(1/|s|) (C.17)

and
√
(s) = 2πs s−1/2 e−s (1 + O(1/|s|)) (C.18)

uniformly for s ∈ R.

The second estimate here is Stirling’s formula for the gamma function, which
generalizes his estimate (B.26) for n!. From this we see that

| (s)| τ σ −1/2 e−π τ/2 (C.19)

as |t| → ∞ with σ uniformly bounded.

Proof From (C.2) and (C.10) we see that if N > |s|, then

N
1
(s) = log N − + O(|s|/N ).
n=0
n+s
By the Euler–MacLaurin summation formula (Theorem B.5) with f (x) =
1/(x + s), a = 0− , b = N , K = 2 we find that

N
1 1 1
= log(N + s) − log s + + + O(|s|−2 ).
n=0
n+s 2s 2(s + N )
On combining these estimates and letting N tend to infinity we find that

1
(s) = log s − + O(|s|−2 ). (C.20)
2s
524 The gamma function

This estimate is more precise than (C.17), and still greater accuracy can be
obtained by choosing a larger value of K .
To derive (C.18) we begin by taking logarithms in (C.3) and applying the
Euler–MacLaurin summation formula, or we integrate (C.20) from s to s + ∞
along a ray parallel to the real axis. In either case we ﬁnd that
1
log (s) = s log s − s − log s + c + O(1/|s|),
2
and it remains to determine the value of the constant c. This may be done
in a number of ways. For example, we could appeal to (C.5) and (B.26). Al-
ternatively, we can take logarithms in (C.9) and apply the above to see that
c = (log 2π)/2. Then (C.18) follows by exponentiating.

The gamma function can be expressed as a deﬁnite integral in various

ways. We now establish two important integral representations for the gamma
function.

Theorem C.2 (Euler’s integral) If s > 0, then

∞
e−x x s−1 d x = (s). (C.21)
0

Proof By integrating by parts repeatedly it is easy to verify that

1
N!
= (1 − y) N y s−1 dy.
s(s + 1) · · · (s + N ) 0

We make the change of variable x = N y and recall Gauss’s formula (C.3) to

ﬁnd that
∞
(s) = lim f N (x) d x
N →∞ 0

where

(1 − x/N ) N x s−1 for 0 ≤ x ≤ N ,
f N (x) =
0 for x > N .

To complete the proof we ∞employ the dominated convergence theorem. Put

f (x) = e−x x σ −1 . Then 0 f (x) d x < ∞ when σ > 0, and | f N (x)| ≤ f (x)
uniformly in N and x. Since

lim f N (x) = e−x x s−1

N →∞

for each ﬁxed x, the formula (C.21) now follows.

The gamma function 525

Let C(ρ) denote the circular arc {z = ρe(θ ) : 0 ≤ θ ≤ 1/4}. It is easy to

verify that

|e−z z s−1 | |dz| → 0

C(ρ)

as ρ → ∞. Thus by Cauchy’s theorem the formula (C.21) still holds if x is

replaced by a complex variable z that goes to inﬁnity along a ray from the
origin, z = ρe(θ), 0 ≤ ρ < ∞, provided that −1/4 ≤ θ ≤ 1/4.
For r > 0 we let H = H(r ) denote the Hankel contour, which consists of
a path that passes from −ir − ∞ to −ir along the ray x − ir , −∞ < x ≤ 0,
and then from −ir to ir along the semicircle r e(θ ), −1/4 ≤ θ ≤ 1/4, and then
from ir to ir − ∞ along the ray x + ir , −∞ < x ≤ 0.

Theorem C.3 (Hankel) For any complex number s,

1 1
e z z −s dz = . (C.22)
2πi H (s)
Here z −s is assumed to have its principal value.

As in the preceding theorem, the contour of integration may be altered sub-

stantially without changing the value of the integral. For example, the ray from
ir to −∞ + ir may be replaced by a ray in the direction e(θ ), provided that
1/4 < θ < 1/2.

Proof It is clear that the left-hand side is an entire function of s. Thus it sufﬁces
to prove the identity when σ < 1. For such s we let r → 0+ , and note that the
integral along the semicircle tends to 0. The remaining integrals tend to
∞ ∞
eiπ s e−x x −s d x − e−iπ s e−x x −s d x = 2i(sin π s) (1 − s)
0 0

by (C.21). To complete the proof it sufﬁces to appeal to (C.6).

Euler’s formula asserts that the gamma function is the Mellin transform of
the function e−x . We now establish the inverse.

Theorem C.4 (Mellin) If z > 0 and c > 0, then

c+i∞
1
(s)z −s ds = e−z .
2πi c−i∞

Proof From Stirling’s formula we see that

c+i K
| (s)z −s | |ds| −→ 0
−K +i K

as K → ∞, and similarly for the integral from −K − i K to c − i K . Moreover,

526 The gamma function

if we ﬁrst apply (C.6) and then Stirling’s formula, we ﬁnd that

−K +i K
| (s)z −s | |ds| −→ 0
−K −i K

as K → ∞ through values of the form K = n + 1/2, n ∈ Z. (We are assuming

here that the path of integration is a line segment joining the two endpoints.)
Thus by the calculus of residues
1 c+i∞ ∞
(s)z −s ds = Res (s)z −s .
2πi c−i∞ k=0
s=−k

From (C.8) we see that the above is

∞
(−1)k
z k = e−z .
k=0
k!

The digamma function can be examined in a similar way. In view of (C.17),

this function is not absolutely integrable on the line σ = c, and thus we cannot
deﬁne its Fourier transform in the classical manner. We now formulate a useful
substitute.
Theorem C.5 Let a > 0 and b > 0 be ﬁxed. If x < 0 and T ≥ 1, then
T
e(−x T ) e(x T )
(a + ibt)e(−xt) dt = − (a + ibT ) + (a − ibT )
−T 2πi x 2πi x
− 2π b−1 e2πax/b (1 − e2π x/b )−1 + O(x −2 T −1 ),
while if x > 0 and T ≥ 1, then
T
(a + ibt)e(−xt) dt
−T

e(−x T ) e(x T )
=− (a + ibT )
+ (a − ibT ) + O(x −2 T −1 ).
2πi x 2πi x
Proof We write the integral as
iT
1
(a + bs)e−2π xs ds.
i −i T

Suppose that x < 0. Let C be the contour passing by line segment from
−∞ − i T to −i T to i T to −∞ + i T . By the calculus of residues and (C.10)
we ﬁnd that

2πi ∞
(a + bs)e−2π xs ds = − e2π x(n+a)/b
C b n=0
2πi 2πax/b −1
=− e 1 − e2π x/b .
b
The gamma function 527

−i T
We parametrize the integral −∞−i T , and integrate by parts, to see that it is
0
(a + bσ − ibT )e(x T )e−2π xσ dσ
−∞
0
e(x T ) be(x T )
=− (a − ibT ) + (a + bσ − ibT )e−2π xσ dσ.
2π x 2π x −∞

But

∞
(s) = (n + s)−2 1/|t|
n=0

for |t| ≥ 1, and hence the last integral above is x −2 T −1 . Similarly,

−∞+i T
e(−x T )
(a + bs)e−2π xs ds = (a + ibT ) + O(x −2 T −1 ).
iT 2π x
We obtain the stated result on combining these estimates. The case x > 0
is treated similarly, but with a contour from +∞ − i T to −i T to i T to
+∞ + i T .

Exercises
1. Show:
π
(a) | (it)|2 = ;
t sinh π t
π
(b) | (1/2 + it)|2 = ;

cosh πt
(c) (s) > 0 if t > 0;
∂
(d) log | (s)| < 0 when t > 0;
∂t
(e) For any given σ , | (s)| is a strictly decreasing function of t on the
interval 0 < t < ∞.
2. (Gauss 1812) Prove Gauss’s multiplication formula:
q−1
(s + a/q) = (2π )(q−1)/2 q 1/2−qs (qs).
a=0

3. Show:
(a) (1 − s) − (s) = π cot π s;

1
(b) (s + 1) = + (s);
s
(c) If n is an integer, n > 1, then

n−1
1
(n) = −C0 + .
k=1
k
528 The gamma function

4. (Gauss 1812) Using additive characters (as discussed in Chapter 4), or

otherwise, show that if 0 < a ≤ q, then

q−1
(a/q) = −C0 − log q + e(−ah/q) log(1 − e(h/q)).
h=1
√
5. Show that (1/3) = −C0 − 32 log 3 − π 3/6.
6. Show that

∞
(s) = −C0 + (−1)n+1 ζ (n + 1)(s − 1)n
n=1

for |s − 1| < 1.
7. Show:

∞
(a) (s) = (s + n)−2 ;
n=0
(s) ∞
(b) = (s)2 + (s + n)−2 ;
(s) n=0
(c) The functions (σ ), (σ ) have the same sign for all real σ .
8. Show that if x > 0 and y ≥ 1, then
(x + y)
≥ x y.
(x)
9. (Hermite 1881) Let xn denote the unique critical point of (σ ) in the in-
terval (−n, −n + 1). Show that xn = −n + (log n)−1 + O((log n)−2 ) for
n ≥ 2.
10. Show that (s) = s −1 + 12 s −2 + O(|s|−3 ) uniformly in the region R of
Theorem C.1.
∞
11. (a) Show that 1 e−x x s−1 d x is an entire function.

(b) Show that if σ > 0, then

1 ∞
(−1)n
e−x x s−1 d x = .
0 n=0
n!(s + n)

(c) Show that if s is not a non-positive integer, then

∞ ∞
(−1)n
(s) = e−x x s−1 d x + .
1 n=0
n!(s + n)

12. (a) Show that if σ > 0, then

∞
(k)
(s) = e−x x s−1 (log x)k d x.
0
The gamma function 529

(b) Show that

∞
e−x log x d x = −C0 .
0
13. (Cauchy 1827; Saalschütz 1887, 1888) Show that if −1 < σ < 0, then
∞
(s) = (e−x − 1)x s−1 d x.
0
14. Let s be fixed with σ > 0, and let f N (x) be the function defined in the proof
of Theorem C.2. Show that
∞
f N (x) d x = (s) − (s + 2)/(2N ) + O(N −2 ).
0
15. (Mellin 1883a, b) Let P(z) and Q(z) be relatively prime polynomials over
C, with roots α1 , . . . , αm and β1 , . . . , βn , respectively, and suppose that
none of these roots is a positive integer.

(a) Suppose that ∞ P(k)
k=1 Q(k) converges. Show:
(i) m = n;
(ii) P and Q have the same leading coefficient;

(iii) αi = βi .
(b) Show conversely that if conditions (i)–(iii) hold, then the product con-
verges, and has the value
m
(1 − βi )
.
i=1
(1 − αi )
(c) Show that if a and b are complex numbers such that none of a, b, a + b
is a negative integer, then
∞
n(n + a + b) (a + 1) (b + 1)
= .
n=1
(n + a)(n + b) (a + b + 1)
16. (Liouville 1852) Show that if q is an integer, q > 1, then
∞ q
(1 − (z/n)q )−1 = −z q (−ze(a/q)).
n=1 a=1

17. (Mellin 1891, p. 324)

(a) Show that
∞
(σ )2 t2
= 1+ .
| (s)| 2
n=0
(n + σ )2
(b) Give a second derivation of the assertion of Exercise 1(e).
18. (Gram 1899) Show that
∞
(n 3 − 1) 2
3 + 1)
= .
n=2
(n 3
530 The gamma function

19. Show that if σ > 0, then

1
(s) = (log 1/x)s−1 d x,
0

and
∞
e−e esx d x.
x
(s) =
−∞

20. (Euler 1794)

(a) Show that if −1 < σ < 1, then
∞
1
(sin x)x s−1 d x = (s) sin π s.
0 2
(b) Show that if 0 < σ < 1, then
∞
1
(cos x)x s−1 d x = (s) cos πs.
0 2
21. For a > 0, b > 0 let the beta function B(a, b) be deﬁned to be
1
B(a, b) = x a−1 (1 − x)b−1 d x.
0

(a) Write
∞ ∞
(a) (b) = e−u−v u a−1 v b−1 du dv
0 0

and make the change of variables u = r x, v = r (1 − x) to show that

(a) (b)
B(a, b) = .
(a + b)
(b) Show that if a > 0 and b > 0, then
∞
1
x 2a−1 (1 − x 2 )b−1 d x = B(a, b).
0 2
(c) Show that if a > 0 and b > 0, then
π/2
1
(sin θ)2a−1 (cos θ)2b−1 dθ = B(a, b).
0 2
(d) By writing t = tan2 θ, or otherwise, show that if a > 0 and b > 0,
then
∞
t a−1
dt = B(a, b).
0 (1 + t)a+b
22. (Dirichlet 1839; Liouville 1839) Let f (x) be a continuous function deﬁned

on [0, 1]. Let R denote that portion of Rn for which xi ≥ 0 and xi ≤ 1.
C.1 Notes 531

Show that

f (x1 + · · · + xn )x1a1 −1 · · · xnan −1 d x1 · · · d xn

R
(a1 ) · · · (an ) 1
= f (x)x a−1 d x
(a1 + · · · + an ) 0

where a = ai and ai > 0 for all i.
23. (Mellin 1902) Suppose that z lies in the slit plane formed by deleting the
negative real axis. Show that if 0 < c < a, then
c+i∞
(a) 1
= (s) (a − s)z −s ds.
(1 + z)a 2πi c−i∞

(This is the inverse of the Mellin transform in Exercise 21(d).)

24. (Raabe 1844) Show that if s is not a negative real number or 0, then
s+1
1
log (z) dz = s log s − s + log 2π.
s 2
25. (Barnes 1900) Let
∞
1 1 s n
e−s−s /(2n)
2
G(s + 1) = (2π)s/2 exp − (C0 + 1)s 2 − s 1+ .
2 2 n=1 n
Show:
(a) G(s) is an entire function.
(b) G(1) = 1.
(c) G(s + 1) = (s)G(s).
(d)
(n!)n
G(n + 1) = .
11 22 33 · · · nn
26. Show that

∞
(−1)n n 2 1 1 π
= ln 2 − − √ .
n=1
n3 +1 3 3 3 cosh(π 3/2)

C.1 Notes
Euler, in a letter of 1729 to Goldbach (cf. Fuss 1843, p. 3) gave the formula

1 ∞ 1 s s −1
(s) = 1+ 1+ .
s n=1 n n
This is substantially the same as the formula (C.3) that Gauss (1812) took to be
fundamental. Based on the above deﬁnition of the gamma function, the formula
532 The gamma function

(C.1) was proved by Schlömilch (1844) and Newman (1848). Weierstrass (1856)
took (C.1) to be the deﬁnition of the gamma function. Euler had given the special
value (C.7) already in his letter to Goldbach. Euler (1771) also discovered the
reﬂection formula (C.6). The duplication formula (C.9) of Legendre (1809) is
a special case of the multiplication formula of Gauss (1812), given in Exercise
C.3. Stirling (1730, p. 135) gave the series expansion
∞
1 1 Bn
log (s) = s − log s − s + log 2π + .
2 2 n=2
n(n − 1)s n−1

This series diverges, but a partial sum provides an asymptotic expansion.The

approximation (C.17) is a weak form of this. To calculate (s) numerically, it
sufﬁces to consider σ ≥ 1/2, in view of (C.6). If |s| is small then (C.4) should be
used repeatedly. Thus it remains to evaluate (s) when σ ≥ 1/2 and |s| is large,
and this is quickly achieved by using the expansion above. By these means it may
be found that the sole minimum of (σ ) for σ > 0 is at σ0 = 1.4616321 . . . ,
and that (σ0 ) = 0.88560319 . . . . The convenient estimate (C.19) was noted
by Pincherle (1888). Theorems C.1 and C.2 may be established in several
ways. An instructive collection of such proofs is found in Sections 8.4, 8.5,
11.1, 11.11, and 12.12 of Henrici (1977). Euler (1730) gave the formula of
1
Theorem C.2, expressed in the form n! = 0 (log 1/y)n dy, and subsequently
found many other integral formulæ involving the gamma function. Thus Euler
was led in quite a different direction than Gauss (1812), whose independent
investigations were more directly related to Gauss’s formula (C.3). Legendre
(1809) called the formula (C.21) the ‘Euler integral of the second kind’, and
introduced the notation (z). The ‘Euler integral of the ﬁrst kind’ is known
today as the beta function (see Exercise C.21). Theorem C.3 is due to Hankel
(1864), and Theorem C.4 to Mellin (1896, p. 76, 1899, p. 39).
Simple proofs of Stirling’s formula for n!, using a minimum of tools, have
been given by Robbins (1955) and Feller (1965).
For more extensive expositions of the subject the reader is referred to Artin
(1964), Henrici (1977), Jensen (1916), Nielsen (1906), and to Whittaker &
Watson (1950, Chapter 12). The related Mellin–Barnes integrals are discussed
in Section 8.8 of Henrici (1977).
Gauss and Binet established several useful formulæ for log (s) and for

(s). Kummer (1847) proved that if 0 < σ < 1, then

1 1
log (σ ) = (C0 + log 2) − σ + (1 − σ ) log π − log sin π σ
2 2
∞
log n
+ sin 2π nσ.
n=1
πn
C.2 References 533

In conjunction with the analysis of Chapter 9, this gives

q
q √
q
χ (a) log (a/q) = −(C0 + log 2π ) aχ (a) − L (1, χ )
a=1 a=1
π

where χ is a primitive character (mod q) for which χ (−1) = −1.

Artin (1931, 1964; p. 14) showed that if f (x) is positive and log f (x) is
convex for x > 0, if x f (x) = f (x + 1) for all x > 0, and f (1) = 1, then f (x) =
(x).
Hölder (1886) showed that (s) does not satisfy an algebraic differential
equation. Additional proofs of this have been given by Moore (1897), Jensen
(1916, pp. 103–112) and Ostrowski (1919).

C.2 References
Artin, E. (1931). Einführung in die Theorie der Gamma-Funktion. Hamburger math.
Einzelschriften 11. Leipzig: Teubner.
(1964). The Gamma Function. New York: Holt, Reinhart and Winston.
Barnes, E. W. (1900). The theory of the G-function, Quart. J. Math. 31, 264–314.
Cauchy, A. L. (1827). Exercices de Math. Vol. 2. Paris: de Buse Frèses, pp. 91–92.
Lejeune–Dirichlet, P. G. (1839). Sur une nouvelle methode pour la détermination des
intégrales multiples, J. Math. pures appl. 4, 164–168; Werke I, pp. 375–380.
Euler, L. (1730). De Progressionibus transcendemibus seu quarum termini generales
algebraice dari nequennt, Comment. Acad. Sci. Petropolitanae 5, 36–57; Opera
Omnia, Ser 1, Vol. 14, Teubner, 1924,
pp. 1–14.
(1771). Evolutio formulae integralis x f −1 (log x)m/n d x integratione a valore x = 0
ad x = 1 extensa, Novi Comment. Acad. Petropol. 16, 91–139.
(1794). Institutiones calculi integralis, Vol. 4, p. 342.
Feller, W. (1965). A direct proof of Stirling’s formula, Amer. Math. Monthly 74, 1223–
1225.
Fuss, P.-H. (1843). Correspondence Mathématique et Physique de quelques célèbres
géomètres du XVIIème siècle, Vol. 1. St. Petersburg: Acad. Impér. Sci.
Gauss, C. F. (1812). Disquisitiones generales circa seriem inﬁnitam etc., Comment. Gott.
2, 1–46; Werke, Vol. 3. Berlin: Deutsch von H. Simon, 1888, pp. 123–162.
Gram, J. P. (1899). Nyt Tidsskrift Mat. 10B, 96.
Hankel, H. (1864). Die Eulerschen Integrale bei unbeschränkter Variabilität des Argu-
ments, Zeit. Math. Phys. 9, 1–21.
Henrici, P. (1977). Applied and Computational Complex Analysis, Vol. 2. New York:
Wiley.
Hermite, Ch. (1881). Sur l’intégrale Eulérienne de seconde espèce, J. Reine Angew.
Math. 90, 332–338.
Hölder, O. (1886). Über die Eigenschaft der Gammafunktion keiner algebraischen Dif-
ferentialgleichung zu genügen, Math. Ann. 28, 1–13.
534 The gamma function

Jensen, J. L. W. V. (1916). An elementary exposition of the theory of the Gamma function,

Annals of Math. (2) 17, 124–166.
Kummer, E. E. (1847). Beitrage zur Theorie der Funktion (x), J. Reine Angew. Math.
35, 1–4.
Legendre, A. M. (1809). Recherches sur diverses sortes d’intégrales définies, Mémoires
de l’Institut de France 10, 416–509.
Liouville, J. (1839). Note sur quelques intégrales définies, J. Math. Pures Appl. 4, 225–
235.
(1852). Note sur la fonction gamma de Legendre, J. Math. Pures Appl. 17, 448–453.
Mellin, H. (1883a). Eine Verallgemeinerung der Gleichung (x) (1 − x) = π : sin π x,
Acta Math. 3, 102–104.
(1883b). Über gewisse durch die Gammafunktion ausdrückbare Produkte, Acta Math.
3, 322–324.
(1891). Zur Theorie der linearen Differenzengleichungen erster Ordnung, Acta Math.
15, 317–384.
(1896). Über die fundamentale Wichtigkeit des Satzes von Cauchy für die Theorien
der Gamma- und hypergeometrischen Funktionen, Acta Soc. Fennicae 21, no. 1,
p. 76.
(1899). Über eine Verallgemeinerung der Riemannschen Funktion ζ (s), Acta Soc.
Fennicae 24, 50 pp.
(1902). Über den Zusammenhang zwischen den linearen Differential- und Differen-
zengleichungen, Acta Math. 25, 139–164.
Moore, E. H. (1897). Concerning transcendentally transcendental functions, Math. Ann.
48, 49–74.
Newman, F. W. (1848). On a, especially when a is negative, Cambridge and Dublin
Math. J. 3, 57–60.
Nielsen, N. (1906). Handbuch der Theorie der Gammafunktion. Leipzig: Teubner.
Ostrowski, A. (1919). Neuer Beweis des Hölderschen Satzes daß die Gammafunktion
keiner algebraischen Differentialgleichung genügt, Math. Ann. 79, 286–288.
Pincherle, S. (1888). Sulle funzioni ipergerometriche generalizzate, Rend. Reale Accad.
Lincei (4) 4, 694–700; 792–799.
Raabe, J. (1844). Angenäherte Bestimmung der Faktorenfolge n!, wenn n eine sehr
große ganze Zahl ist, J. Reine Angew. Math. 28, 12–14.
Robbins, H. (1955). A remark on Stirling’s formula, Amer. Math. Monthly 62, 26–29.
Saalschütz, L. (1887). Bemerkungen über die Gammafunktionen mit negativem Argu-
ment, Zeit. Math. Phys. 32, 246–250.
(1888). Bemerkungen über die Gammafunktionen mit negativem Argument, Zeit.
Math. Phys. 33, 362–371.
Schlömilch, O. (1844). Über einige merkwürdige bestimmte Integrale, Grunert Archiv
5, 204–212.
Stirling, J. (1730). Methodus differentialis: sive, Tractatus de sommationes et interpo-
lationes serium infinitorum. London: G. Strahan.
Weierstrass, K. (1856). Über die Theorie der analytischen Fakultäten, J. Reine Angew.
Math. 51, 1–60; Werke, Vol. 1. pp. 153–211.
Whittaker, E. T. & Watson, G. N. (1950). A Course of Modern Analysis, Fourth edition.
Cambridge: Cambridge University Press.
Appendix D
Topics in harmonic analysis

D.1 Pointwise convergence of Fourier series

Let f ∈ L 1 (T), and suppose that

*
f (k) = f (x)e(−kx) d x (D.1)
T

are the Fourier coefﬁcients of f . Here e(θ ) = e2πiθ is the complex exponential
with period 1. It is a familiar fact in the theory of Fourier series that if f has
bounded variation on T, then

K
f (α + ) + f (α − )
lim *
f (k)e(kα) = . (D.2)
K →∞ 2
k=−K

Less familiar is the strong quantitative version of this that we now derive.
K
Let D K (x) = k=−K e(kx). This is the Dirichlet kernel. We multiply both
sides of (D.1) by e(kα) and sum, to see that

K
f (k)e(kα) = f (x)D K (α − x) d x = D K (x) f (α − x) d x.
k=−K T T

Since D K is an even function, the above is

= D K (x) f (α + x) d x. (D.3)
T

Clearly D K (0) = 2K + 1. If x ∈ / Z, then D K (x) is the sum of a segment of a

geometric progression, which permits us to write D K in closed form,

e ((K + 1)x) − e(−K x) e K + 12 x − e − K + 12 x
D K (x) = =
e(x) − 1 e(x/2) − e(−x/2)
sin(2K + 1)π x
= . (D.4)
sin π x

535
536 Topics in harmonic analysis

0.5

–0.5 0 .5 1 1.5

–0.5

15
Figure D.1 Graph of s(x) and its Fourier approximation − k=1 sin 2πkx/(π k).

Our analysis of the pointwise convergence of Fourier series is based on the

behaviour of the the Fourier series of one particular function, namely the ‘saw-
tooth function’ s(x) given by

{x} − 12 (x ∈ / Z),
s(x) = .
0 (x ∈ Z)

Lemma D.1 Let

K
sin 2π kx
E K (x) = s(x) + .
k=1
πk

Then |E K (x)| ≤ min (1/2, 1/((2K + 1)π| sin π x|)).

It is easy to compute the Fourier coefﬁcients of s(x); we ﬁnd that * s(0) = 0,

and that * s(k) = −1/(2πik) for k = 0. Thus the above lemma constitutes a
quantitative form of (D.2), for the function s(x). A numerical example of Lemma
D.1 is graphed in Figure D.1.

Proof All terms comprising E K (x) are odd, and hence E K is odd. Thus we
may suppose that 0 ≤ x ≤ 1/2. The case x = 0 is clear. We observe that if
x∈/ Z , then

K
E K (x) = 1 + 2 cos 2π kx = D K (x).
k=1
D.1 Pointwise convergence of Fourier series 537

Hence if 0 < x ≤ 1/2, then by (D.4) we see that

1−x
1
E K (x) = − D K (z) dz
2 x

−1 1−x
sin(2K + 1)π z
= dz
2 x sin π z

i 1−x e K + 12 z
= dz.
2 x sin π z

The integrand is analytic in the rectangle x ≤ z ≤ 1 − x, 0 ≤ z ≤ Y , so

by letting Y → ∞ and applying Cauchy’s theorem we see that the above
is

i x+i∞ e K + 12 z i 1−x+i∞ e K + 12 z
= dz − dz.
2 x sin π z 2 1−x sin π z

On writing z = x + i y in the ﬁrst integral, and z = 1 − x + i y in the second,

we see that the above is

−1 ∞ e K + 12 x e − K + 12 x
= − e−(2K +1)π y dy. (D.5)
2 0 sin π(x + i y) sin π (1 − x + i y)

But sin π(x + i y) = (sin π x) cosh π y − i(cos π x) sinh π y, so that | sin

π(x + i y)| ≥ sin π x for all real y. Hence the expression above has absolute
value not exceeding
∞
1 1
e−(2K +1)π y dy = .
sin π x 0 (2K + 1)π sin π x

This gives the second part of the bound. The first bound, |E K (x)| ≤ 1/2,
is weaker if 1/(2K + 1) ≤ x ≤ 1/2, since sin π x ≥ 2x in this range. Thus
it suffices to show that |E K (x)| ≤ 1/2 when 0 < x < 1/(2K + 1). Since
0 < sin u < u for 0 ≤ u ≤ π , it follows from the definition of E K (x)
that
1 1
x− ≤ E K (x) ≤ (2K + 1)x −
2 2
for 0 ≤ x ≤ 1/(2K + 1). This gives the desired bound.

We now establish an analogue of Lemma D.1 for arbitrary functions of

bounded variation.
538 Topics in harmonic analysis

Theorem D.2 If f has bounded variation on T, with *

f (k) given by (D.1),
then for any α,
f (α + ) + f (α − ) K
− *
f (k)e(kα)
2 k=−K
1−
1 1
≤ min , |d f (α + x)|.
0+ 2 (2K + 1)π sin π x
Since the right-hand side here tends to 0 as K → ∞, this inequality implies
the qualitative relation (D.2).

Proof As E K (x) = D K (x) when x ∈

/ Z, the integral (D.3) is
1− 1−
E K (x) f (α + x) d x = f (α + x) d E K (x),
0+ 0+

by Theorem A.3. But E K (0+ ) = −1/2, E K (1− ) = 1/2. Hence by integrating

by parts (as in Theorem A.2) we see that the above is
1−
1 1
f (α + ) + f (α − ) − E K (x) d f (α + x).
2 2 0+

To complete the proof it sufﬁces to apply the triangle inequality (as in Theorem
A.4) and the bound of Lemma D.1.

D.2 The Poisson summation formula

The formula in question asserts that under suitable conditions,
∞ ∞
f (n) = *
f (k) (D.6)
n=−∞ k=−∞

where f is a function of a real variable, and *

f is its Fourier transform,
*
f (t) = f (x)e(−t x) d x. (D.7)
R
*
f is well-deﬁned, we impose the condition f ∈ L (R), i.e., that
1
To ensure that
the integral R | f (x)| d x is ﬁnite. Put

F(α) = f (n + α). (D.8)
n∈Z

This sum is absolutely convergent for almost all α, since

1 n+1
| f (n + α)| dα = | f (α)| dα = | f (α)| dα < ∞.
0 n∈Z n∈Z n R
D.2 The Poisson summation formula 539

Moreover, F(α) has period 1, |F(α)| dα < ∞, and F has Fourier coefﬁcients
T
1 1
*
f (k) = F(α)e(−kα) dα = f (n + α)e(−kα) dα
0 n∈Z 0

= f (x)e(−kx) d x (D.9)
R
= *
f (k).

Here the interchange of the integral and the sum is justiﬁed by absolute con-
vergence. Thus the Fourier expansion of F is

*
f (k)e(kα).
k∈Z

The Poisson summation formula (D.6) is simply the assertion that this Fourier
expansion converges to F(α) when α = 0. Our hypotheses thus far do not ensure
this, but in this direction we establish the following two precise results.

Theorem D.3 Suppose that f ∈ L 1 (R), and that f is of bounded variation

on R. Then
f (n + ) + f (n − ) K
= lim *
f (k).
2 K →∞
n∈Z k=−K

If in addition f is continuous, then we have a result which is close to (D.6),

although it is still necessary to restrict ourselves to symmetric partial sums on
the right-hand side.

Proof We ﬁrst note that if n ≤ α ≤ n + 1, then

n+1 α n+1
f (α) = f (x) d x + (x − n) d f (x) + (x − n − 1) d f (x),
n n α

as can readily be seen by integration by parts. Hence

n+1
| f (α)| ≤ | f (x)| d x + var[n,n+1] f, (D.10)
n

and it follows from our hypotheses that the sum

f (n + α)
n∈Z

is absolutely convergent for all α, and uniformly convergent in compact regions.

Hence F(α) can be taken to be the value of this sum for all α, not merely for
almost all α. By the triangle inequality, varT F ≤ varR f , so that F is of bounded
variation on T, and hence the relation (D.2) applies to F. Thus we see that the
Fourier series of F converges to (F(α + ) + F(α − ))/2 for all α. Using the fact that
540 Topics in harmonic analysis

f is of bounded variation once more, we see that F(α + ) = n∈Z f ((n + α)+ ),
and similarly for F(α − ). Hence we have the stated result.

Theorem D.4 Suppose that f is continuous, and that the series n∈Z f (n +
α) is uniformly convergent for 0 ≤ α ≤ 1. Then
K
|k| *
f (n) = lim 1− f (k).
K →∞ K
n∈Z k=−K

Proof Clearly F(α) given in (D.8) is continuous. Since we have not assumed
that f ∈ L 1 (R), the Fourier transform * f (t) may not exist. However, if k is an
*
integer, then f (k) exists as a convergent improper integral. To see this we first
N
note that n=M f (n + α) is small if M and N are large integers and 0 ≤ α ≤ 1.
Then
1 N N +1
f (n + α)e(−kα) dα = f (x)e(−kx) d x
0 M M

is small. The hypothesis that n f (n + α) converges uniformly implies that
v
f (x) → 0 as |x| → ∞. Hence u f (x)e(−kx) d x → 0 as u, v tend to infinity
through real values. The calculation of * f (k) in (D.9) is still valid, but is now
justified by uniform convergence. Next we appeal to a theorem of Fejér, which
asserts that the Fourier series of a continuous function F(α) with period 1 is
uniformly (C, 1)-summable to F (see Katznelson (2004), p.19). That is,
K
|k| *
1− f (k)e(kα) −→ F(α)
k=−K
K

uniformly as K → ∞. The stated identity follows on taking α = 0.

Exercises
1. Show that if f satisﬁes the hypotheses of Theorem D.2, and α and β are
real numbers, then the function f (x + α)e(βx) does also. Specify conditions
under which

f (n + α)e(βn) = *
f (k − β)e((k − β)α).
n k

2. Suppose that f has bounded variation on [−A, A], for every A > 0. Show
that
N
∞ T
lim f (n) = lim f (x)e(−kx) d x
N →∞ T →∞
n=−N k=−∞ −T

provided that either limit exists.

D.2 The Poisson summation formula 541

3. Suppose that f ∈ L 1 (Rn ), and for x ∈ Tn put

F(x) = f (λ + x) .
λ∈Zn

(a) Show that the sum F(x) is absolutely convergent for almost all x.

(b) Show that F ∈ L 1 (Tn ) and that F L 1 (Tn ) ≤ f L 1 (Rn ) .

(c) Deﬁne the Fourier transform of f, and the Fourier coefﬁ-
cient
of F, respectively, to be * * =
f (t) = Rn f (x)e(−t · x) d x, F(k)
* *
Tn F(x)e(−k · x) d x. Show that F(k) = f (k).
4. (a) Suppose that there is a δ > 0 such that c(k) (1 + |k|)−n−δ . Show that

c(k)e(k · x)
k∈Zn

is a continuous function of x ∈ Tn .
(b) Suppose that there is a δ > 0 such that f (x) (1 + |x|)−n−δ for x ∈ Rn .
Suppose also that f (x) is continuous. Show that

F(x) = f (λ + x)
λ∈Zn

is a continuous function for x ∈ Tn .

(c) Suppose that in addition to the hypotheses in (b), the function f also has
the property that *
f (t) (1 + |t|)−n−δ . Show that

f (λ + x) = *
f (k)e(k · x)
λ∈Zn k∈Zn

for all x ∈ Tn .
5. A lattice in Rn is a set of points of the form AZn where A is a non-singular
n × n matrix. Thus Zn is an example of a lattice, called the lattice of integral
points.
(a) Suppose that 1 = AZn and 2 = BZn are two lattices. Show that 2 ⊆
1 if and only if there is an n × n matrix K with integral entries such
that B = AK .
(b) An n × n matrix U is said to be unimodular if (i) its entries are integers,
and (ii) detU = ±1. Show that if 1 = AZn and 2 = BZn are two
lattices, then 1 = 2 if and only if there is a unimodular matrix U
such that B = AU .
(c) Let a 1 , . . . , a n denote the columns of A. These vectors are said to form a
basis for 1 , because every member of 1 has a unique representation in
the form c1 a 1 + · · · cn a n where the ci are integers. If = AZn , we say
542 Topics in harmonic analysis

that the determinant of is d() = |det A|. Show that the determinant
of a lattice is independent of the basis by which it is presented.
(d) Suppose that = AZn is a lattice in Rn . Let ∗ be the set of all those
∗
points µ ∈ Rn such that µ · λ ∈−1ZTforn all λ ∈ . Show that is a
∗
lattice, and indeed that = A Z .
(e) Suppose that f is a continuous function on Rn such that
f (x) (1 + |x|)−n−δ ,
*f (t) (1 + |t|)−n−δ
for some δ > 0. Let = AZn be a lattice. Show that
1 *
f (λ + x) = f (µ)e(µ · x)
λ∈
d() µ∈∗

for all x.

D.3 Notes
Section D.1. The relation (D.2) is the famous Dirichlet–Jordan test, which is
usually derived with much less effort. Theorem D.2 generalizes and reﬁnes an
argument of Pólya (1918), who estimated the rate of convergence of the Fourier
series (9.18). For more on the convergence of Fourier series, see Katznelson
(2004, Chapter 2), Körner (1988, Part I), or Zygmund (2002, Chapter II).
Section D.2. For more on the Poisson summation formula, see Katznelson
(2004, VI.1.15), Körner (1988, Section 27), or Zygmund (2002, Chapter 2,
Section 13). For a discussion of the Poisson summation formula in higher
dimensions, see Stein & Weiss (1971, Chapter VII Section 2). Siegel (1935)
showed that Minkowski’s convex body theorem could be derived by applying
the Poisson summation formula. Cohn & Elkies (2003), Cohn (2002) and Cohn
& Kumar (2004) have applied the Poisson summation formula in Rn to limit
the density of sphere packings.

D.4 References
Cohn, H. (2002). New upper bounds on sphere packings, II, Geom. Topol. 6, 329–353.
Cohn, H. & Elkies, N. (2003). New upper bounds on sphere packings, I, Ann. of Math.
(2) 157, 689–714.
Cohn, H. & Kumar, A. (2004). The densest lattice in twenty-four dimensions, Electron.
Res. Announc. Amer. Math. Soc. 10, 58–67.
Katznelson, Y. (2004). An Introduction to Harmonic Analysis, Third edition. Cambridge:
Cambridge University Press.
D.4 References 543

Körner, T. W. (1988). Fourier Analysis, Second edition. Cambridge: Cambridge Uni-

versity Press.
Pólya, G. (1918). Über die Verteilung der quadratischen Reste und Nichtreste, Nachr.
Akad. Wiss. Göttingen, 21–29.
Siegel, C. L. (1935). Über Gitterpunkte in convexen Körpern und ein damit zusammen-
hängendes Extremalproblem, Acta Math. 65, 307–323; Gesammelte Abhandlun-
gen, Vol. I. Berlin: Springer-Verlag, 1966, 311–325.
Stein, E. & Weiss, G. (1971). Introduction to Fourier analysis on Euclidean spaces,
Princeton Math. Series 32. Princeton: Princeton University Press.
Zygmund, A. (2002). Trigonometric Series, Third edition, Vol. I. Cambridge:
Cambridge University Press.
Name index

Abel, N. H., 143, 147 Bernstein, S. N., 321, 323

Addison, A. W., 238, 240 Besenfelder, H.-J., 417
Alladi, K., 211, 241 Beukers, F., 514, 517
Allison, D., 194, 195 Beurling, A., 268, 277, 279
Almkvist, G., 513, 517 Beyer, W. A., 32, 33
Anderson, R. J., 481, 484 Binet, J. P. M., 532
Andrews, G. E., 31, 33 Birch, B. J., 134, 392, 394
Ankeny, N. C., 104, 448, 449 Boas, R. P., 513, 517
Apéry, R., 514, 517 Bohr, H., 18, 31, 33, 160, 163, 164, 448, 449
Apostol, T. M., 163, 164, 292, 323, 493, 515, Bollobas, B., 166
517 Bombieri, E., 41, 71, 103, 104, 106, 277, 279,
Arno, S., 393 322, 417
Artin, E., 532, 533 Borel, E., 192, 195
Aubert, K. E., 106 Borel, J.-P., 279
Axer, A., 247, 276, 279, 446, 449 Borevich, Z. I., 513, 514, 517
Borwein, P., 70, 71
Bach, E., 69, 71 Brauer, A., 240, 241
Bachmann, P., 31, 33 Brent, R. P., 32, 33, 516, 517
Backlund, R. J., 240, 241, 339, 340, 356, 460, Breusch, R., 276, 279
461, 515, 517 Brown, J. W., 482, 484
Baker, A., 134, 392, 393, 394 de Bruijn, N. G., 88, 211, 213ff, 239, 241
Baker, R. C., 323 Brun, V., 78, 90, 95, 101–104
Balanzario, E. P., 279 Buchstab, A. A., 102, 104, 217, 239, 240, 241
Balasubramanian, R., 449 Buell, D. A., 394
Ball, K., 514, 517 Bundschuh, P., 394
Barner, K., 417 Burgess, D. A., 315, 323
Barnes, E. W., 514, 517, 531, 533
Bartle, R. G., 493 Cahen, E., 31, 33, 162
Bartz, K., 512, 517 Cai, J.-Y., 69, 71
Bateman, P. T., v, 63, 64, 71, 80, 103, 104, Carathéodory, C., 192
131, 134, 135, 264, 276, 278, 279, 377, 394, Carlitz, L., 507, 512, 514, 517
482, 484, 493 Carmichael, R., 113, 135
Bays, C., 483, 484 Cassels, J. W. S., 514, 517
Behrend, F. A., 81, 104 Cauchy, A. L., 529, 533
Berlekamp, E., 10 Cesàro, E., 142, 147
Berndt, B. C., 341, 356 Chalk, J. H. H., v

544
Name index 545

Chang, T.-H., 240, 241 Estermann, T., v, 33, 370, 392, 393, 394
Chebyshev, P. L., 3ff, 46ff, 54, 69, 71, 475, Euler, L., 20, 32, 33, 194, 195, 500, 514, 517,
484 524, 530, 531, 532, 533
Chih, T.-T., 69, 71 Evelyn, C. J. A., 39, 40, 72, 73
Chowla, S. D., 68, 71, 74, 87, 104, 134, 135,
211, 226, 239, 242, 305, 322, 323, 377, Fatou, P., 277, 280
394 Fekete, M., 376, 394
Chudakov, N. G., 193, 195 Feller, W., 44, 72, 532, 533
Chung, K.-L., 81, 104 Fine, N. J., 49
Cipolla, M., 183, 195 Ford, K., 103, 105
Clausen, Th., 512, 514, 517 Fouvry, E., 103, 105
Coates, J., 393, 394 Freud, G., 163, 164
Cochrane, T., 322, 323 Friedlander, J. B. 102–105, 220, 242, 322
Cohen, E., 71 Friedman, A., 112, 135
Cohen, H., 391, 394 Fujii, A., 323
Cohn, H., 542 Fuss, P.-H., 531, 533
Conrey, J. B., 461, 462
Conway, J. H., 303, 323 Gallagher, P. X., 323, 417
van der Corput, J. G., 68, 69, 71, 81, 104, 276, Ganelius, T., 163, 164
279 Gauss, C. F., 5, 9, 32, 133, 134, 294, 300, 391,
Costa Pereira, N., 69, 71 392, 394, 527, 528, 531, 532, 533
Cramér, H., 31, 33, 240, 241, 416, 417, 421, Gegenbauer, L., 68, 72
447, 448, 449 Gel’fand, I. M., 162, 164
Gel’fond, A. O., 69, 134, 135, 392, 394
Darst, R., 492, 493 Glaisher, J. W. L., 508, 517
Davenport, H., v, 31, 33, 63, 71, 134, 135, 374, Goldbach, C., 531, 532
391, 394, 416, 417 Goldberg, R. R., 162, 164
DeKoninck, J.-M., 241 Goldfeld, D. M., 102, 105, 106, 276, 280, 374,
Delange, H., 71, 72, 135, 163, 164 391, 392, 393, 394, 395, 417, 418
Deléglise, M., 31, 33 Goldston, D. A., 432, 449
Demichel, P., 516, 517 Golomb, S., 54, 72
Deuring, M., 392, 394 Goodman, A., 163, 164
Diamond, H. G., 69, 72, 103, 104, 276, 277, Gorshkov, L. S., 70, 72
278, 279, 493 Gourdon, X., 31, 32, 516, 517
Dickman, K., 202, 239, 241 Graham, S. W., 265, 277, 280
Dirichlet, P. G. L., 38, 68, 72, 115, 133–135, Gram, J. P., 515, 517, 529, 533
391, 530, 533 Granville, A., 322, 324
Dodgson, C., 79 Greaves, G., 103, 105, 240, 242
Dressler, R. E., 264, 279 Gronwall, T. H., 193, 195, 391, 395
Duncan, R. L., 39, 72, 241 Gross, B. H., 393, 395
Dusart, P., 69, 72 Grosswald, E., 42, 63, 71, 72, 514, 517
Grytczuk, A., 113, 135
Edwards, D. A., 164 Guinand, A. P., 417
Edwards, H. M., 416, 417
Eggleston, H. G., 163, 164 Hadamard, J., 3, 192, 194, 195, 345, 356
Elkies, N., 542 Halberstam, H., v, 70, 72, 103, 105, 240,
Ellison, W. J., 393, 394 242
Eratosthenes, 76 Hall, R. R., 70, 72
Erdős, P., 43, 68, 69, 72, 100, 101, 103, 104, Hall, R. S., 278, 280, 482, 484
105, 131, 135, 211, 212, 215, 225, 227, 240, Haneke, W., 374, 391, 395
241, 242, 276, 279, 390, 393, 394 Hankel, H., 525, 532, 533
546 Name index

Hardy, G. H., 31, 32, 33, 59, 69, 70, 72, 101, Karamata, J., 163, 165
103, 105, 133, 150, 151, 162, 163, 164, 165, Karatsuba, A. A., 193, 195
185, 186, 193, 195, 242, 409, 418, 456, 461, Kátai, I., 71, 73
462, 473, 482, 484, 514, 517 Katznelson, Y., 540, 542
Hartman, P., 40, 72 Kestelman, H., 493, 494
Haselgrove, C. B., 472, 484 Kinkelin, H., 508, 518
Hasse, H., 321, 324 Kloss, K. E., 482, 484
Hausman, M., 226, 242 Knapowski, S., 483, 484
Heath-Brown, D. R., 70, 461, 462 Knopfmacher, J., 278, 280
Hecke, E., 194, 195, 356, 391 Knuth, D. E., 32, 34
Heegner, K., 392, 395 Knutson, D. E., 183
Heilbronn, H., 81, 105, 335, 356, 376, 392, 395 Koblitz, N., 514, 518
Hejhal, D. A., 278, 280 von Koch, H., 416, 418, 447, 450
Henrici, P., 532, 533 Körner, T. W., 542, 543
Hensley, D., 88, 105, 240, 242 Kojima, T., 157, 163, 165
Hermite, Ch., 528, 533 Kolesnik, G., 69, 73
Hewitt, E., 162, 165 Korevaar, J., 163, 164, 165, 277, 280
Hildebrand, A., 70, 72, 133, 135, 239, 240, Korobov, N. M., 193, 195
242, 322, 324 Kowalski, E., 103, 105
Hildebrandt, T. H., 493, 494 Kronecker, L., 514, 518
Hille, E., 40, 72 Kubilius, I. P., 70, 71, 73, 240, 242
Hock, A., 394 Kuhn, P., 276, 280
Hölder, O., 133, 135, 533 Kumar, A., 542
Hooley, C., 89, 102, 103, 105 Kummer, E. E., 514, 532, 534, 542
Hua, L. K., 193, 195 Kurokawa, N., 33
Hudson, R. H., 483, 484 Kusmin, R. O., 31, 32
Hutchinson, J. I., 515, 517
Huxley, M. N., 69, 73 Lagarias, J. C., 31, 34, 417, 448, 450
Landau, E., 16, 17, 31, 32, 34, 39, 41, 70, 73,
Ikehara, S., 259, 261, 264, 265, 277, 280 134, 135, 160, 163, 165, 166, 178, 182, 183,
Ingham, A., 163 184, 185, 187, 192, 193, 194, 195, 196, 267,
Ingham, A. E., v, 31, 32, 33, 128, 135, 163, 276, 277, 278, 280, 321, 322, 324, 337, 350,
165, 186, 192, 193, 194, 195, 280, 409, 418, 353, 356, 367ff, 391, 392, 395, 416, 418,
472, 480, 482, 483, 484, 494 448, 449, 450, 473, 485
Ivić, A., 215 Lang, S., 417, 418
Iwaniec, H., 69, 73, 104, 105, 322, 323 Laurinčikas, A., 449, 450
Iwaniec, H. 102ff, 102 Lavrik, A. F., 277, 280, 335, 356, 357
Legendre, A. M., 3, 76, 242, 532, 534
Jacobi, C. G. J., 514, 518 Lehman, R. S., 483, 484, 485, 516,
Jacobsthal, E., 220 518
Jarnı́k, V., 41, 73 Lehmer, D. H., 31, 34, 65, 80, 106, 504, 516,
Jensen, J. L. W. V., 31, 34, 192, 195, 532, 533, 518
534 Lenstra, H., 391, 394
Jordan, C., 514, 518 Lerch, M., 341, 357
Jorgenson, J., 417, 418 LeVeque, W. J., 240, 242
Joris, H., 321, 324 Levinson, N., 276, 280, 461, 462
Joyner, D., 449 Lévy, P., 162, 163, 166
Jurkat, W. B., 106, 481, 484 Linfoot, E. H., 39, 40, 72, 73, 392, 395
Linnik, Yu. V., 134, 135, 392, 394
Kac, M., 71, 73, 240, 242 van Lint, J. H., 88, 106
Kahane, J.-P., 277, 278, 280, 483, 484 Liouville, J., 529, 530, 534
Name index 547

Littlewood, J. E., 5, 31, 33, 101, 103, 105, 150, Niven, I., 69, 74
151, 160, 162, 163, 164, 165, 166, 193, 196, Norton, K. K., 239, 242
242, 340, 357, 409, 418, 432, 448, 449, 450, Nowak, W. G., 41, 74
461, 462, 473, 478, 482, 483, 484, 485, 516, Nyman, B., 278, 281
518
Lucas, É., 512, 514, 518 Odlyzko, A. M., 31, 34, 448, 450, 482, 485,
van de Lune, 166, 516, 517, 518 516, 518
Lunnon, W. F., 394 Oesterlé, J., 393, 395
Onishi, H., 104
Maclaurin, C., 500, 514, 518 Orr, R. C., 39, 74
Mahler, K., 374, 395 Ostrowski, A., 533, 534
Maier, H., 240, 242, 449, 450
Ma̧kowski, A., 69, 73 Page, A., 369, 379, 391, 393, 395
Malliavin, P., 278, 280 Paley, R. E. A. C., 312, 322, 324
Mallik, A., 336, 357 Palm, G., 417
von Mangoldt, H., 194, 195, 196, 416, 418, Parry, W., 278, 281
460, 462 Perron, O., 138, 162, 166
Mapes, D. C., 31, 34 Pesek, J., 394
Martin, G., 286, 324 Peyerimhoff, A., 163, 166
Mascheroni, L., 32, 34 Phragmén, E., 160
Massias, J.-P., 69, 73, 184, 196 Pila, J., 41, 71
Mattics, L. E., 293, 324 Pillai, S. S., 68, 74, 226, 242
McMillan, E. M., 32, 33 Pincherle, S., 532, 534
Meissel, E. D. F., 31 Pintz, J., 134, 136, 194, 197, 240, 243
Meller, N. A., 516, 518 Pitt, H. R., 164, 166, 277, 281
Mellin, H., 162, 166, 525, 529, 531, 532, 534 Poisson, S. D., 356, 357
Mertens, F., 46ff, 68, 70, 73, 127, 134, 135, Pollard, H., 492, 493
176, 193, 197, 482, 485 Pollicott, M., 278, 281
Meurman, A., 513, 517 Pólya, G., 190, 197, 307, 309, 322, 324, 376,
Miller, V. S., 31, 34 394, 395, 484, 485, 542, 543
Mirsky, L., 7, 393, 395 Pomerance, C., 65, 74, 131, 135, 240, 242
Mittag-Lefﬂer, M. G., vi van der Poorten, A., 514, 518
Möbius, A. F., 35 Postnikov, A. G., 163, 166
Monach, W. R., 483, 485 Pringsheim, A., 18, 32, 34
Monsky, P., 134, 136 Pritsker, I. E., 70, 74
Montgomery, H. L., 68, 69, 70, 73, 74, 89,
102, 106, 163, 166, 177, 193, 197, 225, 226, Raabe, J., 531, 534
242, 278, 279, 321, 322, 323, 324, 393, 395, Rademacher, H., 513, 518
432, 446, 448, 449, 450, 483 Ramachandra, K., 449
Moore, E. H., 533, 534 Ramanujan, S., 59, 60, 70, 72, 74, 113, 114,
Mordell, L. J., 32, 34, 134, 135, 293, 305, 323, 133, 136
324, 392, 395 Ramaswami, V., 239, 243
Moser, L., 10 Rankin, R. A., 222, 240, 243, 493, 494
Motohashi, Y., 102, 103, 106 Redmond, D., 113, 136
Mozzochi, C. J., 69, 73 Rényi, A., 65, 71, 74, 240, 243
Reznick, B., 112, 136
Narkiewicz, W., 71, 73, 134, 136, 276, 281 Ricci, G., 100, 106, 240, 243
Newman, D. J., 7, 162, 163, 164, 166 Richards, I., 228, 240, 242, 243
Newman, F. W., 532, 534 Richert, H.-E., 69, 70, 72, 74, 88, 103, 105,
Nicolas, J.-L., 70, 73, 184, 196, 212, 242 106, 193, 197, 240, 242
Nielsen, N., 32, 34, 518, 532, 534 te Riele, H. J. J., 482, 483, 485, 516, 517, 518
548 Name index

Riemann, B., 162, 328, 356, 357, 416, 418, Stark, H. M., 392, 393, 396
460, 462, 515 Stás, W., 194, 197
Riesel, H., 31, 34, 106 von Staudt, K. G. C., 512, 514, 519
Riesz, M., 31, 32, 33, 143, 160, 162, 165, 166, Stein, E., 542, 543
277, 281 Steinhaus, H., 163, 166
Rivat, J., 31, 33 Steinig, J., 277, 279
Rivoal, T., 514, 517 Stemmler, R. M., 482, 484
Robbins, H., 532, 534 Stepanov, S. A., 322
Robin, G., 69, 73, 184, 196 Stieltjes, T. J., 27, 29, 34, 41, 75
Robinson, M. L., 393 Stirling, J., 514, 532, 534
Robinson, R. L., 74 Sweeney, D. W., 32, 34
Rogers, K., 39, 74 Swinnerton-Dyer, H. P. F., 393
Rohrbach, H., 81, 106 Sylvester, J. J., 69, 75
Romanoff, N. P., 97, 103, 106 Szegö, G., 190, 197, 376, 395
Rosser, J. B., 69, 74, 182, 183, 197, 377, 395, Szekeres, G., 43, 72
516, 518
Rubel, L., 163, 166 Tate, J. T., 356, 357
Runge, C., 70, 74 Tatuzawa, T., 193, 197, 375, 396
Rutkowski, J., 512, 517 Tauber, A., 150, 160, 163, 166
Taylor, P. R., 354, 357
Saalschütz, L., 529, 534 Teege, H., 134, 136
Saffari, B., 71, 74, 131 Tenenbaum, G., 70, 71, 72, 75, 239, 240, 242
Sampath, A., 277, 281 Terras, A., 514, 519
Sathe, L. G., 240, 243 Titchmarsh, E. C., 90, 102, 107, 162, 163, 166,
Schinzel, A., 163, 166, 243, 374, 391, 395 167, 193, 194, 197, 356, 357, 391, 396, 448,
Schlömilch, O., 532, 534 449, 451, 461, 462, 516, 519
Schmidt, E., 482, 485 Toeplitz, O., 148, 163, 167
Schmidt, P. G., 43, 74 Tornier, E., 44, 72
Schmidt, W. M., 314, 322, 324 Tsang, K. M., 107
Schoenberg, I. J., 160, 166 Turán, P., 58, 64, 70, 75, 103, 105, 194, 197,
Schoenfeld, L., 69, 74, 182, 197, 516, 518 240, 243, 448, 451, 472, 483, 485
Schönhage, A., 240, 243, 516, 518 Turing, A., 516, 519
Schur, I., 148, 163, 166, 321, 324
Schwarz, W., 71, 74, 133, 135, 136, 276, 281 Vaaler, J. D., 265, 277, 280
Sebah, P., 31, 32 de la Vallée Poussin, C. J., 3, 39, 75, 192ff,
Selberg, A., 102, 103, 106, 107, 240, 243, 251, 193, 194, 197, 321, 324, 356, 357, 409, 418
276, 281, 445, 448, 450, 460–462 Vaughan, R. C., 31, 34, 89, 102, 104, 106, 107,
Serre, J.-P., 133, 136 131, 135, 136, 177, 193, 197, 226, 242, 321,
Shafarevich, I. R., 513, 514, 517 322, 324, 325, 390, 396, 446, 450
Shafer, R. E., 29, 34 Vijayaraghavan, T., 80, 107, 211, 239, 241
Shan, Z., 65, 75 Vinogradov, I. M., 31, 193, 197, 307, 309, 322,
Shapiro, H. N., 68, 72, 226, 242 325
Siegel, C. L., 372, 381, 392, 396, 515, 519, Vivanti, G., 18, 32, 34
542, 543 Vorhauer, U. M. A., 278, 279, 325, 355, 356,
Sitaramachandrarao, R., 41, 75 357, 416, 418, 445, 451
Skewes, S., 483, 485 Vorhauer, V. M. A, 286
Sobirov, A. Š., 277, 280 Voronin, S. M., 193, 195
Soundararajan, K., 69, 75, 322, 324 Voronoı̈, G., 68, 75
Spilker, J., 133, 135
Srinivasan, B. R., 277, 281 Wagner, C., 393, 396
Stall, D. S., 394 Wagon, S., 10, 34
Name index 549

Walﬁsz, A., 32, 34, 68, 75, 193, 198, 322, 325, Wigert, S., 70, 75, 409, 418
336, 357, 381, 386, 393, 396 Wilf, H., 31, 34
Wallis, J., 507, 519 Williamson, H., 162, 165
Ward, D. R., 43, 75 Wilson, B. M., 71, 75
Waterman, M. S., 32, 33 Winter, D. T., 516, 517, 518
Watkins, M., 393, 396 Wintner, A., 40, 43, 72, 75, 113, 136, 158, 167,
Watson, G. N., 514, 519, 532, 534 447, 451
Weber, H., 392 Wirsing, E. A., 70, 75, 134, 277, 281
Wedeniwski, S., 516 Wirtinger, W., 514, 519
Weierstrass, K., 345, 532, 534 Witt, E., 514
Weil, A., 314, 322, 335, 357, 410, 417, Wrench, W. R., 32, 34
418 Wright, E. M., 276, 281
Weinberger, P. J., 393, 395
Weiss, G., 542, 543 Yohe, J. M., 516, 518
Westzynthius, E., 221, 240, 243 Yoshida H., 417, 418
Wheeler, F. S., 393
Whittaker, E. T., 514, 519, 532, 534 Zagier, D. M., 393, 395, 396
Widder, D. V., 34, 162, 163, 164, 167, 281, Zeitz, H., 240, 241
493, 494 Zhang, W. B., 278, 281
Wielandt, H., 163, 167 Zolotarev, G., 303
Wiener, N., 162–164, 167, 259, 261, 264–265, Zuckerman, H. S., 69, 74
277, 281 Zygmund, A., 162, 167, 482, 485, 542, 543
Subject index

Abelian weights, 143 critical line, 328

Abel’s theorem, 147 critical strip, 328
abscissa
of absolute convergence, 14 Dedekind zeta function, 194, 321, 343,
of convergence, 11 392
arithmetic semigroup, 278 Dickman function, 200, 201, 210–212
Axer’s theorem, 247, 276 differential–delay equation, 200, 216
digamma function, 522ff
Bernoulli numbers, 495ff Dirichlet character: see Character, Dirichlet
Bernoulli polynomials, 495ff Dirichlet convolution, 38
Bertrand’s postulate, 49 Dirichlet divisor problem, 68
beta function, 530 Dirichlet–Jordan test, 542
Beurling primes, 266ff, 277, 278, 483 Dirichlet kernel, 535
Blaschke product, 192 Dirichlet L-function, 120ff
Birch–Swinnerton-Dyer conjectures, analytic continuation, 121, 332–333
393 distribution of zeros, 351, 454–456
Borel–Carathéodory lemma, 169 Euler product, 120, 121
Brun–Titchmarsh inequality, 90 exceptional zero, 360, 367ff
Buchstab’s function, 216–220 functional equation, 333
non-trivial zeros, 333, 358ff
Catalan’s constant, 514 special values, 337
Catalan numbers, 8 trivial zeros, 333
Cesàro summability, 147, 158 Dirichlet series, 1, 11ff, 137ff
Cesàro weights, 142 formal, 39
character generalized, 31
additive, 108ff ordinary, 31
Dirichlet, 115ff Dirichlet’s theorem
complex, 123 on Diophantine approximation, 478
conductor, 283 on primes in a. p., 123
induced, 282 discriminant, 343
primitive, 282ff quadratic, 296
quadratic, 295ff divisor function, 2, 38, 45–46, 55–56, 60,
real, 123 68–69
group, 133
circle problem, 45–46 Euler numbers, 506
covering congruences, 7 Euler’s constant, 26, 514

550
Subject index 551

Euler–Maclaurin summation formula, 25, 44, Lambert summability, 159

500ff Landau’s theorem, 16, 32, 463
Euler products, 19ff lattice, 541
Euler’s totient function, 27, 36, 55, 68 Lerch zeta function, 515
explicit fomulæ, 397ff Lindelöf Hypothesis, 330, 438
Liouville lambda function, 21
Farey fractions, 183, 184 logarithmic integral, 5, 180, 189ff
finite differences, 510
finite Fourier transform, 109 von Mangoldt lambda function, 23
Fourier series, 535ff matrix,
fractional part, 39 unimodular, 541
function, unitary, 112, 119
additive, 21 Mellin transform, 137, 141
arithmetic, 20 inverse, 137, 141
even, 133 Mellin–Barnes integrals, 532
multiplicative, 20 method of the hyperbola, 38
totally additive, 21 Mercer’s theorem, 158
totally multiplicative, 20 Minkowski’s convex body theorem, 542
Möbius mu function, 21
gamma function, 520ff
Artin’s theorem, 520, 535 oscillation of error terms, 463ff
Euler’s integral, 524, 532
Gauss’s formula, 520, 531 Parseval’s identity, 110, 133
Gauss’s multiplication formula, 527, 532 partition, 7
Hankel’s integral, 525 Pell’s equation, 134
incomplete, 327 Perron’s formula, 137ff
Legendre’s duplication formula, 522, 532 Plancherel’s identity, 144, 162
Mellin’s integral, 525, 529 Poisson summation
reflection formula, 521, 532 formula, 538ff
special values of, 520ff Pólya–Vinogradov inequality, 307, 309, 322
Stirling’s formula, 523, 532 powe series, 1
Weierstrass product, 520 power-full number, 66
Gauss sum, 286ff Prime Ideal Theorem, 194, 267
generalized prime numbers, Prime k-tuple conjecture, 103, 224
see Beurling primes Prime Number Theorem, 3, 168ff, 244ff, 276,
Generalized Riemann Hypothesis, 333 277
generating function, 1 elementary proof, 250ff
Grössencharakter, 120, 132, 344, 366, 385 for arithmetic progressions, 358ff
group representation, 133
Ramanujan expansion, 133
Hankel path, 515 Ramanujan sum, 110ff, 133, 265, 287
Heisenberg uncertainty principle, 147 regular transformation, 148
Hurwitz zeta function, 30, 340, 513 Riemann Hypothesis, 328, 417
consequences of, 419ff
inclusion–exclusion, 77 Generalized, 333
inversion formula, Riemann–Siegel formula, 515
Möbius, 35 Riemann–Roch theorem, 322
Riemann–Stieltjes integral, 12, 486ff
Jensen’s formula, 168 first mean value theorem for, 491
refinement, 492
Kronecker symbol, 296 second mean value theorem for, 492
Kummer congruences, 514 uniform, 492
552 Subject index

Riemann zeta function, 2 square-free number, 36, 183, 186, 225, 446,
analytic continuation, 24–27, 500, 501 471
distribution of zeros, 175, 353–354, von Staudt–Clausen theorem, 512, 514
452ff Stirling’s formula, 503
Euler product, 22 summability, 147–167
functional equation, 326ff Abel, 147
linear independence of zeros, 447ff, Cesàro, 158
467ff Lambert, 159
non-trivial zero, 328 Riesz, 158
special values, 328 sums of two squares, 45, 46, 187, 188, 227,
trivial zeros, 328 228
zero-free region, 168–175, 192–194 symmetric group, 184
zeros on the critical line, 456ff
Riesz product, 482 tangent coefficients, 505
Riesz representation theorem, 493 Tauberian theorem, 150ff
Riesz typical mean, 143 Hardy–Littlewood, 151–155, 163
Hardy’s, 150
saw-tooth function, 536 Karamata’s, 163
secant coefficients, 506 Littlewood’s, 151, 163
sieve, 76ff Tauber’s first, 150
Brun, 78 Tauber’s second, 160–161
combinatorial, 78 Wiener–Ikehara, 259–266, 277
Eratosthenes–Legendre, 76 Wiener’s, 163–164
Selberg, 82ff, 102
sine integral, 139 Wallis’ formula, 503, 507
square-free kernel, 84 Weyl sum, 193

Brodmann M.P., Sharp R.Y.-local Cohomology
No ratings yet
Brodmann M.P., Sharp R.Y.-local Cohomology
516 pages
Algebraic Number Theory-J. W. S. Cassels, A. Frohlich
100% (1)
Algebraic Number Theory-J. W. S. Cassels, A. Frohlich
392 pages
Evolution Without Evolution: Dynamics Described Stationary
No ratings yet
Evolution Without Evolution: Dynamics Described Stationary
8 pages
Paul C. Shields The Ergodic Theory of Discrete Sample Paths Graduate Studies in Mathematics 13 1996
100% (1)
Paul C. Shields The Ergodic Theory of Discrete Sample Paths Graduate Studies in Mathematics 13 1996
259 pages
Applied Nonstandard Analysis
From Everand
Applied Nonstandard Analysis
Martin Davis
3/5 (1)
Bkook Terrem
No ratings yet
Bkook Terrem
663 pages
Cohen Macaulay Rings
100% (1)
Cohen Macaulay Rings
465 pages
An Introduction To Clifford Algebras and Spinors PDF
No ratings yet
An Introduction To Clifford Algebras and Spinors PDF
257 pages
Lebowitz-Penrose - Modern Ergodic Theory - PT1973
No ratings yet
Lebowitz-Penrose - Modern Ergodic Theory - PT1973
7 pages
Cyclotomic Fields I and II - Serge Lang
No ratings yet
Cyclotomic Fields I and II - Serge Lang
449 pages
Theory of Functions 2nd Ed Titchmarsh e
No ratings yet
Theory of Functions 2nd Ed Titchmarsh e
464 pages
(Ebook) Regular Functions of a Quaternionic Variable by Graziano Gentili, Caterina Stoppato, Daniele C. Struppa ISBN 9783031075315, 3031075315, 350e0ee5-dca3-4f34-912f-530b6b5a6bb5, 350E0EE5-DCA3-4F34-912F-530B6B5A6BB5 - Download the ebook now and read anytime, anywhere
100% (2)
(Ebook) Regular Functions of a Quaternionic Variable by Graziano Gentili, Caterina Stoppato, Daniele C. Struppa ISBN 9783031075315, 3031075315, 350e0ee5-dca3-4f34-912f-530b6b5a6bb5, 350E0EE5-DCA3-4F34-912F-530B6B5A6BB5 - Download the ebook now and read anytime, anywhere
83 pages
Ameya Pitale - Siegel Modular Forms - A Classical and Representation-Theoretic Approach-Springer International Publishing (2019)
No ratings yet
Ameya Pitale - Siegel Modular Forms - A Classical and Representation-Theoretic Approach-Springer International Publishing (2019)
142 pages
Tao T. Higher Order Fourier Analysis (Draft, 2011) (233s) - MT
100% (3)
Tao T. Higher Order Fourier Analysis (Draft, 2011) (233s) - MT
233 pages
Adler: Random Fields and Geometry
No ratings yet
Adler: Random Fields and Geometry
471 pages
Download full American Machiavelli Alexander Hamilton and the Origins of U S Foreign Policy 1st Edition John Lamberton Harper ebook all chapters
100% (12)
Download full American Machiavelli Alexander Hamilton and the Origins of U S Foreign Policy 1st Edition John Lamberton Harper ebook all chapters
67 pages
Random Matrix Theories in Quantum Physics
No ratings yet
Random Matrix Theories in Quantum Physics
178 pages
Analytic Number Theory and Diophantine Problems: Proceedings of A Conference at Oklahoma State U
No ratings yet
Analytic Number Theory and Diophantine Problems: Proceedings of A Conference at Oklahoma State U
349 pages
Elements of - Category Theory
100% (1)
Elements of - Category Theory
606 pages
Explicit Brauer Induction: Cambridge Studies in Advanced Mathematics: 40
No ratings yet
Explicit Brauer Induction: Cambridge Studies in Advanced Mathematics: 40
421 pages
Fourier-Mukai and Nahm Transforms in Geometry and Mathematical Physics (Claudio Bartocci, Ugo Bruzzo Etc.)
100% (1)
Fourier-Mukai and Nahm Transforms in Geometry and Mathematical Physics (Claudio Bartocci, Ugo Bruzzo Etc.)
434 pages
Roe Et Al., 2018. Mathematics For Sustainability
No ratings yet
Roe Et Al., 2018. Mathematics For Sustainability
534 pages
Cohomology of Arithmetic Groups, L-Functions and Automorphic - T. Venkatamarana PDF
No ratings yet
Cohomology of Arithmetic Groups, L-Functions and Automorphic - T. Venkatamarana PDF
132 pages
Moore 1966
100% (1)
Moore 1966
234 pages
Inverse Galois Theory: Gunter Malle B. Heinrich Matzat
0% (1)
Inverse Galois Theory: Gunter Malle B. Heinrich Matzat
547 pages
(Springer Monographs in Mathematics) Haruzo Hida - P-Adic Automorphic Forms On Shimura Varieties-Springer (2004)
No ratings yet
(Springer Monographs in Mathematics) Haruzo Hida - P-Adic Automorphic Forms On Shimura Varieties-Springer (2004)
396 pages
The Second Physicist: Christa Jungnickel Russell Mccormmach
No ratings yet
The Second Physicist: Christa Jungnickel Russell Mccormmach
479 pages
Stacks Project Book
100% (1)
Stacks Project Book
2,757 pages
Class Field Theory, Milne
100% (1)
Class Field Theory, Milne
230 pages
Introduction To Coalgebra 59: Towards Mathematics of States and Observation
100% (1)
Introduction To Coalgebra 59: Towards Mathematics of States and Observation
493 pages
2020 Skew PBW Extensions
100% (1)
2020 Skew PBW Extensions
581 pages
Dales, Dashiell, Lau, . Strauss. Banach Spaces of Continuous Functions as Dual Spaces
No ratings yet
Dales, Dashiell, Lau, . Strauss. Banach Spaces of Continuous Functions as Dual Spaces
286 pages
Higher Categories and Homotopical Algebra
100% (1)
Higher Categories and Homotopical Algebra
448 pages
Homotopy Theory: Elementary Basic Concepts.
No ratings yet
Homotopy Theory: Elementary Basic Concepts.
63 pages
Sheaves in Elementary Mathematics
No ratings yet
Sheaves in Elementary Mathematics
10 pages
J.H. Conway, R.T. Curtis, S.P. Norton, R.A. Parker and R.A. Wilson - Atlas of Finite Groups
No ratings yet
J.H. Conway, R.T. Curtis, S.P. Norton, R.A. Parker and R.A. Wilson - Atlas of Finite Groups
286 pages
Michio Kuga - Susan Addington - Motohico Mulase - Galois' Dream - Group Theory and Differential Equations - Group Theory and Differential Equations-Birkhauser (1993)
No ratings yet
Michio Kuga - Susan Addington - Motohico Mulase - Galois' Dream - Group Theory and Differential Equations - Group Theory and Differential Equations-Birkhauser (1993)
158 pages
Lectures On Groups of Transformations: J. L. Koszul
No ratings yet
Lectures On Groups of Transformations: J. L. Koszul
85 pages
(Cambridge Mathematical Library) Alan Baker - Transcendental Number Theory-Cambridge University Press (2022)
No ratings yet
(Cambridge Mathematical Library) Alan Baker - Transcendental Number Theory-Cambridge University Press (2022)
185 pages
Matzat Greuel Hiss (Eds.) Algorithmic Algebra and Number Theory
No ratings yet
Matzat Greuel Hiss (Eds.) Algorithmic Algebra and Number Theory
430 pages
Ramanujan Tau Function Lygeros Rozier
No ratings yet
Ramanujan Tau Function Lygeros Rozier
14 pages
Foundation of Algebra Geometry
No ratings yet
Foundation of Algebra Geometry
826 pages
Hypergeometric Functions of Two Variables
100% (1)
Hypergeometric Functions of Two Variables
201 pages
The Physics Handbook: Charles P. Poole, JR
No ratings yet
The Physics Handbook: Charles P. Poole, JR
11 pages
(Lecture Notes in Mathematics 1667) Jesús M. F. Castillo, Manuel González (Auth.) - Three-Space Problems in Banach Space Theory-Springer-Verlag Berlin Heidelberg (1997) PDF
No ratings yet
(Lecture Notes in Mathematics 1667) Jesús M. F. Castillo, Manuel González (Auth.) - Three-Space Problems in Banach Space Theory-Springer-Verlag Berlin Heidelberg (1997) PDF
280 pages
(Universitext) Robert G. Underwood (Auth.) - Fundamentals of Hopf Algebras-Springer International Publishing (2015) (1)
100% (1)
(Universitext) Robert G. Underwood (Auth.) - Fundamentals of Hopf Algebras-Springer International Publishing (2015) (1)
164 pages
(IOP Concise Physics) Richard A. Dunlap - Electrons in Solids - Contemporary Topics-IOP Publishing (2019)
100% (1)
(IOP Concise Physics) Richard A. Dunlap - Electrons in Solids - Contemporary Topics-IOP Publishing (2019)
114 pages
Basic Algebra II Second Edition Nathan Jacobson - The full ebook version is just one click away
100% (1)
Basic Algebra II Second Edition Nathan Jacobson - The full ebook version is just one click away
57 pages
The Theory of Groups. by Marshall Hall, JR., New York, Macmillan
No ratings yet
The Theory of Groups. by Marshall Hall, JR., New York, Macmillan
3 pages
(Problem Books in Mathematics) Marek Capiński, Tomasz Zastawniak (Auth.) - Probability Through Problems-Springer-Verlag New York (2001)
No ratings yet
(Problem Books in Mathematics) Marek Capiński, Tomasz Zastawniak (Auth.) - Probability Through Problems-Springer-Verlag New York (2001)
262 pages
(London Mathematical Society Lecture Note Series) Fred Diamond, Payman L. Kassaei, Minhyong Kim - Automorphic Forms and Galois Representations - Volume 2-Cambridge Uni
100% (1)
(London Mathematical Society Lecture Note Series) Fred Diamond, Payman L. Kassaei, Minhyong Kim - Automorphic Forms and Galois Representations - Volume 2-Cambridge Uni
388 pages
B L Van Der Waerden Modern Algebra Vol 2 PDF
No ratings yet
B L Van Der Waerden Modern Algebra Vol 2 PDF
227 pages
Lectures On Lie Groups and Representations of Locally Compact Groups
No ratings yet
Lectures On Lie Groups and Representations of Locally Compact Groups
140 pages
Divergent Series, Summability and Resurgence I Monodromy and Resurgence
No ratings yet
Divergent Series, Summability and Resurgence I Monodromy and Resurgence
314 pages
(Titchmarsh) Theory of Functions
No ratings yet
(Titchmarsh) Theory of Functions
467 pages
Lie Algebra
No ratings yet
Lie Algebra
16 pages
An Introduction To Differential Geometry With Use of Tensor Calculus - Eisenhart L P PDF
No ratings yet
An Introduction To Differential Geometry With Use of Tensor Calculus - Eisenhart L P PDF
309 pages
Flaubert, Zola, and The Incorporation of Disciplinary Knowledge
No ratings yet
Flaubert, Zola, and The Incorporation of Disciplinary Knowledge
275 pages
Lectures on the Coupling Method
From Everand
Lectures on the Coupling Method
Torgny Lindvall
No ratings yet
Substitutional Analysis
From Everand
Substitutional Analysis
Daniel Edwin Rutherford
No ratings yet
Roots of Unity: Paul Garrett
No ratings yet
Roots of Unity: Paul Garrett
15 pages
Discrete and Combinatorial Mathematics 3rd Edition
25% (4)
Discrete and Combinatorial Mathematics 3rd Edition
5 pages
Sandi Cyclic: Dr. Risanuri Hidayat
No ratings yet
Sandi Cyclic: Dr. Risanuri Hidayat
35 pages
Napkin
100% (1)
Napkin
573 pages
Finite Fields
No ratings yet
Finite Fields
5 pages
Codes Over Muffin Ideals of Quaternion Integer Ring-: Shaikh Javed Shafee and Arunkumar R. Patil
No ratings yet
Codes Over Muffin Ideals of Quaternion Integer Ring-: Shaikh Javed Shafee and Arunkumar R. Patil
13 pages
Implementation of Elliptic Curve Digital Signature Algorithm Using Variable Text Based Message Encryption With Message Digest
No ratings yet
Implementation of Elliptic Curve Digital Signature Algorithm Using Variable Text Based Message Encryption With Message Digest
9 pages
A New Signature Scheme Based On Factoring and Discrete Logarithm Problems
No ratings yet
A New Signature Scheme Based On Factoring and Discrete Logarithm Problems
5 pages
Carl Friedrich Gauss - Wikipedia
No ratings yet
Carl Friedrich Gauss - Wikipedia
61 pages
Galois Theory 5th Edition Ian Stewart All Chapter Instant Download
100% (5)
Galois Theory 5th Edition Ian Stewart All Chapter Instant Download
53 pages
A Reed-Solomon Code Magic Trick
No ratings yet
A Reed-Solomon Code Magic Trick
7 pages
ECCS 4141 - Information Theory Final Project Report (7,5) Reed-Solomon Code
No ratings yet
ECCS 4141 - Information Theory Final Project Report (7,5) Reed-Solomon Code
79 pages
Goppa Mceliece
No ratings yet
Goppa Mceliece
41 pages
Error Control Coding
No ratings yet
Error Control Coding
76 pages
Galois Theory 4th Edition Ian Nicholas Stewart - The ebook is available for quick download, easy access to content
100% (2)
Galois Theory 4th Edition Ian Nicholas Stewart - The ebook is available for quick download, easy access to content
60 pages
Finite Fields and Error-Correcting Codes: Lecture Notes in Mathematics
No ratings yet
Finite Fields and Error-Correcting Codes: Lecture Notes in Mathematics
54 pages
Galois Theory 4th Edition Ian Stewart 2024 Scribd Download
83% (6)
Galois Theory 4th Edition Ian Stewart 2024 Scribd Download
61 pages
AIS Tutorial4
No ratings yet
AIS Tutorial4
1 page
The American Mathematics
100% (2)
The American Mathematics
100 pages
[FREE PDF sample] Modern cryptography theory and practice 5th print Edition Mao ebooks
100% (6)
[FREE PDF sample] Modern cryptography theory and practice 5th print Edition Mao ebooks
81 pages
M 19 Irreducibility of Polynomials Over A Field
No ratings yet
M 19 Irreducibility of Polynomials Over A Field
6 pages
Galois Theory 2nd Edition David A. Cox 2024 Scribd Download
100% (6)
Galois Theory 2nd Edition David A. Cox 2024 Scribd Download
81 pages
Galois - Theory 23 3
No ratings yet
Galois - Theory 23 3
2 pages
Synchronization Signals - LTE
No ratings yet
Synchronization Signals - LTE
14 pages
Galois in MatLab Guide
No ratings yet
Galois in MatLab Guide
30 pages
Abstract Algebra With Applications to Galois Theory Algebraic Geometry Representation Theory and Cryptography Third Edition Gerhard Rosenberger - Read the ebook online or download it to own the complete version
No ratings yet
Abstract Algebra With Applications to Galois Theory Algebraic Geometry Representation Theory and Cryptography Third Edition Gerhard Rosenberger - Read the ebook online or download it to own the complete version
52 pages
Ieee P1363.3™/D2 Drafttxttrialusetxtgorrporstd For Vartitlepar
No ratings yet
Ieee P1363.3™/D2 Drafttxttrialusetxtgorrporstd For Vartitlepar
69 pages
381 - MA8551 Algebra and Number Theory - Anna University 2017 Regulation Syllabus PDF
No ratings yet
381 - MA8551 Algebra and Number Theory - Anna University 2017 Regulation Syllabus PDF
1 page
FPGA Implementation of Encoder For (15, K) Binary BCH Code Using VHDL and Performance Comparison For Multiple Error Correction Control
No ratings yet
FPGA Implementation of Encoder For (15, K) Binary BCH Code Using VHDL and Performance Comparison For Multiple Error Correction Control
5 pages