0% found this document useful (0 votes)
29 views16 pages

Chains of Large Gaps Between Primes: P N K X G X P P, - . - , P P, K

Uploaded by

Maxwell Santoro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views16 pages

Chains of Large Gaps Between Primes: P N K X G X P P, - . - , P P, K

Uploaded by

Maxwell Santoro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

CHAINS OF LARGE GAPS BETWEEN PRIMES

arXiv:1511.04468v1 [math.NT] 13 Nov 2015

KEVIN FORD, JAMES MAYNARD, AND TERENCE TAO

A BSTRACT. Let pn denote the n-th prime, and for any k > 1 and sufficiently large X, define the quantity
Gk (X) := max min(pn+1 − pn , . . . , pn+k − pn+k−1 ),
pn+k 6X

which measures the occurrence of chains of k consecutive large gaps of primes. Recently, with Green and
Konyagin, the authors showed that
log X log log X log log log log X
G1 (X) ≫
log log log X
for sufficiently large X. In this note, we combine the arguments in that paper with the Maier matrix method to
show that
1 log X log log X log log log log X
Gk (X) ≫ 2
k log log log X
for any fixed k and sufficiently large X. The implied constant is effective and independent of k.

C ONTENTS
1. Introduction 1
2. Siegel zeroes 4
3. Sieving an interval 5
4. Sieving a set of primes 8
5. Using a hypergraph covering theorem 9
6. Using a sieve weight 11
References 16

1. I NTRODUCTION
Let pn denote the nth prime, and for any k > 1 and sufficiently large X, let

Gk (X) := max min(pn+1 − pn , . . . , pn+k − pn+k−1 ),


pn+k 6X

denote the maximum gap between k consecutive primes less than X. The quantity G1 (X) has been exten-
sively studied. The prime number theorem implies that

G1 (X) > (1 + o(1)) log X,


1
2 KEVIN FORD, JAMES MAYNARD, AND TERENCE TAO

with the bound being successively improved in many papers [1], [4], [25], [9], [22], [24], [23],[15], [20],
[18], [10], [11]. The best lower bound currently is1
log X log2 X log4 X
G1 (X) ≫ ,
log3 X
for sufficiently large X and an effective implied constant, due to [11]. This result may be compared against
the conjecture G1 (X) ≍ log2 X of Cramér [7] (see also [13]), or the upper bound G1 (X) ≪ X 0.525 of
Baker-Harman-Pintz [3], which can be improved to G1 (X) ≪ X 1/2 log X on the Riemann hypothesis [6].
Now we turn to Gk (X) in the regime where k > 1 is fixed, and X assumed sufficiently large depending
on k. Clearly Gk (X) 6 G1 (X), and a naive extension of the probabilistic heuristics of Cramér [7] suggest
that Gk (X) ≍ k1 log2 X as X → ∞. The first non-trivial bound on Gk (X) for k > 2 was by Erdős [9],
who showed that
G2 (X)/ log X → ∞
as X → ∞. Using what is now known as the Maier matrix method, together with the arguments of Rankin
[22] on G1 (X), Maier [14] showed that
log X log2 X log4 X
Gk (X) ≫k
(log3 X)2
for any fixed k > 1 and a sequence of X going to infinity. Recently, by modifying Maier’s arguments and
using the more recent work on G1 (X) in [10], [18], this was improved by Pintz [19] to show that
 
log X log2 X log4 X
Gk (X)/ →∞
(log3 X)2
for a sequence of X going to infinity.
Our main result here is as follows.
chain Theorem 1. Let k > 1 be fixed. Then for sufficiently large X, we have
1 log X log2 X log4 X
Gk (X) ≫ .
k2 log3 X
The implied constant is absolute and effective.
Maier’s original argument required one to avoid Siegel zeroes, which restricted his results to a sequence
of X going to infinity, rather than all sufficiently large X. However, it is possible to modify his argu-
ment to remove the effect of any exceptional zeroes, which allows us to extend the result to all sufficiently
large X and also to make the implied constant effective. The intuitive reason for the k12 factor is that our
method produces, roughly speaking, k primes distributed “randomly” inside an interval of length about
log X log2 X log4 X
log3 X , and the narrowest gap between k independently chosen numbers in an interval of length
L is typically of length about k12 L.
Our argument is based heavily on our previous paper [11], in particular using the hypergraph covering
lemma from [11, Corollary 3] and the construction of sieve weights from [11, Theorem 5]. The main
difference is in refining the probabilistic analysis in [11] to obtain good upper and lower bounds for certain
sifted sets arising in the arguments in [11], whereas in the former paper only upper bounds were obtained.
1As usual in the subject, log x := log log x, log x := log log log x, and so on. The conventions for asymptotic notation such
2 3
as ≪ and o() will be defined in Section 1.2.
CHAINS OF LARGE GAPS BETWEEN PRIMES 3

We remark that in the recent paper [2], the methods from [11] were modified to obtain some information
about the limit points of tuples of k consecutive prime gaps normalized by factors slightly slower than
log X log2 X log4 X
log X ; see Theorem 6.4 of that paper for a precise statement.
3

1.1. Acknowledgments. KF thanks the hospitality of the Institute of Mathematics and Informatics of the
Bulgarian Academy of Sciences. The research of JM was conducted partly while he was a CRM-ISM
postdoctoral fellow at the Université de Montréal, and partly while he was a Fellow by Examination at
Magdalen College, Oxford.
KF was supported by NSF grant DMS-1201442. TT was supported by a Simons Investigator grant, the
James and Carol Collins Chair, the Mathematical Analysis & Application Research Fund Endowment, and
by NSF grant DMS-1266164.
The authors thank Tristan Freiberg for some corrections.
not-sec
1.2. Notational conventions. In most of the paper, x will denote an asymptotic parameter going to infinity,
with many quantities allowed to depend on x. The symbol o(1) will stand for a quantity bounded in mag-
nitude by c(x), where c(x) is a quantity that tends to zero as x → ∞. The same convention applies to the
asymptotic notation X ∼ Y , which means X = (1 + o(1))Y , and X . Y , which means X 6 (1 + o(1))Y .
We use X = O(Y ), X ≪ Y , and Y ≫ X to denote the claim that there is a constant C > 0 such that
|X| 6 CY throughout the domain of the quantity X. We adopt the convention that C is independent of any
parameter unless such dependence is indicated, e.g. by subscript such as ≪k . In all of our estimates here,
the constant C will be effective (we will not rely on ineffective results such as Siegel’s theorem). If we can
take the implied constant C to equal 1, we write f = O6 (g) instead. Thus for instance
X = (1 + O6 (ε))Y
is synonymous with
(1 − ε)Y 6 X 6 (1 + ε)Y.
Finally, we use X ≍ Y synonymously with X ≪ Y ≪ X.
When summing or taking products over the symbol p, it is understood that p is restricted to be prime.
Given a modulus q and an integer n, we use n mod q to denote the congruence class of n in Z/qZ.
Given a set A, we use 1A to denote its indicator function, thus 1A (x) is equal to 1 when x ∈ A and zero
otherwise. Similarly, if E is an event or statement, we use 1E to denote the indicator, equal to 1 when E is
true and 0 otherwise. Thus for instance 1A (x) is synonymous with 1x∈A .
We use #A to denote the cardinality of A, and for any positive real z, we let [z] := {n ∈ N : 1 6 n 6 z}
denote the set of natural numbers up to z.
Our arguments will rely heavily on the probabilistic method. Our random variables will mostly be discrete
(in the sense that they take at most countably many values), although we will occasionally use some contin-
uous random variables (e.g. independent real numbers sampled uniformly from the unit interval [0, 1]). As
such, the usual measure-theoretic caveats such as “absolutely integrable”, “measurable”, or “almost surely”
can be largely ignored by the reader in the discussion below. We will use boldface symbols such as X or a
to denote random variables (and non-boldface symbols such as X or a to denote deterministic counterparts
of these variables). Vector-valued random variables will be denoted in arrowed boldface, e.g. ~a = (ap )p∈P
might denote a random tuple of random variables ap indexed by some index set P.
We write P for probability, and E for expectation. If X takes at most countably many values, we define
the essential range of X to be the set of all X such that P(X = X) is non-zero, thus X almost surely takes
4 KEVIN FORD, JAMES MAYNARD, AND TERENCE TAO

values in its essential range. We also employ the following conditional expectation notation. If E is an event
of non-zero probability, we write
P(F ∧ E)
P(F |E) :=
P(E)
for any event F , and
E(X1E )
E(X|E) :=
P(E)
for any (absolutely integrable) real-valued random variable X. If Y is another random variable taking at
most countably many values, we define the conditional probability P(F |Y) to be the random variable that
equals P(F |Y = Y ) on the event Y = Y for each Y in the essential range of Y, and similarly define the
conditional expectation E(X|Y) to be the random variable that equals E(X|Y = Y ) on the event Y = Y .
We observe the idempotency property
idem (1.1) E(E(X|Y)) = EX
whenever X is absolutely integrable and Y takes at most countably many values.
We will rely frequently on the following simple concentration of measure result.
cheb Lemma 1.1 (Chebyshev inequality). Let X, Y be independent random variables taking at most countably
many values. Let Y ′ be a conditionally independent copy of Y over X; in other words, for every X in
the essential range of X, the random variables Y, Y ′ are independent and identically distributed after
conditioning to the event X = X. Let F (X, Y) be a (absolutely integrable) random variable depending on
X and Y. Suppose that one has the bounds
1-moment (1.2) EF (X, Y) = α + O(εα)
and
2-moment (1.3) EF (X, Y)F (X, Y ′ ) = α2 + O(εα2 )
for some α, ε > 0 with ε = O(1). Then for any θ > 0, one has
conclusion (1.4) E(F (X, Y)|X) = α + O6 (θ)
εα2
with probability 1 − O( θ2
).
Proof. See [11, Lemma 1.2]. 

2. S IEGEL ZEROES

As is common in analytic number theory, we will have to address the possibility of an exceptional Siegel
zero. As we want to keep all our estimates effective, we will not rely on Siegel’s theorem or its consequences
(such as the Bombieri-Vinogradov theorem). Instead, we will rely on the Landau-Page theorem, which we
now recall. Throughout, χ denotes a Dirichlet character.
page Lemma 2.1 (Landau-Page theorem). Let Q > 100. Suppose that L(s, χ) = 0 for some primitive character
χ of modulus at most Q, and some s = σ + it. Then either
1
1−σ ≫ ,
log(Q(1 + |t|))
CHAINS OF LARGE GAPS BETWEEN PRIMES 5

or else t = 0 and χ is a quadratic character χQ , which is unique for any given Q. Furthermore, if χQ exists,
then its conductor qQ is square-free apart from a factor of at most 4, and obeys the lower bound
log2 Q
qQ ≫ .
log22 Q
Proof. See e.g. [8, Chapter 14]. The final estimate follows from the classical bound 1 − β ≫ q −1/2 log−2 q
for a real zero β of L(s, χ) with χ of modulus q. 
We can then eliminate the exceptional character by deleting at most one prime factor of Q.
page-cor Corollary 1. Let Q > 100. Then there exists a quantity BQ which is either equal to 1 or is a prime of size
BQ ≫ log2 Q
with the property that
1
1−σ ≫
log(Q(1 + |t|))
whenever L(σ + it, χ) = 0 and χ is a character of modulus at most Q and coprime to BQ .
Proof. If the exceptional character χQ from Lemma 2.1 does not exist, then take BQ := 1; otherwise we
take BQ to be the largest prime factor of qQ . As qQ is square-free apart from a factor of at most 4, we have
log qQ ≪ BQ by the prime number theorem, and the claim follows. 
Next, we recall Gallagher’s prime number theorem:
gallagher Lemma 2.2 (Gallagher’s prime number theorem). Let q be a natural number, and suppose that L(s, χ) 6= 0
δ
for all characters χ of modulus q and s with 1 − σ 6 log(Q(1+it)) , and some constant δ > 0. Then there is
a constant D > 1 depending only on δ such that
x
#{p prime : p 6 x; p ≡ a (mod q)} ≫
φ(q) log x
for all (a, q) = 1 and x > q D .
Proof. See [14, Lemma 2]. 
This will combine well with Corollary 1 once we remove the moduli divisible by the (possible) excep-
tional prime BQ .

3. S IEVING AN INTERVAL
ec:sieving
We now give the key sieving result that will be used to prove Theorem 1.
sieve-thm Theorem 2 (Sieving an interval). There is an absolute constants c > 0 such that the following holds. Fix
A > 1 and ε > 0, and let x be sufficiently large depending on A and ε. Suppose y satisfies
x log x log3 x
ydef (3.1) y=c ,
log2 x
and suppose that B0 = 1 or that B0 is a prime satisfying
log x ≪ B0 6 x.
6 KEVIN FORD, JAMES MAYNARD, AND TERENCE TAO

Then one can find a congruence class ap mod p for each prime p 6 x, p 6= B0 such that the sieved set
T := {n ∈ [y]\[x] : n 6≡ ap (mod p) for all p 6 x, p 6= B0 }
obeys the following size estimates:
• (Upper Bound) One has
x
up-bound (3.2) #T ≪ A .
log x
• (Lower Bound) One has
x
down-bound (3.3) #T ≫ A .
log x
• (Upper bound in short intervals) For any 0 6 α 6 β 6 1, one has
x
up-short (3.4) #(T ∩ [αy, βy]) ≪ A(|β − α| + ε) .
log x
We remark that if one lowers y to be of order x log x log3 x
rather than x loglogx log 3x
, then this theorem is
(log2 x)2 2x
essentially [14, Lemma 6]. It is convenient to sieve [y]\[x] instead of [y] for minor technical reasons (we
will use the fact that the residue class 0 mod p avoids all the primes in [y]\[x] whenever p 6 x). The
arguments in [11] already can give much of this theorem, with the exception of the lower bound (3.3), which
is the main additional technical result of this paper that is needed to extend the results of that paper to longer
chains.
We will prove Theorem 2 in later sections. In this section, we show how this theorem implies Theorem
1. Here we shall use the Maier matrix method, following the arguments in [14] closely (although we will
use probabilistic notation rather than matrix notation). Let k > 1 be a fixed integer, let c0 > 0 be a small
constant, and let A > 1 and 0 < ε < 1/2 be large and small quantities depending on k to be chosen later.
We now recall (a slight variant of) some lemmas from [14].
pxa Lemma 3.1. There exists an absolute constant D > 1 such that, for all sufficiently large x, there exists a
natural number B0 which is either equal to 1 or a prime, with
b0-est (3.5) log x ≪ B0 6 x,
and is such that the following holds. If one sets P := P (x)/B0 (where we recall that P (x) is the product of
the primes up to x), then one has
log x
zpx (3.6) #{z ∈ [Z] : P z + a prime} ≫ Z
log Z
for all Z > P D and a ∈ P coprime to P , and
 2
log x
zpy (3.7) #{z ∈ [Z] : P z + a, P z + b both prime} ≪ Z
log Z
for all Z > P D and all distinct a, b ∈ [P ] coprime to P .
Proof. We first prove (3.6). We apply Corollary 1 with Q := P (x) to obtain a quantity BP (x) with the stated
properties. We set B0 = 1 if BP (x) > x, and B0 := BP (x) otherwise. Then from Mertens’ theorem we
have (3.5) if B0 6= 1. From Corollary 1 and Lemma 2.2, we then have
PZ
#{z ∈ [Z] : P z + a prime} ≫
φ(P ) log(P Z)
CHAINS OF LARGE GAPS BETWEEN PRIMES 7

for any Z > P D and a suitable absolute constant D > 1. Note that log(P Z) ≪ log Z. From Mertens’
theorem (and (3.5)) we also have
P
ppx (3.8) ≍ log x,
φ(P )
and (3.6) follows.
Finally, the estimate (3.7) follows from standard upper bound sieves (cf. [14, Lemma 3]). 
Now set Z := P D with x and D as in Lemma 3.1, and let z be chosen uniformly at random from [Z]. Let
y, T and ap mod p be as in Theorem 2. By the Chinese remainder theorem, we may find m ∈ [P ] such that
m ≡ −ap (mod p) for all p 6 x with p 6= B0 . Thus, zP + m + T consists precisely of those elements of
zP + m + [y]\[x] that are coprime to P . In particular, any primes that lie in the interval zP + m + [y]\[x]
lie in zP + m + T .
From (3.6) and Mertens’ theorem we have
log x
P(zP + m + a prime) ≫
x
for all a ∈ T (we allow implied constants to depend on D). Similarly, from (3.7) and Mertens’ theorem we
have
log x 2
 
og (3.9) P(zP + m + a, zP (x) + m + b both prime) ≪
x
for any distinct a, b ∈ T . If we let N denote the number of primes in zP + m + T (or equivalently, in
zP + m + [y]\[x]), we thus have from (3.2) and (3.3) that
EN ≫ A
and
EN2 ≪ A2 .
From this we see that with probability ≫ 1, we have
ana (3.10) A ≪ N ≪ A,
where all implied constants are independent of ε and A. (This is because the contribution to EN when N is
much larger than A is much smaller than A.)
Next, if 0 6 α 6 β 6 1 and β − α 6 2ε, then from (3.9), (3.4) and the union bound we see that the
probability that there are at least two primes in zP + m + [αy, βy] is at most
2 
log x 2
  
x
O Aε = O(A2 ε2 ).
log x x
Note that one can cover [0, 1] with O(1/ε) intervals of length at most 2ε, with the property that any two
elements a, b of [0, 1] with |a − b| 6 ε may be covered by at least one of these intervals. From this and the
union bound, we see that the probability that zP + m + [y]\[x] contains two primes separated by at most εy
is bounded by O( 1ε A2 ε2 ) = O(A2 ε). In particular, if we choose ε to be a sufficiently small multiple of A12 ,
we may find z ∈ [Z] such that the interval zP + m + [y]\[x] contains ≫ A primes and has no prime gap
less than εy. If we choose A to be a sufficiently large multiple of k, we conclude that
1
Gk (ZP + m + y) > εy ≫ 2 y.
k
8 KEVIN FORD, JAMES MAYNARD, AND TERENCE TAO

By Mertens’ theorem, we have ZP + m + y ≪ exp(O(x)), and Theorem 1 then follows from (3.1).
It remains to prove Theorem 2. This is the objective of the remaining sections of the paper.

4. S IEVING A SET OF PRIMES


ec:initial
Theorem 2 concerns the problem of deterministically sieving an interval [y]\[x] of size (3.1) so that the
sifted set T has certain size properties. We use a variant of the Erdős-Rankin method to reduce this problem
to a problem of probabilistically sieving a set Q of primes in [y]\[x], rather than integers in [y]\[x].
Given a real number x > 1, and a natural number B0 , define
zdef (4.1) z := xlog3 x/(4 log2 x) ,
and introduce the three disjoint sets of primes
s-def (4.2) S := {s prime : log20 x < s 6 z; s 6= B0 },
p-def (4.3) P := {p prime : x/2 < p 6 x; p 6= B0 },
q-def (4.4) Q := {q prime : x < q 6 y; q 6= B0 }.
For residue classes ~a = (as mod s)s∈S and ~n = (np mod p)p∈P , define the sifted sets
S(~a) := {n ∈ Z : n 6≡ as (mod s) for all s ∈ S}
and likewise
S(~n) := {n ∈ Z : n 6≡ np (mod p) for all p ∈ P}.
We reduce Theorem 2 to
eve-primes Theorem 3 (Sieving primes). Let A > 1 be a real number, let x be sufficiently large depending on A, and
suppose that y obeys (3.1). Let B0 be a natural number. Then there is a quantity
adef (4.5) A′ ≍ A,
and some way to choose the vectors ~a = (as mod s)s∈S and ~n = (np mod p)p∈P at random (not neces-
sarily independent of each other), such that for any fixed 0 6 α < β 6 1 (independent of x), one has with
probability 1 − o(1) that
x
ort-random (4.6) #(Q ∩ S(~a) ∩ S(~n) ∩ (αy, βy]) ∼ A′ |β − α| .
log x
The o(1) decay rates in the probability error and implied in the ∼ notation are allowed to depend on A, α, β.
In [11, Theorem 2], a weaker version of this theorem was established in which B0 was not present, and
only the upper bound in (4.6) was proven. Thus, the main new contribution of this paper is the lower bound
in (4.6).
We prove Theorem 3 in subsequent sections. In this section, we show how this theorem implies Theorem
2 (and hence Theorem 1). The arguments here are almost identical to those in [11, §2].
Fix A > 1, 0 < ε 6 1. We partition (0, 1] into O(1/ε) intervals [αi , βi ] of length between ε/2 and ε.
Applying Theorem 3 with the pairs (α, β) = (αi , βi ) and the pair (α, β) = (0, 1), and invoking a union
bound (and the fact that ε is independent of x), we see that if x is sufficiently large (depending on A, ε), there
are A′ , y obeying (4.5), (3.1) and tuples of residue classes ~a = (as mod s)s∈S and ~n = (np mod p)p∈P
such that
x
#(Q ∩ S(~a) ∩ S(~n)) ∼ A′
log x
CHAINS OF LARGE GAPS BETWEEN PRIMES 9

and
x
#(Q ∩ S(~a) ∩ S(~n)) ∩ (αi y, βi y]) ≪ Aε
log x
for all i. A covering argument then gives
x
#(Q ∩ S(~a) ∩ S(~n) ∩ [αy, βy]) ≪ A(|β − α| + ε)
log x
for any 0 6 α < β 6 1. Now we extend the tuple ~a to a tuple (ap )p6x of congruence classes ap mod p for
all primes p 6 x by setting ap := np for p ∈ P and ap := 0 for p 6∈ S ∪ P, and consider the sifted set
T := {n ∈ [y]\[x] : n 6≡ ap (mod p) for all p 6 x}.
The elements of T , by construction, are not divisible by any prime in (0, log 20 x] or in (z, x/2], except
possibly for B0 . Thus, each element must either be a z-smooth number (i.e. a number with all prime
factors at most z) times a power of B0 , or must consist of a prime greater than x/2, possibly multiplied
by some additional primes that are all either at least log20 x or equal to B0 . However, from (3.1) we know
that y = o(x log x), and by hypothesis we know that B0 ≫ log x. Thus, we see that an element of T is
either a z-smooth number times a power of B0 or a prime in Q. In the second case, the element lies in
Q ∩ S(~a) ∩ S(~n). Conversely, every element of Q ∩ S(~a) ∩ S(~n) lies in T . Thus, T only differs from
Q ∩ S(~a) ∩ S(~n) by a set R consisting of z-smooth numbers in [y] multiplied by powers of B0 .
To estimate #R, let
log y
u := ,
log z
log2 x
so from (3.1), (4.1) one has u ∼ 4 log . The number of powers of B0 in [y] is O(log x). By standard counts
3x
for smooth numbers (e.g. de Bruijn’s theorem [5]) and (3.1), we thus have
#R ≪ log x × ye−u log u+O(u log log(u+2))
 
y x
= log x × =o .
log4+o(1) x log x
Thus the contribution of R to T is negligible for the purposes of establishing the bounds (3.2), (3.3), (3.4),
and Theorem 2 follows from (4.6).
It remains to establish Theorem 3. This is the objective of the remaining sections of the paper.

5. U SING A HYPERGRAPH COVERING THEOREM


sec:pip
In the previous section we reduced matters to obtaining random residue classes ~a, ~n such that the sifted
set Q ∩ S(~a) ∩ S(~n) is small. In this section we use a hypergraph covering theorem from [11] to reduce the
task to that of finding random residue classes ~n that have large intersection with Q ∩ S(~a). More precisely,
we will use the following result:
-quant-cor Theorem 4. Let x → ∞. Let P ′ , Q′ be sets of primes in (x/2, x] and (x, x log x], respectively, with
#Q′ > (log2 x)3 . For each p ∈ P ′ , let ep be a random subset of Q′ satisfying the size bound
 
log x log3 x
rbound (5.1) #ep 6 r = O (p ∈ P ′ ).
log22 x
Assume the following:
10 KEVIN FORD, JAMES MAYNARD, AND TERENCE TAO

• (Sparsity) For all p ∈ P ′ and q ∈ Q′ ,


-quant-cor (5.2) P(q ∈ ep ) 6 x−1/2−1/10 .
1
• (Uniform covering) For all but at most (log2 x)2
#Q′
elements q ∈ Q′ , we have
 
X 1
e-bite-cor (5.3) P(q ∈ ep ) = C + O6
(log2 x)2
p∈P

for some quantity C, independent of q, satisfying


5
sigma (5.4) log 5 6 C ≪ 1.
4
Then for any positive integer m with
log3 x
moo (5.5) m6 ,
log 5
we can find random sets e′p ⊆ Q′ for each p ∈ P ′ such that
#{q ∈ Q′ : q 6∈ e′p for all p ∈ P ′ } ∼ 5−m #Q′
with probability 1 − o(1). More generally, for any Q′′ ⊂ Q′ with cardinality at least (#Q′ )/ log2 x, one
p

has
#{q ∈ Q′′ : q 6∈ e′p for all p ∈ P ′ } ∼ 5−m #Q′′
with probability 1 − o(1). The decay rates in the o(1) and ∼ notation are uniform in P ′ , Q′ , Q′′ .
Proof. See [11, Corollary 3]. 
In view of the above result, we may now reduce Theorem 3 to the following claim.
e-primes-2 Theorem 5 (Random construction). Let x be a sufficiently real number, let B0 be a natural number and
suppose y satisfies (3.1). Then there is a quantity C with
1
igma-order (5.6) C≍
c
with the implied constants independent of c, and some way to choose random vectors ~a = (as mod s)s∈S
and ~n = (np )p∈P of congruence classes as mod s and integers np , obeying the following axioms:
• For every ~a in the essential range of ~a, one has
P(q ≡ np (mod p)|~a = ~a) 6 x−1/2−1/10
uniformly for all p ∈ P.
• For fixed 0 6 α < β 6 1, we have with probability 1 − o(1) that
x
treat (5.7) #(Q ∩ S(~a) ∩ [αy, βy]) ∼ 80c|β − α| log2 x.
log x
• Call an element ~a in the essential range of ~a good if, for all but at most log x xlog x elements q ∈
2
Q ∩ S(~a), one has
 
X 1
good (5.8) P(q ≡ np (mod p)|~a = ~a) = C + O6 .
(log2 x)2
p∈P

Then ~a is good with probability 1 − o(1).


CHAINS OF LARGE GAPS BETWEEN PRIMES 11

We now show why Theorem 5 implies Theorem 3. By (5.6), we may choose 0 < c < 1/2 small enough
so that (5.4) holds. Let A > 1 be a fixed quantity. Then we can find an integer m obeying (5.5) such that
the quantity
A′ := 5−m × 80c log 2 x
is such that A′ ≍ A with implied constants independent of A.
Suppose that we are in the probability 1 − o(1) event that ~a takes a value ~a which is good and such that
(5.7) holds. On each sub-event ~a = ~a of this probability 1 − o(1) event, we may apply Theorem 4 (for the
random variables np conditioned to this event) define the random variables n′p on this event with the stated
properties. For the remaining events ~a = ~a, we set n′p arbitrarily (e.g. we could set n′p = 0). The claim
(4.6) then follows from Corollary 4 and (5.7), thus establishing Theorem 3.
It remains to establish Theorem 5. This will be achieved in the next section.

6. U SING A SIEVE WEIGHT


sec:weight
If r is a natural number, an admissible r-tuple is a tuple (h1 , . . . , hr ) of distinct integers h1 , . . . , hr that
do not cover all residue classes modulo p, for any prime p. For instance, the tuple (pπ(r)+1 , . . . , pπ(r)+r )
consisting of the first r primes larger than r is an admissible r-tuple.
We will establish Theorem 5 by a probabilistic argument involving a certain weight function. More
precisely, we will deduce this result from the following construction from [11].
weight Theorem 6 (Existence of good sieve weight). Let x be a sufficiently large real number, let B0 be an integer,
and let y be any quantity obeying (3.1). Let P, Q be defined by (4.3), (4.4). Let r be a positive integer with
r-bound (6.1) r0 6 r 6 logc0 x
for some sufficiently small absolute constant c0 and sufficiently large absolute constant r0 , and let (h1 , . . . , hr )
be an admissible r-tuple contained in [2r 2 ]. Then one can find a positive quantity
lpha-crude (6.2) τ > x−o(1)
and a positive quantity u = u(r) depending only on r with
u-bound (6.3) u ≍ log r
and a non-negative function w : P × Z → R+ supported on P × (Z ∩ [−y, y]) with the following properties:
• Uniformly for every p ∈ P, one has
  
X 1 y
wap (6.4) w(p, n) = 1 + O 10 τ r .
log2 x log x
n∈Z

• Uniformly for every q ∈ Q and i = 1, . . . , r, one has


  
X 1 u x
wbp (6.5) w(p, q − hi p) = 1 + O τ .
log 10
2 x r 2 logr x
p∈P

• Uniformly for every h = O(y/x) that is not equal to any of the hi , one has
 
XX 1 x y
wcp (6.6) w(p, q − hp) = O τ r .
q∈Q p∈P
log10
2 x log x log x
12 KEVIN FORD, JAMES MAYNARD, AND TERENCE TAO

• Uniformly for all p ∈ P and n ∈ Z,


w-triv (6.7) w(p, n) = O(x1/3+o(1) ).
Proof. See2 [11, Theorem 5]. We remark that the construction of the weights and the verification of the
required estimates relies heavily on the previous work of the second author in [17]. 
It remains to show how Theorem 6 implies Theorem 5. The analysis will be based on that in [11, §5],
which used a weight with slightly weaker hypotheses than in Theorem 6 to obtain somewhat weaker con-
clusions than Theorem 5 (in which the condition q ≡ np (mod p) was replaced by the stronger condition
that q = np + hi p for some i = 1, . . . , r).
Let x, B0 , c, y, z, S, P, Q be as in Theorem 5. Let c0 be a sufficiently small absolute constant. We set r
to be the maximum value permitted by Theorem 6, namely
r-def (6.8) r := ⌊logc0 x⌋
and let (h1 , . . . , hr ) be the admissible r-tuple consisting of the first r primes larger than r, thus hi = pπ(r)+i
for i = 1, . . . , r. From the prime number theorem we have hi = O(r log r) for i = 1, . . . , r, and so
we have hi ∈ [2r 2 ] for i = 1, . . . , r if x is large enough (there are many other choices possible, e.g.
(h1 , . . . , hr ) = (12 , 32 , . . . , (2r − 1)2 )). We now invoke Theorem 6 to obtain quantities τ, u and a weight
w : P × Z → R+ with the stated properties.
For each p ∈ P, let ñp denote the random integer with probability density
w(p, n)
P(ñp = n) := P
w(p, n′ )
n′ ∈Z
for all n ∈ Z (we will not need to impose any independence conditions on the ñp ). From (6.4), (6.5) we
have
  
X 1 u x
wbp-diff (6.9) P(q = ñp + hi p) = 1 + O 10
log2 x r 2y
p∈P

for every q ∈ Q and i = 1, . . . , r, and similarly from (6.4), (6.6) we have


XX 1 x
wcp-diff (6.10) P(q = ñp + hp) ≪ 10 log x
q∈Q p∈P
log2 x

for every h = O(y/x) not equal to any of the hi . Finally, from (6.4), (6.7), (6.2) one has
-triv-diff (6.11) P(ñp = n) ≪ x−1/2−1/6+o(1)
for all p ∈ P and n ∈ Z.
We choose the random vector ~a := (as mod s)s∈S by selecting each as mod s uniformly at random from
Z/sZ, independently in s and independently of the ñp . The resulting sifted set S(~a) is a random periodic
subset of Z with density
Y 1

σ := 1− .
s
s∈S

2The integer B was not deleted from the sets P or Q in that theorem, however it is easy to see (using (6.7)) that deleting at
0
most one prime from either P or Q will not significantly worsen any of the estimates claimed by the theorem.
CHAINS OF LARGE GAPS BETWEEN PRIMES 13

From the prime number theorem (with sufficiently strong error term), (4.1) and (4.2),
log(log20 x)
     
1 1 80 log 2 x
σ = 1+O 10 = 1+O 10 ,
log2 x log z log2 x log x log3 x/ log2 x
so in particular we see from (3.1) that
  
1
gamma-y (6.12) σy = 1+O 80cx log2 x.
log10
2 x

We also see from (6.8) that


amma-small (6.13) σ r = xo(1) .
We have a useful correlation bound:
gamma-cor Lemma 6.1. Let t 6 log x be a natural number, and let n1 , . . . , nt be distinct integers of magnitude
O(xO(1) ). Then one has
  
1
P(n1 , . . . , nt ∈ S(~a)) = 1 + O σt .
log16 x
Proof. See [11, Lemma 5.1]. 

Among other things, this gives the claim (5.7):


s0 Corollary 2. For any fixed 0 6 α < β 6 1, we have with probability 1 − o(1) that
y x
qqa (6.14) #(Q ∩ [αy, βy] ∩ S(~a)) ∼ σ|β − α| ∼ 80c|β − α| log2 x.
log x log x
Proof. See [11, Corollary 4], replacing Q with Q ∩ [αy, βy]. 

For each p ∈ P, we consider the quantity


xp-def (6.15) Xp (~a) := P(ñp + hi p ∈ S(~a) for all i = 1, . . . , r),
and let P(~a) denote the set of all the primes p ∈ P such that
  
1
sumn (6.16) Xp (~a) = 1 + O6 σr .
log3 x
In light of Lemma 6.1, we expect most primes in P to lie in P(~a), and this will be confirmed below
(Lemma 6.2). We now define the random variables np as follows. Suppose we are in the event ~a = ~a for
some ~a in the range of ~a. If p ∈ P\P(~a), we set np = 0. Otherwise, if p ∈ P(~a), we define np to be the
random integer with conditional probability distribution
Zp (~a; n)
xpa (6.17) P(np = n|~a = ~a) := , Zp (~a; n) = 1n+hj p∈S(~a) for j=1,...,r P(ñp = n).
Xp (~a)
with the np jointly conditionally independent on the event ~a = ~a. From (6.15) we see that these random
variables are well defined.
14 KEVIN FORD, JAMES MAYNARD, AND TERENCE TAO

Substituting definition (6.17) into the left hand side of (5.8), and observing that np ≡ q (mod p) is only
possible if p ∈ P(~a), we see that to prove (5.8), it suffices to show that with probability 1 − o(1) in ~a, for
all but at most log x xlog x primes in Q ∩ S(~a), we have
2
 
X X 1
good-1 (6.18) σ −r Zp (~a; q − hp) = C + O 3 .
h
log 2 x
p∈P(~a)

We now confirm that P\P(~a) is small with high probability.


smc Lemma 6.2. With probability 1 − O(1/ log 3 x), P(~a) contains all but O( log13 x logx x ) of the primes p ∈ P.
In particular, E#P(~a) = #P(1 + O(1/ log3 x)).
Proof. See [11, Lemma 5.3]. 
The left side of relation (6.18) breaks naturally into two pieces, a ‘main term’ consisting of summands
where h = hi for some i, and an ‘error terms’ consisting of the remaining summands. We first take care of
the error terms.
smc-1 Lemma 6.3. With probability 1 − o(1) we have
X X 1
good-h (6.19) σ −r Zp (~a; q − hp) ≪
log32 x
p∈P(~a) h≪y/x
h6∈{h1 ,...,hr }
x
for all but at most 2 log x log2 x primes q ∈ Q ∩ S(~a).
Proof. We first extend the sum over all p ∈ P. By Markov’s inequality, it suffices to show that
 
X X X x
ep (6.20) E σ −r Zp (~a; q − hp) = o 4 .
q∈Q∩S(~a) p∈P h≪y/x
log x log 2 x
h∈{h
/ 1 ,...,hk }

The left-hand side of (6.20) equals


X X X
σ −r P(q ∈ S(~a), q + hj p − hp ∈ S(~a) for j = 1, . . . , r)P(q = ñp + hp).
q∈Q h≪y/x p∈P
h∈{h
/ 1 ,...,hk }

We note that for any h in the above sum, the r + 1 integers q, q + h1 p − hp, . . . , q + hr p − hp are distinct.
Applying Lemma 6.1, followed by (6.10), we may thus bound this expression by
X x/ log x 1 y
≪ σ 10 ≪ σ 10 .
log2 x
h≪y/x
log2 x log x
h∈{h
/ 1 ,...,hk }

The claim now follows from (6.12). 


Next, we deal with the main term of (6.18), by showing an analogue of (6.9).
smc-2 Lemma 6.4. With probability 1 − o(1), we have
r   
−r
X X 1 u x
sumno (6.21) σ Zp (~a; q − hi p) = 1 + O
i=1 p∈P(~a)
log32 x σ 2y
CHAINS OF LARGE GAPS BETWEEN PRIMES 15

x
for all but at most 2 log x log2 x of the primes q ∈ Q ∩ S(~a).
Proof. We first show that replacing P(~a) with P has negligible effect on the sum, with probability 1 − o(1).
Fix i and susbtitute n = q − hi p. By Markov’s inequality, it suffices to show that
 
X
−r
X u x 1 1 x
soo-2 (6.22) E σ Zp (~a; n) = o .
n
σ 2y r log32 x log x log2 x
p∈P\P(~a)

By Lemma 6.1, we have


X X XX
E σ −r Zp (~a; n) = σ −r P(ñp = n)P(n + hj p ∈ S(~a) for j = 1, . . . , r)
n p∈P p∈P n
  
1
= 1+O #P.
log16 x
Next, by (6.16) and Lemma 6.2 we have
X X X X
E σ −r Zp (~a; n) = σ −r P(~a = ~a) Xp (~a)
n p∈P(~a) ~a p∈P(~a)
     
1 1
= 1+O E #P(~a) = 1+O #P;
log3 x log3 x
subtracting, we conclude that the left-hand side of (6.22) is O(#P/ log 3 x) = O(x/ log4 x). The claim then
follows from (3.1) and (6.1).
By (6.22), it suffices to show that with probability 1 − o(1), for all but at most 2 log xxlog x primes q ∈
2
Q ∩ S(~a), one has
r X   
X 1 x
soo (6.23) Zp (~a; q − hi p) = 1 + O6 3 σ r−1 u .
log2 x 2y
i=1 p∈P

Call a prime q ∈ Q bad if q ∈ Q ∩ S(~a) but (6.23) fails. Using Lemma 6.1 and (6.9), we have
 X X r X  X
E Zp (~a; q − hi p) = P(q + (hj − hi )p ∈ S(~a) for all j = 1, . . . , r)P(ñp = q − hi p)
q∈Q∩S(~a) i=1 p∈P q,i,p
  
1 σy r−1 x
= 1+O σ u
log10
2 x log x 2y
and
 X r X
X 2  X
E Zp (~a; q − hi p) = P(q + (hj − hiℓ )pℓ ∈ S(~a) for j = 1, . . . , r; ℓ = 1, 2)
q∈Q∩S(~a) i=1 p∈P p1 ,p2 ,q
i1 ,i2

× P(ñ(1) (2)
p1 = q − hi1 p1 )P(ñp2 = q − hi2 p2 )

x 2
    
1 σy r−1
= 1+O σ u ,
log10
2 x log x 2y
(1) (2)
where (ñp1 )p1 ∈P and (ñp2 )p2 ∈P are independent copies of (ñp )p∈P over ~a. In the last step we used the
fact that the terms with p1 = p2 contribute negligibly.
16 KEVIN FORD, JAMES MAYNARD, AND TERENCE TAO

σy 1 x
By Chebyshev’s inequality (Lemma 1.1) it follows that the number of bad q is ≪ log x log32 x ≪ log x log22 x
with probability 1 − O(1/ log2 x). This concludes the proof. 
We now conclude the proof of Theorem 5. We need to prove (6.18); this follows immediately from
Lemma 6.3 and Lemma 6.4 upon noting that by (6.8), (6.3) and (6.12),
u x 1
C := ∼ .
σ 2y c
R EFERENCES
[1] R. J. Backlund, Über die Differenzen zwischen den Zahlen, die zu den ersten n Primzahlen teilerfremd sind, Commentationes
in honorem E. L. Lindelöf. Annales Acad. Sci. Fenn. 32 (1929), Nr. 2, 1–9.
[2] R. C. Baker, T. Freiberg, Limit points and long gaps between primes, preprint.
[3] R. C. Baker, G. Harman and J. Pintz, The difference between consecutive primes. II., Proc. London Math. Soc. (3) 83 (2001),
no. 3, 532–562.
[4] A. Brauer, H. Zeitz, Über eine zahlentheoretische Behauptung von Legendre, Sber. Berliner Math. Ges. 29 (1930), 116–125.
[5] N. G. de Bruijn, On the number of positive integers 6 x and free of prime factors > y. Nederl. Acad. Wetensch. Proc. Ser.
A. 54 (1951) 50–60.
[6] H. Cramér, Some theorems concerning prime numbers, Ark. Mat. Astr. Fys. 15 (1920), 1–33.
[7] H. Cramér, On the order of magnitude of the difference between consecutive prime numbers, Acta Arith. 2 (1936), 396–403.
[8] H. Davenport, Multiplicative number theory, 3rd ed., Graduate Texts in Mathematics vol. 74, Springer-Verlag, New York,
2000.
[9] P. Erdős, On the difference of consecutive primes, Quart. J. Math. Oxford Ser. 6 (1935), 124–128.
[10] K. Ford. B. Green, S. Konyagin, T. Tao, Large gaps between consecutive prime numbers, to appear. Ann. Math..
[11] K. Ford. B. Green, S. Konyagin, J. Maynard, T. Tao, Long gaps between primes, preprint.
[12] P. X. Gallagher, A large sieve density estimate near σ = 1, Invent. Math. 11 (1970), 329–339.
[13] A. Granville, Harald Cramér and the distribution of prime numbers, Scandanavian Actuarial J. 1 (1995), 12–28.
[14] H. Maier, Chains of large gaps between consecutive primes, Advances in Mathematics 39 (1981), 257–269.
[15] H. Maier and C. Pomerance, Unusually large gaps between consecutive primes. Trans. Amer. Math. Soc. 322 (1990), no. 1,
201–237.
[16] J. Maynard, Small gaps between primes, Ann. of Math. (2) 181 (2015), no. 1, 383–413.
[17] J. Maynard, Dense clusters of primes in subsets, preprint.
[18] J. Maynard, Large gaps between primes, to appear Ann. Math.
[19] J. Pintz, On the distribution of gaps between consecutive primes, preprint.
[20] J. Pintz, Very large gaps between consecutive primes. J. Number Theory 63 (1997), no. 2, 286–301.
[21] N. Pippenger, J. Spencer, Asymptotic behavior of the chromatic index for hypergraphs, J. Combin. Theory Ser. A 51 (1989),
no. 1, 24–42.
[22] R. A. Rankin, The difference between consecutive prime numbers, J. London Math. Soc. 13 (1938), 242–247.
[23] R. A. Rankin, The difference between consecutive prime numbers. V, Proc. Edinburgh Math. Soc. (2) 13 (1962/63), 331–332.
[24] A. Schönhage, Eine Bemerkung zur Konstruktion grosser Primzahllücken, Arch. Math. 14 (1963), 29–30.
[25] E. Westzynthius, Über die Verteilung der Zahlen, die zu den n ersten Primzahlen teilerfremd sind, Comm. Phys. Math., Soc.
Sci. Fennica 5, no. 25, (1931) 1–37.

D EPARTMENT OF M ATHEMATICS , 1409 W EST G REEN S TREET , U NIVERSITY OF I LLINOIS AT U RBANA -C HAMPAIGN ,
U RBANA , IL 61801, USA
E-mail address: [email protected]

M ATHEMATICAL I NSTITUTE , R ADCLIFFE O BSERVATORY Q UARTER , W OODSTOCK ROAD , OXFORD OX2 6GG, E NGLAND
E-mail address: [email protected]

D EPARTMENT OF M ATHEMATICS , UCLA, 405 H ILGARD AVE , L OS A NGELES CA 90095, USA


E-mail address: [email protected]

You might also like