0% found this document useful (0 votes)
38 views79 pages

An Introduction To Quantum Computing For Statisticians and Data Scientists

This is a statistics paper on arxic

Uploaded by

Max Lovig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views79 pages

An Introduction To Quantum Computing For Statisticians and Data Scientists

This is a statistics paper on arxic

Uploaded by

Max Lovig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

An Introduction to Quantum Computing for Statisticians

and Data Scientists


Anna Lopatnikova∗ Minh-Ngoc Tran∗ Scott A. Sisson†
arXiv:2112.06587v2 [stat.CO] 3 Apr 2022

2nd version: March 2022

Abstract
Quantum computers promise to surpass the most powerful classical supercomputers when
it comes to solving many critically important practical problems, such as pharmaceutical and
fertilizer design, supply chain and traffic optimization, or optimization for machine learning
tasks. Because quantum computers function fundamentally differently from classical comput-
ers, the emergence of quantum computing technology will lead to a new evolutionary branch
of statistical and data analytics methodologies. This review provides an introduction to quan-
tum computing designed to be accessible to statisticians and data scientists, aiming to equip
them with an overarching framework of quantum computing, the basic language and building
blocks of quantum algorithms, and an overview of existing quantum applications in statistics
and data analysis. Our goal is to enable statisticians and data scientists to follow quantum
computing literature relevant to their fields, to collaborate with quantum algorithm designers,
and, ultimately, to bring forth the next generation of statistical and data analytics tools.

Keywords. Quantum statistical methods; machine learning, quantum Monte Carlo, quantum
descriptive statistics, quantum linear algebra.

Contents
1 Introduction 3

2 A Birds-Eye View of Quantum Theory 5

3 Basic Concepts of Quantum Computation 6


3.1 The Simplest Quantum System: a Qubit . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 The Power of Quantum: Superposition, Entanglement, Parallelism, and Interference 7
3.3 Quantum States and Quantum Operators . . . . . . . . . . . . . . . . . . . . . . . . 8
3.4 Properties of Quantum Computers . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Discipline of Business Analytics, the University of Sydney Business School. The research was partially
supported by the Australian Research Council’s Discovery Project DP200103015, the ARC Centre for Data
Analytics for Resources and Environments (DARE), the ARC Centre of Excellence for Mathematical and
Statistical Frontiers (ACEMS) and a USYD Business School research support scheme. Corresponding to
[email protected].

UNSW Data Science Hub and School of Mathematics and Statistics, University of New South Wales, Sydney.
SAS is supported by the ARC through the Discovery Project scheme FT170100079 and ACEMS.

1
4 Quantum Algorithm Design 15
4.1 Data Encoding in a Quantum State . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.2 Result Postprocessing and Readout . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4 A Brief Overview of Quantum Algorithms . . . . . . . . . . . . . . . . . . . . . . . 21
4.5 Quantum Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5 Programming Quantum Computers 25


5.1 Physical Implementation on NISQs and Beyond . . . . . . . . . . . . . . . . . . . . 25
5.2 Quantum Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.3 Quantum Gates and Other Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . 26

6 Grover’s Search and Descriptive Statistics on a Quantum Computer 30


6.1 Grover’s Search Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.2 Quantum Amplitude Amplification (QAA) . . . . . . . . . . . . . . . . . . . . . . . 32
6.3 Quantum Amplitude Estimation (QAE) . . . . . . . . . . . . . . . . . . . . . . . . 33
6.4 Estimating the Mean of a Bounded Function . . . . . . . . . . . . . . . . . . . . . . 33
6.5 Minimum (or Maximum) of a Function over a Discrete Domain . . . . . . . . . . . . 35
6.6 Median and kth Smallest Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.7 Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.8 Quantum Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

7 Quantum Markov Chains 37


7.1 Coin Walks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.2 Szegedy Walks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.3 Quantum Markov Chain Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . 40

8 Quantum Linear Systems, Matrix Inversion, and PCA 40


8.1 Quantum Fourier Transform (QFT) . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
8.2 Quantum Phase Estimation (QPE) . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
8.3 Applying a Hermitian Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.4 The HHL Linear Systems Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 45
8.5 Fast Gradient Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.6 Quantum Principal Component Analysis (QPCA) . . . . . . . . . . . . . . . . . . . 48

9 Hamiltonian Simulation 49
9.1 Overview and Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
9.2 Product Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
9.3 Hamiltonian Simulation by Quantum Walk . . . . . . . . . . . . . . . . . . . . . . . 54
9.4 Linear Combination of Unitaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
9.5 Hamiltonian Simulation by Quantum Signal Processing (QSP) . . . . . . . . . . . . 56

10 Quantum Optimization 57
10.1 Adiabatic Quantum Computing (AQC) . . . . . . . . . . . . . . . . . . . . . . . . . 58
10.2 Quantum Approximate Optimization Algorithm (QAOA) . . . . . . . . . . . . . . . 59
10.3 Hybrid Quantum-Classical Variational Algorithms . . . . . . . . . . . . . . . . . . . 60
10.4 Quantum Gradient Descent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

2
11 Quantum Eigenvalue and Singular Value Transformations 61
11.1 Quantum Signal Processing (QSP) . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
11.2 Quantum Eigenvalue Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 63
11.3 Quantum Singular Value Transformation (QSVT) . . . . . . . . . . . . . . . . . . . 65
11.4 QSVT and the “Grand Unification” of Quantum Algorithms . . . . . . . . . . . . . 66

12 Conclusion 67

1 Introduction
Quantum computing has emerged as the next computing technology paradigm, currently in the
state of development reminiscent of classical transistor-based computers in the 1950s. Quantum
computers in existence today are small and noisy, but they are capable of proof-of-concept com-
putation, with cloud-based access available to academics and industrial users. The capacity of
quantum computers is expanding every year. The library of quantum algorithms is growing and
currently includes efficient solutions to linear systems of equations, optimization, Markov chain
Monte Carlo, principal component analysis, machine learning models with correlations not possi-
ble to model classically, and other methods of interest to statisticians and data scientists. These
exciting developments will enable us in the next decade and beyond to solve problems we are
currently too computationally constrained to solve.
The development of quantum computers will lead to new statistical methodologies, because
quantum computers behave very differently from classical computers. Modern statistical methods
co-evolved with classical computing technologies. Markov chain Monte Carlo, gradient descent,
or re-sampling optimize the computational properties of classical computers, such as the ability
to copy any data structure or to reset individual bits. Quantum computers do not possess these
capabilities, which, through our daily experience with classical computers, we have come to take for
granted. Instead, they offer quantum superposition, quantum parallelism, quantum interference,
and quantum entanglement, unique properties quantum computers exploit as a resource, promising
to deliver game-changing computational power for many classes of important problems.
Because quantum computers differ fundamentally from classical computers, statistical method-
ologies taking full advantage of them will differ from classical statistical methodologies. Develop-
ment of quantum statistical methods will require a collaboration between statisticians with a deep
understanding of the requirements, end goals, and tradeoffs of existing statistical methods and
quantum algorithm designers adept at harnessing the power of quantum computers despite their
considerable quirks. To be successful, these collaborations will require a shared language and a
shared core of frameworks.
The objective of this review is to equip statisticians with the language and basic frameworks
of quantum computing to enable them to understand, at a high level, the rapidly evolving state
of the art of quantum computing and engage in successful collaborations with quantum algorithm
designers. It is our hope that these collaborations will bring forth an exciting next generation of
statistical methods – quantum statistical methods.
The review is designed to be accessible and self-contained, emphasizing quantum theory’s linear
algebra foundations, deeply familiar to statisticians and data scientists. This extended version of
the review discusses quantum algorithms and their building blocks in greater detail.
The first part of the review, Sections 2 through 5, sets out the concepts necessary to understand
the state of the art of quantum computing. Section 2 starts with a birds-eye view of quantum theory

3
that serves an overarching framework for the building blocks of quantum computing. Section 3 lays
out the necessary details of quantum theory and the critical properties of quantum computers, with
concrete examples throughout to illustrate theoretical concepts. Section 4 provides a framework
for quantum algorithm design and includes an high overview of quantum algorithms (detailed in
Sections 6 through 11) and a review of Quantum Machine Learning, a vibrant area of quantum
algorithm research. Section 5 provides a guide to accessing and programming physical quantum
computers and includes an extended discussion of quantum gates and other quantum programming
primitives.
The second part of the review, Sections 6 through 11, lays out quantum algorithms of interest to
statisticians and data scientists. Quantum computing is a rapidly evolving field where every year
new, more efficient methods surpass the previous year’s cutting edge. To address this problem of
rapid obsolescence, when selecting algorithms for this review, we have sought to provide a balance
between algorithms considered seminal and the promising recent algorithms.
Section 6 starts with Grover’s search algorithm – a seminal algorithm, whose core idea gave rise
to many widely-used quantum routines including Quantum Amplitude Amplification and Quantum
Amplitude Estimation. We describe how these routines can be used to speed up calculation of
statistical descriptive quantities such as sample mean and sample median. We then describe the
Quantum Monte Carlo method for estimating the probability expectation of a function. This
method, an application of Quantum Amplitude Estimation, offers a provable quadratic speedup
over the classical Monte Carlo method.
Section 7 presents quantum Markov chains (known in the literature as quantum walks). Similar
to classical Markov chains, they power many statistical applications, including quantum Markov
chain Monte Carlo.
Section 8 reviews quantum algorithms for linear algebra computations including solving a
system of linear equations and Principle Component Analysis, central to statistics and machine
learning.
Section 9 presents Hamiltonian simulation. While Hamiltonian simulation may not be of direct
interest to statisticians, it is a critical subroutine for quantum linear algebra (Section 8) and is at
the heart of potentially transformational applications of quantum computing in vital fields such
as chemistry, agriculture, and energy.
Section 10 discusses quantum algorithms to speed up optimization, which plays a critical
role in quantum machine learning. The section describes Adiabatic Quantum Computation and
the Quantum Approximate Optimization Algorithm (QAOA). QAOA belongs to the class of hy-
brid quantum-classical variational algorithms, an important class of algorithms that combine the
strengths of quantum and classical computers, which may provide the way to harness the quantum
advantage of the current generation of quantum computers. The section also discusses quantum
approaches to gradient descent.
Section 11 describes Quantum Singular Value Transformation (QSVT) – a cutting-edge algo-
rithm at the time of writing, which enables polynomial transformations of singular values. QSVT
is a promising recent framework that encompasses other popular quantum algorithms – such as
search, Hamiltonian simulation, or systems of linear equations – as specific instances. Because
of its flexibility and expressivity, we expect QSVT to give rise to other influential algorithms of
interest to statisticians and data scientists.
Section 12 concludes.

4
2 A Birds-Eye View of Quantum Theory
Richard Feynman, one of the best-known Nobel Prize-winning physicists, quipped that nobody
understands quantum theory. Because the laws of quantum theory describe what happens at the
atomic scale – i.e. at distances close to the size of a typical atom – they do not comport with
human intuition developed on the basis of our experience with everyday objects. At the atomic
scale, nature is probabilistic, but the objects around us comprise septillions of atoms1 . Figuratively
speaking, what we experience in our everyday lives are large-sample properties of quantum theory
rather than its “small-sample” properties. But even our probability theory intuitions do not furnish
solid analogies, because quantum theory has properties, such as quantum interference, that do not
fit into classical probabilistic frameworks. Nevertheless, even though quantum theory is highly
counter-intuitive, it describes the behavior of particles at the atomic scale to an astonishing degree
of accuracy. It is humanity’s best-tested scientific theory, withstanding a hundred years of rigorous
experimental tests (see, e.g., Peskin and Schroeder, 1995).
The core of quantum theory is linear algebra on complex vector spaces, called Hilbert spaces,
which can be finite- or infinite-dimensional. Within this framework, any closed quantum system
can be described as a vector in a Hilbert space. The vector, called a pure quantum state (often
called a “quantum state” as a shorthand), is denoted as |ψi, where the object |•i is called a ket:
 
ψ1
 ψ2 
 
 : .
|ψi ≡  (1)

ψn 
:

The coordinates ψm are complex numbers, ψm ∈ C, called amplitudes. The quantum state |ψi can
be written as a linear combination of basis vectors, which are also quantum states – i.e. vectors in
the Hilbert space. In quantum theory, a linear combination of quantum states is called a quantum
superposition. Denoting the standard basis vectors as |em i, we have

|ψi = ψ1 |e1 i + ψ2 |e2 i + . . . + ψn |en i + . . . , (2)

i.e. we expressed the state |ψi is a superposition – a linear combination – of the basis states |em i.
The inner product of two vectors |φi and |ψi in a Hilbert space is analogous to the inner
product in the Euclidean space and requires the complex conjugation and transposition of the left
vector. The conjugate-transpose of |φi is hφ| = (φ∗1 , φ∗2 , .., φ∗n , ..) (the object h•| is called a bra),
where the asterisk ∗ indicates
P complex conjugation. The inner product between two states |φi and

|ψi is denotedPas hφ|ψi = m φm ψm . The quantum states are normalized, so that, for any |ψi, we
have hψ|ψi = m |ψm |2 = 1.
The normalization equation m |ψm |2 = 1 suggests the interpretation of the (non-negative)
P
squared amplitudes |ψm |2 as probabilities and, in fact, the squared amplitudes do play this role
in quantum theory. The squared amplitudes |ψm |2 determine the probability of the outcomes of
quantum measurement – a critical part of quantum theory, discussed more formally in Section 3.3.
Quantum measurement is the way a classical observer characterizes an unknown quantum state.
For an arbitrary quantum state and reference basis, the outcomes of a quantum measurement are
probabilistic. For example, if the observer measures |ψi in the standard basis, the measurement
1
A cup of water contains around 25 × 1024 atoms – 25 septillion.

5
yields an outcome m corresponding to the state |em i with probability |ψm |2 . After the measure-
ment, the quantum state collapses to the basis vector |em i corresponding to the outcome m. A
helpful classical analogy is a closed box with a object of an unknown color and a probability dis-
tribution over the set of possible colors. Once the box is opened and the color of the object is
revealed, the probability distribution for the color of the object in the box collapses to certainty
for the observed color. We can characterize the probability distribution over the set of possible
colors by performing experiments on identically prepared boxes. Similarly, we can characterize a
quantum state by repeatedly preparing it and taking quantum measurements of it.
An important aspect of quantum theory, which makes it both counter-intuitive and powerful
for developing efficient computation, is that quantum states evolve according to their amplitudes
rather than to the corresponding probabilities, i.e. the squared amplitudes. Amplitudes can take
negative and, more generally, complex values and, as a result, make probability masses cancel out
when quantum states undergo transformations. Quantum transformations are linear. A quantum
state ψ A transforms into a state in the same Hilbert space ψ B as ψ A = U ψ B , where U is
a linear (more specifically, unitary) operator (Section 3.3). The cancellation of amplitudes under
linear transformations is called quantum interference, a phenomenon that plays a critical role in
efficiency of quantum computers, as discussed in Section 3.2.
A quantum algorithm A is a series of linear transformations that transform the input state
|ψ i into an output state |ψ out i:
in

ψ out = A ψ in . (3)

The quantum state |ψ out i encodes the desired output, to be passed on to another quantum or
classical algorithm. The power of quantum computers is that, in many cases, they can perform the
transformation A very efficiently, promising to dramatically reduce computational time for many
classes of problems as we outline in Section 4.4 and detail in Sections 6 through 11.

3 Basic Concepts of Quantum Computation


Having outlined the high-level framework for quantum algorithms in the previous section, we
now proceed to lay out the basic concepts of quantum computing necessary to understand how
efficient quantum algorithms harness the counter-intuitive properties of quantum theory to deliver
computational power. We start with the most basic unit of a quantum computer – the qubit.

3.1 The Simplest Quantum System: a Qubit


The basic unit of information in a classical computer is a bit, taking either 0 or 1 values. The basic
information-storage unit of a quantum computer is a qubit, which is a two-dimensional system

|qi = a |0i + b |1i , (4)


2
where |0i and |1iform
 an orthonormal basis of this
 two-dimensional space. The basis state  is
 |0i
1 0 a
a shorthand for and |1i is a shorthand for . The qubit state |qi in (4) represents ,a
0 1 b
normalized vector in the two-dimensional Hilbert space that describes the state of the single-qubit
2
The orthonormality is understood in the usual sense of linear algebra.

6
quantum system. The coefficients a and b are the complex-valued amplitudes, a, b ∈ C, normalized
so that |a|2 + |b|2 = 1. The state |qi is a linear superposition of the basis states |0i and |1i.
A quantum measurement of |qi in the basis of {|0i , |1i} yields 0 with probability |a|2 and 1 with
probability |b|2 (more on quantum measurement in Section 3.3). The state |qi is analogous to a
random variable taking the classical bit values 0 and 1 with probabilities |a|2 and |b|2 respectively,
but there is a crucial difference: the coefficients a and b can be negative, and more generally,
complex. As we discuss below, this property is crucial to the power of quantum computers.
The superposition property extends to collections of multiple qubits, sometimes referred to as
quantum registers. For example, a register of three qubits can support quantum states of the form

|ψi =ψ000 |000i + ψ001 |001i + ψ010 |010i + ψ011 |011i


+ ψ100 |100i + ψ101 |101i + ψ110 |110i + ψ111 |111i . (5)

The states |b1 b2 b3 i, where bk = {1, 0} for k = 1, 2, 3, form an orthonormal basis, and the 3-qubit
register supports 23 = 8 dimensional quantum states – superpositions of 8 basis states. The
notation |b1 b2 b3 i is a shorthand for |b1 i ⊗ |b2 i ⊗ |b3 i, where ⊗ represents the tensor product, so
that |b1 b2 b3 i is a 8-dimensional vector. The tensor form shows that the state |b1 b2 b3 i is separable,
which means each qubit in the state can be manipulated independently of the other qubits. In
general, quantum states formed by multiple qubits cannot be expressed as a tensor product. The
state |ψi in (5) is, in general, not separable. Non-separable states are called entangled.
The superposition property and entanglement enable efficient encoding of information that
supports the computational efficiency of quantum computers (see Section 3.2). Generalizing the
state |ψi in (5) to n qubits, we see that an n-qubit register can encode a 2n -dimensional vector with
2n − 1 independent amplitudes, accounting for the normalization of quantum states. In contrast,
a separable state on n qubits can only encode n independent amplitudes. Without entanglement,
quantum computers lose a significant source of their power (Jozsa and Linden, 2003).3

3.2 The Power of Quantum: Superposition, Entanglement, Parallelism,


and Interference
A quantum algorithm is a series of linear operations and quantum measurements performed on a
set of quantum registers. Like their classical counterparts, quantum algorithms are implemented
via a series of simple quantum gates – small units of computation, analogous to the familiar classical
logic gates, AND, XOR, or NOT. Quantum gates are elementary quantum transformations that
act on one, two, or three qubits; we review popular gates Section 5.3.
As we introduced in Section 2, a quantum algorithm takes a quantum state as an input and
transforms it into an output state that encodes the desired result. The superposition property
suggests that quantum algorithms might offer exponential speedup over classical algorithms in some
cases. Consider an algorithm that checks 100-bit strings, i.e. 2100 ≈ 1030 possibilities. A classical
computer has to check each of the 2100 strings. A quantum computer could potentially perform the
task in massive parrallel, labelling the correct and incorrect strings using 100 operations – one for
each of the 100 qubits holding the 2100 strings in a quantum superposition. This natural ability of
quantum computers to perform parallel computation is called quantum parallelism. The trouble is,
the output of the naive quantum computation is a superposition of all the 2100 labelled results. If all
results are approximately equally probable, then it would take ∼ 1030 measurements to extract the
3
Quantum entanglement is not the only source of quantum computing power. Even without entanglement,
quantum computers can perform computations beyond the capabilities of a classical computer (Biham et al., 2004).

7
correct answer, negating any benefit from the quantum computation. To overcome this problem,
quantum algorithms leverage quantum interference – the property that quantum amplitudes are
complex numbers that can cancel out during computation. A well-designed quantum algorithm
uses quantum interference to suppress the amplitudes of the wrong answers and to amplify the
amplitudes of the correct answer(s). The output state of an efficient quantum algorithm is a
superposition of the desired answers, so that just a few measurements are sufficient to extract
them within required precision.
The quantum properties of superposition, entanglement, parallelism, and interference drive the
next-level computational potential of quantum computers. Superposition and entanglement en-
able highly efficient encoding, supporting massively parallel computation by qubit manipulation.
Quantum interference cancels out undesirable by-products of parallel computation and amplifies
desired results for efficient readout. Finding quantum algorithms able to perform parallel compu-
tation while providing an efficient way to extract results has proven challenging, but the list of
such algorithms is expanding. We overview some of these algorithms in Section 4.4 and outline
their central ideas in Sections 6 to 11. The building blocks of these algorithms can be combined
to solve a variety of practical problems.

3.3 Quantum States and Quantum Operators


Multi-Qubit Quantum States
As we discussed in Section 3.1, a quantum computer stores information using collections of qubits
in quantum registers. Equation (5) provides the general form of a quantum state created on a
three-qubit register. A convenient shorthand for the expansion of |ψi in (5) interprets the “bit
strings” of individual qubit basis states as integer numbers, so that to represent, e.g., |101i, we
write |5i. In this notation, a n-qubit quantum state is written as
N
X −1
|ψi = ψi |ii , (6)
i=0

where N = 2n , and i in |ii represents the bit string representation of the integer i. The 3-qubit
quantum state |ψi in (5) is fully described by a normalized 8-dimensional vector over complex
numbers, (ψ0 , ψ1 , .., ψ7 )⊤ . The orthonormal basis formed by states |ii ≡ |b1 b2 ..bn i, where bj =
{0, 1} ∀j = 1, .., n, is called the computational basis.

The Vector Space of Quantum States


A quantum register of n qubits supports a N = 2n -dimensional Hilbert space, a generalization
of Euclidean space over complex numbers in finite or infinite dimensions, introduced briefly in
Section 2. As in Euclidean space, the inner product between two quantum states |ψi and PN|φi in an
−1 ∗
N-dimensional Hilbert space can be expressed in terms of vector coordinates hφ|ψi = i=0 φi ψi ,
where the asterisk ∗ indicates complex conjugation. In quantum
PN −1notation, the conjugate-transpose

of a ket |φi is denoted as a bra hφ| and decomposed as hφ| = j=1 φj hj|, where hj| is the conjugate
transpose
  of |ji. For example, in a two-dimensional Hilbert space, if |0i denotes the basis vector
1 
, then h0| denotes its conjugate transpose 1 0 . The inner product is expressed as h•|•i. For
0
example, the inner product between two basis states |ii and |ji is hi|ji. Because the basis states

8
are orthonormal:
(
1, if i = j
hi|ji = δij ≡ , (7)
0, if i 6= j
where δij is the Kronecker delta function used widely in quantum computing literature. For two
general N-dimensional states, the inner product takes the form:
N
X −1  N
X −1  N
X −1 N
X −1 N
X −1
hφ|ψi = φ∗j hj| ψi |ii = φ∗j ψi hj|ii = φ∗i ψi , (8)
j=0 i=0 j=0 i=0 i=0

where (7) helped simplify the expression.

Operators
Operators in quantum theory are linear.4 They can be expressed as matrices of complex numbers
acting on vectors in the N-dimensional Hilbert space. Consider the operator A represented by an
N −1
N × N matrix with elements {aij }i,j=0 with respect to the computational basis. In the bra-ket
notation it takes the form
N
X −1
A= aij |iihj| , (9)
i,j=0
PN −1
so that, when it acts on state |ψi = k=0 ψk |ki, the operation yields the expected result
N
X −1  N
X −1  N
X −1 N
X −1  N
X −1 
A |ψi = aij |iihj| ψk |ki = aij ψj |ii = aij ψj |ii , (10)
i,j=0 k=0 i,j=0 i=0 j=0

where we used orthonormality of basis vectors hj|ki = δjk (as shown in Eq. 7). In effect, quantum
operators are linear combinations of outer products |iihj| of basis states in quantum notation.
Because of the constraints in quantum theory, not all linear operators are quantum operators.
There are two types of quantum operators: unitary transformations and observables. Unitary
transformations transform one quantum state into another; observables are related to quantum
measurement and will be discussed shortly.
A linear operator U is unitary if U −1 = U † , where the dagger † denotes the conjugate transpose.
Unitary operators preserve the unit norm of quantum states.5
The identity operator I is a unitary operator. It is commonly used in quantum algorithms to
N −1
re-express a quantum state in an alternative basis. Let {|ai i}i=0 be an orthonormal set of basis
N −1
states of an N-dimensional Hilbert space and let {|bj i}j=0 be an alternative set of orthonormal
basis states of the space. A state |ψi expressed in terms of basis states {|ai i} can be expressed in
terms of states {|bj i} by using the identity operator I written in terms of states {|bj i},
N
X −1
I= |bj ihbj | , (11)
j=0

4
Abrams and Lloyd (1998) demonstrate that, if it were possible to construct non-linear quantum operators, then
the computational complexity class NP of problems exponentially hard for classical computers would be equal to
the complexity class P – the class of problems classical computers can solve efficiently (i.e. in time that scales
polynomially with the size of the input to the problem). While no proof exists that NP 6= P, it is considered highly
unlikely that NP = P.
5
Let |ψi be the initial state, and |φi = U |ψi. Then, hφ|φi = hψ| U † U |ψi = hψ|ψi=1.

9
as follows
N
X −1 N
X −1  N
X −1 
|ψi = ψi |ai i = |bj ihbj | ψi |ai i
i=0 j=0 i=0
N
X −1  N
X −1  N
X −1
= ψi hbj |ai i |bj i = ψ̃j |bj i ,
j=0 i=0 j=0
PN −1
where ψ̃j = i=0 ψi hbj |ai i are the coordinates of |ψi in the basis {|bj i}.

Quantum Measurement
Quantum measurement, introduced informally in Section 2, is the way for a classical observer
characterizes a quantum state. Quantum measurement is famously probabilistic. The goal of a
quantum algorithm is to transform an input quantum state in such a way that a measurement of
the resulting state yields the desired result with high probability.
Formally, quantum measurement is a collection of operators {Mm } that correspond to outcomes
m (where m can be an outcome or the index of an outcome) that occur with probability pm .
A measurement applied to a quantum state |ψi yields the outcome m with probability pm =

hψ| Mm Mm |ψi. The state of the system after the measurement is
Mm |ψi
q .

hψ| Mm Mm |ψi
P
Because the probability of all possible outcomes adds up to 1, i.e.P m pm = 1, for all quantum

states, the measurement operators satisfy the completion relation m Mm Mm = I.
For example, quantum measurement of the qubit |qi in (4) in the computational basis is a
collection of two measurement operators, called projection measurement operators, P0 = |0ih0|
and P1 = |1ih1|, corresponding to outcomes 0 and 1 respectively. These operators satisfy the
completeness relation P0† P0 +P1†P1 = |0i h0|0i h0|+|1i h1|1i h1| = |0ih0|+|1ih1| = I. The probability
that the measurement yields 0 is p0 = hq| P0†P0 |qi = hq|0i h0|qi = | h0|qi |2 = |a|2 and, similarly,
p1 = | h1|qi |2 = |b|2 . Because |a|2 + |b|2 = 1, the relation p0 + p1 = 1 is satisfied.
More generally, the measurement of an N-dimensional quantum state |ψi in the orthonor-
mal basis {|ui i}N i=1 is the set of projection operators Pi = |ui ihui |. The probability pi that the
measurement of state |ψi yields basis state |ui i is
pi = | hui |ψi |2 . (12)
This property is called the Born rule.
One consequence of the Born rule is that a quantum state is determined up to a complex
phase pre-factor eiδ , where δ ∈ R. In other words, the states |ψi and eiδ |ψi are equivalent. The
pre-factor eiδ , called the global phase, has no physically observable consequences.

Observables
Observables are linear operators that do not preserve the norm, but have the property that all
their eigenvalues are real. These operators are self-adjoint, K † = K, and are called Hermitian.6
6
Historically, observables corresponded to physical properties, such as energy or momentum, of quantum states
that could be observed in physics experiments.

10
Let |κj i be eigenvectors of the observable K with (real) eigenvalues κj , so that K |κj i = κj |κj i.
The observable K then can take the form
X
K= κj |κj ihκj | , (13)
j

where the states |κj i form an orthonormal (or orthogonalizable7 ) basis. Denoting by Pj = |κj ihκj |
the projection operator onto the subspace spanned by |κj i, we can connect the observable K to
the quantum measurement defined by the complete set of projection operators {Pj } with outcomes
{κj }. This quantum measurement is often called the measurement of the observable K.
For a quantum state |ψi, the expectation value of the result of the measurement of K is
X X
Eψ [K] ≡ hKi = pj κj = | hκj |ψi |2 κj , (14)
j j

where h•i denotes the expectation value of an observable and pj = | hκj |ψi |2 is the probability
that the measurement of K in the state |ψi yields the value κj . Since | hκj |ψi |2 = hψ|κj i hκj |ψi,
the expectation value is commonly written as
X X 
hKi = hψ|κj i hκj |ψi κj = hψ| κj |κj ihκj | |ψi = hψ| K |ψi , (15)
j j

as a consequence of linearity of inner and outer products.


Similar logic applies to the expectation value of higher moments of K, e.g., the variance Var(K).
Given a Hermitian K and a positive integer a, the operator K a is also Hermitian and can serve as
a quantum observable. The eigenvectors |κj i of the operator K are also eigenvectors of K a , with
eigenvalues κaj , so that higher moments of K are expressed as:
X X
hK a i = κaj pj = κaj | hκj |ψi |2 = hψ| K a |ψi , (16)
j j

with the variance of K equal to Var(K) = hK 2 i − hKi2 .


Observables play an important role in algorithms of interest to data scientists and statisti-
cians, e.g., those for quantum optimization and quantum machine learning, particularly the hybrid
quantum-classical variational algorithms, which harness the strengths of quantum and classical
computers (see Sections 4.5 and 10.3). In these algorithms, a quantum mechanical observable,
which we denote L, often represents the cost function. A quantum variational state |ψ(θ)i encodes
a parameterized model, where θ is the parameter vector. The structure of the quantum variational
state8 |ψ(θ)i can vary widely depending on the the nature of the optimization. For example,
the state |ψ(θ)i could represent a machine learning model with quantum correlations (Low et al.,
2014; Rebentrost et al., 2018a; Schuld and Killoran, 2019; Schuld et al., 2020; Abbas et al., 2020;
Bausch, 2020; Park and Kastoryano, 2020) or a quantum superposition of all possible binary strings
of length n for combinatorial optimization (Farhi et al., 2014). The optimization or training prob-
lem takes the form

θ∗ = argminθ L(θ) = argminθ hψ(θ)| L |ψ(θ)i . (17)


7
In case of degenerate subspaces.
8
Physics literature uses the word ansatz for a fixed-form variational state.

11
A popular way to encode the parameter vector θ into the quantum state |ψ(θ)i is to parameterize
the quantum gates used to prepare the state |ψ(θ)i (see, e.g., Farhi et al., 2014; Schuld et al., 2020;
Cerezo et al., 2021, and references therein). A classical computer controls the quantum gates; for
example, it sets the angle of rotation in controlled rotation gates. The parameterization of the gates
enables a handoff of information from the classical to the quantum computer. A measurement of the
observable L in the state |ψ(θ)i yields an unbiased estimate of L(θ). The classical computer collects
the results of the repeated measurements of L (using repeated preparations of |ψ(θ)i) and uses this
information to update the parameter vector θ to use in the next iteration, with the ultimate goal
of finding the optimal parameter vector θ∗ . This hybrid quantum-classical approach harnesses the
strengths of quantum computers, such as superposition and entanglement, while leveraging the
strengths of classical computers, such as straightforward resetting of bits or copying of data. By
injecting quantumness into the mature classical computing environment, hybrid quantum-classical
algorithms may deliver efficient optimization on the current generation of quantum computers.
For more on optimization and hybrid quantum classical algorithms, see Section 10.

Unitary and Hermitian operators


For any unitary operator U there exists a Hermitian operator KU , such that U = eiKU , where i
is the imaginary unit. The operator KU has a set of eigenstates {|uj i} with eigenvalues uj ∈ R,
i.e. for any |uj i: KU |uj i = uj |uj i. The eigenstates {|uj i} are also eigenstates of the operator U,
with eigenvalues eiuj . Conversely, for any Hermitian K, the operator UK = eiK is unitary. This
connection between unitary and Hermitian operators is widely used in quantum algorithms (see,
e.g., Sections 8 and 9).

Time-evolution of quantum states and the Hamiltonian of a system


Because all reversible transformations of quantum states are unitary, the evolution of a quantum
state between time t1 and time t2 is a unitary operator U(t2 , t1 ):

|ψ(t2 )i = U(t2 , t1 ) |ψ(t1 )i . (18)

For the time-evolution operator U(t2 , t1 ), there is a Hermitian operator KU (t2 , t1 ), such that
U(t2 , t1 ) = e−iKU (t2 ,t1 ) . If the system is stationary, that is its fundamentals do not change over
time, then we can write K(t2 , t1 ) = H × (t2 − t1 ), where H is a Hermitian operator called the
Hamiltonian of the system.9 For a stationary system we have

|ψ(t)i = e−iHt |ψ(0)i . (19)

Time evolution of a quantum state plays a central role in quantum algorithms. The first pro-
posed use for quantum computers was the simulation of quantum systems (Benioff, 1980, 1982;
Feynman, 1981). Classical simulations of quantum systems are exponentially hard, quickly run-
ning into limitations of current technology. But quantum computers may be able to simulate
the evolution of quantum systems in polynomial time. The ability to simulate complex physical
systems efficiently would revolutionize engineering, allowing us to design new materials, fertilizers,
superconductors, or pharmaceuticals at the molecular level. Furthermore, the ability to simulate
9
Sometimes the inverse of Planck’s constant 1/~ pre-multiplies H in KU (t2 , t1 ) = ~1 H × (t2 − t1 ), so that
eigenvalues of H have the units of energy. In quantum computing and much of quantum physics literature an
assumption is made that ~ = 1, corrected when it becomes necessary to consider relative energy scales.

12
Hamiltonians can help us solve problems beyond direct simulations of quantum systems, such as
combinatorial optimization problems or problems that can be cast as linear systems of equations.
For a detailed discussion of Hamiltonian simulation, see Section 9.

Density matrix formulation of quantum states


So far in this review, we have described quantum states using kets |ψi – vectors in a Hilbert
space. The states that can be represented as vectors in a Hilbert space are called pure quantum
states. In this section, we briefly introduce an alternative way to describe quantum states using
density operators aka density matrices. Density matrices are more general than kets because they
enable us to describe not only the pure quantum states but also mixed quantum states – classical
probabilistic ensembles of pure quantum states. For example, Alice prepares for Bob the pure
state |0i with probability 1/3 and pure state |1i with probability 2/3, then the √ resulting
√ state
√ is
a mixed state {(1/3, |0i), (2/3, |1i)}. Note that this state is different from 1/ 3 |0i + 2/ 3 |1i,
which is a pure state, or from 1/3 |0i + 2/3 |1i which is not a valid quantum state as the amplitudes
are not normalized. Density matrices, in effect, exist on the continuum between quantum states
and classical probability distributions. Because of this, density matrices most commonly appear
in the literature concerning noise and de-coherence – the loss of “quantumness” over time – in
physical quantum computers. However, some important quantum algorithms, such as the quantum
Principal Component Analysis algorithm (Section 8.6), also rely on the density matrix formalism.
The density operator of a pure state |ψi is defined as ρψ = |ψihψ|. Consider a system that is
in a pure state |ψi i with probability pi . The density operator for the system is
X
ρ= pi |ψi ihψi | . (20)
i

All the quantum postulates can be equivalently reformulated using density operators. For example,
given quantum measurement operators {Mm }, the probability of getting outcome m is
† †
p(m) = hψ| Mm Mm |ψi = tr(Mm Mm ρψ ).

When a unitary operator U is applied to a mixed quantum state ρ comprising pure states |ψi i, the
operator acts on each of the states |ψi i: U : |ψi i → U |ψi i. The state ρ becomes
X X
U :ρ= pi |ψi ihψi | → pi U |ψi ihψi | U † = UρU † . (21)
i i

The expectation value of an observable in a mixed quantum state equals to the trace of the
product of the observable with the density matrix:
X X
hKi = tr(Kρ) = pi tr(K |ψi ihψi |) = pi hψi | K |ψi i , (22)
i i

where we used the linearity and the cyclic property of the trace. For a pure state ρ = |ψihψ|, the
expectation value hKi equals tr(Kρ) = hψ| K |ψi, just as we have seen in (15).
An important concept often used in the density matrix formulation of quantum computing is
partial trace. If the quantum system, described by a density operator ρ, is defined over a Hilbert
space that is a tensor product of two Hilbert spaces H A ⊗ H B , we can define a partial trace trB (ρ)
to obtain a density operator ρA of the subsystem H A . This concept is similar to marginalizing
a joint probability distribution to obtain a marginal distribution. Let the set of states {|uB i}

13
comprise an orthonormal basis of subspace H B . Taking a partial trace trB of the density matrix
ρ over the subspace H B results in a reduced density matrix ρA on H A :
X
ρA = trB ρ = huB | ρ |uB i . (23)
uB

The density matrix is a Hermitian operator, and we can transform density matrices using the
quantum algorithmic building blocks that apply to general Hermitian operators. For example,
because ρ is Hermitian, e−iρt is unitary – a property used in the quantum Principal Component
Analysis algorithm (Section 8.6).

3.4 Properties of Quantum Computers


Efficient quantum algorithms exploit the laws of quantum physics as a computational resource,
in ways that can seem strange and unfamiliar. Even simple operations like copying, erasing,
or addition proceed very differently on a quantum vs. on a classical computer (Draper, 2000;
Häner et al., 2018). This section reviews properties of quantum computers that highlight these
differences.

No Cloning Theorem
An important property of quantum computers that sets them apart from classical computers is the
No Cloning Theorem (Dieks, 1982; Wootters and Zurek, 1982; Barnum et al., 1996). The theorem
states that for a general unknown quantum state |ψi there is no unitary operator OC that makes
an exact copy of |ψi:

OC : |ψi ⊗ |0i → eα(ψ) |ψi ⊗ |ψi does not exist.

Exact copying is possible if |ψi is known, by repeating the process of creating the state in
a different register. However, the No Cloning Theorem states that it is not possible to copy
the unknown result of a computation, for example to conduct repeated measurements. When
repeated measurements are required – as is most often the case when a quantum result needs to
be interpreted classically – the result has to be re-computed after each measurement.
The No Cloning Theorem does not preclude approximate cloning (Buzek and Hillery, 1996) or
perfect cloning with some probability of success (Duan and Guo, 1998); however, these methods
of cloning are rarely used in quantum algorithms because they have limited fidelity (Bruß et al.,
1998; Gisin, 1998). For example, the approximate cloning method developed by Buzek and Hillery
(1996), which has been proven optimal by Bruß et al. (1998) and Gisin (1998), has fidelity 5/6
for copying a single-qubit state – i.e., the initial state |ψinit i and the cloned state |ψcopy i have
a maximum overlap | hψinit |ψcopy i |2 ≤ 5/6. This fidelity limits the practical ability to apply
approximate quantum cloning to multi-qubit states used in quantum algorithms.

Reversible computation
Another property of quantum computers is that all computation with unitary gates is reversible;
it neither creates nor destroys information. Common Boolean gates such as AND or OR used in
classical computation are irreversible (Vedral et al., 1996). For example, given the result of a AND
b, it is not possible to recover the values of a and b. An example of a reversible classical gate is the
NOT gate: given NOT a, we can recover a. Because classical computers perform irreversible gates

14
with ease, few classical algorithms rely exclusively on reversible gates, even though, in principle,
any classical algorithms can be represented in terms of these gates (Bennett, 1989).
An example of a reversible gate is the Fredkin gate, which has three input bits and three output
bits. One of the bits is the control bit; if the control bit is 1, then the values of the other two bits
are swapped. This logic gate is not only universal – i.e. can be cascaded to simulate any classical
circuit – it is also self-inverse and conservative – i.e. it conserves the number of 0 and 1 bits.
Any classical algorithm can therefore be, in principle, implemented on a quantum computer.
First, the classical algorithm is rewritten using reversible gates, then these gates are translated
into unitary gates. Such direct translations, however, are usually inefficient because they do
not leverage quantum properties and simply replicate classical ideas on the more expensive and
noisy hardware of a quantum computer. Efficient quantum algorithms often have a structure
fundamentally different to that of the classical algorithms designed to achieve similar goals.

Uncomputing
Computation often results in temporary “garbage” data. During classical computation such data
can be discarded, but during a quantum computation these “garbage” data may be entangled with
the main result of the computation making it impossible to reset supplementary registers (often
called auxiliary, see Section 5.3), e.g. using quantum measurement.10 Uncomputing – running
parts of a quantum algorithm “in reverse” – is a way to remove the “garbage” data and clear the
auxiliary register. Let algorithm A be such that A |0i = |ψi. The inverse A−1 uncomputes the
register containing the state |ψi: A−1 |ψi = A−1 A |0i = |0i.
Aharonov and Ta-Shma (2007) demonstrate that if quantum computers were able to “forget”
information, they could solve the NP-complete graph isomorphism problem efficiently. The un-
computing requirement is consequential – it limits quantum computers’ power.

4 Quantum Algorithm Design


This section presents the general structure of quantum algorithms, and highlights the considera-
tions required for the development of successful quantum algorithms.
A general quantum algorithm often proceeds in three steps:

1. Quantum state preparation.

2. Quantum computation.

3. Postprocessing and readout of the resulting quantum state.

If a quantum algorithm is embedded in another quantum algorithm, then steps 1 and 3 may be
omitted; however, a quantum algorithm that has classical data as its input and delivers a result
for use by classical computers or for human interpretation requires all three steps. We focus the
discussion on steps 1 and 3 in this section, while step 2 is the subject of the subsequent sections.
10
The quantum measurement of the auxiliary register would affect the main result. Consider a state with two
registers, where
P the first register contains the main result and the second register holds the byproducts of compu-
tation |ψi = x ax |xi |y(x)i, where states {|xi} span the Hilbert space on the first register, ax are the quantum
amplitudes, and y(x) are the computational byproducts. In general, the state P |ψi is entangled. If we measure the
auxiliary register and get an outcome y0 , then the state |ψi will collapse to x s.t. yx =y0 ax |xi |yx i, which can be a
dramatically different state.

15
4.1 Data Encoding in a Quantum State
The first thing that a statistician may like to know when using quantum computing in statistics
is how to import classical data into a quantum computer. Data can be imported into a quantum
computer using quantum state preparation - the process of encoding data into a quantum state
supported by one or more qubit registers. The qubit registers typically start out initially in the
ground state – all qubits are in the 0 state. Quantum state preparation is a quantum routine that
transforms this initial state into a state that encodes the necessary data.
Efficient loading of data onto a quantum computer is an open area of research (Ciliberto et al.,
2018). The data loading step can require significant computational resources and, if not carefully
thought out, can offset the computational efficiency attained via quantum computation. Similarly,
extracting the result from a quantum state can be a resource-consuming task requiring careful
planning as part of the algorithm design. The creation of a general quantum state on n qubits
n
can be computationally taxing and requires, at a minimum, O( 2n ) quantum operations (see, e.g.,
Prakash, 2014; Schuld and Petruccione, 2018). The computational complexity of data preparation
is reduced considerably when it is possible to exploit the structure of the data, such as if the
data has a functional form. For example, if the amplitudes represent probability densities of a
discretized integrable probability distribution function, loading can be achieved more efficiently
(Grover and Rudolph, 2002). Additional proposals include loading pre-compressed data for anal-
ysis; see, e.g., Harrow (2020).
We now describe a few methods for encoding data in a quantum state.

Amplitude Encoding
A quantum state provides several natural ways to encode data. Consider a general n-qubit quantum
state in (6). One of the most common ways to encode data in this state is amplitude encoding
where data are encoded in the amplitudes ψi and the basis vectors |ii serve as indices. For example,
n
a vector x ∈ C2 , normalized so that kxk = 1, can be encoded as
n −1
2X
|xi = xi |ii . (24)
i=0

The basis vectors |ii serve as indices and the amplitudes xi encode the data. This type of encoding
is widely used in quantum linear systems of equations (Section 4.4) and related algorithms. The
benefit of this encoding is that it is qubit efficient, i.e. a vector of length N needs only O(log N)
qubits (Prakash, 2014; Adcock et al., 2015). The downside is that it may be difficult to initialize
and to read out – i.e. to transfer the information from the output quantum state to a classical com-
puter for processing. Initialization may require an intermediate step, such as quantum Random
Access Memory (see below). Readout within error ǫ generally requires O(N/ǫ2 ) measurements.
Even though, in some cases more efficient readout is possible, for example through compressive
sensing methods (see Section 4.2), sometimes alternative ways to transfer information to the classi-
cal computer are used, such as distilling the results of classification algorithms to few-bit summaries
that are more efficient to read out (Schuld et al., 2016).

Computational Basis Encoding


An alternative way to encode the data in an n-qubit quantum state is to encode information in
the basis vectors |ii (as opposed to the amplitudes as in Eq. 24). Consider a data set of M vectors

16
D = {xm = (xm m ⊤
1 , ..., xN ) , m = 1, ..., M}, each of a dimension of N. Suppose each vector x
m
has
been already represented by a binary string with Nτ bits

xm = bxm
1
...bxm
N

where bxm
j
is the binary representation of xm
j (a string of τ bits with τ the precision). There exists
a procedure to prepare data D in the superposition
M
1 X m
|Di = √ |x i
M m=1

with |xm i the basis quantum state corresponding to the binary representation of xm ; see, Ventura and Martinez
(2000) or Schuld and Petruccione (2018), Ch. 5. Technically, |Di can be understood as a super-
position state with respect to the computational basis {|0i , ..., 2N τ − 1 }, where the basis states

corresponding to the |xm i have the amplitude of 1/ M and other states have zero amplitude.
This data encoding, known as basis or computational basis encoding, requires O(Nτ ) qubits and
takes O(MN) operations to initialize.
The benefit of computational basis encoding is that it enables quantum algorithms to directly
leverage quantum parallelism. For example, let U be an operator that implements a function
f (xm ):
U : |xm i |0i 7→ |xm i |f (xm )i ,
then
M
1 X m
U : |Di |0i 7→ √ |x i |f (xm )i .
M m=1
That is, a single application of U gives us M values f (x1 ), ..., f (xM ) encoded in a superposition
quantum state. Other uses of computational basis encoding include applications in optimization
where the quantum optimization algorithms aim to amplify the optimal entry xm (Section 10).

Qsample Encoding and Quantum Sample States


Another encoding, often called qsample encoding (Aharonov and Ta-Shma, 2003), can be used to
encode a probability distribution P on a finite set {xm , m = 1, ..., M} with probabilities p(xm )
M
1 Xp m m
|P i = √ p(x ) |x i . (25)
M m=1

The quantum p state |P i in (25) is known as a quantum sample state. This encoding uses the
m m
amplitudes p(x ) to encode the probabilities p(x ) and the basis vectors to encode the data
points xm . The qsample encoding is appropriate for use in statistics, especially in Monte Carlo
methods. For example, as a measurement in the computational basis yields xm with probability
p(xm ), the measurement serves as a sampling technique: it generates samples from the distribution
P . Also, it is computationally efficient, compared to classical methods, to estimate the expectation
of a function with respect to the probability distribution P if it is encoded in |P i; see Section 6.8.
As we will see later in Section 7.3, the output of quantum Markov chain Monte Carlo is a quantum
sample state in the form of (25).

17
Data Encoding Using Multiple Qubit Registers
Splitting the collection of n qubits into multiple registers makes further data structures possible.
Consider a collection of r registers of nk , k = 1, .., r qubits each, such that the total number of
qubits is n: rk=1 nk = n. Each basis vector |ii of the 2n -dimensional Hilbert space of n-qubits can
P
be expressed as a tensor product of basis states of the 2nk -dimensional subspaces spanned by each
nk -qubit register: |ii = |i1 i ⊗ |i2 i ⊗ . . . ⊗ |ir i ≡ |i1 i |i2 i . . . |ir i. In this notation, we can re-express
the quantum state in Eq. (6) as a multi-register state:
n1 −1 2n2 −1
2X nr −1
2X
X
|ψi = ... ψi1 ,i2 ,..,ir |i1 i |i2 i . . . |ir i . (26)
i1 =0 i2 =0 ir =0

This general form provides a rich set of possibilities for encoding data including quantum
Random Access Memory (QRAM) and quantum Read-Only Memory (QROM) schemes that provide
quantum algorithms with efficient access to data.

QRAM
Classical RAM is a scheme which, given an index i, outputs the data element xi stored at the
address indexed with the unique binary label i. QRAM is a scheme that, given a superposition
of states corresponding to index values in an input register and an empty output data register,
outputs the data elements into the data register
N
X −1 N
X −1
in out
ψ in
= ai |ii |0i 7→ ψ out
= ai |iiin |xi iout , (27)
i=0 i=0

where both i and xi are recorded in the computational basis; N represents the size of the memory;
and the coefficients ai provides the (optional) weights of the various addressed data elements. For
an overview of the method, see Hann et al. (2021) and also Giovannetti et al. (2008b,a).
The QRAM data structure is suitable for use in some algorithms directly; in others, QRAM is
a stepping stone to amplitude encoding, where it is possible to use a controlled rotation to turn
the QRAM encoding into amplitude encoding efficiently.
Query complexity in QRAM encoding is O(log N); however, the method needs O(N) auxiliary
qubits.11 Critics of QRAM point out that QRAM requires unphysically high qubit fidelity to work
(Arunachalam et al., 2015), although Hann et al. (2021) have recently demonstrated that QRAM
is more robust than had been previously thought. Additionally, the approach requires a parallel
gate architecture; if a classical computer leveraged a similar parallel architecture, it would be
able to achieve similar speedups over sequential classical architecture as quantum computers do
(Aaronson, 2015; Steiger and Troyer, 2016; Csanky, 1975).

QROM
Instead of storing data in a quantum state, it is possible to create a classical data structure
that provides efficient quantum access to the data for use in some algorithms, such as those
based on quantum singular value transformation (Section 11). One such structure, proposed by
Kerenidis and Prakash (2016) and named quantum read-only memory (QROM) by Chakraborty et al.
11
Alternatively, it is possible to reformulate QRAM so that its query complexity of O(N ) using O(log(N ))
auxiliary qubits.

18
(2018), can store a matrix A ∈ RM ×N in O(w log2 MN) classical operations, where w is the number
of non-zero elements of A. Once the structure is in place, it is possible to perform the following
quantum initializations with ǫ-precision in O(polylog(MN)/ǫ) time (requiring O(N) gates accessed
in parallel), where polylog(•) is a common shorthand for “some polynomial in log(•):
N
1 X
U : |ii |0i 7→ |ii Ai,j |ji = |ii |Ai i (28)
kAi k j=1
M
X
V : |0i |ji 7→ |1i kAkF kAi k |ii |ji = |Ãi |ji , (29)
i=1

where |Ai i is the quantum state encoding the normalized ith row of A; |Ãi is the quantum state
such that its inner product with the quantum state corresponding to the row index |ii yields the
normalization factor kAi k: hi|Ãi = kAi k.

4.2 Result Postprocessing and Readout


Reading out the results of a quantum algorithm can be a challenging, resource-consuming step.
The result of a quantum algorithm is a quantum state that may be handed off for processing
either to a classical or a quantum algorithm. Post-processing by quantum algorithm is usually
more efficient because the quantum-classical readout requires a number of operations that may
overwhelm the numberP of operations required to run the algorithm itself (Zhang et al., 2021).
N −1
Consider a state |xi = i=0 xi |ii encoding an N-dimensional vector x = (x0 , ...xN −1 ). Reading
out the elements of vector x within precision ǫ requires a minimum of O(N/ǫ2 ) measurements
(O’Donnell and Wright, 2016). When the vector x is a result of a quantum computation that has
a polylogarithmic complexity dependence on N, the readout complexity of O(N/ǫ2 ) overwhelms
the complexity of the quantum computation to obtain x. Full readout of a quantum state is called
quantum tomography.
Improvements to readout complexity are only possible assuming prior knowledge of the struc-
ture of the quantum state to be read out. Efficient readout methodologies that exploit the quantum
state’s structure often involve reparametrization in order to reduce the effective dimensionality of
the state. Methods include compressed sensing (Gross et al., 2010; Kyrillidis et al., 2018)12 , per-
mutationally invariant tomography (Tóth et al., 2010; Moroder et al., 2012), schemes based on
tensor networks (Cramer et al., 2010; Baumgratz et al., 2013; Lanyon et al., 2017), or mapping
target states onto highly entangled but structured lower dimensional models, such as restricted
Boltzmann machines (Torlai and Melko, 2018; Torlai et al., 2018).
For many applications, full tomography of the quantum state may not be required. Aaronson
(2019) proposed shadow tomography, a method to predict specific properties of the state, called
target functions, without fully characterizing it. In order to predict with high probability an ex-
ponential number of target functions, it is often sufficient to have only a polynomial number of
copies of the quantum state. Huang et al. (2020) improved the efficiency of shadow tomography
to reduce its exponential circuit depth requirements. The method involves repeated application
of random unitaries drawn from a purposefully constructed ensemble followed by a measurement
of the resulting state in the computational basis. The expectation value of the repeated unitary
12
Classical compressed sensing is a method of recovering a sparse vector from a small number of measurements.
Quantum measurement techniques leveraging compressed sensing aim to recover a pure or mostly pure quantum
state efficiently.

19
transformations and measurements is, in itself, a transformation of the quantum state. The in-
verse of the expectation value of the measurement acts as a snapshot of the quantum state, its
classical shadow. Classical shadows are expressive enough to yield many efficient predictions of the
quantum state (Huang et al., 2020): A shadow based on M measurements is sufficient to predict
L linear functions Oi , i = 1, .., L, of the quantum state up to an error ǫ, provided M exceeds
O(log L maxi kOi k2shadow /ǫ2 ), where the norm kOi k2shadow depends on the distribution of random
unitaries used in the construction of the shadow (see Huang et al., 2020, for further details). It
has the property kOi k2shadow < 4n kOi k∞ , where n is the number of qubits and k · k∞ denotes the
operator norm.

Swap Test and Sample Mean Estimation


One of the methods used in post-processing a quantum result state is the swap test (see, e.g.
Buhrman et al., 2001), used to estimate the inter product a⊤ b of two normalized vectors a and b,
encoded in two states |ai and |bi
X X
|ai = ai |ii , |bi = bi |ii .
i i

For example, Schuld et al. (2016) use a swap test to perform a prediction by linear regression using
a quantum algorithm. The swap test also provides an efficient way for computing a sample mean
(see below).
The swap test applies a series of three-qubit controlled swap gates (also known as Fredkin
gates), which swap two qubit states conditional on the state of the auxiliary qubit so that
h 1 i 1
c-SW AP √ (|0i + |1i) ⊗ |ai ⊗ |bi = √ (|0i |ai |bi + |1i |bi |ai). (30)
2 2
Applying a Hadamard gate to the auxiliary qubit and rearranging the terms results in
1 
|0i (|ai |bi + |bi |ai) + |1i (|ai |bi − |bi |ai)
2
α1 (|ai |bi + |bi |ai) α2 (|ai |bi − |bi |ai)
= |0i + |1i
2 α1 2 α2
with α1 and α2 the norm of (|ai|bi+|bi|ai) and (|ai|bi−|bi|ai), respectively. It is easy to see that
p
α1 = 2(1+|ha|bi|2 ).
Hence, the probability of measuring state |0i in the first qubit is
α2 1 p
p = 1 = 1+|ha|bi|2 , hence, |ha|bi| = 2p−1.

4 2
Estimating p by repeated measurement gives us an estimate of the absolute value |ha|bi| = |a⊤ b|.
Its sign can also be determined. Consider two states ã and b̃ that encode the vectors √12 (a1 ,...,aN ,1)⊤
and √12 (b1 ,...,bN ,1)⊤ respectively. Applying the swap test to these two states results in |hã|b̃i| =

2p−1. By noting that |hã|b̃i| = a⊤ b/2+1/2, we have
p
a⊤ b = 2 2p−1−1.

P Now, suppose that a data vector (x0 ,..,xN −1 ) has been encoded into a quantum 1
PNstate
−1
|xi =
x
i i |ii. Applying the swap test to |xi and the uniform superposition state |ui= √
i=0 |ii gives
√ N
us an estimate of N x̄. The algorithm requires O(1/ǫ2 ) measurements to achieve error tolerance
of ǫ.

20
4.3 Computational Complexity
Computational complexity quantifies the resources, particularly time and memory, an algorithm
requires to complete a computational task. For quantum algorithms, the most relevant dimensions
of computational complexity are time, the number of qubits (qubit complexity) and the number
of gates (gate complexity) required to complete a computation. Time complexity is the most
commonly cited metric.
Computational complexity is often expressed as a function of the size of the input, N, using Big
O notation (Nielsen and Chuang, 2002). The most popular measure of computational complexity
is the upper bound O(g(·)), which indicates that required resources are bounded from above by a
function g(·) of the relevant parameters, such as the size N. For example, naive matrix multipli-
cation of two N ×N matrices has time complexity of O(N 3 ). Sometimes we use Õ, pronounced
as “soft-O”, to indicate that Õ(g(·)) = O(g(·)logk g(·)), for some finite k. Other measures of com-
plexity that sometimes appear in the quantum computing literature are lower bound complexity
Ω(·) (pronounced as “Big Omega") and asymptotically tight complexity Θ(·) (pronounced as “Big
Theta") where lower and upper bound coincide.
The set of all problems that a quantum computer can solve in polynomial time – i.e. time
polynomial in the size of the input N – with an error probability of at most 1/3 (by convention)
comprises the complexity class BQP. The complexity class BQP contains some problems that a
classical computer cannot resolve in less than exponential time (time exponential in N) – NP-
hard problems. However, the class BQP does not contain NP-complete problems – the NP-hard
problems that, if solved in polynomial time, lead to a polynomial-time solution of all the other
NP-hard problems.
For an informal (although mathematical) and engaging discussion of topics in computational
complexity of quantum algorithms and the complexity classes relevant to quantum computation,
see Aaronson (2013).

4.4 A Brief Overview of Quantum Algorithms


Quantum computing is a rapidly evolving field, with many new algorithms emerging every year.
For this review, we have assembled a representative selection of algorithms most relevant to statis-
ticians and data scientists. This section provides an overview of these algorithms, with detailed
explanations and references available in Sections 6 through 11.
We start with the Grover search family of algorithms (Section 6).√Grover’s search algorithm
finds a labeled item in an unstructured database of size N using O( N ) queries, quadratically
faster than the best classical approach that requires O(N) queries. Grover proved that his algo-
rithm is optimal – no quantum algorithm can perform unstructured search faster.
The influence of Grover’s algorithm extends far beyond unstructured search. Grover’s algorithm
gave rise to several widely-used subroutines, including Quantum Amplitude Amplification (QAA)
and the closely related Quantum Amplitude Estimation (QAE). These subroutines appear in many
quantum algorithms, for example to amplify the amplitude of the desired result and suppress
undesirable byproducts at the end of a quantum computation. The core ideas of Grover’s search
algorithm power quantum Monte Carlo integration and the efficient estimation of the statistical
properties of a function on an unstructured set, such as its mean, median, minimum, and maximum.
Additionally, Grover’s search algorithm inspired a popular version of quantum walks.
Quantum walks (Section 7) are quantum analogues of classical random walks or Markov chains,
widely used in randomized classical algorithms. Quantum walks mix quadratically faster than clas-

21
sical random walks. This quadratic speedup can offer efficiencies in applications such as quantum
Markov chain Monte Carlo.
The next important family of quantum algorithms, presented in Section 8, leverages the quan-
tum Fourier transform (QFT) – the quantum analogue of the classical discrete Fourier transform.
QFT powers many important algorithms of interest to statisticians and data scientists, such the
quantum linear systems algorithm, quantum matrix inversion, and quantum PCA. QFT exploits
quantum parallelism – the ability to apply a function to multiple elements of a vector in parallel -
to deliver exponential speedups to the quantum algorithms it powers. For a vector of size N, QFT
requires only O(polylog(N)) calls to the function, provided the vector is encoded into a quantum
state using amplitude encoding. Classical discrete Fourier transform requires O(NlogN) calls to
transform the classically encoded vector.
Arguably the most famous application of QFT is Shor’s factoring algorithm, one of the first
quantum algorithms with direct potential real-world consequences – the breaking of the encryption
system based on prime factorization. Most encryption systems today that keep financial trans-
actions and other sensitive information secure rely on the fact that factoring a large number N
is exponentially hard for classical computers. Shor demonstrated that, using QFT, it is possi-
ble to perform the task in time polylogarithmic in N. Since Shor’s algorithm does not have a
direct application in statistics and many clear reviews of the algorithm are available (see, e.g.,
Nielsen and Chuang, 2002), we omit it in this review. We focus instead on applications of QFT to
fast linear algebra, critical to many data science applications.
Quantum linear algebra applications leverage QFT via another quantum subroutine, quantum
phase estimation (QPE) — an algorithm to record an estimate of a phase, such as the phase
θ in an eigenvalue of the form eiθ of a unitary operator, in a quantum state in computational
basis. A common use of QPE is to extract the eigenvalues of a Hermitian matrix (or the singular
values of a rectangular matrix embedded in a Hermitian matrix). The quantum linear systems
algorithm, first proposed by Harrow, Hassidim, and Lloyd (HHL) (Harrow et al., 2009) uses QPE to
perform fast matrix inversion, which has a wide range of statistical applications, such as regression
analysis. Other applications of QPE include Quantum Principal Component Analysis (QPCA) and
fast gradient estimation. QFT, at the heart of these algorithms, powers their exponentially faster
runtime (under certain conditions detailed in Section 8) compared with their classical counterparts.
In Section 9, we describe Hamiltonian simulation algorithms, which simulate the evolution of
complex quantum systems, including those of direct practical importance such as biologically active
molecules. Advances in Hamiltonian simulation could change the face of agriculture, materials,
and energy; they are potentially even more consequential than Shor’s factoring algorithm. Many
quantum algorithms rely on Hamiltonian simulation as a building block. For example, the HHL
algorithm for quantum linear systems uses Hamiltonian simulation together with QPE to perform
matrix inversion. The computational complexity of quantum linear systems algorithms has dra-
matically improved since HHL’s first proposal, and most of these advances occurred because of
more efficient Hamiltonian simulation techniques.
Another family of quantum algorithms of interest to statisticians are the quantum optimization
algorithms (Section 10), which find a quantum state that minimizes a cost function. Many quantum
optimization algorithms leverage ideas of Adiabatic Quantum Computation (AQC), a quantum
computing framework based on the Adiabatic Theorem. The Adiabatic Theorem states that a
system in the eigenstate corresponding to the smallest eigenvalue of its Hamiltonian13 will stay
13
Physicists refer to the state corresponding to the smallest eigenvalue of its Hamiltonian as the lowest-energy
state or ground state of the system.

22
in this state if the system changes sufficiently slowly – i.e. in such a way that a finite eigenvalue
gap between the two lowest eigenvalues persists throughout the evolution. In AQC, an easy-to-
initialize quantum state slowly transforms into the desired state through gradual Hamiltonian
evolution. AQC is, in effect, analog quantum computing; nevertheless, on a noiseless quantum
computer, it is equivalent to circuit-based quantum computing (Aharonov et al., 2008).
The Quantum Approximate Optimization Algorithm (QAOA) is the most famous optimization
algorithm inspired by AQC. QAOA is a hybrid quantum-classical variational algorithm, which
optimizes a quantum state using a classical outer loop. In QAOA, a specific sequence of classically-
controlled parametrized gates prepares a variational quantum state; the cost function takes the
form of a quantum observable. The classical outer loop adjusts the parameters based on the
measurements of the cost function. QAOA can be a powerful method to obtain approximate
solutions to combinatorial optimization problems, such as MaxCut – the NP-hard problem of
cutting a graph in two parts, such that the number of edges between the parts is as large as
possible. QAOA and other hybrid quantum-classical algorithms play to the strengths of quantum
and classical algorithms to deliver efficient optimization (McClean et al., 2016; McArdle et al.,
2019).
Section 11 presents Quantum Singular Value Transformation (QSVT), an algorithm to perform
polynomial transformations of the singular values of a matrix. This versatile method represents the
cutting edge of quantum algorithm development. It serves as a unifying framework encompassing
many existing quantum algorithms, such as Grover search, Hamiltonian simulation, and matrix
inversion, provides a consistent way to develop efficient versions of these algorithms, and enables
the systematic development of new algorithms. For example, the QSVT-based version of quantum
matrix inversion is the most efficient known algorithm for this task. QSVT is a generalization of
Quantum Signal Processing (QSP), a method inspired by signal processing used in nuclear magnetic
resonance – an important and well-developed technology used in many industries from mining to
medicine. QSP embeds a quantum system in a larger system to perform a non-linear/non-unitary
transformation of the subsystem. QSVT extends the method to general rectangular matrices. The
flexible paradigm may result in novel algorithms of interest to statisticians being developed in the
near future.

4.5 Quantum Machine Learning


Quantum machine learning is a very active area of quantum computing research. Results in quan-
tum learning theory point to classes of problems where quantum computing can deliver signifi-
cant advantages (see, e.g. Arunachalam and de Wolf, 2017), including polynomially faster learning
rates. Additionally, evidence suggests that NISQs (Section 5.1) may be able to deliver quantum
advantage in machine learning over classical counterparts. We refer readers interested in a more
detailed review of quantum machine learning to Ciliberto et al. (2018); Schuld and Petruccione
(2018); Adcock et al. (2015); and the informal review by Dunjko and Wittek (2020).
The most direct application of quantum computers to machine learning is the development of
quantum neural networks – artificial neural networks designed to benefit from quantum superposi-
tion and entanglement and encoded in quantum states that are difficult to sample from classically
(see, e.g. Low et al., 2014; Rebentrost et al., 2018a; Schuld and Killoran, 2019; Schuld et al., 2020;
Abbas et al., 2020; Bausch, 2020; Park and Kastoryano, 2020, and references therein). A critical
challenge in development of quantum neural networks is that quantum transformations are fun-
damentally linear, and non-linearity is seen as particularly important in successful classical neural
networks. Schuld et al. (2016) demonstrated that the vast majority of early proposals for quantum

23
neural networks did not meet the non-linearity requirements of artificial neural networks. But re-
cent models are overcoming this disadvantage using, for example, quantum measurement (see e.g.
Romero et al., 2017; Wan et al., 2017) or kernel functions (Farhi and Neven, 2018; Blank et al.,
2020; Liu et al., 2021) to include non-linearity. Huang et al. (2021) develop a class of kernel mod-
els that can provide rigorously demonstrable speedup over classical models in the presence of noise,
which means they are potentially achievable on the current generation of quantum computers.
Faster optimization is another way to leverage quantum computers in machine learning. Many
machine learning methods use optimization techniques for parameter learning. Quantum Optimiza-
tion (Section 4.4), which includes annealing (Kadowaki and Nishimori, 1998, 2021) and adiabatic
methods, approximate optimization, and hybrid quantum-classical optimization, has the potential
to deliver quadratic or polynomial speedups (Aaronson and Ambainis, 2009; McClean et al., 2021)
for a variety of machine learning tasks. For example, Miyahara and Sughiyama (2018) perform
mean-field VB via quantum annealing, an alternative to gradient descent optimization. Hybrid
quantum-classical variational algorithms, which combine the strengths of quantum and classical
computers, can speed up variational methods (see, e.g., Farhi and Neven, 2018; McClean et al.,
2018; Mitarai et al., 2018). Hybrid approaches are particularly attractive in the near term, because
they may help to harness the power of the current generation of quantum computers.
Bayesian computation has long called for scalable techniques. Markov chain Monte Carlo
(MCMC) has been the main workforce in Bayesian statistics, but it is also well-known that
MCMC can be too slow in many modern applications. Quantum Markov chains (Section 4.4)
hold the promise to speed up MCMC (see, e.g. Szegedy, 2004; Chowdhury and Somma, 2017;
Orsucci et al., 2018). Quantum computation has also been exploited to speed up Variational
Bayes (Lopatnikova and Tran, 2021) - another popular technique for Bayesian computation.
When fault-tolerant quantum computers become available, machine learning methods may ben-
efit from fast quantum linear algebra. For example, the algorithms to solve systems of linear equa-
tions, such as the HHL algorithm (Harrow et al., 2009) and its updates (Section 8.4), enable fast
matrix inversion used widely in machine learning models (Wiebe et al., 2012; Lloyd et al., 2013;
Rebentrost et al., 2014; Lloyd et al., 2016; Cong and Duan, 2016; Kerenidis and Prakash, 2016;
Childs et al., 2017; Rebentrost et al., 2018b; Wang et al., 2019; Kerenidis and Prakash, 2020). Un-
der certain conditions, such as when quantum access to data is provided and the matrices to be
inverted are sparse or low-rank, quantum computers can deliver exponential speedup relative to
classical computers. Direct applications include linear regression for data fitting (Wiebe et al.,
2012) and prediction (Schuld et al., 2016), ridge (Yu et al., 2021) and logistic (Liu et al., 2019) re-
gression. Lopatnikova and Tran (2021) use quantum matrix inversion to speed up the estimation of
natural gradient for Variational Bayes (VB). Related algorithms such as the quantum PCA and the
QSVT algorithms (Section 4.4) have also been influential; e.g., quantum PCA using parameterized
quantum circuits can support face recognition tasks (Xin et al., 2021).14
Quantum machine learning and classical machine learning have benefited from cross-over ideas.
Classical machine learning algorithms have incorporated quantum-insired structures (Gilyén et al.,
2018; Tang, 2018; Chia et al., 2020). Similarly, classical machine learning ideas, such as kernel
methods, helped better understand the nature of quantum neural networks and improve their
14
Gilyén et al. (2018); Tang (2018); Chia et al. (2020) have recently demonstrated that, if data are made available
to classical computers in structures similar to those required for efficient quantum computations, then some of
the quantum linear systems algorithms can be de-quantized – i.e. significant efficiencies can be obtained using
randomized classical algorithms. These quantum-inspired algorithms can bring intriguing efficiencies to machine
learning; however, currently these algorithms suffer from disadvantaged scaling in critical parameters, such as
condition numbers and sparsity of matrices, which may render them impractical in the near term.

24
design (Schuld and Killoran, 2019).

5 Programming Quantum Computers


Quantum algorithm designers have the opportunity to perform proof-of-concept computations on
real quantum computers, available from a number of companies as a cloud-based service. In this
section we review the state of the art of quantum computing technology and outline ways to access
quantum computers.

5.1 Physical Implementation on NISQs and Beyond


Today’s quantum computers are small and noisy, similar to classical computers in the 1950s. The
largest quantum computers today comprise up to 100 qubits, with error rates at best around 1%
and coherence time – the duration of time qubits can represent quantum states with sufficient
accuracy – up to 100 microseconds. In the 1950s, classical computers suffered a similar problem.
Made of vacuum tubes or mechanical relays, the bits in the early classical computers tended
to flip at random, introducing errors. To perform computation, error correction code, based on
redundant bits, was required. Modern classical computing technology is so advanced that bits
are extremely stable without error correction. But quantum computers still require redundant
qubits to compensate for the decoherence errors. As a result, a system of 100 qubits may have an
order of magnitude fewer logical qubits, units acting as qubits for algorithm implementation. The
connectivity between qubits – i.e. our ability to apply two or three-qubit gates with high accuracy
– and also how long it takes to apply a gate also play an important role.15
Most quantum algorithms that offer provable speedups over classical counterparts require much
larger, fault tolerant quantum computers. Two metrics can help identify the quantum resources
an algorithm might require: the size of the qubit registers and the circuit depth of the algorithm.
Circuit depth refers to the number of gates required to implement a quantum algorithm on a
quantum computer sequentially. Higher circuit depth algorithms require higher coherence times.
But even though today’s quantum computers are relatively small and noisy, we may be crossing
over to the era when these computers surpass classical computers for some classes of problems. In
2018, Preskill (2018) called these noisy computers with 50+ qubits NISQs – noisy intermediate-
scale quantum systems. Some algorithms such as the Quantum Approximate Optimization Algo-
rithm, described in Section 4.4, or the Variational Quantum Eigensolver (Peruzzo et al., 2014) – an
algorithm important in simulating quantum physical systems – can be implemented on NISQs. For
a discussion of quantum algorithms on NISQs, see e.g. Bharti et al. (2021) and references therein.
The most popular quantum computing technologies today are based on superconductors and
cold ions. Superconductor-based quantum computers have been able to achieve highest qubit
counts and fast gate times, but the gates and qubits on these computers are noisy. Ion-based
quantum computers offer slower gate times, but much higher qubit fidelities and greater connec-
tivity. Other qubit types include silicon qubits, nitrogen-vacancy qubits, and optical qubits.
15
For a popular account of the role of noise in quantum computing and the importance of quantum error correction,
see e.g. Cho (2020).

25
5.2 Quantum Programming
Physical quantum computers are available today through a number of companies, such as IBM,
Amazon AWS, Google, Rigetti, and others, as a cloud-based service. Some of the cloud quantum
computing providers, such as IBM and Rigetti, offer access to their in-house quantum computers;
other, like Amazon AWS and Google, aim to provide access and programming tools to run programs
on third-party quantum computers, such as those built by IonQ, Honeywell, and Rigetti. Addi-
tionally, the quantum computing providers also offer classical simulation of quantum algorithms,
which can be used to test these algorithms. To operate the quantum computers and simulators,
each cloud service offers a software-development kit (SDK).
At the time of writing, by far the most popular quantum SDK is IBM’s Qiskit (pronounced as
“’kiss-kit”). Qiskit enables quantum software developers to run quantum code on IBM Quantum
Experience, IBM’s cloud quantum computing service. At present, IBM is the only platform that
provides free quantum computing access to the general public. It also provides premium access to
its latest, higher-fidelity, higher-qubit-count quantum computers. Qiskit, like most other quantum
SDKs, is Python-based and open-source. Like all other current quantum programming packages,
Qiskit has an assembly language at its core; it manipulates individual logical qubits, gates, and
quantum circuits. The growing library of higher-level tools and application packages at this stage
are also written in quantum assembly language.16 Qiskit provides extensive documentation, tu-
torials, including many quantum programming examples, and an active developer community.
Statisticians and data scientists, who have read this review up to this point, should be able follow
Qiskit programming tutorials at https://qiskit.org/textbook/ with relative ease.
Rigetti Computing also provides a cloud computing service to access quantum computers and
simulators developed in-house. The SDK called Rigetti Forest uses a custom instruction language
Quil, particularly strong at facilitating hybrid quantum/classical computing. Like Qiskit and the
vast majority of other quantum programming tools, Quil is based on open-source Python packages.
The Python library pyQuil is a library of higher-level Quil applications.
The other significant industry players, such as Amazon, Google, and Microsoft provide cloud
computing access to third-party quantum computers. Each of these players have their own SDKs
and cloud quantum computing access tools. Amazon has recently launched Amazon Braket, a
quantum computing service within their broader on-demand computing service Amazon Web Ser-
vices (AWS). Braket comes with its own script, access to third-party quantum computers such
as those made by IonQ, Rigetti, and Oxford Quantum Computing, as well as quantum annealers
by D-Wave, and classical quantum-computing simulators. Google Quantum Computing Service
uses Cirq, its open source Python-based framework, and provides access to third-party quantum
computers, such as those from IonQ and Honeywell. Microsoft’s quantum computing service Azure
Quantum is a part of its Microsoft Azure on-demand computing services. It has its own language
Q#, and provides access to third-party quantum computers.

5.3 Quantum Gates and Other Primitives


Quantum algorithms are implemented physically using quantum circuits – i.e. series of basic quan-
tum gates – analogous to classical circuits. Quantum gates are unitary transformations applied
16
When classical computing technology was less mature, assembly language coding was much more popular.
Today, the vast majority of classical software designers operate at a higher level of abstraction.

26
to one, two, or three qubits at a time. 17 This section provides an overview of some of the most
common gates. For a more extended discussion of quantum circuits, see, e.g. Nielsen and Chuang
(2002) and Kitaev et al. (2002); for a general theory of quantum circuits, see, e.g. Aharonov et al.
(1998).

Hadamard Gate
The most widely used quantum gate is the Hadamard gate. In the computational basis {|0i,|1i},
it is represented by the matrix H:
 
1 1 1
H=√ . (31)
2 1 −1
When applied to a qubit in the basis state |0i it creates a uniform superposition of states |0i and |1i,
often denoted as |+i: |+i = H|0i = √12 (|0i+|1i). Similarly, |−i = H|1i = √12 (|0i−|1i). Application
of the Hadamard gate to each qubit of an n-qubit register creates a uniform superposition of the
N = 2n possible computational basis states
N −1
⊗n 1 X
H |0...0i = √ |ii, (32)
N i=0
where |ii is a shorthand for the computational basis state of n-qubits that can be interpreted as
an n-bit integer i; the notation H ⊗n means that the gate H is applied to each of n qubits once (as
compared with H n , which means n sequential applications of H to the same qubit). The operator
H ⊗n that creates a uniform superposition on n qubits is sometimes called the Walsh-Hadamard
transform.

Single-Qubit Rotations
To understand qubit rotations, we reparametrize the qubit in (4) in terms of angular coordinates
of a unit sphere
|qi = cos(θ/2)|0i+eiφ sin(θ/2)|1i. (33)
The unit sphere representing a qubit is called the Bloch sphere. It is a two-dimensional object
embedded in a three dimensional space. A single-qubit unitary transformation can be decomposed
into a sequence of basic rotations around x, y, and z axes
 
−iXϕ/2 cosϕ/2 −isinϕ/2
Rx (ϕ) = e = (34)
−isinϕ/2 cosϕ/2
 
−iY ϕ/2 cosϕ/2 −sinϕ/2
Ry (ϕ) = e = , (35)
sinϕ/2 cosϕ/2
 −iϕ/2 
−iZϕ/2 e 0
Rz (ϕ) = e = , (36)
0 eiϕ/2
     
0 1 0 −i 1 0
where X = ,Y= , and Z = are Pauli matrices (sometimes also written
1 0 i 0 0 −1
as σx , σy , and σz ).
17
A general quantum transformation requires exponentially many quantum gates (Knill, 1995). The art of writing
quantum algorithms is in finding ways to perform useful transformations efficiently with respect to all resources –
time, qubits, and gates.

27
The NOT Gate
The Pauli matrix X has the effect of a NOT gate: X|0i = |1i and X|1i = |0i.

Controlled-NOT Gate
The Controlled-NOT gate is an example of a two-qubit gate. It applies the NOT gate to the
second qubit only if the first qubit is in the state |1i. As a matrix in the computational basis of
two qubits, {|00i,|01i,|10i,|11i}, Controlled-NOT (C-X) gate is
 
1 0 0 0
0 1 0 0
C-X = 0 0 0 1,

0 0 1 0
so that C-X|00i = |00i, C-X|01i = |01i, C-X|10i = |11i and C-X|11i = |10i. An alternative way to
write down the Controlled-NOT gate is using the bra-ket notation:
C-X = |0ih0|⊗I +|1ih1|⊗X. (37)
When used as a part of an operator, the tensor product ⊗ means that different parts of the
operator apply to different qubits (or, more generally, qubit registers). Consider a two-register
state |ψi|φi ≡ |ψi⊗|φi, then the operator A⊗B performs the operation A on the state |ψi in the
first register and the operation B on the state |φi in the second register:
(A⊗B)|ψi|φi = (A|ψi)⊗(B|φi). (38)
It is straightforward to verify that the operator C-X in (37) applied to two-qubit states performs
the desired Controlled-NOT operation.
As an example, we can use the Controlled-NOT gate to create a maximally entangled two-qubit
state, √12 (|00i+|11i), starting with two qubits initialized to |00i. First, we apply a Hadamard gate
H to the first qubit: H|00i = √12 (|0i+|1i)|0i= √12 (|00i+|10i). Then, we apply C-X to the register
of two qubits: C-X[ √12 (|00i+|10i)] = √12 (|00i+|11i). The C-X gate flips the second qubit only
when the first qubit is in the state |1i, resulting in the desired state.
The three-qubit extension of the Controlled-NOT gate is the Toffoli gate, also known as the
CCNOT gate. It applies NOT to the third qubit conditionally on the state of the first two qubits
being |11i.
Basic gates such as rotations, the NOT gate, CNOT gate, or the Hadamard gate can be
implemented on existing quantum computers.

Auxiliary Qubits
Many algorithms require supplementary qubits to support computation in addition to the qubit
registers encoding data (Section 4.1 discusses data encoding in detail). These qubits are called
auxiliary or ancilla qubits. The addition of a single auxiliary qubit effectively doubles the Hilbert
space: An n-qubit register spans a 2n -dimensional Hilbert space; the addition of an auxiliary qubit
expands this Hilbert space to 2n+1 dimensions. Therefore, the addition of auxiliary qubits embeds
the data registers in a larger space enabling, for example, non-linear transformation of the data
registers (for an example of a non-linear transformation with the help of an auxiliary qubit, see
Postselection later in this Section).

28
Controlled Rotation
Controlled-rotation is the application of a series of gates that act on an auxiliary qubit conditionally
on the state of one or more other qubits. For example, consider a state |xi, which encodes an n-bit
binary string x on a register of n qubits. Append an auxiliary qubit to create the state |xi|0i. A
popular form of controlled rotation is

C-Ry (f (x)) = |xihx|⊗e−iY f (x) , (39)

where Ry is a single-qubit rotation operator introduced in (35), and f (x) is a function of x that
is reasonably simple to compute. In matrix notation, the operator e−iY φ has the effect of e−iY φ =
cosφ −sinφ
h i
, so that the effect of the operator C-Ry (f (x)) on the state |xi and the auxiliary
sinφ cosφ
qubit is

C-Ry (f (x))|xi|0i = |xi cosf (x)|0i+sinf (x)|1i . (40)

A common example of f (x) is arcsine of x/C, where C is a constant selected so that |x/C|≤0.5.
Arcsine is efficient to compute using an expansion based on the inverse square root (see, e.g.
Häner et al., 2018, and references therein). Quantum algorithms exist to approximate the arcsine
function, inspired by classical reversible algorithms for the inverse square root – used, e.g., in
gaming (Lomont, 2003).
Controlled rotation supports many important quantum algorithms such as quantum Monte
Carlo (Section 6.8), the application of a Hermitian operator (Section 8.3), or the HHL quantum
linear systems algorithm (Section 8.4).

Controlled Unitary
Controlled-unitary C-U is an operation applied to a multi-qubit register conditional on the state
of an auxiliary qubit:

C-U = |0ih0|⊗I +|1ih1|⊗U. (41)

The controlled-unitary operation exists for some, but not all unitaries (Lloyd et al., 2014).
When the operator U applies to a single-qubit register, C-U is implemented using a decompo-
sition of the operator U such that U = AXBXC and ABC = I. Then, substituting CNOT (C-X)
for NOT (X), we get the desired controlled-unitary.

Postselection
Another important algorithmic building block is postselection, where a quantum state is kept or
discarded conditionally on the result of a quantum measurement of a part of the state (usually an
auxiliary qubit). Postselection enables nonlinear quantum transformations at the cost of having
to discard quantum states where the measurement did not yield P the desired result.
For example, let |ψi be a quantum state such that |ψi = x ψx |xi, where the states {|xi} form
an orthonormal basis. We can apply a controlled rotation in Eq. (40) to the state |ψi and an
auxiliary qubit. The result of the controlled rotation is
X 
C-Ry |ψi|0i = ψx |xi cosf (x)|0i+sinf (x)|1i . (42)
x

29
We now measure the auxiliary qubit. If the measurement yields 1, we keep the state; if 0, we discard
it and then repeat the preparation of the state |ψi, the controlled rotation, andP
the measurement
until the measurement yields 1. The state we keep will be proportional to x ψx sinf (x)|xi –
a nonlinear transformation of the original state |ψi. Postselection plays an important role in
algorithms such as the application of a Hermitian operator (Section 8.3), the HHL quantum linear
systems algorithm (Section 8.4), or the quantum singular value transformation (Section 11).

Oracles
Oracles are not gates, but they too are important algorithmic building blocks – in both classical
and quantum computation. Oracles are “black box” parts of algorithms that solve certain problems
in a single operation. They are calls of a function that do not take into account the structure of
the function.
Quantum algorithms employ two types of oracles – classical oracles, which provide a classi-
cal solution to a given problem, and quantum oracles, which make the solution available to the
quantum computer as a quantum state.

6 Grover’s Search and Descriptive Statistics on a Quantum


Computer
One of the most influential quantum algorithms is the search algorithm by Lov Grover (Grover,
1996). Grover considers the problem of searching for a binary string x0 among N strings, provided
that there is an oracle binary function such that f (x)=0 if x6= x0 and f (x0 )=1. The classical search
algorithm over
√ an unstructured space requires O(N) oracle calls; Grover’s quantum algorithm
requires O( N) calls, offering a quadratic improvement.
Grover’s algorithm has been influential because it led to the development of a class of prac-
tically important applications of quantum computers, including those for efficient estimation of
statistical quantities such as the sample mean, median, or minimum (maximum) of a function
over a discrete domain. Quantum Amplitude Amplification, based on Grover’s algorithm, is a
widely used subroutine to many other quantum algorithms. It works by amplifying the amplitude
of the correct result in a quantum state holding the result in a superposition with byproducts of
computation.

6.1 Grover’s Search Algorithm


Assume, without loss of generality, that N = 2n , where n is an integer. The starting state is a
uniform superposition of all possible n-bit strings x created on n qubits:
1 X
|si = √ |xi. (43)
N x

One of these bit strings is x0 – the target. Grover’s algorithm proceeds


√ in a series
√ of iterative
steps, each amplifying the amplitude of the state holding x0 by 2/ N . After O( N) iterations
this amplitude – and the corresponding measurement probability – becomes of order unity.

30
To understand Grover’s algorithm, it is helpful to think of the state |si as a superposition of
the target state |x0 i and its orthogonal complement |s′ i:
1 X
|s′ i = √ |xi, hs′ |x0 i = 0, (44)
N −1 x6=x0

so that

N −1 ′ 1
|si = √ |s i+ √ |x0 i. (45)
N N

When viewed as vectors in a Hilbert space, the starting state |si √ is very close to |s i. Let θ/2 be

the angle between |si and |s i. Then, by Eq. (45), sin(θ/2) = 1/ N.
At each iteration of Grover’s algorithm, the quantum state of the system is pushed slightly
toward |x0 i and away from |s′ i by an angle θ. The state stays in the two-dimensional plane
spanned by |x0 i and |s′ i, and the result of each iteration is a superposition of |x0 i and |s′ i. After
t iterations such that sin2 ((t+ 21 )θ) ≈ 1, the algorithm results in a quantum state which, when
measured in the √ computational
√ basis, yields x0 with a probability close to 1. The number of
iterations is t ≈ π4 N or O( N).
Let UG represent one Grover iteration. Each iteration consists of two unitary operators Us and
Uf : UG =Us Uf . The first operator is Uf =I −2|x0 ihx0 |, a reflection with respect to |s′ i in the plane
spanned by |x0 i and |s′ i as it maps |x0 i into −|x0 i. Application to the initial state |si yields
2 1
Uf |si = (I −2|x0 ihx0 |)|si = |si− √ |x0 i = |s′ i− √ |x0 i. (46)
N N
The second operator is Us = 2|sihs|−I, a reflection around the state |si which, applied after the
operator Uf , results in an increased amplitude of |x0 i:
2 4 2
UG |si = Us Uf |si = (2|sihs|−I)(|si− √ |x0 i) = (1− )|si+ √ |x0 i. (47)
N N N
So, the application of UG to the state |si has amplified the amplitude of the target state |x0 i. Both
operators Uf and Us can be implemented on a quantum computer.
The implementation of operator Uf requires an auxiliary qubit and quantum oracle access Of
to the function f (x). Given a value X encoded in a quantum state, the oracle Of records the value
f (x) in a quantum state in the following way. Let |xi|yi be a quantum system that represents the
value x and, in the auxiliary qubit, some value y. The quantum oracle Of applied to |xi|yi mod-
adds the value f (x) to the value of the auxiliary qubit: Of |xi|yi= |xi|y⊕f (x)i, where ⊕ indicates
mod-addition. It is easy to show that, if we start with the auxiliary qubit in the Hadamard |−i
state, we have Of |xi|−i = (Uf |xi)|−i.
The operator Us can be implemented using a decomposition into three parts: an uncomputation
of |si to recover the ground state |0i, a sign-flip on the ground state, and a re-computation of |si.
Let A be the operator that creates the state |si when applied to the ground state: |si=A|0i. In the
case when |si is a uniform superposition of all n-bit strings as in (43), we know that A = H ⊗n, but
we will use the more general form here because it will help us to derive the Quantum Amplitude
Amplification algorithm in Section 6.2. Let U0 = I −2|0ih0|, an operator that changes the sign of
the |0i state. Then,

−AU0 A† = −A(I −2|0ih0|)A† = 2A|0ih0|A† −AIA† = 2|sihs|−I = Us , (48)

31
where we used the fact that the operator A is unitary, and therefore, A† = A−1 . Equation (48)
demonstrates the implementation of Us using the the three parts mentioned above.
With minor modifications, it is possible to apply Grover’s algorithm to the situation with more
than one desired state (Boyer et al., 1998).
Brassard et al. (2002) provide an intuitive explanation for how Grover’s algorithm delivers the
quadratic improvement in computational efficiency. Consider a classical randomized algorithm that
succeeds with probability p, where p≪1. After j repetitions of the algorithm, if j is small enough,
the cumulative probability of success is approximately jp. In this classical case, the probability
of success increases by a constant increment p at each iteration. Grover’s algorithm increases the
amplitude of the desired state |x0 i by an approximately constant increment at each iteration. The
probability that a quantum measurement of the resulting quantum state yields the outcome x0
is proportional to the squared amplitude of |x0 i, resulting in a quadratically faster increase in
probability at every iteration of Grover’s algorithm.

6.2 Quantum Amplitude Amplification (QAA)


Many quantum algorithms deliver the result of computation in a quantum superposition state
|ψi of “good” and “bad” results. The extraction of the “good" result often requires repeated
measurements of auxiliary qubits. For example, the quantum algorithm in Section 8.3 (application
of a Hermitian operator to a quantum state) produces the desired result only when the measurement
of the auxiliary qubit yields 1. If the probability of measuring 1 is p, then it takes roughly 1/p
measurements on average to extract the desired result. In some cases the probability p may be
quite small, making the measurement step a substantial computational overhead.
Brassard et al. (2002) generalize Grover’s algorithm to mitigate this problem and reduce the

computational burden of repeated measurements
P from O(1/p) to O(1/ p). Let A be the algorithm
that creates the state |ψi: |ψi = A|0i = x ax |xi, a superposition of “good” and “bad” results.
Consider a validation function χ that returns 1 if x is a “good” result, and 0 if a “bad” result. The
aim of the QAA algorithm is to amplify the amplitudes of the subspace of “good” results in order
to increase the probability that a measurement yields those results.
The algorithm leverages the fact that the quantum state |ψi can be decomposed as
q
|ψi = a1 |ψ1 i+ 1−a21 |ψ2 i, (49)

where |ψ1 i is a projection of |ψi onto the “good” subspace spanned by the states representing the
“good” results and |ψ2 i is the projection onto its complement – the “bad” subspace. The probability
p of a measurement of |ψi yielding a “good” state is p = |a1 |2 .
An iteration of QAA, represented as an operator Q, increases the amplitude of the “good” state
|ψ1 i. By analogy with Grover’s search algorithm, the operator Q is
Q = −AU0 A† Uχ , (50)
where
U0 = I −2|0ih0|, Uχ = I −2|ψ1 ihψ1 |, (51)
and the algorithm A is assumed to be reversible, i.e. containing no measurements.
Repeated applications of Q gradually increase the amplitude of the “good” |ψ1 i component.
After t measurements, where t ≈ 4|aπ1 | = 4√π p , the probability that a measurement of Qt |ψi yields a
“good” component is of order unity.

32
Both QAA and Grover’s algorithms are periodic. Once the minimum error is reached at t≈ 4|aπ1 | ,
repeated applications of the search operator Q start pushing the state of the system away from

the target state, and the error starts to increase until t ≈ 4|a1|
. After maximum error is reached,
repeated applications of Q start pushing the system closer to the target state again. In the
absence of prior knowledge of |a1 |, the periodicity makes it difficult to determine the stopping
point optimally. The literature refers to this problem as the “soufflé problem,” referring to the
way the dessert rises during baking, but starts to deflate if baked too long (the analogy would
have been more apt if the soufflé inflated and deflated periodically with baking time). Solutions to
the problem include fixed-point quantum search (Yoder et al., 2014) and variable time amplitude
amplification (Ambainis, 2012).

6.3 Quantum Amplitude Estimation (QAE)


As discussed in the previous section, the QAA algorithm is periodic with respect to repeated
application of the amplification step Q. When the amplification step Q is applied t times, Qt , the
amplification error cycles between its minimum and maximum with a period of t = |aπ1 | , where a1
is the amplitude of the “good” subspace in (49). Brassard et al. (2002) leverage the fact that the
period of Qt is a function of the absolute value of the amplitude |a1 | to estimate this amplitude
using Quantum Phase Estimation, an influential algorithm we describe in Section 8.2.
The QAE algorithm starts with a register of n qubits containing
PN −1 |ψi = A|0i andn an auxiliary
1
register of n qubits, initialized to a uniform superposition N j=0 |ji, where N = 2 . Let ΛN (U)

be a controlled unitary operator that applies multiple copies of a unitary U conditional on the
state of the auxiliary qubit register:
ΛN (U)|ji|ψi = |ji(U j |ψi), (52)
PN −1
where the state |ji acts as a reference. The operator ΛN (Q) applied to the state √1N j=0 |ji|ψi
creates a superposition of states with a range of repeated quantum amplitude amplification steps
Qj . Quantum Phase Estimation enables the extraction of the phase of Qj . Quantum Phase
Estimation records ŷ, an n bit approximation of y such that |a1 |2 =sin2 (π Ny ), in the computational
basis in the auxiliary register. Measurement of the auxiliary register yields the outcome |ŷi with
probability of at least 8/π 2 . With â1 = sin2 (π Nŷ ), the estimation error bound after t iterations of
the algorithm is:
a1 (1−a1 ) π 2
|â1 −a1 | ≤ 2π + 2. (53)
t t

6.4 Estimating the Mean of a Bounded Function


The QAE algorithm provides an efficient way to estimate the mean of a bounded function (see
Section 4.2 for a method to estimate the mean amplitude of a state that leverages the Swap Test).
Let F : {0,...,N −1} → X, where X ⊂ [0,1], be a black-boxPfunction. Brassard et al. (2011)
propose a method to approximate the mean value of F , µ = N1 x F (x); see also Heinrich (2002).
The idea is to create a state of the form:
|ψi = α|ψ0 i+β|ψ1 i, (54)
PN −1
such that |ψ0 i and |ψ1 i are orthogonal and |β|2 = N1 i=0 F (i) = µ. If the creation of state |ψi
requires O(1) oracle calls to F , then QAA can yield an estimate of |β|2 with precision ǫ in O(1/ǫ)
oracle calls regardless of the size N.

33
Creation of state |ψi in (54) proceeds in a few steps and requires three registers. The first,
n-qubit register |in encodes index values i in the computational basis; the second, m-qubit register
|im is an auxiliary register that temporarily holds an m-bit approximation of F (i) – i.e. an ap-
proximation of F (i) written down using m bits; the third, single-qubit register |i1 is an auxiliary
register that helps create orthogonal states |ψ0 i and |ψ1 i.
Let A be an algorithm that encodes an m-bit approximation of F (i) in an m-qubit auxiliary
register initialized to |0im :

A|iin |0im = |iin |F (i)im . (55)

Using a control rotation operator C-R (see Section 5.3), we transfer the values F (i) into the
amplitudes of the quantum state
p p 
C-R|F (i)im |0i1 = |F (i)im 1−F (i)|0i1 + F (i)|1i1 . (56)

The operator A = (A−1 ⊗I1 )(Im ⊗In ⊗C-R)(A⊗I1 ), where Im , In , and I1 are identity operators
acting on the m- and n-qubit registers and the auxiliary qubit, respectively, applies an extension
of A to all the three registers and then uncomputes the stored values of F (i). When applied to a
state where the n-qubit register holds a uniform superposition of n-bit strings |ii, the operator A
produces the desired state |ψi:
N −1 N −1
1 X  1 X p p 
|ψi = A √ |iin |0im |0i1 = √ |iin |0im 1−F (i)|0i1 + F (i)|1i1 . (57)
N i=0 N i=0

Discarding the m-qubit auxiliary register and rearranging, we have:


N −1
1 X p p 
|ψi = √ |iin 1−F (i)|0i1 + F (i)|1i1 (58)
N i=0
q P
N − j F (j) h 1
N
X −1
p i
= √ q 1−F (i)|ii n |0i1
N
P
N − j F (j) i=0
qP
j F (j) h
N −1
1 X p i
+ √ qP F (i)|iin |1i1
N
j F (j) i=0

=α|ψ0 i+β|ψ1 i,

where the expressions in the square brackets are the properly normalized and orthogonal states
2
|ψP0 i and |ψ1 i. The coefficients in front of the brackets are α and β respectively, so that |β| =
1 N −1
N i=0 F (i) = µ can be efficiently approximated by QAE, described in Section 6.3. The creation
of the quantum state |ψi requires only two calls to the quantum oracle A (one to initialize the
m-qubit auxiliary register containing the values |F (i)i and the other to uncompute it).
Montanaro (2015) extends the Brassard et al. method to functions F with non-negative output,
X ⊂R≥0 . The method decomposes the function F into k components F(xi−1 ,xi] , for i= 1,..,k, whose
output falls within disjoint intervals (xi−1 ,xi ]⊂X of the codomain of F . The mean is then estimated
for each function F(xi−1 ,xi ] , and the overall mean of the function of F is constructed from the means
of F(xi−1 ,xi ] . Montanaro (2015) extends the method further to functions with output that does not
have to be non-negative but has a bounded variance.

34
6.5 Minimum (or Maximum) of a Function over a Discrete Domain
The fastest classical algorithm for finding the minimum (or maximum) value of in an unsorted
table of N items√requires O(N) oracle calls; with the help of a quantum computer this requirement
decreases to O( N) through repeated application of the QAA (Durr and Hoyer, 1996).
Let F : {0,...,N −1} → X, X ⊂ R be a black-box function. The algorithm outputs y ∗ , the index
corresponding to the minimum value of F . The algorithm starts by selecting uniformly at random
an index value y such that 0 ≤ y ≤ N −1. Assuming without loss of generality that NP = 2n for an
N −1
integer n, two registers of n qubits each can support a quantum state of the form √1N j=0 |ji|yi.
The first register encodes a uniform superposition of all indices j, the second register encodes the
value y, both in the computational basis (i.e. in the form of binary strings).
Mark every item with F (j) < F (y) using an efficient oracle – i.e. an oracle able to perform the
operation in O(logN) calls to the function F (see subsection Oracles in Section 5.3). The marked
items represent the “good” state in QAA (Section 6.2). Run the algorithm to amplify this “good”
state. Next, uniformly at random select y ′ until F (y ′) < F√(y) and re-run the algorithm with y ′
instead of y. Durr and Hoyer (1996) show that, after 22.5 N +1.4log22 N calls to F , the desired
state is reached with probability at least 1/2.

6.6 Median and kth Smallest Value


Nayak and Wu (1999) generalized the algorithm of Durr and Hoyer (1996) to find not just the
smallest value, but the kth smallest value of a discrete function F , including the N/2th smallest
value which is the median. Denote by rank(F (i)) the rank of F (i) in {F (0),...,F (N −1)} ordered
in non-decreasing order. Let ∆≥1/2 be the accuracy parameter. Given k, 1 ≤k ≤N, the problem
is to find the value F (i) such that rank(F (i)) is the smallest value in (k−∆,k+∆). This is referred
to as the ∆-approximation of the kth smallest element of F .
The algorithm relies on two subroutines. The first subroutine implements a function K(l)
that returns ‘yes’ if F (l) is in the ∆-approximation of the kth smallest element; ‘<’ if the rank
of F (l) is at most k −∆; and ‘>’ if the rank of F (l) is at least k +∆. This subroutine can be
implemented based on the counting algorithm of Brassard et al. (1998) (see Section 6.7). The
second subroutine implements the sampler S(i,j) that chooses an index l uniformly at random
such that F (i) < F (l) < F (j). The sampler can be implemented based on the generalized search
algorithm of Boyer et al. (1998).
The kth smallest value algorithm works as follows: For convenience, define F (−1) = −∞ and
F (N) = ∞,

1. i ← −1; j ← N

2. l ← S(i,j)

3. If K(l) returns ‘yes’, output F (l) (and/or the index l) and stop. Else, if K(l) returns ‘<’,
i ← l, go to step 2. Else, if K(l) returns ‘>’, j ← l, go to step 2.

Nayak and Wu (1999) prove that the expected number of iterations before termination is O(logN).18
18
p p
More precisely, let n = N/∆+ k(N −k)/∆. The expected number of iterations is O(logn).

35
6.7 Counting
In some applications, the problem of interest is not to find the solution but to count how many
solutions exist. Consider the function F : {0,...,N −1} → X = {0,1}. We are interested in counting
the number of indices x such that F (x) = 1. Then, the algorithm for estimating the mean of F is
effectively an algorithm for counting how many indices x such that F (x) = 1 there are.

6.8 Quantum Monte Carlo


The techniques used in the previous sections provide a way to estimate the mean of a function
with respect to a probability distribution (Low and Chuang, 2017; Rebentrost and Lloyd, 2018).
Classically, the most popular method for estimating an expectation is Monte Carlo simulation. In
this method, samples are drawn from the probability distribution and the function is evaluated
for each sample; the average of these outputs, µ̃, is the Monte Carlo estimate of the true mean µ.
Chebyshev’s inequality guarantees that, for k independent draws from the probability distribution,
the probability that the estimate is far from the real mean µ is bounded
σ2
Pr[|µ̂−µ| ≥ ǫ] ≤ , (59)
kǫ2
where σ 2 is the variance of the function with respect to the probability distribution. In other words,
in order to estimate µ up to an additive error ǫ, k = O(σ 2/ǫ2 ) samples – and function evaluations
– are required.
Quantum Amplitude Estimation holds the promise of reducing the required number of function
evaluations to O(σ/ǫ) to achieve the same error bound. That is, Quantum Monte Carlo provides a
quadratic speedup over classical Monte Carlo. The quantum method requires two efficient quantum
oracles. The first oracle P helps to prepare a state that encodes the probability distribution in a
quantum state – called a quantum sample state. The second oracle F applies the function whose
mean is to be estimated with respect to the probability distribution.
To define the quantum sample P state, let X be a finite N-dimensional set, and p(x) a probability
distribution over X, such that x∈X p(x)=1. Let the basis set of an N-dimensional Hilbert space,
{|xi}, represent the elements x of X. The quantum state |pi of the form
Xp
|pi = p(x)|xi (60)
x∈X

is the quantum sample with respect to the distribution p. The state |pi has the property that, by
the Born rule, the probability that a measurement of all qubits supporting the state yields x is
p(x).
Using these two oracles P and F and an auxiliary qubit we have:
Xp
P :|0i 7→ |pi = p(x)|xi, (61)
x∈X
Xp p p
F :|pi|0i = p(x)|xi( 1−f (x)|0i+ f (x)|1i). (62)
x∈X

The quantum state in (62) has a structure parallel to that of the quantum state in (58), and we can
apply to it P
a similar technique to isolate the expectation value of f with respect to the distribution
p, Ep [f ] = x∈X p(x)f (x). If F is efficiently computable (i.e., with a sub-polynomial number of
operations), then the algorithm yields fˆ, an estimate of Ep [f ] within the error bound ǫ with O(σ/ǫ)
function evaluations.

36
Further Discussion on Quantum Sample Preparation
The quantum Monte Carlo method described above delivers a quadratic speedup over classical
methods provided the quantum oracles P and F can be implemented efficiently. If the func-
tion f is easily computable, then F has an efficient implementation (Low and Chuang, 2017;
Rebentrost and Lloyd, 2018). But implementing the oracle P to create a quantum sample can
be challenging. For an arbitrary probability distribution, preparing a quantum sample state is
computationally equivalent to solving the graph isomorphism problem and is exponentially hard
(Plesch and Brukner, 2011; Chakrabarti et al., 2019). For efficiently integrable distributions, such
as the normal distribution, the method proposed by Grover and Rudolph (2002) has been popular.
However, Herbert (2021) recently demonstrated that the Grover and Rudolph (2002) method is
limited to situations simple enough to be solved without the use of the Monte Carlo method –
classical or quantum. Because quantum Monte Carlo delivers quadratic rather than exponential
speedup over classical methods, the modest computation overhead of the Grover and Rudolph
(2002) method negates the quantum gains.
Efficient preparation of quantum samples for distributions of practical interest is, at the time of
writing this review, an open problem in quantum algorithm design. A machine-learning approach
based on empirical data has been proposed by Zoufal et al. (2019). Vazquez and Woerner (2021)
propose to view the probability distribution function p as a function, implemented similarly to
function f , using a controlled rotation of an auxiliary qubit. This method works for simpler dis-
tributions, with an efficiently-computable p(x). For more complex, high-dimensional distributions
Kaneko et al. (2021) propose to create quantum samples using pseudorandom numbers. An et al.
(2021) quantize the classical method of multilevel Monte Carlo to find approximate solutions to
stochastic differential equations (SDEs), particularly for applications in finance.
It may also be possible to achieve relatively efficient quantum sampling using a sequence
of slowly varying quantum walks (see Section 7) – the quantum equivalents of Markov chains
(Wocjan and Abeyesinghe, 2008; Wocjan et al., 2009). The method, called Quantum Markov chain
Monte Carlo and discussed in more detail in Section 7.3, is analogous to classical Markov chain
Monte Carlo (MCMC), and can be used to generate a Markov chain with a given equilibrium
distribution π.

7 Quantum Markov Chains


This section introduces Quantum Markov chains, often called quantum walks in the quantum
computing literature, the quantum equivalents of classical Markov chains which are widely used in
probability and statistics. Quantum walks can provide polynomial speed-ups for a wide variety of
problems from estimating the volume of convex bodies (Chakrabarti et al., 2019) to option pricing
(An et al., 2021), search for marked items (Magniez et al., 2011) and active learning in artificial
intelligence (Paparo et al., 2014).19 However, because of quantum interference, quantum Markov
chains behave substantially differently from their classical counterparts. For example, quantum
Markov chains do not admit any equilibrium distribution, but the average of the states of the
chain does (see below for a formal definition). This makes applications of quantum Markov chains
in statistics different from those of classical chains, and can lead to new interesting and important
applications.
19
For special classes of problems, such as a subclass of black-box graph traversal problems, quantum walks deliver
exponential speeds up over any classical algorithm (Childs et al., 2003). Generally, quantum walks are universal for
quantum computation (Childs, 2009) (i.e. any sequence of gates can be expressed as a quantum walk).

37
Questions of interest to statisticians are: how to quantize a Markov chain, i.e. how to implement
a Markov chain in a quantum computer? What are the properties of the resulting quantum Markov
chain? How can this quantum Markov chain be used in statistical applications such as MCMC
sampling? In this section, we review the two most popular approaches for quantizing a Markov
chain: coin walks and Szegedy walks (Szegedy, 2004; Watrous, 2001). Coin walks quantize Markov
chains on unweighted graphs. Szegedy walks work on weighted directional graphs. We focus
on Markov chains with a discrete state space, but it is also theoretically possible to quantize
a continuous-space Markov chain.20 For a detailed and thorough review of quantum walks, see
Venegas-Andraca (2012).

7.1 Coin Walks


Consider a Markov chain with a discrete state space. It can be represented on a graph G = (V,E):
the vertices V represent the states; after each time step, the chain stays at the current vertex or
jumps to one of its adjacent vertices according to a transition probability. This creates a random
walk on the graph.
Let HV and HE be Hilbert spaces whose basis states encode the vertices in V and the edges
in E, respectively. Define a shift operator S on HV ⊗HE that determines the next vertex u given
the current vertex v and the edge e, i.e., S|ei|vi = |ei|ui. Define a coin operator C to be a unitary
transformation on HE . Then, U = S(C ⊗I) implements one step of the random walk on graph G.
If the initial state is |ψ0 i, the state after t steps is

|ψt i = U t |ψ0 i.

The dynamic of this quantum random walk is governed by the coin operator C. Because of the
quantum interference and superposition effect, the distribution of |ψt i behaves very differently
from the classical Markov chain. Denote by Pt (v|ψ0 ) the probability of finding |ψt i at a node
v ∈ V . The probability distribution Pt (·|ψ0 ) does not converge (Venegas-Andraca, 2012), but its
average does. More precisely, let
t
1X
P̄t (v|ψ0 ) = Ps (v|ψ0 ),
t s=1

then P̄t (·|ψ0 ) converges and this stationary distribution can be determined. With a suitable defi-
nition of mixing time, it is shown that the mixing time of a quantum walk is quadratically faster
than a classical random walk - a property that attracted attention of researchers seeking to speed
up algorithms based on Markov chains.

7.2 Szegedy Walks


Szegedy (2004), based on the earlier work of Watrous (2001), proposed another approach to quan-
tize Markov chains. Consider a Markov chain operating on a bipartite graph. Let X and Y be
two finite sets, and matrices P and Q describe the probabilities of jumps from elements of X to
elements of Y and Y to X, respectively. The elements of P and Q, px,y and qy,x , are transition
20
In practice, quantum computers have finite precision and can only encode finite (if high-dimensional) sets,
making discrete-space quantum samples most relevant for quantum algorithm applications (Chakrabarti et al.,
2019).

38
P P
probabilities and, as such, are non-negative and normalized so that y∈Y px,y =1 and x∈X qy,x =1.
A Markov chain that maps X to X with a transition matrix P is equivalent to a bipartite walk
where qy,x = py,x for every x,y ∈ X or, equivalently, Q = P .
To quantize the bipartite random walk, define a two-register quantum system spanned by |xi|yi
with x ∈ X, y ∈ Y . We start with two unitary operators,
X√ X√
UP : |xi|0i 7→ px,y |xi|yi, VQ : |0i|yi 7→ qy,x |xi|yi, (63)
y∈Y x∈X

which are quantum equivalents of the transition matrices P and Q. The quantization is based on
the observation that the Grover “diffusion” operator Uf (from Section 6.1) is similar to a step of
a random walk over a graph – a transition from each state to all the other N states. In matrix
form, the operator Uf (up to an overall negative sign) is such that its off-diagonal elements equal
2
N
and the diagonal elements are −1+ N2 . This unitary operator effectively distributes quantum
probability mass from each node to all the other nodes – a property that led to naming this
operator a “diffusion” operator.
Using the operators UP and VQ we define operators similar to Grover’s diffusion operators:

R1 = 2UP UP† −I, R2 = 2VQ VQ† −I, (64)

where the identity operator I acts on both registers. The quantum walk operator W is defined as
the product of the two diffusion operators

W = R2 R1 . (65)

For a Markov chain from X to X, the expression can be simplified by replacing R2 with SR1 S,
where S is the swap operator, which swaps the two registers:
X
S= |y,xihx,y|. (66)
x,y

This operator is self-inverse, so that S 2 = I and SR1 S −1 = SR1 S. For a Markov chain we can write

W = S(2UP UP† −I)S(2UP UP† −I), (67)

so that some researchers (see e.g. Chakrabarti et al., 2019) define a step of a quantized Markov
chain as

WM C = S(2UP UP† −I). (68)

A Szegedy quantum walk, similar to a coin walk, is a unitary process and, as such, does not
converge to a stationary distribution. Instead the quantum walk “cycles through” the stationary
distribution π of P , similarly to the way Grover’s search (Section 6.1) or QAA (Section 6.2)
pass through the desired state with a certain period (QAE, described in Section 6.3,
P p exploits this
periodicity). The quantum state analogous to the stationary distribution π, |πi = x π(x)|xi, is
the highest-eigenvalue eigenstate of the Szegedy quantum walk operator W (Orsucci et al., 2018);
the eigenvalue of eigenstate |πi equals 1, i.e. W |πi = |πi. This important property of the Szegedy
quantum walk can be exploited to develop quantum Markov chain Monte Carlo to sample from π;
see Section 7.3.

39
7.3 Quantum Markov Chain Monte Carlo
Can a quantum walk be used to derive a quantum Markov chain Monte Carlo algorithm for
sampling from a target probability distribution? The answer is yes. Consider
P a distribution π over
p
a discrete space X; the problem is to prepare a quantum sample |πi = x∈X π(x)|xi.
Let P be the transaction matrix of a classical ergodic Markov chain with the stationary distri-
bution π; P can be derived based on, e.g., the Metropolis algorithm. Let W (P ) be the Szegedy
quantum walk with respect to P . Then |πi is the unique eigenstate of W (P ) with the eigenvalue
1. All other eigenstates have an eigenphase which is at least quadratically larger than the spectral
gap δ – the difference between the top and the second highest eigenvalues. This property allows
one to use phase estimation (or phase detection) to distinguish |πi from the other eigenstates of
W (P ). √
However, the mixing time of the Szegedy quantum walk is O(1/ δπmin ) steps in general with
πmin = minx∈X π(x) (Aharonov and Ta-Shma, 2007; Montanaro, 2015). This can be problematic
when the size N of X is large. One way to reduce the dependence of mixing time on πmin is
to employ a slowly varying series of quantum walks to reach the desired quantum sample state
(Wocjan and Abeyesinghe, 2008; Wocjan et al., 2009). This is similar to the idea of annealed
sampling. Let P0 ,...,Pr be classical reversible Markov chains with stationary distributions π0 ,...,πr ,
where πr =π, such that each chain has a relaxation time Pat most τ (Montanaro, 2015). Then given
1
an easy-to-prepare state |π0 i, e.g. the uniform state N x∈X |xi, and the condition that hπi |πi+1 i≥

p for some p > 0 and ∀i = 1,..,r−1, for any ǫ > 0, there is a quantum algorithm
√ which results in a
2
quantum sample |π̃r i such that k|π̃r i−|πr ik≤ǫ. The algorithm uses O(r τ log (r/ǫ)(1/p)log(1/p))
quantum walk steps. Chakrabarti et al. (2019) generalized this approach and used it to design a
quantum MCMC algorithm to speed up evaluation of volume of convex bodies. Further speedup
has been proposed by Magniez et al. (2011) and Orsucci et al. (2018). Additionally, even more
efficient methods exist to reflect about the states |πi i, with a runtime that does not depend on r
(Yoder et al., 2014).
Applications of quantum Monte Carlo and quantum Markov chain Monte Carlo are rich and
varied. They include, for example, speeding up classical annealing approaches to combinatorial
optimization problems (Somma et al., 2008), search (Magniez et al., 2011), speeding up learning
agents (Paparo et al., 2014), derivative pricing (Rebentrost and Lloyd, 2018), and risk analysis
(Woerner and Egger, 2019).

8 Quantum Linear Systems, Matrix Inversion, and PCA


Consider a system of linear equations Ax = b, where A is an M ×N matrix, b is an M ×1 input
vector, and x is an N ×1 solution vector, provided it exists. For a square N ×N well-conditioned
matrix A, the solution to the classical system of linear equations is x= A−1 b. The system of linear
equations powers many applications in statistics and machine learning, especially in high and
ultra-high dimensional settings such as deep learning. The quantum analog of the system of linear
equations takes the form A|xi = |bi, where the quantum states |xi and |bi represent vectors x and
b in amplitude encoding (Section 4.1). The solution is the quantum state |xi = C1 A−1 |bi, where C
is a constant to ensure normalization of the state |xi. Harrow et al. (2009) discovered a quantum
algorithm, known as the HHL algorithm, to solve the quantum system of linear equations in time
that scales with logN, provided that the matrix A is sparse. This offers an exponential speedup
over the best classical algorithms, which require a runtime of at least O(N). The exponential
advantage of the HHL algorithm stems from its use of the Quantum Fourier Transform (QFT) – a

40
foundational quantum routine and the engine at the core of many quantum algorithms including
Shor’s famous factoring algorithm (Shor, 1994).
The QFT directly exploits quantum parallelism – the ability to apply a function to all elements
of a vector simultaneously if this vector is encoded in a quantum state. The QFT powers the
HHL algorithm through another influential quantum subroutine – Quantum Phase Estimation
(QPE), an algorithm that enables recording of a quantum phase θ, for example in an eigenvalue
eiθ of a unitary matrix, into a computational basis state within an error ǫ with a high probability
(Section 8.2).
This section presents the QFT (Section 8.1) and its many uses in other quantum algorithms,
such as the application of a Hermitian (rather than an unitary) operator to a quantum state
(Section 8.3), finding the solution of a system of linear equations (Section 8.4), fast gradient
computation (Section 8.5), and quantum principal component analysis (Section 8.6). Even though
more efficient ways to perform some of these computations have been discovered recently (see,
e.g. Section 11), the QFT remains an influential and pedagogical quantum subroutine.

8.1 Quantum Fourier Transform (QFT)


PN −1
The QFT (Coppersmith, 1994) transforms a state |xi= m=0 xm |mi, where |mi are binary-encoded
basis vectors in the computational basis, so that
N
X −1
QFT: |xi 7→ |yi = yk |ki, (69)
k=0
N −1
1 X m
yk = √ xm ei2πk N . (70)
N m=0
QFT is the quantum equivalent of the classical discrete Fourier transform where a vector x =
(x0 ,..,xN −1 ) is transformed into a vector y = (y0,..,yN −1 ). That is, QFT transforms a superposition
state |xi into a new superposition state |yi whose amplitudes yk are the classical discrete Fourier
transforms of the amplitudes xm .
QFT exploits quantum computers’ ability to encode N-dimensional states using d = ⌈logN⌉
qubits. The structure of the Fourier transform allows the operation to be performed as a series
of O(log2 N) Hadamard gates, controlled rotations, and swap gates – an exponential improvement
in efficiency compared with O(NlogN) operations required by classical fast Fourier transform.
Because of state preparation and readoff, it is difficult to benefit from the quantum speedup of
QFT for estimating the Fourier coefficients. However, QFT serves as a powerful module in other
algorithms, such as Shor’s factoring algorithm, the HHL linear systems algorithm, and many others.
Consider |mi, a basis state of the Hilbert space containing |xi. Assume without loss of generality
that the dimension of the Hilbert space N = 2d . As shown below, it turns out that the QFT of
state |mi is a tensor product of single-qubit states, possible to create in a quantum computer using
a series of one- and two-qubit gates.
The QFT of the state |mi is
N −1
1 X i2πm kd
QF T |mi = √ e 2 |ki, (71)
N k=0

where |ki, k =0,..,N −1 represent basis states of an N =2d -dimensional Hilbert space. These states
can be chosen to be binary numbers from 0 to N −1 expressed in the computational basis such

41
that, if k is expressed as a binary string k1 k2 ...kd , where kj = {0,1}, each state |ki is a tensor
product state of d qubits in state |0i or |1i:
|ki = |k1 i⊗|k2 i⊗...⊗|kd i, (72)
Pd (d−j)
and k = j=1 kj 2 . Using this notation, QFT|mi becomes
Pd (d−j) d
1 X i2πm j=1 kj 2 1 X O kj
QF T |mi = e 2d |k1 i|k2 i...|kd i = ei2πm 2j |kj i
2d/2 2d/2
k1 ,k2 ,..,kd k1 ,k2 ,..,kd j=1
1 m m m
= √ d (|0i+ei2π 2 |1i)⊗(|0i+ei2π 22 |1i)⊗...⊗(|0i+ei2π 2d |1i), (73)
2
m
a separable state of d qubits. The exponent ei2π 2j effectively extracts the binary “decimal” of m
i2π mj
represented as a binary string m = m1 m2 ..md so that e 2 = ei2πm1 ..md−j .md−j+1 ..md = ei2π0.mj ..md ,
since ei2πk = 1 for any integer k. QFT of |mi then simplifies to
1
QF T |mi = √ d (|0i+ei2π0.m1 m2 ..md−1 md |1i)⊗(|0i+ei2π0.m2 ..md−1 md |1i)⊗...
2
...⊗(|0i+ei2π0.md |1i), (74)
a state that can be created via a series of relatively simple single-qubit and two-qubit gates. It is
easy to see that the QFT operator is linear and applies similarly to a linear superposition of states
|mi, i.e. any state |xi.

8.2 Quantum Phase Estimation (QPE)


Let U be a unitary operator with eigenstates |ui. Because the operator U is unitary, its eigenvalues
take the form ei2πθ , where θ ∈ [0,1), and U|ui = ei2πθ |ui, as we discuss in Section 3.3.
Quantum Phase Estimation (QPE) (Kitaev, 1995) is an algorithm to estimate, within a finite
precision, the phase θ of the operator U and record its binary approximation in a quantum state in
the computational basis. QPE is a building block in many algorithms, particularly those requiring
the application of a Hermitian (rather than unitary) operator to a quantum state (Section 8.3).
The linchpin of QPE is the control-unitary gate C-U, which applies the unitary U conditional
on the state of an auxiliary qubit (Section 3.3). Consider the state |0i⊗|ui, where |0i represents the
auxiliary qubit. Applying a Hadamard gate to the auxiliary qubit yields the state √12 (|0i+|1i)⊗|ui.
The controlled-unitary operator C-U acting on the state applies the operator U to state |ui if the
auxiliary qubit is in state |1i and does nothing if the auxiliary qubit is in the state |0i:
 1 1
C-U √ (|0i+|1i)⊗|ui = √ (|0i+ei2πθ |1i)⊗|ui.

(75)
2 2
Even though the operator U acts on the state |ui, it is the auxiliary qubit state that ends up being
modified because the operator is controlled on the state of this qubit. This effect is called phase
kickback.
To capture the n-bit approximation of θ, θ̃=0.θ1 θ2 ...θn with θj ∈{0,1}, QPE requires n auxiliary
qubits. Each auxiliary qubit is initialized to |0i and then a Hadamard gate is applied to each qubit
to yield the state
1
(|0i+|1i)⊗(|0i+|1i)⊗...⊗(|0i+|1i)⊗|ui. (76)
2n/2

42
j−1
Next, we apply a series of controlled-unitary gates, Cj -U 2 , which apply the unitary operator
2j−1
U to state |ui conditional to the state of the auxiliary qubit j. This results in the state
1 0 1 n−1
(|0i+ei2πθ2 |1i)⊗(|0i+ei2πθ2 |1i)⊗...⊗(|0i+ei2πθ2 |1i)⊗|ui
2n/2
1 2n θ 2n θ 2n θ
= n/2
(|0i+ei2π 2n |1i)⊗(|0i+ei2π 2n−1 |1i)⊗...⊗(|0i+ei2π 2 |1i)⊗|ui
2
n −1
 1 2X n k

= n/2 ei2π(2 θ) 2n |ki ⊗|ui, (77)
2 k=0

where ks are integers represented as n-bit strings k1 ...kn by the qubits in the auxiliary register as
|ki = |k1 i⊗..⊗|kn i.
If θ is an n-bit number, so that θ = θ̃, then θ2n is an integer. In this case, the state in the
auxiliary register of (77) is the QFT of the n-qubit state |θ2n i (c.f. Eq. 73 with m = θ2n ). This
state represents θ2n as a binary integer encoded in the computational basis. However, in general,
θ is a number with greater than n bits, so that θ 6= θ̃. In this case, θ2n is not an integer. Splitting
θ into its n-bit approximation θ̃ and a residual δ, δ = θ− θ̃, we can write θ2n = θ̃2n +δ2n , where θ̃2n
is the integer part of θ2n .
In the last step of QPE, we recover θ̃2n using the inverse QFT applied to the auxiliary register:
n −1 n −12n −1
 1 2X  1 2X y
i2π(2n θ) 2kn n k
X

QF T n/2
e |ki = n
e−i2πk 2n ei2π(2 θ) 2n |yi
2 k=0 2 y=0 k=0
2n −12n −1
1 X X i2π(2n θ̃−y) kn i2πδk
= n e 2 e |yi (78)
2 y=0 k=0

Probability in the superposition state in (78) is peaked around the state |θ̃2n i, which encodes θ̃2n
in the computational basis.
The last step is to measure the auxiliary register in the computational basis. If δ = 0, i.e. if θ
is an n-bit number, then the measurement yields θ2n with probability 1. If 0 < |δ| ≤ 2|θ|n , then the
measurement yields θ̃2n with probability 4/π 2 or greater (Cleve et al., 1998).21

8.3 Applying a Hermitian Operator


Quantum gates are unitary, but it is often useful to apply a Hermitian operator, rather than a
unitary operator, to a quantum state. This can be done in two steps: QPE (Section 8.2) and a
controlled rotation (Section 5.3).
Let H be a Hermitian operator on an N-dimensional Hilbert space, such that N = 2n for an
integer n. The goal is to apply the operator H to an N-dimensional state |ψi. Because the operator
H is Hermitian, there exists an orthonormal basis {|ui i}N i=1 such that H|u
PNi i = λi |ui i. The scalars
λi are (real) eigenvalues of H. In quantum notation, the expression H= i=1 λi |ui ihui | reflects the
fact that H is diagonal in the basis of its eigenvectors. The target state H|ψi then takes the form
N
X N
X
H|ψi = λi |ui ihui |ψi = λi βi |ui i, (79)
i=1 i=1

21
By using O(log(1/ǫ)) qubits and discarding the qubits above n, it is possible to increase the probability of
measuring θ̃2n to 1−ǫ.

43
where the scalar coefficients βi =hui |ψi equal the inner products of the state |ψi with the eigenstates
|ui i.
For any Hermitian operator H, there exists a unitary operator U = e−iHt , where t is a scalar
constant. Because the operator U is unitary, it can be constructed as a series of quantum gates
and applied to a quantum state prepared on n qubits. Alternatively, it is often more efficient to
interpret the operator U as an evolution operator (Section 3.3) by Hamiltonian H over time t and
to apply it using a suitable Hamiltonian simulation algorithm (Section 9). We will use the operator
U to create the target state H|ψi on a quantum computer.
The first step is toPexpress the state |ψi as a linear combination of eigenstates of H using the
identity operator I = N i=1 |ui ihui |

N
X N
X
|ψi = |ui ihui |ψi = βi |ui i. (80)
i=1 i=1

Note that the state |ψi in (80) does not undergo a transformation; instead, it is simply re-written in
the eigenbasis {|ui i}N N
i=1 . The eigenvectors {|ui i}i=1 are also eigenvectors of U, with corresponding
iλi t
eigenvalues e .
The algorithm exploits this property to extract the m-bit approximations of eigenvalues λi
using QPE. QPE requires a m-qubit auxiliary register, where m is the desired binary precision of
λi . QPE takes the two-register state |ψi|0i as an input; here, we denote the initialized auxiliary
register as |0i, a shorthand for |0...0i – an m-qubit register with each qubit initialized to |0i. For
each eigenstate |ui i, QPE records λ̃i , the m-bit approximation of λi , in the auxiliary register:
N
X N
X
QP E|ψi|0i = βi QP E|ui i|0i = βi |ui i|λ̃i i, (81)
i=1 i=1

exploiting the linearity of QPE.


After applying QPE, the next step is to perform a controlled rotation, as described in Sec-
tion 5.3. The controlled rotation requires an additional  auxiliary qubit and uses the m-qubit
register holding λ̃i as the reference register. The arcsin λ̃Ci function acts as f (x) in the definition
of controlled rotation(Equation 39 in Section 5.3), where C is chosen so that |λ̃Ci | ≤ 1 for all λ̃i
(Häner et al. (2018) demonstrated that arcsine is efficiently computable). We obtain:
s
N N  λ̃ 2
X X i λ̃i 
C-Ry βi |ui i|λ̃i i|0i = βi |ui i|λ̃i i 1− |0i+ |1i . (82)
i=1 i=1
C C

The next step is to measure the auxiliary qubit in the computational basis. If the measurement
yields 0, the quantum state on all registers is discarded and the computation is performed again.
If the measurement yields 1 then the resulting quantum state is:
N
1 X
λ̃i βi |ui i|λ̃i i|1i, (83)
C1 i=1
qP
N 2
where C1 = C i=1 |λ̃i βi | is the normalization constant. The number of measurements (and
recomputations) required to achieve the desired state can be reduced using amplitude amplification
(Brassard et al., 2002), described in Section 6.2.

44
The last step is to uncompute the register |λ̃i i in order to return it to the ground state |0i.
Discarding this register and the auxiliary qubit yields
N
1 X 1
λ̃i βi |ui i = H|ψi, (84)
C1 i=1 C1

the desired result up to a normalization constant.


Similar techniques can be used to implement smooth functions of sparse Hermitian operators
(Subramanian et al., 2019). Rebentrost et al. (2019) used the application of a Hermitian operator
to a quantum state in order to perform gradient descent on a homogeneous polynomial. Homoge-
neous polynomials have the property that the application of a gradient operator is equivalent to
the application of a linear operator. Let f (x) be a homogeneous polynomial of x = (x1 ,...,xN )T .
Then there exists an operator D(x) such that ∇f (x) = D(x)x. Because of this property, it is
possible to estimate the gradient of f (x) using the techniques described in this section.
The gradient descent algorithm starts with an initial guess vector x(0) . Rebentrost et al. (2019)
encode this vector in a quantum state x(0) and then use the method described in this section
to apply the operator D(x(0) ) to the state x(0) in order to evaluate the gradient ∇f (x). For
homogeneous polynomials, the operator D(x(0) ) has a relatively simple structure, which makes
it possible to simulate e−iDt efficiently on a quantum computer using simulation techniques from
the quantum principal component analysis method (Lloyd et al., 2014) described in Section 8.6.
The efficient computation of e−iDt makes it possible to use QPE as in (81) and, therefore, use
the Hermitian operator method described in this section to evaluate gradients for homogeneous
polynomials.

8.4 The HHL Linear Systems Algorithm


The linear systems algorithm by Harrow et al. (2009) exploits the ability to apply a Hermitian
operator to a quantum state in order to solve linear systems of the form Ax= b, where x and b are
N-dimensional vectors and A is an N ×N matrix. Finding the solution x requires the inversion
(or pseudoiversion) of the matrix A, which is computationally expensive for a high-dimensional
matrix.22
To introduce the core of the algorithm, we assume A is a Hermitian matrix and generalize it
at the end. We also assume N = 2n , where n is an integer. We initialize an n qubit register and
encode the state b in amplitude encoding:
N
1 X
|bi = bi |ii, (85)
kbk i=1

where states |ii are n-qubit states in computational basis; the state of each qubit in the register
can (but does not have to) correspond to the binary encoding of q integers i; the amplitudes bi are
PN 2
the elements of vector b; kbk is the normalization constant, kbk = i=1 |bi | .
If A is a Hermitian matrix and is invertible, the solution to the quantum linear system A|xi=|bi
is a quantum state |xi such that
1 −1
|xi = A |bi, (86)
C1
22
Inversion of an N ×N matrix on a classical computer requires O(N d ) operations, where 2 < d ≤ 3.

45
where the constant C1 ensures normalization of |xi. To streamline notation, without loss of
generality, we assume in this section that C1 = 1.
Let {αi }N N
i=1 be the set of eigenvalues of matrix A and {|ai i}i=1 be the set of corresponding
−1
eigenstates. The eigenstates of the matrix A , which is Hermitian since A is Hermitian, are also
1 N
{|ai i}N
i=1 , with eigenvalues { αi }i=1 . Using the identity matrix expressed in terms of the eigenvectors
PN
of A, I = j=1 |aj ihaj |, we transform the expression for |xi into

N N N N N
1 X 1 XX 1 XX 1
|xi = A−1 bi |ii = A−1 bi |aj ihaj |ii = bi |aj ihaj |ii. (87)
kbk i=1 kbk i=1 j=1 kbk i=1 j=1 αj

The result in (87) resembles the result of the Hermitian operator routine from Section 8.3 with
one difference: in the controlled-rotation of the auxiliary qubit in (82), arcsin α̃Ci replaces arcsin λ̃Ci .
Here, the value α̃i is the approximation of αi obtained by QPE, and the constant C is such that
C
|α̃i |
≤1. If some values α̃i are very small or 0, regularization techniques, similar to those in classical
matrix inversion, can provide stability (for example, Tikhonov (1963) regularization). The small
values of α̃i can be discarded or collected in a separate state for further analysis. We refer the
reader to Harrow et al. (2009) and Dervovic et al. (2018) for details.
The routine generalizes to the case where A is non-Hermitian. In this case, in place of A, we
use the Hermitian matrix IA, where I is the isometry superoperator such that:
 
0 A
IA = † . (88)
A 0

The eigenvectors and eigenvalues of the matrix IA are closely related to right and left eigenvectors
and singular values of matrix A, uk , vk , and αk , respectively. Following HarrowPet al. (2009),
N
we append an auxiliary qubit and define |a± √1
k i = 2 (|0i|uk i±|1i|vk i), where |uk i = i=1 uik |ii and
PM
|vk i = j=1 vjk |ji, k = 1,..,K, and K is the rank of matrix A. The operator IA takes the from
IA = |0ih1| Ss=1 αs |us ihvs |+|1ih0| Ss=1 αs |vs ihus |.23
P P
The runtime of the algorithm is O(d4κ2 logN/ǫ), where d is the sparsity of the matrix A (the
number of non-zero elements in each row or column of A), κ = αmax /αmin is its condition number
(the ratio of the largest to the smallest singular value), and ǫ is the admissible error. If the matrix
A is sparse and well-conditioned, the algorithm runs exponentially faster than the best classical
linear systems algorithm, with runtime scaling as O(Ndκlog(1/ǫ)) (Saad, 2003).
The result of the quantum linear systems algorithm is a quantum state, which can be passed
onto another quantum subroutine. For example, Schuld et al. (2016) use the building blocks for the
quantum linear systems algorithm embedded in another quantum algorithm to perform prediction
by linear regression. Alternatively, the quantum state can be read out for use by a classical
computer, although the information in the quantum state may need to be compressed in order to
preserve the computational efficiency of the algorithm. Another approach, discussed in Section 8.6
is to characterize the state using quantum PCA.
Since HHL formulated the quantum linear systems problem, |xi = A−1 |bi, many efficient al-
gorithms have been found, with computational complexity gradually improving to reduce the
dependence on the error ǫ, sparsity d, and condition number κ of the matrix A. Ambainis (2012)
23
If K < (N +M )/2, IA has N +M −2K zero eigenvalues, corresponding to the basis states of the orthonormal
complement to the 2K-dimensional Hilbert space spanned by |a±
k i.

46
replaced QAA with variable-time amplitude amplification to reduce the dependence on the con-
dition number. Clader et al. (2013) proposed to precondition the matrix A using sparse approx-
imate inverse preconditioning and reduced the dependence on both condition number and error.
Kerenidis and Prakash (2016) and Wossnig et al. (2018) replaced Hamiltonian simulation using
optimized Product Formula (Section 9.2) as in Berry et al. (2007) with a Hamiltonian simulation
based on a quantum walk (Section 9.3), making it possible to deliver exponential speedup to low-
rank (rather than sparse) problems. Childs et al. (2017) decomposed A−1 as a linear combination
of unitaries (Section 9.4) and eschewed QPE, achieving a logarithmic dependence on the error.
Subaşi et al. (2019) and An and Lin (2019) proposed an adiabatic-inspired method. Gilyén et al.
(2019a) used their QSVT method (Section 11) lowering the computational complexity to the mul-
tiplicative lower bounds in all variables and eliminating the dependence on size entirely, so that
the overall complexity of solving the linear systems problem is O(κlog(κ/ǫ)).

8.5 Fast Gradient Computation


The QFT at the core of most algorithms in this section also enables efficient evaluation of gradients.
The quantum gradient algorithm proposed by Jordan (2005) (and refined by Gilyén et al. (2019a))
calculates an approximate N-dimensional gradient ∇f (x) of a function f : RN → R at point x ∈
RN using a single evaluation of f . For comparison, standard classical techniques require N +1
function evaluations. The algorithm is suitable for problems where the function evaluations are
computationally taxing, but the number of dimensions is only moderately high, because it requires
O(N) qubits (and O(N) measurements if the result is to be used in a classical computation).
The algorithm uses a phase oracle (see a brief discussion of oracles in Section 5.3). An oracle is
an algorithmic “black box” assumed to perform a computational task efficiently. A quantum phase
oracle Og adds a phase to a quantum state such that, given a function g(y), we have

Og |yi = eig(y) |yi. (89)

The algorithm stems from two observations. The first observation is that, if f is twice-
differentiable then, for a vector δ with a sufficiently small norm, the expansion of f (x+ δ) in
the vicinity of x takes the form f (x+δ) = f (x)+∇f ·δ+O(||δ||2). The second observation is that
the phase oracle for f (x+δ) takes a convenient form. To define the phase oracle, we take g =2πDf ,
where where D >0 is a scaling factor necessary for all values of 2πDf on the relevant domain to be
less than 2π. When acting on a quantum state |δi initialized to hold the value of δ, the phase oracle
adds a phase that depends on the value of f : O2πDf : |δi 7→ e2πiDf (x+δ) |δi. Using the expansion of
f (x+δ) we can write approximately O2πDf : |δi 7→ e2πiDf (x) e2πiD∇f ·δ |δi. The phase contains the dot
product of the gradient ∇f and the differential δ, enabling the extraction of the gradient using
inverse QFT (Section 8.1). The algorithm works under the assumption that the third and fourth
derivatives of f around x are negligible.
The algorithm starts with a uniform superposition |ψi = √ 1 d δ∈Gdx |δi over the points of a
P
|Gx |
sufficiently small discretized N-dimensional grid GN x around x. Each state |δi reflects coordinates
recorded in the computational basis (in an N ×m-qubit register, where m is the binary precision).
A single call to the phase oracle O2πDf creates the state

e2πiDf (x) X 2πiD∇f ·δ


O2πDf |ψi = p e |δi, (90)
|Gdx | δ∈Gd
x

47
ready for the application of inverse QFT, which extracts ∇f ˜ , an m-bit approximation of the
gradient ∇f . The output of the algorithm is an N ×m-qubit state that records the coordinates of
the gradient ∇f˜ in the computational basis.
For a step-by-step description of Jordan’s algorithm, we refer the reader to the paper by
Gilyén et al. (2019a) who review the algorithm and modify it to take advantage of central-difference
formulas.

8.6 Quantum Principal Component Analysis (QPCA)


Algorithms such as HHL (Section 8.4) yield a result in the form of an unknown quantum state
that has to be characterized. The most straightforward way to characterize the state is to create
multiple copies and take measurements to enable statistical analysis of the state. However, this
approach can be computationally taxing and inefficient, particularly for non-sparse but low-rank
quantum states. Lloyd et al. (2014) propose an alternative method that uses multiple copies of a
quantum system to perform Principal Component Analysis (PCA) of the system, i.e. to extract
its principal components – the eigenstates corresponding to largest eigenvalues. Quantum PCA
(QPCA) performs the task for any unknown low-rank N-dimensional quantum state in O(logN)
runtime, exponentially faster than any existing classical algorithm. The algorithm leverages the
density matrix formalism of quantum theory, an alternative way to describe quantum states,
introduced in Section 3.3.
Let ρ be a density matrix describing a quantum state. The density matrix is Hermitian and has
real eigenvalues rj corresponding to eigenstates |χj i, with j =1,..,N. The goal of the quantum PCA
(QPCA) algorithm is to extract the eigenstates and eigenvalues of ρ. Given |ψi, an N-dimensional
quantum state, and an m-qubit register of auxiliary qubits, QPCA performs the transformation
X
QP CA|ψi|0i 7→ ψj |χj i|r̃j i, (91)
j

where r̃i are m-qubit approximations of the eigenvalues ri and ψi = hχi |ψi. The state in (91) has
a structure similar to that of the intermediate result (83) of the algorithm to apply a Hermitian
operator to quantum state (Section 8.3). This is because the QPCA algorithm treats the density
matrix ρ as a Hermitian operator that can be applied to an arbitrary quantum state. The trouble
is, for the algorithm in Section 8.3 to be efficient, the Hermitian operator H has to be sufficiently
structured for the controlled-unitary with the unitary operator e−iHt to be realized; the density
matrix ρ may lack such structure.
The critical insight at the core of the QPCA is that it is possible to construct the controlled-
unitary C-U with U = e−iρ∆t for any density matrix ρ using a series of swap operations, provided
the increment ∆t is sufficiently small. It is then possible to use Product Formula (Section 9.2)
to develop the controlled unitary with U = e−iρt based on the controlled unitary with U = e−iρ∆t ,
where t = n∆t.
Consider the application of e−iρ∆t to an arbitrary N-dimensional state described by a density
matrix σ:
e−iρ∆t σeiρ∆t = σ−i∆t[ρ,σ]+O(∆t2 ). (92)
Lloyd et al. (2014) demonstrate that this operation is equivalent to applying the swap operator to
the tensor product state ρ⊗σ and a subsequent partial trace trP of the first variable:
trP e−iS∆t ρ⊗σeiS∆t = (cos2 ∆t)σ+(sin2 ∆t)ρ−isin∆tcos∆t[ρ,σ]
= σ−i∆t[ρ,σ]s+O(∆t2 ). (93)

48
The swap operator S is represented by a sparse matrix and e−iS∆t is computable efficiently.
The derivation of (93) uses the identity

e−iS∆t = cos(∆t)I −isin(∆t)S, (94)

so that the expression e−iS∆t ρ⊗σeiS∆t can be rewritten as:

[cos(∆t)I −isin(∆t)S]ρ⊗σ[cos(∆t)I +isin(∆t)S]. (95)

Partial trace over the (cos2 ∆t) term yields trP Iρ⊗σI = σ. Partial trace over the (sin2 ∆t) term
yields trP Sρ⊗σS = ρ. Partial trace over the sin∆tcos∆t term results in:
X XX
trP Sρ⊗σI = hi|P Sρ⊗σ|iiP = hi|P Sρ⊗σ|j,kihj,k|iiP
i i j,k
XXX
= hi|P |l,mihl,m|Sρ⊗σ|j,kihj,k|iiP
i j,k l,m
XXX
= δi,l |mihl,m|Sρ⊗σ|j,kihk|δj,i
i j,k l,m
XX
= |mihi,m|Sρ⊗σ|i,kihk|
i m,k
XX
= |mihm,i|ρ⊗σ|i,kihk|
i m,k
XX
= |mihm|ρ|iihi|σ|kihk|
i m,k
X
= |mihm|ρσ|kihk| = ρσ. (96)
m,k

The transformation in (93) therefore results in e−iρ∆t σeiρ∆t . Given multiple copies of ρ, the
transformation can be applied repeatedly to provide an ǫ-approximation to e−iρt σeiρt , which re-
quires n=O(t2 ǫ−1 |ρ−σ|2 )≤O(t2 ǫ−1 ) steps (a consequence of Suzuki-Trotter theory, see Section 9.2).
The procedure is quite flexible. For example, using matrix inversion in Harrow et al. (2009), it is
possible to implement e−ig(ρ) for any “simply computable” function g(ρ).
Armed with e−iρt , the next step is to apply QPE in order to record m-bit approximations of
the eigenvalues
Pri in auxiliary register (Section 8.2) transforming any initial state |ψi|0i into the
desired state i ψi |χi i|r̃i i.

9 Hamiltonian Simulation
9.1 Overview and Preliminaries
Introduced briefly in Section 3.3, the Hamiltonian of an N-dimensional quantum system is a
Hermitian operator that guides the evolution of the system over time. Let H be a Hamiltonian.
A system in the initial state |ψi evolves over time according to the unitary operator e−iHt

|ψ(t)i = e−iHt |ψi, (97)

provided the Hamiltonian H stays unchanged during the period of evolution.

49
The time evolution of the state |ψi can be expressed in terms of eigenvalues and eigenstates
of the Hamiltonian H. Let |λi be the eigenstates of the Hamiltonian H with eigenvalues λ:
H|λi = λ|λi. Then we can express the initial state |ψi in terms of the eigenstates |λi:
X X
|ψi = hλ|ψi|λi ≡ ψλ |λi, (98)
λ λ

where ψλ ≡ hλ|ψi. The state |ψ(t)i evolved under H over time t takes the form
X
|ψ(t)i = ψλ e−iλt |λi. (99)
λ

Hamiltonian simulation is an approximation of the evolution of a system using the evolution


of a different system – usually one that is simpler or easier to control using a simple set of con-
trollable operations. In his classic work on universal quantum simulation, Lloyd (1996) draws a
parallel between Hamiltonian simulation and parallel parking a car, which is possible even though
a car is only able to move backward and forward. Using a simpler, controllable quantum system,
Hamiltonian simulation approximates the result of the evolution over time t of a more complex
quantum system.
The goal of Hamiltonian simulation is to evolve the initial state |ψi in (98) in a way that creates
pre-factors e−iλt in front of each |λi, within ǫ-error. The creation of these pre-factors is equivalent
to the evolution of |ψi under Hamiltonian H.
The ability to simulate Hamiltonian evolution efficiently on a quantum computer will not only
revolutionize molecular engineering but also allow us to tackle computationally hard problems such
as combinatorial optimization and high-dimensional linear systems of equations. Because of its
central role in quantum computing applications, Hamiltonian simulation is an active and rapidly
evolving field. Additionally, Hamiltonian simulation is BQP-complete (Low and Chuang, 2019;
Berry et al., 2015b) (see Section 4.3 for a brief discussion of computational complexity). In the
words of Low and Chuang (2019), Hamiltonian simulation is a “universal problem that encapsulates
all the power of quantum computation.”

Simulatable Hamiltonians and Simulation Limits


A simulatable Hamiltonian is one that makes it possible to approximate the unitary evolution
operator e−iHt by a quantum circuit efficiently, i.e. to an accuracy at most polynomial in the
precision of the circuit and in time at most polynomial in evolution time t (Aharonov and Ta-Shma,
2003). There is no Hamiltonian simulation algorithm able to simulate the evolution under a
general Hamiltonian in poly(kHkt,logN), where kHk is the spectral norm (defined in Eq. 103)
(Childs and Kothari, 2009). Additionally, there is no general algorithm to simulate a general
sparse Hamiltonian in time less than linear in kHkt – a statement known as the No Fast-Forwarding
Theorem (Berry et al., 2007).

Hamiltonian Input Models


Efficient Hamiltonian simulation algorithms exploit the structure of the Hamiltonian.24 Efficient
simulation algorithms have been proposed for time-independent Hamiltonians such as Hamilto-
nians that are linear combinations of local terms – terms acting on a small number of qubits
24
Just like it is not possible to implement an arbitrary unitary efficiently, it is not possible to efficiently to simulate
an arbitrary Hamiltonian (Childs and Kothari, 2009).

50
(Lloyd, 1996; Berry et al., 2007), sparse Hamiltonians that have at most polylog(N) entries in
each row (Aharonov and Ta-Shma, 2003; Berry et al., 2007; Berry and Childs, 2012; Berry et al.,
2014), Hamiltonians that comprise a linear combination of unitaries (LCU) (Childs and Wiebe,
2012), and low-rank Hamiltonians (Berry and Childs, 2012; Wang and Wossnig, 2018).
Hamiltonian simulation algorithms specify input models that make the terms of the Hamil-
tonian accessible to the quantum computer. The most popular input models include black-box,
sparse access, QROM, and LCU models.
The black-box input model, proposed by Grover (2000) and refined by Sanders et al. (2019),
uses a unitary oracle OH that returns matrix terms Hjk and their indices j,k in a binary format:

OH |j,ki|zi = |j,ki|z⊕Hjk i, (100)

where ⊕ represents the bit-wise XOR. The oracle OH represents a quantum algorithm that performs
the encoding of Hamiltonian terms into a quantum state. For example, Sanders et al. (2019)
propose an algorithm that performs OH using quantum computing primitives, such as Toffoli
gates.
If the Hamiltonian is sparse and has at most d nonzero entries in any row, then the sparse
access input model helps drive further efficiency (Aharonov and Ta-Shma, 2003). This model uses
two unitary oracles, OH – the black-box Hamiltonian oracle – and OF the address oracle. Oracles

OH |j,ki|zi = |j,ki|z⊕Hjk i (101)


OF |j,li = |ji|f (j,l)i (102)

where f (j,l) is a function that gives the column index of the lth non-zero element in row j. The
sparse access model has been the most popular input model for sparse Hamiltonian simulation.
Aharonov and Ta-Shma (2003) defined a Hamiltonian as row-sparse if the number of non-zero
entries in each row is O(polylogN) or lower. A Hamiltonian is called row-computable if there exists
an efficient (i.e. requiring O(polylogN) or fewer operations) – quantum or classical – algorithm
that, given a row index i, outputs a list (j,Hij ) of all non-zero entries in that row. Row-sparse
and row-computable Hamiltonians are simulatable, provided they have a bounded spectral norm
kHk ≤ O(polylogN).25
For Hamiltonians decomposed into a linear combination of unitaries, the input model provides
the constituent unitaries and their weights.
An increasingly popular input model is the QROM input model, based on the classical QROM
structure (Kerenidis and Prakash, 2016; Chakraborty et al., 2018), outlined in Section 4.1, that
provides efficient quantum access to Hamiltonian terms. This input structure is used in Hamilto-
nian simulation based on quantum singular value transformation (Gilyén et al., 2019b), found to
be a unifying framework for the top quantum algorithms of the last two decades (Martyn et al.,
2021).

Matrix Norms
For reference, we include the most common matrix norms used in Hamiltonian simulation literature.
The matrix norms arise in normalizations of quantum states encoding Hamiltonian terms and
in estimations of computational complexity. For rigorous definitions of matrix norms and the
relationships between them see, e.g., Childs and Kothari (2009).
25
This statement is called the sparse Hamiltonian lemma due to Aharonov and Ta-Shma (2003).

51
The spectral norm kHk of Hamiltonian H is defined as
kHk = max kHvk = σmax (H), (103)
kvk=1

where, for a vector v, kvk represents the Euclidean 2-norm; σmax (H) denotes the largest singular
value. Spectral norm is the matrix 2-norm, induced by the vector 2-norm; it sometimes
P denoted
as kHk2 . Similarly, the matrix 1-norm induced by the vector 1-norm is kHk1 = maxj i |Hij |.
The max norm is
kHkmax = max|Hij |. (104)
i,j

The Frobenius norm, also called the Hilbert-Schmidt norm, is


sX sX
kHkF = |Hij |2 = σk2 (H), (105)
i,j k

where σk (H) are the singular values of H.

Overview of Hamiltonian Simulation Algorithms


As a critical potential application of quantum computers and a linchpin of other algorithms, such
as quantum linear systems algorithms, Hamiltonian simulation is an active area of algorithm de-
sign. In this section, we review a few influential methods of Hamiltonian simulation. We start with
a foundational method Product Formula that splits the Hamiltonian into easy-to-simulate parts
and then uses sequences of small subsystem simulations to approximate whole-system Hamiltonian
evolution (Section 9.2). We then overview Hamiltonian simulation by quantum walk, an influential
method that works for general Hamiltonians and is highly efficient for sparse Hamiltonians (Sec-
tion 9.3). We also summarize the method of linear combination of unitaries for Hamiltonians that
can be expressed as linear combinations of unitary operators (Section 9.4). And last, we demon-
strate how to use quantum signal processing in combination with the quantum walk method to
perform Hamiltonian simulation with computational efficiency that corresponds multiplicatively
to all known lower bounds (Section 9.5).

9.2 Product Formula


One of the foundational methods of Hamiltonian simulation is the Lie-Trotter-Suzuki Product
Formula (Suzuki, 1990, 1991). This approach works for time-independent local or, more generally,
separable Hamiltonians. Many Hamiltonians26 H can be expressed as a sum of l Hamiltonians Hj :
H= lj=1Hj . If the time evolution operation e−iHj t is relatively easy to apply to a quantum state,
P
for example because it represents a simple sequence of rotation gates, then it would be beneficial
to express the time evolution e−iHt by the Hamiltonian H as a function of time evolution operators
e−iHj t of the constituent Hamiltonians Hj .
In general e−iHt 6= lj=1 e−iHj t , but we can follow Suzuki (1990, 1991) and adopt the classical
Q
Lie-Trotter formula for exponentiated matrices:
l
Pl Y r
−iHt −i j=1 Hj t
e =e = lim e−iHj t/r . (106)
r→∞
j=1

26
Particularly local Hamiltonians that usually apply to real physical systems.

52
Table 1: Hamiltonian simulation algorithms. The table demonstrates the gradual improvement of
query complexity of Hamiltonian situation with respect to the essential parameters of the simula-
tion: the dimension of the Hilbert space N, admissible error ǫ, and the sparsity parameter d, equal
to the number of non-zero elements in each row (column) of the Hamiltonian.

Algorithm Citation Method Query Complexity


Lie-Suzuki-Trotter Lloyd (1996) Finite sum of l local Hamil- O(lr)
Product Formula tonians (split into r parts)
Adiabatic Hamilto- Aharonov and Ta-Shma
Adiabatic evolution O(poly(log(N ),d)(kHkt)2 /ǫ
nian evolution (2003) for row-sparse, row-
computable Hamiltonians
Optimized Product Berry et al. Efficiently decompose O(d4 (log∗ N kHtk)1+o(1) )
Formula (2007) Hamiltonian into a sum of
local Hamiltonians
Quantum walk- Berry and Childs Combine quantum walk O(d2/3 [(loglogd)kHkt]4/3 /ǫ1/3 )
based Hamiltonian (2012) with quantum phase esti-
simulation mation for general Hamil-
tonians with black-box
access

Linear combination Childs and Wiebe Decompose a Hamiltonian O(dkHkmax t/ ǫ)
of unitaries (LCU) (2012) into a finite linear combina-
tion of unitaries
Taylor-series based Berry et al. Taylor series expansion of O(τ log(τ /ǫ)/loglog(τ /ǫ)),
(2015a) e−iHt where τ = d2 kHkmaxt
BCK Berry et al. Combine quantum walk O(τ log(τ /ǫ)/loglog(τ /ǫ)),
(2015b) with linear combination of where τ = dkHkmax t
unitaries
log(1/ǫ)
Quantum signal Low and Chuang Combine quantum walk O(τ + loglog(1/ǫ) ) matches the-
processing (2017) with quantum signal oretical additive lower bound
processing Berry et al. (2015b)

53
For a finite r, the product method approximates the time evolution by H using a sequence of time
evolutions over short segments of time t/r by constituent Hamiltonians Hj .
This method is flexible and robust, but it requires O(lr) queries to the Hamiltonians Hj . For
an N-dimensional systems, the number l of simple-to-apply constituent Hamiltonian evolution
operators e−iHj t can be of O(N). This can be efficient in many situations, but would not yield
exponential speedup relative to classical methods. Additionally, the algorithm scales superlinearly
in simulation time t, which is above the optimal linear dependence on t. The approach also suffers
from poor scaling in the sparseness O(d4 ).

9.3 Hamiltonian Simulation by Quantum Walk


Childs (2010) proposed a way to simulate Hamiltonian evolution using a quantum walk (Sec-
tion 7.2). This method became a part of many efficient and influential algorithms, such as the
Berry et al. (2015b) and Low and Chuang (2019) algorithms, considered the state of the art in
Hamiltonian simulation at the time of writing this review.
The method uses two properties of quantum walks. First, it is possible to construct a quantum
walk operator from a Hamiltonian, provided the elements of the Hamiltonian can be encoded in
a quantum state. Second, each eigenvector of the quantum walk unitary corresponds to an eigen-
vector of the simulated Hamiltonian |λi, with eigenvalues ±e±iarcsinλ . Childs (2010) demonstrates
that it is possible to combine these two properties of quantum walks with QPE (Section 8.2) to
perform Hamiltonian simulation.
Hamiltonian simulation via a quantum walk proceeds in a duplicated Hilbert space. For a
Hamiltonian acting on a space spanned by N = 2n qubits, first an auxiliary qubit is appended
doubling the dimension of the Hilbert space. Then another register of n+1 qubits is appended,
expanding the Hilbert space for the quantum walk. To implement the walk, an operator T is
defined that acts on the doubled Hilbert space:
N
X −1 X
T= (|jihj|⊗|bihb|)⊗|ϕjb i, (107)
j=0 b∈{0,1}

where the first register is the register that spans the Hilbert space of H, the second register is
the auxiliary qubit, and the state |ϕjbi is across the third and fourth registers that duplicate the
Hilbert space of H and the auxiliary qubit. The state |ϕjb i encodes the absolute values of the
non-zero elements of H as follows:

|ϕj1 i = |0i|1i (108)


r s
∗ ∗
1 X  Hjl |Hjl | 
|ϕj0 i = √ |li |0i+ 1− |1i , (109)
d l∈F X X
j

where X ≥ kHkmax and Fj is the set of nonzero elements of H in column j.


The unitary U corresponding to a Szegedy quantum walk (Section 7.2) takes the form

U = iS(2T T † −I), (110)

where the swap operator S swaps the two registers (S|a1 i|a2 i|b1 i|b2 i = |b1 i|b2 i|a1 i|a2 i) and I is
identity that acts on both registers.

54
The random walk unitary U has properties that make it an effective building block of Hamil-
tonian simulation: The eigenstates of U are related to eigenstates of H and the eigenvalues of U
are close to the desired prefactor of Hamiltonian simulation e−itλ :
U|µ± i = µ± |µ± i (111)
|µ± i = (T +iµ± ST )|λi|0i (112)
λ
±iarcsin kHk
µ± = ±e 1 . (113)

Childs (2010) was first to propose the use of the quantum walk phase e±itarcsinλ to construct
the exponent e−itλ that simulates Hamiltonian evolution. His Hamiltonian simulation algorithms
uses QPE (Section 8.2) and a special transformation Ft = e−itsinφ |θ,φi to induce with high fidelity
the phase e−iλ̃t , where λ̃ is an approximation of λ from QPE.
Berry et al. (2015b) further optimized the method by using a linear combination of unitaries
(LCU) to construct the prefactor e−itλ from e±itarcsinλ . Instead of relying on QPE and a functional
transformation, Berry et al. (2015b) decompose e−itλ into a series of exponents of e±itarcsinλ :

λ
iz kHk
X
Jm (z)µm
± =e
1 , (114)
m=−∞

where Jm (z) are Bessel functions of the first kind. As a result, effectively we have:

H
−it kHk
X
Jm (−t)U m = e 1 . (115)
m=−∞

H
−it
The sum in (115) truncated to |m| ≤ k is an efficient approximation of the desired phase e kHk1 .
Success probability for U m decays with m, limiting the efficiency of this algorithm. As will be
discussed in the next section, Low and Chuang (2017) circumvent this problem by proposing an
alternative route from the prefactors e±itarcsinλ to e−itλ .
The query complexity of Hamiltonian simulation based on a quantum walk is linear in t and
in d, the sparseness parameter.
Hamiltonian simulation is also a way to apply a unitary, since for any unitary there is a
corresponding Hamiltonian. The quantum-walk approach enables√the implementation of an N ×N
unitary in Õ(N 2/3 ) queries, with typical unitaries requiring Õ( N ) queries (Berry and Childs,
2012).

9.4 Linear Combination of Unitaries


The linear combination of unitaries (LCU) method uses a series of controlled unitaries combined
with multi-qubit rotations to apply a complex unitary U such that
X
U= βj Vj , (116)
j

where the weights βj > 0.


Let B be a unitary operator such that
1 Xp
B|0i = √ βj |ji, (117)
s j

55
P
where s = j βj .
Using the sequence of gates below:

|0i B B†

|ψi Vj

obtain
r
1 1 ⊥
|0iU|ψi+ 1− Ψ , (118)
s s2

where the state Ψ⊥ spans the subspace defined by the auxiliary in state |1i, i.e. it is orthogonal
to any state of the form |0i|•i, including |0iU|ψi.
Use oblivious amplitude amplification to amplify the state |0iU|ψi. Unlike original QAA
(Section 6.2), oblivious amplitude amplification (developed by Berry et al. (2014) based on work
by Marriott and Watrous (2005)) provides a way to amplify an a priori unknown state.
Linear combination of unitaries enables Hamiltonian simulation through expansion of e−iHt ,
for example as a Taylor series (Berry et al., 2015a) or a series of quantum walk steps (Berry et al.,
2015b). For the P Taylor series expansion, the Hamiltonian is expressed as a linear combination
of unitaries H = l αl Hl , where each constituent Hamiltonian Hl is unitary. Then the Taylor
expansion of e−iHt is a linear combination of unitaries
K
−iHt
X (−it)k
e ≈ αl1 ...αlk Hl1 ...Hlk . (119)
k=0
k!

Alternatively, Berry et al. (2015b) approximate the evolution operator e−iHt as a series of quan-
tum walk steps as shown in (115), where powers of the quantum walk operator U are unitary.
The expansion of the quantum evolution operator e−iHt as a series of quantum walk steps is
one of the most efficient Hamiltonian simulation methods to date, requiring a number of queries
proportional to O(τ log(τ /ǫ)/loglog(τ /ǫ)), where τ = dkHkmax t, to the input oracles OH and OF .
The dependence on time is nearly linear, close to the lower bound set by the No Fast-Forwarding
theorem (Berry et al., 2007). The dependence of sparsity d is also nearly linear, while the depen-
dence on the simulation error ǫ is sublinear. The computational complexity only depends on the
dimension of the system indirectly, through the matrix norm of the Hamiltonian kHkmax . Addi-
tionally, gate complexity of this method – the number of gates required to implement it – is only
slightly larger than query complexity. A small disadvantage of the method is that, because it is
based on quantum walks, it requires duplicating the register of qubits simulating the quantum
systems. It is possible to extend the method to time-dependent Hamiltonians (Berry et al., 2020).

9.5 Hamiltonian Simulation by Quantum Signal Processing (QSP)


Low and Chuang (2017) proposed an alternative way to perform Hamiltonian simulation using
a method they called “quantum signal processing” (QSP), because some of their methodologies
are analogous to filter design in classical signal processing. (We discuss QSP in greater detail in

56
Section 11.1.) The time complexity of the method matches the proved lower bounds for Hamilto-
nian simulation of sparse Hamiltonians. The method uses a single auxiliary qubit to encode the
eigenvalues of the simulated Hamiltonian, and then transforms these eigenvalues using a sequence
of rotation gates applied to the auxiliary qubits. Using specialized sequences of rotation gates, the
method makes it possible to perform polynomial functions of degree d on the input using O(d)
elementary unitary operations.
QSP borrows ideas from quantum control – the area of quantum computing that deals with
extending the life of qubits and improving the fidelity of quantum computation. Low and Chuang
(2017) point out that Hamiltonian simulation is a mapping of a physical system onto a different
physical system possible to control more precisely. Because of this connection between Hamiltonian
simulation and quantum control, it is possible to create robust quantum simulations using the QSP
framework.
Let W be a quantum walk unitary that encodes the Hamiltonian H as in Section 9.3. Consider
λ
λ it kHk
a function eih(θ) = e−itsinθ , where θ = arcsin kHk1
, so that eih(θ)
= e 1 . QSP provides an efficient

way to calculate an finite approximation to the Fourier transform of eih(θ) .


Low and Chuang (2017) observe that eih(θ) splits into real and imaginary parts, A(θ) and C(θ),
respectively
A(θ)+iC(θ) = eih(θ) = e−itsinθ , (120)
and find their Fourier transforms using the Jacobi-Anger expansion

X
cos(τ sinθ) = J0 (τ )+2 Jk (τ )cos(kθ) (121)
keven>0

X
sin(τ sinθ) = 2 Jk (τ )sin(kθ), (122)
kodd>0

where Jk (z) are Bessel functions of the first kind.


The method turns out to have query complexity that corresponds multiplicatively to all known
lower bounds.

10 Quantum Optimization
Optimization is the process of finding x such that f (x), where f : R7 → R, is minimized. Opti-
mization plays an important role in many statistical methods, including most machine learning
methods. A subset of optimization problems – combinatorial optimization problems of finding an
n-bit string that satisfies a number of Boolean conditions – represent some of the most compu-
tationally difficult tasks for classical computers to solve. These problems, which belong to the
computational complexity class NP, include practical tasks such as manufacturing scheduling or
finding the shortest route for delivery to multiple locations. To solve NP-hard problems, classical
computers need time or memory that scale, in the worst case, exponentially with n. Further, some
of the problems belong to the NP-complete subset of NP-hard problems. These problems can be
converted into one other in time polynomial in n. An efficient solution to a single NP-complete
problem would solve all problems in the computational class NP.
While it is not likely that quantum computers can solve NP-complete problems, they can
solve some NP-hard problems (such as prime factorization) and efficiently deliver controlled ap-
proximations to some NP-complete problems. Quantum Approximate Optimization Algorithm

57
(QAOA) is the most famous quantum algorithm for the approximate solution of combinatorial
optimization problems. The relative efficiency of QAOA for a general combinatorial problem com-
pared with the best classical algorithm to solve the problem is not known (see, e.g. Zhou et al.,
2020); however, we do know that it is not possible to simulate QAOA on a classical computer
(Farhi and Harrow, 2016). Additionally, QAOA, as well as other hybrid quantum-classical varia-
tional algorithms (Section 10.3) may be robust enough to harness the power of NISQs, the noisy
quantum computers available today, and to deliver practical quantum computing advantage in the
near term (McClean et al., 2016).

10.1 Adiabatic Quantum Computing (AQC)


As discussed in Section 3.3, a closed quantum system that evolves according to a Hamiltonian
H with eigenstates |λi and eigenvalues λ will evolve from a starting state |ψ(0)i into the state
|ψ(t)i = λ ψ0 e−iλt |λi, where ψ0 = hλ|ψ(0)i, the inner product of |ψ(0)i and |λi. This property
P
implies that, if the starting state |ψ(0)i is one of the eigenstates of H, |λi, |ψ(0)i = |λi, then
the evolution under the Hamiltonian H will leave the state unchanged, |ψ(t)i = e−iλ |λi ∝ |ψ(0)i,
since the overall phase has no impact on quantum states and is ignored. This is the essence of
the Adiabatic Theorem of Born and Fock (1928): if the Hamiltonian of the system is changed
slowly enough, the system will stay in its ground state (the eigenstate corresponding to the lowest
eigenvalue of the Hamiltonian), provided the spectral gap (the gap between the lowest and the
second lowest eigenvalues) is maintained throughout the evolution.
Adiabatic quantum computing (AQC) algorithms (Farhi et al., 2000) leverage the Adiabatic
Theorem. An AQC algorithm starts in an easy-to-prepare starting state |Si, usually the ground
state of the Hamiltonian HS . The goal of the algorithm is to arrive at the ground state of the
ending Hamiltonian HE . To transform the initial state |Si into the desired state |Ei, the system
is evolved slowly by applying the unitary evolution operator corresponding to a Hamiltonian H(t)
that slowly transitions from HS to HE by increasing the weight β(t):

H(t) = HS (1−β(t))+HE β(t). (123)

If the spectral gap is preserved throughout the transformation from HS to HE , the starting state
evolves into the ground state of HE . The squared inverse of the spectral gap bounds the rate at
which the Hamiltonian can evolve from HS to HE which, in turn, bounds the runtime efficiency
of the AQC algorithm (Reichardt, 2004). Aharonov et al. (2008) proved that, in the absence of
noise and decoherence, adiabatic quantum computation is theoretically equivalent to circuit based
computation.
In practice, AQC has both advantages and disadvantages compared with circuit-based compu-
tation. An attractive property of adiabatic quantum computation is that, if the spectral gap is
large enough (compared with the inverse of the evolution time 1/t), the computation is robust even
in the presence of noise. A disadvantage is that, for many systems, there are stringent physical
limits on how fast the effective Hamiltonian can change; because of this, the required time for full
transition from HS to HE may be slower than available system coherence time.
A recent review article by Albash and Lidar (2018) provides more detail about AQC.27 AQC
may contribute most powerfully in chemistry (see, e.g. Babbush et al., 2014; Veis and Pittner,
27
A related way of performing optimization is quantum annealing (Apolloni et al., 1989). Here the optimization
starts in an arbitrary initial state and then explores the cost function landscape until it finds the ground state of
the system. Quantum annealing is the method deployed by the company D-Wave.

58
2014) and may unlock applications in other fields, such as machine learning (Neven et al., 2008;
Denchev et al., 2012; Seddiqi and Humble, 2014; Potok et al., 2021) and combinatorial optimiza-
tion (Farhi et al., 2000, 2001; Choi, 2011, 2020).

10.2 Quantum Approximate Optimization Algorithm (QAOA)


The Quantum Approximate Optimization Algorithm (QAOA) proposed by Farhi et al. (2014)
finds computationally-efficient approximations for a class of NP-hard combinatorial optimization
problems. Combinatorial optimization is equivalent to searching for string z of length n that
satisfies m clauses. The objective function is the number of clauses string z satisfies:
m
X
C(z) = Ck (z), (124)
k=1

where Ck (z) = 1 if z satisfies clause k and 0 if it does not. The algorithm finds a string z – among
the 2n possible n-bit strings – such that C(z) is close to maxz C(z).
The core idea of the algorithm is that it is possible to create a parameterized quantum super-
position of 2n states that represent all possible binary strings z such that the expectation of the
objective function C for this state is maximized for a set of parameters. The quantum state corre-
sponding to the maximum expectation of C contains the approximate solution to the combinatorial
optimization with a high probability.
The algorithm starts with a uniform superposition of all possible n-bit binary strings z in the
computational basis
1 X
|si = √ |zi. (125)
2n z

Two types of parameterized unitary operators are applied to this state. The first unitary
operator has the form
m
Y
−iγC
U(C,γ) = e = e−iγCk , (126)
k=1

where γ is a scalar parameter, γ ∈ [0,2π). The second unitary operator is based on an operator of
the form
n
X
B= σjx , (127)
j=1

where σjx is a Pauli operator (Section 3.3) applied to qubit j, j = 1,..,n. The unitary operator
U(B,β) equals
n
x
Y
−iβB
U(B,β) = e = e−iβσj , (128)
j=1

where β ∈ [0,π).28
28
The uniform state |si is the highest-eigenvalue eigenstate of B.

59
Alternating application of the parameterized unitary operators U(C,γ) and U(B,β) to the
initial state |si creates a parameterized quantum state

|γ,βi = U(B,βp )U(C,γp )...U(B,β1 )U(C,γ1 )|si, (129)

where γ =(γ1 ,...γp ) and β =(γ1 ,...γp ) are parameter vectors. The parameter p reflects the resulting
quantum circuit depth, which, as Farhi et al. (2014) show, controls the quality of the approxima-
tion, converging to the exact result in the limit p → ∞. In effect, QAOA applies Product Formula
(Section 9.2) to the AQC algorithm (Section 10.1).
The optimal approximation is the quantum state |γ ∗ ,β ∗ i maximizing the expectation value of
the objective function C

argmaxγ,β hγ,β|C|γ,βi. (130)

The maximization proceeds in a classical outer loop using gradient descent, Nelder-Mead, or other
methods. At step t, the classical computer controls parameterized unitary gates to create the state
γ (t) ,β (t) and collects results of measuring C in this state. The results of the measurements of
C feed into the parameter update (γ (t) ,β (t) ) 7→ (γ (t+1) ,β (t+1) ). The unitary gates, updated with
the new set of parameters and applied to the reset initial state |si, create the updated quantum
state γ (t+1) ,β (t+1) . The algorithm stops when a stopping criterion, such as an increase in the
expectation value of C below a given threshold, is reached.29
Because the objective function C is defined in general terms, the algorithm lends itself to a range
of combinatorial approximation problems. Farhi et al. (2014) use QAOA to find an approximate
solution to MaxCut, the problem of cutting a graph into two parts in a way that maximizes
the reduction in the cost function. Proposed practical industrial applications include finance
(Fernández-Lorenzo et al., 2020) and wireless scheduling (Choi et al., 2020).

10.3 Hybrid Quantum-Classical Variational Algorithms


Hybrid quantum-classical variational algorithms introduced in Sections 3.3 (subsection Observ-
ables) and 4.4 play to the strengths of both quantum and classical computers. The structure
of hybrid quantum-classical variational algorithms is similar to that of QAOA: A quantum state
encodes the variational (i.e. trial) solution; a quantum observable represents the cost function
(Rebentrost et al., 2018a; Mitarai et al., 2018; Zoufal et al., 2019; Schuld et al., 2020; Zoufal et al.,
2020). A classical computer stores and updates variational parameters, which it uses to classically
control the quantum gates used create the variational quantum state. The expectation value of
measurements of the quantum observable in the variational quantum state is the cost associated
with the set of variational parameters that define the state. The classical computer collects the
results of these measurements and uses them to update variational parameters. The classical com-
puter then uses the updated variational parameters to reset the quantum gates used to prepare the
next iteration of the variational quantum state. In some cases, direct measurements of the gradient
of the cost function can improve convergence of hybrid variational methods (Harrow and Napp,
2019).
29
Marsh and Wang (2020) observe that QAOA is a form of a quantum walk with phase shifts and use this
observation to generalize the algorithm.

60
10.4 Quantum Gradient Descent
An important method in classical machine learning is gradient descent (and stochastic gradient
descent) – a popular way for training statistical models by optimizing a loss function. When
gradient descent is used for training quantum neural networks, it is often performed on a classical
computer. The reason is that it is difficult to iterate on a quantum computer. The vast majority
of quantum computing algorithms rely on postselection (see Section 5.3) to produce the desired
result. Postselection yields the desired quantum state with a fractional probability (often around
1/2); the rest of the time it yields an incorrect state that has to be discarded. Therefore, in
order to have a sufficient number of desired quantum states available at the end of the iterative
process, multiple copies of the initial state have to be created, with the number of copies increasing
exponentially with the number of expected iterative steps.
For optimization where the number of expected iterations is small, quantum algorithms with
provable speedups have been proposed. These algorithms work for specific simple forms of the loss
function. How to extend these algorithms to alternative classes of loss functions remains an open
area for future research. The most general quantum gradient descent work is by Rebentrost et al.
(2019) who propose a quantum algorithm to find the minimum of a homogeneous polynomial
using gradient descent and Newton’s methods. The proposed quantum algorithm leverages the
matrix exponentiation method used in Quantum PCA (Section 8.6), followed by Quantum Phase
Estimation (Section 8.2) and a controlled rotation (Sections 5.3 and 8.3) of an auxiliary qubit
to achieve each parameter update step. The shortcoming of this method is that it requires, on
average, the destruction of approximately three sets of quantum states for each updating steps
(this number can be reduced to around two with an optimized sequence of steps), and is therefore
only appropriate for approximate minimization with a small number of iterations.
Sweke et al. (2020) consider stochastic gradient descent for hybrid quantum-classical optimiza-
tion (Section 10.3). They argue that to obtain an unbiased estimator of an integral over a probabil-
ity distribution, it is sufficient to prepare a corresponding quantum state and take a measurement,
repeating the process k times, where k, in some cases, can be as low as 1. Where the gradient can
be expressed as a linear combination of expectation values, a “doubly-stochastic” gradient descent
is possible. The paper considers cases where the cost function can be expressed as an observ-
able that can be readily measured, such as energy (i.e. the expectation value of the Hamiltonian
of the system) in Variational Quantum Eigensolver (Peruzzo et al., 2014). Stokes et al. (2020)
propose a quantum algorithm to estimate the natural gradient for cost functions expressed by a
block-diagonal Hamiltonian in a space spanned by parameterized unitary gates.

11 Quantum Eigenvalue and Singular Value Transformations


Quantum Singular Value Transformation (QSVT) by Gilyén et al. (2019b), a generalization of
Quantum Signal Processing (QSP) by Low and Chuang (2017), has recently emerged as a unify-
ing framework, encompassing all the major families of quantum algorithms as specific instances
(Martyn et al., 2021). The algorithms leverage the fact that transformations of quantum subsys-
tems can be non-linear even though the transformations of closed quantum systems have to be
linear – more specifically, unitary – and reversible.
Let H be a Hamiltonian acting on a 2n -dimensional space spanned by a register of n qubits. Let
|λi be the n-qubit eigenstates of the Hamiltonian H corresponding to eigenvalues λ: H|λi = λ|λi.

61
QSVT enables us to perform a polynomial transformation of the eigenvalues of H:
X
P oly(H) = P oly(λ)|λihλ|, (131)
λ

assuming the operator norm of the Hamiltonian obeys kHk ≤ 1 to enable block-encoding of the
system in a larger system. More generally, QSVT enables polynomial transformations of singular
values of a general (non-Hermitian) matrix A.
In this section, we introduce QSP (Section 11.1), review QSVT for Hermitian matrices (Quan-
tum Eigenvalue Transformation in Section 11.2), and describe the general form of QSVT (Sec-
tion 11.3) and its universality (Section 11.4) as a framework for other quantum algorithms.

11.1 Quantum Signal Processing (QSP)


Quantum Signal Processing (QSP) is a method to enact polynomial transformations of a “signal”
x (assuming x ∈ [−1,1]) embedded in a unitary W acting on a single qubit. QSP transformations
proceed as sequences of simple rotations of the qubit followed by post-selection (Section 5.3) to
perform polynomial transformations of x. To develop QSP, Low et al. (2016) drew inspiration from
the signal processing methods of nuclear magnetic resonance (NMR) – a powerful and important
technology widely used in medicine, chemistry, petroleum industry, materials science, and physics.
A QSP algorithm has four components. The first component is the signal unitary W (x) – the
unitary that encodes the signal x to be transformed. The second component is the signal processing
~ = (φ0 ,...,φd ). The
unitary S(φj ), usually a simple rotation by an angle φj , an element of a tuple φ
algorithm proceeds as a sequence of alternating unitaries S(φj ) and W (x). The third component is
~ which determines what polynomial transformation
the sequence of rotational angles in the tuple φ,
the “signal” x undergoes. The forth component is the signal basis, e.g. {|+i,|−i}, used to perform
a measurement at the end of the algorithm.
Let W be the signal unitary acting on a qubit or a more general two-state quantum system
(we will use the generalization in the next section):
 √ 
x i 1−x2
W (x) = √ = eiXarccos(x) , (132)
i 1−x2 x

where X is a Pauli matrix and eiXarccos(x) represents rotation about the x axis by angle 2arccos(x).
Note that encoding of x in a unitary can take multiple forms, and the form in (132) represents a
specific choice, convenient for discussion of Quantum Eigenvalue Transformation and QSVT and
their applications in the following sections (Martyn et al., 2021).
Let S(φ) be a signal processing operator. A convenient choice is

S(φ) = eiφZ , (133)

where Z is a Pauli matrix and eiφZ represents rotation about the z axis by angle 2φ. In principle,
S(φ) can take other forms, as long as S(φ) does not commute with W (x).
~
For a specific choice of a d-dimensional parameter vector φ=(φ 0 ,φ1 ,..φd ), a series of alternating
applications of S(φj ) and W (x) result in a polynomial transformation of the signal x:
d  √ 
Y P (x) iQ(x) 1−x2
Uφ~ (x) = S(φ0 ) W (x)S(φj ) = √ , (134)
iQ∗ (x) 1−x2 P ∗ (x)
j=1

62
where P (x) and Q(x) are complex polynomials of degree less than or equal to d and d−1, and
with parity of (d mod2) and (d−1 mod2), respectively. The polynomials satisfy |P (x)|2 +(1−
x2 )|Q(x)|2 = 1 for all x ∈ [−1,1]. Importantly, given polynomials P (x) and Q(x) that satisfy the
aforementioned conditions, it is possible to find φ ~ that satisfies (134).
The polynomial P (x) determines the probability p that the state |0i stays unchanged under the
operation Uφ~ (x), P (x) = h0|Uφ~ (x)|0i and p = |P (x)|2. The choice of φ ~ determines the polynomial.
For example, the choice φ ~ = (0,0) results in P (a) = a; φ
~ = (0,0,0) results in P (a) = 2a2 −1, and so
on, with φ~ = (0,0,...,0) in d dimensions yielding Chebyshev’s polynomials of the first kind, Td (a).
For greater expressiveness of Uφ~ (x), consider the matrix element h+|Uφ~ (x)|+i, where |+i =
1
√ (|0i+|1i) is a state in the Hadamard basis, chosen in this case as the signal basis. The state
2
|+i plays the role of the reference state. Post-selection on the reference state yields a polynomial
transformation of x

h+|Uφ~ (x)|+i = Re(P (x))+iRe(Q(x)) 1−x2 . (135)

Post-selection relative to the signal basis is one of the critical steps of QSP. After all, we are
able to perform a non-linear quantum transformation because it applies to a subsystem of a larger
system. Post-selection is a way to extract the smaller system from the larger system at the end of
the computation.

11.2 Quantum Eigenvalue Transformation


Gilyén et al. (2018) generalized QSP (Section 11.1) to perform polynomial transformations of
matrices rather than scalar signals. They combined QSP with block encoding and qubitization.
Block encoding is a way to embed a linear non-unitary operator acting on a quantum system
inside a unitary operator acting on a larger system that contains the smaller quantum system. A
simple way to embed a quantum system inside a larger system is to append an auxiliary qubit.
n
Consider a quantum system on n qubits. Its Hilbert space is C2 . Appending the auxiliary qubit
n+1
doubles the Hilbert space to C2 . If the operator to be embedded E is unitary operator, then a
simple block-encoding of E is the control-E operator, where E applies to the register of n qubits
conditional on the state of the auxiliary qubit.
Consider a Hermitian operator H acting on the quantum system of n qubits. It is possible to
embed this Hamiltonian in a unitary U acting on the expanded system of the n qubits plus the
auxiliary. For example, let U take the form

U = I ⊗H+iX ⊗ 1−H2 , (136)

where
√ the Pauli operators Z and X act on the auxiliary qubit and the Hermitian operators H and
1−H2 act on the n-qubit√register. The eigenstates of H are n-qubit states |λi with eigenvalues λ:
H|λi=λ|λi. The operator 1−H2 is Hermitian and shares √ its eigenstates
√ with the Hamiltonian H;
it is easy to demonstrate using Taylor expansion that 1−H |λi = 1−λ2|λi. As a consequence,
2

the operator U is unitary, U † U = UU † = I.30


Qubitization is the reduction of a multi-qubit state to a two-state qubit-like system. For
example, qubitization is at the core of Grover’s search algorithm (Section 6), where the “good”
30
Note that U is not a unique way to embed the Hermitian operator H in a unitary operator acting on a larger
system.

63
state and its complement effectively act as a two-state system. They span a (two-dimensional)
plane, invariant under Grover iterations.
Consider the application of the unitary U from (136) to states of the extended system |0i|λi
and |1i|λi, where |0i and |1i represent the states of the auxiliary qubit:

U|0i|λi = λ|0i|λi+i 1−λ2 |1i|λi (137)

U|1i|λi = λ|1i|λi+i 1−λ2 |0i|λi. (138)
The space spanned by |0i|λi and |1i|λi is closed under U and acts as a two-state system. The
unitary U has the effect of a rotation on each subspace that corresponds to the eigenstate of H
|λi:
M λ √ 
i 1−λ 2
U= √ ⊗|λihλ| (139)
i 1−λ2 λ
λ
M
= eiXarccos(λ) ⊗|λihλ|, (140)
λ

where eiXarccos(λ) is effectively a rotation by 2arccos(λ) about the x-axis of the Bloch sphere defined
by |0i|λi and |1i|λi for each λ. In each subspace spanned by |0i|λi and |1i|λi, the unitary operator
U acts as the signal unitary in QSP, encoding the eigenvalue λ as the signal.
By analogy with the unitary U that extends the idea of a signal unitary to a multi-qubit state,
we extend the signal processing operator. The signal processing operator is independent of the
signals λ and acts on the auxiliaryP qubit identically for each subspace spanned by |0i|λi and |1i|λi.
Using the identity operator I = λ |λihλ|, we can express the extended signal processing operator
as
M
Πφ = eiφZ ⊗|λihλ|, (141)
λ

where Z is a Pauli operator and the rotation eiφZ acts on the auxiliary qubit.
Gilyén et al. (2018) demonstrate that the sequence of alternating extended signal unitaries and
signal processing operators results in an embedded polynomial transformation of the Hermitian
operator H. For an even d:
d/2 i  
P oly(H) ·
hY

Uφ~ = Πφ2k−1 U Πφ2k U = , (142)
· ·
k=1

and similarly for an odd d:


h(d−1)/2 i  
Y
† P oly(H) ·
Uφ~ = Πφ1 U Πφ2k U Πφ2k+1 U = , (143)
· ·
k=1

where, postselecting on the auxiliary qubit ending in state |0i (if it started in state |0i), we obtain
the desired polynomial transformation P oly(H)
X
P oly(H) = P oly(λ)|λihλ|. (144)
λ

The resulting polynomial of H is of order less than or equal to d. Its exact form depends on the
~ which can be computed efficiently using a version
sequence of signal processing rotation angles φ,
of the classical Remez exchange algorithm (Low et al., 2016; Martyn et al., 2021).

64
11.3 Quantum Singular Value Transformation (QSVT)
In the previous section we considered the polynomial transformation of a Hermitian operator. For
non-Hermitian linear operators, it is possible to perform an analogous polynomial singular value
transformation.
Consider a non-Hermitian operator A, represented by a rectangular matrix. Any general rect-
angular matrix can be decomposed into a diagonal matrix of singular values Σ and unitary matrices
WΣ and VΣ :
A = WΣ ΣVΣ† . (145)
The matrix Σ contains r non-negative real singular values of matrix A along the diagonal. The
columns of matrices WΣ and VΣ form orthonormal bases composed of left and right singular vectors
respectively. In quantum notation we denote the left singular vectors {|wk i} and right singular
vectors {|vk i} and express the matrix A as
r
X
A= σk |wk ihvk |. (146)
k=1

Using a quantum computer, we can efficiently perform polynomial singular value transformation
of A:
X
P oly (SV ) (A) = P oly(σk )|wk ihvk |. (147)
k

The matrix A can be embedded in a unitary matrix that applies to a larger quantum system
similarly to the way we embedded the Hermitian matrix H in Section 11.2. Even though the
Hilbert spaces spanned by {|wk i} and {|vk i} in general have different dimensions, we can create
an extended quantum state by appending a single auxiliary qubit. In this case, the Hilbert spaces
spanned by {|wk i} and {|vk i} would be encoded on the same register of n qubits, where n is large
enough to hold the larger of the spaces.
The extended signal unitary U then takes the form
p
M σk i 1−σk2

U= p ⊗|wk ihvk |. (148)
k
i 1−σk2 σk

When the input and output spaces of A are different, we have two signal processing operators
M
Πφ = eiφZ ⊗|vk ihvk | (149)
k
M
Π̃φ = eiφZ ⊗|wk ihwk |. (150)
k

With these definitions we have an expression analogous to the quantum eigenvalue transforma-
tion (Section 11.2), e.g. for an even d:
d/2
P oly (SV ) (A) ·
hY i  

Uφ~ = Πφ2k−1 U Π̃φ2k U = , (151)
· ·
k=1

where the polynomial transformation P oly (SV ) (A) is postselected on the auxiliary qubit in the
state |0i after the unitary Uφ~ is applied.

65
11.4 QSVT and the “Grand Unification” of Quantum Algorithms
Quantum Eigenvalue Transformation and its generalization, the Quantum Singular Value Trans-
formation, are powerful and expressive ways to perform a wide range of operations on a quantum
computer. For some quantum problems, such as Hamiltonian simulation (Section 9), QSVT pro-
vides the most efficient known algorithm to date, nearly optimal relative to known lower bounds.
Martyn et al. (2021) point out that the QSVT framework unifies all the existing quantum
algorithms. After all, a quantum algorithm is a transformation of inputs – linear or non-linear,
dimension-preserving or not. The QSVT framework provides a unified way to encode any matrix
transformation that has a polynomial expansion, and the algorithm itself is encoded in a sequence
of real numbers – the phases in the tuple φ. ~
Consider, for example, the search problem described in Section 6. The problem has two natural
singular vectors: the starting state |ψ0 i and the target state |x0 i. The objective is to transform
the small inner product between the starting and ending state c = hx0 |ψ0 i into a scalar of order
unity hx0 |U|ψ0 i = O(1). Yoder et al. (2014) demonstrate that a sequence of pulse sequences can
be tuned to achieve this transformation efficiently, while avoiding the Grover algorithm’s “soufflé
problem” – the fact that the approximation error in the Grover algorithm is periodic and, if the
minimum error is missed, it starts rising with every iteration until it peaks at O(1) and starts
decreasing again. The method proposed by Yoder et al. (2014), called fixed-point quantum search,
generalizes reflections on a plane of Grover’s algorithm to rotations on a three-dimensional
√ Block
sphere using the framework of QSP and QSVT. The algorithm converges in O( N) steps and is
optimal.
The Hamiltonian simulation problem is equivalent to matrix exponentiation. Given a Hamilto-
nian H we seek to apply e−iHt which is equivalent to applying the sum e−iHt = cos(Ht)−isin(Ht).
The trigonometric functions cos(Ht) and sin(Ht) have polynomial expansions, called Jacobi-Anger
expansions, and therefore it is possible to cast the Hamiltonian simulation problem as a sum of
two Quantum Eigenvalue Transformations.
QSVT provides an efficient way to solve quantum linear systems problem (Section 8.4) A|xi =
|bi. In the original QLSA algorithm, Harrow et al. (2009) used Quantum Phase Estimation to
extract singular values of A (or its eigenvalues, if A is Hermitian) and a controlled rotation to invert
these values. The QSVT framework does not require the explicit extraction of the singular values
of A. The singular value inversion is performed in a single step using a polynomial approximation
to A−1 . Assuming the singular values of A are bounded, e.g. by 1/κ from below, it is possible
to construct a polynomial expansion of A−1 , as shown in Martyn et al. (2021), Appendix C. The
query complexity of the resulting algorithm is O(κlog(κ/ǫ)); remarkably, it has no dependence
on N, the dimension of A, and has a logarithmic dependence on error. (It is important to note,
however, that the block encoding of A may require a number of operations that scales as a function
of N.)
Martyn et al. (2021) demonstrate how to apply the QSVT framework to prime factorization,
phase estimation, and eigenvalue thresholding; Gilyén et al. (2018) apply QSVT to Gibbs sampling,
quantum walks, and the computation of machine learning primitives. A wide variety of quantum
matrix functions and quantum channel discrimination methods are being developed using QSVT.
Additionally, because QSVT is based on quantum control techniques which are also used in error
correction, it is possible to create naturally robust algorithms using the QSVT frameworks. For
example, embedding a matrix in a large unitary may enable algorithm designers to “push” errors
out into the outer blocks of the matrix – similarly to the way deep networks distribute errors
through unimportant dimensions of an overparameterized model (Bartlett et al., 2020).

66
The QSVT framework distills the great variety of quantum algorithms to a (finite) string or real
numbers – the auxiliary rotation phases φ.~ All the algorithms follow the same circuit – composed
of alternating block-encoded unitaries and auxiliary rotations which, depending on the sequence
of auxiliary rotations, transform the singular values by a nearly arbitrary polynomial. The specific
sequence of auxiliary rotations delivers the multitude of useful outcomes.
The QSVT framework is powerful, but, like other quantum algorithmic frameworks, it is not
without challenges. It is not trivial to design efficient block encodings – even for the simplest
systems like harmonic oscillators. Computationally efficient methods to determine the sequence of
phases φ~ corresponding to a given polynomial are also an open challenge. Nevertheless, QSVT is
a significant advance towards practical, actionable quantum algorithms.

12 Conclusion
The intersection of quantum computing and statistics has delivered – and is likely to continue
to deliver – powerful breakthroughs that unlock interesting and practical applications. Consider,
for example, advances in quantum MCMC, in particular quantum Metropolis-Hastings. Speeding
up the Metropolis-Hastings algorithm is critical for many important practical applications but,
because of its mathematical structure, the method does not generally apply to high-dimensional
statistical models. For such cases, Sequential Monte Carlo (SMC) can provide an attractive al-
ternative to MCMC. Quantum SMC is a challenging, but potentially highly impactful research
question, which remains an open question at the time of writing.
In the classical world, Variational Bayes stands out as a computationally attractive alternative
to MCMC for Bayesian computation in big model and big data settings. Lopatnikova and Tran
(2021) propose a Variational Bayes method based on quantum natural gradient, which can be
implemented on a quantum-classical device. How to implement a quantum Variational Bayes
approach entirely on a quantum computer is an interesting research question.
Potential advances stem not just from applying quantum algorithms to machine learning, but
also from borrowing insights the other way around – from statistics and machine learning to quan-
tum computing. For example, as discussed in Section 4.2, reading out a quantum state that encodes
the result of a quantum computation might require too many measurements offseting quantum effi-
ciency. When the result is a quantum sample state, we can interpret it as a probability distribution.
We can then adopt the idea of normalizing flow in machine learning (Papamakarios et al., 2021)
and transform the quantum sample state into a new manageable quantum sample state, e.g., the
uniform state, using parameterized quantum circuits with the parameters trained in classical outer-
loop. The method would allow us to create as many (approximate) copies of the original state
as needed, via the inverse transformation, for use in quantum tomography. One important area
that was not reviewed in this article is quantum-inspired computation, such as quantum-inspired
linear algebra (Gilyén et al., 2018; Chia et al., 2020). Quantum-inspired algorithms work on clas-
sical computers, but are designed based on quantum-inspired ideas and can still offer significant
speed-ups. We leave this topic for a future work.

References
Aaronson, S. (2013). Quantum computing since Democritus. Cambridge University Press.

Aaronson, S. (2015). Read the fine print. Nature Physics, 11(4):291–293.

67
Aaronson, S. (2019). Shadow tomography of quantum states. SIAM Journal on Computing,
49(5):STOC18–368.

Aaronson, S. and Ambainis, A. (2009). The need for structure in quantum speedups. arXiv preprint
arXiv:0911.0996.

Abbas, A., Sutter, D., Zoufal, C., Lucchi, A., Figalli, A., and Woerner, S. (2020). The power of
quantum neural networks. arXiv preprint arXiv:2011.00027.

Abrams, D. S. and Lloyd, S. (1998). Nonlinear quantum mechanics implies polynomial-time solu-
tion for np-complete and# p problems. Physical Review Letters, 81(18):3992.

Adcock, J., Allen, E., Day, M., Frick, S., Hinchliff, J., Johnson, M., Morley-Short, S., Pallister,
S., Price, A., and Stanisic, S. (2015). Advances in quantum machine learning. arXiv preprint
arXiv:1512.02900.

Aharonov, D., Kitaev, A., and Nisan, N. (1998). Quantum circuits with mixed states. In Proceed-
ings of the 30th Annual ACM Symposium on Theory of Computing, pages 20–30.

Aharonov, D. and Ta-Shma, A. (2003). Adiabatic quantum state generation and statistical zero
knowledge. In Proceedings of the 35th Annual ACM Symposium on Theory of Computing, pages
20–29.

Aharonov, D. and Ta-Shma, A. (2007). Adiabatic quantum state generation. SIAM Journal on
Computing, 37(1):47–82.

Aharonov, D., Van Dam, W., Kempe, J., Landau, Z., Lloyd, S., and Regev, O. (2008). Adia-
batic quantum computation is equivalent to standard quantum computation. SIAM Review,
50(4):755–787.

Albash, T. and Lidar, D. A. (2018). Adiabatic quantum computation. Reviews of Modern Physics,
90(1):015002.

Ambainis, A. (2012). Variable time amplitude amplification and quantum algorithms for linear
algebra problems. In STACS’12 (29th Symposium on Theoretical Aspects of Computer Science),
volume 14, pages 636–647. LIPIcs.

An, D. and Lin, L. (2019). Quantum linear system solver based on time-optimal adiabatic quantum
computing and quantum approximate optimization algorithm. arXiv preprint arXiv:1909.05500.

An, D., Linden, N., Liu, J.-P., Montanaro, A., Shao, C., and Wang, J. (2021). Quantum-accelerated
multilevel Monte Carlo methods for stochastic differential equations in mathematical finance.
Quantum, 5:481.

Apolloni, B., Carvalho, C., and De Falco, D. (1989). Quantum stochastic optimization. Stochastic
Processes and their Applications, 33(2):233–244.

Arunachalam, S. and de Wolf, R. (2017). Guest column: A survey of quantum learning theory.
ACM SIGACT News, 48(2):41–67.

Arunachalam, S., Gheorghiu, V., Jochym-O’Connor, T., Mosca, M., and Srinivasan, P. V. (2015).
On the robustness of bucket brigade quantum RAM. New Journal of Physics, 17(12):123010.

68
Babbush, R., Love, P. J., and Aspuru-Guzik, A. (2014). Adiabatic quantum simulation of quantum
chemistry. Scientific Reports, 4(1):1–11.

Barnum, H., Caves, C. M., Fuchs, C. A., Jozsa, R., and Schumacher, B. (1996). Noncommuting
mixed states cannot be broadcast. Physical Review Letters, 76(15):2818.

Bartlett, P. L., Long, P. M., Lugosi, G., and Tsigler, A. (2020). Benign overfitting in linear
regression. Proceedings of the National Academy of Sciences, 117(48):30063–30070.

Baumgratz, T., Gross, D., Cramer, M., and Plenio, M. B. (2013). Scalable reconstruction of
density matrices. Physical Review Letters, 111(2):020401.

Bausch, J. (2020). Recurrent quantum neural networks. Advances in Neural Information Processing
Systems, 33.

Benioff, P. (1980). The computer as a physical system: A microscopic quantum mechanical hamil-
tonian model of computers as represented by turing machines. Journal of Statistical Physics,
22(5):563–591.

Benioff, P. (1982). Quantum mechanical hamiltonian models of turing machines. Journal of


Statistical Physics, 29(3):515–546.

Bennett, C. H. (1989). Time/space trade-offs for reversible computation. SIAM Journal on Com-
puting, 18(4):766–776.

Berry, D. W., Ahokas, G., Cleve, R., and Sanders, B. C. (2007). Efficient quantum algorithms for
simulating sparse Hamiltonians. Communications in Mathematical Physics, 270(2):359–371.

Berry, D. W. and Childs, A. M. (2012). Black-box Hamiltonian simulation and unitary implemen-
tation. Quantum Information and Computation, 12(1-2):29–62.

Berry, D. W., Childs, A. M., Cleve, R., Kothari, R., and Somma, R. D. (2014). Exponential
improvement in precision for simulating sparse hamiltonians. In Proceedings of the 46th Annual
ACM Symposium on Theory of Computing, pages 283–292.

Berry, D. W., Childs, A. M., Cleve, R., Kothari, R., and Somma, R. D. (2015a). Simulating
Hamiltonian dynamics with a truncated Taylor series. Physical Review Letters, 114(9):090502.

Berry, D. W., Childs, A. M., and Kothari, R. (2015b). Hamiltonian simulation with nearly opti-
mal dependence on all parameters. In 2015 IEEE 56th Annual Symposium on Foundations of
Computer Science, pages 792–809. IEEE.

Berry, D. W., Childs, A. M., Su, Y., Wang, X., and Wiebe, N. (2020). Time-dependent Hamiltonian
simulation with l1-norm scaling. Quantum, 4:254.

Bharti, K., Cervera-Lierta, A., Kyaw, T. H., Haug, T., Alperin-Lea, S., Anand, A., Degroote, M.,
Heimonen, H., Kottmann, J. S., Menke, T., et al. (2021). Noisy intermediate-scale quantum
(NISQ) algorithms. arXiv preprint arXiv:2101.08448.

Biham, E., Brassard, G., Kenigsberg, D., and Mor, T. (2004). Quantum computing without
entanglement. Theoretical Computer Science, 320(1):15–33.

69
Blank, C., Park, D. K., Rhee, J.-K. K., and Petruccione, F. (2020). Quantum classifier with
tailored quantum kernel. npj Quantum Information, 6(1):1–7.

Born, M. and Fock, V. (1928). Beweis des adiabatensatzes. Zeitschrift für Physik, 51(3-4):165–180.

Boyer, M., Brassard, G., Høyer, P., and Tapp, A. (1998). Tight bounds on quantum searching.
Fortschritte der Physik: Progress of Physics, 46(4-5):493–505.

Brassard, G., Dupuis, F., Gambs, S., and Tapp, A. (2011). An optimal quantum algorithm to
approximate the mean and its application for approximating the median of a set of points over
an arbitrary distance. arXiv preprint arXiv:1106.4267.

Brassard, G., Hoyer, P., Mosca, M., and Tapp, A. (2002). Quantum amplitude amplification and
estimation. Contemporary Mathematics, 305:53–74.

Brassard, G., Høyer, P., and Tapp, A. (1998). Quantum counting. In International Colloquium on
Automata, Languages, and Programming, pages 820–831. Springer.

Bruß, D., DiVincenzo, D. P., Ekert, A., Fuchs, C. A., Macchiavello, C., and Smolin, J. A. (1998).
Optimal universal and state-dependent quantum cloning. Physical Review A, 57(4):2368.

Buhrman, H., Cleve, R., Watrous, J., and De Wolf, R. (2001). Quantum fingerprinting. Physical
Review Letters, 87(16):167902.

Buzek, V. and Hillery, M. (1996). Quantum copying: Beyond the no-cloning theorem. Physical
Review A, 54(3):1844.

Cerezo, M., Arrasmith, A., Babbush, R., Benjamin, S. C., Endo, S., Fujii, K., McClean, J. R.,
Mitarai, K., Yuan, X., Cincio, L., et al. (2021). Variational quantum algorithms. Nature Reviews
Physics, 3(9):625–644.

Chakrabarti, S., Childs, A. M., Hung, S.-H., Li, T., Wang, C., and Wu, X. (2019). Quantum
algorithm for estimating volumes of convex bodies. arXiv preprint arXiv:1908.03903.

Chakraborty, S., Gilyén, A., and Jeffery, S. (2018). The power of block-encoded matrix
powers: improved regression techniques via faster Hamiltonian simulation. arXiv preprint
arXiv:1804.01973.

Chia, N.-H., Gilyén, A., Li, T., Lin, H.-H., Tang, E., and Wang, C. (2020). Sampling-based
sublinear low-rank matrix arithmetic framework for dequantizing quantum machine learning.
In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages
387–400.

Childs, A. M. (2009). Universal computation by quantum walk. Physical Review Letters,


102(18):180501.

Childs, A. M. (2010). On the relationship between continuous-and discrete-time quantum walk.


Communications in Mathematical Physics, 294(2):581–603.

Childs, A. M., Cleve, R., Deotto, E., Farhi, E., Gutmann, S., and Spielman, D. A. (2003). Ex-
ponential algorithmic speedup by a quantum walk. In Proceedings of the 35th Annual ACM
Symposium on Theory of Computing, pages 59–68.

70
Childs, A. M. and Kothari, R. (2009). Limitations on the simulation of non-sparse Hamiltonians.
arXiv preprint arXiv:0908.4398.

Childs, A. M., Kothari, R., and Somma, R. D. (2017). Quantum algorithm for systems of linear
equations with exponentially improved dependence on precision. SIAM Journal on Computing,
46(6):1920–1950.

Childs, A. M. and Wiebe, N. (2012). Hamiltonian simulation using linear combinations of unitary
operations. Quantum Information & Computation, 12(11-12):901–924.

Cho, A. (2020). The biggest flipping challenge in quantum computing. Science. https://www.
sciencemag. org/news/2020/07/biggest-flipping-challenge-quantumcomputing (retrieved Aug. 3,
2020).

Choi, J., Oh, S., and Kim, J. (2020). Quantum approximation for wireless scheduling. Applied
Sciences, 10(20):7116.

Choi, V. (2011). Different adiabatic quantum optimization algorithms for the NP-complete exact
cover and 3SAT problems. Quantum Information & Computation, 11(7-8):638–648.

Choi, V. (2020). The effects of the problem hamiltonian parameters on the minimum spectral gap
in adiabatic quantum optimization. Quantum Information Processing, 19(3):1–25.

Chowdhury, A. N. and Somma, R. D. (2017). Quantum algorithms for Gibbs sampling and hitting-
time estimation. Quantum Information & Computation, 17(LA-UR-16-21218).

Ciliberto, C., Herbster, M., Ialongo, A. D., Pontil, M., Rocchetto, A., Severini, S., and Wossnig,
L. (2018). Quantum machine learning: a classical perspective. Proceedings of the Royal Society
A: Mathematical, Physical and Engineering Sciences, 474(2209):20170551.

Clader, B. D., Jacobs, B. C., and Sprouse, C. R. (2013). Preconditioned quantum linear system
algorithm. Physical Review Letters, 110(25):250504.

Cleve, R., Ekert, A., Macchiavello, C., and Mosca, M. (1998). Quantum algorithms revisited.
Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering
Sciences, 454(1969):339–354.

Cong, I. and Duan, L. (2016). Quantum discriminant analysis for dimensionality reduction and
classification. New Journal of Physics, 18(7):073011.

Coppersmith, D. (1994). An approximate fourier transform useful in quantum factoring. IBM


Research Report RC 119642.

Cramer, M., Plenio, M. B., Flammia, S. T., Somma, R., Gross, D., Bartlett, S. D., Landon-
Cardinal, O., Poulin, D., and Liu, Y.-K. (2010). Efficient quantum state tomography. Nature
Communications, 1(1):1–7.

Csanky, L. (1975). Fast parallel matrix inversion algorithms. In 16th Annual Symposium on
Foundations of Computer Science (sfcs 1975), pages 11–12. IEEE.

Denchev, V. S., Ding, N., Vishwanathan, S., and Neven, H. (2012). Robust classification with
adiabatic quantum optimization. arXiv preprint arXiv:1205.1148.

71
Dervovic, D., Herbster, M., Mountney, P., Severini, S., Usher, N., and Wossnig, L. (2018). Quantum
linear systems algorithms: a primer. arXiv preprint arXiv:1802.08227.

Dieks, D. (1982). Communication by epr devices. Physics Letters A, 92(6):271–272.

Draper, T. G. (2000). Addition on a quantum computer. arXiv preprint quant-ph/0008033.

Duan, L.-M. and Guo, G.-C. (1998). Probabilistic cloning and identification of linearly independent
quantum states. Physical Review Letters, 80(22):4999.

Dunjko, V. and Wittek, P. (2020). A non-review of quantum machine learning: trends and explo-
rations. Quantum Views, 4:32.

Durr, C. and Hoyer, P. (1996). A quantum algorithm for finding the minimum. arXiv preprint
quant-ph/9607014.

Farhi, E., Goldstone, J., and Gutmann, S. (2014). A quantum approximate optimization algorithm.
arXiv preprint arXiv:1411.4028.

Farhi, E., Goldstone, J., Gutmann, S., Lapan, J., Lundgren, A., and Preda, D. (2001). A quantum
adiabatic evolution algorithm applied to random instances of an NP-complete problem. Science,
292(5516):472–475.

Farhi, E., Goldstone, J., Gutmann, S., and Sipser, M. (2000). Quantum computation by adiabatic
evolution. arXiv preprint quant-ph/0001106.

Farhi, E. and Harrow, A. W. (2016). Quantum supremacy through the quantum approximate
optimization algorithm. arXiv preprint arXiv:1602.07674.

Farhi, E. and Neven, H. (2018). Classification with quantum neural networks on near term pro-
cessors. arXiv preprint arXiv:1802.06002.

Fernández-Lorenzo, S., Porras, D., and García-Ripoll, J. J. (2020). Hybrid quantum-classical


optimization for financial index tracking. arXiv preprint arXiv:2008.12050.

Feynman, R. P. (1981). Simulating physics with computers. International Journal of Theoretical


Physics, 21(6/7).

Gilyén, A., Arunachalam, S., and Wiebe, N. (2019a). Optimizing quantum optimization algorithms
via faster quantum gradient computation. In Proceedings of the Thirtieth Annual ACM-SIAM
Symposium on Discrete Algorithms, pages 1425–1444. SIAM.

Gilyén, A., Lloyd, S., and Tang, E. (2018). Quantum-inspired low-rank stochastic regression with
logarithmic dependence on the dimension. arXiv preprint arXiv:1811.04909.

Gilyén, A., Su, Y., Low, G. H., and Wiebe, N. (2019b). Quantum singular value transformation
and beyond: exponential improvements for quantum matrix arithmetics. In Proceedings of the
51st Annual ACM SIGACT Symposium on Theory of Computing, pages 193–204.

Giovannetti, V., Lloyd, S., and Maccone, L. (2008a). Architectures for a quantum random access
memory. Physical Review A, 78(5):052310.

72
Giovannetti, V., Lloyd, S., and Maccone, L. (2008b). Quantum random access memory. Physical
Review Letters, 100(16):160501.

Gisin, N. (1998). Quantum cloning without signaling. Physics Letters A, 242(1-2):1–3.


Gross, D., Liu, Y.-K., Flammia, S. T., Becker, S., and Eisert, J. (2010). Quantum state tomography
via compressed sensing. Physical Review Letters, 105(15):150401.

Grover, L. and Rudolph, T. (2002). Creating superpositions that correspond to efficiently integrable
probability distributions. arXiv preprint quant-ph/0208112.

Grover, L. K. (1996). A fast quantum mechanical algorithm for database search. In Proceedings
of the 28th Annual ACM Symposium on Theory of Computing, pages 212–219.
Grover, L. K. (2000). Synthesis of quantum superpositions by quantum computation. Physical
Review Letters, 85(6):1334.
Häner, T., Roetteler, M., and Svore, K. M. (2018). Optimizing quantum circuits for arithmetic.
arXiv preprint arXiv:1805.12445.

Hann, C. T., Lee, G., Girvin, S., and Jiang, L. (2021). Resilience of quantum random access
memory to generic noise. PRX Quantum, 2(2):020311.
Harrow, A. and Napp, J. (2019). Low-depth gradient measurements can improve convergence in
variational hybrid quantum-classical algorithms. arXiv preprint arXiv:1901.05374.
Harrow, A. W. (2020). Small quantum computers and large classical data sets. arXiv preprint
arXiv:2004.00026.

Harrow, A. W., Hassidim, A., and Lloyd, S. (2009). Quantum algorithm for linear systems of
equations. Physical Review Letters, 103(15):150502.

Heinrich, S. (2002). Quantum summation with an application to integration. Journal of Complex-


ity, 18(1):1–50.
Herbert, S. (2021). The problem with Grover-Rudolph state preparation for quantum monte-carlo.
arXiv preprint arXiv:2101.02240.
Huang, H.-Y., Broughton, M., Mohseni, M., Babbush, R., Boixo, S., Neven, H., and McClean,
J. R. (2021). Power of data in quantum machine learning. Nature Communications, 12(1):1–9.

Huang, H.-Y., Kueng, R., and Preskill, J. (2020). Predicting many properties of a quantum system
from very few measurements. Nature Physics, 16(10):1050–1057.
Jordan, S. P. (2005). Fast quantum algorithm for numerical gradient estimation. Physical Review
Letters, 95(5):050501.
Jozsa, R. and Linden, N. (2003). On the role of entanglement in quantum-computational speed-up.
Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering
Sciences, 459(2036):2011–2032.
Kadowaki, T. and Nishimori, H. (1998). Quantum annealing in the transverse Ising model. Physical
Review E, 58(5):5355.

73
Kadowaki, T. and Nishimori, H. (2021). Greedy parameter optimization for diabatic quantum
annealing. arXiv preprint arXiv:2111.13287.

Kaneko, K., Miyamoto, K., Takeda, N., and Yoshino, K. (2021). Quantum speedup of Monte Carlo
integration with respect to the number of dimensions and its application to finance. Quantum
Information Processing, 20(5):1–24.
Kerenidis, I. and Prakash, A. (2016). Quantum recommendation systems. arXiv preprint
arXiv:1603.08675.

Kerenidis, I. and Prakash, A. (2020). Quantum gradient descent for linear systems and least
squares. Physical Review A, 101(2):022316.
Kitaev, A. Y. (1995). Quantum measurements and the abelian stabilizer problem. arXiv preprint
quant-ph/9511026.
Kitaev, A. Y., Shen, A., Vyalyi, M. N., and Vyalyi, M. N. (2002). Classical and quantum compu-
tation, volume 47. American Mathematical Soc.

Knill, E. (1995). Approximation by quantum circuits. arXiv preprint quant-ph/9508006.


Kyrillidis, A., Kalev, A., Park, D., Bhojanapalli, S., Caramanis, C., and Sanghavi, S. (2018).
Provable compressed sensing quantum state tomography via non-convex methods. npj Quantum
Information, 4(1):1–7.
Lanyon, B., Maier, C., Holzäpfel, M., Baumgratz, T., Hempel, C., Jurcevic, P., Dhand, I.,
Buyskikh, A., Daley, A., Cramer, M., et al. (2017). Efficient tomography of a quantum many-
body system. Nature Physics, 13(12):1158–1162.
Liu, H.-L., Yu, C.-H., Wu, Y.-S., Pan, S.-J., Qin, S.-J., Gao, F., and Wen, Q.-Y. (2019). Quantum
algorithm for logistic regression. arXiv preprint arXiv:1906.03834.
Liu, Y., Arunachalam, S., and Temme, K. (2021). A rigorous and robust quantum speed-up in
supervised machine learning. Nature Physics, 17(9):1013–1017.

Lloyd, S. (1996). Universal quantum simulators. Science, pages 1073–1078.


Lloyd, S., Garnerone, S., and Zanardi, P. (2016). Quantum algorithms for topological and geometric
analysis of data. Nature Communications, 7(1):1–7.

Lloyd, S., Mohseni, M., and Rebentrost, P. (2013). Quantum algorithms for supervised and
unsupervised machine learning. arXiv preprint arXiv:1307.0411.
Lloyd, S., Mohseni, M., and Rebentrost, P. (2014). Quantum principal component analysis. Nature
Physics, 10(9):631–633.
Lomont, C. (2003). Fast inverse square root. Technical Report, 32.
Lopatnikova, A. and Tran, M.-N. (2021). Quantum natural gradient for Variational Bayes. arXiv
preprint arXiv:2106.05807.
Low, G. H. and Chuang, I. L. (2017). Optimal hamiltonian simulation by quantum signal process-
ing. Physical Review Letters, 118(1):010501.

74
Low, G. H. and Chuang, I. L. (2019). Hamiltonian simulation by qubitization. Quantum, 3:163.

Low, G. H., Yoder, T. J., and Chuang, I. L. (2014). Quantum inference on Bayesian networks.
Physical Review A, 89(6):062315.

Low, G. H., Yoder, T. J., and Chuang, I. L. (2016). Methodology of resonant equiangular composite
quantum gates. Physical Review X, 6(4):041067.

Magniez, F., Nayak, A., Roland, J., and Santha, M. (2011). Search via quantum walk. SIAM
Journal on Computing, 40(1):142–164.

Marriott, C. and Watrous, J. (2005). Quantum arthur–merlin games. Computational Complexity,


14(2):122–152.

Marsh, S. and Wang, J. B. (2020). Combinatorial optimization via highly efficient quantum walks.
Physical Review Research, 2(2):023302.

Martyn, J. M., Rossi, Z. M., Tan, A. K., and Chuang, I. L. (2021). A grand unification of quantum
algorithms. arXiv preprint arXiv:2105.02859.

McArdle, S., Jones, T., Endo, S., Li, Y., Benjamin, S. C., and Yuan, X. (2019). Variational ansatz-
based quantum simulation of imaginary time evolution. npj Quantum Information, 5(1):1–6.

McClean, J. R., Boixo, S., Smelyanskiy, V. N., Babbush, R., and Neven, H. (2018). Barren plateaus
in quantum neural network training landscapes. Nature Communications, 9(1):1–6.

McClean, J. R., Harrigan, M. P., Mohseni, M., Rubin, N. C., Jiang, Z., Boixo, S., Smelyanskiy,
V. N., Babbush, R., and Neven, H. (2021). Low-depth mechanisms for quantum optimization.
PRX Quantum, 2(3):030312.

McClean, J. R., Romero, J., Babbush, R., and Aspuru-Guzik, A. (2016). The theory of variational
hybrid quantum-classical algorithms. New Journal of Physics, 18(2):023023.

Mitarai, K., Negoro, M., Kitagawa, M., and Fujii, K. (2018). Quantum circuit learning. Physical
Review A, 98(3):032309.

Miyahara, H. and Sughiyama, Y. (2018). Quantum extension of variational Bayes inference. Phys-
ical Review A, 98(2):022330.

Montanaro, A. (2015). Quantum speedup of Monte Carlo methods. Proceedings of the Royal
Society A: Mathematical, Physical and Engineering Sciences, 471(2181):20150301.

Moroder, T., Hyllus, P., Tóth, G., Schwemmer, C., Niggebaum, A., Gaile, S., Gühne, O., and
Weinfurter, H. (2012). Permutationally invariant state reconstruction. New Journal of Physics,
14(10):105001.

Nayak, A. and Wu, F. (1999). The quantum query complexity of approximating the median and
related statistics. In Proceedings of the 31st Annual ACM Symposium on Theory of Computing,
pages 384–393.

Neven, H., Denchev, V. S., Rose, G., and Macready, W. G. (2008). Training a binary classifier
with the quantum adiabatic algorithm. arXiv preprint arXiv:0811.0416.

75
Nielsen, M. A. and Chuang, I. (2002). Quantum computation and quantum information.
O’Donnell, R. and Wright, J. (2016). Efficient quantum tomography. In Proceedings of the 48th
Annual ACM Symposium on Theory of Computing, pages 899–912.
Orsucci, D., Briegel, H. J., and Dunjko, V. (2018). Faster quantum mixing for slowly evolving
sequences of Markov chains. Quantum, 2:105.

Papamakarios, G., Nalisnick, E., Rezende, D., Mohamed, S., and Lakshminarayanan, B. (2021).
Normalizing flows for probabilistic modeling and inference. Journal of Machine Learning Re-
search, 22(57):1–64.
Paparo, G. D., Dunjko, V., Makmal, A., Martin-Delgado, M. A., and Briegel, H. J. (2014). Quan-
tum speedup for active learning agents. Physical Review X, 4(3):031002.

Park, C.-Y. and Kastoryano, M. J. (2020). Geometry of learning neural quantum states. Physical
Review Research, 2(2):023232.
Peruzzo, A., McClean, J., Shadbolt, P., Yung, M.-H., Zhou, X.-Q., Love, P. J., Aspuru-Guzik,
A., and O’brien, J. L. (2014). A variational eigenvalue solver on a photonic quantum processor.
Nature Communications, 5:4213.
Peskin, M. E. and Schroeder, D. V. (1995). An Introduction to Quantum Field Theory. Westview
Press. Reading, USA: Addison-Wesley (1995) 842 p.
Plesch, M. and Brukner, V. (2011). Quantum-state preparation with universal gate decompositions.
Physical Review A, 83(3):032302.
Potok, T. et al. (2021). Adiabatic quantum linear regression. Scientific Reports, 11(1):1–10.
Prakash, A. (2014). Quantum algorithms for linear algebra and machine learning. University of
California, Berkeley.
Preskill, J. (2018). Quantum computing in the NISQ era and beyond. Quantum, 2:79.
Rebentrost, P., Bromley, T. R., Weedbrook, C., and Lloyd, S. (2018a). Quantum Hopfield neural
network. Physical Review A, 98(4):042308.
Rebentrost, P. and Lloyd, S. (2018). Quantum computational finance: quantum algorithm for
portfolio optimization. arXiv preprint arXiv:1811.03975.

Rebentrost, P., Mohseni, M., and Lloyd, S. (2014). Quantum support vector machine for big data
classification. Physical Review Letters, 113(13):130503.
Rebentrost, P., Schuld, M., Wossnig, L., Petruccione, F., and Lloyd, S. (2019). Quantum gradient
descent and Newton’s method for constrained polynomial optimization. New Journal of Physics,
21(7):073023.
Rebentrost, P., Steffens, A., Marvian, I., and Lloyd, S. (2018b). Quantum singular-value decom-
position of nonsparse low-rank matrices. Physical Review A, 97(1):012327.
Reichardt, B. W. (2004). The quantum adiabatic optimization algorithm and local minima. In
Proceedings of the 36th Annual ACM Symposium on Theory of Computing, pages 502–510.

76
Romero, J., Olson, J. P., and Aspuru-Guzik, A. (2017). Quantum autoencoders for efficient
compression of quantum data. Quantum Science and Technology, 2(4):045001.

Saad, Y. (2003). Iterative methods for sparse linear systems. SIAM.

Sanders, Y. R., Low, G. H., Scherer, A., and Berry, D. W. (2019). Black-box quantum state
preparation without arithmetic. Physical Review Letters, 122(2):020502.

Schuld, M., Bocharov, A., Svore, K. M., and Wiebe, N. (2020). Circuit-centric quantum classifiers.
Physical Review A, 101(3):032308.

Schuld, M. and Killoran, N. (2019). Quantum machine learning in feature Hilbert spaces. Physical
Review Letters, 122(4):040504.

Schuld, M. and Petruccione, F. (2018). Supervised learning with quantum computers. Springer.

Schuld, M., Sinayskiy, I., and Petruccione, F. (2016). Prediction by linear regression on a quantum
computer. Physical Review A, 94(2):022342.

Seddiqi, H. and Humble, T. S. (2014). Adiabatic quantum optimization for associative memory
recall. Frontiers in Physics, 2:79.

Shor, P. W. (1994). Algorithms for quantum computation: discrete logarithms and factoring. In
Proceedings 35th Annual Symposium on Foundations of Computer Science, pages 124–134. Ieee.

Somma, R. D., Boixo, S., Barnum, H., and Knill, E. (2008). Quantum simulations of classical
annealing processes. Physical Review Letters, 101(13):130504.

Steiger, D. S. and Troyer, M. (2016). Racing in parallel: quantum versus classical. In APS March
Meeting Abstracts, volume 2016, pages H44–010.

Stokes, J., Izaac, J., Killoran, N., and Carleo, G. (2020). Quantum natural gradient. Quantum,
4:269.

Subaşi, Y., Somma, R. D., and Orsucci, D. (2019). Quantum algorithms for systems of linear
equations inspired by adiabatic quantum computing. Physical Review Letters, 122(6):060504.

Subramanian, S., Brierley, S., and Jozsa, R. (2019). Implementing smooth functions of a hermitian
matrix on a quantum computer. Journal of Physics Communications, 3(6):065002.

Suzuki, M. (1990). Fractal decomposition of exponential operators with applications to many-body


theories and Monte Carlo simulations. Physics Letters A, 146(6):319–323.

Suzuki, M. (1991). General theory of fractal path integrals with applications to many-body theories
and statistical physics. Journal of Mathematical Physics, 32(2):400–407.

Sweke, R., Wilde, F., Meyer, J. J., Schuld, M., Fährmann, P. K., Meynard-Piganeau, B., and Eis-
ert, J. (2020). Stochastic gradient descent for hybrid quantum-classical optimization. Quantum,
4:314.

Szegedy, M. (2004). Quantum speed-up of Markov chain based algorithms. In 45th Annual IEEE
Symposium on Foundations of Computer Science, pages 32–41. IEEE.

77
Tang, E. (2018). Quantum-inspired classical algorithms for principal component analysis and
supervised clustering. arXiv preprint arXiv:1811.00414.

Tikhonov, A. N. (1963). On the solution of ill-posed problems and the method of regularization.
Doklady Akademii Nauk, 151(3):501–504.

Torlai, G., Mazzola, G., Carrasquilla, J., Troyer, M., Melko, R., and Carleo, G. (2018). Neural-
network quantum state tomography. Nature Physics, 14(5):447–450.

Torlai, G. and Melko, R. G. (2018). Latent space purification via neural density operators. Physical
Review Letters, 120(24):240503.

Tóth, G., Wieczorek, W., Gross, D., Krischek, R., Schwemmer, C., and Weinfurter, H. (2010).
Permutationally invariant quantum tomography. Physical Review Letters, 105(25):250403.

Vazquez, A. C. and Woerner, S. (2021). Efficient state preparation for quantum amplitude esti-
mation. Physical Review Applied, 15(3):034027.

Vedral, V., Barenco, A., and Ekert, A. (1996). Quantum networks for elementary arithmetic
operations. Physical Review A, 54(1):147.

Veis, L. and Pittner, J. (2014). Adiabatic state preparation study of methylene. The Journal of
Chemical Physics, 140(21):214111.

Venegas-Andraca, S. E. (2012). Quantum walks: a comprehensive review. Quantum Information


Processing, 11(5):1015–1106.

Ventura, D. and Martinez, T. (2000). Quantum associative memory. Inf. Sci., 124:273–296.

Wan, K. H., Dahlsten, O., Kristjánsson, H., Gardner, R., and Kim, M. (2017). Quantum general-
isation of feedforward neural networks. npj Quantum Information, 3(1):1–8.

Wang, C. and Wossnig, L. (2018). A quantum algorithm for simulating non-sparse hamiltonians.
arXiv preprint arXiv:1803.08273.

Wang, D., Higgott, O., and Brierley, S. (2019). Accelerated variational quantum eigensolver.
Physical Review Letters, 122(14):140504.

Watrous, J. (2001). Quantum simulations of classical random walks and undirected graph connec-
tivity. Journal of Computer and System Sciences, 62(2):376–391.

Wiebe, N., Braun, D., and Lloyd, S. (2012). Quantum algorithm for data fitting. Physical Review
Letters, 109(5):050505.

Wocjan, P. and Abeyesinghe, A. (2008). Speedup via quantum sampling. Physical Review A,
78(4):042336.

Wocjan, P., Chiang, C.-F., Nagaj, D., and Abeyesinghe, A. (2009). Quantum algorithm for ap-
proximating partition functions. Physical Review A, 80(2):022340.

Woerner, S. and Egger, D. J. (2019). Quantum risk analysis. npj Quantum Information, 5(1):1–8.

78
Wootters, W. K. and Zurek, W. H. (1982). A single quantum cannot be cloned. Nature,
299(5886):802–803.

Wossnig, L., Zhao, Z., and Prakash, A. (2018). Quantum linear system algorithm for dense
matrices. Physical Review Letters, 120(5):050502.

Xin, T., Che, L., Xi, C., Singh, A., Nie, X., Li, J., Dong, Y., and Lu, D. (2021). Experimental
quantum principal component analysis via parametrized quantum circuits. Physical Review
Letters, 126(11):110502.

Yoder, T. J., Low, G. H., and Chuang, I. L. (2014). Fixed-point quantum search with an optimal
number of queries. Physical Review Letters, 113(21):210501.

Yu, C.-H., Gao, F., and Wen, Q. (2021). An improved quantum algorithm for ridge regression.
IEEE Transactions on Knowledge and Data Engineering, 33(3):858.

Zhang, K., Hsieh, M.-H., Liu, L., and Tao, D. (2021). Quantum gram-schmidt processes and
their application to efficient state readout for quantum algorithms. Physical Review Research,
3(4):043095.

Zhou, L., Wang, S.-T., Choi, S., Pichler, H., and Lukin, M. D. (2020). Quantum approximate
optimization algorithm: Performance, mechanism, and implementation on near-term devices.
Physical Review X, 10(2):021067.

Zoufal, C., Lucchi, A., and Woerner, S. (2019). Quantum generative adversarial networks for
learning and loading random distributions. npj Quantum Information, 5(1):1–9.

Zoufal, C., Lucchi, A., and Woerner, S. (2020). Variational quantum Boltzmann machines. arXiv
preprint arXiv:2006.06004.

79

You might also like