Solving The Quantum Many-Bodyproblem With Artificialneural Networks
Solving The Quantum Many-Bodyproblem With Artificialneural Networks
Solving the quantum many-body sical phenomena have been introduced (19–23).
These have so far focused on the classification
of complex phases of matter, when exact sampling
problem with artificial of configurations from these phases is possible.
The challenging goal of solving a many-body
T
present a stochastic framework for reinforce-
he wave function Y is a fundamental ob- a large number of unexplored regimes exist, in- ment learning of the parameters W, allowing for
ject in quantum physics and possibly the cluding many open problems. These encompass the best possible representation of both ground-
hardest to grasp in the classical world. Y fundamental questions ranging from the dyna- state and time-dependent physical states of a
is a monolithic mathematical quantity that mical properties of high-dimensional systems given quantum Hamiltonian H. The parame-
contains all of the information on a quan- (11, 12) to the exact ground-state properties of ters of the neural network are then optimized
tum state, be it a single particle or a complex strongly interacting fermions (13, 14). At the heart (trained, in the language of neural networks),
molecule. In principle, an exponential amount of this lack of understanding lies the difficulty either by static variational Monte Carlo (VMC)
of information is needed to fully encode a ge- in finding a general strategy to reduce the ex- sampling (24) or time-dependent VMC (25, 26),
neric many-body quantum state. However, wave ponential complexity of the full many-body wave when dynamical properties are of interest. We
functions representing many physical many-body function down to its most essential features (15). validate the accuracy of this approach by study-
systems can be characterized by an amount of In a much broader context, the problem re- ing the Ising and Heisenberg models in both one
information much smaller than the maximum sides in the realm of dimensional reduction and and two dimensions. The power of the neural-
capacity of the corresponding Hilbert space. A feature extraction. Among the most successful network quantum states (NQS) is demonstrated,
limited amount of quantum entanglement and
a small number of physical states in such sys-
tems enable modern approaches to solve the
many-body Schrödinger’s equation with a limited
amount of classical resources.
h1 h2 h3 Hidden Layer hM
Numerical approaches directly relying on the
wave function can either sample a finite num-
ber of physically relevant configurations or per-
form an efficient compression of the quantum
state. Stochastic approaches, like quantum Monte
Carlo (QMC) methods, belong to the first cat-
egory and rely on probabilistic frameworks typ-
ically demanding a positive semidefinite wave
function (1–3). Compression approaches instead
rely on efficient representations of the wave func-
tion, such as in terms of matrix product states
(MPS) (4–6) or more general tensor networks
(7–9). However, examples of systems in which
obtaining state-of-the-art accuracy in both ground- convenient point of the NQS representation is many-body state is unknown, and it is typically
state and out-of-equilibrium dynamics. that it can be formulated in a way that conserves found upon solving either the static Schrödinger
some specific symmetries. For example, lattice trans- equation HjYi ¼ EjYi or the time-dependent one
Neural-network quantum states lation symmetry can be used to reduce the number HjYðtÞi ¼ i dtd jYðtÞi for a given Hamiltonian H.
Consider a quantum system with N discrete- of variational parameters of the NQS ansatz, in the In the absence of samples drawn according to the
valued degrees of freedom S ¼ ðS 1 ; S 2 ; … ; S N Þ, spirit of shift-invariant RBMs (30, 31). Concretely, exact wave function, supervised learning of Y is
which may be spins, bosonic occupation numbers, for integer hidden-variable density a = 1,2,…, the therefore not a viable option. Instead, we derive
or similar. The many-body wave function is a map- weight matrix takes the form of feature filters a consistent reinforcement learning approach in
ðf Þ
ping of the N-dimensional set S to (exponentially Wj for f ∈½1; a. These filters have a total of aN which either the ground-state wave function or
many) complex numbers that fully specify the variational elements in lieu of the aN 2 elements of the time-dependent one is learned on the basis
amplitude and the phase of the quantum state. the asymmetric case (see supplementary materials). of feedback from variational principles.
The point of view we take here is to interpret Given a general expression for the quantum
the wave function as a computational black box many-body state, we are now left with the task of Ground state
which, given an input many-body configuration solving the many-body problem by using machine To demonstrate the accuracy of the NQS in
S, returns a phase and an amplitude according learning to optimize the network parameters W. the description of complex many-body quantum
to YðSÞ. Our goal is to approximate this compu- In the most interesting applications, the exact states, we first focus on the goal of finding the
tational black box with a neural network, trained
to best represent YðSÞ. Different possible choices
for the artificial neural-network architectures Ising 1D Heisenberg 1D
fhi g
P
where Fi ðSÞ ¼ 2cosh½bi þ W ij sj . The network
z
j
weights are, in general, to be taken complex- W (5) W (6) W (7) W (8)
valued to provide a complete description of both
the amplitude and the phase of the wave function.
The mathematical foundations for the ability
W (9) W (10) W (11) W (12)
of NQS to describe intricate many-body wave
functions are the established representability
theorems (27–29), which guarantee the existence
of network approximates of sufficiently smooth W (13) W (14) W (15) W (16)
and regular high-dimensional functions. If these
conditions are satisfied by the many-body wave
function, we can reasonably expect the NQS form 0.3 0.0 0.3
to be a sensible choice. One of the practical ad-
vantages of this representation is that its quality Fig. 2. Neural-network representation of the many-body ground states. Results for prototypical
can, in principle, be systematically improved by spin models in one and two dimensions are shown. In the top group of panels, we show the feature maps for
increasing the number of hidden variables. The the 1D transverse-field Ising (TFI) model at the critical point h = 1, as well as for the antiferromagnetic
number M (or, equivalently, the density a = M/N) Heisenberg (AFH) model. In both cases, the hidden-unit density is a = 4 and the lattices comprise 80 sites.
ðf Þ
then plays a role analogous to the bond dimension Each horizontal colormap shows the values that the fth feature map W j takes on the jth lattice site (horizontal
for the MPS. However, the correlations induced axis, broadened along the vertical direction for clarity). In the bottom group of panels, we show the feature
by the hidden units are intrinsically nonlocal in maps for the 2D Heisenberg model on a square lattice, for a = 16. In this case, the horizontal (or vertical) axis
space and are therefore well suited to describe of the colormaps corresponds to the x (or y) coordinates on a 10-by-10 square lattice. Each of the feature
quantum systems in arbitrary dimension. Another maps acts as an effective filter on the spin configurations, capturing the most important quantum correlations.
with respect to the network weights W. In the −6 h
10 −5
stochastic setting, this is achieved with an iterative 10
scheme. At each iteration k, a Monte Carlo sampl- h
ing of jYM ðS; W k Þj2 is realized for a given set of 10−8 10−6 10−4
1 2 4 1 2 4 1 2 4 8 16 32
parameters W k . At the same time, stochastic esti- α α α
mates of the energy gradient are obtained. These
are then used to propose a next set of weights Fig. 3. Finding the many-body ground-state energy with neural-network quantum states (NQS).
W kþ1 with an improved gradient-descent optimi- The error of the NQS ground-state energy relative to the exact value is shown for several test cases.
zation (32). The overall computational cost of Arbitrary precision on the ground-state energy can be obtained upon increasing the hidden-unit density a.
this approach is comparable to that of standard (A) Accuracy for the 1D TFI model, at a few values of the field strength h and for an 80-spin chain with
ground-state QMC simulations (see supplemen- periodic boundary conditions (PBCs). Points below 10–8 are not shown to enhance readability. (B) Accuracy
tary materials). for the 1D AFH model, for an 80-spin chain with PBCs, compared with the Jastrow ansatz (horizontal
To validate our scheme, we consider the prob- dashed line). (C) Accuracy for the AFH model on a 10-by-10 square lattice with PBCs, compared with the
lem of finding the ground state of two prototyp- precision obtained by EPS [upper dashed line (35)] and PEPS [lower dashed line (36)]. For all cases
and 0.15
0.6
X Jz → /
HAFH ¼ sxi sxj þ syi syj þ szi szj h / →
ij
0.2 0.10
0 1 2 0 1 2 3
respectively, where s ; s ; and s are Pauli matrices.
x y z
t t
In the following, we consider the case of both
one- and two-dimensional (1D and 2D) lattices Fig. 4. Many-body unitary time evolution with NQS. NQS results (solid lines) for the time evolution
with periodic boundary conditions (PBCs). In induced by a quantum quench in the microscopic parameters of the models we study (the transverse
Fig. 2, we show the optimal network structure field h for the TFI model and the coupling constant Jz in the AFH model) are shown. (A) Time-dependent
of the ground states of the two spin models for transverse spin polarization in the TFI model, compared to exact results (dashed lines). (B) Time-
hidden-variable density a = 4 and with imposed dependent nearest-neighbors spin correlations in the AFH model, compared to exact numerical results
translational symmetries. We find that each fil- obtained with t-DMRG (dashed lines). All results refer to 1D chains representative of the thermodynamic
ter f ¼ ½1; …; a learns specific correlation features limit, with finite-size corrections smaller than the line widths.
emerging in the ground-state wave function. For
example, in the 2D case (Fig. 2, rightmost panels)
the neural network learns patterns correspond- error, which, for a small a = 4, attains the same energy upon increasing a, qualitatively similar
ing to antiferromagnetic correlations. The gen- high precision obtained for the TFI model at the to the 1D case. The increased difficulty of the
eral behavior of the NQS is completely analogous critical point. The accuracy of our model is sev- problem is reflected in a slower convergence. We
to that observed in convolutional neural networks, eral orders of magnitude higher than the spin- still obtain results at the level of existing state-
where different layers learn specific structures of Jastrow ansatz (dashed line in Fig. 3B). It is also of-the-art methods or better. In particular, with
the input data. interesting to compare the value of a with the a relatively small hidden-unit density (a ~ 4), we
In Fig. 3, we show the accuracy of the NQS, MPS bond dimension M needed to reach the already obtain results at the same level as the
quantified by the relative error on the ground- same level of accuracy. For example, on the AFH best-known variational ansatz for finite clusters
state energy Drel ¼ ðENQS ðaÞ−Eexact Þ=jEexact j, for model with PBCs, we find that with a standard [the entangled plaquette states (EPS) of (35)
several values of a and model parameters. In Fig. density matrix renormalization group (DMRG) and the projected entangled pair states (PEPS)
3A, we compare the variational NQS energies with implementation (33), we need M ~ 160 to reach of (36)]. Further increasing a then leads to a siz-
the exact result obtained by the fermionization of the accuracy NQS have at a = 4. This points able improvement and, consequently, yields the
the TFI model, on a 1D chain with PBCs. The toward a more compact representation of the best variational results reported to date for this
most notable result is that NQS achieve a con- many-body state in the NQS case, which fea- 2D model on finite lattices.
trollable and arbitrary accuracy that is compatible tures about three orders of magnitude fewer var-
with a power-law behavior in a. The hardest-to- iational parameters than the corresponding MPS Unitary dynamics
learn ground state is at the quantum critical point ansatz. NQS are not limited to ground-state problems
h = 1, where nonetheless a notable accuracy of We next studied the AFH model on a 2D but can be extended to the time-dependent
one part per million can be easily achieved with square lattice (for a comparison with QMC re- Schrödinger equation. For this purpose, we de-
a relatively modest density of hidden units. The sults, see Fig. 3C) (34). As expected from en- fine complex-valued and time-dependent network
same accuracy is obtained for the more complex tanglement considerations, the 2D case proves weights WðtÞ that, at each time t, are trained to
1D AFH model (Fig. 3B). In this case, we also ob- harder for the NQS. Nonetheless, we always best reproduce the quantum dynamics, in the sense
serve a systematic drop in the ground-state energy find a systematic improvement of the variational of the Dirac-Frenkel time-dependent variational
principle (37, 38). In this context, the variatio- Many paths for research can be envisaged in the 27. A. N. Kolmogorov, Dokl. Akad. Nauk SSSR 108, 179–182
nal residuals near future. For example, the most recent ad- (1961).
28. K. Hornik, Neural Netw. 4, 251–257 (1991).
vances in machine learning, like deep network 29. N. Le Roux, Y. Bengio, Neural Comput. 20, 1631–1649
RðẆ ðtÞÞ ¼ distð@t Y; −iHYÞ architectures and convolutional neural networks, (2008).
can constitute the basis of more advanced NQS 30. K. Sohn, H. Lee, “Learning invariant representations with
are the objective functions to be minimized as and therefore have the potential for increasing local transformations,” in Proceedings of the 29th International
Conference on Machine Learning, Edinburgh, Scotland, 26 June
a function of the time derivatives of the weights their expressive power. Furthermore, the extension to 1 July 2012 (Omnipress, 2012), pp. 1311–1318.
Ẇ ðtÞ (see supplementary materials). In the of our approach to treat quantum systems other 31. M. Norouzi, M. Ranjbar, G. Mori, “Stacks of convolutional
stochastic framework, this is achieved by a than interacting spins is, in principle, straight- restricted Boltzmann machines for shift-invariant feature
time-dependent VMC method (25, 26), which forward. In this respect, applications to answer learning,” in Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, 20 to 25 June 2009 (IEEE, 2009),
samples jYM ðS; WðtÞÞj2 at each time and pro- the most challenging questions concerning inter- pp. 2735–2742.
vides the best stochastic estimate of the Ẇ ðtÞ acting fermions in two dimensions can already 32. S. Sorella, M. Casula, D. Rocca, J. Chem. Phys. 127, 014105
that minimizes R2(t), with a computational cost be anticipated. Finally, at variance with tensor (2007).
OðaN 2 Þ. Once the time derivatives are determined, network states, the NQS feature intrinsically 33. M. Dolfi et al., Comput. Phys. Commun. 185, 3430–3440
(2014).
these can be conveniently used to obtain the full nonlocal correlations, which can lead to substan- 34. A. W. Sandvik, Phys. Rev. B 56, 11678–11690 (1997).
time evolution after time integration. tially more compact representations of many- 35. F. Mezzacapo, N. Schuch, M. Boninsegni, J. I. Cirac, New J.
To demonstrate the effectiveness of the NQS body quantum states. A formal analysis of the Phys. 11, 083026 (2009).
in the dynamical context, we consider the unitary NQS entanglement properties might therefore 36. M. Lubasch, J. I. Cirac, M.-C. Bañuls, Phys. Rev. B 90, 064425
(2014).
dynamics induced by quantum quenches in the bring about substantially new concepts in quan-
Editor's Summary
Article Tools Visit the online version of this article to access the personalization and
article tools:
http://science.sciencemag.org/content/355/6325/602
Science (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week
in December, by the American Association for the Advancement of Science, 1200 New York
Avenue NW, Washington, DC 20005. Copyright 2016 by the American Association for the
Advancement of Science; all rights reserved. The title Science is a registered trademark of AAAS.