SciPost Submission Page
Accuracy of Restricted Boltzmann Machines for the one-dimensional $J_1-J_2$ Heisenberg model
by Luciano Loris Viteritti, Francesco Ferrari, Federico Becca
This is not the latest submitted version.
This Submission thread is now published as
Submission summary
Authors (as registered SciPost users): | Francesco Ferrari · Luciano Loris Viteritti |
Submission information | |
---|---|
Preprint Link: | https://arxiv.org/abs/2202.07576v1 (pdf) |
Date submitted: | 2022-02-21 17:23 |
Submitted by: | Viteritti, Luciano Loris |
Submitted to: | SciPost Physics |
Ontological classification | |
---|---|
Academic field: | Physics |
Specialties: |
|
Approach: | Computational |
Abstract
Neural networks have been recently proposed as variational wave functions for quantum many-body systems [G. Carleo and M. Troyer, Science 355, 602 (2017)]. In this work, we focus on a specific architecture, known as Restricted Boltzmann Machine (RBM), and analyse its accuracy for the spin-1/2 $J_1-J_2$ antiferromagnetic Heisenberg model in one spatial dimension. The ground state of this model has a non-trivial sign structure, especially for $J_2/J_1>0.5$, forcing us to work with complex-valued RBMs. Two variational Ans\"atze are discussed: one defined through a fully complex RBM, and one in which two different real-valued networks are used to approximate modulus and phase of the wave function. In both cases, translational invariance is imposed by considering linear combinations of RBMs, giving access also to the lowest-energy excitations at fixed momentum $k$. We perform a systematic study on small clusters to evaluate the accuracy of these wave functions in comparison to exact results, providing evidence for the supremacy of the fully complex RBM. Our calculations show that this kind of Ans\"atze is very flexible and describes both gapless and gapped ground states, also capturing the incommensurate spin-spin correlations and low-energy spectrum for $J_2/J_1>0.5$. The RBM results are also compared to the ones obtained with Gutzwiller-projected fermionic states, often employed to describe quantum spin models [F. Ferrari, A. Parola, S. Sorella and F. Becca, Phys. Rev. B 97, 235103 (2018)]. Contrary to the latter class of variational states, the fully-connected structure of RBMs hampers the transferability of the wave function from small to large clusters, implying an increase of the computational cost with the system size.
Current status:
Reports on this Submission
Report #2 by Anonymous (Referee 3) on 2022-3-25 (Invited Report)
- Cite as: Anonymous, Report on arXiv:2202.07576v1, delivered 2022-03-25, doi: 10.21468/SciPost.Report.4771
Report
Developing accurate variational methods is one of the central challenges in computational science and physics. Recently, Carleo and Troyer introduced variational wave functions based on artificial neural networks. Given that this research field is still in an early stage, it is an important task to perform systematic benchmark calculations to confirm the accuracy of neural network variational ansatz.
In the paper, the authors systematically investigate the accuracy of the restricted Boltzmann machine (RBM) wave function for the spin 1/2 J1-J2 Heisenberg model in one spatial dimension.
First, the authors show that complex RBM (cRBM) performs better than phase-modulus RBM (pmRBM). Then, the study focuses on the cRBM.
The authors pay attention to the sign structure of the wave function. They then show that the Marshall-sign rule helps the optimization, especially when J2 is small. The best cRBM results show better accuracy in the calculation of the ground state, both in energy and correlation functions, compared to the Gutzwiller-projected fermionic states, albeit with a much larger number of variational parameters.
By considering linear combinations of cRBMs, excited states can also be investigated (the accuracy of the ground-state is also improved). In this paper, it is shown that cRBMs give highly accurate results for excited states as well.
Finally, the authors discuss the size consistency and discuss several possibilities to improve size consistent behavior.
The paper is clearly written, and the accuracy of the RBM variational ansatz is carefully and systematically investigated. I believe that the present work is one of the important pieces of recent intensive investigations of neural-network quantum states. Thus, I recommend that this paper be accepted for publication in SciPost.
Below, I list several points (all are minor).
In Fig. 6, the definition of N_MC is not very clear. In particular, what is the difference between N_opt in Fig. 4 and N_MC in Fig. 6?
In Fig. 8, and in page 12, the overlap < Psi_0 | Psi_cRBM > should be a complex number. Do the authors show the absolute value ?
According to the inset of Fig. 11 right panel, the relative error of the pBCS at k=0 is smaller than that of the cRBM. However, looking at the right panel, the error of the pBCS appears to be considerably larger than that of the cRBM.
Report #1 by Anonymous (Referee 4) on 2022-3-17 (Invited Report)
- Cite as: Anonymous, Report on arXiv:2202.07576v1, delivered 2022-03-17, doi: 10.21468/SciPost.Report.4716
Strengths
1- Very timely
2-Well and clearly written
3- Thorough performance analysis of restricted Boltzmann machines
in a well-studied example
4- Good discussion of advantages and drawbacks of RBMs and comparison to other numerical methods
Weaknesses
1-No new physics
2-No comparison with DMRG attempted here, DMRG being the state of the art method.
Report
Neural networks have emerged as a novel route to study quantum phases
of many-body systems. Here, specifically, the authors are interested in
evaluating the performance of so-called restricted Boltzmann machines (RBM) in characterizing ground-state properties of quantum magnets. Since
the sign structure of the ground state wave function of frustrated
systems hampers the applicability of QMC techniques, the authors decide
to study a one-dimensional model of frustrated magnetism, the J1-J2 chain.
The sign structure necessitates the use of complex-valued RBMs, for which
two ansaetze are compared. A fully complex network is shown to be superior.
Moreover, translational symmetry is enforced via projection operators
and the a certain sign structure can either be pre-imposed on the RBM
or not.
The authors compute variational energies, spin correlation functions
and the energies of momentum-resolved excited states. These results
are compared to those of Lanczos calculations (which can be considered
an exact benchmark) and a Gutzwiller type variational ansatz (pBCS).
The authors demonstrate good quantitative agreement of the RBM results with exact diagonalization.
They discuss the important question of scaling up the simulations to larger
systems where the considered RBM do not appear to be a promising route. The
physically motivated variational ansatz, however, can be extended to larger
systems in a straightforward manner. Alternative neural networks are identified that may be better suited for larger systems. Moreover, the variational parameters of the RBMs lack a physical interpretation. Further, the number of variational parameters for the pBCS (projected Gutzwiller) is much lower than for the RBMs.
The paper further contains very interesting results on the behavior of the
RBMs during the training phase, e.g., for the phases or the average sign.
The paper is well written and accessible, also to people who do not work
with neural networks/machine learning. It does not provide new insights into
the physics of the studied model, but the performance of the RBMs is very
carefully evaluated and discussed. It appears that RBMs are not the most
promising route for frustrated magnets. The technical details of training
the networks will presumably be of interest to machine-learning practitioners.
Overall, I conclude that this certainly very good research and publishable
science that will be of interest to its target community. The fact that RBMS are very critically evaluated is an important piece of information, follow-up work
may be triggered along the directions laid out in the conclusions sections.
Requested changes
Necessary revisionis:
1- The Gutzwiller projected wave functions are mentioned in the abstract,
but neither in the abstract nor in the methods section. I may have missed it,
but the acronym is never defined. The authors should a paragraph on this method
in the Methods section, make sure that acronyms are properly introduced and
mention the method in the introduction as well.
2- Please correct a few typos: (page 1) ".. wve functions has been defined ...",
(page 14): "trasparent"
Some optional questions:
3- Could the authors make a more definite statement about the usefulness
of RBMS for 2d frustrated quantum magnets?
4- The method of choice for 1D quantum magnets is still DMRG. Are there
any prospects of RBMs or other neural networks becoming competitive for models of frustrated magnetism?
Author: Luciano Loris Viteritti on 2022-04-15 [id 2388]
(in reply to Report 1 on 2022-03-17)
We thank the referee for her/his positive report. In the following we reply to requested changes:
- The acronym is defined at the beginning of section 3, just before sub- section 3.1, and the definiton of Gutzwiller-projected wave functions is reported in the Appendix. The acronym is not used before its definition. The reason not to describe Gutzwiller-projected states in the body of the paper is because they are only used as a comparison for RBM states.
- We corrected the typos.
- & 4. We added a couple of sentences in the introduction to comment these points. As far as the first question is concerned, generic neural-network states can, in principle, describe quantum systems in arbitrary dimension. Howewer, from a practical point of view, higher values of the complexity α could be needed to reach the same accuracy as the one dimensional case. Concerning the second question, DMRG represents the best me- thod to solve quantum problems in one dimension, while in two or more dimensions its accuracy deteriorates. In recent years, there have been several attempts to improve both DMRG (by considering tensor-network states) and RBMs (by defining more refined architectures). Today, the latter ones have reach an accuracy that, in several cases, is competiti- ve with DMRG [see for example K. Choo, T. Neupert, G. Carleo Phys. Rev. B 100 (12), 125124 (2019)].
Author: Luciano Loris Viteritti on 2022-04-15 [id 2389]
(in reply to Report 2 on 2022-03-25)We thank the referee for her/his positive report. Here we reply to the minor points: