SciPost logo

SciPost Submission Page

Accuracy of Restricted Boltzmann Machines for the one-dimensional $J_1-J_2$ Heisenberg model

by Luciano Loris Viteritti, Francesco Ferrari, Federico Becca

This is not the latest submitted version.

This Submission thread is now published as

Submission summary

Authors (as registered SciPost users): Francesco Ferrari · Luciano Loris Viteritti
Submission information
Preprint Link:  (pdf)
Date submitted: 2022-02-21 17:23
Submitted by: Viteritti, Luciano Loris
Submitted to: SciPost Physics
Ontological classification
Academic field: Physics
  • Condensed Matter Physics - Computational
Approach: Computational


Neural networks have been recently proposed as variational wave functions for quantum many-body systems [G. Carleo and M. Troyer, Science 355, 602 (2017)]. In this work, we focus on a specific architecture, known as Restricted Boltzmann Machine (RBM), and analyse its accuracy for the spin-1/2 $J_1-J_2$ antiferromagnetic Heisenberg model in one spatial dimension. The ground state of this model has a non-trivial sign structure, especially for $J_2/J_1>0.5$, forcing us to work with complex-valued RBMs. Two variational Ans\"atze are discussed: one defined through a fully complex RBM, and one in which two different real-valued networks are used to approximate modulus and phase of the wave function. In both cases, translational invariance is imposed by considering linear combinations of RBMs, giving access also to the lowest-energy excitations at fixed momentum $k$. We perform a systematic study on small clusters to evaluate the accuracy of these wave functions in comparison to exact results, providing evidence for the supremacy of the fully complex RBM. Our calculations show that this kind of Ans\"atze is very flexible and describes both gapless and gapped ground states, also capturing the incommensurate spin-spin correlations and low-energy spectrum for $J_2/J_1>0.5$. The RBM results are also compared to the ones obtained with Gutzwiller-projected fermionic states, often employed to describe quantum spin models [F. Ferrari, A. Parola, S. Sorella and F. Becca, Phys. Rev. B 97, 235103 (2018)]. Contrary to the latter class of variational states, the fully-connected structure of RBMs hampers the transferability of the wave function from small to large clusters, implying an increase of the computational cost with the system size.

Current status:
Has been resubmitted

Reports on this Submission

Anonymous Report 2 on 2022-3-25 (Invited Report)

  • Cite as: Anonymous, Report on arXiv:2202.07576v1, delivered 2022-03-25, doi: 10.21468/SciPost.Report.4771


Developing accurate variational methods is one of the central challenges in computational science and physics. Recently, Carleo and Troyer introduced variational wave functions based on artificial neural networks. Given that this research field is still in an early stage, it is an important task to perform systematic benchmark calculations to confirm the accuracy of neural network variational ansatz.

In the paper, the authors systematically investigate the accuracy of the restricted Boltzmann machine (RBM) wave function for the spin 1/2 J1-J2 Heisenberg model in one spatial dimension.

First, the authors show that complex RBM (cRBM) performs better than phase-modulus RBM (pmRBM). Then, the study focuses on the cRBM.

The authors pay attention to the sign structure of the wave function. They then show that the Marshall-sign rule helps the optimization, especially when J2 is small. The best cRBM results show better accuracy in the calculation of the ground state, both in energy and correlation functions, compared to the Gutzwiller-projected fermionic states, albeit with a much larger number of variational parameters.

By considering linear combinations of cRBMs, excited states can also be investigated (the accuracy of the ground-state is also improved). In this paper, it is shown that cRBMs give highly accurate results for excited states as well.

Finally, the authors discuss the size consistency and discuss several possibilities to improve size consistent behavior.

The paper is clearly written, and the accuracy of the RBM variational ansatz is carefully and systematically investigated. I believe that the present work is one of the important pieces of recent intensive investigations of neural-network quantum states. Thus, I recommend that this paper be accepted for publication in SciPost.

Below, I list several points (all are minor).

In Fig. 6, the definition of N_MC is not very clear. In particular, what is the difference between N_opt in Fig. 4 and N_MC in Fig. 6?

In Fig. 8, and in page 12, the overlap < Psi_0 | Psi_cRBM > should be a complex number. Do the authors show the absolute value ?

According to the inset of Fig. 11 right panel, the relative error of the pBCS at k=0 is smaller than that of the cRBM. However, looking at the right panel, the error of the pBCS appears to be considerably larger than that of the cRBM.

  • validity: -
  • significance: -
  • originality: -
  • clarity: -
  • formatting: -
  • grammar: -

Author:  Luciano Loris Viteritti  on 2022-04-15  [id 2389]

(in reply to Report 2 on 2022-03-25)

We thank the referee for her/his positive report. Here we reply to the minor points:

  1. One Monte Carlo step is obtained performing O(N) Metropolis moves (two-spin flips). With fixed variational parameters, we perform $O(10^3)$ Monte Carlo steps and then we update the variational parameters. Therefore, an optimization step corresponds to $O(N × 10^3)$ Metropolis moves. We added a sentence to clarify this point, also improving the captions of the figures.
  2. Yes, we thank the referee for pointing out this issue. We corrected it in the text.
  3. In the main panel of Fig.11, we report the variational gap, which is defined as $\Delta E_k = E_k − E_0$, where both energies are variational. In the inset, we report the accuracy of $E_k$, namely $\varepsilon_{rel}$. A lower value of $\varepsilon_{rel}$ does not necessary imply a better accuracy on the gap, because the latter depends also on the accuracy of $E_0$. At k = 0, the accuracy of the pBCS excitation is much higher than for the pBCS ground state, giving a very small $\varepsilon_{rel}$, but a relatively inaccurate gap.

Anonymous Report 1 on 2022-3-17 (Invited Report)

  • Cite as: Anonymous, Report on arXiv:2202.07576v1, delivered 2022-03-17, doi: 10.21468/SciPost.Report.4716


1- Very timely
2-Well and clearly written
3- Thorough performance analysis of restricted Boltzmann machines
in a well-studied example
4- Good discussion of advantages and drawbacks of RBMs and comparison to other numerical methods


1-No new physics
2-No comparison with DMRG attempted here, DMRG being the state of the art method.


Neural networks have emerged as a novel route to study quantum phases
of many-body systems. Here, specifically, the authors are interested in
evaluating the performance of so-called restricted Boltzmann machines (RBM) in characterizing ground-state properties of quantum magnets. Since
the sign structure of the ground state wave function of frustrated
systems hampers the applicability of QMC techniques, the authors decide
to study a one-dimensional model of frustrated magnetism, the J1-J2 chain.

The sign structure necessitates the use of complex-valued RBMs, for which
two ansaetze are compared. A fully complex network is shown to be superior.
Moreover, translational symmetry is enforced via projection operators
and the a certain sign structure can either be pre-imposed on the RBM
or not.

The authors compute variational energies, spin correlation functions
and the energies of momentum-resolved excited states. These results
are compared to those of Lanczos calculations (which can be considered
an exact benchmark) and a Gutzwiller type variational ansatz (pBCS).
The authors demonstrate good quantitative agreement of the RBM results with exact diagonalization.

They discuss the important question of scaling up the simulations to larger
systems where the considered RBM do not appear to be a promising route. The
physically motivated variational ansatz, however, can be extended to larger
systems in a straightforward manner. Alternative neural networks are identified that may be better suited for larger systems. Moreover, the variational parameters of the RBMs lack a physical interpretation. Further, the number of variational parameters for the pBCS (projected Gutzwiller) is much lower than for the RBMs.

The paper further contains very interesting results on the behavior of the
RBMs during the training phase, e.g., for the phases or the average sign.

The paper is well written and accessible, also to people who do not work
with neural networks/machine learning. It does not provide new insights into
the physics of the studied model, but the performance of the RBMs is very
carefully evaluated and discussed. It appears that RBMs are not the most
promising route for frustrated magnets. The technical details of training
the networks will presumably be of interest to machine-learning practitioners.

Overall, I conclude that this certainly very good research and publishable
science that will be of interest to its target community. The fact that RBMS are very critically evaluated is an important piece of information, follow-up work
may be triggered along the directions laid out in the conclusions sections.

Requested changes

Necessary revisionis:

1- The Gutzwiller projected wave functions are mentioned in the abstract,
but neither in the abstract nor in the methods section. I may have missed it,
but the acronym is never defined. The authors should a paragraph on this method
in the Methods section, make sure that acronyms are properly introduced and
mention the method in the introduction as well.

2- Please correct a few typos: (page 1) ".. wve functions has been defined ...",
(page 14): "trasparent"

Some optional questions:

3- Could the authors make a more definite statement about the usefulness
of RBMS for 2d frustrated quantum magnets?

4- The method of choice for 1D quantum magnets is still DMRG. Are there
any prospects of RBMs or other neural networks becoming competitive for models of frustrated magnetism?

  • validity: top
  • significance: high
  • originality: good
  • clarity: high
  • formatting: perfect
  • grammar: good

Author:  Luciano Loris Viteritti  on 2022-04-15  [id 2388]

(in reply to Report 1 on 2022-03-17)
answer to question

We thank the referee for her/his positive report. In the following we reply to requested changes:

  1. The acronym is defined at the beginning of section 3, just before sub- section 3.1, and the definiton of Gutzwiller-projected wave functions is reported in the Appendix. The acronym is not used before its definition. The reason not to describe Gutzwiller-projected states in the body of the paper is because they are only used as a comparison for RBM states.
  2. We corrected the typos.
  3. & 4. We added a couple of sentences in the introduction to comment these points. As far as the first question is concerned, generic neural-network states can, in principle, describe quantum systems in arbitrary dimension. Howewer, from a practical point of view, higher values of the complexity α could be needed to reach the same accuracy as the one dimensional case. Concerning the second question, DMRG represents the best me- thod to solve quantum problems in one dimension, while in two or more dimensions its accuracy deteriorates. In recent years, there have been several attempts to improve both DMRG (by considering tensor-network states) and RBMs (by defining more refined architectures). Today, the latter ones have reach an accuracy that, in several cases, is competiti- ve with DMRG [see for example K. Choo, T. Neupert, G. Carleo Phys. Rev. B 100 (12), 125124 (2019)].

Login to report or comment