SciPost Submission Page
Learning the ground state of a non-stoquastic quantum Hamiltonian in a rugged neural network landscape
by Marin Bukov, Markus Schmitt, Maxime Dupont
This Submission thread is now published as
|As Contributors:||Marin Bukov · Markus Schmitt|
|Arxiv Link:||https://arxiv.org/abs/2011.11214v2 (pdf)|
|Date submitted:||2021-04-13 08:51|
|Submitted by:||Bukov, Marin|
|Submitted to:||SciPost Physics|
Strongly interacting quantum systems described by non-stoquastic Hamiltonians exhibit rich low-temperature physics. Yet, their study poses a formidable challenge, even for state-of-the-art numerical techniques. Here, we investigate systematically the performance of a class of universal variational wave-functions based on artificial neural networks, by considering the frustrated spin-$1/2$ $J_1-J_2$ Heisenberg model on the square lattice. Focusing on neural network architectures without physics-informed input, we argue in favor of using an ansatz consisting of two decoupled real-valued networks, one for the amplitude and the other for the phase of the variational wavefunction. By introducing concrete mitigation strategies against inherent numerical instabilities in the stochastic reconfiguration algorithm we obtain a variational energy comparable to that reported recently with neural networks that incorporate knowledge about the physical system. Through a detailed analysis of the individual components of the algorithm, we conclude that the rugged nature of the energy landscape constitutes the major obstacle in finding a satisfactory approximation to the ground state wavefunction, and prevents learning the correct sign structure. In particular, we show that in the present setup the neural network expressivity and Monte Carlo sampling are not primary limiting factors.
Published as SciPost Phys. 10, 147 (2021)
Author comments upon resubmission
thank you for considering our manuscript for review. We have revised the manuscript taking into consideration all critique points raised by the referees.
We hope that with the revisions made, our work meets the publication criteria of SciPost.
List of changes
- clarified sentences throughout the text
- added new Fig 4 to explain the occurrence of a potential instability with holomorphic neural networks, and a corresponding paragraph as requested by Ref B
- re-wrote Sec 4 to (i) clearly state its purpose in the introductory paragraph, (ii) visually simplify the text layout thru bullet points, (iii) relegated extra info to a footnote, (iv) added a new Fig as Fig 7 d upon request by Ref B and a corresponding discussion in the text, (v) formulated a clear take-home message from the section
- moved Fig 9 to Sec 5.2 as requested by Ref B
- re-organized Sec 6
- added Fig 14 to Sec 7.2 to provide a visualization to clarify the text
- added new Sec C3 in appendix
- added references as suggested by the referees, and due to private communication with colleagues working in the field
- answered to all critique points raised by the referees
Submission & Refereeing History
You are currently on this page
Reports on this Submission
Report 2 by Giuseppe Carleo on 2021-5-29 (Invited Report)
I thank the authors for having addressed my comments, I feel that the work is now publishable as it is.
Anonymous Report 1 on 2021-5-9 (Invited Report)
- Cite as: Anonymous, Report on arXiv:2011.11214v2, delivered 2021-05-09, doi: 10.21468/SciPost.Report.2890
I have read the new version of the manuscript and the responses to my previous questions. I think that the presentation has been improved, even though it is not optimal. I still have a few points that are not completely clear:
1) According to the text (page 10), the Fig.2 has been obtained with a single layer NN. However, in App.H, it seems that the NN has 2 layers. Am I wrong?
2) I am not sure to understand the whole construction of Fig.4: why are we looking to z_i^(1)? By the way, how many layers are there? In addition, I do not understand the location of the poles in Fig.4: is there a way to obtain them?
3) I do not fully understand the discussion in Sec.5: according to the results of Sec.4, the 50% of the configurations not having the Marshall sign rule have a tiny contribution to the ground-state wave function. It should be possible that this is the best that can be done with the chosen number of parameters and, in order to improve the correct signs, a larger number of parameters is necessary. Could the author comment on that?
In summary, I think that the paper is interesting and could be published, after these additional points will be addressed.