SciPost Submission Page
Scalable Imaginary Time Evolution with Neural Network Quantum States
by Eimantas Ledinauskas, Egidijus Anisimovas
This is not the latest submitted version.
This Submission thread is now published as
Submission summary
Authors (as registered SciPost users): | Egidijus Anisimovas · Eimantas Ledinauskas |
Submission information | |
---|---|
Preprint Link: | https://arxiv.org/abs/2307.15521v2 (pdf) |
Date submitted: | 2023-08-02 05:54 |
Submitted by: | Ledinauskas, Eimantas |
Submitted to: | SciPost Physics |
Ontological classification | |
---|---|
Academic field: | Physics |
Specialties: |
|
Approaches: | Theoretical, Computational |
Abstract
The representation of a quantum wave function as a neural network quantum state (NQS) provides a powerful variational ansatz for finding the ground states of many-body quantum systems. Nevertheless, due to the complex variational landscape, traditional methods often employ the computation of quantum geometric tensor, consequently complicating optimization techniques. We introduce an approach that bypasses the computation of the metric tensor and instead relies exclusively on first-order gradient descent with Euclidean metric. This allows for the application of larger neural networks and the use of more standard optimization methods from other machine learning domains. Our approach leverages the principle of imaginary time evolution by constructing a target wave function derived from the Schrodinger equation, and then training the neural network to approximate this target. Through iterative optimization, the approximated state converges progressively towards the ground state. We demonstrate the benefits of our method via numerical experiments with 2D J1-J2 Heisenberg model, which showcase enhanced stability and energy accuracy in comparison to direct energy loss minimization. Importantly, our approach displays competitiveness with the well-established density matrix renormalization group method and NQS optimization with stochastic reconfiguration.
Current status:
Reports on this Submission
Report #1 by Anonymous (Referee 3) on 2023-10-2 (Invited Report)
- Cite as: Anonymous, Report on arXiv:2307.15521v2, delivered 2023-10-02, doi: 10.21468/SciPost.Report.7880
Strengths
The paper contains a good analysis of a supervised approach to find ground states, that does not require computing the quantum geometric tensor. A connection between the gradient of the overlap and the standard used gradient is presented, this is also of general interest.
Weaknesses
1. The presentation conveys the idea that the main technique presented in the manuscript (supervised learning of one step of imaginary evolution) is new. However, this is not the case (see Report) and the main novelty is instead the adaptive step and fixing the target state of a few steps. The manuscript should reflect this.
Report
I would recommend publication of this work in SciPost, since the work contains a nice analysis of the supervised (approximate) imaginary-time evolution and introduces an interesting adaptive method.
However, I believe the paper should be rewritten in key places in order to convey more clearly that the key novelty here is the adaptive scheme and not the supervised approach. To be clear, this fact, by itself, does not diminish the importance of the work, but must be addressed in the rewriting phase.
Requested changes
1. As mentioned in the report, the authors should clarify that the supervised learning of the imaginary time evolution (or of the power-method /Euler approximation ) was pre-existing, and that the main novelty here is the adaptive step/ keeping the target fixed for a few steps. The authors do briefly acknowledge the existence of previous works exploring the same idea "We observe that several studies have relied on similar techniques to the one presented in this section within the context of unitary dynamics [28,47–49], as well as in non-unitary dynamics or power iteration [50,51]." but this mention is lost both in the abstract and in the introduction. Also, it is not mentioned that the novelty is the adaptive step.
In the introduction they also write "In the present paper, we propose a ground-state determination ... " Also, they mention the method is inspired by Ref. 28, however the more directly related methods are arguably 50 and 51, as well as H.Atanasova, et al. Nature Communications 14 (2023) . doi:10.1038/s41467-023-39244-4.
2. The authors compare their approach to standard gradient-descent based on the gradient of the energy and see a nice improvement. Do the authors use standard SGD or a better optimizer such as Adam? It would be important to include at least something beyond bare SGD
3. Related to the previous point, and in order to show that their method converges to an energy at least similar to what obtained by using the more commonly adopted Stochastic Reconfiguration (SR), it would be important that the authors add a curve based on SR , in Figure 3(b). SR is implemented in standard open source packages (NetKet for example) and does not require major implementation steps on their side.
4. When discussing the comparison to DMRG, it is important to mention that state-of-the-art optimization techniques and architectures can significantly outperform DMRG, also beyond 8x8. I would encourage the authors for example to consider these works https://arxiv.org/pdf/2302.01941.pdf , https://arxiv.org/abs/2104.05085, https://arxiv.org/abs/2301.06788 , Ref 66, also considering that Ref 56 and 71 are rather "old" and do not reflect the state of the art anymore.
Author: Eimantas Ledinauskas on 2023-10-17 [id 4042]
(in reply to Report 1 on 2023-10-02)Thank you for your comments and the suggested papers.
We have made revisions to the third paragraph of the introduction and have also adjusted the abstract and conclusions to clearly acknowledge that the concept of supervised learning of imaginary time evolution was pre-existing. In addition, we made minor modifications throughout the text to ensure consistency.
Regarding the comment about SGD and ADAM, we already employ ADAM, as indicated in Sec. 4.2. To enhance its visibility, we have added additional references to ADAM in other sections.
In response to your suggestion, we have incorporated the SR curve into Fig. 3(b) and provided commentary on it in Sec. 4.3.
Furthermore, as recommended, we have included more results for the 10x10 lattice, which better represent the current state of the art in Sec. 4.4. These energy results are now summarized in Table 2.