SciPost Submission Page
Scalable Imaginary Time Evolution with Neural Network Quantum States
by Eimantas Ledinauskas, Egidijus Anisimovas
This is not the latest submitted version.
This Submission thread is now published as
Submission summary
Authors (as registered SciPost users): | Egidijus Anisimovas · Eimantas Ledinauskas |
Submission information | |
---|---|
Preprint Link: | https://arxiv.org/abs/2307.15521v3 (pdf) |
Date submitted: | 2023-10-17 08:20 |
Submitted by: | Ledinauskas, Eimantas |
Submitted to: | SciPost Physics |
Ontological classification | |
---|---|
Academic field: | Physics |
Specialties: |
|
Approaches: | Theoretical, Computational |
Abstract
The representation of a quantum wave function as a neural network quantum state (NQS) provides a powerful variational ansatz for finding the ground states of many-body quantum systems. Nevertheless, due to the complex variational landscape, traditional methods often employ the computation of quantum geometric tensor, consequently complicating optimization techniques. Contributing to efforts aiming to formulate alternative methods, we introduce an approach that bypasses the computation of the metric tensor and instead relies exclusively on first-order gradient descent with Euclidean metric. This allows for the application of larger neural networks and the use of more standard optimization methods from other machine learning domains. Our approach leverages the principle of imaginary time evolution by constructing a target wave function derived from the Schr\"odinger equation, and then training the neural network to approximate this target. We make this method adaptive and stable by determining the optimal time step and keeping the target fixed until the energy of the NQS decreases. We demonstrate the benefits of our scheme via numerical experiments with 2D J1-J2 Heisenberg model, which showcase enhanced stability and energy accuracy in comparison to direct energy loss minimization. Importantly, our approach displays competitiveness with the well-established density matrix renormalization group method and NQS optimization with stochastic reconfiguration.
Author comments upon resubmission
We have made revisions to the third paragraph of the introduction and have also adjusted the abstract and conclusions to clearly acknowledge that the concept of supervised learning of imaginary time evolution was pre-existing. In addition, we made minor modifications throughout the text to ensure consistency.
Regarding the comment about SGD and ADAM, we already employ ADAM, as indicated in Sec. 4.2. To enhance its visibility, we have added additional references to ADAM in other sections.
In response to Referee 1 suggestion, we have incorporated the SR curve into Fig. 3(b) and provided commentary on it in Sec. 4.3.
Furthermore, as recommended, we have included more results for the 10x10 lattice, which better represent the current state of the art in Sec. 4.4. These energy results are now summarized in Table 2.
Current status:
Reports on this Submission
Report #2 by Anonymous (Referee 1) on 2023-11-14 (Invited Report)
- Cite as: Anonymous, Report on arXiv:2307.15521v3, delivered 2023-11-14, doi: 10.21468/SciPost.Report.8117
Report
In this manuscript the authors introduce a novel way to realize imaginary time evolution for a recently emerging numerical method called neural quantum states. Such imaginary time evolution is crucial when aiming to find the ground state of quantum matter, one key goal in computational physics. In this context the current manuscript is very timely. Further, the manuscript is also very well written and easily accessible also to non-experts. I particularly appreciate the care and extensive discussion on the numerical aspects such as the inclusion of all the hyperparameters for their simulations.
Overall, I therefore recommend publication in Scipost.
I just have a few suggestions and questions, which might be worthwhile to address:
1. When I understood correctly, Eq. (7) describes how the authors perform the time evolution before they do the optimization based on the loss function in Eq. (8). Is the current wave function obtained through iterating Eq. (7) for a few time steps? If yes, is this something which can be controlled for large system sizes? I'm asking because the wave function amplitudes psi(s) should typically behave exponentially on system size, which might make Eq. (7) impossible to handle because of over- or underflow errors due finite accuracy. Is this something the authors would agree to or maybe they have observed this even already?
2. The computational effort of the present imaginary time approach is proportional to N^2 with N the number of degrees of freedom because higher-order moments of the energy need to be computed. Do the authors also know to which extent this also means that the number of Monte Carlo samples has to be increased? I guess that higher-order moments require more samples for the same accuracy.
Author: Eimantas Ledinauskas on 2023-11-17 [id 4125]
(in reply to Report 2 on 2023-11-14)Thank you for your comments and suggestions. Below are our replies:
The target wave function is calculated using only a single time step. Performing a second time step is not practical for large systems, as it would necessitate an additional application of the Hamiltonian to the resulting target wave function. This would change the scaling from linear to quadratic with respect to system size (as similarly described in Eq. 16-17). Additionally, as you have pointed out, the increasing number of summation terms could pose problems with numerical accuracy. However, this issue is not relevant in our case, because we only require a single time step. We added this clarification in text after Eq. 7.
This is an interesting point. We have not investigated in detail whether higher-order moments require more samples to achieve the same level of accuracy. However, the accuracy of moment estimations can always be monitored by calculating the standard error of the sample. Additionally, this error diminishes as the variational wave function approaches the ground state (where, with the exact ground state, every sample point would have the same local energy value). Consequently, in practice, we have not encountered any problems related to the accuracy of higher-order moments. We note that the computational complexity for all three energy moments utilized in this work actually scales linearly with the system's size. This linear scaling is due to the fact that the average of E^2 can be calculated from a sample of first-order H_loc values (as shown in Eq. 13 and derived in Eq. 23). For the average of E^3, a sample of second-order H_loc values (Eq. 14) would be required, leading to quadratic scaling. However, as described at the end of Sec. 2.4, we instead use an approximate estimate (Eq. 18) which also scales linearly. While this estimate might be biased, our numerical experiments suggest that it is likely sufficient. We added this clarification in text after Eq. 18.