SciPost Submission Page
Role of stochastic noise and generalization error in the time propagation of neural-network quantum states
by D. Hofmann, G. Fabiani, J. H. Mentink, G. Carleo, M. A. Sentef
This is not the current version.
|As Contributors:||Giuseppe Carleo · Damian Hofmann · Michael Sentef|
|Arxiv Link:||https://arxiv.org/abs/2105.01054v2 (pdf)|
|Date submitted:||2021-05-07 11:49|
|Submitted by:||Hofmann, Damian|
|Submitted to:||SciPost Physics|
Neural-network quantum states (NQS) have been shown to be a suitable variational ansatz to simulate out-of-equilibrium dynamics in two-dimensional systems using time-dependent variational Monte Carlo (t-VMC). In particular, stable and accurate time propagation over long time scales has been observed in the square-lattice Heisenberg model using the Restricted Boltzmann machine architecture. However, achieving similar performance in other systems has proven to be more challenging. In this article, we focus on the two-leg Heisenberg ladder driven out of equilibrium by a pulsed excitation as a benchmark system. We demonstrate that unmitigated noise is strongly amplified by the nonlinear equations of motion for the network parameters, which by itself is sufficient to cause numerical instabilities in the time-evolution. As a consequence, the achievable accuracy of the simulated dynamics is a result of the interplay between network expressiveness and regularization required to remedy these instabilities. Inspired by machine learning practice, we propose a validation-set based diagnostic tool to help determining the optimal regularization hyperparameters for t-VMC based propagation schemes. For our benchmark, we show that stable and accurate time propagation can be achieved in regimes of sufficiently regularized variational dynamics.
Submission & Refereeing History
You are currently on this page
Reports on this Submission
Anonymous Report 2 on 2021-7-5 (Invited Report)
1- Detailed analysis of the origin of instabilities of t-VMC with neural-network quantum states.
2- Presentation of a novel diagnostic tool to detect overfitting in t-VMC.
3- Careful separation of the potential sources of numerical issues.
4- Clear presentation.
1- Although the origin of instabilities was identified, no suggestions on how to overcome the problems are provided.
2- Regularization techniques considered do not include latest developments in the field.
In their manuscript “Role of stochastic noise and generalization error in the time propagation of neural-network quantum states” the authors systematically investigate the origin of numerical instabilities in the simulation of real time dynamics of a quantum spin model. They find that simulations on a ladder geometry are hindered by the occurrence of “jump instabilities” due to overfitting to the stochastic noise that is inherent to the t-VMC method. For their analysis the authors introduce a new diagnostic tool that is based on cross-validation of the solution of the TDVP equation. The study the effect of simulation hyperparameters on the cross-validation error and demonstrate that the overfitting as indicated by the cross-validation error occurs especially for the quenches where t-VMC suffers from instabilities.
This research addresses a timely subject and the manuscript presents valuable insights into what hinders the straightforward application of neural-network quantum states to problems of physical interest. The cross-validation test introduced in this work can help to better illuminate this issue and the identification of the Heisenberg ladder as a “drosophila” provides a suited basis for methodological advancements in future work. Unfortunately, the question whether and how these problems discussed could be resolved remains open, also because the thorough analysis focuses on the simplest regularization techniques and a network architecture that is known to have built-in instabilities.
In general, I find the manuscript suited for publication in SciPost Physics. However, the authors should address the issues listed below.
1- Comparing MCMC to EMC the authors show that remaining autocorrelation in the MCMC simulation seems to result in more severe instability of the algorithm. However, correlated samples can in principle be avoided by increasing the number of proposed updates between two samples in a suited manner or with more sophisticated techniques. Are uncorrelated samples in this case out of reach with MCMC?
2- The maximum number of MC samples used is 2.8e4. Is this the limit of reasonable compute time (e.g., the numbers given in Ref.  differ by an order of mangitude)? The data in Fig. 11 give the impression that the overfitting issue might in most cases vanish around 100k samples. It would be helpful to add a comment about the feasibility of simulations with more samples.
3- In Appendix C it is not totally clear to me which optimizer is being used. The authors mention “stochastic reconfiguration”, but it is not clear to me how this is used for supervised learning. Also with a potential comparison to Ref.  in mind it could be worth to better clarify this point.
4- Below Eq. 5 it would be helpful to give the formulas for the quantum Fisher matrix and the energy gradient.
5- On page 6 the authors discuss the step size used and mention that smaller step sizes were tested. What was the smallest step size tested and why wasn’t an integrator with adaptive step size used?
6- Axis labels in Fig. 9 seem misleading. The labels seem to denote a ratio, but after thinking about it for a while I believe that the “/“ is rather supposed to denote an “or”. This should be fixed.
7- The authors refer to the quantity "1-fidelity" as the “inverse fidelity”. I think the more commonly used term is “infidelity”. The authors might consider to change this.
Anonymous Report 1 on 2021-6-25 (Invited Report)
1) High quality in depth analysis on a timely topic, namely the potential of simulating quantum many-body dynamics with variational wavefunctions derived from artificial neural networks.
2) Clarity of presentation and figures.
3) Disentangling of various computational challenges in implementing the time-dependent variational principle with stochastic sampling.
4) A validation error to quantify over-fitting issues is introduced , and its relation to physical observables is demonstrated. While such tools are familiar in the machine learning community, their application to the context of quantum many-body physics is a main merit of this work.
5) The two-leg ladder Heisenberg model is identified as a simple and promising benchmark model for validating the numerical stability of a time-dependent variational ansatz.
1) Originality claims as well as the relation to previous work could have been worked out more clearly.
2) The high potential of neural network quantum states (NQS) for simulating non-equilibrium dynamics is mentioned repeatedly and prominently. However, the results of this manuscript and previous work on the subject have by no means led to a breakthrough in the computational physics community. Limitations of NQS, as for example reported in Ref.  of the manuscript (Lin and Pollmann) are not explicitly mentioned (only cited in bulk).
3) It remains largely open how model-specific the quality of the numerical results is.
See also above strengths and weaknesses.
In their manuscript "Role of stochastic noise and generalization error in the time propagation of neural-network quantum states", Hofmann et al. report on numerical challenges and instabilities in the simulation of non-equilibrium time-evolution of lattice spin models with neural network quantum states. In this context, the authors make an effort to clearly distinguish different sources of errors, ranging from the representational limitations of the chosen variational wavefunction ansatz to the amplification of noise introduced by stochastic sampling in the parameter updates.
In my view two main original directions of this manuscript are: First, a validation error quantifying over-fitting problems in NQS approaches is introduced and linked to errors in physical observables. Second, a simple two-leg ladder Heisenberg spin model is identified as a promising benchmark model for studying the numerical stability of time-dependent variational methods. Compared to comparable 2D models, it does not only have the advantage of revealing more clearly possible instabilities, but its simplicity also allows for better comparison to exact data for reasonable system sizes.
Overall, the presentation is clear and accessible. However, regarding the relation to previous work including more clear originality claims, and a fair assessment of advantages and disadvantages of the NQS approach compared to other computational methods, there is room for improvement.
1) In Eq. (1) it would be better to directly introduce a model with couplings J_\mu, as in the actual simulations the couplings are not only isotropic but partly time-dependent. Hence, it would be better to have as Eq. (1) a model that is closer to what the manuscript is actually concerned with.
2) In Eq. (5), the order of the two expressions should be exchanged, which would make the meaning of "where" more clear. Now, the first expression seems to be a definition of something that occurs in the second one.
3) In the discussion around Eq. (6) it would be great to discuss how the influence of the white noise scales with the size of the time-step.
4) In my understanding \kappa \le \sigma_1/\lambda should read \kappa \le 1/\lambda .
5) It would be better to rename the eigenvalues of the QFM, which are now called sigma, i.e. the same as the Pauli matrices associated with the input spin variables.