SciPost Submission Page
Investigating ultrafast quantum magnetism with machine learning
by G. Fabiani, J. H. Mentink
This is not the latest submitted version.
This Submission thread is now published as
Submission summary
Authors (as registered SciPost users): | Giammarco Fabiani |
Submission information | |
---|---|
Preprint Link: | https://arxiv.org/abs/1903.08482v2 (pdf) |
Date submitted: | 2019-03-22 01:00 |
Submitted by: | Fabiani, Giammarco |
Submitted to: | SciPost Physics |
Ontological classification | |
---|---|
Academic field: | Physics |
Specialties: |
|
Approach: | Theoretical |
Abstract
We investigate the efficiency of the recently proposed Restricted Boltzmann Machine (RBM) representation of quantum many-body states to study both the static properties and quantum spin dynamics in the two-dimensional Heisenberg model on a square lattice. For static properties we find close agreement with numerically exact Quantum Monte Carlo results in the thermodynamical limit. For dynamics and small systems, we find excellent agreement with exact diagonalization, while for larger systems close consistency with interacting spin-wave theory is obtained. In all cases the accuracy converges fast with the number of network parameters, giving access to much bigger systems than feasible before. This suggests great potential to investigate the quantum many-body dynamics of large scale spin systems relevant for the description of magnetic materials strongly out of equilibrium.
Current status:
Reports on this Submission
Report #3 by Anonymous (Referee 3) on 2019-4-27 (Invited Report)
- Cite as: Anonymous, Report on arXiv:1903.08482v2, delivered 2019-04-27, doi: 10.21468/SciPost.Report.922
Strengths
- the RBM results are benchmarked against both numerical methods (ED, QMC) as well as theory predictions (spin wave theory).
- the results are new and will contribute to developing the field.
- the paper is well written and easily accessible to non experts.
Weaknesses
- the RBM results for dynamics are not as good as one would like them to be (meaning that even at $\alpha=6$ there is a visible mismatch with ED in Fig 2c).
Report
The paper "Investigating ultrafast quantum magnetism with machine learning" by Fabiani and Mentink applies the Restricted Boltzmann Machine variational ansatz for quantum many-body states to study static and dynamic properties of the two-dimensional Heisenberg model on a square lattice: for static properties the authors report agreement with Quantum Monte Carlo results in the thermodynamical limit; for dynamic properties, they "find excellent agreement with exact diagonalization" for small systems, and compare the performance of RBMs to interacting spin-wave theory for larger systems. A major message of the paper is that RBMs allow one to access to much bigger systems than feasible before, which hints at the potential to study quantum many-body dynamics of large spin systems, important for understanding magnetic materials, and strongly-correlated out of equilibrium setups. An open source version of the code termed “ULTRAFAST” is provided by the authors.
The reported results are solid, since the authors benchmarked them against various theory and numerical approaches. I believe the paper constitutes an interesting study which will be of good use to the community.
Requested changes
Suggestions:
- the authors write: "For dynamics we adopt a time dependent variational scheme where at each time-step the angle between the variational evolved state and the exactly evolved state is minimized.": at this stage the reader may wonder: if no exact simulation is available due to large system sizes, how can this method work? Could the authors elaborate on this?
- "and the optimization routine becomes equivalent to minimizing the expectation value of the energy for normalized wavefunctions": the word `normalized` may misleadingly suggest that the normalization of the wavefunciton is required, which is not the case.
- "In both Eq. (7) and Eq. (8), we retain only first order terms.": I assume `order` here refers to the expansion in $L^{-1}$? Does this mean that the term proportional to $a$ in Eq(7) is disregarded?
- "In particular we checked that for α ≤ 10 and αN ≤ 104 [26], system sizes above 30 × 30 spins are feasible in reasonably accessible CPU time on our local cluster nodes. Such system sizes are far beyond the capabilities of exact diagonalization.": could the authors show some of the data they have (maybe in the Appendix) to back up these statements? One thing the authors may want to comment on explicitly is that this is feasible because the RBM ansatz does not have to return properly normalized amplitudes [otherwise there would be an issue with the amplitudes being smaller than machine precision at such Hilbert space sizes].
Misc:
- "represented by means of a Restricted Boltzmann Machine (RBM), which is a two-layer Artificial Neural Network (ANN)": in the ML community, there is a formal difference between a feed-forward deep neural network and an RBM. For the benefit of the readers, I suggest the authors to make the statement more precise in order to avoid confusion.
- in paragraph after Eq.(1): "Following [12], the wavefunction of the quantum spin system is identified with the probability amplitude Eq. (1), namely ⟨S|ψM ⟩ ≡ ψM (S) = P(S) ...": the wavefunction (even when real-valued) does not represent a probability distribution, but its square does. The authors should comment on how the ansatz (2) can capture negative [or even complex] probability amplitudes [I guess for this, it is essential that the weights and biases are complex-valued even if the Hamiltonian, and thus its states, are all real].
- "the set of network parameters Wk = {ai,bi,Wij} is trained via a reinforcement learning algorithm": I am aware that this terminology was first employed in Ref.[12] (presumably due to the presence of a feedback loop), yet I believe it is more appropriate to set the training procedure apart from Reinforcement Learning, which is a whole independent branch of ML per se, the formulation of which is based on Markov decision processes. I only have a few minor suggestions for the authors to consider:
Remarks:
- some recent work on using RBMs for variational wavefunctions try to learn $\log(\psi)$ instead. This offers an advantage because the state amplitudes for large systems can differ by a few orders of magnitude, and taking the log is expected to mitigate this effect.
Figures:
- caption to Fig 2:
a) "At times t 2 the RBM dynamics matches well the dynamics from ED even with α = 2"
b) "For large simulation times rapid convergence with α is found."
Could the authors quantify a bit what they mean by "matches well" and "rapid convergence"?
c) could the authors add the system size L to the caption?
- Fig 4: axis label font needs to be adjusted
Typos:
- Fig 1, caption: "The extrapolations from the fits for several alpha are shown": alpha --> $\alpha$
- in caption to Table 2, it it said: "Evaluation of the correlations is done by sampling 10^6 states", but in the code snippet above Table 1, I read "nsweeps obs = 10000 #number of samples for the evaluation of <S i S j>". The number of samples do not match :)
- "Results dynamics": the title of this Sec sounds somewhat cryptic.
- "Hence, the dominant contribution of this peak originates from modes with large wave numbers that can be well captured on finite systems.": on --> in
Report #2 by Anonymous (Referee 2) on 2019-4-19 (Invited Report)
- Cite as: Anonymous, Report on arXiv:1903.08482v2, delivered 2019-04-19, doi: 10.21468/SciPost.Report.915
Strengths
1- The results presented are new.
2- The technicalities are well-covered in the paper.
Weaknesses
1 - In addition to the energy and the magnetization, the authors could consider other ground state static observables, e.g., magnetic susceptibility, to check the accuracy of the method.
Report
The paper by Fabiani et. al. investigates the accuracy of a recently proposed quantum many-body variational method based on the restricted Boltzmann Machine (RDM) neural network. As point out in the introduction, this method has the potential to efficiently simulate static and dynamic properties of many-body wave functions in any dimension. However, the efficiency of the RDM method in simulations of dynamical properties was not tested in dimensions higher then one. The introduction motivates well this point.
In the paper, the RDM method is applied to study static and some dynamical properties of the prototypical two-dimensional Heisenberg model (HM); the results are then validated with other exact (or approximated) methods.
The results presented by the authors provide relevant information about the the efficiency of the restricted Boltzmann Machine in two-dimensions.
Requested changes
1 - The authors mention that for larger systems, already for $\alpha = 4$ "convergence is reached within Monte Carlo error". Is this a general feature of the method, i.e., larger systems requires smaller $\alpha$ for convergence? or just a numerical observation for this specific case? I think the authors should comment about this on the manuscript.
Report #1 by Anonymous (Referee 1) on 2019-4-5 (Invited Report)
- Cite as: Anonymous, Report on arXiv:1903.08482v2, delivered 2019-04-05, doi: 10.21468/SciPost.Report.900
Strengths
1- Timeless investigation of RBM states for strongly-correlated systems.
2- Accurate numerical calculations
3- Well written
Weaknesses
1- Some minor points should be addressed (see report)
Report
The paper is nice and well written. Just a few comments
1) I do not like the fact the nomenclature of ``reinforcement learning'' for the optimization technique. Even though this is a fancy name, the optimisation is just standard, according to the Monte Carlo community (see ref.26).
For the real-time evolution a VMC approach was proposed in Scientific Reports 2, 243 (2012) for a Bose-Hubbard model. I think that this paper should be cited.
2) Are the variational parameters real of complex for the static calculations?
Is the Marshall sign imposed?
3) The standard way to define the energy accuracy is to normalize |E_vmc-E_0| by E_0 and not by E_vmc.
Requested changes
1) Do not use ``reinforcement learning'' and add the reference.
2) Specify if W is real or complex for the static calculations.
3) Change the normalization in the accuracy.