SciPost Submission Page
On the descriptive power of Neural-Networks as constrained Tensor Networks with exponentially large bond dimension
by Mario Collura, Luca Dell'Anna, Timo Felser, Simone Montangero
This is not the current version.
|As Contributors:||Mario Collura · Timo Felser|
|Arxiv Link:||https://arxiv.org/abs/1905.11351v3 (pdf)|
|Date submitted:||2020-05-18 02:00|
|Submitted by:||Collura, Mario|
|Submitted to:||SciPost Physics|
In many cases, neural networks can be mapped into tensor networks with an exponentially large bond dimension. We show that, when used to study the ground state of short-range Hamiltonians, the tensor network resulting from this mapping is highly constrained and thus it does not deliver the naive expected drastic improvement of the description, with respect to what is obtained via state-of-the-art tensor network methods. We explicitly show this result in two paradigmatic examples, the 1D ferromagnetic Ising model and the 2D antiferromagnetic Heisenberg model.
Submission & Refereeing History
You are currently on this page
Reports on this Submission
Anonymous Report 2 on 2020-7-15 (Invited Report)
- Cite as: Anonymous, Report on arXiv:1905.11351v3, delivered 2020-07-15, doi: 10.21468/SciPost.Report.1827
1- Comparison of a variety of variational states using various extrapolation schemes
2- Mapping between RBM and co-mps
1- The comparison is done between states with very different number of variational parameters
2- Different methods were used for optimising the MPS states compared to the neural network states. This makes it difficult to compare 'descriptive power', which the authors attempt to do.
3- Slightly misleading title/abstract/conclusion considering only a small class of neural network states (RBM and uRBM) and tensor network states were investigated.
The authors presents a class of neural network states (RBM/uRBM) as constrained tensor network states and attempts to make a comparison with standard unconstrained tensor networks, in terms of 'descriptive power'. It is an interesting work considering the increasing popularity of such variational states and the lack of a detailed comparison of the expressiveness of the various ansatz.
However, it is not easy to evaluate the quality of their comparison for two main reasons:
1) the states being compared have very different number of free variational parameters
2) the different states are optimised with different methods, making difficult to disentangle the expressiveness of the ansatz from the effectiveness of the optimisation.
Moreover, considering that only very narrow class of neural networks (RBM/uRBM) and tensor networks (MPS/TTN) have been considered, it would perhaps be prudent adjust the title/abstract/conclusion to more accurately reflect scope of the paper.
Despite the above points, I think the paper do highlight some important aspects when attempting to compare different variational ansatz. While I do not think the paper should be published in SciPost Physics, I think with some revision, the paper could be published in SciPost Physics Core.
Anonymous Report 1 on 2020-6-24 (Invited Report)
- Cite as: Anonymous, Report on arXiv:1905.11351v3, delivered 2020-06-24, doi: 10.21468/SciPost.Report.1781
1) Systematic comparison of the accuracy of the coMPS representation of (unrestricted) Boltzmann machines against canonical MPS for prototypical condensed matter physics models.
1) Use of an unfavorable metric and parameters (w.r.t. the neural networks) for the comparison.
2) Results are based on very specific method to optimize the variational parameters, without checking alternative methods/minimizers.
3) The authors compare the optimization of the coMPS representation of the (u)RBM, rather than optimizing the parameters of the (u)RBM itself, yet they compare against literature results from an RBM.
The authors aim to present a systematic comparison constraint MPS (coMPS) representations of (u)RBM neural networks (NN) with canonical MPS as ansatz for quantum many body wave functions of prototypical condensed matter physics models. I consider this a valuable objective in the booming field of applications of neural networks in quantum many body physics.
Their main metric for comparison is the effective bond dimension of the coMPS representation for the uRBM rather than the number of d.o.f. of the uRBM itself, which is a significantly different number of parameters to be optimized. Despite the very different representations the authors compare results from RMBs, coMPS and canonical MPS simulations on the basis of the bond dimension. I do not consider this metric appropriate when RBMs are included. Comparing coMPS and MPS bond dimensions only is certainly better, yet still even then it is unclear which of those two can be optimized more/less easily in practice.
Optimization results are highly dependent on the ansatz/representation AND the methods employed to optimize the auxiliary degrees of freedom. I am not aware though, of a systematic comparison between the accuracy that can be practically reached within the coMPS representation and the uRBM itself, nor if coMPS and MPS can be optimized similarly well. This manuscript certainly does not provide this information. The different representations have different constraints, such that optimization in one of them might be less prone to local minima than the others. While the authors employ a randomized procedure to improve the minimum search with NMinimize they do not consider different minimization algorithms. A different choice of solver can also significantly alter the outcome.
Concerning the points above the authors need reformulate the manuscript in such a way that they clearly and transparently state from the begining what they actually compare, their methodology and its limitations.
In Sec. 3.1 the authors mention that algorithms based on the coMPS representation are more efficient in terms of CPU time as they are based on matrix-matrix multiplications. To my knowledge variational Monte Carlo is based on matrix-vector multiplications and following even more efficient? I think a direct comparison of the 'time-to-solution' is more appropriate here.
In Sec. 3.2 the authors compare results from the TTN of finite size clusters against the expectations values of the AFM Heisenberg model in the thermodynamic limit. The displayed errors are thus not the relative errors with respect to the accuracy of solving a certain finite size system. Since the reference values are obtained from sampling, errorbars should be provided or at least mentioned, e.g., if they are smaller than the symbol size. In Fig. 6 the legend key reads N = 8, 10, but probably should L = 8, 10 where L is the linear dimension such that N = L^2?
The stated lesson that the bond dimension of the RBM (should be coMPS) is not guaranteed to be able to encode power law correlations, is not substantiated by the very limited range of \alpha=1,2 presented. The lesson that for the same energy accuracy the TTN is more accurate, cannot be generalized from the very limited results.
On the point raised that accounting for additional symmetries would allow to dramatically increase the TTN accuracy: Additional symmetry constraints can also be implemented in the method of Ref. 13.
The authors mention twice that the coMPS representation should be used to evaluate expectation values for higher accuracy and lower computational time. I would stress though, that (to my knowledge) the evaluation of expectation values even with stochastic methods is not the problem such that, while maybe computationally more expensive, arbitrary precision can be reached. The challenge and computational cost lies in the optimization of the wave function. In this regards I suggest to rephrase the authors statement on p. 7, last paragraph.
I do not consider the manuscript in its current form suitable for the high standards of SciPost Physics. If the editor decides to advance the manuscript after revision, I think it may be more suited for SciPost Physics Core.