SciPost Submission Page
Rapid detection of phase transitions from Monte Carlo samples before equilibrium
by Jiewei Ding, Ho-Kin Tang, Wing Chi Yu
This is not the latest submitted version.
|As Contributors:||Wing Chi Yu|
|Date submitted:||2022-03-21 09:51|
|Submitted by:||Yu, Wing Chi|
|Submitted to:||SciPost Physics|
We found that Bidirectional LSTM and Transformer can classify different phases of condensed matter models and determine the phase transition points by learning features in the Monte Carlo raw data before equilibrium. Our method can significantly reduce the time and computational resources required for probing phase transitions as compared to the conventional Monte Carlo simulation. We also provide evidence that the method is robust and the performance of the deep learning model is insensitive to the type of input data (we tested spin configurations of classical models and green functions of a quantum model), and it also performs well in detecting Kosterlitz–Thouless phase transitions.
Submission & Refereeing History
You are currently on this page
Reports on this Submission
Anonymous Report 2 on 2022-4-29 (Invited Report)
1- New deep learning based method that allows for phase transition probing with less computational effort than the traditional Monte Carlo approaches.
2- Versatility of the method, which works well on a variety of models with rather distinct phase transitions.
3- The authors discuss several "internal" parameters of the method that are important in controlling its performance.
4- Good contextualization of the work.
5- The figures are generally good, clear and informative.
1- Absence of a performance comparison against other established methods.
2- Some details, knobs and performance of the methodology could have been explored/presented more thoroughly.
3- Raw Monte Carlo data and deep learning scripts are not available to the readers.
4- Several misspellings across the text and occasional unclear wording.
In this manuscript the authors estimate the location of phase transitions of several model Hamiltonians by employing supervised machine learning algorithms tailored for time domain analysis. Unlike previous works, the Monte Carlo data fed to the algorithms was not collected at equilibrium, but was instead extracted from the initial portion of the Monte Carlo time series before equilibrium is reached. This procedure should decrease the computational cost needed for estimating critical temperatures and couplings of physical models. This work is original, technically sound, the obtained results are interesting and of practical relevance for the field. I suggest the acceptance of this manuscript provided the following concerns are addressed by the authors.
1- The authors state that this methodology allows for cheaper estimates of the position of phase transitions of model Hamiltonians. Despite this being a credible statement it would be useful for the readers to know how much can one save with this method. Even a semi-quantitative comparison of the cost of this method (for a fixed accuracy of the prediction), against the usual Monte Carlo estimate based on expectation values, and against the supervised learning methods using equilibrium samples (eg. Carrasquilla et al., Ref. ), would be useful.
2- It is unclear from the text if the Monte Carlo time series (for a given site) that is fed into the deep learning algorithms, is sequential (i.e., each Metropolis step/spin flip attempt corresponds to an element of that site time series), or if there is some sort of reblocking to decorrelate samples (eg., only after a sweep over all the spins do we add a new element to the time series of a given site). For the benefit of the reader this should be clarified in the text.
3- It is a well known fact that the estimate of the phase transition will depend on the size of the simulated system. With finite size scaling techniques one can account for that and obtain reliable thermodynamic limit estimates. The methodology presented in this manuscript should not be immune to this issue, since it originates from the cutting-off of correlations by the finite size of the system. In the manuscript there is no study/discussion of the change of the predicted Tc with the system size. The simulated classical models are large, so the Tc prediction should be rather close to the thermodynamic limit Tc. Still, I believe that the average reader would benefit from this issue being discussed in the text. In particular, it would be useful to have a sense of how small can one make our simulations and still obtain reliable results with this method (given that the authors are randomly selecting 20 sites to feed into the algorithm). Would problems arise if these random sites are too close to each other?
4- The authors could fit the data in Fig. 2 with a function that better describes the data (eg., Tanh[a.x-b]) instead of using a linear fit between 0.1 and 0.9. I believe this would enable a more controlled estimate of the transition and its associated error. If there is some strong reason for using a linear fit between 0.1 and 0.9 instead of a function better adapted to the data, then that could be addressed in the text. In any case, the fits could be shown in the appendix or at least made available online.
5- It is not clear from the text how are the error bars in Fig. 3 estimated. For the benefit of the reader please clarify this.
6- Fig. 3 is rather useful to assess the convergence of the different algorithms with the number of steps of the time series. However I find it could convey more information if for instance, more points at larger number of steps were included (notice that data for m=20,30,40 is included in Fig. 2). From the text it is unclear whether there is a finite optimal number of steps for making a prediction, or if one should instead take the number of steps towards infinity to controllably approach the exact result. The readers would benefit from such discussion.
7- The authors mention that they chose to randomly select 20 elements/sites because selecting more sites did not improve the deep learning model. It could be useful to have a figure conveying such an information, together with a discussion of why 20 sites is the right number. I wounder if that optimal value of 20 sites depends on the system size and/or model.
8- In the last paragraph before the conclusion the text reads: "...exhibit a significant increase from 0 to 1 as temperature increases, as shown in Fig. 4(d3).". I believe this to be a typo since the authors are discussing the Hubbard model, which has a quantum phase transition upon tuning of the Hubbard interaction.
9- The presented method gave a reasonable estimate of the critical coupling Uc of the honeycomb lattice Hubbard model, despite not as good as the estimates of Tc for the classical models. It is not entirely clear to me why is that so. I would nevertheless like to remark that with the supervised deep learning methods that use equilibrium samples this seems to also occur: application to the Hubbard model (Broekcker et al., Ref. ) seems to result in a biased estimate of the critical coupling, while the Tc of the classical models (Carrasquilla et al., Ref. ) are rather well estimated. Related to this the authors mention that "the large fluctuation of the output in the close vicinity of the transition point may attribute to the fact that the Green functions fluctuate a lot (as compared to the classical model) due to the quantum fluctuations when they are far from equilibrium, resulting in the misclassification of some test samples.". Can this issue be improved with a longer time series sequence? Else, does it help to use more elements of the Green's function to feed the algorithm? Please discuss these issues.
10- In appendix A the authors state that "not using positional embedding or shuffling the data in the time dimension will not change the performance of the deep learning model.". That is demonstrated for the square lattice Ising model. Could the authors discuss/demonstrate if this statement holds for the other models (namely the XY model and Hubbard model)?
11- I believe that the Temperature label on Fig. 4(b2) might be wrong since the Tc in that system is 3.69 and that label reads T=3.5. Please check this possible issue.
12- The community would benefit from the raw Monte Carlo data, the deep learning, the fitting and the figure-generating scripts being made available online.
13- There are several misspellings across the text that could hinder the understanding of the concepts being discussed in the manuscript. I suggest the authors go through the text again in order to fix these typos.
Anonymous Report 1 on 2022-4-27 (Invited Report)
1- The authors introduce a way to reduce the calculation time of Monte Carlo methods by using different neural network architectures which leads to more efficient calculations.
2- very good that the authors check if the performance of the model is independent of the selection of the training data
3- The authors check different models
4- good graphical representation of the results and the architecture
1- sometimes unclear text structure. The authors repeat some information in the different paragraphs.
2- fit to find the phase transition from the output of the neural networks (see requested changes)
3- The concrete implementation/code of the neural network architectures is not publicly available.
In the paper „Rapid detection of phase transitions from Monte Carlo samples before equilibrium“ by Ding et al. the authors use different machine learning techniques, namely a Transformer and a Recurrent Neural Network, to detect phase transitions from Monte Carlo samples before reaching equilibrium. They test their new technique on the Ising model, Hubbard Model and the XY model.
The authors give a good introduction into the field which address a broad audience. They motivate well the importance of their work. They give clear visualizations of the techniques and the data. Furthermore they tested their technique with different physical and machine learning models as well. In the appendix they show the stability for different training data/training ranges which is required for a sophisticated machine learning algorithm.
The contribution is of general interest since it can reduce the Monte Carlo steps needed to find phase transitions which reduces the amount of computational resources involved.
I recommend publication in SciPost Physics if the authors address the questions/changes.
1- In the end of section 1 (e.g. first paragraph on page 3) the authors describe in detail how the method works. In section 2 this is done again. I kindly ask to reduce the repetition here.
2- The data for figure 3 is generated by doing a linear regression on the data from figure 2 in the interval of 0.1 to 0.9. I would like to recommend to fit a tanh like function to the complete data to get a more stable point since the restriction of 0.1 and 0.9 as boundary is an arbitrary choice. Nevertheless I do understand that no model is available that suits the data from a theoretical point of view, however the fit of a tanh function appears often in such publications where a phase transition needs to be found from a supervised trained network. As an alternative it would be helpful to either see the linear fits in the main text or in the appendix. The authors could plot the fitted function instead of the guide to the eye in figure 2 and 4.
3- How do the authors calculate the errorbars in figure 2 and figure 4?
4- I do not completely understand paragraph 2 on page 5 even after reading the reference in the appendix. Why and how do the authors map the information in the different dimensions? Could the authors comment what they are doing here?
5- The authors comment that for the Ising model normally 1000 MC steps are required to reach equilibrium. To train the neural networks some training data is required. Could the authors directly compare (for example with an overall factor) how much less/more data the methods need to find the phase transition? How big is the actual reduction of computational requirements taking the complete progress into account.
6- Please review minor grammar mistakes and typos. (typos especially in the figures. for example fig 1: "menory" -> I think the authors intention is "memory" )
7- I would like to draw attention to the formal acceptance criteria which state that the publication of the code/notebooks to create the figures are required. I was not able to find if and where it is available. To check if the models are under or overfitting I would like to recommend to also publish the training progress/concrete implementation of the neural networks.