SciPost Submission Page
Supervised learning of few dirty bosons with variable particle number
by Pere Mujal, Àlex Martínez Miguel, Artur Polls, Bruno Juliá-Díaz, Sebastiano Pilati
This is not the latest submitted version.
This Submission thread is now published as
Submission summary
Authors (as registered SciPost users): | Pere Mujal · Sebastiano Pilati |
Submission information | |
---|---|
Preprint Link: | https://arxiv.org/abs/2010.03875v2 (pdf) |
Data repository: | https://doi.org/10.5281/zenodo.4058492 |
Date submitted: | 2021-01-22 11:31 |
Submitted by: | Mujal, Pere |
Submitted to: | SciPost Physics |
Ontological classification | |
---|---|
Academic field: | Physics |
Specialties: |
|
Approaches: | Theoretical, Computational |
Abstract
We investigate the supervised machine learning of few interacting bosons in optical speckle disorder via artificial neural networks. The learning curve shows an approximately universal power-law scaling for different particle numbers and for different interaction strengths. We introduce a network architecture that can be trained and tested on heterogeneous datasets including different particle numbers. This network provides accurate predictions for all system sizes included in the training set and, by design, is suitable to attempt extrapolations to (computationally challenging) larger sizes. Notably, a novel transfer-learning strategy is implemented, whereby the learning of the larger systems is substantially accelerated and made consistently accurate by including in the training set many small-size instances.
List of changes
1. We have reformulated the abstract, following the second Referee’s comment about the accuracy of the extrapolations to larger system sizes.
2. In the introduction, we have added citations to new Refs. [29] and [30], which were mentioned by the second Referee in his/her report.
3. We have modified the introduction according to the second Referee’s report comments on the extrapolations. The claim on the extrapolation accuracy is substantially scaled down.
4. At the end of Sec. 2.3, we have extended the discussion on the use of regularization techniques and on the procedure we adopted to inspect for the possible occurrence of overfitting (see comment by the second Referee).
5. In Secs. 4.1, 4.2, and 4.3, we have modified the claims and discussion about the extrapolations, emphasising the specific conditions where reasonable accuracy is obtained, and mentioning the need for further analysis on larger systems.
6. We have added the right panels in Figs. 3 and 4 to better visualize the discrepancies in the extrapolations and in the outcomes of accelerated learning (see comment by the second Referee).
7. In the conclusions, we have added a sentence about the approximately universal behaviour of the learning curve, speculating that different neural network architecture might provide a faster learning (see comment by the second Referee).
8. In the conclusions, we have mentioned the possibility of computing other physical quantities (see comment by the second Referee).
9. In the conclusions, we expand the discussion on cold-atom quantum simulators and on three-body recombinations (see comment by the first Referee). We cite new Refs.[46] and [47], which report cold-atom experiments on the deterministic preparation of few-body systems with controlled atom numbers.
10. In the conclusions, we have pointed out the possibility of using quantum machine learning, citing new Refs. [48-52], following a comment by the first Referee.
Current status:
Reports on this Submission
Report #2 by Anonymous (Referee 4) on 2021-2-15 (Invited Report)
- Cite as: Anonymous, Report on arXiv:2010.03875v2, delivered 2021-02-15, doi: 10.21468/SciPost.Report.2558
Strengths
see repot
Weaknesses
see report
Report
Although the authors have addressed all of my comments, their response is rather disappointing. Most of the changes made are in more careful wording of the claims. Of course, this is welcome, but my suggestions, most of which declined politely by the authors, were meant to provide the reader with a better understanding of the result. For example, point 1 in my original review could have shown the limitations of extrapolation and could easily have been done with their already existing data. Unfortunately, the authors did not accept my suggestion. The same goes for points 2,3 and 4. In point 5, the authors again do not take my suggestion, but instead, present a histogram of the absolute error. This addition is not helpful, in my opinion. My original proposal was to subtract the linear term and still plot the data as a function of E. That way, one could see the systematic deviation with the energy. The authors choose not to do that; I suspect because the result is not favorable. My overall feeling is that the authors decided to make only small "cosmetic" changes. However, as already noted in my first review, the paper's two established claims justify its publication, and since in this version the authors softened the third claim, I can recommend its acceptance.
Report #1 by Anonymous (Referee 3) on 2021-1-26 (Invited Report)
- Cite as: Anonymous, Report on arXiv:2010.03875v2, delivered 2021-01-26, doi: 10.21468/SciPost.Report.2468
Report
The authors have properly enhanced the manuscript following my remarks. I can recommend it for publication in current form.
Author: Pere Mujal on 2021-02-28 [id 1271]
(in reply to Report 1 on 2021-01-26)
THE REFEREE WRITES:
The authors have properly enhanced the manuscript following my remarks. I can recommend it for publication in current form.
OUR ANSWER:
We thank the Referee for the positive assessment on our manuscript and for recommending its publication.
Author: Pere Mujal on 2021-02-28 [id 1272]
(in reply to Report 2 on 2021-02-15)THE REFEREE WRITES:
Although the authors have addressed all of my comments, their response is rather disappointing. Most of the changes made are in more careful wording of the claims. Of course, this is welcome, but my suggestions, most of which declined politely by the authors, were meant to provide the reader with a better understanding of the result. For example, point 1 in my original review could have shown the limitations of extrapolation and could easily have been done with their already existing data. Unfortunately, the authors did not accept my suggestion.
OUR RESPONSE:
As requested by the Referee, in this second revision of the manuscript we include data and comments on the extrapolation from N=1 and N=2 instances to the particle number N=4. This analysis is reported for the real-case scenario discussed in Section 4.3 and in Table 3. As expected, the extrapolation accuracy is reduced compared to the extrapolation from N=1,2,3, by approximately 10%-20%. While this analysis further highlights the limitations of the extrapolations procedure and the need to use sufficiently large sizes in the training stage (as, in fact, already discussed in the previous version of the manuscript), it does not affect our two main claims. We hope that the additional data and the further discussions on the limitation of the extrapolation procedure can be considered an adequate response to the main concern raised by the Referee.
THE REFEREE WRITES:
The same goes for points 2,3 and 4.
OUR RESPONSE:
It seems to us that these three points have been given appropriate consideration, at least within the framework set by the model Hamiltonian we address.
Concerning point 2: we do state that we observe the same approximately universal behavior in all physical regimes, from weakly interacting to the Tonks-Girardeau limit. Phases such as the Bose glass cannot be identified in the few-body model we consider.
Concerning point 3: we mention the possibility to consider different physical quantities. However, this analysis is clearly beyond the scope of our work, and it wouldn't necessarily add significant information concerning the two main claims we report on.
Concerning point 4: we do state that the learning curve is not affected by training parameters such as the learning rate.
THE REFEREE WRITES:
In point 5, the authors again do not take my suggestion, but instead, present a histogram of the absolute error. This addition is not helpful, in my opinion. My original proposal was to subtract the linear term and still plot the data as a function of E. That way, one could see the systematic deviation with the energy. The authors choose not to do that; I suspect because the result is not favorable.
OUR RESPONSE:
We believe that a histogram of the discrepancies is a legitimate and effective way to visualize the magnitude of these discrepancies. One can directly read the value of the maximum discrepancy, and also have quantitative information on the probability to find a given value. Indeed, the importance of using accelerated learning instead of simple extrapolations is quite evident from these figures. As the Referee implies, the information on the energy dependence of the discrepancy is not visible in the histograms. However, this information is provided by the scatter plot of predicted versus exact energies. For full transparency, we make use of the SciPost feature of making authors' responses publicly available, attaching here the plots showing the data as suggested by the Referee, i.e., plotting the discrepancy between predicted and exact energies, versus exact energies. It seems to us that the plots reported in the manuscript provide readers all information necessary to justify our claims, and also to understand the limitations of the extrapolation procedure.
THE REFEREE WRITES:
My overall feeling is that the authors decided to make only small "cosmetic" changes. However, as already noted in my first review, the paper's two established claims justify its publication, and since in this version the authors softened the third claim, I can recommend its acceptance."
OUR RESPONSE:
We hope that the additional data included in the manuscript, the (publicly available) graphs included in this response, and the above discussions, will be considered an adequate response to the Referee's concerns. The additional information further highlights the limitations of the extrapolation procedure, and the need to use sufficiently copious training sets including several system sizes. The necessity of exploring even larger sizes to establish the applicability of direct extrapolations was already emphasized in the manuscript. Since the Referee already agreed that the two main claims -- i.e., the implementation of a flexible neural network for variable particle numbers and the efficiency of accelerated learning -- had already been established, we hope that the Referee will find that his/her suggestions have been exhaustively addressed.
Attachment:
Figures_Response.pdf