SciPost Submission Page
Learning tensor trains from noisy functions with application to quantum simulation
by Kohtaroh Sakaue, Hiroshi Shinaoka, Rihito Sakurai
Submission summary
Authors (as registered SciPost users): | Kohtaroh Sakaue |
Submission information | |
---|---|
Preprint Link: | scipost_202405_00037v1 (pdf) |
Date submitted: | 2024-05-23 16:49 |
Submitted by: | Sakaue, Kohtaroh |
Submitted to: | SciPost Physics |
Ontological classification | |
---|---|
Academic field: | Physics |
Specialties: |
|
Approach: | Computational |
Abstract
Tensor cross interpolation (TCI) is a powerful technique for learning a tensor train (TT) by adaptively sampling a target tensor based on an interpolation formula. However, when the tensor evaluations contain random noise, optimizing the TT is more advantageous than interpolating the noise. Here, we propose a new method that starts with an initial guess of TT and optimizes it using non-linear least-squares by fitting it to measured points obtained from TCI. We use quantics TCI (QTCI) in this method and demonstrate its effectiveness on sine and two-time correlation functions, with each evaluated with random noise. The resulting TT exhibits increased robustness against noise compared to the QTCI method. Furthermore, we employ this optimized TT of the correlation function in quantum simulation based on pseudo-imaginary-time evolution, resulting in ground-state energy with higher accuracy than the QTCI or Monte Carlo methods.
Author indications on fulfilling journal expectations
- Provide a novel and synergetic link between different research areas.
- Open a new pathway in an existing or a new research direction, with clear potential for multi-pronged follow-up work
- Detail a groundbreaking theoretical/experimental/computational discovery
- Present a breakthrough on a previously-identified and long-standing research stumbling block
Current status:
Reports on this Submission
Strengths
- Propose a novel procedure which improves the TCI interpolation for fitting noisy functions
- Showcases benchmarks and applications of the method
Report
The authors propose a method to encode in an efficient way noisy functions in the tensor train (TT) format. Based on the benchmarks provided by the authors, the idea looks promising and I think it fits the Scipost requirements for publication.
While (with the exception of Step (c)) the authors explain the steps of the algorithm, what I find a bit missing is some discussion on why these steps work, especially for readers who have less familiarity on the TCI algorithm.
As an example, the authors state repeatedly that the measured points in the initial TCI interpolation* of Step (a) capture the global structure of the noise-free function f(x), something which seems reasonable but I fail to see how this can be rigorously justified.
* as a matter of fact, in the discussion in Sec. 4.1 the authors state "this measured point captures..." as if they were referring to a single point, I guess it's a typo, since both in the introduction as well as in the initial discussion in Sec4 the authors say that " the measured points capture.."
Regarding step (b), it's not clear to me what is the advantage of (over)fitting the function to a larger \tilde{\chi} and then subsequently compressing it, rather than just imposing a maximum \chi to the TCI directly in step (a). Is that in order to get more function evaluations for the later fitting done in step (c) ? In that case I imagine one could also pick some points on some regular grid, so I imagine there's another reason for that, I hope the authors can clarify this - In general, it would be interesting for me to compare what happens if one does overfitting+compression vs just fitting with smaller chi before performing step (c)
Finally, I agree with the other referee saying that the step (c) of the algorithm is not well explained, and I wish the authors could expand on it, since to my understanding it is a crucial part of the method.
As a matter of fact, I'd be curious also to see how much does this final step reduce the error, compared to steps (a+b) only.
In summary, I think the method is definitely interesting but I'd like for the authors to discuss a bit more the intuition behind it, some more details on the last fitting and some basic benchmarking discussing the relevance of the intermediate step (b) vs (a) only with a restricted \chi and of the improvement given by the last fitting (c), even if only for the "basic" benchmark of sec 4.2
Requested changes
- extend discussion on the intuition behind the method (see my comments above)
- provide a more detailed description of step (c) of the procedure
- discuss in a more detailed benchmark the relevance of steps (b) and (c), see my comments above
- improve Fig5, which is not very clear (I don't understand if those are some kind of error bars, or heavily fluctuating lines, and whether the green bands are narrow or they are covered by the red ones)
Recommendation
Ask for minor revision
Strengths
1. The paper is very clearly written, and the main algorithm presented is straightforward to understand.
2. The algorithm works well at reducing the errors, by about an order of magnitude, over the baseline approach to learning noisy functions with the TCI algorithm.
Weaknesses
Possible straightforward extensions of the method, while helpfully outlined in the conclusions, are not explored.
Report
This paper meets the SciPost Physics acceptance criteria, primarily in that it provides novel links between research areas (here quantum computing, tensor network algorithms, and machine learning approaches) and opens more widely research into learning noisy functions or data with tensor networks and using tensor networks and machine learning within quantum computing approaches.
I think the paper should be accepted once the authors address the changes and questions below.
Requested changes
Requested changes:
1. in Section 3.1, the authors should explain or ideally give a citation for the statement that TCI is quasi-optimal in the maximum norm
2. at the beginning of page 10, there is a statement "where <\bar{O}>(t,t') is denoted as <\bar{O}>(t,t')". What is the meaning of this statement and is it a mistake? Otherwise it seems to have no content since it says something is equal to itself.
3. Figure 5 should be improved by not using a bar-chart format, unless the authors could explain a good reason. I think a line plot would be clearer. (Or is the chart actually a line that is very dense and ranges over t values?)
4. do the authors provide a reference or explanation for the algorithm to perform the least-squares minimization of a tensor train (step (c) on page 7)? If not, the authors should ideally provide a reference (perhaps from the MPS machine learning literature, such as Phys. Rev. X 8, 03101 if that seems relevant enough) or provide a short explanation of the steps in an Appendix.
Minor (optional) changes:
1. in the introduction to Section 4, the authors should write out "sine" instead of "sin.
2. at the end of the Acknowledgements, it says "IUse of Julia...". Should it be "Use of Julia..."?
Recommendation
Ask for minor revision
Kohtaroh Sakaue on 2025-03-15 [id 5292]
Thank you very much for your careful review of our manuscript and for your valuable comments and suggestions. We sincerely appreciate the time and effort you have put into evaluating our work. Your constructive feedback has been very helpful in improving the clarity and completeness of our paper. We address each of your comments below in detail.
We added a citation to [arXiv:2407.02454] in Section 3.1 to support the statement that TCI is quasi-optimal in the maximum norm. This reference provides a detailed analysis of the quasi-optimality properties of TCI and strengthens the theoretical foundation of our discussion.
We agree that the original statement was unclear and could be misleading. To clarify, we revised it to: "where \bra{\Psi\left(0\right)}e^{i\bar{H}t'}Oe^{-i\bar{H}t}\ket{\Psi\left(0\right)} is denoted as \overline{\expval{O}}(t,t')." This revision explicitly defines the notation and removes the redundant statement.
Initially, we used a line plot for Figure 5. However, due to the high density of the data points, the plot became difficult to interpret. To improve clarity, we downsampled the displayed data to 1/50 of the original resolution. Additionally, we adjusted the transparency of the ˜Fopt plot to enhance readability. These modifications make the figure clearer while preserving the key information. Please refer to the revised Figure 5 attached for your review.
We added references to both [arXiv:1906.06329] and Phys. Rev. X 8, 031012 to support the least-squares minimization of a tensor train in step (c) on page 7. These references provide relevant background on the optimization techniques used in our approach and strengthen the theoretical foundation of our discussion.
Thank you for your careful attention to detail. We made the suggested corrections: "sin" has been changed to "sine" in the introduction to Section 4, and "IUse of Julia..." has been corrected to "Use of Julia..." in the Acknowledgements. We kindly ask you to verify these changes.
Attachment:
abs_err_1d_step_length_50.pdf