# Time Dependent Variational Principle for Tree Tensor Networks

### Submission summary

 As Contributors: Daniel Bauernfeind Arxiv Link: https://arxiv.org/abs/1908.03090v2 Date submitted: 2019-09-10 Submitted by: Bauernfeind, Daniel Submitted to: SciPost Physics Discipline: Physics Subject area: Quantum Physics Approach: Computational

### Abstract

We present a generalization of the Time Dependent Variational Principle (TDVP) to any finite sized loop-free tensor network. The major advantage of TDVP is that it can be employed as long as a representation of the Hamiltonian in the same tensor network structure that encodes the state is available. Often, such a representation can be found also for long-range terms in the Hamiltonian. As an application we use TDVP for the Fork Tensor Product States tensor network for multi-orbital Anderson impurity models. We demonstrate that TDVP allows to account for off-diagonal hybridizations in the bath which are relevant when spin-orbit coupling effects are important, or when distortions of the crystal lattice are present.

###### Current status:
Editor-in-charge assigned

### Submission & Refereeing History

Submission 1908.03090v2 on 10 September 2019

## Reports on this Submission

### Strengths

1- clearly written, pedagogic description of an algorithm which should prove useful for the community.

### Weaknesses

1- the extension of the time dependent variational principle for matrix product states to tree tensor networks was already employed in the literature, though without an explicit description.

### Report

The main aim of this manuscript is to describe the time dependent variational principle (TDVP) procedure to simulate the time evolution of quantum state decomposed as a tree tensor network. The variant on a tree has not been (to the best of my knowledge) explicitly described until now. As such, I believe a pedagogic and accurate description of the algorithm should be a useful addition to the literature.

The overall presentation is clear and well written. There are, however, several small details that are incorrect or imprecise and should be fixed. I provide the list of the issues I notice below, together with some comments which I hope would help to improve the presentation further.

### Requested changes

1- The introduction gives a short list of time-evolution methods for tensor networks.
For completeness, in the 1d case, authors should also mention Krylov-based methods (Zaletel, et al. Phys. Rev. B 91, 165112 (2015)), which, similarly as TDVP, can be used for long-range Hamiltonians, in particular on a tree.
For PEPS, the current state-of-the-art for the infinite system is given by Czarnik et al. Phys. Rev. B 99, 035115 (2019) and Hubig et al., SciPost Phys. 6, 031 (2019). [if full update with recycling the environments cited as [33] is a minor update of TEBD than the algorithm presented in those two works is as well]

2- In Sec. 2.2, it is better (faster) to use QR decomposition in place of SVD.
Also, the statement: "One of the major advantages of orthogonality center is that they ... speed up calculation of local observables" is hard to agree with. One has to orthogonalize the whole network to calculate single local observable in such a way -- which is quite costly (and re-orthogonalize to calculate the expectation value at another site)

3- In Sec. 2.3 in order to truncate leg q_k, one should use SVD on C_{(q_k), rest}

4- In Sec. 3.1, the authors write: "In order to arrive at a similar result, we first need to define a fixed starting and end-point with the restriction that these have to be a leave."
Is this a necessary condition in this section? I don't see the first-site appearing in the derivation; nor the fact that end-point is a leaf. [not a leave, please double-check the spelling in the article]

5- In Sec. 3.2, the authors start by mentioning second-order Suzuki-Trotter decomposition. Then describe the 1-st order one in detail. I find it slightly confusing. The second-order decomposition is described later as a combination of first-one and its reversal. Additionally, having a second-order decomposition allows one to form 4th order or higher -- it might be worth mentioning this. Additionally, fix white spaces in Eq. 16

On page 12: The sentence "Importantly, this also means ... the order of a single and local update is reversed" -- is confusing. There are two single gates connected by a local link, one is going to be applied before, and the second after the local gate; It might not be obvious which single gate the authors refer to.

6- In Sec. 3.3, the authors write: "(QR-decomposition suffices)." Here one should only use SVD. Otherwise, the bond dimension of the link would grow exponentially with time. It would also be useful to explicitly mention that one performs truncation of smallest Schmidt values here.

7- In Sec. 4, please define H_{int}, so that the article is self-contained.
In Fig 8, it would also be helpful, to add some arrows pointing in the direction in which l and k are changing; The statement 'connected to the impurity below' (bottom of page 14) only makes sense with respect to specific layout of that figure [I would suggest making it more figure-independent]

8- Combining 1-site TDVP and 2-site TDVP (bottom of page 14); Single site gates propagating the impurity sites backward (the second part of a 2-site step) and forward (first part of a 1-site step) are canceling each other. It is only necessary to evolve the links between impurity sites backward in this case -- it might be good to mention this.

9- I find the error discussion around Fig 10 highly lacking. The authors do not say, what is the bond dimension, which strategy do they apply to truncate it (in 2-site part of the scheme, is there some maximal bond dimension? or are the authors truncating the Schmidt values to some precision/discarded weights?) -- without such information the results cannot be reproduced.
Also, since this is mostly a discussion of the method, I would appreciate the demonstration that this error (of 1e-5 in Fig 10) can be further controlled (i.e., everything is working as expected). What is the parameter of simulation which is ultimately controlling this error? (3-4 lines showing errors as this parameter is changed would be sufficient in my opinion).
Also, the error would be more natural to read if the authors use a logarithmic scale in lower panels of Fig 10.

10- Figures; I like the idea of using the same small network for illustration of all the steps of derivation. However, having to refer to Fig 1 (a couple of pages back in some cases) is not optimal. I am wondering if some insets showing the layout of the whole network and site's numbering would help here?

11- Figure 7 is illustrating 1-site TDVP very nicely. I am wondering if a similar figure for the 2-site scheme (e.g., a second panel of fig 7) would not provide a good illustration of 3.3?

• validity: high
• significance: good
• originality: ok
• clarity: high
• formatting: good
• grammar: good

### Strengths

1- well-presented work, introduction with the right context
2- technical details, every step can be followed by a reader who is familiar with tensor networks
3- benchmarking of the method is sound

### Weaknesses

1- the discussion of the results is a bit rudimentary
2- the physical significance/interpretation of the results is not given

### Report

A few remarks:

- The authors note that TEBD is hard for more complicated hamiltonians, especially long-range order; here, the works of Zaletel, et. al. should be mentioned as a possibility for extending the method to long-range interactions (Phys. Rev. B 91, 165112)
- One of major advantages (both numerical and conceptual) of TDVP over other methods is the fact that the energy (and, possibly, other preserved quantities) is preserved; is this the case for TTN as well? It would be good for the authors to comment on that, and check numerically that this is indeed the case (especially, when comparing with the TEBD results).
- Bond dimensions are nowhere mentioned in the numerical part; it would be good to have an idea of how large these have to be to attain reasonable timescales. Also, how do the different TDVP steps scale with the bond dimension?
- The authors say that "non-interacting systems are already entangled"; this is a confusing statement, since the amount of entanglement depends on what kind of partition one takes. It would be better to refer to real-space correlations that are non-trivial despite the system being non-interacting.
- Do the errors accumulate as time progresses? For spin chains, the bipartite entanglement entropy increases linearly as a function of time; is something similar happening here, and does the bond dimensions of the tensor network have to increase exponentially to keep a decent accuracy?

### Requested changes

1- energy conservation and bond dimension as a function of time should be discussed (not necessarily new plots)
2- include references as discussed in the report

• validity: ok
• significance: ok
• originality: low
• clarity: high
• formatting: perfect
• grammar: excellent

### Strengths

Well-organized, clear, user-friendly for programmers.

### Weaknesses

The method and usage is not entirely original.

### Report

The authors present a well-organized picture for running TDVP on TTN architectures.
The technical part is clearly written, with in-depth details, such as the explanation of gauging and truncation. While the derivation of the tangent space projector is indeed technical, it follows mainly the original TDVP paper by Haegeman et al., so it should feel familiar to experts in the topic.
The algorithms are clearly presented, which is a hands-on approach to programmers aiming to implement it, without the need to sift through all the maths.
In the following we report a few remarks/ questions that the authors shuould consider:

- In the introduction the authors state:
"One of the major reasons behind the success of tensor networks are the celebrated area lawsof entanglement [20] stating that the entanglement of ground states of gapped Hamiltoniansis proportional to the surface area connecting the two region"
In the paper they cite (Ref.20) it states for the 1D case: "If a system is local and gapped, an area law always holds rigorously. Inmany specific models, prefactors can be computed. In contrast, if the interactions may be long ranged, area laws may be violated."
Therefore, the authors may want to stress that such Hamiltonian also needs to be short range.

- The authors mention that TEBD is difficult to be used for off-diagonal hybridizations. In particular they state:
"For more involved bathson the other hand, TEBD can become difficult to formulate as discussed next."
However, it is not clearly explained why it is more difficult to use TEBD for off-diagonal hybridizations, since diagonal hybridization already requires long-range terms to be included. Could the authors briefly explain why off-diagonal hybridizations complicates the situation when using TEBD?

- Could the authors provide the bond dimensions they used?
We assume the authors converged their results with respect to the bond dimension.
Did they see any differences in the convergence for sites where they use the one-site/two-site update?
Since the one-site update does not allow for dynamical adjustment of the
bond dimension, did the authors use an initial bond dimension larger than
required for the ground state in order to have some overhead that might be needed by the time-evolution?

- Since the authors implemented the second order integration,
did they compare it to the first order method? Did they analyze the error as
as function of the time step and find the expected scaling?

- In addition to the references in the paper we deem the following papers to be relevant:
SIAM J. Numer. Anal., 53(2), 917–941, (for TDVP in general)
SIAM J. Matrix Anal. Appl., 34(2), 470–494 (for the derivation of the tangent space projector of a TTN)

- When showing the single site TDVP the authors state:
"Since each term in the projection operator keeps all but one tensor fixed, the integration can be performed locally."
Could the authors explain briefly why the projection operator keeps all but one tensor fixed?
After the Trotter breakup the projector and all tensors in the network are still time dependent, correct?
If we understand correctly it is shown in SIAM J. Numer. Anal., 53(2), 917–941 (Section 4) and mentioned in the authors Ref.[29]
that the integration can indeed be performed locally, but we think this is not obvious after the Trotter splitting of Eq.(6).

Finally, some minor remarks, such as typos we spotted:

- The authors use both "loop-free" and "loop free", which should be homogenized.

- FTPS abbreviation has not been introduced explicitely.

- Fig.4 caption "Each of these links correspond" -> corresponds.

- In section 2.2 the authors introduce notations for the orthogonality center (C) and
for orthogonalized tensors. Maybe the authors want to consider using this
notation also in the following to improve the presentation, for example:
After Eq.(17), second $\bullet$, the tensor being evolved is an orthogonality center
and could be called C instead of T.

• validity: high
• significance: good
• originality: low
• clarity: high
• formatting: good
• grammar: good