Studying dynamics in two-dimensional quantum lattices using tree tensor network states

We analyze and discuss convergence properties of a numerically exact algorithm tailored to study the dynamics of interacting two-dimensional lattice systems. The method is based on the application of the time-dependent variational principle in a manifold of binary and quaternary Tree Tensor Network States. The approach is found to be competitive with existing matrix product state approaches. We discuss issues related to the convergence of the method, which could be relevant to a broader set of numerical techniques used for the study of two-dimensional systems.


I. INTRODUCTION
The exact simulation of the non-equilibrium dynamics of interacting quantum lattice systems is generally an unsolved challenge, due to the exponential growth of the Hilbert space with the size of the system. Tensor network state (TNS) methods allow for a significant extension of accessible length scales by trading in the exponential cost in system size for an exponential cost in time. This becomes possible due to a reduction of the exact Hilbert space in terms of a structured product of low-order tensors, referred to as a tensor network. The set of the states expressible by a given tensor network spans only a small region in the full Hilbert space, but the coverage can be improved systematically by increasing the number of variational parameters, i.e. the bond dimension. The logarithm of the latter quantity gives an upper bound to the entanglement entropy of any bipartition of the lattice. Since the entanglement of a generic system grows linearly with time 1 , the accessible timescales are limited. In onedimensional systems, these timescales are often comparable to those attainable in experimental realizations 2 , however describing to higher spatial dimensions becomes extremely challenging due to a number of reasons.
While in one-dimensional systems, matrix product states (MPS) are known to efficiently represent area-law entangled states (which includes ground-states of gapped one-dimensional systems), this does not hold in two spatial dimensions [3][4][5] . The generalization of MPS to twodimensional lattices is called Projector-Entangled Pair States (PEPS) 6 , which provides an efficient representation of two-dimensional area-law entangled states 7 , but PEPS are challenging to manipulate numerically 8 (see also Ref. 9 for a recent review). Uncontrolled approximations are typically used in PEPS algorithms in order to tame the computational effort. Even with such apa) Electronic mail: bk2576@columbia.edu b) Electronic mail: ybarlev@bgu.ac.il proximations, the computational scaling is usually unfavorably steep. Nonetheless, PEPS-derived methods are state-of-the art numerical techniques for computing ground-states of two-dimensional systems 10 . Extensions of PEPS methods to the time-domain have been recently developed, however the accessible timescales are extremely limited [11][12][13][14] . An alternative approach is to use tensor network structures, which are more numerically tractable. One way to achieve this is to map the twodimensional lattice into a one-dimensional chain and apply MPS methods, which are adjusted to handle the longranged interactions that arise from the mapping [15][16][17][18] . Ref. 16 , for example, introduced an algorithm which expresses the propagator as a matrix product operator (MPO) acting on the states encoded as MPS. Their application of this approach to two-dimensional lattices shares the very limited timescale of the more recent approaches based on PEPS, since the advantages in the computational scaling of simpler tensor networks are balancing out the disadvantages in non-optimal representation of entanglement by the tensor network structure for the problem at hand.
It is important to explore computationally tractable tensor network structures other than MPS, since these structures may enable progress in the computation of the exact dynamics of interacting two-dimensional systems. For this purpose, in this work we propose to employ Tree Tensor Network States (TTNS), which encompass all loop-free tensor network states. In particular, hierarchical, tree-like tensor network structures may be more flexible in capturing entanglement in more than one spatial dimensions, with a moderate increase in the computational scaling with the bond dimension. TTNS are rarely used in the context of interacting lattice systems 19-23 , but they feature prominently in applications like electronic structure methods [24][25][26] or molecular quantum dynamics in the chemical physics literature. In this context they are called the Multi-Configuration Time-Dependent Hartree (MCTDH) method and its multi-layer generalization (ML-MCTDH) [27][28][29] . In ML-MCTDH, the timedependent variational principle 30,31 (TDVP) is applied to a TTNS as a variational ansatz for the wavefunction. Up to differences in the numerical integrations scheme, these methods are similar to the more recent applications of the TDVP tailored specifically to matrix product states 32,33 . In this work, we generalize the algorithms of Refs. [32][33][34] to general TTNS. We note that similar versions of this algorithm were reported in Refs. 35,36 , which we became aware of during the preparation of this manuscript. While in our work we focus on two-dimensional systems, Ref. 35 showcases a promising application of a TTNS as an impurity-solver, which is an effectively zero-dimensional problem. On the other hand, Ref. 36 proves the algorithm's exactness property as well as a linear error-bound for the total time evolution in the timestep.
The purpose of this work is to investigate the performance of TTNS as a numerically exact method to study the dynamics of two-dimensional systems. In Sec. II, we introduce the main concepts of TTNS along with the TDVP before presenting the algorithm and commenting on some caveats which are relevant to the applications of the TDVP. We benchmark the method on an exactly solvable, non-interacting two-dimensional system, and compare our approach to previously published results for two-dimensional interacting hardcore bosons in section §III. Notably, we identify the reachable timescales and investigate convergence properties of the algorithm alongside with practical considerations regarding how to assess the accuracy of the results. We conclude by placing the results in the context of existing techniques and recent developments in section §IV.

II. THEORY
Tensor network states represent a pure state, |Ψ = s1...s N Ψ s1...s N |s 1 . . . s N , of a lattice system as a product of tensors {T }. Each tensor T i may have a number of indices corresponding to physical degrees of freedom and also auxiliary indices which do not correspond to physical degrees of freedom. Consider the Schmidt decomposition, corresponding to a bipartition of the lattice into a set of sites A and its complement B, Ψ s1...
This expression can be also understood as a product of three tensors, where a single auxiliary index of each tensor is shared with the diagonal matrix of the Schmidt coefficients (or singular values) λ i . In a general tensor network any auxiliary index will appear on two tensors, and summation over the common index implies contraction of the two tensors. Tensor networks can be represented diagrammatically, see Fig. 1a, where the nodes correspond to tensors and the links, dubbed legs in the following, indicate a shared index between the two tensors. Any tensor network for which the legs do not form closed loops is considered a Tree Tensor Network (TTN), with matrix product states (MPS) serving as a prominent example for one-dimensional lattices. Here, we focus on more general TTNS with a simple hierarchical structure: n- ary TTNS in which every node has one parent node and n child nodes, except for those in the top and bottom layers. We group all physical degrees of freedom into the bottom layer such that all layers above the bottom layer contain only nodes with auxiliary legs. We will further limit the discussion to binary TTNS, although the method described in this work applies to any TTNS. In a binary TTNS, a general node represents a third order tensor Λ [l,i] ,where l denotes the layer of the tree to which the node belongs and i enumerates the nodes in that layer. Each such node has two child nodes. Due to the lack of loops in the tensor network, the physical degrees of freedom separate naturally into two groups from the perspective of a node Λ [l,i] : those reachable by only descending in the tree towards the bottom layer, i.e. those in the subtree of Λ [l,i] , and their complement. We define the number of non-zero singular values of the Schmidt decomposition along this bipartition as the rank r of node Λ [l,i] . For a state with volume law entanglement, the exact rank r will generally scale exponentially with the system size. Thus we introduce a cutoff in the number of kept singular values, namely the bond dimension of the tree χ. In the following, we consider a TTNS of rank χ, which implies that all its tensors Λ We next present a method for time-propagation on the manifold M χ of tree tensor networks with tree rank χ using a time-dependent variational principle (TDVP) 30,31 . We start by introducing a few properties and manipulations of TTNS and then describe TDVP and its application to TTNS. We finally highlight important technical details in the use of the TDVP.  Figure 2. A TTNS isometrized about node [1,1], a), and its shorthand notation, b). The thick black bars on the environment tensors represent the set of physical sites belonging to each of the environment tensors.

A. TTNS -Basics
A TTNS of a rank χ is unique up to unitary transformations. This can be seen by inserting a unit matrix between two linked nodes of the tree where repeated indices are summed over, I represents a χ × χ unit matrix and U * indicates complex conjugation of the corresponding tensor. This property can be exploited to isometrize the tree around any of its nodes 21 , which is a generalization of the mixed canonical representation of MPS. To illustrate this concept, consider the isometrization about the top-node. In this perspective, every tensor in the tree, except the top-node, represents a truncated orthonormal basis in the space of the bases of children nodes, called isometry in the language of real-space or tensor RG. Through recursion, a structured, incomplete basis for the physical lattice sites is obtained. The coefficients for these basis functions are contained in the top node. Any general TTNS can be brought into this form using a sequence of QR decompositions. Practically, one applies QR factorization Λ γα2α3 = δ β,γ , for each of the nodes proceeding layer by layer from bottom to top and absorbing the matrices R into the parent node after each factorization (see also Fig. 1b). Graphically, the direction along which the tensors are orthogonalized is indicated by an arrow on the linking leg. Isometrization around a specific node in the tree translates into arrows pointing in the direction of this node on any (direct) path between the node and a physical site, see Fig.  2. We may rewrite the TTNS in the following manner: Here, we take the TTNS to be isometrized about node [l, i], indicated as Ψ[l, i], and an environment tensor V is the contraction of all tensors between the legs of node Λ [l,i] , labeled by α j , and the physical sites s j , linked by paths from leg α j that do not cross node Λ [l,i] . c j (i) and p(i) are placeholders for the child and parent of node Λ [l,i] , respectively. We note in passing that similarly to MPS methods, such a contraction is never explicitly carried out, and we only use it for notational convenience. For future reference, we define projectors onto environment tensors of the lower and upper levels in the hierarchy: and . A useful property of the environment tensors is their orthogonality, which allows for facile computation of certain physical quantities. For example, if the state is isometrized about node [l, i], the norm of the state is given by To improve the readability of the presentation, in the following we will omit the indices specifying the tensors' elements.

B. TDVP
The time-dependent variational principle generates classical dynamics in the space of variational parameters, α, described by the Lagrangian The associated action is minimized along a path on a certain variational manifold, which in our case is the manifold M χ of TTNS with tree rank χ. The least-action principle gives the following equation of motion, where P T (Ψ[α]) is the projector onto the tangent space of the manifold M χ at the point Ψ[α]. An expression for P T (Ψ[α]) was derived for general binary TTNS in Refs. 37,38 . Here, we will use an additive splitting of P T (Ψ[α]), in an analogy to those presented for TTNS with only two layers, i.e. Tucker tensors 34 and matrix product states 32,33 , respectively. Note that the latter two TNS are subclasses of a general TTNS and that the expressions for the projector, P T (Ψ[α]), are not restricted to binary TTNS but valid for any TTNS with straightforward modifications. In particular, with with the effective Hamiltonian environment We choose a convenient gauge in which the timederivative of any tensor of the TTNS representation is orthogonal to itself. This must be done to avoid overcompleteness of the basis of the tangent space. In this gauge the time derivative simplifies to and for R [l,i] , which results from the action of P Time-evolution is obtained by integrating the linear differential equations Eqs. (11) and (12) using the projector splitting integrator. The procedure simplifies for Hamiltonians which can be expressed as a sum of rank-1 terms, which is possible for any lattice Hamiltonian composed of a sum of products of spin operators. In this case, the tensors of the effective Hamiltonian H [l±1,p(i)/c(i)] ef f reduce to matrices, and similarly to the standard DMRG procedure, H ef f is obtained by contracting the Hamiltonian with the set of environment tensors of the current site, see Fig. 3. Note that the number of effective Hamiltonian terms to be evaluated for a given site can be reduced by combining terms in the rank-1 decomposition of the Hamiltonian during the recursive contraction. If the Hamiltonian is not a sum of rank-1 terms but, for example, a matrix product operator (MPO) or a tree tensor network operator (TTNO), the result of the contraction of the Hamiltonian with the environment tensors is not a set of matrices but a compact tensor network.

C. Splitting integrator
Formally, the splitting integrator is obtained using a Trotter splitting applied to the additive decomposition of the tangent-space-projected evolution operator. Practically, it consists of a forward walk on the tree, propagation of the top-level tensor Λ [0,1] for a full time step, and a backward walk on the tree. A pseudo-code is given in a) b) Figure 3. Graphical representation of the RHS of Eq. (11). b) is a short-hand notation for a) valid when the Hamiltonian is a sum of rank-1 terms. algorithms 1-3. During the walks on the tree, isometrization of the TTNS is alwaysmaintained about the currently visited node and the effective Hamiltonian matrices are updated when going from one node to another along the direction of the step. The forward walk (backward walk) starts from the top-level node and proceeds from the current node to the adjacent node in a clockwise direction (in a counter-clockwise direction) closest to the previous/incoming node and propagation for half a time step is performed while ascending (descending). A walk on the tree is finished once the top-node is reached after visiting all physical sites, i.e. after each tensor (and the associated matrix R) is propagated savethose of the top node.

D. Remarks
The algorithm introduced above reduces to a previously published projector-splitting integrator when applied to TTNS with a single-layer 34,39 . Furthermore, a modification of the presented algorithm in which propagation is carried out while descending and ascending on the tree renders the algorithm identical to the singlesite algorithm presented in Ref. 33 for the application of TDVP to MPS as also to the algorithm of Ref. 35 , when applied to respective classes of TTNS. On the other hand, Ref. 36 describes an algorithm for general TTNS, which is identical to the above algorithm with a single (either forward or backward) walk per timestep, for which a linear error-bound in the timestep for the total time-evolution is provided. While the choice of concatenating forward and backward walks on the tree in this work as well as in Ref. 35 is motivated by the similar second-order construction for MPS 33 and TTNS with a single layer 34 , it is unlikely that the quadratic error bound in the time-step for the total time-evolution of the latter carries over.
The choice of n-ary TTNS in this work is motivated by their relation to a coarse graining procedure in systems of dimension larger than one, which makes them a   natural choice for the study of two-dimensional lattices. While the TDVP applied to MPS has been demonstrated to be capable of simulating dynamics in two-dimensional systems 17 , a detailed analysis and comparison with other tensor network structures is absent in the literature. In particular, the numerical stability of the TDVP cannot be taken for granted, especially when interactions between sites are long-ranged and not smoothly decaying, as discussed in the following. The application of TDVP formally requires the TTNS corresponding to the initial condition to possess a full tree rank of r. However, many physical initial conditions of interest can be represented with a low rank TTNS or even as a product state. If the initial condition is not contained in the manifold of TTNS with tree rank of r due to rank deficiency, the TDVP doesn't provide a prescription for how to choose and evolve the redundant parameters, which will gain weight in the wavefunction representation at later times. Stability and exactness of the dynamics under such circumstances is then dependent on details of the implementation and the model. For the projector splitting integrator, the initial rank-deficiency translates into non-uniqueness of the matrix decompositions employed in the change of isometrization. While the algorithm is not guaranteed to be exact in this case, numerical experiments and prior applications of the algorithm in one-dimensional systems indicate that it is generally reliable even for product state initial conditions. As a check, one may choose to regularize the initial condition by the addition of weak noise, and test for invariance of the resulting dynamics at short times. The initial evolution of redundant variational paramaters depends on arbitrary choices such as their initialization, the choice of regularization (if applied), as well as the details of the linear algebra routines used. Thus, different initializations of the same physical state may not converge to the same solution 40,41 . This issue has been rectified by the introduction of a scheme to optimally initialize redundant parameters 41 . However, this scheme requires the evaluation of an effective Hamiltonian matrices for H 2 and its compatibility with the integration scheme employed in this work is an open question.
Practically, we observe that the dependence of our results on non-optimal initializations of redundant parameters systematically decreases with increasing bonddimension, which is also expected from the derivation of the optimization scheme mentioned above. The dependence on initialization becomes noticeable only when the wavefunction markedly departs from the exact result, which provides an additional handle to access the convergence of the method. This intuition, however, is largely based on systems with local, or at least smoothly decaying interactions, and can break down for interactions which result from mapping a 2D lattice to a 1D chain, for example. We note in passing, that while this discussion applies to the 1-site version of the TDVP employed in this work, the two-site variant of TDVP may have similar issues for interactions that go beyond nearest-neighbours in one spatial dimension.

III. RESULTS
We first benchmark the method developed in this work by comparison with exact results obtained for noninteracting fermions on a 2D lattice. In the second stage we propagate a 2D system of hard-core bosons with nearest neighbor interaction and compare our results to propagation using MPS 16 . All calculations employ a regularization of the initial product state, which consists of addition of white noise sampled uniformly from the interval [−10 −20 , 10 −20 ] and subsequent renormalization of the TTNS.

A. Non-interacting fermions
We compute the dynamics of non-interacting fermions on a 2D lattice with on-site disorder where h i is drawn from a uniform distribution [−W, W ] and J = 1. All simulations use an initial state which is a random product state of empty and occupied sites, and use a time step dt = 0.01. The tensor network state calculations employ the Jordan-Wigner transform of (13) Different paths along which the sites are enumerated can be chosen, and this choice potentially influences the performance of the TNS algorithm. Here, we choose the path such that the Jordan-Wigner strings span a minimal distance on the graph of the tree tensor network structure. While solving the non-interacting problem in the fermionic representation is trivial, the presence of Jordan-Wigner strings renders its solution with tensor network states just as difficult as that of an interacting problem. Fig. 4 shows the average of the absolute value deviation of the expectation value of the local density, n i =ĉ † iĉ i from the exact result for a clean and a disordered system (W = 10). Without disorder, convergence of the local density within 2 is obtained for both quaternary and binary TTNS only for t ≤ 1. For moderately strong disorder and sufficiently high bond dimensions, longer times are attainable, due to the fact that the system is Anderson localized with a moderate localization length. For times beyond the convergence time we also observe a strong influence on the initialization procedure of the redundant parameters of the TTNS. Increasing the bond dimension systematically reduces this effect.

B. Hard-Core Bosons (XXZ model in 2D)
We consider the dynamics of hard-core bosons on a 2D lattice,Ĥ  system and initial condition have been studied previously in Ref. 16 using matrix product states, where results up to tJ = 2.0 were presented. In contrast to the non-interacting model discussed above, no exact results are available for this interacting system. Therefore, the convergence of the results is assessed by comparing the deviation of the local density between different bond dimensions. Furthermore, since the Hamiltonian and the initial condition are isotropic, closeness to the exact solution can also be assessed by the anisotropy A(t) = 1 xy ρxy(t) xy (ρ xy (t) − ρ yx (t)) of the bosonic density. We note however, that the isotropy of the numerical solution is required, but not sufficient condition for the solution to be numerically exact. Both convergence criteria are shown in Fig. 6 as a function of time t. For quaternary and binary TTNS, small anisotropies (< 4%) are obtained until tJ = 2.5, similarly to the anisotropy reported in Ref. 16 at t = 2.0 using matrix product states. Generally, the quaternary TTNS has less anisotropic error since the partitioning of the lattice through the tree structure is isotropic. The deviation of the results for different bond dimensions (see right panel of Fig. 6) shows a different trend: while results for binary TTNS at intermediate and largest bond dimensions are quite close this is not the case for quaternary TTNS. While for both tree structures and t < 2, the average deviation per site is well below 1%, the deviation between the best results of either tree type saturates this threshold for t > 2. This indicates that assessing convergence using only anisotropy overestimates the quality of the results, since the results of both quaternary and binary trees may show slow convergence over the range of accessible bond dimensions. However, given the systematic improvement in the convergence with respect to the bond dimension combined with reduction of the anisotropy of the results, we consider the latter to achieve numerically exact results for t < 2.

IV. DISCUSSION
In this work we have assessed the performance of TTNS for simulating the dynamics of two-dimensional many-body lattice systems. We introduced an algorithm based on the time-dependent variational principle for arbitrary TTNS and benchmarked it on systems of noninteracting fermions and interacting hard-core bosons in two dimensions, comparing the performance to previously published results using matrix product states. During the preparation of the manuscript we became aware of a recent complementary work introducing a similar versions of the algorithm, which were applied in rather different settings (as an impurity solver 35 , and in a more formal derivation of the algorithm 36 ).
Currently, no efficient technique exists for exactly simulating the non-equilibrium dynamics of interacting, twodimensional quantum systems. Despite recent progress, the timescales accessible by tensor network techniques for such systems are generally extremely short. We have found tree tensor networks to perform at least as well as matrix product state techniques, with binary TTNS generally providing a more robust performance than their quaternary counterparts. The issue of analyzing the con-  vergence, and thus ensuring the numerical exactness of the computed result, was discussed and potential issues of slow convergence were highlighted. We believe the availability of an alternative to matrix product states in the form of more general TTNS is important and can offer additional insight on situations when slow convergence is observed.
Our analysis has been mostly qualitative and a promising future avenue is the exploration of the entanglement structure of out-of-equilibrium states in 2D lattices . This will aid in the identification of optimal tensor network structures in order to best exploit the increased flexibility of TTNS, which already has proven to be important in applications for zero-dimensional systems, such as impurity models and also for molecular quantum dynamics 35,[42][43][44][45] . The dynamics of critical onedimensional systems is another application where such an increased flexibility may be of advantage. For critical systems in equilibrium, the multi-scale entanglement renormalization (MERA) 46,47 ansatz provides an efficient tensor network structure, which bears resemblance with the n-ary tree structures employed here. However, since a time-evolution approach for MERA is missing, it is interesting to compare the performance of MPS and n-ary TTNS for critical systems out-of-equilibrium. We leave such an investigation to a future work.