Efficient numerical simulations with Tensor Networks: Tensor Network Python (TeNPy)

Tensor product state (TPS) based methods are powerful tools to efficiently simulate quantum many-body systems in and out of equilibrium. In particular, the one-dimensional matrix-product (MPS) formalism is by now an established tool in condensed matter theory and quantum chemistry. In these lecture notes, we combine a compact review of basic TPS concepts with the introduction of a versatile tensor library for Python (TeNPy) [https://github.com/tenpy/tenpy]. As concrete examples, we consider the MPS based time-evolving block decimation and the density matrix renormalization group algorithm. Moreover, we provide a practical guide on how to implement abelian symmetries (e.g., a particle number conservation) to accelerate tensor operations.

The interplay of quantum fluctuations and correlations in quantum many-body systems can lead to exciting phenomena. Celebrated examples are the fractional quantum Hall effect [2,3], the Hadane phase in quantum spin chains [4,5], quantum spin liquids [6], and high-temperature superconductivity [7]. Understanding the emergent properties of such challenging quantum many-body systems is a problem of central importance in theoretical physics. The main difficulty in investigating quantum many-body problems lies in the fact that the Hilbert space spanned by the possible microstates grows exponentially with the system size.
To unravel the physics of microscopic model systems and to study the robustness of novel quantum phases of matter, large scale numerical simulations are essential. In certain systems where the infamous sign problem can be cured, efficient quantum Monte Carlo (QMC) methods can be applied. In a large class of quantum many-body systems (most notably, ones that involve fermionic degrees of freedom or geometric frustration), however, these QMC sampling techniques cannot be used effectively. In this case, tensor-product state based methods have been shown to be a powerful tool to efficiently simulate quantum many-body systems.
The most prominent algorithm in this context is the density matrix renormalization group (DMRG) method [8] which was originally conceived as an algorithm to study ground state properties of one-dimensional (1D) systems. The success of the DMRG method was later found to be based on the fact that quantum ground states of interest are often only slightly entangled (area law), and thus can be represented efficiently using matrix-product states (MPS) [9][10][11].
More recently it has been demonstrated that the DMRG method is also a useful tool to study the physics of two-dimensional (2D) systems using geometries such as a cylinder of finite circumference so that the quasi-2D problem can be mapped to a 1D one [12]. The DMRG algorithm has been successively improved and made more efficient. For example, the inclusion of Abelian and non-Abelian symmetries, [13][14][15][16][17], the introduction of single-site optimization with density matrix perturbation [18,19], hybrid real-momentum space representation [20,21], and the development of real-space parallelization [22] have increased the convergence speed and decreased the requirements of computational resources. An infinite version of the algorithm [23] has facilitated the investigation of translationally invariant systems. The success of DMRG was extended to also simulate real-time evolution allowing to study transport and non-equilibrium phenomena, [24][25][26][27][28][29]. However, the bipartite entanglement of pure states generically grows generically linearly with time, leading to a rapid exponential growth of the computational cost. This limits time evolution to rather short times. An exciting recent development is the generalization of DMRG to obtain highly excited states of many-body localized systems [30][31][32]. Tensor-product states (TPS) or equivalently projected entangled pair states (PEPS), are a generalization of MPS to higher dimensions [33,34]. This class of states is believed to efficiently describe a wide range of ground states of two-dimensional local Hamiltonians. TPS serve as variational wave functions that can approximate ground states of model Hamiltonians. For this several algorithms have been proposed, including the Corner Transfer Matrix Renormalization Group Method [35], Tensor Renormalization Group (TRG) [36], Tensor Network Renormalization (TNR) [37], and loop optimizations [38].
A number of very useful review articles on different tensor network related topics appeared over the past couple of years. Here we mention a few: Ref. [11] provides a pedagogical introduction to MPS and DMRG algorithms with detailed discussions regarding their implementation. In Ref. [39], a practical introduction to tensor networks including MPS and TPS is given. Applications of DMRG in quantum chemistry are discussed in Ref [40].
In these lecture notes we combine a pedagogical review of basics MPS and TPS based algorithms for both finite and infinite systems with the introduction of a versatile tensor library for Python (TeNPy) [1]. In the following section, we motivate the ansatz of TPS with the area law of entanglement entropy. In section 3 we introduce the MPS ansatz for finite systems and explain the time evolving block decimation (TEBD) [24] and the DMRG method [8] as prominent examples for algorithms working with MPS. In section 4 we explain the generalization of these algorithms to the thermodynamic limit. For each algorithm, we give a short example code show how to call it from the TeNPy library. Finally, we provide a practical guide on how to implement abelian symmetries (e.g., a particle number conservation) to accelerate tensor operations in section 5.

Entanglement in quantum-many body systems
Entanglement is one of the fundamental phenomena in quantum mechanics and implies that different degrees of freedom of a quantum system cannot be described independently. Over the past decades it was realized that the entanglement in quantum many-body system can give access to a lot of useful information about quantum states. First, entanglement related quantities provide powerful tools to extract universal properties of quantum states. For example, scaling properties of the entanglement entropy help to characterize critical systems [41][42][43][44], and entanglement is the basis for the classification of topological orders [45,46]. Second, the understanding of entanglement helped to develop new numerical methods to efficiently simulate quantum many-body systems [11,47]. In the following, we give a short introduction to entanglement in 1D systems and then focus on the MPS representation.
Let us consider the bipartition of the Hilbert space H = H L ⊗ H R of a 1D system as illustrated in Fig. 1(a), where H L (H R ) describes all the states defined on the left (right) of a given bond. In the so called Schmidt decomposition, a (pure) state |Ψ ∈ H is decomposed as where the states {|α L(R) } form an orthonormal basis of H L (H R ) and Λ α ≥ 0. The Schmidt decomposition is unique up to degeneracies and for a normalized state |Ψ we find that α Λ 2 α = 1. An important aspect of the Schmidt decomposition is that it gives direct insight into the bipartite entanglement (i.e., the entanglement between degrees of freedom in H L and H R ) of a state, as we explain in the following. The amount of entanglement is measured by the entanglement entropy, which is defined as the von-Neumann entropy S = − Tr ρ R log(ρ R ) of the reduced density matrix ρ R . The reduced density matrix of an entangled (pure) quantum state |ψ is the density matrix of a mixed state defined on the subsystem, A simple calculation shows that it has the Schmidt states |α R as eigenstates and the Schmidt coefficients are the square roots of the corresponding eigenvalues, i.e., ρ R = α Λ 2 α |α R α| R (equivalently for ρ L ). Hence, the entanglement entropy can be expressed in terms of the If there is no entanglement between the two subsystems, S = 0, the Schmidt decompositions consists only of a single term with Λ 1 = 1. The entanglement spectrum { α } [48] is defined in terms of the spectrum {Λ 2 α } of the reduced density matrix by Λ 2 α = exp(− α ) for each α.

Area law
A "typical" state in the Hilbert space shows a volume law, i.e., the entanglement entropy grows proportionally with the volume of the partitions. In particular, it has been shown in Ref. [49] that in a system of N sites with on-site Hilbert space dimension d, a randomly drawn state |ψ random has an entanglement entropy of S ≈ N/2 log d − 1/2 for a bipartition into two parts of N/2 sites. In contrast, ground states |ψ 0 of gapped and local Hamiltonians follow instead an area law, i.e., the entanglement entropy grows proportionally with the area of the cut [50]. For a cut of an N-site chain as shown in Fig. 1(a) this implies that S(N ) is constant for N ξ (with ξ being the correlation length). This can be intuitively understood from the fact that a gapped ground state contains only fluctuations within the correlation length ξ and thus only degrees of freedom near the cut are entangled, as schematically indicated in Fig. 1(b). A rigorous proof of the area law in 1D is given in Ref. [10]. In this respect, ground states are very special states and can be found within a very small corner of the Hilbert space, as illustrated in Fig. 1(c).
In slightly entangled states, only a relatively small number of Schmidt states contribute significantly. This is demonstrated in Fig. 1(d) by comparing the largest 20 Schmidt values of an area law and a volume law state for a bipartition of an N = 16 chain into two half chains.
As an example of an area law state, we considered here the ground state of the transverse field Ising model with σ x n and σ z n being the Pauli operators and g > 0. This Z 2 symmetric model with a quantum phase transition at g c = 1 has two very simple limits. For g = 0, the ground state is twofold degenerate and given by the ferromagnetic product state (symmetry broken), and at g → ∞ the ground state is a product state in which all spins are polarized by the transverse field in x-direction (symmetric). For intermediate values of g, the ground states are area law type entangled states (except at the critical point). As shown in Fig. 1(d) for a representative example of g = 1.5, the ground state has essentially the entire weight contained in a few Schmidt states. Generic states fulfilling the area law show a similar behavior and thus the above observation provides an extremely useful approach to compress quantum states by truncating the Schmidt decomposition. In particular, for all > 0 we can truncate the Schmidt decomposition at some finite χ (independent of the system size) such that This particular property of area law states is intimately related to the MPS representation of 1D quantum states, as we will discuss in the next chapter. The situation is very different for a highly entangled (volume law) random state: All the Schmidt values are roughly constant for all 2 N/2 states and thus only little weight is contained in the 20 dominant states (assuming an equal weight, we find Λ 2 α ≈ 1/2 N/2 per Schmidt state).

Finite systems in one dimension
In this chapter, we consider a chain with N sites. We label the local basis on site n by |j n with j n = 1, . . . , d, e.g., for the transverse field Ising model we have spin-1/2 sites with the (d = 2) local states |↑ , |↓ . A generic (pure) quantum state can then be expanded as |ψ = j 1 ,j 2 ,...j N ψ j 1 j 2 ···j N |j 1 , j 2 , . . . , j N .
Before we proceed with the definition of MPS, we introduce a diagrammatic notation, which is very useful for representing tensor networks and related algorithms and has been established in the community. In this notation, a tensor with n indices is represented by a symbol with n legs. Connecting two legs among tensors symbolizes a tensor contraction, i.e., summing over the relevant indices. This is illustrated in Fig. 2.

Matrix Product States (MPS)
The class of MPS is an ansatz class where the coefficients ψ j 1 ,...,jn of a pure quantum state are decomposed into products of matrices [9,51,52]: Spin-1 AKLT state.
Affleck, Kennedy, Lieb, and Tasaki (AKLT) constructed an exactly solvable Hamiltonian which reads where S are spin S = 1 operators and P S=2 j,j+1 is a projector onto the S = 2 sector of the spins on sites j and j + 1. This model is in a topologically nontrivial phase with remarkable properties of the ground state. To construct the ground state, we note that the projector P S=2 j,j+1 does not give a contribution if we decompose the S = 1 spins on each site into two S = 1 2 spins and form singlets between spins on neighboring sites, as illustrated in Fig. 3(d) [53]. While the ground state is unique on a ring with periodic boundary conditions, in a chain with open boundary conditions the S = 1 2 spins on the edges do not contribute to the energy and thus lead to a 4-fold degeneracy of the ground state. Given the structure of the ground state, we can construct the corresponding MPS as shown in Fig. 3(e): We start by writing the product of singlets with the matrices of eq. 11 and add arbitrary spin-1 2 states φ L and φ R on the left and right. We apply the projectors P S=1 to map the two spin-1 2 onto the physical spin-1 site, and contract the three tensors on each site to obtain the MPS structure. For sites 1 < n < N in the bulk, we obtain Here, we included the factor 4 3 to normalize the MPS.
In general, any state in a finite system can be decomposed exactly into the MPS form of eq. (7). The caveat is that for a generic state (with a volume law entanglement) the required bond dimension χ max := max n χ n increases exponentially with the number of sites N . However, by linking the MPS representation with the Schmidt decomposition (1), we will see that we can approximate area law states very well (in the sense of eq. (5)) by MPS with a finite bond dimension χ max [54,55]. This link is given by the so-called canonical form of an MPS, which we introduce now.

Canonical form
The MPS representation (7) is not unique. Consider the bond between sites n and n + 1, which defines a bipartition into L = { 1, . . . , n } and R = { n + 1, . . . , N }. Given an invertible χ n+1 × χ n+1 matrix X, we can replace and still represent the same state |ψ , see Fig. 4(a). This freedom can be used to define a convenient "canonical form" of the MPS, following Ref. [56,57]. α n+1 on the diagonal. Performing partial contractions gives a However, for general M andΓ [n] , the states |α n+1 L/R will not be orthonormal. Note that we can interpret the X in eq. (14) as a basis transformation of the states |α n+1 R in eq. (17). The idea of the canonical form is to choose the X in eq. (14) such that it maps |α n+1 R to the Schmidt states |α n+1 R . Using the Schmidt values Λ , we find that eq. (15) indeed gives the Schmidt decomposition. Repeating this on each bond yields the canonical form depicted in Fig. 4(b), Here, we have introduced trivial 1 × 1 matrices Λ [1] ≡ Λ [N +1] ≡ 1 multiplied to the trivial legs of the first and last tensor, again with the goal to achieve a uniform bulk. While the canonical form is useful as it allows to quickly read off the Schmidt decomposition on any bond, in practice we usually group each Γ with one of the Λ matrices and define If we write an MPS entirely with A tensors (B tensors), it is said to be in left (right) canonical form. In fact, all the examples given in eq. (8)- (13) are in right-canonical form. If we consider the bond between sites n and n + 1, we can write the MPS in a "mixed" canonical form with A tensors up to site n and B tensors starting from site n + 1, as depicted in Fig. 4(c) for n = 3. The A and B tensors transform the Schmidt basis from one bond to the next: Therefore, the orthonormality conditions α n | L |ᾱ n L = δ αnᾱn = α n | R |ᾱ n R translate into the very useful relations shown in Fig. 4(d).
One great advantage of the canonical form is that these relations allow to evaluate expectation values of local operators very easily. As shown in Fig. 5, this requires only the contraction of a few local tensors. If needed, we can easily convert the left and right canonical forms into each other, e.g., are diagonal matrices, their inverses are simply given by diagonal matrices with the inverse Schmidt values 1 .
As mentioned above, we can represent any state in a finite system if we allow an arbitrary bond dimension χ max ; but to avoid a blowup of the computational cost (exponentially in N ), we need to truncate the matrices to a moderate bond dimension χ max . Consider the bond between sites n and n+1. It turns out that the simple truncation of the Schmidt decomposition is optimal in the sense of minimizing the error in eq. (5). In the (mixed) canonical form, we can therefore simply discard 2 some rows of A [n]jn , diagonal entries of Λ [n+1] and columns of B [n+1]j n+1 (namely the ones corresponding to the smallest Schmidt values). To preserve the norm of the wave function, we renormalize the Schmidt values on the diagonal of α n+1 = 0 for some αn+1, we can remove the corresponding columns of B [n] and rows of B [n+1] before taking the inverse, as they do not contribute to the wave function.
2 Strictly speaking, this changes the Schmidt values and vectors on other bonds and thus destroys the canonical form! This is the reason why TEBD does not even preserve the energy at large times when we need to truncate strongly. However, if the discarded weight α>χ Λ [n] αα is small, this error might be ignored. An improved algorithm based on the time dependent variational principle (TDVP) was introduced in in Refs. [28,29] which performs a unitary evolution in the space of MPS with given bond dimension χmax.

Time Evolving Block Decimation (TEBD)
In the TEBD algorithm [24], we are interested in evaluating the time evolution of a quantum state: The time evolution operator U can either be U (t) = exp(−itH) yielding a real time evolution, or an imaginary time evolution U (τ ) = exp(−τ H). The latter is used to find ground states of the Hamiltonian H through the relation To achieve this, one makes use of the Suzuki-Trotter decomposition [58], which approximates the exponent of a sum of operators with a product of exponents of the same operators. For example, the first and second order expansions read Here X and Y are operators, and δ is a small parameter. To make use of these expressions, we assume that the Hamiltonian is a sum of two-site operators of the form where h [n,n+1] acts only on sites n and n + 1, and decompose it as a sum Each term H odd and H even consists of a sum of commuting operators, therefore e H odd δ = n odd e h [n,n+1] δ and similar for H even . We now divide the time into small time slices δt 1 (the relevant time scale is in fact the inverse gap) and consider a time evolution operator U (δt). Using, as an example, the first order decomposition (23), the operator U (δt) can be expanded into products of two-site unitary operators where n+1] . This decomposition of the time evolution operator is shown pictorially in Fig. 6(a). The successive application of these two-site unitary operators to an MPS is the main part of the algorithm and explained in the following.
Local unitary updates of an MPS. One of the advantages of the MPS representation is that local transformations can be performed efficiently. Moreover, the canonical form discussed above is preserved if the transformations are unitary [56]. A one-site unitary U simply transforms the tensors Γ of the MPS Figure 6: (a) In TEBD each time step δt of a time evolution is approximated using a Suzuki-Trotter decomposition, i.e., the time evolution operator is expressed as a product of two-site operators. (b) Update to apply a two-site unitary U and recover the MPS form, see main text for details.
In such a case the entanglement of the wave-function is not affected and thus the values of Λ do not change.
The update procedure for a two-site unitary transformation acting on two neighboring sites n and n + 1 is shown in Fig. 6(b). We first find the wave function in the basis spanned by the left Schmidt states |α n+2 L , the local basis |j n and |j n+1 on sites n and n + 1, and the right Schmidt states |α n+2 R , which together form an orthonormal basis { | α n L ⊗ | j n ⊗ | j n+1 ⊗ | α n+2 R }. Calling the wave function coefficients Θ, the state is expressed as |ψ = αn,jn,j n+1 ,α n+2 Θ jnj n+1 αnα n+2 |α n L |j n |j n+1 |α n+2 R .
Using the definitions of |α L/R shown in Fig. 3(d), Θ is given by Writing the wave function in this basis is useful because it is easy to apply the two-site unitary in step (ii) of the algorithm:Θ jnj n+1 Next we have to extract the new tensorsB [n] ,B [n+1] andΛ [n+1] from the transformed tensor Θ in a manner that preserves the canonical form. We first "reshape" the tensorΘ by combining indices to obtain a dχ n × dχ n+2 dimensional matrix Θ jnαn;j n+1 α n+2 . Because the basis { | α n L ⊗ | j n } is orthonormal, as for the right, it is natural to decompose the matrix using the singular value decomposition (SVD) in step (iii) intõ is a diagonal matrix. Indeed, the suggestive notation that the new tensors are in mixed canonical form is justified, since the SVD yields a Schmidt decomposition of the wave function for a bipartition at the bond between sites n and n + 1.
After the update, the new MPS is still in the canonical form. The entanglement at the bond n, n + 1 has changed and the bond dimension increased to dχ. Thus the amount of information in the wave function grows exponentially if we successively apply unitaries to the state. To overcome this problem, we perform an approximation by fixing the maximal number of Schmidt terms to χ max . In each update, only the χ max most important states are kept in step (iii), i.e., if we order the Schmidt states according to their size we simply truncate the range of the index α n+1 in eq. (31) to be 1 . . . χ max . This approximation limits the dimension of the MPS and the tensors B have at most a dimension of χ max × d × χ max . Given that the truncated weight is small, the normalization conditions for the canonical form will be fulfilled to a good approximation. In order to keep the wave function normalized, one should divide by the norm after the truncation, i.e., divide by N = jn,j n+1 ,αn,α n+2 Θ jnj n+1 αnα n+2 2 .
If we perform an imaginary time evolution of the state, the operator U is not unitary and thus it does not conserve the canonical form. It turns out, however, that the successive Schmidt decompositions assure a good approximation as long as the time steps are chosen small enough. One way to obtain very accurate results is to decrease the size of the time steps successively [57].
The simulation cost of this algorithm scales as O(d 3 χ 3 max ) and the most time consuming part of the algorithm is the SVD in step (iii). Numerically, the algorithm can become unstable when the values of Λ become very small since the matrix has to be inverted in order to extract the new tensors in step (iv) of the algorithm. This problem can be avoided by applying a slightly modified version of this algorithm as introduced by Hastings in Ref. [59]. The following example code uses the TeNPy library [1] to perform an imaginary time evolution, finding the ground state of the transverse field Ising model (4). For a real time evolution, one can use eng.run() instead of eng.run_GS(). Note that the TEBD algorithms is rather slow in finding the ground state (especially near critical points) and can moreover only be applied to Hamiltonians with nearest-neighbor couplings. The DMRG algorithm discussed in the following represents a more efficient and versatile algorithm to study ground state properties.
from tenpy . networks . mps import MPS from tenpy . models . t f i s i n g import TFIChain from tenpy . a l g o r i t h m s import tebd

Matrix Product Operators (MPO)
The DMRG algorithms explained in the next section relies on expressing the Hamiltonian in the form of a matrix product operator (MPO). An MPO is a natural generalization of an MPS to the space of operators, given by where W [n]jnj n are D × D matrices, and |j n , |j n represent the local basis states at site n, as before. At the boundaries we initiate and terminate the MPO by the left and right vectors v L , v R . A diagrammatic representation of an MPO is given in Fig. 7(a). The advantage of the MPO is that it can be applied efficiently to a matrix product state as shown in Fig. 7(b). All local Hamiltonians with only short range interactions can be represented exactly using an MPO of a small dimension D. Let us consider, for example, the MPO of the anisotropic Heisenberg (XXZ) model in the presence of an field h n which can vary from site to site. The

Hamiltonian is
where S γ n , with γ = x, y, z, is the γ-component of the spin-S operator at site n, ∆ is the XXZ anisotropic interaction parameter. Expressed as a tensor product, the Hamiltonian takes the following form: The corresponding MPO has a dimension D = 5 and is given by where the entries of this "matrix" are operators acting on site n, corresponding to the indices j n , j n , and v L = 1, 0, 0, 0, 0 , v R = 0, 0, 0, 0, 1 T .
By multiplying the matrices (and taking tensor products of the operators), one can easily see that the product of the matrices does in fact yield the Hamiltonian (35). Further details of the MPO form of operators can be found in Refs. [11,60].
To derive the form of the matrices for a more complicated Hamiltonian, it can be useful to view the MPO as a finite state machine [61]. Using this concept, the generation of an MPO for models with finite-range (two-body) interactions is automated in TeNPy [1]. The following example code creates a model representing eq. (34). Moreover, various models (including the Heisenberg bosonic and fermionic models on cylinders and stripes) are already predefined under tenpy.models and can easily be generalized; in fact, the model defined below is a special case of the more general spin chain model in tenpy.models.spin.
from tenpy . models . l a t t i c e import Chain from tenpy . networks . s i t e import S p i n S i t e from tenpy . models . model import

Density Matrix Renormalization Group (DMRG)
We now discuss the Density Matrix Renormalization Group (DMRG) algorithm [8]. Unlike TEBD, the DMRG is a variational approach to optimize the MPS, but the algorithms have many steps in common. One advantage of the DMRG is that it does not rely on a Suzuki-Trotter decomposition of the Hamiltonian and thus applies to systems with longer range interactions. We assume only that the Hamiltonian has been written as an MPO. Secondly, the convergence of the DMRG method to the ground state is in practice much faster. This is particularly the case if the gap above the ground state is small and the correlation length is long. The schematic idea for the DMRG algorithm is as follows (see Fig. 8). Like in TEBD, the state at each step is represented by an MPS. We variationally optimize the tensors of two neighboring sites (say n and n + 1) to minimize the ground state energy ψ|H|ψ , while keeping the rest of the chain fixed. To do so, at each step we represent the initial wave function |ψ using the two site tensor Θ jnj n+1 αnα n+2 (as previously defined in eq. (29) the TEBD section), project the Hamiltonian into the space spanned by the basis set { | α n L ⊗ | j n ⊗ | j n+1 ⊗ | α n+2 R }, and use an iterative algorithm (e.g. Lanczos) to lower the energy. Repeating this two-site update for each pair of neighboring sites, the wave function converges to the ground state. While the Trotter decomposition requires to update first all even bonds and then odd bonds, see eq. (26), in the DMRG we perform the two-site updates in a sequential order 3 , starting from the left and proceeding to the right, n = 1, 2, 3, . . . , L − 2, L − 1, and then back from right to left, n = L − 1, L − 2, . . . , 3, 2, 1. This sequence is called a "sweep" from left to right and back.
Two-site update. We start by describing the update of the tensors on two neighboring sites n and n + 1. Let us assume that we have the MPS in mixed canonical form as depicted in Fig. 8(a). We now want to find new while keeping all other tensors fixed.
Step (i) of the update is identical to the first step in the TEBD method: We contract the tensors for two neighboring sites to obtain the initial two-site wave function Θ jnj n+1 αnα n+2 . The orthonormal basis { | α n L ⊗ | j n ⊗ | j n+1 ⊗ | α n+2 R } spans the variational space |ψ = αn,jn,j n+1 ,α n+2Θ jnj n+1 αnα n+2 |α n j n j n+1 α n+2 of the update, in which we must minimize the energy E = ψ |H eff |ψ in order to determine the new optimalΘ. Here, H eff is the Hamiltonian projected onto the variational space. Recall from Fig. 4(c) that the product A [1] A [2] · · · A [n−1] gives exactly the projection from |i 1 i 2 . . . i n−1 to |α n L , and similarly B [n+2] · · · B [L] maps |i n+2 . . . i N to |α n+1 R . Hence, H eff is given by the network shown in Fig. 8(b). For convenience, we have contracted the tensors strictly left of site n to form L [n] , and the ones to the right of site n + 1 into R [n+1] , respectively. We call these partial contractions L [n] and R [n+1] the left and right "environments". Each environment has three open legs, e.g., L [n] has an MPO bond index γ n and the two bond indices α n , α n of the bra and ket MPS. For now let us assume that we already performed these contractions; we will later come back to the initialization of them.
Grouping the indices on the top and bottom, we can view H eff as a matrix with dimensions up to χ 2 max d 2 × χ 2 max d 2 . Minimizing the energy E = ψ |H eff |ψ thus means to find the the χ 2 max d 2 dimensional ground-state vectorΘ of the effective Hamiltonian. Since this is the computationally most expensive part of the DMRG algorithm, it is advisable to use an iterative procedure like the Lanczos algorithm instead of a full diagonalization of H eff . If the previous two-site wave function Θ obtained in step (i) is already a good approximation of the ground state, the Lanczos algorithm typically converges after a few steps and thus requires only a few "matrix-vector" multiplications, i.e., contractions of H eff with Θ. Note that the scaling of such a matrix-vector multiplication is better ( d 4 )). This update step can be compared to the TEBD update where we obtain a new wave-functionΘ after applying an time-evolution operator. As with TEBD, we split the newΘ using an SVD in step (iii), and must truncate the new index α n+1 to avoid a growth χ → dχ of the bond dimension. It is important that the left and right Schmidt basis |α n L , |α n+2 R are orthonormal to ensure that the truncation at the given bond is optimal. Assuming that this is the case, the isometry properties of the SVD matrices imply that the orthonormality conditions also hold for the updated Schmidt states |α n L/R defined about the central bond.
At this point, we have improved guesses for the tensorsÃ [n] ,Λ [n+1] ,B [n] (after a reshaping into the desired form) and can move on to the next bond. Note that we moved the center of the mixed canonical form to the central bond n : n + 1. If we move to the right, the next two-site wave function Θ for step (i) is thus again given byΛ [ . Moreover, we need to find the next environments.
The starting environments on the very left and right are simply given by (see Fig. 8(a)) Here, the δ α 1 α 1 and δ α N +1 α N +1 are trivial since α 1 and α N +1 are dummy indices which take only a single value. The other environments can be obtained from a simple recursion rule shown as step (iv) of Fig. 8(d). Using this recursion rule, R [2] required for the first update of the sweep can be obtained by an iteration starting from the right-most R [ N ]. Note that the update on sites n, n + 1 does not change the right environments R k for k > n + 1. Thus it is advisable to keep the environments in memory, such that we only need to recalculate the left environments when sweeping from left to right, and vice versa in the other direction. As noted above, DMRG is usually faster and more stable than an imaginary time evolution with TEBD. Adapting the TeNPy code from the TEBD section to run DMRG instead of TEBD is very simple [1]: from tenpy . networks . mps import MPS The procedure described above optimizes always two sites. Ref. [18] introduced a way to perturb the density matrices during the algorithm in order to avoid getting stuck in local minima, which allows to perform DMRG optimizing only a single site at once. Especially for models with long-range interactions (as appear when mapping a quasi-2D cylinder to a 1D chain) or models with topological phases, such a perturbation can be necessary to converge towards the correct ground state. In TeNPy, this perturbation can be activated with the parameter 'mixer'. An alternative method was introduced recently [19].

Infinite systems in one dimension
For translation invariant systems, we can take the thermodynamic limit in which the number of sites N → ∞, generalizing (7)  The paramagnetic product state |· · · ←← · · · with the tensors of eq. (9) is trivial example for such a translation invariant state; another example is the AKLT state given in eq. (13). In general, we might only have a translation invariance by shifts of (multiples of) L sites. In this case we introduce a repeating unit cell of L sites with L different tensors, M [n] = M [n+L] in eq. (39). For example, the Neel state |· · · ↑↓↑↓ · · · is only invariant under a translation by (multiples of) L = 2 sites, with the tensors on even and odd sites as given in eq. (10) for the finite case, illustrated in Fig. 9(a). The length L of the unit cell should be chosen compatible with the translation symmetry of the state to be represented, e.g. for the Neel state L should be a multiple of 2. Choosing L larger than strictly necessary allows to check the translation invariance explicitly. At first sight, it might seem that we need to contract an infinite number of tensors to evaluate expectation values of local operators, as the corresponding network consists of an infinite number of tensors. However, as shown in Fig. 9(b) for a unit cell of L = 2 sites, the network has a repeating structure consisting of the so-called transfer matrix T defined as A state is called pure if the largest (in terms of absolute value) eigenvalue of T is unique and mixed if it is degenerate. In the following, we will always assume that the state is pure (in fact every mixed state can be uniquely decomposed into a sum of pure ones). We renormalize the iMPS such that the largest eigenvalue of T is 1. The eigenvector depends on the gauge freedom of eq. (14), which we can use to bring the iMPS into the convenient canonical form defined by the Schmidt decomposition on each bond, see Fig. 9(c). An algorithm to achieve this is described in Ref. [62]. , the orthonormality condition of the Schmidt vectors depicted in Fig. 3(e) applied to the whole unit cell implies that δ γγ is a right eigenvector of T with eigenvalue 1. Note that T is not symmetric and hence left and right eigenvectors differ; the left eigenvector to the eigenvalue 1 is (Λ [1] α ) 2 δ αα . All other eigenvalues of the transfer matrix have magnitude smaller than 1. Therefore, the repeated application of the transfer matrix in the network of the expectation value projects onto these dominant left and right eigenvectors, and the infinite network of the expectation value ψ|O n |ψ simplifies to a local network as in the finite case, see Fig. 5.
A similar reasoning can be used for the correlation function ψ|O n O m |ψ . Projecting onto the dominant eigenvectors left of O n and right O m , we arrive at the network of Fig. 10(a). In between the operators O n and O m , the transfer matrix T appears N = |m−n| L − 1 times, where · denotes rounding down to the next integer. Formally diagonalizing the transfer matrix to take the N -th power shows that the correlation function is a sum of exponentials, Here, η i labels the i-largest eigenvalue corresponding to the left and right eigenvectors η n ) denotes the remaining parts of the network shown in Fig. 10, and we identified the C 1 = ψ|O n |ψ ψ|O m |ψ in the term of the dominant eigenvalue η 1 = 1 . The decay of the correlations is thus determined by the second largest eigenvalue η 2 , which yields the correlation length Numerically, it is readily obtained from a sparse algorithm finding extremal eigenvalues of T .

Infinte Time Evolving Block Decimation (iTEBD)
Generalizing TEBD to infinite systems is very simple and requires only minor modifications in the code [57]. Without loss of generality we assume that the Hamiltonian is translation invariant by L sites as the iMPS; otherwise we enlarge the unit cells. As in the finite case, we use a Suzuki-Trotter decomposition to obtain the expression of the time evolution operator U (t) given in eq. (26), but now the index n runs over all integer numbers, n ∈ Z. If we apply the two-site unitary U [n,n+1] = e ih [n,n+1] δt on the iMPS to update the matrices B [n] and B [n+1] as illustrated in Fig. 6(b), this corresponds due to translation invariance to the action of U [n,n+1] on the sites (n + mL, n + 1 + mL) for any m ∈ Z. Therefore, we can use the same two-site update as in the finite case; the only difference is that the matrices of the iMPS represent only the unit cell with nontrivial left and right bonds, and compared to a finite system with L sites we have an additional term h [L,L+1] ≡ h [L,1] accross the boundary of the unit cell. Note that the iTEBD algorithm is different from a time evolution in a finite system of N = L sites with periodic boundary conditions. For analytical calculations with MPS in systems with periodic boundary conditions, it can be usefull to change the definition of an MPS from eq. (7) to which has at first sight the same tensor network structure as an iMPS. However, cutting a single bond of such a finite MPS with periodic boundary conditions does not split it into two parts. Therefore, the canonical form (which relies on the Schmidt decomposition) is not well defined in a system with periodic boundary conditions (or in general for any tensor network state in which the bonds form loops) 4 . Since the two-site update scheme of iTEBD implicitly uses the canonical form, it implements the time evolution in the infinite system with open boundary conditions. This becomes also evident by the fact that the bond dimension χ max -in other words the number of Schmidt states take into account -can get larger than the Hilber space dimension d L inside one unit cell.
In TeNPy one only needs to change the parameter bc_MPS from 'finite' to 'infinite' to switch from TEBD to iTEBD [1]. In addition, we can choose a smaller unit cell of just L = 2 sites and calculate the energy density per site in the end.

Infinite Density Matrix Renormalization Group (iDMRG)
While iTEBD works directly in the thermodynamic limit N → ∞ by employing translation invariance, for the infinite Density Matrix Renormalization Group (iDRMG) one should think of a finite system with a growing number of sites -the "renormalization group" in the name refers to this. Let us assume that the Hamiltonian is given as an MPO with a translation invariant unit cell consisting of W [n] , n = 1, · · · , L, which we can terminate with the boundary vectors v L , v R to obtain the Hamiltonian of a finite system with a multiple of L sites. We initialize the environments and perform two-site updates during a sweep exactly like in finite DMRG. The crucial difference is that we increase the system size between the sweeps as illustrated in Fig. 11(b): assuming translation invariance, we redefine the left and right environments L → L and R → R to include additional unit cells. Moreover, we need to extend the sweep to include an update on the the sites (L, L + 1) ≡ (L, 1). With each unit cell inserted, the described finite system growth by L sites, where we focus only only on the central L sites. Full translation invariance is only recovered when the iDMRG iteration of sweeps and growing environments converges to a fix point, at which the environments describe infinite half-chains.
One subtlety of the above prescription lies in the interpretation of the energy E obtained during the diagonalization step. Is it the (infinite) energy of the infinite system? Keeping track of the number of sites R/L included into each of the environments, we see that the energy E corresponds to a system of size N = L + L + R . By monitoring the change in Figure 11: (a) For iDMRG (here with a unit cell of L = 2 sites), we initialize the environments and perform updates like in DMRG of a finite system with L sites. (b) Between the sweeps, we increase the system size by inserting a unit-cell of L sites into each of the environments (assuming translation invariance of the iMPS).
E with increased N , we can extract the energy per site. This is convenient for problems in which there is no few-site Hamiltonian with which to evaluate the energy.
When symmetry breaking is expected, it is helpful to initialize the environments by repeatedly performing the iDMRG update without performing the Lanczos optimization, which builds up environments using an initial symmetry broken MPS.
Like for iTEBD, the switch from DMRG to iDMRG in TeNPy is simply accomplished by a change of the parameter bc_MPS from 'finite' to 'infinite', a minimal example is given below. and matrix or tensor products. While exploiting the block structure does not change the scaling of the considered algorithm with the total dimension of the tensors, the gained speedup is often significant and allows more precise simulations with larger bond dimensions at the same computational cost.
For tensor networks, the basic idea is that we can ensure a block structure of each tensor individually. One can argue based on representation theory of groups that the tensors can be decomposed in such a block structure [14,15]. However, here we present a bottom-up approach which is more closely to the implementation. Motivated by an example, we will state a simple "charge rule" which fixes the block structure of a tensor by selecting entries which have to vanish. We explain how to define and read off the required charge values. Then we argue that tensor network algorithms (like TEBD or DMRG) require only a few basic operations on tensors, and that these operations can be implemented to preserve the charge rule (and to exploit the block strucure for the speedup).

Definition of charges
For concreteness, let us now consider two spin-1 2 sites coupled by where we have represented H in the basis { | a } ≡ { | ↑↑ , | ↑↓ , | ↓↑ , | ↓↓ } and omitted zeros. Indeed, we clearly see a block-diagonal structure in this example, which stems from the conservation of the magnetization 5 S z = S z 1 +S z 2 . We can identify the blocks if we note that the considered basis states are eigenstates of S z and inspect their eigenvalues: |↑↑ corresponds to the eigenvalue , the two states |↑↓ , |↓↑ form a block to the eigenvalue 0, and |↓↓ corresponds to − . To avoid floating point errors we rescale the "charges" to take only integer values; clearly, whenever S z is conserved, so is q := 2S z / , but the latter takes the simple values 2, 0 and −2 for the four basis states |a considered above. We have thus associated one charge value to each index a, which we can summarize in a vector q [a] := (2, 0, 0, −2). Using this definition, we can formulate the conservation of S z as a condition on the matrix elements: How does this generalize to tensors with a larger number of indices? To stay with the example, we can also write H = s 1 s 2 t 1 t 2 H s 1 s 2 t 1 t 2 |s 1 |s 2 t 1 | t 2 | as a tensor with 4 indices s 1 , s 2 , t 1 , t 2 corresponding to the single-site basis { | s } ≡ { | ↑ , | ↓ }. The charge values q [s] = (1, −1) for this basis are obvious from the definition q = 2S z / (and the reason why we 5 We call this a U (1) symmetry since H commutes with U = exp(iφ j S z j ) = j e iφS z j which has a U (1) group structure. If one thinks of particles (e.g., Fermions using a Jordan-Wigner transformation) it corresponds to the particle number conservation. Note that this Hamiltonian also has an SU (2) ∼ = SO(3) symmetry of spin rotations. Yet, the latter is non-abelian, and exploiting it requires a change of the computational basis and is much more difficult to implement. Here, we focus exclusively on local, abelian symmetries and refer to Refs. [13,14,16,17] for the non-abelian case.
Example ζ [1] q [1] ζ [2] q [2] ζ [3] q [3] ζ [4] q [4] Table 1: Examples for charge definitions such that the tensors fullfill the charge rule (47). We consider spin- included the factor 2 in the rescaling). Since S z is additive, its conservation now implies that Note that the indices corresponding to a ket appear on the left hand side of the inequality, while the ones corresponding to a bra appear on the right. For an arbitrary tensor, we therefore define one sign ζ = ±1 for each leg, where we choose the convention ζ = +1 (ζ = −1) for a ket (bra); for the above example ζ [1] = ζ [2] = +1 for the first two indices s 1 , s 2 and ζ [3] = ζ [4] = −1 for the legs of t 1 , t 2 . In diagrams, we can illustrate this sign by an arrow pointing into (for ζ = +1) or out of (for ζ = −1) the tensor, see Fig. 12. Finally, we also introduce an offset Q, which we call the "total charge" of a tensor. The general charge rule for an arbitrary n-leg tensor M then reads ∀a 1 , a 2 · · · a n : ζ [1] q [1] a 1 + ζ [2] q [2] a 2 + ζ [3] q [3] a 3 + · · · + ζ [n] q [n] an = Q ⇒ M a 1 a 2 ···an = 0 (47) Note that the signs ζ [i] and the total charge Q introduce some ambiguity: the charge rule (47)  . We can therefore fix the charge vectors q of physical legs in the very beginning of the algorithm. Since also the signs ζ are fixed by conventions, for tensors with only physical legs one can solve the charge rule (47) for Q (by inspecting which entries of a tensor are non-zero). Examples of this kind are given in Tab. 1.
On the other hand, if the total charge Q and the charges q [i] of all but one leg j of a tensor are fixed, one can also solve the charge rule (47) for the missing q [j] : ∀a 1 , a 2 · · · a n : M a 1 a 2 ···an = 0 This allows to determine the charges on the virtual legs of an MPS. As an example, let us Here, l and r are trivial indices l ≡ r ≡ 1, and only introduced to turn the Γ [i] into matrices instead of vectors. For trivial legs, we can (usually) choose trivial charges q [triv] := (0) which do not contribute to the charge rule. Moreover, we choose the convention that ζ = +1 for left virtual legs, ζ = −1 for right virtual legs and Q = 0 (see Fig. 12). Then we can use the charge rule (48) of Γ [1] solved for q [c] and obtain: We use the same q [c] = (1, −1) for the left virtual leg of Γ [2] ; one can easily check that it also fulfills the charge rule (47) for Q = 0. Strictly speaking, an operator with a non-zero total charge Q does not preserve the charge of the state it acts on. However, it still preserves the block structure, because it changes the charge by exactly Q, e.g. S + increases it by 2. In contrast, S x (and similarly S y ) can both increases or decreases the charge, thus it can not be written as tensors satisfying eq. (47); only the combination S x 1 S x 2 + S y 1 S y 2 = 1 2 (S + 1 S − 2 + S − 1 S + 2 ) preserves the charge. When writing H as a charge conserving MPO, one can only use single-site operators with a well-defined Q.

Basic operations on tensors
Above, we motivated the form of the charge rule (47) and explained how to define the charges for various tensors. Thus, we can write both the initial state and the Hamiltonian in terms of tensors satisfying eq. (47) Now, we argue that tensor network algorithms require just a few basic operations on the tensors, namely (a) transposition, (b) conjugation, (c) combining two or more legs, (d) splitting previously combined legs (e) contraction of two legs, (f) matrix decompositions, and (g) operations on a single leg. These operations are depicted in Fig. 13. As we will show in the following, all of them can be implemented to preserve the charge rule (47) and thus the block structure of the tensors. Thus, any algorithm using (only) these basic operations preserves the charges. Transposition is by definition just a reordering of the legs. Clearly, (47) is then still valid if we reorder the charge vectors q and signs ζ in the same way. Examples for the conjugation are already given in Tab. 1; beside the complex conjugation of the entries this includes exchanging bra and ket, i.e., a sign flip of all ζ. The charge rule is then preserved if we also flip the sign of the total charge Q. For hermitian operators like H the combination of complex conjugation and appropriate transposition changes neither the entries nor the charges of a tensor.
Another operation often needed is to combine two (or more) legs, e.g., before we can do an SVD, we need to view the tensor as a matrix with just two indices. In other words, we group some legs into a "pipe". The pipe looks like an ordinary leg, i.e., we define a sign ζ and charge vector q for it. However, it has the internal structure that it consists of multiple smaller legs. Thus, we can later split it, e.g., after we did an SVD. For concreteness, let us again consider the above example H s 1 s 2 t 1 t 2 → H ab , i.e., we want to combine the indices s 1 , s 2 into a pipe a (and t 1 , t 2 into a pipe b). In this case, we map the indices as a(s 1 , s 2 ) := 2s 1 + s 2 and b(t 1 , t 2 ) := 2t 1 + t 2 . The charge rule is then preserved if we define the charge vectors q of the pipes as ζ [a] q  [4] = +1 are the signs of the indices s 1 , s 2 , t 1 , t 2 , and ζ [a] = 1, ζ [b] = −1 are the desired signs of the pipes. One can easily check that these definitions coincide with the previous ones, q [a] = (2, 0, 0, −2) = q [b] . Since the mapping of indices is one to one, one can also split a pipe into the smaller legs it consists of. However, note that this requires the q vectors and signs ζ of these legs; the pipe should thus store a copy of them internally.
One of the most important (and expensive) operations on tensors is the contraction of legs. Let us consider two tensors A a 1 a 2 and A contraction means to identify two indices and sum over it. Two indices can be identified if they represent the same basis, thus we require them to have the same charge vector q and opposite signs ζ. For example for the usual matrix product C a 1 b 2 := c A a 1 c B cb 2 we require q [a 2 ] = q [b 1 ] and ζ A [2] = −ζ B [1] . The charge rule (47) for C then follows from the charge rules of A and B, if we define Q C := Q A + Q B and just copy the signs ζ and charge vectors q for the free, remaining indices. Moreover, the cost of the contraction is reduced if we exploit the block structure of A and B, which becomes most evident if we have a block diagonal structure as in H ab , eq. (44). On the other hand, we can also contract two legs of the same tensor, i.e., take a (partial) trace. The contributions of these two indices to the charge rule (47) then simply drop out and the rule again stays the same for the remaining indices of the tensor.
We collect linear algebra methods that take a matrix as input and decompose it into a product of two or three matrices under the name matrix decomposition. Examples include the diagonalization of a matrix H = U † EU , QR-decomposition M = QR and SVD M = U SV † . Here, we focus exemplary on the SVD, other decompositions can be implemented analogously. Let us first recap the properties of the SVD: it decomposes an arbitrary m × n matrix into a product M lc = c U lc S c (V † ) cr , where S c are the k = min(m, n) positive singular values, and U and V are isometries, i.e., U † U = 1 = V † V . The charge rule (47) for the matrix elements M lc implies a block structure: assuming that the basis states of the index l are sorted by charge (which we discuss in the next paragraph), we can group indices with the same charge values together to form a block. Moreover, for each block of l with a charge value q [l] l , there is at most one block of the index r with compatible charges, i.e., we have some kind of pseudo block-diagonal form (even if the blocks are not strictly on the diagonal). Therefore, we can apply the SVD to each of the (non-zero) blocks separately and simply stack the results, which yields again a (pseudo) block-diagonal form for U and V † with the required properties. To define the charges of the new matrices we can ignore S, since it is only a trivial rescaling of one leg. Similar as for the contraction, we keep the charge vectors q and signs ζ for the indices l and r. Finally, the remaining operations needed for tensor networks are operations on a single leg of a tensor. One examples is a permutation of the indices of the leg, for example required to sort a leg by q as mentioned above. Clearly, this preserves the charge rule if we apply the same permutation to the corresponding charge vector q. Simliarly, if we discard some of the indices of the leg, i.e., if we truncate the leg, we just apply the same truncation to the charge vector q. Lastly, we might also want to slice a tensor by plugging in a certain index of a leg, e.g., taking a column vector of a matrix. This requires to update the total charge Q to preserve the charge rule, as one can show by viewing it as a contraction with a unit vector.
Above we explained how to define the charges for the U (1) symmetry of charge conservation. In general, one can have multiple different symmetries, e.g., for spinfull fermions we might have a conservation of both the particle numbers and the magnetization. The generalization is straight-forward: just define one q for each of the symmetries. Another simple generalization is due to another type of symmetry, namely Z n , where all the (in)-equalities of the charge rules are taken modulo n. An example for such a case is the parity conservation of a superconductor.
In TeNPy, the number and types of symmetries are specified in a ChargeInfo class [1]. We collect the q-vectors and sign ζ of a leg in a LegCharge class. The Array class represents a tensor satisfying the charge rule (47). Internally, it stores only the non-zero blocks of the tensor along with one LegCharge for each of its legs. If we combine multiple legs into a single larger "pipe" as explained above, the resulting leg will have a LegPipe, which is derived from the LegCharge and stores all the information necessary to later split the pipe.
All these classes can be found in the tenpy.linalg.np_conserved, which also contains functions for all the basic operations on tensors represented by an Array class, with an interface very similar to that of the NumPy (and SciPy) library [64]. Moreover, the module tenpy.networks.site contains classes which pre-define the charges and local operators for the most commonly used models. For example the class SpinHalfSite defines the operators S + , S − , and S z (called Sp, Sm, and Sz) as instances of Array. The following code snippet uses them to generate and diagonalize the two-site hamiltonian (44); it prints the charge vector q

Conclusion
In these lecture notes we combined a pedagogical review of MPS and TPS based algorithms for both finite and infinite systems with the introduction of a versatile tensor library for Python (TeNPy) [1]. While there exits by now a huge arsenal of tensor-product state based algorithms, we focused here on the time evolving block decimation (TEBD) [24] and the density-matrix renormalization group (DMRG) method [8]. For both algorithms, we provided a basic introduction and showed how to call them using the TeNPy package. The TeNPy package contains in the folder toycodes/ minimal working codes implementing finite and infinite TEBD and DMRG based on standard Python libraries. These "toy codes" are intended as a pedagogical introduction, to give a flavor of how the algorithms work.
We did not cover genuine 2D tensor-product state methods. However, we note that the tensor tools build into TeNPy allow for a simple implementation of tensor networks also in higher dimensions.
Contributions to the TeNPy library are very welcome.