Automated construction of $U(1)$-invariant matrix-product operators from graph representations

We present an algorithmic construction scheme for matrix-product-operator (MPO) representations of arbitrary $U(1)$-invariant operators whenever there is an expression of the local structure in terms of a finite-states machine (FSM). Given a set of local operators as building blocks, the method automatizes two major steps when constructing a $U(1)$-invariant MPO representation: (i) the bookkeeping of auxiliary bond-index shifts arising from the application of operators changing the local quantum numbers and (ii) the appearance of phase factors due to particular commutation rules. The automatization is achieved by post-processing the operator strings generated by the FSM. Consequently, MPO representations of various types of $U(1)$-invariant operators can be constructed generically in MPS algorithms reducing the necessity of expensive MPO arithmetics. This is demonstrated by generating arbitrary products of operators in terms of FSM, from which we obtain exact MPO representations for the variance of the Hamiltonian of a $S=1$ Heisenberg chain.


Introduction
The density-matrix renormalization group (DMRG) [1,2,3,4,5] has proven to be one of the most powerful tools to treat low-dimensional problems in strongly correlated quantum systems. Inspired by quantum information theory, a formulation in terms of matrix-product states (MPS) [6] and matrix-product operators (MPO) [5] boosted the development of related algorithms (e.g., [7,8,9,10]). Numerous different techniques have been developed in the framework of MPS, most of which require the representation of the Hamiltonian of the system and of further observables in terms of MPOs [11,12,13]. Even though finding such a representation is a well-known and generally solved problem [14,15,16], it can become arbitrarily complicated as systems may incorporate complex lattice geometries, site-dependent interaction strengths, or transformations of local operators due to their commutation relations.
It is possible to build generic construction schemes for MPO representations of operators by means of finite-state machines (FSM) [15]. However, the numerical realization of these FSMs can be quite involved, especially when exploiting global symmetries [14,17,18].
In this paper, we use the fact that FSMs have an underlying graph structure to obtain a generic algorithmic construction scheme for MPO representations of operators conserving U (1) symmetries. Consequently, most of the complexity can be unwrapped into tensor-network manipulations of strings of local operators, which can be mapped one-to-one to graph representations. The obtained construction scheme can be automatized so that an implementation is capable to efficiently create MPO representations. In particular, this also allows us to include commutation rules as well as conservation laws. We demonstrate how to map the operator arithmetics to operations on graph representations, leading to an algorithm for computing the sum as well as the product of two arbitrary operators. We use this algorithm to demonstrate how to compute the variance of a Hamiltonian var Ĥ = Ĥ − Ĥ 2 with very high accuracy. This is achieved by reducing the effects of catastrophic cancellation [19] and the total amount of required floating-point operations by using exact operator arithmetics in terms of FSM graphs. In addition, this also minimizes the overall computational costs.

U (1)-invariant tensor networks and application of two-site gates
Consider an operatorÔ acting on a Hilbert space H, which decomposes into a tensor product of L-identical d-dimensional local Hilbert spaces H d (d, L ∈ N), and assume that this operator is invariant under a global U (1) symmetry generated by local observablesn i , i ∈ {1, . . . , L} 1 , A state |ψ ∈ H can be expanded into an MPS with local basis states {|n i } spanned by the generators {n i } of the global U (1) symmetry |ψ = n 1 ···n L n 1 · · · n L |ψ |n 1 · · · n L ≡ n 1 ···n L T n 1 · · · T n L |n 1 · · · n L , with T n i ≡ T n i α i−1 α i being rank-3 tensors acting on site i, which carry a physical bond index n i and two virtual bond indices α i−1 , α i . The coefficients n 1 · · · n L |ψ can be reobtained by contracting the corresponding tensors T n 1 · · · T n L over all their virtual indices. In a graphical notation, a tensor T is represented by drawing a shape (circles, squares, . . . ) with as many legs attached to it as there are tensor indices (see fig. 1). Contractions over indices are denoted by connecting the corresponding edges. This yields a tensor network.
Due to conservation of the global quantum numberN , these rank-3 site tensors T n i are irreducible representations of the global U (1) symmetry. The physical indices n i label the basis states of the local generators. As a consequence, the site tensors can be decomposed into blocks that are subject to an on-site conservation law [17,18] with the block indices (n i , α i−1 , α i ) labeling the irreducible representations of the global quantum numberN on the particular site i. These tensor blocks in general are complex matrices Hence, thinking in terms of block-diagonal MPS, each site matrix M n i is decomposed into the block diagonal form M n i = a T n i a with the non-vanishing blocks T n i a = T n i α i−1 ,α i having block dimensions d α i−1 × d α i and a labeling the irreducible representations at site i.
There is a simple graphical representation for these local conservation laws acting on the tensor blocks, which is derived from the tensor-network framework [17]. Whenever there are indices with a (hidden) plus sign in the δ function, the corresponding bonds in the network are labeled by an ingoing arrow, whereas, for indices with a minus sign, an outgoing arrow is placed on the respective bond (see fig. 1).
Next, we take a more elaborate look at the action of a two-site gate on the symmetry conserving state, ignoring the common framework of irreducible representations and block diagonal structures for a moment (later we generalize the considerations to arbitrary operator expressions). Introducing operatorsÂ (i) ,B (j) : H acting only on the sites i, j, we start from a generic U (1)-invariant expression of the form A (i)B(j) |ψ ≡ n 1 ···n L n i ,n j T n 1 · · · A n i n i T n i · · · B n j n j T n j · · · T n L × δ(n i + n j − (n i + n j )) |n 1 · · · n L , where the δ function ensures that the total quantum numberN is conserved when applying the operator. A dummy index τ with −(d − 1) ≤ τ ≤ (d − 1) is introduced to factorize the δ function. Thus, suppressing the sums over the physical indices n i for a moment, we obtain which nearly restores the factorized shape of generic MPS, even though the physical sites i, j are still connected by the sum over the dummy index τ . It is desirable to restore a tensornetwork form so that the action of an operator pair can be written as contraction of tensors acting on the local Hilbert spaces. Therefore, we need to take a closer look at the matrix elements A n i n i γ i−1 γ i , B n j n j γ j−1 γ j of the local operatorsÂ (i) ,B (j) . If the total operator expression is U (1)-invariant, then the local operators themselves have to be one-dimensional representations in the physical indices (acting on the local Hilbert space) so that each operator carries a unique mapping between states |n i → |n i . In other words, for U (1)-invariant operator pairs each local operator is non-vanishing only for a certain value of the dummy index τ = ∆. For instance, in case of spin-ladder operators Ŝ ± (i) the total value of S z is locally changed by ±1. Hence, where we introduced the change of local quantum numbers ∆. Having this in mind, blockindex conservation laws for the local operators can be realized viâ The case of two-site gates is obtained by setting τ i−1 = τ j ≡ 0 as well as τ i = τ j−1 ≡ τ ; the non-vanishing operator blocks A n j n j τ j−1 τ j ∈ C dτ i−1,j−1 ×dτ i,j now contain the reduced operator matrix elements acting on the local basis states n i,j . The contraction of the block index τ over the intermediate site tensors T n k , i < k < j can be recast into a matrix contraction using the matrix identity Therefore, the application of the above two-site gate is factorized completely if we define intermediate shift tensors that act on the sites k with i < k < j as identity, but are contracted over auxiliary bond indices τ k , Hence, the action of the U (1)-invariant operator pair can conveniently be written as tensor network, with the definitions When further local operators to the left and right ofÂ (i) ,B (i) are absent, the auxiliary indices are shrunk to dummy indices τ i−1 = τ j ≡ 0. Thus, the sum over all remaining auxiliary block indices τ i≤l≤j restores the global conservation law. Note that we have implicitly fixed a gauge freedom carried by the auxiliary indices τ i to τ 0 = τ L ≡ 0 (we can even go further and permit any global change δ of the overall quantum numberN by setting τ L = δ). Even though these expressions look a bit tedious, they can be represented compactly in form of a tensor network, which also reveals how useful this decoupling turns out to be (see fig. 2a). For example, it is possible to identify the well-known transformation law for U (1)invariant MPO site tensors in the expressions above [17], yet the block indices τ k are related to the change of the quantum number, captured by the shift tensors R (k) ∆ . Generally, contractions over physical bond indices n k , n k of such shift tensors correspond to a mapping from one block a of an irreducible representation U (k) (N ) = a U (k) a (N ) on a site k to another block a+∆. Hence, given a decomposition into the different quantum-number sectors of an U (1)-invariant MPS or MPO tensor the contraction along a chain of l shift tensors over physical indices (with ∆ 1 . . . ∆ l the changes in the local quantum numbers) is given viâ ∆ mediating the shift in the auxiliary-bond quantum numbers. (b) Network of various shift tensors applied to the U (1)-invariant site tensorT (k) before and after performing the contractions, respectively. After contraction, the auxiliary block indices are given viã α k−1,k = α k−1,k + ∆ 1 + · · · + ∆ l .
From this point on it is clear how to generalize the considerations to arbitrary strings of local operators. In a given expression of local operators, we need to identify pairs of operators that conserve the global U (1) quantum number but locally change the on-site quantum numbers n i 1 → n i 1 , n i 2 → n i 2 . For each pair, shift tensors then have to be inserted acting on the intermediate sites i 1 < k < i 2 . Note that in a code there is no need to explicitly implement the shift tensors. It suffices to implement only their action on the virtual bonds of either the MPS or of another MPO, i.e., to collect all vertical strings of shift tensors R (k) ∆r 1≤r≤l in the network and to apply the shifting in the auxiliary block indices α k−1,k →α k−1,k = α k−1,k + ∆ 1 + · · · + ∆ l (see fig. 2b).

U (1)-invariant MPO representation from FSMs
As discussed in [15,16], the MPO representation of an operator on a tensor-product space H = H ⊗L d can be obtained from the transition amplitudes of FSMs. In the following, the underlying graph structure of FSMs is used extensively. Thus, we first give a brief review on how to identify FSMs with MPO representations, following [15].
Note that the initial and final states are highlighted with a green and red background, respectively while intermediate states A, B, A 1 . . . A r are left white.
The set of all lattice-ordered n-point r + 1-ranged operator strings Σ = ĥ (i) ν,r ν,r defines a language of a FSM, i.e., there is a set of states L and a transition function δ : L × K −→ L so that with a proper initial and final state I, F ∈ L, the FSM M (K, L, δ, I, F ) is obtained with a generated language Σ. The corresponding graph is then a representation ofĤ and is denoted by Λ(Ĥ). An example for the graph representation of the XX model is given in fig. 3a and in the following, we shortly explain how to obtain this graph. At first, we have to rewrite the global operator in terms of lattice-ordered n-point r + 1-ranged operator strings 3 i.e., we have two distinct 2-point 2-ranged operator strings. Then, we define a default set of states L 0 = {I, F } with I, F being the initial and final state of the FSM, respectively. For convenience, we will highlight them in the corresponding graphs with green (I) and red (F ) background. In general, FSMs are capable to generate sequences of symbolsô ∈ K by transitioning between states a ∈ L via permitted transitions, i.e., those with non-vanishing transition function δ(ô, a). Thus, we define the initial/final state by demanding that all the previous/subsequent transitions have to be identities. Next, we build the graph fig. 3a by constructing each operator string separately, so we may start withĥ +− (i) . We therefore add r = 1 intermediate states (here only one state to the set of states: L = L 0 ∪ {A} and define permitted transitions δ(I,Ŝ + ) = A and δ(A,Ŝ − ) = F . Thus, the FSM can transition from the initial state (after placing an arbitrary number of identities) into state A by appending an operatorŜ + onto the so far constructed operator string. Being in state A, the FSM has no other choice than to transition into state F by appending a subsequent operatorŜ − . A generalization of the construction of such local operator strings is shown in fig. 3b for the case of a general 2-point r-ranged operator string.
From this point on, we will call graphs that generate only one distinct type of n-point r + 1-ranged operator string single-branched graphs and identify them with the corresponding contribution in the global operatorĤ ν,r .
Returning to the XX model, we can finish the construction of the FSM by adding another single-branched graph transitioning from the initial state into an additional state B via a local operatorŜ − and a transition into the final state via a local operatorŜ + .
Finally, we can construct the MPO representation, i.e., the operator-valued matriceŝ W n i n i , by assigning matrices W , those with at least one local operator blockô Figure 4a sketches a general FSM and fig. 4b shows the generated MPO bulk tensor, in which the rows and columns are labeled by the states of the FSM. The corresponding boundary tensors are obtained by projecting out (a) the transition from the initial state into the bulk for i = 1 and (b) the transitions from the bulk into the final state for i = L. We emphasize that the site-dependent coefficients c (i) ab ∈ C are free parameters and therefore can be chosen independently for every site. Figure 5: Realization of the operator sumĤ 1 +Ĥ 2 in terms of graph representations Λ( . Graph representations of operatorsĤ 1,2 are illustrated by transitions from the initial state into the graph's bulk (ô 11 . . .ô 1n andô 21 . . .ô 2j ) and from the graph's bulk to the final state ((ô r1 . . .ô rm andô s1 . . .ô sk )). Blue boxes denote the bulk of the graph representations Λ(Ĥ 1,2 ) and Λ(Ĥ 1 +Ĥ 2 ).

Maximally branched representation and local transformations on graphs
As already mentioned, every global operator on H can be formulated as a sum over latticeordered n-point r + 1-ranged operator stringŝ Note that the graph representation via FSM is not unique; for every operatorĤ, there is a set of corresponding FSMs Λ(Ĥ) Λ . Therefore, we are free to choose one representation Λ(Ĥ), which makes it easier to perform operator arithmetics and then switch to another representationΛ(Ĥ) to find the most compact MPO. Referring to eq. (21), a natural translation to the graph representations ofĤ can be obtained by introducing a commutative map ⊕ between graph representations of sums of The realization of ⊕ in terms of graphs is obtained by taking the graph representations of the operatorsĤ 1 ,Ĥ 2 and by merging the initial and final states as depicted in fig. 5. Next, we can define the notion of a maximally branched graph representation, which is given by the graph Λ max (Ĥ) satisfying the conditions: a) the initial state I is the only state with more than one child state, b) the final state F is the only state with more than one parent state. Λ max (Ĥ) satisfies the equation This representation has several advantages; most importantly for our discussion, any local transformation of the operator can be mapped one-to-one to the transitions of the graph representation. Here local transformations are those transformations that map a latticeordered n-point r + 1-ranged operator string into another lattice-ordered m-point r + 1-ranged Figure 6: (a) Transformation of local operatorsŜ ± ,Îd to conserve U (1) quantum numbers in the graph representation Λ(Ĥ 1 ) for a single-branch string operatorĤ 1 = iŜ Transformation of local operatorsâ,â † ,Îd to conserve U (1) quantum numbers in the graph representation Λ(Ĥ 2 ) for a single-branch string operator, which describes fermionic next-tonearest-neighbor hoppingĤ 2 operator stringĤ ν,r →Ĥν ,r without changing r. To clarify what is meant by this mapping we emphasize that each branchĤ ν,r generates exactly one type of local-operator stringô ν 1 . . .ô νr and therefore the tensor representation of these strings factorizes on the local Hilbert spaces. Hence, if we can give a factorization of the local transformationÛ in terms of tensors acting on the local Hilbert spaces (e.g.,û ν 1 . . .û νr ), then we can represent the transformation directly by contracting the transformation tensors with the local operators over their physical indiceŝ Note that the transformations in eq. (13) forcing conservation of U (1) quantum numbers for two-site gates are of exactly this kind and so is their generalization to arbitrary strings of local operators. Thus, conservation of U (1) quantum numbers can be implemented by transforming the graph representation of an operator into its maximally branched representation. The local operator strings in each branch are transformed by applying shift tensorsR ∆ . For example, let us consider the transformation for a next-to-nearest-neighbor spin-flip term withŜ ± being the usual angular-momentum ladder operators with lower site indices. The transformed graph representation is given in fig. 6a. Conveniently, these rules can be extended to anticommuting operators by applying a Jordan-Wigner transformation 4 . For U (1)-invariant operators, the Jordan-Wigner transformation is also local, as there has to be a corresponding annihilation operator for every fermionic creation operator appearing in a string operator -and vice versa -because of quantum-number conservation. The Jordan-Wigner transformation can be implemented as a product of parity operators viâ withâ ( †) (i) annihilation (creation) operators for hard-core bosons at site i. Then we find that, for any U (1)-conserving product of fermionic creation and annihilation operators, the transformations act only within the operator strings. For instance, for a next-to-nearest-neighborhopping term we find Again, these transformations have a simple graph representation, see fig. 6b. To sum up, we have derived a construction scheme for MPO representations of generic U (1)-invariant operators that takes a FSM as input. Specifying the phase factor e iφ for the commutation relations of the local operators (e.g., φ = 2π for bosons, π for fermions), the scheme automatizes the construction of the MPO site tensors so that we can identify the graph representation of the FSM with the MPO. This permits us to take advantage of the graph representation to improve operator arithmetics, which will be discussed in the following section.

Graph arithmetics and MPO representation of the variance of operators
Having derived a construction scheme that permits us to automatize the generation of MPO representations for U (1)-invariant operators, we now make use of the established connection between graphs and operators by replacing operator arithmetics with graph manipulations. We then demonstrate the power of this approach by employing the graph algebra to derive various expressions for the variance of a HamiltonianĤ. As we will show, this not only allows for more efficient calculations, but also addresses the problem of catastrophic cancellation that comes up in a naïve evaluation of var(Ĥ) = Ĥ 2 − Ĥ 2 due to the need to subtract large numbers, which scale roughly as O(L 2 ) in the system size L [19]. Let us consider the product of two global operatorsĤ 1 ,Ĥ 2 in terms of their maximally branched representationsĤ and in particular a single summand that is the product of two lattice-ordered string operators 5 Note that we have introduced superscripts ν 1,2 , r 1,2 to distinguish the index sets of the global operator 6 . A representation of this product in terms of a FSM and therefore its graph repre- 5 We use the notation introduced in eq. (19). 6 Expanding the indices one would haveĤi = ν i (a) sentation requires a reformulation in terms of lattice-ordered string operators. Although the product of two lattice-ordered operatorsĤ r 1 ,ν 1 ·Ĥ ν 2 ,r 2 is no longer lattice-ordered, a careful inspection of the terms violating the lattice order reveals how to build a graph representation generating the productĤ r 1 ,ν 1 ·Ĥ r 2 ,ν 2 . It turns out to be useful to define a non-commutative ∧ product, which maps two single-branched graphs to a single-branched graph via A graph realization of ∧ is obtained by identifying the final state of Λ(Ĥ ν 1 ,r 1 ) with the initial state of Λ(Ĥ ν 2 ,r 2 ). For instance, see fig. 7a for an exemplary evaluation of Λ( iô ). As carried out in appendix A, an algorithm can be constructed that yields the following graph representation with the symmetrized wedge product and sgn(ĥ ν 1 ,r 1 ,ĥ ν 2 ,r 2 ) the sign of the commutation relation between local operatorsĥ ν 1 ,r 1 and h ν 2 ,r 2 acting on different sites. Employing linearity of the graph representation for addition, the product of two general operators can then be formulated via Despite the compact form, we emphasize that eq. (31) and eq. (33) describe a graph representation that is maximally expanded, so that there is no branching below the initial node, and the bond dimension of the generated MPO is very large. However, the size of the graph can be reduced very efficiently by shrinking it into its most compact form. The idea behind the shrinking is best demonstrated with a concrete example. Consider an expanded graph generated from merging two branches The number of nodes can be reduced by fusing edges sharing one node and carrying the same transitions. For example, in fig. 7b these are the edges connecting the states I → A 1 and I → A 2 . In the same way, entire branches that are completely equal with respect to their transitions can be fused together by employing the linearity of ⊕ But we can go even further by defining which local operator pairs acting on the same lattice site vanish identically. For example, this is the case forŜ +Ŝ+ ≡ 0 for S = 1/2 models. Generated branches that contain this type of transitions can be discarded completely. For more complex graphs, these shrinking procedures can be applied iteratively, so that only the reduced graph is stored and used in the actual calculations. An illustrative example demonstrating the advantages of the approach above is to construct two different expressions for the variance of the HamiltonianĤ of a system, where the goal is to avoid catastrophic cancellation as best as possible while keeping calculations as cheap as possible. For instance, the variance can be used as a control parameter in numerical simulations to test whether a state |ψ is close to an eigenstate ofĤ. A naïve evaluation is obtained by directly calculating the expectation values in which eventually vanishes, if |ψ is an exact eigenstate. However, as the expectation value of H 2 scales as L 2 , explicit evaluation of eq. (36) has major drawbacks when it comes to numerical calculations with finite-precision arithmetics such as catastrophic cancellation. Every (exact) number z is numerically represented up to a certain precision [20], which is usually measured in orders p of magnitudes so that we can mimic limited numerical precision by replacing z → z(1 + × 10 −p ) with a random variable ∈ (−1, 1). Let z 1,2 be numbers represented with the same numerical precision p and 0 < z 1 −z 2 = 10 −δ their exact difference. In finite-precision arithmetics, we then obtain For δ > 0, the second term cannot be represented due to finite precision. In our case, the values for z 1,2 are obtained from expectation values of operators acting on the whole system. Hence, if we estimate them by their leading-order contribution z 1,2 ∼ L q with magnitude q, with γ ≡ log 10 (L), and ≡ 2 − 1 , we obtain where is a random variable of order ±1. It follows that we need δ < p − γ · q in order to have a reasonable numerical outcome. In case of the variance, i.e., q = 2, with double-precision arithmetics, p = p num = 16, the naïve evaluation, δ < 16 − 2γ, yields an upper bound of a maximally possible precision of 10 −12 for a lattice with L = 100 sites. However, the numerical precision in general is not only bound by the exact numerical precision p num , but it is also subject to round-off errors of preceding calculations. Thus, in actual calculations, we have an effective precision that depends on simulation parameters p(L, χ, ...) ≤ p num . Returning to eq. (36), we realize that the calculated variance is not only bound, but may even become negative (as in eq. (38) can be negative).
The problem can be addressed by a minimization of γ · q, for instance by constructing an MPO representation for the operatorV = (Ĥ − ψ|Ĥ |ψ ) 2 , (e.g., by distributing the expectation value ψ|Ĥ |ψ over the lattice sites). Unfortunately, this comes at the cost of having constructed an operator that is dependent on its expectation value. Hence, its MPO representation has to be rebuilt for every new state |ψ , which requires an efficient way of obtaining MPO representations of powers of operators while keeping the numerical effort low.
To analyze both problems, we investigate two MPO representations Λ 1 (var(Ĥ)) (see fig. 8a for the corresponding graph) and Λ 2 (var(Ĥ)), which are obtained from the graph representations Let us briefly discuss the properties of the two graph representations and their generated MPO. We start with the numerical implementation and its costs. Both representations are obtained by loading pre-computed graph products at runtime. Then, we explicitly calculate Ĥ with numerical costs scaling as O(LdD 2 D 2 W ) with D and D W being the maximal matrix dimensions of the MPS and MPO site tensors, respectively. Construction of the MPO representation for Λ 1,2 (var(Ĥ)) then only requires allocation of the site tensors with costs scaling as O(L[dD W 1,2 ] 2 ). Here, D W 1,2 denotes the maximal bond dimensions of the MPO site-tensor matrices generated from the graphs Λ 1,2 (var(Ĥ)). Subsequently, the MPO tensors of the graphs are compressed by using an SVD with numerical costs scaling as O(L · [d 2 · D W 1,2 ] 3 ).
To sum up, the numerical costs for obtaining the MPO representation of the graphs Λ 1,2 (var(Ĥ)) are to leading order governed by the expenses when calculating the expectation value Ĥ , as long as the MPS bond dimensions are the cost-determining factor. This demonstrates a major advantage of mapping the MPO arithmetics onto the graph representations: As the expectation value Ĥ , in general, is already available, the additional numerical costs for constructing the MPO representation are not significant. Figure 8: (a) Graph representation Λ 1 (var(Ĥ)), compare eq. (39). The representation ofĤ 2 and of Ĥ 2 are considered independently: For the former, we depict transitions from the initial state to the subgraph Λ(Ĥ 2 ) and then to the final state indicated by the operatorŝ o i , (Λ(Ĥ 2 ) represents the graph ofĤ 2 ). For the latter, we depict transitions from the initial state to the state J 1 of the FSM, and then to the final state via identity operators. Those identities carry the weights − 0 and 1 , respectively, which after performing the multiplication yields − Ĥ 2 , see eq. (39). (b) MPO site tensor obtained from the graph representation Λ 1 (var(Ĥ)). Matrix entries with blue background denote the entries obtained from the sitetensor representation of Λ(Ĥ 2 ).
Next, we take a look at the numerical stability. The graph Λ 2 (var(Ĥ)) generates the MPO representation of the operator Ĥ − Ĥ multiplied by itself. Thus, the graph ofĤ is expanded so that the expectation value Ĥ is equally distributed over all lattice sites with an on-site value of Ĥ /L. As operator arithmetics are represented exactly by means of the constructed graph Λ 2 (var(Ĥ)), the only relevant source of catastrophic cancellation is the evaluation of Ĥ − Ĥ along the lattice that compares terms of order O(L), hence q = 1. Thus, we expect the variance to be bound from below by 10 16−γ . Yet, the graph Λ 1 (var(Ĥ)) in general also suffers from catastrophic cancellation with q = 2, as in the naïve evaluation of the variance. This is best seen by evaluating the structure of the generated matrix representation for non-vanishing tensor blocks W n i n i τ i−1 τ i (see fig. 8b). As the latter contains the complete matrix representation ofĤ 2 , we end up at the final site by comparing numbers of order O(L 2 ) when performing the tensor contractions to evaluate the variance. Therefore, q = 2 yields the variance to be bound from below by 10 16−2γ .
In addition, graph arithmetics are exact, whereas MPO arithmetics need a vast amount of numerical operations completed before expectation values can be calculated. Therefore, using MPO arithmetics is much more prone to collecting round-off errors, which, in drastic cases, reduces the numerical precision p by orders of magnitudes [5]. In this section, we test the behavior of the variance obtained from site-tensor representations of Λ 1,2 (var(Ĥ Heisenberg )) for S = 1 antiferromagnetic Heisenberg chains [21,22] withŜ i the vector of S = 1 spin operators on lattice site i. For this purpose, we exploit the total magnetization of a system with L lattice sites as the conserved U (1) quantity. For the ground state search, we sweeps through the system (using open boundary conditions) and optimize the site tensors via a standard Lanczos algorithm, see [23,24,25]. All MPS contractions are formulated in two-site representation [5]. The largest observed bond dimensions D W 1,2 of the site-tensor matrices W n i n i τ i−1 τ i for the MPO representations of Λ 1,2 (var(Ĥ Heisenberg )) are D W 1 ,max = 12 and D W 2 ,max = 10, respectively. Note that these bond dimensions are generally smaller than MPS bond dimensions used during the simulation. We conclude from D W 2 ,max < D W 1 ,max that the distribution of the constant energy term Ĥ Heisenberg over the lattice sites in Λ 2 (var(Ĥ Heisenberg )) ensures a more efficient compression of the resulting MPO representation. Consistently, for common MPS bond dimensions, we find that Λ 2 (var(Ĥ Heisenberg )) is evaluated faster than Λ 1 (var(Ĥ Heisenberg )). In addition, fig. 9 displays the build time of the MPO representation from the graphs Λ 1,2 (var(Ĥ Heisenberg )) for various system sizes. We find a perfect linear scaling ∼ O(L) for the dependency of the build time, which moreover has no impact on the overall computation time.
We have performed simulations, in which we varied either the total bond dimension per 10 −15 10 −13 10 −11 10 −9 10 −7 10 −13 10 −10 MPS site tensor χ max or the discarded weight w while keeping the respective other fixed. 7 Figures 10a and 10b show results for various values of χ max and w. An important criterion for consistency of the calculations is the independence of the threshold at which catastrophic cancellation sets in. We find that varying both parameters χ max and w yields a constant value of which corresponds to the value at which the graph representation Λ 1 (var(Ĥ Heisenberg )) saturates. Employing eq. (38), these results suggest for the actual numerical precision an ansatz of the form p = p num − p r (L) with a correction p r (L), which to first order depends only on the lattice size L. Repeating the calculations for S = 1 Heisenberg chains with various system sizes, we can thus extract p r from eq. (38) by estimating δ = log 10 (var(Ĥ Heisenberg )) from the saturated value for the variances. For the two graph representations we consider the estimator for the actual numerical precision For both graphs, we perform a linear fit obtaining which is shown in Fig. 11. We emphasize that the observed behavior is perfectly consistent with the ansatz above for constant double-precision arithmetics p num = 16 and residual numeric precision p r (L) ≈ γ(L) = log 10 (L). We hence find that, aside from catastrophic cancellation, the dominating contribution to the loss of numerical precision is proportional to the lattice size, which can be associated with inevitable rounding errors generating an error O(10 −pnum ) per lattice site.
To complete this section, we now turn to the method discussed by Hubig et al. in [14], which suggests a complementary access for introducing operator arithmetics for MPOs. In short, Hubig et al. present a scheme to build MPO representations numerically via in-code evaluation of direct sums and Kronecker products of local operator strings acting on individual lattice sites. To account for the growth in the MPO bond dimension, they discuss various numerical compression schemes and benchmark the obtained MPOs for a 100-site S = 1 Heisenberg chain. This scheme is somewhat more straightforward, as it only involves elementary matrix operations.
However, the FSM approach has additional advantages when it comes to numerical calculations. In particular, when comparing our results for the variance with those obtained by Hubig et al. for the same numerical simulation parameters 8 , we find that the maximally obtained precision is two orders of magnitudes larger when using the MPOs obtained from graph arithmetics. The reason lies in the above observation that the graph representations only require a minimal number of floating-point operations. In fact, the MPO representation for var(Ĥ) constructed from numerical MPO arithmetics as suggested by Hubig et al. already requires at least L Kronecker products and sums to be evaluated for the advantageous MPO representation of the variance. In contrast, the FSM is generated from abstract graph operations and is therefore an exact representation.
Thus, before the actual variance calculation (namely the contraction of the tensor network representation of ψ| var(Ĥ) |ψ ) can begin, the MPO representation obtained from numerical MPO arithmetics already picked up numerical round-off errors of magnitude O ∼ L = 10 γ . Therefore, in the example of the 100-site S = 1 Heisenberg chain, the decrease in precision can be estimated to be p loss ∼ γ = 2 magnitudes, which is exactly what we found in our calculations.
Another nice side effect is that performing graph compression as described in section 4 already covers the deparallelisation compression method which was first introduced in [26].
Finally, note that implementing local transformations on an abstract level as discussed in section 3 maps the whole complexity of index shifting, as required by quantum-number conservation or in an implementation of fermionic anticommutation rules, from the code to an input level. In other words, there is no need for the programmer to hard-code these features when MPOs are generated from FSMs. Instead, they can be incorporated by designing the corresponding graphs, which in turn enormously increases the flexibility of the code.

Conclusions
We have formulated an optimized algorithmic construction scheme for efficient MPO representations of U (1)-invariant operators generated by FSMs. This scheme allows implementations to automatize the application of local transformations of operators in MPO representations, e.g., while exploiting U (1) symmetries by propagating local changes in quantum numbers or by tackling the fermionic sign via a Jordan-Wigner transformation. As a consequence, graph representations for FSMs can be interpreted directly as representations of operators. Based on this, operator arithmetics are then mapped to transformations of the underlying graphs to generate exact MPO representations of operator sums and products. This permits us to exactly calculate operator arithmetics, which can be stored and quickly loaded in the course of simulations. We demonstrated the effectiveness of our approach by considering two graph representations Λ 1,2 (var(Ĥ)) of the variance of a system's HamiltonianĤ. Investigating their numerical properties in a ground-state calculation for a S = 1 Heisenberg chain with L = 100 lattice sites, both representations behave numerically consistent and stable with a resolution for the variances var 1,2 (Ĥ Heisenberg ) of the graph representations of var 1 (Ĥ Heisenberg ) ∼ = O(10 −10 ) and var 2 (Ĥ Heisenberg ) ∼ = O(10 −12 ), respectively. Investigating the dependence of the numerical breakdown on the lattice size shows that the graph representations achieve a numerical precision of p ∼ = O(p num − log 10 (L)) in the calculations. We conclude that this high numerical precision is due to the exact graph representation of the operator arithmetics and comes without any significant additional computational costs during runtime.
Finally, we note that wrapping operator arithmetics into re-usable graph representations helps to obtain efficient and exact MPO representations, which are useful for various applications. For instance, consider variational problems of the form min |ψ ψ| (Ĥ − E) 2 |ψ − λ ψ |ψ to find highly excited eigenstates and eigenvalues. Solving this minimization problem benefits significantly from the increased numerical precision and gains in efficiency of the calculation of the expectation value of (Ĥ − E) 2 .
Further applications of the introduced approach cover the efficient representation of longranged swap gates, and the concepts can be generalized to higher-dimensional tensor networks (e.g. PEPS) [27]. In this case, the transition amplitudes get additional degrees of freedom ('color'), corresponding to either transversal or longitudinal auxiliary indices. Similar benefits are to be expected as in 1D MPS, allowing for more precise and more flexible implementations of tensor network methods in higher dimensions.
Nevertheless, the generalization to arbitrary n-point string operators is straightforward: Simply replace identities with additional local operators. Decomposing the operator product, we findĤ Next, the commutation relation of local operators acting on different sites i 1 = i 2 fulfills with ν 1 ν 2 the sign according to the commutation relation. Then, we can decompose the product into lattice-ordered sums by commuting local operators acting on strictly unequal sites.
The first lattice-ordered contribution is given viaĤ A . The corresponding diagram is a single-branch graph obtained by identifying the final state of the graph Λ(Ĥ ν 1 ,r 1 ) with the initial state of the graph Λ(Ĥ ν 2 ,r 2 ) by introducing an intermediate state E (see fig. 12a). We now make use of the wedge product ∧ for single-branched graphs as defined in eq. (30) to rewriteĤ A in short as Λ(Ĥ A ) = Λ(Ĥ r 1 ,ν 1 ) ∧ Λ(Ĥ r 2 ,ν 2 ). (52) Swapping the operators, we obtain another lattice-ordered sum by commuting all local operator contributions, so that the corresponding graph picks up two factors, ν 1