Tensor network representations from the geometry of entangled states

Tensor network states provide successful descriptions of strongly correlated quantum systems with applications ranging from condensed matter physics to cosmology. Any family of tensor network states possesses an underlying entanglement structure given by a graph of maximally entangled states along the edges that identify the indices of the tensors to be contracted. Recently, more general tensor networks have been considered, where the maximally entangled states on edges are replaced by multipartite entangled states on plaquettes. Both the structure of the underlying graph and the dimensionality of the entangled states influence the computational cost of contracting these networks. Using the geometrical properties of entangled states, we provide a method to construct tensor network representations with smaller effective bond dimension. We illustrate our method with the resonating valence bond state on the kagome lattice.

Introduction Over the last two decades, tensor networks have proven to be a very successful approach to quantum and classical many-body systems. Starting from the density matrix renormalization group (DMRG) [37], subsequentially reformulated and generalized to matrix product states (MPS) [13,24], projected entangled pair states (PEPS) [26,31] and other tensor network ansatz classes [29,34,35], tensor network methods constitute the main numerical tool for the investigation of quantum many-body systems. They provide an efficient parametrization of quantum states that satisfy an area law with respect to the entanglement entropy [22]. In step with the development of numerical methods, the tensor network formalism has also been successfully applied as an analytical tool to describe low-energy eigenstates of gapped [25,26] and disordered systems [5,14,15], the classification of quantum phases [6,27], critical systems [34] and the AdS/CFT-correspondence [23].
Given a quantum many-body state T on L finite-dimensional quantum systems, we can expand it with respect to a product basis: A matrix product state representation of T can be seen as a decomposition of the coefficient tensor T i1,...,i l of the form tr M [1] i1 · · · M [L] iL , where each M [j] i is a matrix of dimension D×D with D denoting the so-called bond dimension. For a fixed site j, we can regard the list of matrices (M  Figure 1: Matrix products states can be seen as a network of maximally entangled states Ω D shared between physical sites of the 1D lattice, on which we have applied local operations A j on each site. virtual quantum state, consisting of a network of maximally entangled states of dimension D shared by neighbouring lattice sites: We call this a matrix product state (MPS) representation of the state T with bond dimension D. Note that in this expression, the two tensor products are shifted with respect to each other by half a physical lattice site (see Figure 1). This procedure can then be generalized to higher-dimensional lattices, where maximally entangled states are shared among vertices, leading to the notion of projected entangled pair states (PEPS). Even more generally, we can consider the case of arbitrary graphs, in which the states obtained in this fashion are known as tensor network states. From this point of view, a tensor network state is constructed by applying local linear operators to an underlying state Φ D = i∼j Ω D i,j , where the lattice sites connected by a maximally entangled state Ω D are determined from the edges of an underlying fixed graph. We will call these states Φ D entanglement structures.
To extend this procedure, we can also consider tensor network states which arise from more general entanglement structures based on multipartite entangled states shared between several sites. This approach has for example been employed in the construction of model systems that exhibit symmetry-protected topological order such as the CZX model, where four-party Greenberger-Horne-Zeilinger (GHZ) states are shared around each plaquette in a two-dimensional square lattice [7]. Other recent examples include projected entangled simplex states [38] and quasi-injective PEPS [21] (see Figure 2). A general discussion on entanglement structures is presented in Appendix B. The paper is structured as follows. In the section following this introduction, we will discuss how transformations of the states of the plaquettes give rise to transformations of entanglement structures and tensor network states. We then come to describe our main result: the construction of novel exact tensor network representations by use of geometric tools from the study of multiparticle entanglement.
Subsequently, we will give details of our construction and conclude by a discussion of the resulting savings in computational cost.
Transformations of entanglement structures Summarizing our discussion so far, we say that a state T on L sites admits a tensor network representation in terms of an entanglement structure Φ if we can find local maps A j , which, when applied to each combined virtual space at each lattice site, transform the entanglement structure Φ to the target state T . More generally, we will say that a state T ∈ (C d ) ⊗L restricts to T ′ ∈ (C d ′ ) ⊗L , and write T T ′ , if there are local linear maps A j such that L j=1 A j T = T ′ [30]. Note that this is equivalent to requiring that T can be converted to T ′ via stochastic local operations and classical communication [12]. T can then be represented by the entanglement structure Φ iff Φ T . If T admits a representation by Φ, and there exists another entanglement structure Ψ such that Ψ Φ, then by composing the local maps we see that Ψ T , i. e. T also has a representation in terms of Ψ. Moreover, if the linear maps giving the restriction Φ T are invertible, then it also holds that T Φ. In the tensor network literature, a state T with this property (possibly after grouping multiple sites together) is called injective. In this case, if we find a different entanglement structure Φ ′ that directly restricts to T , then it also restricts to Φ, meaning that Φ is essentially optimal as far as restrictions are considered.
In order to find restrictions between different entanglement structures, it is sufficient to find restrictions of the entangled states they are made up of. For the simplicity of the exposition, let us consider the case where the entanglement structure Φ can be obtained by tensoring many copies of a single plaquette state ϕ according to a lattice (the more general case is discussed in the Appendix B). Examples of such plaquette states are cyclically shared maximally entangled pairs in the case of PEPS, and GHZ-states in the case of quasi-injective PEPS, for which we borrow the following graphical notation from [10] (see Figure 2), Let κL be the number of copies of ϕ which are required to obtain Φ. If ϕ is an m-partite state, then ϕ ⊗κL will be an mκL-partite state, and therefore we obtain the L-partite state Φ by grouping mκ vertices together. In the case of a regular lattice with coordination number r, we see that κ is equal to r/m (and thus L has to be chosen accordingly in order to make κL an integer number). We now consider a different entanglement structure Ψ, composed of the same number of plaquette tensors ψ. If ψ restricts to ϕ, then this immediately implies that one entanglement structure restricts to the other, i. e. Ψ Φ, and we have already observed that this also implies that T can be represented by Ψ. In order to show how the theory of restrictions can lead to improved tensor network representations, we consider the Resonating Valence Bond (RVB) state [2]. A tensor network representation of this state was introduced in [28]: using our language, the authors show that the RVB state on the kagome lattice can be represented by an entanglement structure constructed from a 3-party entangled state λ ∈ (C 3 ) ⊗3 , shared among the triangular plaquettes in a kagome lattice 1 , where λ is given by and ε i,j,k denotes the antisymmetric tensor with ε 0,1,2 = 1. Therefore, the RVB state naturally fits in our framework of entanglement structures: from the plaquette tensor λ we can build a large lattice entanglement structure Λ by tensoring 2 3 L copies of λ and grouping pairs of vertices to form a kagome lattice (see Figure 3).
In [28], the state λ was obtained as a restriction from 3 3 3 , obtaining a PEPS representation of the RVB state with bond dimension 3. It turns out that this representation is sub-optimal: the tensor λ can 1 In [28] this state is denoted |ε . Λ = λ λ λ Figure 3: The entanglement structure Λ of the RVB state. The triangles represent the λ tensor, and the entanglement structure Λ is obtained by tensoring 2 3 L copies of it and arranging the vertices according to the kagome lattice.
be obtained also as a restriction from 3 2 2 , using the following MPS representation: This leads to a PEPS representation of the RVB state where the bond dimension is reduced from 3 to 2 on two of the edges of each triangle of the kagome lattice. In Appendix A.4, we prove that this representation is optimal, i.e. 2 2 2 . The choice of how to distribute the reduced bond-links inside a plaquette is arbitrary and can also be changed from one plaquette to another plaquette. In conclusion, this example illustrates that the systematic study of restrictions can lead to more efficient tensor network representations.

Main result
Our main contribution is to show that it is possible to obtain even more efficient exact representations, starting from an approximate conversion of the plaquette entangled states. To this end, let us say that a state T ∈ (C d ) ⊗L degenerates to T ′ ∈ (C d ′ ) ⊗L , denoted by T T ′ , if there exists a sequence of linear maps (A j (n)) m j=1 n such that In other words, T degenerates to T ′ if we can find a sequence of restrictions that approximate T ′ to arbitrary precision. We first note that the existence of a restriction implies a degeneration, while in many important cases the converse is not true, i. e. a degeneration from one state to another can exist even if a restriction does not: a well known example is the degeneration of the GHZ state on three parties |000 + |111 to the W state |001 + |010 + |100 [1,12,33]. Even more striking is the conversion from a GHZ state on L parties with k levels With restrictions, a GHZ state of L levels is required (the same as the number of parties), but with a degeneration only two levels are sufficient (independently of L), i. e. GHZ 2 (L) W (L) but GHZ L−1 (L) W (L). Similar examples were obtained in [19], in the case of states with physical dimension exceeding the bond dimension. Let us consider the k-level GHZ state on three parties with the graphical notation introduced in (2). In Appendix A.2, we observe that 2 2 In other words, the 3-level GHZ state on three parties can be obtained as a limit of an MPS with periodic boundary conditions and bond dimension equal to 2, which is not sufficient to construct an exact representation. Moreover, as we show in Appendix A, it is possible to get an even higher saving by using maximally entangled states with different bond dimension: While the notion of degeneration is weaker than the one of restriction, if we can find a degeneration to the plaquette tensor ϕ, then we can still obtain an exact representation of the target state T , as follows: Theorem. Let Ψ and Φ be the entanglement structures obtained by placing ψ and ϕ, respectively, on the faces of a lattice with L sites. Assume T can be represented by Φ, i. e. Φ T . If now ψ ϕ, then where each W i can be represented by Ψ, i. e. Ψ Note that the number of terms in the sum scales linearly with the system size L, instead of an exponential dependence which would be naively expected from a reduction in the bond dimension.
As an illustration of the theorem, we improve the PEPS representation of the RVB state on the kagome lattice (see Figure 4). We reduce the bond dimension to 2, instead of the bond dimension 3 which was considered in [28]: in other words we reduce the local virtual dimension at each vertex from 3 4 = 81 to 2 4 = 16. We are able to do this at the cost of considering a linear combination of tensor network states, where the number of terms scales linearly in the system size. We obtain this reduction in the effective bond dimension by showing that 2 2 2 , and then the result immediately follows from the theorem. The degeneration is realized with help of the MPS matrices for j = 1, 2, 3, which represent the state ε 2 λ + ε 4 |2 ⊗ ( 1 4 |00 − |11 ). Dividing, e.g. M Hence, building on these two optimal representations in terms of restrictions and degenerations, we find a clear separation between the two descriptions, with a reduction in bond dimension for the degeneration that cannot be attained with restrictions. Figure 4: Graphical representation of Theorem: a) A local degeneration (A(ε),B(ε),C(ε)) depending polynomially on ε from one plaquette state (pairwise entangled states between three parties) to another (λ state), gives rise to a global degeneration between a collection of plaquette states. b) Evaluating the degeneration at eL + 1 points ε i , we can express the full entanglement structure built from the second plaquette state (here λ states) as a superposition of eL + 1 states that arise as restrictions from the first entanglement structure (here pairwise entangled states between three parties). The parameter e is a scaling factor depending on the polynomial degree of the local degeneration, the prefactor γ i is obtained be evaluating the ith Lagrange polynomial ℓ i at 0 and L is the number of plaquettes in the lattice.

Mathematical methods
In the following we present a proof of the theorem. It is known (see e. g. [4]) that the definition of degeneration as given in (4) is equivalent to the following statement: ψ ϕ if there exist linear maps A i (ε) : C di → C d ′ i , depending polynomially on the parameter ε, such that for some tensors ϕ l and some integers d and e.
In this case e is called error degree, and we write ψ e ψ to specify it. Note that, by dividing the local maps A i (ε) by a combined factor of ε d , this amounts to a sequence of restrictions converging to ψ where the matrix entries of the local linear maps are Laurent polynomials in ε. The presented degeneration from 2 2 2 to , for instance, has d = e = 2.
We observe that the plaquette degeneration ϕ ψ immediately implies that ϕ ⊗κL ψ ⊗κL , as can be seen by taking the tensor product of the local operators given by the degeneration ϕ ψ. As was already observed in [9,Prop. 4], the error degree will only grow linearly in the number of copies of the degeneration maps, and therefore we see that the product of κL copies of ϕ degenerates to κL copies of ψ with error degree eκL: ψ ⊗ ψ ⊗ · · · ⊗ ψ κL copies This degeneration is possible, when all m-parties of each of the κL copies are considered independently, i. e. when the states in (10) are regarded are mκL-partite states. In [9] this was derived in order to show that tensor rank is strictly submultiplicative. Note that the degeneration resulting from grouping all the κL copies of ψ and ϕ into an m-tensor was already considered in [3], and led to faster algorithms for matrix multiplication. In order to prove the theorem, we will consider instead a different consequence of this argument: grouping the mκL tensor factors according to the underlying lattice, we obtain Ψ and Φ as L-partite states respectively, which means that Similar as in [3,9], we now apply Lagrange interpolation [17, p. 260] in order to transform the degeneration into a restriction. From (11), we can write for some integer d, where the linear maps A l (ε), depending polynomially on ε, are given by copies of the degeneration maps of ψ ϕ grouped with respect to the lattice. Let (B l ) l be the local operators given by the restriction Φ T , i. e. (11) with the (B l ) l and dividing by ε d , we define Considering the right hand side, we immediately see that T (ε) depends polynomially on ε with degree eκL, and that T (0) = T . Moreover, for each ε > 0, T (ε) is a restriction from Ψ. Evaluating T (ε) at eκL + 1 points (ε i ) eκL i=0 , we can obtain the value at ε = 0 via Lagrange interpolation: we obtain (7). In order to prove (8), we observe that any expectation value with respect T (ε) is also given by a polynomial in ε, this time of degree at most 2eκL: Similarly as before, computing T (ε), OT (ε) for a fixed ε > 0 amounts to computing an expectation value for a state T (ε) which has a representation in terms of Ψ. Computing 2eκL+1 of such expectations values is sufficient to computing the value for ε = 0, this proves (8).
Discussion In this work we have shown that the geometry of entangled states and transformations between general entanglement structures provide a new framework for the construction of more efficient tensor network representations of quantum states. More precisely, starting from local improvements on the level of plaquette states, we have shown how to construct optimized tensor network representations on the entire lattice. We provide two methods to obtain such local improvements: restrictions and degenerations.
We illustrate this approach with the RVB state and its PEPS representation. From the representation with bond dimension equal to 3 given in [28], we obtain a first improvement by considering bonds of different dimensions, obtaining a representation where two out of three bonds on each triangle of the kagome lattice have bond dimension 2 instead of 3. In Appendix C we present the details of the computational complexity cost of contracting a PEPS with unequal bonds on the kagome lattice: if the bond dimensions around a triangular plaquette satisfy D 1 = D 3 D 2 , then the computational cost of approximately contracting the PEPS network scales as where C i are constants and χ denotes the bond dimension of the boundary-MPS (this generalizes the well known scaling of C 1 χ 3 D 4 + C 2 χ 2 D 6 d for the case D i = D [18]). Hence, our optimized tensor network representation of the entanglement structure generated by underlying the RVB state in terms of restrictions reduces the prefactor of χ 3 from 81C 1 to 36C 1 and for χ 2 d from 729C 2 to 216C 2 . In addition, due to the reduced bond dimension, also the error caused by the truncation of the boundary-MPS is reduced. Note that this runtime bound improvement applies to the contraction of all tensor networks based on the entanglement structure on the kagome lattice. The same entanglement structure representing the RVB state is used in [28] to construct a family of quantum states which interpolates between the RVB state and a dimer state, which are believed to lie in different quantum phases. Since we have improved the PEPS representation of the entanglement structure behind all these states, the saving we have obtained for the RVB state applies to all of them. Note further that, obviously, there are ways to optimize the contraction cost for specific tensor networks. In [28], for instance, the kagome lattice is first transformed to a square lattice for which an RVB-specific improved double layer bond dimension is derived. We would like to emphasize that our general contraction method still obtains an improvement when compared to this specific method (for details see Appendix C.2). In Appendix A.4, we also show that this representation of the RVB state in terms of is optimal for restrictions.
If we consider the more general case of approximating the plaquette state in terms of degenerations, we can even construct a bond dimension 2 representation of the RVB state, which again is optimal in terms of this effective bond dimension. Using geometrical tools, our main result then allows us to lift this local approximate conversion on the level of the plaquette states to an exact representation of the RVB state on the entire kagome lattice in terms of a superposition of a linear number of tensor network states with bond dimension 2. More generally, we present a recipe to obtain such improvements via degenerations between different entanglement structures.
In addition, our result gives a prescription of how to leverage this optimized bond dimension in order to reduce the computational cost of computing expectation values. More precisely, we describe a parallel contraction algorithm to compute physical expectation values T, OT of the original state as where L is the system size and each of the vectors V i is given as an MPS with the reduced bond dimension. Furthermore, we can explicitly construct the local maps giving the representation of each V i in terms of the underlying entanglement structure Ψ. Computing each V i , OV i then requires computing one contraction of Ψ, which will have a smaller bond dimension than the original entanglement structure Φ and thus can potentially be done over larger lattice sizes. The exact expectation value of T can then be reconstructed from the independently computed V i , OV i via (8). The superposition of tensor network states arises in our result as a way to evaluate a polynomial expression at zero using Lagrange interpolation. We note that the due to the reduced bond dimension for each of the V i also the error caused by approximate contraction will be smaller and that by oversampling the number of evaluation points in the degeneration, there is an additional potential for improving the accuracy of the contraction. In the case of the RVB state or any other tensor network state with a entanglement structure on the kagome lattice, the presented degeneration reduces the prefactors for the computational effort for the contraction of each of V i , OV i to 16C 1 for χ 3 and 64C 2 for χ 2 d as compared to 36C 1 and 216C 2 for the unbalanced optimal restriction with bond dimension (2,2,3).
Furthermore, in Appendix A.3 we show that the EPR pairs on a square with K levels degenerate to a GHZ-state on the square with ⌈ k 2 2 ⌉ levels. Hence on the level of plaquette states, we can degenerate from pairwise maximally entangled states on four parties with ⌈ √ 2D⌉ levels to a GHZ state on four parties of D levels. Taking into account that in a two-dimensional square lattice the bond dimension of neighbouring plaquette states have to be combined (see Figure 2 (c) and (e)), this means that quasi-injective PEPS on the two-dimensional square lattice based on GHZ states as introduced in [21] with bond dimension D can be represented as a normal PEPS of bond dimension 2D. By our theorem, expectation values for these generalized PEPS can hence be computed from expectation values of normal PEPS, for which highly optimized numerical codes exist.
More generally, given an entanglement structure Φ built from locally distributed multi-partite entangled states, our result allows to characterize the variational class given by the set of states obtained by applying local maps {A i (ε)} L i=0 which are polynomial of degree e in ε, and then taking the limit ε to zero. Each state obtained in this fashion is specified by a polynomial number of parameters. We have shown that such states belong to the span of a linear number of states represented by Φ, and that their expectation values can be efficiently computed by interpolation. While this class of states can be represented by a PEPS with a sufficiently large bond dimension, this alternative description allows to compute contractions on larger system sizes.

A Plaquette conversions
In this section, we present general strategies and examples for optimized conversion between plaquette states in terms of degenerations. To this end, we consider m-tensors, i. e. elements of m i=1 C di , for some non-zero integers (d i ) i , which can be equivalently seen as unnormalized pure m-partite quantum states. We will usually consider m to be a small integer (often m will be equal to 3 or 4), as these m-tensors will be the building blocks of the entanglement structures we will consider in Appendix B. After some definitions and examples that set the scene, we will study the conversion between maximally entangled states shared around circles and and GHZ states, which are the basis for conversion between PEPS and more general tensor network states. To do this, we utilize the correspondence between entangled pairs on the circle and the matrix multiplication tensor (see e.g. [8]). This will be first done for 3party tensors and subsequently for tensors of m parties. In addition, we prove in Appendix A.4 that the MPS representation with bond dimension (2, 2, 3) for the state λ, which is the basis for the PEPS representation of the RVB state, is optimal.

A.1 Definitions and Examples
Let us start by recalling the definitions of tensor restriction and degeneration.
for some tensors ϕ l and some integer d. We simply write ψ ϕ if ψ e ϕ for some error degree e.
Let us denote by GHZ k (m) the k-level Greenberger-Horne-Zeilinger (GHZ) state on m parties: We note that GHZ k agrees with the unit tensor in algebraic complexity theory, usually denoted as k .
In the cases when m is small, as in the case m = 3 which we will study extensively, we will use the following graphical notation to represent the GHZ state: When the number of parties is clear from the context, we will simply write GHZ k for simplicity. In algebraic complexity theory, the GHZ state plays a special role, which leads us to define the following quantities.
Definition 3 (Rank and border rank). For ϕ ∈ m i=1 C di we define the rank and border rank of ϕ as respectively.

Remark 4.
Both the rank and the border rank depend on the tensor product structure of the space where ϕ lives: if we regroup the tensor product differently, the rank might change. It is easy to see that if we group factors together, i. e. we see ϕ not as an m-partite state but as an m ′ -partite state, with m ′ < m, then both the rank and the border rank will not increase. This is due to the fact that after regrouping the state GHZ k (m) becomes the state GHZ k (m ′ ), so if a restriction/degeneration to ϕ was possible before grouping it will still be possible after grouping. Moreover, if m = 2, then both rank and border rank of ϕ coincide with the Schmidt rank across the bipartition. Therefore, we can see that the maximal Schmidt rank across any possible bipartition: is a lower bound to R(ϕ).
Another relevant example is the so-called iterated matrix multiplication tensor, which is the m-tensor given by maximally entangled states of dimensions k 1 , k 2 , . . . , k m arranged in a cycle, which we will denote by MaMu k1,...,km : This tensor is often denoted by k 1 , . . . , k m in algebraic complexity. We write MaMu k (m) if k = k 1 = · · · = k m , a case, which is often denoted by IMM m k in the literature. As in the case of the GHZ state, we will write MaMu k without the parameter m when this does not cause any ambiguity.
In the cases where m is fixed and small, as for example when m = 3, we will use the following graphical notation: Note that the fact that MaMu k (m) restricts to an m-tensor ϕ is equivalent to the fact that ϕ has an MPS representation of bond dimension k with periodic boundary conditions. More generally, since PEPS and other tensor network states are defined in terms of networks of maximally entangled states, we will be very interested in having results regarding conversions between MaMu k (m) and other states. This leads us to define, in analogy to the rank and border rank, the following quantities. respectively.

A.2 From
MaMu k (m) to GHZ k (m): the case m = 3 The aim of this section is to investigate restrictions and degerations from MaMu k1,k2,k3 to GHZ k (3): this will be relevant as these tensors can be used to construct entanglement structures of triangular lattices. In particular, we will prove the following proposition.
for some fixed positive c. In other words, Hence, bond(GHZ 3 (3)) > 2, whereas bond(GHZ 3 (3)) = 2. Before giving the proof, we discuss a non-symmetric extension of this result, i.e. degenerations from MaMu k1,k2,k3 with different values of k 1 , k 2 , k 3 . Following [36], we consider the local diagonal operator depending on an integer g which we will fix later. This leads to the transformation The leading order term in ε corresponds to a GHZ state, because fixing any pair of i 1 , i 2 , i 3 determines the third one uniquely. Hence, we only have to determine the number of solutions to the equation i 1 + i 2 + i 3 = g for given n i and inhomogeneity g. Choosing k 1 = 2, k 3 = 3 and k 2 = 2 or k 2 = 3 and g = 5 then directly leads to 2 2 3 4 and 2 3 3 5 .
These degenerations are optimal, both in the sense that the corresponding restrictions are not possible, and in the sense that we cannot obtain GHZ states with more levels from a degeneration of these MaMu tensors. It is also not possible to obtain the same GHZ states from MaMu tensors, where one of the bond dimension is smaller than the ones we have considered.
We will now turn to the proof of Proposition 7. We will first introduce two definitions and prove a lemma. We will denote by dim H the dimension of the orthogonal representation. Let K 0 n,n be the graph obtained by removing the edge (b 0 , c 0 ) from K n,n : Lemma 10. With the notation defined above, let π : K 0 n,n → H be an orthogonal representation such that dim H 2(n − 1). Then at least one of the following holds . . , n − 1} and C = span{π(c i ) | i = 1, . . . , n − 1}. Since π(b i ) is orthogonal to π(c j ) for every i, j = 1, . . . , n − 1, we have that B ⊥ C. If B ⊕ C is not equal to H, which has dimension 2(n − 1), then at least one of the two has to have dimension strictly smaller than n − 1, so that either 1. or 2. holds. If not, then H = B ⊕ C. Since π(b 0 ) is orthogonal to every π(c i ) for i = 1, . . . , n − 1, it is orthogonal to C, and therefore π(b 0 ) ∈ B. Similarly, π(c 0 ) is orthogonal to B and therefore lies in C. But then π(b 0 ) and π(c 0 ) live in orthogonal subspaces and they are themselves orthogonal.
We are now ready to prove Proposition 7.
Proof (Proposition 7). We will start by proving the lower bound of (24) as well as first part of (25), since they are equivalent as can be see by setting k = n 2 − n + 1. Let us assume that GHZ n 2 −n+1 has an MPS representation with bond dimension D n, and let us show how to derive a contradiction from this fact. To fix notation, let We start by showing that if D n we can without loss of generality assume that A 0 is non-singular. To show this, we will use the following fact: any linear subspace of M D containing only singular matrices has dimension at most D 2 − D [11]. Consider S = span{A i | i = 0, . . . , n 2 − n} ⊂ M D . S is the span of n 2 − n + 1 matrices: if it contains only singular matrices, then its dimension can be at most D 2 − D. So if D n, either in S there is one matrix which has full rank or dim S D 2 − D n 2 − n, which implies that the matrices (A i ) i are not linearly independent.
Let W = (w ij ) ∈ U (n 2 − n + 1) a unitary matrix such that n 2 −n i=0 w 0i A i is either zero or full rank. Then by denoting ϕ i = W |i the rotated basis, we see that (W ⊗ W ⊗ W ) GHZ n 2 −n+1 = i ϕ i ⊗ ϕ i ⊗ ϕ i has an MPS representation with matrices and A 0 is either zero or full-rank. The first case we can exclude, because tr A 0 B 0 C 0 = 1. This shows that up to a local unitary on the physical level, we can assume without loss of generality that A 0 is not singular.
Let A 0 = U ΣV * be the singular-value decomposition of A 0 . Then Σ > 0 defines a scalar product on we obtain an orthogonal representation of the graph K 0 n 2 −n+1,n 2 −n+1 (defined in Lemma 10) on M D with inner product ·|· Σ , since If D n, then dim M D = D 2 n 2 < n 2 + (n − 1) 2 + 1 = 2(n 2 − n + 1), which implies that we can apply Lemma 10 and at least one of the conditions stated in it must hold true. If 1. or 2. hold, then either span{B i } or span{C i } has dimension strictly smaller than n 2 − n + 1, but we have already seen that this leads to a contradiction. Therefore 3. must hold, but this also leads to a contradiction: on the one hand we have proven that tr A 0 B 0 C 0 = 0 but we also know that know that tr A 0 B 0 C 0 = π(b 0 )|π(c 0 ) = 1.
We will now prove the upper bound of (24). Our starting point is the following result [30] MaMu n γn 2 GHZ ⌈3n 2 /4⌉ , for some constant γ > 0. Let α an integer to be determined later, and consider the tensor product of α copies of (27). To simplify notation, we set k = (⌈ 3 4 n 2 ⌉) α , so that we get MaMu n α αγn 2 GHZ k .
As we have discussed previously, it is a well known result in algebraic complexity theory that a degeneration can be turned into a restriction by interpolation paying a price in terms of a direct sum (see e.g. [4]). In the present context, we this means that we can turn the degeneration into a restriction by supplementing a GHZ state with a number of level equal to the error degree plus one (see e.g. [9]). Therefore we obtain GHZ αγn 2 +1 ⊗ MaMu n α GHZ k , from which follows that bond(GHZ k ) n α bond(GHZ αγn 2 +1 ).
Now we see that n We now want to choose α in order to minimize the right hand side. We will instead simply minimize 4 3 α 2 k 1 α , as this will already give the right asymptotic scaling. Since the function diverges to infinity when α tends to zero or to infinity, we find the minimum by setting the derivative of α 2 log 4 3 + 1 α log k to zero: We can improve this bound by minimizing the right hand side of (29) instead, obtaining bond(GHZ k ) 8γ 3 log(4/3) k Note that the asymptotic scaling of this bound is the same as the one we had obtained by minimizing 4 3 α 2 k 1 α , as we claimed. To get the second part of (25), let instead α+2 . Then (29) implies that Again we would like to take the maximum over α to obtain the best lower bound. We approximate the optimal value by maximizing the function given by the value of α satisfying again since the function is smaller or equal to zero for α equal to zero or tending to infinity. Since both for a positive constant c.

A.3 From MaMu k (m) to GHZ k (m): the general case
We will now generalize the results for the 3-party case from the previous section to m-parties. Hence, let us consider the state MaMu k (m) given by a network of maximally entangled states with m-levels each shared between neighbouring parties arranged on a circle, with a total of m parties. We want to find local linear transformations A l (ε) depending polynomially on ε for each vertex such that the leading contribution in ε of the resulting state is an m-party GHZ state with k ′ -levels where the kets indicate the grouping of parties. Following [36], we choose the operators A l (ε) diagonal in the local product basis, such that the leading order contribution in ε is given by those vectors |i 1 , i 2 · · · |i m , i 1 , that satisfy a certain system of linear equations, i. e. l c l i l = g with coefficients c l and inhomogeneity g elements of Z ν for some integer ν. This last condition is equivalent to the requirement that the vector l c l i l − g is the zero vector, which in turn leads to the norm condition We have to ensure that this expression can be generated by a product of local degenerations of the form , which can always be achieved for all the terms in (32) that depend at most on a single index l. However, for the cross-terms this requires c l |c l ′ = 0 if |l − l ′ | > 1, making the vectors c l into an orthogonal representation of the cycle graph (giving a lower bound on ν), in which case we obtain Furthermore, we have to ensure that the leading contribution, given by is indeed locally unitarily equivalent to a GHZ state, i. e. consists of an equal weight superposition of product states ψ r = ψ r,1 ⊗· · ·⊗ψ r,m , such that ψ r,l |ψ r ′ ,l = δ r,r ′ . Since (33) is a superposition of vectors of the form |i 1 , i 2 · · · |i m , i 1 this means that fixing a pair of indices i l ′ , i l ′ +1 at any vertex l the linear equation l c l i l = g must have at most one unique solution in the remaining i l . One way of ensuring this is to choose the vectors c l , c l ′ linearly independent, whenever |l − l ′ | > 1. In other words, we have to choose the vectors (c l ) l in such a way that if we remove any subset of vectors that share a vertex, the remaining ones have to be linearly independent. The maximal dimension of the GHZ state we can extract is then given by the number of integer solutions to the equation where we optimize over the inhomogeneity g. One can get a bound on the number of these solutions by a probabilistic argument with respect to the inhomogeneity g , However, in order to talk about the finite m case, we are going write down an explicit expression for (34) that satisfies all the necessary properties, i. e. c l |c l ′ = 0 for l ′ / ∈ {l − 1, l, l + 1} and {c l } m l=0 \ {c j , c j+1 } linearly independent for all j. We define the equations inductively starting from the four-party case Now adding a new vertex and edge into the cycle between i 4 and i 1 means that now c 4 has to be orthogonal to c 1 and the new c 5 should be orthogonal to all vectors except c 1 and c 4 . This can be achieved by the choice This procedure can be repeated leading to the following linear system for the k-cycle  In order to find the integer solutions to this problem, we employ the Smith normal form of the matrix on the left-hand side, which gives the general solution vector where z 1 , z 2 are arbitrary integers and the constants (A l ) depend on the choice of g by a simple linear integer transformation given by the Smith normal form. In order to obtain the relevant solutions for our specific problem, we have to impose the upper and lower bounds 0 and k − 1 if the original maximally entangled states are of dimension k for each entry of the solution vector i.

A.3.1 The case m = 4
In the case m = 4, (38) leads to the inequalities 0     Choosing g 2 = g 1 ∈ { k 2 , k−1 2 } depending on whether k is even or odd leads to the lower bound on the number of solutions of the form k 2 +1 2 for odd dimensions and k 2 2 for even k. This shows MaMu k (4) GHZ ⌈ k 2 2 ⌉ (4), i. e. that we can locally degenerate from a cycle of four maximally entangled states with k-levels to a four party GHZ state of ⌈ k 2 2 ⌉-levels.

A.4 Bond dimension of λ is strictly larger than 2
In [28] the PEPS representation of the RVB-state is obtained via the multipartite entangled state with ε denoting the completely antisymmetric tensor such that ε 0,1,2 = 1. In this section, we show, that λ cannot represented as a Matrix product state of bond dimension 2.
We first note, that the trace on the left hand side gives rise to the usual MPS gauge freedom, were we can (3), the special linear group. Hence, restricting to matrices of the form M = R ⊕ |2 2|, with R ∈ SL (2), which in addition leave |2, 2, 2 invariant, we also have (M ⊗ M ⊗ M )λ = λ. Thus taking this physical symmetry plus the Y, Z gauge transformation together and restricting for the moment to the 2×2×2 tensor B = (B 0 , B 1 ), we see that we can apply any operator K 1 ⊗ K 2 ⊗ K 3 with K i ∈ GL (2) to B without changing (40) if we transform (A i ) i and (C k ) k accordingly. However GL(2) 3 orbits of 2×2×2-tensors are known explicitly [12], and we can use this freedom in order to reduce B 0 and B 1 to seven different normal forms, for which we have to obtain a contradiction. In addition to the null tensor and the product state, these seven classes encompass the bipartite entanglement between only two parties, the W state and the GHZ state. We will now go through all the cases.
null tensor In this case, both B 0 and B 1 are equal to the zero matrix, which leads for example to tr(A i B 1 C k ) = 0, which clearly contradicts Equation (40).
product state In this case, B can be chosen as |0 |0 |0 , which implies B 0 = |0 0| and B 1 equal to the zero matrix. Hence, tr(A i B 1 C k ) = 0 for all i, k leads to the same contradiction as for the null tensor.
In all the cases which we have not immediately discarded, we see that B 0 can be chosen as |0 0| while B 1 can either be |1 1|, |1 0|, |0 1| or |0 1| + |1 0|. We now want to show that neither of these cases are possible. We start by decomposing the matrices A i and C k as Since we have reduced the problem to the case B 0 = |0 0|, we have that In particular, we have that c 1 |a 2 = 1, c 2 |a 1 = −1, implying that none of these vectors can be the zero vector. Together with c 2 |a 2 = 0 this means, that span{|a 1 , |a 2 } = C 2 , and thus necessarily |c 0 has to be 0, since the trace condition forces it to be orthogonal to both a 1 and a 2 . Similarly, we have that span{|c 1 |c 2 } = C 2 and that |a 0 = 0. Let us denote the matrix entries of B 2 as b i,j = tr(B 2 |j i|) for i, j = 0, 1, and let us consider the vectors In particular c ′ k |a ′ i = 0 for (i, k) = {(0, 0), (0, 2), (2, 0)}. Therefore, they define an orthogonal representation of K 0 2,2 : by Lemma 10, either c ′ 2 |a ′ 2 = 0, or either |a ′ 0 or |c ′ 0 is zero. We can exclude the latter case, since this would imply that either A 0 or C 0 is zero, which we already know leads to a contradiction. Therefore c ′ 2 |a ′ 2 = b 1,1 = 0. In the same way, defining We will now consider the four possibilities we have for B 1 , driving each one of them to a contradiction, and therefore showing that no MPS representation of λ with bond dimension 2 is possible. 1,k , and in particular tr(A 1 B 1 C 0 ) = c 0 |a 1 since |c 0 = 0. From this equation it follows that so we obtain a contradiction. 1,k , so reasoning in the same way as before we see that | a 1 = | c 1 = 0 and that | a 0 , | a 2 , | c 0 and | c 2 are non-zero, therefore reducing to the case where and since b 0,1 = 0 and b 1,0 = 0 we see that necessarily c 0 |a 2 = c 2 | a 0 = 0. Therefore |a 2 is proportional to | a 0 and similarly |c 2 is proportional to | c 0 , and so it follows that This leads to a contradiction since

B From plaquettes to entanglement structures
In this section, we present the general theory of entanglement structures underlying tensor network ansatz-classes based on multi-partite entangled plaquette operators and their conversions. We show in particular, how to lift conversions of plaquette states to the conversion of entire entanglement structures. We will start by defining entanglement structures on graphs and hypergraphs. Thereafter, we will discuss their conversion.

B.1 Entanglement structures on graphs and hypergraphs
In this section, we introduce the formal definitions of entanglement structure and representability of a state in terms of them. It will be natural to talk about entanglement structures defined on hypergraphs, but we will first of all restrict to graphs, as they correspond to the situation which is mostly considered in tensor network models. The following definitions should then just be seen as rephrasing the well-known concept of tensor network states in a slightly different language.
Definition 12 (Entanglement Structure (Graph)). Let G = {V, E} be a graph with vertex set V , edge set E, and let w be an integer-valued weight function on the edge set w : E → N. For each e ∈ E, let Ω e ∈ C w(e) ⊗ C w(e) be the maximally entangled state of Schmidt rank w(e). An entanglement structure or contraction scheme w.r.t. to G is then given by We will define the local virtual dimension of a vertex v ∈ V as D v = e:v∈e w(e) and call the bond For a fixed integer D we will denote by Ψ D (G) the entanglement structure obtained by setting a constant weight w(e) = D on the graph (which will then have bond dimension D). We will also say that a state ϕ ∈ We will now generalize the concept of contraction schemes to representations of hypergraphs, where the underlying entanglement structure is given by multipartite entangled states shared among all vertices that are connected by an hyperedge. Structure (Hypergraph)). Let G = {V, E} be a hypergraph, with vertexset V and hyperedge set E. For each e ∈ E, let Ω e ∈ v∈e C Dv,e be a pure state. An entanglement structure or contraction scheme w.r.t. to G is then given by

Definition 14 (Entanglement
We define the local virtual dimension at vertex v ∈ V as D v = e:v∈e D v,e and the bond dimension of Ψ(G) as Note that, contrary to the graph case, the hypergraph entanglement structure is not simply defined by weights on the hyperedges but also by the choice of multi-partite entangled states Ω e (since there exist non-equivalent multi-partite entangled states, we cannot simply specify the bond dimension as in the case of graphs). As an example, note that the GHZ k (m) can be written as an entanglement structure on the hypergraph H m with m vertices and a single hyperedge containing all vertices: In analogy to the graph case, we can still consider a hypergraph entanglement structure Ψ(G) as a contraction scheme, with a state ϕ being representable with by Ψ(G) iff we can find local maps An example of a hypergraph entanglement structure is the one considered in [21], which is used to construct quasi-injective PEPS. The vertex set is given by the same vertices of the two-dimensional square lattice on L × L sites (i. e. C L × C L ), but instead of having an edge for each pair of neighbouring sites, there is instead an hyperedge containing the 4 vertices in each of the plaquettes: Finally, for each hyperedge e, we choose a GHZ state on 4 parties as Ω e , so that the resulting entanglement structure is given by Φ = e∈E GHZ k (4).
The bond dimension of Φ is then simply given by the number of GHZ levels k.
Trivially, every graph is also an hypergraph, so we will state the results in the following in terms of hypergraphs, but the reader should be aware that they immediately apply also to graph entanglement structures.

B.2 Conversions between entanglement structures
In order to find representations of physical states with optimal bond dimension, we will analyze how well a given contraction scheme can be expressed in terms of another. To this end, we consider the following definition, which specializes Definition 1 to the particular case of entanglement structures.
Let G and G ′ two graphs or hypergraphs with the same vertex set V . Given two entanglement structures Ψ(G) and Ψ(G ′ ) we say that Ψ(G) restricts to Ψ(G ′ ), and we write Ψ(G) Ψ(G ′ ), if there exists linear maps where D v and D ′ v are the local dimension at vertex v of Ψ(G) and Ψ(G ′ ), respectively. The notion of degeneration specializes to the case of entanglement structures exactly in the same way as restrictions (i. e. by allowing local maps to act according to the tensor product structure defined by the vertex set V ).

Remark 15.
Note that in the case of a path graph L L on L sites and a graph entanglement structure Ψ k (L L ) (i. e. the entanglement structure of an open boundary condition MPS of bond dimension k), the existence of a degeneration implies the existence of a restriction. More concretely, if Ψ k (L L ) T for some L-partite quantum state T , then also Ψ k (L L ) T . This is due to the fact that, by sequential SVD decompositions (see [24,Theorem 1] and [22, pag. 18-20]), it is possible to construct T with a bond dimension equal to the maximal Schmidt rank across any cut Sr cut (T ) (see (23)): equivalently Ψ k (L L ) T for k = Sr cut (T ). On the other hand we can repeat the argument of Remark 6 for Ψ k (L L ), but taking into account that we have open boundary conditions instead: we see that after grouping neighbouring sites we can convert Ψ k (L L ) to Ψ k (L L ′ ) with L ′ < L, so that if Ψ k (L L ) T then necessarily k Sr cut (T ). On the other hand, as soon as there are cycles in the graph, this argument breaks down, and we have already seen that in Appendix A that some degenerations are possible when the corresponding restriction is not.

C Computational complexity of tensor-contractions
In this section, we derive estimates on the computational cost of approximately contracting PEPS networks for the two-dimensional square and the kagome lattice. We will subsequently discuss a specialized contraction strategy for the RVB state from the literature.

C.1 PEPS contraction on the kagome and square lattice
We now turn to the contraction of PEPS networks on the kagome and square lattice. In contrast to the results commonly stated in the literature, we will explicitly deal with the case of non-equal bond dimensions with respect to different virtual degrees of freedom and in the case of the kagome lattice also take into account different distributions of the legs in the two layers of the network. In all cases, we consider a boundary-MPS approach, where the PEPS tensors at the boundary of the network are considered as an MPS of fixed bond dimension χ to which the internal PEPS tensor regarded as MPOs are applied subsequently. All bounds are based on the estimates C MM D 1 D 2 D 3 for the computation of the product of two rectangular matrices of dimensions D 1 ×D 2 and D 2 ×D 3 and C SVD χD 1 D 2 for the truncated singular value decomposition (SVD) of a D 1 ×D 2 matrix to its largest χ singular values [16] with C MM and C SVD constants. Figure 5: Approximate contraction of a PEPS network on the two-dimensional square lattice with the boundary-MPS method. The first row shows the initial step at the boundary (I) and the bulk-step (II), which is repeated until the right boundary of the network is reached. For simplicity, only a single layer of the two-layer PEPS-network is shown here, but each red circle in the upper row represents the two local PEPS tensors that have to be contracted along the invisible physical dimension. In the second row the detailed contractions of both PEPS-layers that are carried out in each step are depicted with their corresponding computational cost. Lines that terminate in a tensor at a given sub-step ( a)-d)) in a tensor represent the contractions carried out at this point, whereas lines not connected to a tensor at that level correspond to free indices. In total, the scaling is given by Two-dimensional square lattice Starting from one boundary of the lattice, the next row of the double of the contraction are depicted in Figure 5. Starting from the left-boundary, the first MPO-tensor (red circle) of the next row is contracted into the boundary-MPS and its bond dimension subsequently reduced to χ via an SVD (step (I)). The cost of each step in this contraction is indicated in the second row of Figure 5. In each of the steps a), b) and c), the contractions performed in that step are indicated by lines that terminate in a tensor at that level, all other lines count as free indices. In step I.a) for example the only contraction performed is with respect to the gray line connecting the yellow square and the red circle, whereas the remaining lines (two gray, one black, one orange) are free indices. Hence, this contraction can be seen as a multiplication between a χD 1 ×D 1 matrix (yellow square) and a D 1 ×D 1 D 2 d matrix (red circle) leading to an overall cost of C MM · χD 3 1 D 2 d. The two red circles correspond to the two layers of the PEPS network. Hence, the overall cost for contracting the MPO into the boundary-MPS at the boundary is given by C SVD χ 2 D 2 1 D 2 2 + C MM χD 3 1 (D 2 2 + D 2 )d.
In step (II), the sub-steps b) to d) are basically the same one as the steps a) to c) in step (I), however we first have to take care of the violet tensor resulting from the SVD performed in sub-step I.c. The overall computational cost is then given by Because this cost upper bounds the contraction cost at the boundary, cost of contracting each MPO tensor into the boundary-MPS tensor can be upper bounded by which agrees with the estimate for uniform bond dimension O(χ 2 D 6 d) + O(χ 3 D 4 ) found in the literature [20,31,32].

Kagome lattice
The situation for the kagome lattice is very similar when compared to the square two-dimensional lattice except more care has to be taken about how to associate the local tensors to the boundary-MPS tensors. The procedure we adopt here is depicted in Figure 6. In order to make the procedure more transparent, we first split the boundary vertices at the tip of each triangle into two lattice sites, before we start the contraction procedure. Fixing the three bond dimensions in each triangle for the full lattice, we can nevertheless distinguish their distribution for upwards (K 1 , K 2 , K 3 ) and downwards (D 1 , D 2 , D 3 ) pointing triangles. In comparison to the square two-dimensional lattice, we have to distinguish three different contraction steps, depending on whether we are contracting a tensor on the top right (I), the top left (II) or in the middle (III) of a hexagon. These three steps are then repeated until the right boundary of the kagome lattice is reached. Figure 6: Approximate contraction of a PEPS network on the kagome lattice with the boundary-MPS method. Depending on the position of the local tensor to be contracted into the boundary-MPS in the kagome lattice, three different contractions have to be performed. We allow for different bond dimensions for up-(K i ) and downwards (D i ) pointing triangles. Figures 7 to 9 depict the details of these three steps, breaking down every step into the explicit