Entanglement Dynamics of Random GUE Hamiltonians

In this work, we consider a model of a subsystem interacting with a reservoir and study dynamics of entanglement assuming that the overall time-evolution is governed by non-integrable Hamiltonians. We also compare to an ensemble of Integrable Hamiltonians. To do this, we make use of unitary invariant ensembles of random matrices with either Wigner-Dyson or Poissonian distributions of energy. Using the theory of Weingarten functions, we derive universal average time evolution of the reduced density matrix and the purity and compare these results with several physical Hamiltonians: randomized versions of the transverse field Ising and XXZ models, Spin Glass and SYK model. Along the way, we find general expressions for exponential n-point correlation functions in the gas of GUE eigenvalues.


Introduction
The question of emergence of thermal behavior in isolated quantum systems has attracted considerable attention since the birth of quantum mechanics [1]. Recently intensive interest from the community has been reawakened. In part it is motivated by the availability of new experimental systems (see e.g. [2][3][4][5][6][7][8][9][10][11] for recent cold atomic experiments) that enable us to probe thermalization or its absence through precise control of microscopic conditions. Studies of these experiments also deepened our theoretical understanding of the universal laws governing dynamics of generic many-body systems and constraints in the contrasting case of integrability.
There are several roads leading to thermalization in quantum systems: one is based on the eigenstate thermalization hypothesis (ETH) [12][13][14][15][16] and is supported by a body of numerical evidence. It asserts that sufficiently complex quantum system (in particular ones that are chaotic in the classical limit) have eigenstates that are essentially indistinguishable from thermal states with the same average energy. A similar claim holds for local operator averages of almost all observables. In other words, a global pure quantum state is apparently indistinguishable from a mixed, globally-entropic thermal ensemble. In this respect it is interesting to understand dynamics of the process of thermalization in generic closed quantum systems.
A different path to accessing universal features of thermalization is to assume that dynamics are governed by a random Hamiltonian which smears out all microscopic detail. Assuming some overall symmetries it can be postulated further that the randomized Hamiltonian is described by one of the traditional random matrix ensembles (GUE, GOE, or GSE). This type of setup has been implemented in several recent works [17][18][19][20][21][22][23][24][25][26][27], and most notably, [28]. In all deference, this last paper predates the current work and shares the same perspective. In it, a number of our results were already independently established, along with some interesting supplementary ones.
Among all of the recent works, quantum states converge dynamically to agree with the predictions of a thermal ensemble, and the universal quantities accompanied by this convergence have been established. Most of the above papers focus however on the asymptotic limit of very large or even infinite Hilbert spaces. The perspective of the present work is on explicit dependence on the finite system size.
While one can argue that the Hamiltonians drawn from one of the random matrix ensembles correspond to somewhat unphysical situation, we believe that our results are applicable to a large class of physically relevant models where long-range interactions dominate. For example, this is realized in central-spin type models whose Hamiltonians H CS = k J k S 0 I k describe interaction between a "central spin" S 0 and the nuclear spins I k with arbitrarily distributed couplings J k . This Hamiltonian, while integrable for arbitrary J k 's, is used to model quantum dots and NV centers [29,30] reduced BCS models and Dicke-type systems [31]. These integrable models represent a subclass of more elaborate non-integrable counterparts which can be used to describe more realistic setups [32]. The latter then are used to describe, for example, dipole-dipole interacting spin models [33] used to describe e.g. ions in a trap or nitrogen vacancy centers [34,35] which were instrumental in observing time crystals. Recently, the random Hamiltonian setup or random quantum circuits have been used for studying universal features of the out-of-time correlation function [23,26,36,37], entanglement features [27], unitary design [25] and spectral decoupling [23], as well as entanglement tsunami [38], and measurement induced phase transitions [39].
Finally, we mention the setup used in our paper can also be employed as a particular form of a model for studying stability of many-body localized (MBL) systems [40,41]. In that case our subsystem would represent an ergodic island in a non-ergodic sea (Anderson, or MBL phase).
Having in mind all these recent developments, here we consider a model of a subsystem interacting with a reservoir. We study dynamics of information transfer in this system assuming that the overall system is described by non-integrable or integrable Hamiltonians with uniformly random spectral basis, which are modeled by RMT from either Wigner-Dyson or Poissonian distributions.
The paper is organized as follows. In the next section, the technical setup of the subsystem/reservoir is explained, as well as the mathematical notation. The random matrix models provide a distribution of systems, over which we must average. This averaging is done by integrating over the degrees of freedom of the matrices, and similar to a polar coordinate system on the plane, this integral is split into two parts. The angular part is treated in section 3, and the radial part in section 4. Along the way, we find quite general procedures to average functions of reduced density matrices, and also for vertex operator-esque correlators in the field theory of random matrix eigenvalues. Then in section 5, the results of these averages are discussed, their properties highlighted, and notably they are compared to numerics by sampling the studied matrix ensembles, and to ensembles of existing famous models from (one-dimensional) condensed matter theory. In the final section, some outlook and discussion is offered.

Setup
We are interested in using Random Matrix Theory (RMT), specifically the techniques of unitary integrals and correlation functions, to gain insights in the statistics of bipartite discrete quantum systems, induced by a Gaussian ensemble of Hamiltonians.

Bipartite Systems
The setup is as follows. We consider a discrete finite full quantum system A+B consisting of a subsystem A, and a bath B, with a tensor product Hilbert space On the full space, a constant Hamiltonian, a d × d Hermitian operator or matrix, governs time evolution. Notably, this Hamiltonian is thought to be a general random interaction on H, and has no knowledge of locality or the division into A and B. The partition simply arises because the experimenter has access to A via local observables O: Hermitian operators that act as the identity on B: O = O A ⊗ 1 B . B may be thought of as much larger than A, but this is not crucial. We will repeatedly make use of the mapping of a Hermitian matrix to its eigenvalues and eigenvectors found by diagonalizing H. In this context, this is known as the radial-angular-decomposition [42]. Here E = {E j } is a vector of real eigenvalues (energies) and V ∈ U(d) is a unitary matrix encoding the spectral basis 1 .
One assumption made in this work is that at time t = 0 the subsystem and bath are brought into contact with each other, resulting in a full system product state |1 = |1 A ⊗ |1 B = |1 A ; 1 B : the constituent systems are initially pure 2 . The choices of |1 A and |1 B are arbitrary, but our notation will use a basis in which |1 is the first basis vector 3 .
At later times, by the Schrödinger equation, the state is V also diagonalizes the time-evolution operator, so |t is polynomial in V , V † . In component notation, with completed basis {|k } of H, it looks like 1 This mapping is not exactly bijective, H is overcounted because right-multiplying V by a permutation matrix or an element of U(1) ⊗d results in the same Hamiltonian. However, in the results below this will be compensated by normalization. This symmetry is larger for degenerate systems but these are measure zero in the full space, so they do not affect the statistics either.
2 Or we could imagine performing a selective measurement on A at t = 0 to the same effect. 3 |1 is not an eigenstate of H.
Tr B : h B Figure 1: Diagram of expression (5). Lines indicate matrix multiplication, with summed index below. The outer indices are split into a subsystem and bath double-index, the trace sums over the latter.
From here, we can obtain the mixed state description of the subsystem A by tracing out the bath, Then, implicitly ρ A is thus also a function of V and E. In the basis {|h A } of H A , the elements of this matrix are given by The expression is visualized in a diagram in figure 1. The summation over h B results from tracing out the bath. This is done by expanding the outer indices of ρ = |t t| into a doubleindex, not unlike the digits of a decimal number: a bath state tensored with a subsystem state |h g| = |h A ; h B g A ; g B |, and contracting with δ h B ,g B . Others, namely j, k, are internal full indices from the matrix multiplication of the diagonal decomposition, ranging over the full dimensionality of H. In what follows, we will use the Einstein summation convention on repeated indices 4 . In doing so, we remember that indices with a subscript A (e.g. g A ) are summed from 1 to d A , subscript B summed to d B and unsubscripted to d.
Local observables O A on A take the form of Hermitian operators on H A , and an expectation value can be found:ō = Tr A (O A ρ A ). This is the main utility of ρ A .
Another important quantity in this work is the purity [43]: It contains information about the entanglement between A and B: The lower the purity, the higher the entanglement. A product state like |1 gives ρ A = |1 A 1 A |, for which γ = 1. On the other hand, an entangled state cannot be written in this product form and will have γ < 1. The lowest purity is 1/d A , corresponding to the maximally mixed Though less studied than the Von Neumann entropy, it is computationally favorable because it does not involve diagonalization of ρ A or an infinite power series (the logarithm).

GUE Distribution and Measure
In random matrix theory, one treats matrices as random variables. They can be integrated, and thus averaged over, if one takes care constructing the measure and probability density function. We will use the Gaussian Unitary Ensemble (GUE) in the following. The weight on the eigenvalues will be multivariate Gaussian, With C a normalization constant and λ set to 1 in this work. The measure is given by is the Vandermonde determinant, the Jacobian of the transformation to angular-radial coordinates. DE is the product of Lebesgue measures on the eigenvalues, and dV is the Haar measure on the Unitary group U(d). Some explanatory background information, as well as the theory needed to integrate out these measures, is collected in appendix A.

Angular Integral: Unitary Average over Eigenstates
Now we make our first steps towards non-standard calculations. In the setup, we have cited a probability density function (PDF) and measure P(E)∆ 2 (E)DV DE on M, the space of possible Hamiltonians. The next step will be to choose a suitable integrand to average over this PDF. The choice we make, is the reduced density matrix ρ A (t). In this section, we will perform the equivalent of the angular integral in polar coordinates: we will average over the compact unitary group U(d). A note on notation: when an average of some f has been carried out over dV or DE we will write f . If both have been done, we will use double brackets: f . The authors would like to be clear that although the results of this section were achieved independently, most if not all were already established some years earlier in [28].

Reduced Density Matrix
Returning to expression (5) and taking time t as a fixed parameter, we first wish to average (integrate) ρ A (t) over the eigenbasis V ∈ U(d). We will average element-wise, so the result is again a (d A ×d A ) matrix. The salient observation is that each element of this matrix ρ A is polynomial in the elements of V and V † . In fact it is second order in both. As we vary V , we are interested in individual terms of expression (5) . This is handled using the theory of Weingarten functions, see appendix A for details.
In the language of equation (56), the sets of indices are: The inner product is linear, so we may write Here, S 2 is the symmetric group on two symbols, and δ I,σ(I ) := l δ I l ,I σ l . Pulling this all together, a consistent pattern emerges: the I, I -terms decouple from the J, J -terms into a product structure: they do not depend on each other, neither in the contributions due to equation (56), nor in the multiplicative factors that appear from the Schrödinger equation. This is a feature that appears repeatedly in our unitary average calculations, so it warrants some notation.
The objects R σ and Q τ can be seen as vectors of length q!, with elements indexed by the permutations of S q . In the present case, ρ A is quadratic in V and V † , so q = 2, and the former are 5 : They are determined by checking, for each σ or τ , which Kronecker-delta's are satisfied. The four possibilites are listed below. Then, Where we also define auxilliary functions which will reappear numerous times, and which holds all the time and energy dependence of the average. The latter function will be prominent in the next section.
We can take this average in (11) as a type of inner product of vectors defined through a real symmetric 6 q! × q! matrix Wg(d, στ −1 ), that only depends on d and q.
As an example, for q = 2 the Weingarten matrix takes the form: 5 As ρA is a matrix quantity, also these elements must in turn be expressed in bra and kets. Again here summation is implied, regardless of some indices being inside bra's/kets, and some inside the definition of the multi-indices. 6 Wg only has as many unique elements as there are partitions φ q, for instance, the diagonal elements are all given by Wg(Id, d), because the Weingarten function only depends on the conjugacy class of στ −1 .
Performing the inner product: The initial condition is visible in the first term and it competes with the maximally mixed state in the second term. This expression satisfies a number of consistency checks: at t = 0, χ(0) = d 2 so ρ A (t) is the inital state, and Tr A ρ A (t) = 1 ∀t, which is to be expected: integrating and tracing are linear operations and should commute, and trivially dV Tr A ρ A = 1dV = 1. A more general form, valid for entangled initial states, is found in [28].

Purity
A larger, but similar calculation can be done for an average over all eigenbases of the purity, defined in (6). For the technical details, see appendix B.
The expression for γ is again a polynomial, so we can integrate it over the whole unitary group. In this case, as ρ A was second order in V, V † , γ is fourth order, and q = 4. This means the vectors R σ , Q τ are 24-dimensional. However, the approach is the same and the result is strikingly compact 7 : Here ξ(t) is again a real function accounting for the energy and time dependence. In terms of ι(t) from (14): As a consistency check, indeed ξ(0) = d 2 (d − 1)(d + 3) for any spectrum, so γ(0) = 1, the initial state is pure. Also (19) is symmetric in A ↔ B, which is to be expected: the nonzero eigenvalues of the Schmidt decompositions of ρ A and ρ B are equal, so the purities of their reduced density matrices agree [44]. Also note, for a trivial bath d B = 1, d = d A , we have γ(t) = 1. This makes sense: as nothing is traced out, no information is lost and the state remains pure. Combining the last two observations, a trivial subsystem (d A = 1) is also always pure.

The Functions χ(t) and ξ(t)
The time and energy dependence in both main expressions of this section collect neatly into two functions, χ(t) for the density matrix, and ξ(t) for the purity. These functions deserve some attention.
Though they administer the competition between the initial information remaining in A, and it being swept into correlations with B, the χ(t) and ξ(t) remarkably don't depend on the partition A + B, only on the product d A d B = d. Examples of χ(t) and ξ(t), for spectra of Hamiltonians drawn randomly from the (λ = 1) GUE are plotted in figures 2 and 3. Physically, at t = 0, the system is pure. This coincides with all the phasors in definitions of   (19), for the spectrum of a Hamiltonian drawn from the d = 4, 5, 6 GUE, respectively. χ(t) and ξ(t) in (15), (19) being evaluated at zero: they are in phase. A high value of χ(t) and ξ(t) is thus associated with the state being pure. After some time, we expect the phasors to decohere. Then the functions drop, and we associate this with a transition to a mixed 'phase'. More quantitative statements will be made in the next sections, when eigenvalue distribution (and thus dynamics) is treated.
The spectra used to plot the figures 2 and 3 exhibit level repulsion [42], so energies are nondegenerate. Few of the oscillatory terms in χ, ξ are then coherent for long. By comparing to more erratic figures (not shown) plotted with uncorrelated energies, this level repulsion appears to be an important feature driving the rapid and sustained dying down of the functions to a stable value as t → ∞.

Radial Integral: Correlators in Energy
In the previous section, we computed the average over U(d) of a number of expressions involving the reduced density matrix. In this section, we continue this job by also performing the weighted average over the matrix ensemble of χ(t) in (15) and ξ(t) in (19). These hold the energy dependence of ρ A (t) and γ(t) , respectively. After this we have a full average over the space M of GUE Hamiltonians. This is analogous to performing the radial part of an integral in polar coordinates. In a later section, we will also consider Poissonian (uncorrelated) energies.
In fact, these calculations take the form of 2-, 3-and 4-point exponential correlators, when we interpret the energies as the positions of the particles of a gas living in one dimension [42]. As explained in subsection A.2, we will make use of the theory of Orthogonal Polynomials. (15) is a sum of d 2 terms. The strategy will be to average them each separately, to obtain χ(t) . An important simplification stems from the invariance of χ(t) under exchange of any two variables E j , E k . Specifically, it consists of a sum over j, k of terms e i(E k −E j )t , which, upon integrating out the energies, are of course all identical as long as k = j. The d remaining terms with j = k are each constant unit and average to one. Let us therefore set k = 1, j = 2 without loss of generality:

Calculation of Two-Point Correlator
In appendix B, the technique of these integrals is explained, allowing us to integrate out directly all E j with j > 2 by introducing the symmetric kernel K d . See equations (58) and (7), we evaluate the integral leaving the natural prefactor d(d − 1), Here we have defined a new symmetric matrix-valued function F (t). Its elements are given for indices 0 ≤ µ, ν ≤ d − 1: It may be noted that Tr µ a generalized Laguerre polynomial. The derivation is found in appendix C.
It appears χ(t) is a function of t 2 . For an example, in the simplest case 8 of two coupled qubits, the function looks like 9 : We have plotted χ(t) for a number of dimensions. See figure 4. The plot begins at its global maximum of χ(0) = d 2 , drops precipitously, oscillates a number of times somewhat proportional to the dimension, and slowly climbs to its stable value of d at infinity. The positions of the extrema satisfy d χ dt = 0 which amounts to finding the roots of a highdimensional polynomial. This is done via Powell's method. Solving for the position of the global minimum, the first local minimum, it appears more quickly as d increases and neatly follows the fit t min ≈ 1.93/ √ d + 0.45, found on d ∈ [2, . . . , 84]. Despite the initial maximum scaling as d 2 , the global minimum converges from below to a constant as we increase d, empirically: lim d→∞ χ(t min ) ≈ 1.908.
These techniques readily generalize to larger correlators.

Calculation of Higher Correlators
Continuing the approach above, we will evaluate the integrals needed to find ξ(t) . As a more general application, seen as a field theory of an eigenvalue gas [45], these techniques allow one to calculate vertex operators and n-point correlators in the GUE distribution. All terms in ξ(t) are of the form e i(E k +Em−E j −E l )t , however some j, k, l, m may coincide. Again, after it has been determined which indices are distinct, the actual value of each index is irrelevant for the expectation value. A different correlator will appear for choosing k = m than for k = j etc., due to signs. Upon careful inspection of expression (19), we decompose We recognize the two-point correlator from the previous section in the last pair of brackets. Also, the first term is simply the same correlator with double time. What remains are the three-and four-point correlators. In fact it will turn out that all these expressions are real, and as both three-point correlators are each others complex conjugate, they are also equal. Rather than subsituting everything into a final, cumbersome expression, we will explain how to evaluate general n-point functions, with more detail in appendix C.
The first characterization is in terms of an exponential generating function G({a m }) of dummy variables a m , enumerated by the possible phase multiples m ∈ Z. Formally: Indeed this determinant of a symmetric matrix is always real, and therefore also the resulting correlators. For any n-point correlator characterized by integers {c 1 , . . . , c n }: This coincides, in index notation, with the following 10 : Going from expression (21) to (28), we expand the (n × n) determinant using the Leibniz formula into a sum over permutations σ ∈ S n . In order to interpret this result, it is best to try an example. For instance, in the case of The term corresponding to σ = Id, with sign +1, will result in tracing each instance of F with itself: Tr F (2t) · Tr F (−t) · Tr F (−t). The term in σ = (12) will couple the first index to the second, the second to the first, and the third to itself: − Tr[F (2t)F (−t)] · Tr F (−t), just as σ = (13). σ = (123) contributes Tr[F (2t)F (−t)F (−t)], etc. All these contractions can be found by expanding the determinant in expression (27) as det = exp • Tr • log and collecting the suitable factors of a m . In the current example, we would be interested in any and all terms multiplied by exactly a 2 a 2 −1 . Replacing a 2 a 2 −1 → 2 due to the derivative ∂ ac −1 results in the answer. Take note that from n = 4, the order inside the trace will begin to matter.
The explicit forms of all needed correlators are found in appendix C. Again, ξ(t) will take the form of polynomials times exponents of t 2 . For instance, for d = 4, or 2 qubits, it works out to For d = 4, 5, 6 we have plotted the shape of ξ(t) , in figure 5. It is clear that the function begins at its global maximum ξ(0) = d 2 (d − 1)(d + 3), drops close to zero, and slowly climbs to its steady state value of 2d(d − 1). The positions of the global minima also follow an inverse square root fit, with slightly different constants than in the χ(t) case. We find, on d ∈ [4, . . . , 20]: t min ≈ 1.95/ √ d + 1.69 minimizes ξ(t) .

Final Results and Discussion
Here we state the full expressions for the two main results of this paper, and discuss their properties.

Dynamical GUE-averaged Reduced Density Matrix
The first result is the time-dependent, GUE average reduced density matrix of subsystem A. When a bipartite system A + B is coupled by a λ = 1 GUE Hamiltonian, the average local state of A is given by: with The definition of the the matrix F (t) is given in (23). This can be thought of as a GUEinduced, time-dependent ensemble on the basis states of H A . It satisfies some consistency checks, for instance Tr ρ A (t) = 1, as it should: the integrand always has unit trace. Also, ρ A (0) = |1 A 1 A |, before time-evolution, the average is always in the initial state. Also, as it appears χ(t) > 0, it is easy to see ρ A (t) > 0: it is positive semi-definite and is thus a well defined density matrix [43]. Besides the normalization (to unit trace) of the projectors, it is remarkable that this expression does not depend on the partition of (d A , d B ), only on their product.
It is interesting to visualize the competition between the projectors |1 A 1 A | and 1 A /d A in equation (31). In order to be specific, we will consider coupling a single qubit as subsystem A, to a bath with H B of d B = 2, 3, 4. See figure 6. As the bath size increases, indeed the mixed component becomes more dominant. We pointed out in the previous sections that χ(t) administers the competition between the pure and mixed states of A. When the phases that comprise χ(t) in (15) are coherent, χ(t) is large and A is pure. As they decohere, A mixes. We observe that shortly (t min ≈ 2/ √ d) following coupling A to B, the mixing becomes approximately complete. After this dip, the initial condition |1 A 1 A | resurfaces and then stabilizes to a degree. Initial information disappears rapidly, then trickles back in. At high d, the coefficient of |1 A 1 A | at t min falls off as 0.908/d 2 . However, the late time limit is Also taking d B → ∞, the mixing becomes complete instantaneously. This is to be expected: increasing the degrees of freedom of the bath without decreasing the interaction to each of them.
It is almost futile to contrast the analytic results of figure 6 with numerical simulations, so good is the agreement. See figures 7 and 8, in which three times N = 10000 randomly generated GUE (λ = 1) Hamiltonians were used to couple a qubit to the same three baths as in the analytic example. The initial product state was drawn randomly according to the Haar measure from H A and H B . For each system, ρ A (t) was calculated, and then averaged over the set of N . Of this average, the |1 A 1 A | occupation is plotted, as well as that of the other state 11 , |2 A 2 A |. To an accuracy ≈ 1/N , the cross-components vanish.
Using the same techniques, other polynomial functions of the reduced density matrix can be averaged over the GUE. We have simply treated the most obvious candidates. On that note, by linearity, the GUE-average of the expectation valueō(t) of any constant observable O A on A can immediately be found from the average reduced density matrix. Integrating and 11 The difference in range of the plots is due to the normalization of the projectors, and the fact that element occupations are the sum and difference of the projectors coefficients. In numerics, we have no access to the exact decomposition of equation (31).  tracing are linear operators, and thus commute, The matrix-valued variance of the density matrix has also been averaged over the unitary group. This is nowhere zero, as expected, with smaller elements along the diagonal and a larger ones on the first row and column. In the large d limit, all go to zero at least at the same rate as the density matrix elements. This adds credence to the average. However, the explicit expression is cumbersome and not very insightful, so it has been omitted from this work.

Dynamical GUE-averaged Subsystem Purity
The second result of this paper is the Dynamical GUE-average subsystem purity. It is the average purity of A entangled to B under collective evolution of a λ = 1 GUE Hamiltonian.
Where ξ(t) is defined in (25). This expression also satisfies a number of consistency checks: γ(0) = 1 by the setup, and also if d A = 1 or d B = 1, γ(t) = 1, as a trivial system is always pure and a trivial bath cannot entangle. In agreement with the behavior of ρ A (t) , the purity quickly moves from a pure state to its most mixed value, and then recovers somewhat slowly. We can visualize expression (35), for qubit and qutrit subsystems A coupled to varying baths. See figures 9 and 10.  The late time behaviour is readily given. Remember that in expression (25), the exponents will eventually shrink to zero, their prefactors are polynomial.
The plots teach us convergence to this limit is strong and uniform. Compare this to the trace measure average purity obtained by drawing a random vector from a d = d B · d A dimensional Hilbert space according to the Haar measure, and tracing out the d B dimensions of the bath [46], Figure 11: Numerical average purity: a qubit-qubit system for growing samples of GUE-Hamiltonians. In black the analytic γ(t) . Figure 12: Numerical average purity: a qubit-qutrit system for growing samples of GUE-Hamiltonians. In black the analytic γ(t) .
The limit of the dynamical case, (36), exhibits an extra positive term: a remnant of the initial purity.

Oscillations: Comparison to Poissonian Ensembles
We will take a moment to spotlight a peculiar feature of our dynamical averages. They exhibit oscillations, as shown in figures 4 and 6. We may count oscillations by the number of extrema. For χ(t) , containing the dynamics of ρ A (t) , the time at which extrema occur is plotted against d. See figure 13. Note that they all have a maximum at t = 0, which is omitted. Until d = 5, there is just one minimum, and in steps a new 'fold' with a maximum and minimum is added.  Each individual Hamiltonian will drive eternal oscillations, and there is always Poincaré recurrence. Their averages however are characterized by finite dimensional Laguerre poly-nomials, which form a landscape with a finite amount of 'features'. They will settle down eventually. This allows us to classify a period of pre-equilibration rigorously, as the period between the first and last extremum. After this, the system is on its way to equilibrium. This plot also portrays characteristic frequencies in the period of pre-equilibration, which stem from the Gaussian distribution of the energies. The appearance of the Laguerre polynomials can be traced back to the Vandermonde determinant: the repulsion of eigenvalues giving rise to Wigner-Dyson statistics, causes the phases to decohere in a specific way. To illustrate, consider a a very artificial ensemble where the eigenstates are still uniform in the Haar measure, but we replace the distribution of energy differences such that all levels are decoupled. This results in Poisson statistics between neighboring levels. Such level spacing is seen to describe integrable systems more accurately. Specifically, all the above calculations can be repeated with factorized eigenvalue distribution and then each absolute energy gap |E 1 − E 2 | will also follow an exponential distribution 12 , which is the hallmark of Poisson statistics for neighboring levels.
We may thus average χ(t) and even ξ(t) with respect to this, taking care to scale the distribution sensibly with dimension. The considerably simpler calculation can be found in Appendix D. The result is This function has the same limiting behavior, for t = 0 and t → ∞, but at intermediate times, decays more slowly, and exhibits no oscillations. See figure 14. In Integrable Systems the degrees of freedom are more independent, or are thought to be less highly coupled, and do not exhibit energy level repulsion. Instead the largest probability occurs at zero gap. [42]. In such systems, we expect entanglement to grow more slowly.

Bessel Function Scaling Limit
In a related work, discovered after most of our computations had been completed, we encountered a certain scaling limit of what is essentially the upper left coefficient in ρ A (t) in expression (31) [28]. This function does not diverge or vanish as d → ∞, so long as we take care to scale time by a factor √ d, equivalent to setting λ = d in the GUE distribution, decreasing interaction strength as we increase dimension. This agrees with the square root 12 The exponential distribution describes a random variable measuring the interval between instances of a Poisson process. This can be thought of as the waiting time before a random event with a constant probability. The difference of two such variables is then the time between two independent events. Assume an ordering, without loss of generality. After the first has occurred, the distribution of the time until the second is independent of the past, and is again exponentially distributed. It is true that the energies themselves are artificially all positive, however that is of no concern as in this work, and generally in physics, we are only interested in energy differences.
behaviour of the first minima. In a new time coordinate, J 1 is a Bessel function of the first kind. Additionally, we infer the limit of what we will call the 'Arbiter' of the purity: the time dependent factor that mediates between the trace average purity and its complement in (35).
Deviations from the limit are of order 1/d. The convergence is apparent in figures 15 and 16. Figure 15: Simultaneous plot of the first coefficient of ρ A (t) for various d against scaled time. Also the limit to which they converge, in black. Figure 16: Simultaneous plot the arbiter of γ(t) for various d against scaled time. Also the limit to which they converge, in black. These results were found by Fourier transforming the d = ∞ eigenvalue distribution, which is characterized by the semicircle law [42]. This invites a conjecture: after tentative calculations of Tr ρ n A for n = 1, 2, 3, we see that in the limit d A , d B → ∞, d A /d B fixed, the only term that does not scale to zero is (χ(t)/d 2 ) n , with the exception of n = 1, where due to normalization the trace is unit. Therefore we expect that Tr ρ n A (τ ) = (J 1 (2t)/t) 2n up to O(1/d) corrections, for n > 1. This could be useful for series expansions.

Numerical Comparison to Physical Models
The philosophy of this subsection is the following: we would like to compare entanglement generation in our random matrix ensembles to ensembles of more physical models. Each element of the latter type of ensemble is a well studied model, but it is characterized by unique coordinates and coupling strengths. This way, we may hope their statistical moments equilibrate to a constant state over time, while each individual instance of the model obviously does not, but instead oscillates forever. This is necessary in order to compare to the GUE and Poissonian ensembles, which show this behavior. To this end, we will need to introduce randomness. We begin from the Transverse Field Ising Model (TFIM), Transverse Field XXZchain (XXZ), Spin Glass (SG), and Sachdev-Ye-Kitaev (SYK), because they are all integrable in the certain implementations. Contrary to what is common practice, we will be looking at (extremely) low dimensional versions of these models, as that is the focus of the article in general.
For each integrable ensemble, we construct a 'Disordered twin', where the local interactions are the same, but they are no longer globally coordinated, breaking integrability and introducing more randomness. The models all feature couplings h, g which are drawn from standard normal distributions, and are in places scaled by external parameters labeled by J, B ∈ R. More randomness is introduced by rotating the set of axes in the TFIM and its disordered twin, the DTFIM, and the XXZ and its own disordered twin, DXXZ. This is achieved by choosing random elements X from the 3-dimensional representation of SO(3), and using the first column of this as the new x-axis, etc. Exact implementations are below.
When stochastic variables are given a subscript j, it means that a new element is drawn for each site j, inside one Hamiltonian of the ensemble. As always, the σ a , a = 1, 2, 3, are Pauli spin matrices, and j runs over the sites. Moving on, f are spinless Majorana Fermions. They are implemented by Jordan-Wigner transforming Pauli matrices, in a one-to-two mapping [47]. We use them in the SYK model. 13 .
The Spin glass model used, is given by: For an example of the gap statistics, a common lithmus test for the nature of the ensemble, see figure 17. Pictured are the subsequent energy gaps for the XXZ/DXXZ, as well as SYK and the GUE for comparison. It is clear that the introduced disorder has succesfully broken the Poissonian character of the XXZ, as its spectrum has changed to more resemble that of the GUE. For other ensembles, the results follow the same trend from the original to the disordered. The lower graph is the ratio of subsequent gaps, a naturally scale-invariant and robust measure for the same distinction [48]. Wherever applicable, periodic boundary conditions apply: j is taken modulo s, the number of sites, each having a 2-dimensional Hilbert space. In order to compare time evolution quantitatively, after the ensemble is sampled, the energies of each system are scaled such that the ensemble has numerical average (E i − E j ) 2 = 2(d + 1), i = j. The system is initialized in a tensor product of a Haar-measure random state from H A and one from H B . Then time evolution begins, according to H, and at each time, ρ A is computed, and finally its elements are averaged over 10000 samples from each ensemble.
As stated, the random variables are drawn from the following distributions: uniformly according to the Haar measure, and For each of these systems, we compare their evolution to the GUE and Poisson ensembles, described earlier. See figures 18. This allows us to get a sense of how well these mathematical In thick lines, also the GUE and Poissonian predictions. How well they agree varies wildly per model.
predictions work for physical models. All systems initially entangle quadratically 14 . There seems to be a general trend that the disordered versions of each ensemble have more mixing, and conversely some integrable systems do not seem to equilibrate at all, which is to be expected. For example, the J = 0 TFIM is even free, characterized by perfect harmonic occupation.
We hypothesize the reason the physical systems often do not adhere to the predictions of either average more faithfully, is that the requirement of unitary invariance of the basis is very strong, even for the most exotic ensembles of systems. When the bases of the integrable ensembles are 'scrambled', by multiplying them with a random unitary from U(d), their behavior very strongly resembles the Poissonian curves in figure 18 (Not shown). Finally, these systems still exhibit a number of symmetries leading to exact degeneracies, which may further influence the results. Notably, the SYK model is quite faithful to the GUE in many ways, hinting at a deeper connection. Such ties have been investigated, see e.g. [49].

Conclusions and Outlook
In a nutshell, this work has shown how to average functions of the reduced density matrix over the GUE, as a general approach to understanding thermalization and entanglement generation. Along the way, we have gained understanding of calculation of general vertex operators and higher point correlators in the GUE field theory. Summarizing the experience in numerical works, we arrive at the following conclusion: information contained eigenvalue statistics are not enough to classify thermalization properties, we can show that ensembles may have same energy statistics as GUE, but don't thermalize in the same way. Notably, a model such as SYK does. A natural following step would be to focus on eigenfunction and matrix elements statistics using similar tools.
A fair critique of this research is that the averages are more of mathematical than physical significance. Most of the sets of eigenvectors V ∈ U(d) are highly non-local, exotic interactions. This criticism is confirmed by the results of section 5.5. It would be insightful to take averages over ensembles that are non-uniform on U(d). Attempts have been made in this direction, using an additional weight ∝λ Tr [V, Ω][V, Ω] † , which rewards V that are in some sense 'close' to a favored Ω ∈ U(d). However, the constraints on probability distributions leave us with little power to weigh systems too undemocratically. This means in particular that as d becomes large, the corrections induced by this weight compared to the flat average vanish. All the while, the calculation becomes dramatically enlarged. More thought on this matter is welcome. Also, besides exponential and Gaussian, other distributions of energy may be interesting if they can be physically motivated.
For anyone wishing to reproduce this work, or related questions, the required Python code will be made available with the publication. Most notably, the script to perform all the exact unitary integrals is free for examination. This includes a subroutine that produces symbolic Weingarten functions corresponding to any conjugacy class, at any dimension using the Murnaghan-Nakayama rule and hook-length formula, which may be useful in general [50].
There are several points we would like to mention at the end. First, to compute time evolution of the entanglement entropy a common strategy is to employ a replica symmetry trick, S = − Tr(ρ log ρ) = −(∂/∂n)(log Tr(ρ n )| n=1 . In order to carry this out, as it requires an analytic continuation, we would need to analytically evaluate an average of an arbitrary power of the density matrix. This would require control of unitary integrals of non-integer polynomial degree. This may be within reach by means of a generating function technique. Similar perhaps to a matrix model of the Gross-Witten type [51,52]. In future work, it would be worthwhile to consider this and other techniques to perform unitary integrals. It is the authors' hope generating functions for Weingarten functions might allow us to push past polynomials of V to more general forms.

Acknowledgements
We are endebted to J-S Caux, the promotor of the first author, for his patience and flexibility allowing this child to be brought to term under his aegis. Besides this, we must mention fruitful discussion with O. Gamayun, of the same group, from whose mind specifically expressions (26) and (27) sprang, as if by divine inspiration. Finally, we are thankful for the input of Maris Ozols, to whom the first author went for advice early in this project. He made the original suggestion to us to consider the unitary integrals such as they stand in section 3. That this suggestion was in fact, in light of [28] not actually original, none of us were aware at the time.

Funding Information
The authors gratefully acknowledge the support of European Research Council under ERC Advanced grant No 743032 (DC). Additionally, this work is part of the Delta-ITP consortium, a program of the Netherlands Organization for Scientific Research (NWO) that is funded by the Dutch Ministry of Education, Culture and Science (OCW).

Appendix A Random Matrix Theory
In this appendix, we will expound upon Random Matrix Theory (RMT) and the techniques of performing matrix integrals. As stated in the setup, the idea is to model Hamiltonians as Hermitian random matrices. A Hermitian matrix ensemble can be sampled by choosing matrix elements independently at random, subject to the hermicity condition. The degrees of freedom are the real parts of the diagonal, as well as the real and imaginary parts of the superdiagonal elements.
However, in Quantum Mechanics we are not so much interested in the individual elements of H, but rather its eigenvectors and eigenvalues. We wish to change coordinates. This is analogous to converting a 2d Cartesian integral into a polar coordinates, hence the reference to angular and radial parts. It is conventional to split a uniform measure dH = d(V ΛV † ) over (d × d) Hermitian matrices H ∈ M into a product of the radial-angular form [53]: In the middle expression, each factor is the familiar Lebesgue measure on the the independent variables of the matrix element. Then on the RHS, DE := d l=1 dE l with each dE l the Lebesgue measure on an eigenvalue. dV is the symbolic Haar measure on the unitary group. The Haar measure is the unique uniform, translationally invariant normalized measure on any compact Lie group, and by definition satisfies by the transitive nature of the group. In particular, using this measure, every eigenbasis V is equally probable.
Finally, ∆(E) is the Vandermonde determinant. Its square is the Jacobian of the coordinate transformation H → (V, E), defined in (8).
We see that for uniformly distributed elements of H, the eigenvalues are spaced apart: they repel. It is quite remarkable that such correlated eigenvalues result from uncorrelated simple elements. It can be unerstood by analogy. In polar coordinates the size of the orbit of the angle scales with the radius, which is reflected in the Jacobian. Analogously, the orbit in M of action of the unitary group through the eigenvectors of H scales with ∆ 2 (E), and becomes measure zero when two eigenvalues are degenerate [42], and dH vanishes.
If we consider a weight on the matrices, we desire it to be basis independent, just like the measure. One way to accomplish this is as follows: Then if we collapse the constants to λ m = λδ m,2 , we recover the Gaussian Unitary Ensemble (GUE). This weight is particularly useful, because it is easy to calculate averages with scalar Gaussian distributions, which is now essentially what we have on the energies. With suggestive notation, we see the weight only depends on the energies, P(H) = P(E), leading to expression (7).
We demand for a well defined distribution It is known in this ensemble that the normalization C = C(d, λ) is fixed by [45] because dV is normalized by definition 15 . In practice, we can obtain matrices from the GUE as follows. The real and imaginary parts of the superdiagonal elements can each be drawn independently from a N 0, 1 2 λ −1 distribution, and the real diagonals from a N 0, λ −1 distribution, and the matrix completed by the hermicity demand. The product of their independent distributions will result in a joint probability distribution proportional to exp(− λ 2 Tr H T H ) as desired 16 . The stage is set: we have constructed a distribution and are still free to choose any function to average over it. To simplify equations in this work, we will fix λ = 1. We may do this 15 Technically, expression (50) and below should also incorporate the overcounting factor (d!) −1 (2π) −d [42] from the decomposition in (1), but this would simply be canceled again in the definition of C. 16 Computationally, one can also generate two auxiliary real matrices A1, A2 of (d × d) i.i.d. N (0, λ −1 ) random variables, and construct H = 1 2 (A1 + A T 1 ) + i(A2 − A T 2 ) and it will have the desired statistical properties. [54] because λ merely rescales time. We can understand this in the following way. The functions whose expectation value we seek come from the Schrödinger equation and will always have time multiplying differences of energy. These yield calculations of the form for f some function in general depending on all t(E k − E j ), j, k ∈ {1, . . . , d}. Rescaling the standard deviation of energies, √ λ for the GUE, is thus equivalent to dilating time. To restore the spread, one can substitute t → t/ √ λ in the final result 17 . From this observation, the only truly 'free' parameters in this research of any consequence are d A and d B .

A.1 Nontrivial Unitary Integrals
Moving on to solution techniques, the integrals over energy (eigenvalues) can be performed in a standard way from complex analysis. By contrast, the integrals over the unitary group will require more modern machinery. Viewed as d × d matrices, there are standard identities for integrals of polyvariate monomials of matrix elements of V ∈ U(d) of degree q and their Hermitian conjugates. They involve Weingarten functions [55] [56].
Here we have made the identification of the ordered set I = {i 1 , . . . , i q } and similar for I , J, J , and the delta function δ I,σ(I ) is only satisfied when the full sets agree element-wise. From this equality, any polynomial integral is found by linearity.
We have also referenced the symmetric group of permutations of q symbols, S q . We will use notation that S q consists of all bijective maps {1, 2, . . . q} → {1, 2, . . . q}. It is implied that any integral such as V 1,1 dV that does not contain an equal number of factors of V and V † will vanish by symmetry, but they will also not arise in this work.
The Weingarten functions Wg(d, σ), which can be calculated to any dimension using representation theory of S q , in fact depend on the dimension d, and only on the conjugacy class of the permutation σ, not the specific element. As can be learned from any text on finite group theory, the conjugacy classes in S q are determined by cycle type, characterized by a partition φ = {φ 1 , . . . , φ l } of the integer q, satisfying φ 1 ≥ φ 2 ≥ . . . φ l ≥ 1 and l j=1 φ j = q. Then we write φ(σ) q. This means, for each q, there are finitely many Weingarten functions, which are quotients of polynomials in d. In this paper, we will only need the functions for q = 2 and q = 4. The denominator is q−1 z=0 (d 2 − z 2 ) for both cases 18 . The numerators 17 Rigorously, first substitute t = √ λτ in (55), and then absorb √ λ in the integrand by substituting √ λEj = xj. λ will drop from the Gaussian, and the powers of λ from the measure and Vandermonde determinant will cancel against those from the modified normalization constant C. τ is now rescaled time.
18 Caution: this structure does not persist exactly in higher q.

A.2 Orthogonal Polynomials and Symmetric Kernels
In section 4, it will be imperative to calculate the expectation value of functions that depend on energy over the distribution P(E)∆ 2 (E)DE from the GUE. We will be interested in large d behaviour, so straightforward evaluation of the d-dimensional energy integral is prohibitively complicated due to proliferation of factors in ∆(E). Luckily we will in practice be in need of f of functions f (E 1 , E 2 , . . . , E n ), with n ≤ d, where n does not scale with system size. Such an average is naturally evaluated in the basis of Orthogonal Polynomials (OPs). This is an indexed family of univariate polynomials p µ (x), µ ∈ {0, 1, . . .} together with a weight function w(x). OPs obey a form of orthogonality according to the inner product We cite from Forrester [45], due originally to Dyson [58]: for n ≤ d, where the symmetric kernel K d (x, y) is given by The final equality is known as the Christoffel-Darboux formula [45]. This allows us to work with a normalized expression where all but n dimensions have already been integrated out. It reduces the d dimensional integral to an n dimensional one, which will prove crucial. One more ingredient is needed for the determinant above. When taking the limit y → x, expression (59) gives In our case, the independent variables become {x j } → {E j } and the weights are Gaussian, w(x j ) → exp (− 1 2 E 2 j ). Indeed, then the denominator of (58) is just the normalization defined in (54). The OPs corresponding to the standard Gaussian distribution are the probabilist's Hermite polynomials, where we will keep the notation {p µ } instead of the conventional {H µ } to avoid confusion with the Hamiltonian. p µ is a monic polynomial of degree µ. Elementarily, these satisfy and A related set of polynomials are the generalized Laguerre polynomials, which have two indices.
They will become relevant through the useful identity [59]: Here w(x) is still the standard Gaussian. Of the Laguerre polynomials, these identities will be needed [59]: (65)
From this expression, we populate the multi-indices needed for expression (56).
As before, the terms dependent of I, I decouple from those of J, J . We can construct new R σ , Q τ , which are similar to those used averaging the density matrix, although now they are indexed by permutations of S 4 , and are thus 4! = 24 dimensional. Also, for scalar purity, these are all scalar expressions. Then our final answer will take the form with For the results, see table 2: Table 2: We will go through the derivation and definition of terms briefly. Starting with R σ , a look at (68) tells us any permutation σ taking the pairs {1, 3} → {1, 3} ⊂ {1, 2, 3, 4} forces all indices to 1, and the sum in (70) is trivially unit, Next, as g B is left free. By the same token, any σ that maps 1 → 2 ∨ 3 → 4 will leave a single free bath index.
Moving on to Q τ , we note that J = J , so δ J,τ (J ) = δ J,J = δ j,j δ k,k δ l,l δ m,m , and the first element factorizes.
Where we are reminded of the definition of χ(t) in (15).
with τ producing δ j,m ∨ δ l,k ∨ δ l,m , the same results: Making use of definition (14), we have the pair On to the first three-cycle, τ = (123). Here, as with all three-cycles, 3 out of 4 indices are equated, such as j = k = l, and the 4 th , h is left free. Always, two of the three equated indices kill their energies in the exponent, i.e. E k − E j − E l → −E j , and the 4 th , +E m has opposite sign. All three-cycles contribute a factor: Analogously, all four-cycles equate all indices j = k = l = m. Three summations kill four Kronecker-deltas, as the last is redundant (cyclic).
Continuing, in τ = (12)(34), two Kronecker delta's cancel the whole exponent, leaving two free summations: And finally, We notice that exchanging indices 2 ↔ 4 in σ has the effect of exchanging A ↔ B in R σ , or if the permutation is invariant, R σ = 1. Conversely, this exchange in τ always leaves Q τ invariant. This advises us that any multiple of d A in the final expression will also be present in d B . The straightforward way to find the final expression, is by constructing the Weingarten matrix.
In the basis index order Id, (123), (132), . . . , (1432) defined by table 4, the first few rows and columns of the Weingarten matrix look like Performing the inner product was done symbolically by computer.
In the last equality, we have defined another function ξ(t) to hold the time and energy dependence.
Another identity for ξ(t), which is computationally favorable, is printed in (19). At present the origin of the terms (d ± z), z ∈ R in expressions such as (86) is unclear to the authors, however their appearance is fortuitous: they cancel neatly with the denominator. The final result is then equation (18).

B.1 Third power of the Density Matrix
Using similar techniques, any power of the density matrix is obtainable in principle. However, already for the third power, (6!) 2 = 518400 terms must be calculated and added. To this end, a Python routine was created that constructs R σ and Q τ , and performs the inner product. For a copy of this routine, please contact the first author. The result is printed below without derivation.
Recalling ι(t) from (14), Using this, the trace of the third power of the reduced density matrix averages to . (89)

Appendix C Derivation of n-Point Correlation Functions
In this appendix, we elaborate on the details of deriving expression (22) and what follows.
To begin, we investigate the emergence of the generalized Laguerre polynomial L (23). An important component is the Fourier transform of the (diagonal element of the) symmetric kernel in (60).
Here we have completed the squares in the exponent and made the substitution of integration variableẼ 1 := E 1 − it. Then technically the integration domain is shifted up by a distance t into the complex plane, to a line parallel to the real line. But we observe that the integrand dies at infinity and moreover has no poles. Then the domain can be deformed continuously back to its original position. Employing the first of expression (62) and subsequently (64) which expresses an integral of shifted Hermite polynomials in terms of generalized Laguerre polynomials, we arrive at In the last equality we made use of the two standard identities (65). This expression is manifestly a function of (t 2 ), thus the sign of time doesn't matter. We continue by observing This calculation will also prove useful in the larger correlators needed for the average purity. But first we move on to the origin of F (t). Consider completing the square again and using the same substitution ofẼ 1 1 We have also substituted the first equality of (59). From here, we construct by again invoking (64), leading immediately to definition (23). From there, and we have a full derivation of expression (22). Incidentally, this is the squared Frobenius norm of F (t). Let us examine the matrix function. F µ,ν (t) ∈ R for µ + ν even, and F µ,ν (t) ∈ I for µ + ν odd. However its trace and traces of products of F (t) will turn out to be all real.
Miraculously, this definition of F (t) allows us to reconcile the diagonal elements of the symmetric kernel with the off-diagonal, seeing as immediately from (95) and (66), Tr F (t) = . Moving on, F (t) is defined in such a way that This allows us to quickly check the normalization. For t = 0, a look at expressions (22) and (58) tells us that the integral in the numerator should equal that in the denominator. Indeed, setting t = 0 there gives as desired.

C.1 Three-and Four-Point Functions
In this subsection, we will continue the work above, and explain the derivation of the correlators that comprise the expectation value ξ(t) . Before we start though, with specific integrals, we will emphasize the pattern distilled from the previous calculation. The determinant of kernels in (58), by the Leibniz expansion, is a sum of products. Each product has factors of the form K d (E j , E k ), which can be seen to 'couple' E j and E k . There are also factors of the form e ic l E l t , for c l ∈ {±1, ±2} coming from the integrand. Each energy appears exactly twice in a kernel and once in an exponent. If j = k, energy E j is coupled to itself, and after integration results in a factor If j = k, then energies E j and E k are coupled to each other, resulting in factors We again view these F µ,ν (c j t) factors as symmetric (d × d) matrices 21 , µ, ν ∈ {0, 1, . . . , d − 1}, and the coupling as matrix multiplication. The string of coupled matrices closes on itself: it forms a loop. This is accounted for by tracing over the matrix product. So dropping the indices, for instance and the generalization to larger loops is evident. Trivially, from (22) For higher correlators, all that is left to do is to expand the determinant and collect like terms 22 . The three-point correlator was included in the main text above, in expression (29).
For the four-point correlator, at times the order of the non-commutative coupling will be important.
In general, this procedure is described by equation (28).

C.2 Derivation of the Generating Function
In this subsection, we will prove that the generating function G({a m }) in equation (26) indeed procedurally generates equation (28). In the latter, let us say for concreteness we are looking for an n-point correlator characterized by the sequence {c 1 , c 2 , . . . c n }, which in turn consists of l < n distinct elements {m k } with respective multiplicities {n k }. I.e. l k=1 n k = n. This is done using the Leibniz formula, notationally making use of the totally antisymmetric Levi-Civita tensor .
Anticipating the final result, we will focus our attention on the terms in this large product that have exactly a prefactor j a c j = k (a m k ) n k . For simplicity, first assume all c j distinct, or all n k = 1. We will later relax this assumption.
As a combinatoric exercise, we can imagine drawing the term containing a c j in the infinite series from any of the many factors (δ+ m a m . . .), i.e. corresponding to any ν. Once we have drawn one term from every factor, their product is one final term in the expanded product. Let us define ν j to be the index of the factor where we select a c j . Then for any n distinct values {ν 1 , . . . , ν n } ∈ {0, . . . , d − 1} ⊗n , we will obtain one term with the correct prefactors.
This second product of delta functions, after contraction, has the effect of setting the ν th index of the Levi-Civita tensor to ν. For example, if n = 2, d = 5, ν 1 = 0, ν 2 = 3, the corresponding term would be µ 0 ,1,2,µ 3 ,4 a c 1 F 0,µ 0 (c 1 t)a c 2 F 3,µ 3 (c 2 t). Aside from the dummy variables a c j , this is simply the antisymmetrization over the remaining free indices, reducing the Levi-Civita symbol to n effective dimensions, or the term to an n × n determinant.
We can now generalize to degenerate c j : if some c j = c j , j = j , then in fact some of the terms in (106) correspond to the same combination in (105), and we must compensate for overcounting by dividing by l k=1 n k !. And finally, any determinant of a matrix with duplicate columns vanishes, so we may promote the sum to include also sets of nondistinct {ν j }: these terms vanish regardless. Then we have arrived at the most general which agrees with expression (27).

Appendix D Poisson Statistics Comparison
To contrast GUE-, also called Wigner-Dyson statistics of eigenvalues, there are Poisson statistics. Either is characterized in terms of the distribution of gaps between consecutive energy levels. For the former, as we have seen, levels repel due to the Vandermonde determinant and so the probability to find any E j approach E j + 1 vanishes. The exact distribution of the gap |E 1 − E 2 | is not known 23 to arbitrary d > 2. By contrast in Poisson statistics, we don't necessarily know the distribution of single energies, but the gaps are weighted by their size according to the exponential distribution: The distribution is defined on R >0 and µ > 0 is a free parameter. We are curious how the behavior of χ(t) P compares to that of χ(t) GUE , averaged under different distributions of energy gap. Luckily χ(t) only depends on differences of energy. An exponential distribution of gaps emerges if the probability to encounter a level is constant over R and does not depend on the levels before or after it: all levels are uncorrelated. Therefore we posit that this exponential behavior also persists between any two levels, not just adjacent ones, albeit with a different µ. Then all terms in χ(t) have the same average, and analogous to the previous calculations, we simplify This gap distribution can be achieved by taking a product form joint PDF of exponential distributions on each separate energy.
As µ > 0 by construction. In order to interpret this result fairly, we must choose a value for µ. Of the exponential distribution, it is known that average is 1/µ and the variance is 1/µ 2 . These readily yield This is useful, because we will set the parameter µ = µ(d) such that this second moment of the gaps agrees between the exponential distribution and the GUE, where the statistics also depend on d. Ideally, we would equate the first moment, but E 1 − E 2 GUE = 0 trivially and |E 1 − E 2 | GUE is not known.
A straightforward correlator calculation tells us, We know dxK d (x, x) = d and it is also clear by symmetry that dxK d (x, x)x = 0, by comparison to d · E 1 GUE . Using these observations and substituting an alternative form of the kernel [45], we transform to In the last term, due to the orthogonal polynomials, the cross terms of the square vanish. We are left to use the inner product in expression (61): Furthermore, from (60), we use properties of Hermite polynomials in (62) repeatedly to modify In the penultimate equality, we used orthogonality of the polynomials to discard most of the terms, and integrate the rest. Putting together (115), (116) and (117), we find We scale the Poisson statistics to agree on this value 24 , 2/µ 2 = 2(d + 1) ⇔ µ = (d + 1) −1/2 , so finally the Poisson average χ(t) is given by equation (39).
For sufficiently idealized ensembles of integrable systems, meaning specifically the model's eigenbases are uniformly Haar distributed over the Unitary group, we expect the subsystem dynamics to follow this time-evolution.
This can be used to describe the time-dependent purity of ensembles of integrable systems.