Beyond Fermi’s golden rule with the statistical Jacobi approximation

Many problems in quantum dynamics can be cast as the decay of a single quantum state into a continuum. The time-dependent overlap with the initial state, called the fidelity, characterizes this decay. We derive an analytic expression for the fidelity after a quench to an ergodic Hamiltonian. The expression is valid for both weak and strong quenches

1 Introduction There are a broad range of problems in quantum dynamics which can be framed as the decay of a single state into a continuum of other states.The archetypal example is the emission or absorption of radiation by an atom [1,2].However, formally equivalent scenarios arise in the study of thermalization in isolated many-body quantum systems [3,4], heating in driven systems [5], quantum scars [6,7], quantum information [8,9], and Loschmidt echoes used to characterize quantum chaos [9].The decay of a single state can be captured by the state fidelity (also referred to as the return probability or survival probability), P 0 (t) = |⟨ψ 0 |ψ(t)⟩| 2 . (1) We consider the isolated quantum dynamics of a many-body eigenstate |ψ 0 ⟩ of H 0 under the ergodic Hamiltonian H = H 0 + JV , such that |ψ(t)⟩ = e −iHt |ψ 0 ⟩ (ℏ = 1).Many analytical methods have already been developed to estimate the fidelity in this context [9][10][11][12][13].The earliest is a result of first order perturbation theory known as Fermi's golden rule (FGR) [2] (though it was first derived by Dirac [1]).For small J and long times, FGR predicts the exponential decay of P 0 (t) with a rate where |f V (ω)| 2 is the spectral function of the perturbation V .
In this work, we use the statistical Jacobi approximation (SJA) to relax both the requirements of small J and long times.We predict fidelities in well-thermalizing quantum systems for times shorter than a cutoff when finiteness of the Hilbert space limits the fidelity decay, and for perturbation strengths J which are smaller than the total bandwidth σ E of H 0 .The perturbation may be large compared to local energy scales.For concision, we denote local energy scales by a single parameter σ ω .The log-fidelity P 0 (t) of an H 0 eigenstate decays when the Hamiltonian is quenched to H = H 0 + JV .The statistical Jacobi approximation (SJA) correctly predicts all features of the time-dependent fidelity with no free parameters both in (a) random matrix models and (b) local Hamiltonians.The leading order of time-dependent perturbation theory (TDPT, black dashed) predicts exponential decay at the FGR rate Γ GR (2).In both (a) and (b), Eq. (3) (red dashed) correctly predicts a significantly larger decay rate than FGR.Parameters: (a) As in the random matrix model of Fig. 8(b).(b) As in the mixed field Ising model of Fig. 10.The TDPT and SJA formulae are calculated using L = 16.
Our main result is a concise integral formula relating the log-fidelity log P 0 (t) to an autocorrelation function of the perturbation, Here, C + Jac (τ ) = C + V (τ ) + O(J/σ ω ) 2 is an autocorrelation function defined through the SJA.At leading order in J/σ ω , it is given by the symmetric connected autocorrelation function of the perturbation in H 0 , C + V (τ ) = 1 2 ⟨ψ 0 |{V (τ ), V (0)}|ψ 0 ⟩ c .Neglecting higher order contributions to C + Jac and replacing it with C + V in Eq. ( 3) correctly reproduces the prediction of conventional time-dependent perturbation theory (TDPT) at leading order in J/σ ω [14].
Equation (3) has no fit parameters, and accounts for both non-universal early-time dynamics, and the regime of exponential decay-where it accurately predicts corrections to the FGR decay rate (Fig. 1).The formula is derived assuming a continuous density of states, and so does not capture any feature which is due to discreteness of the spectrum.
Numerically, we test Eq.(3) in two classes of models.The first is an ensemble of random matrices, where the spectral function |f V (ω)| 2 may be chosen arbitrarily, and log P 0 (t) is finite in the limit of infinite Hilbert space dimension, N → ∞ (Fig. 1(a)).Second, we apply Eq. (3) to several well studied spin chains (Fig. 1(b)).In this context, the log-fidelity is extensive (as is the FGR decay rate Γ GR ), and, for a spin chain of length L, it is [log P 0 (t)]/L which has a finite thermodynamic limit.(In d dimensions, [log P 0 (t)]/L d .)Equation (3) shows excellent quantitative agreement with numerics in both cases, outperforming TDPT for some models.(Left) For weak to intermediate perturbations the log-fidelity initially decays quadratically, log P 0 (t) ∼ −J 2 t 2 L d , followed by linear decay, log P 0 (t) ∼ −ΓtL d (where L is the linear extent of the system).In the actual dynamics, this decay is cut off at a time t * when the log-fidelity density reaches the entropy density at the energy of the initial state, s. (Right) For strong perturbations, the cutoff time may precede the onset of linear decay, Jt * ≲ 1.Then decay of the log-fidelity appears quadratic (Gaussian for the fidelity) until the cutoff time.
The fidelity has several generic features which are also reproduced by Eq. (3) (Fig. 2).At early times, log P 0 (t) decays quadratically.For late times and weak perturbations, the decay of P 0 (t) is exponential, with a rate predicted by FGR.In any system with a finite Hilbert space, the decay of the fidelity is eventually cut off.In many-body Hamiltonians, the cutoff time can be estimated by equating the log-fidelity density and the entropy density at the energy density of the initial state.The cutoff time t * is O(1) with system size, as both the log-fidelity density and the entropy density have finite thermodynamic limits.As such, strong perturbations can push t * below the onset of exponential decay; P 0 (t) then appears Gaussian up to the cutoff time [10] (Fig. 2).The SJA formula Eq. ( 3) captures fidelity dynamics up to this cutoff time, regardless of whether decay appears exponential or Gaussian.
At the cutoff time t * and beyond, outside of the regime where Eq. (3) applies, P 0 (t) shows a dip below its limit plateau value, and a slow linear growth (or ramp) towards that limit.If the density of states contains nonanalyticities due to, for instance, a finite ground state energy, then P 0 (t) decays as a power law [10,15].
Each of these individual features can be predicted by other methods [9-11, 16, 17].Low orders of time-dependent perturbation theory accurately predict the quadratic decay of log P 0 (t).Indeed, the perturbative series can be cast as an integral transformation of the perturbation's autocorrelation function [9], as in Eq. (3).For later times and weak perturbations, few methods improve qualitatively on FGR [9,10,13,14,18].Random matrix techniques can account for late time features [9,16], including the dip and ramp after the cutoff time (also called a correlation hole) [13,19,20].To our knowledge, previous methods which simultaneously capture all time scales up to the cutoff time only exist for special models, including some random matrix models [12,13], or weak quenches [14].In contrast, our method should apply to any ergodic final Hamiltonian, and both weak and strong quenches.
The SJA provides the conceptual and analytical framework for our calculations.This method is a statistical treatment of the Jacobi matrix diagonalization algorithm, which iteratively diagonalizes the Hamiltonian by repeatedly diagonalizing 2 × 2 submatrices [21,22].The method descends from the study of resonances in disordered single-particle systems through strong disorder renormalization groups (SDRG) [23][24][25][26][27][28], and many-body localization [29][30][31][32][33][34][35][36][37][38][39][40][41].The SJA may be regarded as a kind of SDRG on Hilbert space which diagonalizes the Hamiltonian while maintaining the number of degrees of freedomsimilar to Wegner-Wilson flows [36], though different in the specific RG flow it generates.Reference [42] both framed resonances in terms of the Jacobi algorithm, and extended the method to provide dynamical predictions in prethermal systems.Reference [28] defined a similar method in the context of SDRG in long-range single particle systems, but did not identify a relation to the Jacobi algorithm.This work further extends the SJA to characterize the decay of an initial state into a continuum, which is relevant to ergodic many-body systems.That the same SJA framework may, with distinct approximations, predict dynamics in both prethermal and well-thermalizing systems indicates its broad applicability.
In Sec. 2 we summarize the Jacobi diagonalization algorithm, and related quantities which are central to our analysis.By statistically describing the action of this algorithm on an initial state |ψ 0 ⟩, we formulate a continuum flow equation for the local density of states (LDOS), the Fourier transform of which gives the fidelity (Sec.3).We solve the flow equation in Sec. 4 to leading order in the system volume, and assuming that the total density of states (DOS) may be treated as constant.The solution shows excellent agreement with numerics both in random matrix models and more realistic interacting many-body Hamiltonians (Sec.5).We summarize and discuss applications of our results in Sec. 6.
We defer a technical discussion of the J/σ E dependent corrections in Eq. ( 3) to Appendix A, where we also demonstrate a connection between the SJA and Dyson Brownian motion [43,44].

Statistical Jacobi approximation
The statistical Jacobi approximation (SJA) [42] is the primary analytical tool in this work.This approximation makes a statistical description of the Jacobi diagonalization algorithm to characterize the statistical properties of eigenstates, and hence the dynamics of a quantum system.
In Sec.2.1 we review the Jacobi diagonalization algorithm, including standard worstcase bounds on its convergence (13).In Sec.2.2 we identify and characterize two distinct regimes in which the algorithm may operate-a dense regime in which the worst-case bounds are close to being saturated, and a sparse regime where they are far from being saturated.These two regimes for the algorithm reflect distinct dynamical behaviors of quantum systems.

Jacobi diagonalization algorithm
Originally introduced in 1846 [21], Jacobi's matrix diagonalization algorithm provides a simple iterative procedure to diagonalize an N × N Hermitian matrix, H.While this algorithm is far from the state of the art in modern numerical linear algebra [22], it provides a powerful framework for the description of quantum dynamics [42].As the diagonalization procedure addresses fast degrees of freedom first, it is possible to relate the action of the algorithm to real time dynamics.
The input for the algorithm is the matrix of H in an arbitrary computational basis the model has a sparse regime [42], where w = O(1) is much larger than typical elements in its row and column (inset).Consequently, the distribution of decimated elements (15) is O(N ).(c) For late flow times (small w) a dense regime emerges, where all matrix elements are of the same scale (inset), w = O N −1/2 .Thus, the distribution of decimated elements becomes O N 2 .Rescaling by these powers of N produces data collapse in the density of log w. {|j 0 ⟩}.The algorithm diagonalizes H using a sequence of 2 × 2 rotations which zero-or decimate-large off diagonal matrix elements (Fig. 3(a)).The algorithm proceeds as follows.
1. Identify the largest (in absolute value) off diagonal matrix element of H, The (|a 0 ⟩ , |b 0 ⟩) submatrix containing this element is where E j 0 = ⟨j 0 |H|j 0 ⟩, and ϕ 0 is the complex phase of the off diagonal element.
2. Update the computational basis {|j 0 ⟩} → {|j 1 ⟩} by applying a 2 × 2 rotation between the |a 0 ⟩ and |b 0 ⟩ states which diagonalizes the (|a 0 ⟩ , |b 0 ⟩) submatrix.That is, calculating the rotation angle η 0 by the update to the basis is Other basis elements are not affected.The full update to the basis is thus where we defined a unitary rotation R 0 which implements the update.
After n iterations, the updated Jacobi basis states are As the iterations are continued, n → ∞, the basis {|j n ⟩} converges to the eigenbasis of H. Indeed, the total norm in the off diagonal of H, parameterized as (the right hand side is the mean squared Frobenius norm of a row in the off diagonal) strictly decreases in each iteration The element ⟨a n+1 |H|b n+1 ⟩ is set to zero, while the sum of squares of other elements in the off diagonal is preserved by Eq. (9).As w n is the largest off diagonal matrix element, it is at least as large as the root-mean-square, w 2 n ≥ β −2 n /N .Thus, The norm of the off diagonal converges exponentially to zero with a rate which is at least 1/N 2 (Fig. 4).If implemented numerically, this means that the matrix is diagonalized with O N 3 floating point operations, similar to other diagonalization algorithms, including the standard QR algorithm.The elements w n which are zeroed by the rotations play an important role both in the algorithm, and in our analysis.In the computer science community, these elements are called the pivotal elements.In order to emphasize a conceptual link with the renormalization group, we refer to them as decimated elements.
Indeed, the algorithm is conceptually similar to the renormalization group (RG): fast degrees of freedom (states with large matrix elements) are updated to eliminate the fast dynamics (zero the matrix element) [28].However, the Jacobi algorithm differs from the usual formulation of RG as it does not eliminate degrees of freedom.Its action on H is unitary.3(b-c), with N = 1024.While w n (grey) does not decrease monotonically with n, its typical scale does decrease (exponentially).Either reshuffling decimated elements so that they decrease monotonically (red), or averaging over a fixed number of rotations (the blue dashed curve shows an average over 2000 rotations) causes monotonic decrease by removing small fluctuations.
The quantity β (11) can be viewed as a flow time for the algorithm.In units where ℏ = 1, it has units of time, indicating a direct relationship to real time dynamics.When calculating dynamical quantities, such as fidelities, the Jacobi basis at flow time β n , {|j n ⟩}, defines an approximate Lehmann decomposition, Thus, if a controlled description of the algorithm can be maintained up to a flow time β, dynamical quantities can be predicted up to a similar timescale.In certain cases, the operation of the Jacobi algorithm may also admit physical interpretation.In Ref. [42] large rotation angles η in the Jacobi algorithm were interpreted as an indicator of resonances between different many-body states.The algorithm also generates dynamics for the energy levels E an , similar to the Dyson Brownian motion of random matrix theory [43] (Appendix A).
It will be convenient to use the size of the decimated element, w n , to parameterize the flow time.Strictly, this reparameterization requires that w n is monotonic with β n , which is not true.However, the average of w n over several rotations is a smooth, decreasing function of β n (Fig. 4).Thus, if dealing with average quantities, it is possible to treat w n as a reparameterization of the flow time.

Distribution of decimated elements
The central object characterizing the statistical properties of the Jacobi algorithm is the distribution of decimated elements, Here, |a n ⟩ and |b n ⟩ are the states involved in the nth rotation, and from the definition ρ dec (w, E, E ′ ) = ρ dec (w, E ′ , E).If we ignore the effect of the Jacobi rotations updating the elements ⟨j|H|a⟩ and ⟨j|H|b⟩ (and their transposed partners), then ρ dec is the joint distribution of Hamiltonian elements |⟨j|H|k⟩| and their associated energies E j and E k .However, because the Jacobi algorithm also affects elements other than the one it is decimating (Fig. 3), the distribution ρ dec differs from the bare distribution of matrix elements.It still satisfies That is, the second moment of ρ dec (w) is the same as the bare distribution of matrix elements.This is because all of the norm of the Hamiltonian in the off diagonal is eventually decimated, and the second moment of ρ dec (w) is the total decimated norm.
The distribution ρ dec distinguishes between two qualitatively different regimes of dynamics.In the sparse regime (Fig. 3(b)), the largest element in a row of the Hamiltonian, w, makes an O(1) contribution to the total squared Frobenius norm.Decimating one element per row decreases β −2 by an O(1) value (11), and, when still in the sparse regime, the new largest element is also smaller by an O(1) amount.The number of elements decimated in a finite interval of w values thus scales with the Hilbert space dimension, However, the worst case bounds on the Jacobi algorithm indicate that it can take as many rotations as there are matrix elements to reduce the norm by an O(1) factor, This is the dense regime (Fig. 3(c)).It emerges when all Hamiltonian matrix elements are of the same scale, and so w 2 = O(1/N ).
Ref. [42] studied the sparse regime in the context of prethermal many-body localization, and found that resonances-large rotation angles η-accounted for most observable dynamics.Other models which show slower than exponential relaxation-including power-law random banded matrices [45], the Anderson model on the random regular graph [46], and the log normal and Lévy Rosenzweig-Porter models [12,18,47]-all possess a sparse regime to which an analysis similar to that in Ref. [42] should apply.(The Rosenzweig-Porter models would require modification of the analysis in Ref. [42], as in those models the ratio of the Frobenius norm of the off-diagonal to the norm of the diagonal may vanish for large system sizes.) In this work, we focus on the dense regime.As the decimated elements w must decrease as 1/ √ N (more correctly, 1/ ν(E), where ν(E) is the density of states) the large matrix elements responsible for resonances are rarely encountered.Instead, the dense regime reflects the irreversible decay of fidelities and correlators into a continuum of states.This is the usual scenario at play when using Fermi's golden rule (FGR), and we will find that decay is always asymptotically exponential when the spectral function |f V (0)| 2 is finite.At a technical level, the large number of Jacobi rotations scrambles correlations between Hamiltonian elements, and justifies random-matrix-style approximations, in which we neglect such correlations.
In the context of predicting fidelity decay in the many-body Hamiltonian H = H 0 + JV , the initial Jacobi basis {|j 0 ⟩} is the eigenbasis of H 0 .If H 0 satisfies the eigenstate thermalization hypothesis (ETH) [4,[48][49][50][51][52], then the perturbation V is generically dense in this basis, Here, |f V ( Ē, ω)| 2 is the spectral function for V in the Hamiltonian H 0 at mean energy Ē and frequency ω, is the density of states in H 0 , and X jk = O(1) is a random number with variance equal to one.In particular, all matrix elements at a particular energy are of the same scale, implying that the Jacobi algorithm will operate in the dense regime.
If H 0 is diagonal in a local basis and the perturbation V has only few-body interactions, then the initial Hamiltonian matrix is sparse.This is the case for (prethermal) MBL [31,32,35,41,42,53].Regardless, the dense regime should emerge at large flow times β whenever H = H 0 + JV satisfies ETH.As the Jacobi algorithm comes close to diagonalizing H, the Jacobi states present almost as random vectors.The remaining off diagonal of the matrix should then be dense.

Flow of the local density of states
The Jacobi algorithm, when implemented numerically, diagonalizes the Hamiltonian to within numerical precision.An exact description of all the eigenstates is out of reach analytically, and contains more information than is required to compute fidelities in any case.Rather, only the distribution of initial wavefunction amplitudes in the eigenbasis is required.Treating the Jacobi diagonalization algorithm at the level of the statistical distribution is the statistical Jacobi approximation (SJA).
We have where |E j ⟩ is an eigenstate of the full Hamiltonian H = H 0 + JV of energy E j , and has been called the local density of states (LDOS) [10], or the strength function [3].Without loss of generality, we assume V to be off diagonal in the eigenbasis of H 0 , {|j 0 ⟩}.With some simplifying assumptions (discussed below and summarized in Table 1), it is possible to calculate the LDOS from the decimated distribution ρ dec in the dense regime.In Sec.3.1, we show that the LDOS in the Jacobi basis, is determined by its initial condition p(w 0 , E) and the flow equation  The rotations performed by the Jacobi algorithm change the basis {|j n ⟩} (top), and thus the LDOS (bottom), causing a redistribution of probability.The final LDOS, which is related to the fidelity P 0 (t) by Fourier transform, is found by running the Jacobi algorithm until w → 0. The approach to this final distribution can be described by a continuum flow equation, Eq. ( 23) is the rotation angle and we have introduced Sec. 3.1 contains several technical details, and can be skipped in a first reading of this paper.We find the solution to Eq. ( 23) in Sec. 4.

Derivation of the flow equation
The action of a single rotation performed by the Jacobi algorithm is easy to characterize.If the Jacobi basis is {|j n ⟩}, and the element w n = max jk |⟨j n |H|k n ⟩| = |⟨a n |H|b n ⟩| is to be decimated, then recall that the nontrivial update to the Jacobi basis is ( 7) where tan η n = 2w n /(E an − E bn ), and the phase ϕ n will not be important.
The corresponding nontrivial update to the probabilities which, upon replacing cos 2 ηn 2 = 1 − sin 2 ηn 2 and rearranging gives a formula for the change in the probability  53) Table 1: The assumptions and approximations made in the technical derivation of the flow equation, Eq. ( 23), and its solution, Eq. ( 54).The physical interpretation of each approximation is provided in the second column.All approximations are expected to be valid in the large volume limit of ergodic and local many-body Hamiltonians.The continuum formulation of the LDOS also limits our analysis to times shorter than the cutoff t * .
Making dn Jacobi rotations, the total update to |⟨j n |ψ 0 ⟩| 2 is given by the sum over the rotations which affect the state with label j.Indexing such rotations by m, this is The expression ( 29) is too unwieldy for an analytical treatment.It can be made tractable by using w to parameterize flow time, converting the sum on the right hand side to an integral over energy, and replacing the probabilities |⟨j m |ψ 0 ⟩| 2 with the probability density p(w m , E j ).Each of these steps requires that E jn and |⟨j n |ψ 0 ⟩| 2 be slowly varying with n to be valid.Being in the dense regime of Jacobi should imply that this condition is met-many rotations are required to appreciably change the Hamiltonian, so each rotation typically changes little.Thus E jn and |⟨j n |ψ 0 ⟩| 2 will vary slowly.

Decimated element as flow time
The decimated elements w n do not strictly decrease (Fig. 4).Treating w as a continuous parameter controlling the flow time is only possible if w n is averaged over several rotations.Thus, we must assume that |⟨j n |ψ 0 ⟩| 2 and E jn vary slowly with n.
Making this assumption, and supposing that the dn rotations reduce the decimated element from w to w − dw, the left hand side of Eq. ( 29) becomes (30) where we take dw to be infinitesimal, and |j(w n )⟩ = |j n ⟩.

Replacement of the sum by an integral
In replacing the sum with an integral, we must account for the different number of times each pair of states is rotated.This information is encoded in ρ dec .The number of rotations which are made between states of energy To find the average number of rotations which affect a single state of energy E jm in the interval [E, E + dE), we divide by the number of states in this shell, ν(E)dE, which gives ρ dwdE ′ as in Eq. ( 25).The resulting replacement of the sum in Eq. ( 29) with an integral is (32)

Replacement of probabilities with probability density
Finally, we replace the probabilities |⟨j(w)|ψ 0 ⟩| 2 with averages over small energy windows, We assume that the cross term in Eq. ( 29) is negligible compared to the p(w, E) terms, This term does not have a definite sign, so when neglecting correlations between these terms for different m, we expect their average to be a factor 1/ ν(E)dE smaller compared to the p(w, E) terms.Thus, we assume it may be ignored in the limit of infinite system size.Making all of the substitutions Eqs. ( 30), ( 32), (33), and (34), we have Multiplying both sides by ν(E) and changing variables to ω = E − E ′ in the integral recovers Eq. (23).Additionally, the energies E jn are affected by the perturbation, and are altered by the Jacobi evolution.This effect will not be important in the limit we consider in the main text.We leave a derivation and discussion of this addition to the flow equation to Appendix A.

Basic properties
The flow equation Eq. ( 23) is a linear integro-differential equation.It possesses several expected properties, which we briefly list in this section.

Conservation of probability
The flow equation conserves total probability.The rate of change of the total probability is Exchanging integration variables E ↔ E ′ in the right hand side does not change the value of the integral, but multiplies the integrand by −1.This implies that the rate of change of the total probability is zero.Indeed, we have sin

2
, while so that new form of the integral after exchanging variables is This shows as claimed.Thus and the total probability is conserved.

Large volume limit
In neglecting the cross-term in the microscopic flow equation Eq. ( 29) we have already specialized to the limit of large system volume.(More correctly, the limit of large DOS.)In this limit, and in the dense regime, the maximum matrix element w 0 should decrease exponentially with the volume.Indeed, for a spatially local H 0 which satisfies the ETH, we have for some energy E, and where S(E) is the (extensive) entropy at energy E.
The flow equation can be simplified in this limit of small w 2 0 .From we have the small angle expansion For the flow equation, this gives The combination w 2 ρ has a finite integral (Sec.4), so the right hand side makes a non-zero contribution to the final value of the LDOS, p(0, E).
The flow equation predicts a leading order correction to the LDOS which scales as w 2 /ω 2 .This structure should be familiar from the first order of perturbation theory in J, which is being recovered in this limit of the flow equation.
Note that, in making the small angle approximation for η, we have neglected any resonances.That is, large rotation angles as a result of an atypically small value of the energy difference ω.Such rotations are always present for any w 0 , but they make up a vanishing fraction of all rotations, and so may be neglected in the w 0 → 0 limit.

Uniform fixed point
The flow equation ( 23) has a fixed point given by This corresponds to the wavefunction ⟨j n |ψ 0 ⟩ having equal probability in every Jacobi basis state, so that |ψ 0 ⟩ appears as a Haar random vector.However, this fixed point lies outside the regime of applicability of the flow equation for an initially peaked LDOS.When the LDOS flows to a width comparable to the many-body DOS, we can no longer neglect the effect of the perturbation on the DOS.Further, any microcanonical state of a local Hamiltonian H 0 has a vanishing energy density variance in H, making this fixed point unphysical.As such, the existence of this fixed point solution will not be relevant to our analysis.

Solution of the flow equation 4.1 Solution for broad density of states
The flow equation for the LDOS is difficult to solve in closed form for arbitrary initial LDOS.However, upon specializing to narrow LDOS (discussed below), it can be solved exactly.We leave a discussion of corrections for broad LDOS to Appendix A.
Identifying a characteristic width in E, σ E , for ρ(w, E, ω), we solve the flow equation at the leading order in σ −1 E (which is σ 0 E ).The solution holds when the magnitude of the perturbation (J), any characteristic width of ρ(w, E, ω) in ω (σ ω ), and the initial width of the LDOS are all much smaller than σ E (Fig. 6).In many-body Hamiltonians, this is usually the case.The total energy E is an extensive variable, so the variation of ρ(w, E, ω)-which we will see is related to a spectral function of the perturbation-in E is suppressed compared to its variation in the energy difference ω.In random matrix models, this regime must be engineered as part of the model definition.(Outside of particular models, the widths σ ω,E will be used at the level of dimensional analysis, and so we do not define them precisely.) At leading order in the ratios σ ω /σ E and J/σ E , we may make the replacements where p(w 0 , E) is peaked at E = E 0 .This assumption is recovered in a more systematic expansion in derivatives of ρ(w, E, ω) with respect to E in Appendix A. With this assumption, the flow equation becomes This equation only involves convolutions of p(w, E) with a w-dependent kernel, and can be solved by Fourier transform.We define the Fourier transforms ρ(w, E 0 , τ ) = dω ρ(w, E 0 , ω)e −iωτ , and where we have abused notation by using ρ for both the density of Jacobi decimations and its Fourier transform, and similarly for p.In Eq. (48a), the Fourier transform of sin Expressing the integral over ω in terms of Fourier transforms gives where Fourier transforming Eq. ( 49) with respect to E reduces the flow equation to an ordinary differential equation: The solution to this equation is an exponential, log p(w, t) p(w 0 , t) = In the limit of large volume, the density of states diverges, ν(E 0 ) → ∞.In the dense regime, where w 0 = O ν(E 0 ) −1/2 , this implies that the initial size of the decimated element vanishes, w 0 → 0. In this limit, Eq. ( 52) simplifies further.We have (ignoring resonances as in Sec.3.2.2) Taking the initial state to be an eigenstate of H 0 , so that p(w 0 , E) = δ(E − E 0 ), we have p(w 0 , t) = e −iE 0 t .This phase does not affect the fidelity, log P 0 (t) = 2Re log p(0, t).
where we defined the symmetric Jacobi autocorrelation function and have arrived at Eq. (3), our main result.Note that we replaced the upper limit of the w integral, w 0 , with infinity.As w 0 is the largest decimated element, the density of decimated elements above this cutoff is zero.That is, ρ(w, E 0 , τ ) = 0 for w ∈ (w 0 , ∞).The integral should be read as a summation over all the Jacobi decimations.

Jacobi spectral function and autocorrelation function
The identification of C + Jac with an autocorrelation function is well motivated, as we show below.Indeed, the integral of w 2 ρ is the total norm decimated by the Jacobi algorithm between states of a given energy, where we defined a Jacobi spectral function |f Jac (E, ω)| 2 .From its definition, the Jacobi spectral function is defined as a sum of squares of matrix elements (up to factors of ν(E)), which makes its interpretation as a spectral function natural.Indeed, if we ignore the rotation of Hamiltonian matrix elements by the Jacobi algorithm, and assume the distribution of decimated elements is the same as the distribution of matrix elements in the original Hamiltonian, we have (Fig. 7) where is the usual spectral function for the operator V in H 0 , with an initial state |ψ 0 ⟩ of energy E 0 .The rotation of the Hamiltonian matrix elements causes the deviation between the Jacobi spectral function and bare spectral function.The error term in Eq. ( 58) is found by estimating the effect of these rotations at leading order.In one rotation connecting energy differences ω and ω ′ , the correction to the spectral function is sin 2 ) (as in Eq. ( 28)), and the total correction is a sum of terms like this from all rotations.In a small angle approximation, sin 2 η 2 ≈ w 2 /ω 2 .Replacing ω by its typical scale σ ω and performing the sum gives w 2 /σ ω ≈ J 2 /σ 2 ω .Estimating the scale of the initial spectral function by σ −1 ω gives the O J 2 /σ 3 ω estimate of the total error.As the Jacobi algorithm eventually decimates the entire off diagonal, the sum of squares of all decimated elements becomes the norm-squared of the initial perturbation.This implies a sum rule for the Jacobi spectral function, where ∥ • ∥ F is the Frobenius norm, and we used the definition of the flow time β (11).The symmetric Jacobi autocorrelation function is the (real part of the) Fourier transform of the Jacobi spectral function, so where is the usual connected symmetric autocorrelation function for the perturbation V in the H 0 Hamiltonian.(Recall that V was defined to be off diagonal in the H 0 eigenbasis, so the appropriate correlation function is the connected one.) In Appendix A we show that the antisymmetric part of the correlation function is related to a shift in the mean energy of the LDOS.This appears as a phase in the Fourier transform p(w, t), and does not affect the fidelity at leading order in J/σ E .

Agreement with time-dependent perturbation theory
The log-fidelity can be perturbatively computed using time-dependent perturbation theory (TDPT).1 Equation (3) correctly reproduces the leading order of TDPT when the Jacobi autocorrelation function, C + Jac (τ ), is replaced by the bare autocorrelation function, C + V (τ ).Indeed, substituting Eq. ( 61) into Eq.(3), we have Equation ( 63b) is the prediction of TDPT for the log-fidelity.Replacing the spectral function by a sum of delta functions, produces a potentially more familiar formula in terms of a sum over matrix elements,

Exponential decay at long times
For times t much longer than the decay time of C + Jac , the solution Eq. ( 3) predicts exponential decay with t.
When t is much larger than the characteristic value of τ in the integral, we may approximate and thus find provided that |f Jac (E 0 , 0)| 2 is finite.This result is reminiscent of FGR.Indeed, using Eq. ( 58) we have where Γ GR is the FGR prediction for the decay rate, Eq. ( 2).The SJA formula recovers FGR at the leading order, and accounts for all corrections in J/σ ω through the definition of the Jacobi spectral function and autocorrelation function.
If |f Jac (E 0 , ω)| 2 diverges as ω → 0, then the integral in Eq. ( 67) does not converge.This scenario should arise when the perturbation V overlaps with a slow hydrodynamic mode, and so decays asymptotically as a nonintegrable power law in time.If the integral in Eq. ( 67) is regularized by taking its principal value-that is, by using symmetric bounds on the τ integral-then the result is always finite by Eq. ( 69) in the next section.However, in this case the decay of P 0 (t) can be nonexponential asymptotically.
Note that, for a system with a finite Hilbert space dimension, the exponential decay cannot hold for all times (Sec.4.3.4,Fig. 2).

Quadratic decay at short times
The solution Eq. ( 3) shows the expected quadratic decrease in fidelity predicted by timedependent perturbation theory at early times.
To see this, we rewrite the integration kernel in the solution as As C + Jac (τ ) is even in τ , the integrals for τ < −|t| and τ > |t| cancel.Then, the nonvanishing part of the integral is for |τ | < |t|, and for short times we can expand where thus the energy variance of the LDOS.One can similarly compute higher order corrections order-by-order from the short time expansion of C + Jac .For instance, the next order term is −(J2 t 4 /12)∂ 2 τ C + Jac (E 0 , 0).

Exponential dependence on volume
In a local many-body system, the fidelity should decrease exponentially in the volume.For example, in a spin system, initial product states are orthogonal to states differing by a single spin flip.Thus, the fidelity should decrease proportionally to the probability of a spin flip occurring on any site.Equation (3) reproduces this behavior.Indeed, writing the system volume as v = O L d (where L is the linear size of the system, and d is the dimension), the perturbation V should be extensive, where V x is a (quasi)local operator, and there are O(v) terms in the sum.In turn, this implies that the connected autocorrelator of V with itself is also extensive, Because log P 0 (t) depends linearly on C + Jac (E 0 , τ ), the fidelity P 0 (t) decays exponentially in the volume, log This implies that there is a cutoff time t * which is O(1) in the volume beyond which Eq. ( 3) no longer holds.Indeed, fidelity decay will be cut off in a system with a finite density of states at a time t * such that log P 0 (t * ) ≈ −S(E 0 ), (74) where S(E) ≈ log(Jν(E)) is the entropy at energy E (with k B = 1).We have used the coupling J to fix units in the logarithm for S(E) as this roughly corresponds to the width of the LDOS (Sec.4.3.3),so that S(E) the the log of the multiplicity, as usual. 2  In contrast, our analysis assumed a continuum density of states, and so does not recover the saturation of P 0 (t).Dividing both sides of Eq. ( 74) by the volume, we have where s(E) is the (intensive) entropy density.Both sides are O(1) with the volume in a local many-body system, and so t * will also be finite in the thermodynamic limit.
The cutoff time may still be very long for weak perturbations, but for sufficiently strong perturbations may even be small enough to preempt the regime of exponential decay.The result in this case is that log P 0 (t) decays quadratically until the cutoff time t * , so that P 0 (t) decays as a Gaussian [10].

Higher order corrections
Equation ( 3) can be viewed as the leading order term in an expansion of log P 0 (t) in the small parameter ϵ = J/σ E at a fixed time t.The parameter σ E is the typical energy scale over which the Jacobi autocorrelator C + Jac (E 0 , τ ) varies in E 0 .As E 0 is extensive, we expect that ϵ will be very small in a large many-body system.
The higher order corrections in ϵ are considered in Appendix A. The leading correction is of the form where f (x, y) is quadratic in x for small x, and linear in x for large x.

Numerical verification
Numerical calculations of the fidelity P 0 (t) show excellent agreement with the SJA result in Eq. ( 3) in both random matrix models (Sec.5.1) and in local many-body Hamiltonians (Sec.5.2).
In both cases, the numerical procedure is to find an eigenstate of the unperturbed Hamiltonian H 0 at a given target energy E 0 , and then compute log P 0 (t) using exact time evolution under H = H 0 + JV .We then compare this to the prediction of the SJA, using the Jacobi algorithm to compute C + Jac (τ ).We also compare to the TDPT prediction, using the bare autocorrelator C + V (τ ).The agreement between the prediction Eq. ( 3) and numerics seems to improve upon averaging the autocorrelation function C + Jac (τ ) either over the random matrix ensemble or over eigenstates |ψ 0 ⟩.Inspecting Eq. ( 3), the average autocorrelation function should be related to an average log-fidelity, which we denote with square brackets, [log P 0 (t)].

Random matrix model
By explicitly engineering a target spectral function |f V (ω)| 2 in a random matrix model, we can obtain log-fidelity curves log P 0 (t) which show nontrivial features beyond early time quadratic decay and late time exponential decay.Equation (3) reproduces these features, in addition to the more generic early and late time behavior.
We consider N × N random Hamiltonians H = H 0 + JV with matrix elements in a computational basis (H 0 ) jk = E j δ jk , and where E j are random numbers drawn independently from a distribution with probability density function ν(E)/N , |f V (E, ω)| 2 is the target spectral function, and X jk = X kj are drawn from a standard normal distribution.The elements V jk are given by the ETH ansatz for the matrix elements of a local operator in the eigenbasis of a thermalizing Hamiltonian [4,50].The squared Frobenius norm of both H 0 and V scale proportionally to the Hilbert space dimensions N , while the bandwidth of the model is O(1).
In this section, and in Fig. 1(a), we take ν(E)/N to be uniform on [−σ E /2, σ E /2], and to be an E-independent sum of two Gaussians of width σ ω separated by 2ω 0 .For the autocorrelation function, this gives These functions are shown in Fig. 7 as black dashed curves.Our choice of ν(E) minimizes the effects of the finite σ E on initial states in the middle of the spectrum.Our choice of ω , which can be made small even for large J 2 /σ ω by increasing ω 0 /σ ω .This makes relative deviation from FGR more visible.
A state |ψ 0 ⟩ with E = 0 in H 0 is prepared by fixing one of the H 0 diagonal elements to zero.To compute the fidelity of this state in a quench to the full Hamiltonian H = H 0 +JV , we diagonalize H and compute P 0 (t) = |⟨ψ 0 |e −iHt |ψ 0 ⟩| 2 using a Lehmann expansion.We average log P 0 (t) over the random matrix ensemble.The result is shown in Fig. 1(a) and Fig. 8 for a sequence of Hilbert space dimensions N .In these random matrix models, the norm of the perturbation is proportional to N , and so [log P 0 (t)] approaches a finite limit as N → ∞ at fixed t.This limit curve shows nontrivial behavior, including damped oscillations around the overall decay of the fidelity.This is a consequence of the damped oscillations in the perturbation's correlation function, Eq. ( 79), which also appear in the logarithm of the fidelity dynamics, Eq. ( 3).
Indeed, the leading order prediction (which is reproduced by TDPT) may be evaluated in closed form.Defining where erf is the error function, and one can check that the expression is real.This prediction for log P 0 (t) reproduces the generic features expected from Sec. 4.3, and additionally shows the predicted damped oscillations (in the second line).
Even at this order, the nontrivial features of the fidelity are reproduced (Fig. 1 and Fig. 8, black dashed).However, this order of the calculation does not accurately reproduce the rate of asymptotic exponential decay-the numerically observed rate of decay differs from Fermi's golden rule.
To evaluate the higher order corrections, we numerically compute using the Jacobi algorithm.We bin the energies E an and E bn to compute the dependence on E, and average C + Jac (E, τ ) over the random matrix ensemble.The integral transform Eq. ( 3) then provides a prediction for [log P 0 (t)].(Note that using an average C + Jac (E, τ ) requires that we average the log of the fidelity.) We find that this procedure quantitatively reproduces the asymptotic exponential decay, including both the decay rate and the overall scale (which appears in the logarithm as a constant offset).The early time damped oscillations are also reproduced, though with a small error for large perturbation strengths.
Numerically implementing the Jacobi algorithm is more expensive than state of the art exact diagonalization.However, evaluating Eq. (3) using C + Jac (E, τ ) computed through the Jacobi algorithm produces curves with smaller error bars for a given number of random matrix samples.Further, the asymptotic exponential decay of P 0 (t) is visible with smaller Hilbert space dimensions, as Eq. ( 3) does not reproduce the asymptotic limit of P 0 (t) ≳ 1/N ; it allows for P 0 (t) to decay to zero.

Local Hamiltonians
In this section, we compare the SJA formula Eq. ( 3) to the numerically computed fidelity in several standard local spin chains.We find quantitative agreement, even when the unperturbed spin chain is integrable.This indicates that Eq. ( 3) is predictive for local Hamiltonians, and not only for random matrix models.While the log-fidelity is also well captured by TDPT in many models, some choices of model show significant deviation from TDPT.These deviations are still captured well by Eq. ( 3).
We consider two different one-dimensional spin Hamiltonians with periodic boundary conditions: the mixed field Ising model, and XXZ model, (where σ α i is a Pauli operator on site i) and work in the zero momentum sector of each.While the model ( 83) is believed to be chaotic for almost all parameter values [56], model ( 84) is integrable [57][58][59][60].
We diagonalize these Hamiltonains for system sizes up to L = 16 sites, and average the log-fidelity under the time evolution generated by H = H i + JV α,β for 200 states in the middle of the spectrum.We use two different perturbations: For the nonintegrable Hamiltonian H 1 , any perturbation should behave similarly, and the choices of V α and V β are not important.For H 2 , we have chosen V α such that it preserves the integrability of the model [61], and V β such that integrability is broken [62].
The results for the fidelity decay are shown in Fig. 9.They show the typical features of the fidelity: they initially decay quadratically, but then become exponential with a decay rate set by FGR.With the extensive perturbations of Eq. (85b), the FGR rate is itself extensive, and it is the log-fidelity density [log P 0 (t)]/L which has a finite thermodynamic limit (Sec.4.3.4).However, we note that we do not see convergence in [log P 0 (t)]/L for the system sizes in Fig. 9 when the final Hamiltonian is integrable, even before the saturation time t * .This is likely due to more severe finite size effects in integrable and nearly integrable models.
To compare these numerical results to the SJA prediction, we calculate the symmetric, connected autocorrelation function J 2 C + Jac (E 0 , τ ) of each perturbation in its respective initial Hamiltonian H i using Eq. ( 82).We compute the E 0 dependence by binning the energies.Equation (3) then gives the prediction for [log P 0 (t)].We additionally compare the exact dynamics to the leading order prediction of TDPT (which is reproduced by the leading order of the Jacobi prediciton).We compute the TDPT expression by explicitly performing the sum over matrix elements given by Eq. (65).For both SJA and TDPT, we use a system size of L = 16.
The initial decay of the fidelity and its subsequent exponential decay tend to be well captured by TDPT for these models and perturbations (up to the cutoff time).The SJA formula performs as well as, or better than, TDPT in each case.However, the discrepancy between the TDPT and SJA formulae is not very large.
This agreement also holds in the model H 2 + JV α at early times, which is integrable even after the perturbation is introduced.Here, it is important that we are averaging over many states in the middle of the spectrum.The behavior of an arbitrary initial state in an integrable model is not determined just by its energy density, as Eq.(3) would predict.However, sufficient averaging in a small energy window makes [log P 0 (t)] a smooth function of energy [63,64].Additionally, the log-fidelity density has a sharp feature, which may be an indication of a dynamical quantum phase transition [54,55], and beyond which neither TDPT nor the SJA appear to capture the fidelity decay.
Taking lessons from Sec. 5.1, we can engineer models where the discrepancy between TDPT and the SJA is more severe.In Sec.5.1, the autocorrelation function of the perturbation was oscillatory, which resulted in a strong renormalization of |f V (0)| 2 to |f Jac (0)| 2 .In Fig. 10, we quench the transverse field in H 1 from a small value g x = 0.3 to a slightly larger value, g x + J/K = 0.4.That is, we consider H = H 1 + JV γ with and J/K = 0.1.The initial eigenstates tend to be polarized along the z direction, due the relatively large longitudinal field and ferromagnetic coupling.Then the perturbing x field tends to precess, and produces an oscillatory autocorrelation function.As was the case The SJA formula (3) (red dashed) continues to accurately predict the log-fidelity in this case, even for local Hamiltonians.Parameters: g x = 0.3, h z = 0.8090 in the mixed field Ising model H 1 (83), with perturbation V γ = L i=1 σ x i , J/K = 0.1.The log-fidelity is averaged over 100 states in the middle of the spectrum.As finite size effects are still large at the shown values of L in this model, the TDPT and SJA formula should only be compared to the L = 16 system, for which they were calculated.
for the random matrix model, Fig. 10 shows a large discrepancy between the predictions of TDPT at leading order (equivalently, SJA at leading order) and the full SJA formula, Eq. ( 3).The latter agrees with the numerically exact evolution up to Jt ≈ 4.
This model is close to an integrable point: H 1 with g x = 0.As such, we see large finite size effects, and no convergence of [log P 0 (t)]/L with increasing L. The TDPT and SJA calculations are performed at L = 16, and SJA correctly reproduces the fidelity for that system size.TDPT seems to approximate the fidelity for L = 12, but this is a coincidence.TDPT performed at L = 12 disagrees with the L = 12 fidelity.
At all system sizes in Fig. 10, the fidelity seems to deviate from the SJA prediction well before the fidelity saturates.We leave accounting for this feature open to future work.

Discussion
A statistical description of Jacobi's matrix diagonalization algorithm-the statistical Jacobi approximation (SJA)-relates the distribution of decimated elements to observable dynamical properties.Previous work demonstrated how many-body resonances in prethermal many-body localization (MBL) were naturally captured by the SJA, with a corresponding prediction for infinite temperature autocorrelation functions [42].The finite size scaling of the distribution of decimated elements in prethermal MBL shows that it is part of the sparse regime of the SJA description, where the decimated matrix element is much larger than typical matrix elements.
Here, we have provided a statistical description of the opposite dense regime, which characterizes the long time dynamics of systems which satisfy the eigenstate thermalization hypothesis (ETH) [4,[48][49][50][51][52]. Solving a master equation for the flow of the local density of states (LDOS) under the action of the Jacobi algorithm allows time-dependent fidelities to be calculated after a quench of the Hamiltonian.These results extend beyond the regime of validity of Fermi's golden rule (FGR)-they are applicable for a wider range of time scales, and for larger quench magnitudes.Numerical results show excellent agreement with our predictions.
A natural and useful extension of our results is to predict the behavior of correlation functions in the dense regime using the SJA.Correlation functions of local observables in many-body systems tend to have more structure than the decay of fidelities, making them more difficult to calculate.For example, they reflect the locality of the system's dynamics through light cones [65] and hydrodynamics.As autocorrelation functions continue to decay when the fidelity has saturated, a statistical Jacobi description must maintain control beyond the cutoff time t * to describe their asymptotic behavior [66].
As many systems may be described in terms of the decay from an initial state to a continuum of final states, there are several experimental contexts in which our results may be useful.The decay of correlations in Loschmidt echoes can be framed in terms of fidelity [9], with experiments having probed fidelity decay in, for example, nuclear magnetic resonance experiments [75][76][77][78], trapped ions [79,80], and classical waves [81].With new techniques of efficiently measuring fidelity [82], the variety of systems and initial states for which the fidelity can be obtained experimentally is likely to increase.
A particularly interesting context for fidelity is in dynamical quantum phase transitions [54,55], where P 0 (t) develops nonanalyticities in t.The application of our formalism to such dynamical transitions and other systems where P 0 (t) is not analytic [83,84] is an open direction for future research.
Our methods and results can likely be adapted straightforwardly to periodicallydriven (Floquet) systems [85][86][87].Floquet systems show dynamical features on several well-separated time scales [5,14,88], which indicate that the SJA could be useful in understanding the behavior of state fidelities in these systems.Further, with their lack of a conserved energy, Floquet systems represent a minimal setting for understanding the behavior of correlation functions.
Our current analysis involves computing C + Jac (τ ) by implementing the Jacobi algorithm.As such, it has the same scaling of numerical effort with system size as exact diagonalization.While we have observed that C + Jac (τ ) and the SJA prediction Eq. ( 3) are more stable with increasing system size than an exact computation of log P 0 (t) at the same size, having an efficient method to compute the Jacobi autocorrelation function would represent a very useful extension to our work.
There appears to be a connection between the SJA and more conventional techniques of random matrix theory.Appendix A indicates a similarity between the Jacobi algorithm's action on diagonal elements of H and the motion of energy levels in Dyson Brownian motion [43,44,89].A precise relationship between the two techniques is currently not clear, but establishing such a connection more concretely may allow some random matrix techniques to be incorporated into the analysis of the Jacobi algorithm.For instance, the joint distribution of eigenvalues and eigenvectors has been calculated for several random matrix ensembles [43,44].If the SJA could also be adapted to track such a joint distribution, one could predict state fidelities even beyond the cutoff time t * at which our current continuum description becomes invalid.Conversely, the application of the SJA to random matrix theory is also an open direction for research.
While both the sparse [42] and dense regimes have been characterized with the SJA, interpolating between these two regimes is an open problem.In prethermal MBL [29-33, 41, 42] or near other dynamical transitions [46] systems may begin in the sparse regime, but then cross over into the dense regime.Understanding this process may reveal important insights into the nature of dynamical transitions in isolated quantum systems.

A The flow equation with broad local density of states
In this section we show a more complete derivation of the SJA flow equation for the local density of states (LDOS), including the effects of energy level motion-the changes to the energies E jn produced by the Jacobi algorithm.The complete flow equation admits an expansion in inverse powers of the energy bandwidth σ E , and allows for corrections to the result in Eq. ( 3) to be computed systematically order-by-order.This extends our results to LDOS which are broader in energy.

A.1 Energy level motion
In the derivation of the flow equation ( 23), we neglected the fact that the Jacobi algorithm alters the energy levels E jn .This effect is important when finding corrections to the flow equation.
Upon the decimation of the Hamiltonian matrix element ⟨a n |H|b n ⟩, the energy levels E an and E bn are repelled from each other, Here, ω = E an − E bn and w 2 n = |⟨a n |H|b n ⟩| 2 .The energy update can be expressed as where the last equality holds in the limit w → 0, to which we specialize, as in the main text.Recall that this is effectively a large volume limit (Sec.3.2.2).Equation (88) shows that the Jacobi algorithm induces repulsive dynamics between energy levels with kicks that are 1/ω strong.Treating the rotations as stochastic, the Jacobi algorithm thus reproduces Dyson Brownian motion [43,Chapter 4.3].Rather than stochastically modelling the kicks, we will use the distribution ρ dec (w, E, E ′ ) to encode the statistics of the rotations.
Passing to the continuum from the discrete step Eq. ( 88), as we did for the updates to the wavefunction elements, the sum over energy updates between rotation indices n and n + dn, becomes a differential update to E jn .Following an essentially identical procedure to the derivation of Eq. ( 35) gives the nonlinear ordinary differential equation for E jn .E(w n ) = E jn is then determined by the decimated element w n and the initial energy E(w 0 ) = E j 0 through the differential equation where now involves ν(w, E), the density of states (DOS) when the decimated element is w, In writing a differential equation for E(w), we are assuming that each change in energy E j n+1 − E jn is infinitesimally small.From Eq. ( 88), this assumption is valid in the w → 0 limit.
In principle, the differential equation Eq. ( 90) can be solved self-consistently for E(w) by requiring that ν(w 0 , E 0 ) be related to ν(w, E) by a change of variables from E 0 to E(w).Alternatively, a flow equation for ν(w, E) can be solved directly.This flow equation is the continuity equation implied by the conservation of energy levels, which simplifies to The right hand side is independent of ν(w, E), and so may be directly integrated to find the solution for the DOS.

A.2 The flow equation with energy level motion
As the energy levels flow under the action of the Jacobi algorithm, the LDOS flows with them.That is, as the energies E jn appearing in are moving, the distribution p(w n , E) must also move so that the wavefunction weights |⟨j n |ψ 0 ⟩| 2 are being binned in the correct energy interval.As an extreme example, if all the E jn values move one unit towards positive E, then p(w n , E) must also translate one unit in the same direction.This feature can be accounted for in the flow equation Eq. ( 23) by replacing the partial derivative with respect to w by a comoving derivative, similar to that appearing in the continuity equation for the DOS (93).The complete flow equation is

A.3 Solution to the modified flow equation
We have not found an exact solution to Eq. ( 98), but the equation admits a systematic expansion of the solution in derivatives of the spectral function with respect to E.
As before, we Fourier transform this equation to obtain an equation directly for p(w, t), which is the quantity of interest.The transformed equation is The width of the kernel K(w, ξ, t) in ξ is assumed to be narrow.In a many-body Hamiltonian, it relates to the variation in the spectral function in the energy E. As energy is an extensive variable, this variation should be suppressed in the volume.Thus, we can expand p(w, t − ξ) in the small parameter ξ.In fact, l(w, t − ξ) = log p(w, t − ξ) will be better behaved.
In terms of a typical energy scale σ E , we introduce the O(1) dimensionless time ζ = σ E ξ.Further, we introduce the rescaled kernel The solution to this equation can be calculated order-by order in ϵ, as the right hand side is smaller by a factor ϵ than the left.In more detail, define a sequence l k (w, t) for k ≥ −1 by setting the first member of the sequence to the initial condition for l(w, t), l −1 (w, t) = l(w 0 , t) = −iE 0 t, and iteratively define subsequent l k (w, t) as the solution to the initial value problem − ∂ w/J l k+1 (w, t) = ϵ dζ 2π K ′ (w, ζ, t)e l k (w,t−ζ/σ E )−l k (w,t) with initial conditions l k+1 (w 0 , t) = l(w 0 , t). (107) We show below that this implies l k+1 (w, t) − l k (w, t) = O ϵ k+1 .For small ϵ and k → ∞, we have l k+1 (w, t) ≈ l k (w, t), which implies that l k+1 (w, t) approximately satisfies Eq. (105).That is, the sequence l k (w, t) gives an asymptotic sequence approaching the actual solution.
The l 0 (w, t) member of the sequence is the solution found in the main text.Substituting Eq. (106) into Eq.(107) to find l 0 (w, t) produces the result from the main text, Eq. ( 54), but with additional terms which accounts for the drift of energy levels.
First, we observe that the energy level motion caused by the Jacobi algorithm produces an O(ϵ) correction to the density of states: ν(w, E) = ν 0 (E) + ϵν 1 (w, E), where ν 0 is the DOS of H 0 .Integrating Eq. (94), we find ν(w, E) = ν 0 (E) + ϵ We introduced typical scaling factors σ ω for ω and σ E for E to emphasize that the correction is smaller by a factor of ϵ than the leading term (assuming J and σ ω are similar).Note that the integral of correction over E is zero, as it is the total derivative of a function which decays to zero at large and small E.
The term which appears in the solution for l(w, t) is ρ(w, E, ω), which we can expand in terms of ρ0 (w, E, ω) = ρ dec (w, E, E − ω)/ν 0 (w, E) as Observe that the higher order corrections to ρ(w, E, ω) are independent of ω.This makes including the correction to l 0 (w, t) straightforward.Expanding Eq. ( 109), we have The additional integral terms are complicated, but independent of τ .Thus, all the properties from Sec. 4.3 continue to hold for the corrected solution l 0 (0, t) + l 0 (0, t) * .The early time expansion will be quadratic, and the late time behavior will be linear, though the coefficients in these expansions will be slightly modified.We also need to account for the O(ϵ) correction in ∆l 1 (w, t).Working to O(ϵ) in Eq. (115b), we have Making a short time expansion for the term in square brackets and using Eq.(108), we have = −iϵ∂ E 0 /σ E K(w, E 0 , t)

(120c)
The last line is written in terms of dimensionless combinations (the integral of K with respect to w is dimensionless).
One may expand the kernels in Eq. (120c) to obtain an expression in terms of ρ0 , but the expression as written is already sufficient to deduce the short and long time behavior of ∆l 1 (0, t).The kernel K(w, E 0 , t) has a quadratic real part and linear imaginary part at short times.At long times, both its real and imaginary parts are linear in t.It is then straightforward to check from Eq. (120c) that ∆l 1 (0, t) shares these properties.Its real part is quadratic at short times and linear at long times, while its imaginary part is linear at both short and long times.In principle, the coefficients in these expansions can be calculated from Eq. (120c) in terms of an integral involving ρ0 .
Figure 1: The log-fidelity P 0 (t) of an H 0 eigenstate decays when the Hamiltonian is quenched to H = H 0 + JV .The statistical Jacobi approximation (SJA) correctly predicts all features of the time-dependent fidelity with no free parameters both in (a) random matrix models and (b) local Hamiltonians.The leading order of time-dependent perturbation theory (TDPT, black dashed) predicts exponential decay at the FGR rate Γ GR (2).In both (a) and (b), Eq. (3) (red dashed) correctly predicts a significantly larger decay rate than FGR.Parameters: (a) As in the random matrix model of Fig. 8(b).(b) As in the mixed field Ising model of Fig. 10.The TDPT and SJA formulae are calculated using L = 16.

Figure 2 :
Figure 2: Schematic of the log-fidelity density obtained from Eq. (3) (red dashed) and exact dynamics (blue) in thermalizing d-dimensional many-body Hamiltonians.(Left)For weak to intermediate perturbations the log-fidelity initially decays quadratically, log P 0 (t) ∼ −J 2 t 2 L d , followed by linear decay, log P 0 (t) ∼ −ΓtL d (where L is the linear extent of the system).In the actual dynamics, this decay is cut off at a time t * when the log-fidelity density reaches the entropy density at the energy of the initial state, s. (Right) For strong perturbations, the cutoff time may precede the onset of linear decay, Jt * ≲ 1.Then decay of the log-fidelity appears quadratic (Gaussian for the fidelity) until the cutoff time.

Figure 3 :
Figure3: (a) Sketch of the Jacobi diagonalization algorithm.In each iteration, the largest (in absolute value) off diagonal element, w, is identified and zeroed (decimated ) by a 2×2 rotation.Both (b) and (c) show the same average distribution of log w, ρ dec (log w), calculated for the sum of an N × N random matrix drawn from the Gaussian orthogonal ensemble (GOE) and a random symmetric sparse matrix with 20 nonzero elements per row.(b) For early flow times (larger w) the model has a sparse regime[42], where w = O(1) is much larger than typical elements in its row and column (inset).Consequently, the distribution of decimated elements(15) is O(N ).(c) For late flow times (small w) a dense regime emerges, where all matrix elements are of the same scale (inset), w = O N −1/2 .Thus, the distribution of decimated elements becomes O N 2 .Rescaling by these powers of N produces data collapse in the density of log w.

Figure 4 :
Figure4: The decimated elements w n for the model in Fig.3(b-c), with N = 1024.While w n (grey) does not decrease monotonically with n, its typical scale does decrease (exponentially).Either reshuffling decimated elements so that they decrease monotonically (red), or averaging over a fixed number of rotations (the blue dashed curve shows an average over 2000 rotations) causes monotonic decrease by removing small fluctuations.

Figure 5 :
Figure 5: The local density of states (LDOS) p(w n , E) is the probability density for the initial state |ψ 0 ⟩ to be in a Jacobi basis state |j n ⟩ with energy E jn = E.The rotations performed by the Jacobi algorithm change the basis {|j n ⟩} (top), and thus the LDOS (bottom), causing a redistribution of probability.The final LDOS, which is related to the fidelity P 0 (t) by Fourier transform, is found by running the Jacobi algorithm until w → 0. The approach to this final distribution can be described by a continuum flow equation, Eq.(23)

Figure 7 :
Figure 7: The symmetric (a) Jacobi autocorrelation function C + Jac and (b) Jacobi spectral function |f Jac | 2 (blue, red) are corrections to the bare autocorrelation function C + V (62) and spectral function |f V | 2 (59) (black dashed) of the perturbation V in H 0 .For perturbation strengths much less than the characteristic width of the bare spectral function, J/σ ω ≪ 1, the difference between the Jacobi autocorrelator and the bare autocorrelator ∆C + Jac (and the difference between the spectral functions ∆|f Jac | 2 ) is small (c-d).The difference becomes significant for large perturbations.Parameters: As in the random matrix model of Fig. 8.

Figure 8 :
Figure 8: The ensemble averaged log-fidelity, [log P 0 (t)], for the random matrix model (77) is quantitatively reproduced by the SJA formula for large Hilbert space dimension N .Even relatively weak perturbations (a) can produce asymptotic decay rates which differ from FGR (black dashed), and stronger perturbations (b) have substantially altered decay rates.These corrections are captured with no free parameters by the SJA formula (red dashed), though there are small errors at early times for large perturbations (c, d).This model exhibits damped oscillations in the log-fidelity (b and inset in a).Parameters: ω 0 /σ E = 0.14, σ ω /σ E = 0.06, (a, c) J/σ ω = 1/3, (b, d) J/σ ω = 4/3.Logarithms of the fidelity are averaged over 10 4 random matrix samples, and C + Jac used in the SJA formula (3) is computed using N = 1024 and 10 3 samples.

Figure 9 :
Figure 9: The (intensive) log-fidelity density [log P 0 (t)]/L, averaged over 200 states in the middle of the spectrum of a local Hamiltonian(83,84), is reproduced by the leading order of TDPT, and by the full SJA formula.The SJA formula holds in one-dimensional chaotic models including (a, b) the mixed field Ising model, but also in integrable models such as (c, d) the XXZ model, provided an average over states is made.(c) Both TDPT and SJA only reproduce the fidelity for L = 16, which is the system size at which those calculations were performed.The presence of sharp features in the log-fidelity may indicate a dynamical quantum phase transition[54,55].Parameters: J/K = 0.2, (a, b) g x = 0.9045, h z = 0.8090, (c, d) ∆ = 0.5.

Figure 10 :
Figure10: When the perturbing operator V has an oscillatory autocorrelation function in H 0 , the log-fidelity may fail to be captured by TDPT (black dashed).The SJA formula (3) (red dashed) continues to accurately predict the log-fidelity in this case, even for local Hamiltonians.Parameters: g x = 0.3, h z = 0.8090 in the mixed field Ising model H 1(83), with perturbation V γ = L i=1 σ x i , J/K = 0.1.The log-fidelity is averaged over 100 states in the middle of the spectrum.As finite size effects are still large at the shown values of L in this model, the TDPT and SJA formula should only be compared to the L = 16 system, for which they were calculated.