Dynamical Instantons and Activated Processes in Mean-Field Glass Models

We focus on the energy landscape of simple mean-field models of glasses and analyze activated barrier-crossing by combining the Kac-Rice method for high-dimensional Gaussian landscapes with dynamical field theory. In particular, we consider Langevin dynamics at low temperature in the energy landscape of the spherical $p$-spin model. We select as initial condition for the dynamics one of the many unstable index-1 saddles in the vicinity of a reference local minimum. We show that the associated dynamical mean-field equations admit two solutions: one corresponds to falling back to the original reference minimum, and the other to reaching a new minimum past the barrier. By varying the saddle we scan and characterize the properties of such minima reachable by activated barrier-crossing. Finally, using time-reversal transformations, we construct the two-point function dynamical instanton of the corresponding activated process.


Introduction
Rough high-dimensional energy landscapes are central in many different contexts. In physics, they are one of the key ingredients of the theory of glasses, and more generally of disordered systems [1]. In computer science, they are studied to characterize algorithmic phase transitions for inference and signal processing [2], and they have attracted a lot of attention in the field of deep neural networks [3]. In biology, they appear in analysis of evolution and in the study of protein folding [4,5]. In all these disparate contexts the system under study explores a rough landscape by stochastic dynamics, and the main aim is to characterize the complex dynamical behavior that ensues from it. The mean-field theory of glasses and spin-glasses has been instrumental in this respect. It provided the first quantitative analysis of rough high-dimensional landscapes [6], in particular of the number and the properties of the critical points, and of the associated stochastic dynamics [7]. It was shown theoretically that starting from a random high-energy initial condition mean-field glass models very slowly approach metastable states, which are typically the most numerous ones and are marginally stable, meaning that their Hessian matrix is characterized by arbitrary small eigenvalues [8]. The intuitive explanation of this phenomenon is that these metastable states, called the threshold states, are the most numerous and the wider ones, and hence naturally correspond to the largest basins of attraction. This paradigmatic behavior has been applied and transposed with success in a variety of contexts in the last twenty years, in particular to explain the glassy dynamics of three-dimensional interacting particle systems (super-cooled liquids, colloidal glasses, etc.) [1]. Calling N the dimension of the energy landscape, which in physics contexts is proportional to the number of degrees of freedom, mean-field glass models display two dynamical regimes: a slow descent regime that corresponds to time-scales that do not diverge with N in which the system approaches (or more precisely ages toward) the threshold states. An activated regime in which the systems jumps over increasingly larger barriers and is able to explore fully the energy landscape. To observe such activated processes in mean-field models one has to probe the dynamics on time-scales which are exponentially large in N since the barriers between low energy metastable states scale as N [9,10,11,12]. Whereas a theory of the first dynamical regime has been progressively developed in the last twenty years, constructing a theoretical framework to understand the second one remains an open problem-a central one in many of the contexts in which rough energy landscapes play a role. The main reason for this state of affairs is that activated dynamics is well understood mainly in low dimensional cases, where the number of minima and of saddles connecting them is finite and possibly small. The standard methods to tackle this problem were developed quite independently in statistical physics to analyze the phenomenon of nucleation at firstorder phase transitions [13], in quantum field-theory for tunneling between degenerate vacua (where Planck's constant plays the role of temperature) [14], and in probability theory [15]. On the contrary, rough high-dimensional energy landscapes are characterized by diverging (exponentially in N ) number of paths that connect a diverging number of metastable states. In this case standard frameworks are not adapted, and new ideas and methods are needed. The main technical difficulty in establishing a theory of activated processes for mean-field glassy systems and high-dimensional rough landscapes is that the correct order parameter that describes glassy dynamics is the correlation function between two different times [7]. This is different with respect to standard phase transitions where one instead focuses on onetime (or point) functions. In consequence, contrary to known situations in which to describe an activated process one has to find the rare trajectory, called instanton, that connects two minima and that corresponds to the optimal change of the one-point function corresponding to the order parameter [16], in this case one has to find the instanton on a two-point function. This is a quite different, less intuitive and more complex mathematical object. Henceforth, in order to highlight this difference, we will call it dynamical instanton. Although some results have been given in the past literature [17,18,19,20], the problem of finding the dynamical instanton corresponding to the activated jump out of a given minimum of the energy landscape of a mean-field glass model remains a largely unsolved challenge. Here we provide the first computation of such dynamical instanton and characterize the properties of the new minima reached after barrier-crossing. In order to achieve this goal we make use of the results we obtained recently on the number of the stationary points constrained to be at fixed overlap q (or distance d) from a given minimum in a prototypical energy landscape [21,22] (see also [23] for a related analysis). These studies showed that, given an arbitrary local minimum s 1 of the energy landscape with energy density 1 , the landscape in its vicinity is populated by rank-1 saddles, that constitute available escape states when the system is trapped in s 1 . By extending to dynamics the theoretical framework developed for the high-dimensional Kac-Rice method (see also [24]), we derive dynamical equations describing the evolution of the system conditioned to start from such unstable saddles as initial states. By analyzing these equations analytically and integrating them numerically, we show that they admit two solutions which are associated to the descents toward the two minima reachable from the saddle. In this way, we map out the first geometrical properties of the Morse complex, i.e. we characterize all the local minima that are connected to the original reference one through rank-1 saddles (as illustrated in Fig. 1). We then resort to dynamical field theory and to the time-reversal property of stochastic dynamics to construct the dynamical instantons for the two-point functions. The two dynamical solutions discussed above are used as building blocks: the part of the dynamical instanton associated to the ascent of the system from the original minimum to the nearby saddle is obtained through the time reversal of the relaxation path from the saddle down to the minimum [18,19]. We then combine this contribution with the one corresponding to the descent from the saddle to the new minimum to finally obtain the shape of the dynamical instanton, see Fig. 4.
In the following section a summary of the state-of-the-art and our main contributions is presented, we will then expose in details our analysis.

Model and state-of-the-art
We focus on the energy landscape associated to the p-spin spherical model: defined at each point s = (s 1 , · · · , s N ) of an N -dimensional sphere, s · s = N . The couplings J i 1 ···ip are independent Gaussian random variables with zero average and variance J 2 i 1 ···ip = 1/2p!N p−1 , and are symmetric under permutations of the indexes. The functional (1) has been the subject of an extensive amount of research devoted to understanding its statistical properties, which started with the earlier investigations [25,26,27,28,29,30] and culminated in the most recent results [31,32,21,22]. These works highlighted a peculiar organization of the landscape stationary points in terms of their energy density = E/N and of their stability: while at large value of the energy the landscape is dominated by saddles with a huge index (i.e., number of unstable directions), the local minima and low-index saddles concentrate at the bottom of the landscape, below a critical threshold value of the energy density th . Their number N k ( ) (k being the index) is exponentially large in N , its typical value being governed by a positive complexity Σ k ( ) = lim N →∞ log N k ( ) /N that decreases with k. The high-dimensionality of configuration space entails that most of these low-energy minima and saddles are orthogonal to each others on the sphere, i.e. they normalized overlap q(s, s ) = lim N →∞ s · s /N is typically equal to zero. In order to find the escape paths from a given minimum, one needs to perform a more thorough analysis. In particular, it is important to scan the landscape in the vicinity of any of its stationary points. This information is accessible via large deviation techniques by computing the complexity of the stationary points constrained to be at fixed overlap q = 0 from the reference stationary point [23,21,22]. In Ref. [21] we computed the complexity of the typical stationary points that are found in the vicinity of a reference minimum, extracted uniformly from the ensemble of minima having a given energy density larger than the ground state gs and smaller than the threshold th . Henceforth we denote with s 1 the reference minimum, and with s 2 a stationary point at overlap s 1 ·s 2 = N q from it. We let 1 , 2 be the corresponding energy densities, and Σ( 2 , q| 1 ) the complexity of the stationary points at energy 2 . The results for a representative value of energy 1 of the reference minimum are summarized in Fig. 2. The stationary points that are closer to the reference minimum (i.e., at larger overlap) typically appear at an overlap that we denote with q M , and are at high energy density (equal to th in the case of Fig. 2). At each overlap smaller than q M we find an exponentially large number of stationary points (Σ > 0), with energy densities 2 > (q| 1 ). The lower bound (q| 1 ) corresponds to the energy of the deepest stationary points at overlap q: their complexity is exactly zero. For any fixed 2 smaller than th (see for instance the dashed arrow in Fig. 2), the closest stationary points with that energy are found at an overlap q m ( 2 ), and are rank-1 saddles: their Hessian has an eigenvalue density with a positively supported bulk (like for minima), plus an isolated eigenvalue that is separated from the bulk, and negative. The eigenvector associated to that eigenvalue has a macroscopic projection along the direction connecting s 2 and s 1 in configuration space, indicating that the saddle s 2 is unstable in a direction that 'points' towards the minimum s 1 . This remains true decreasing the overlap, up to a value q ms ( 2 ) where a transition to minima occurs. This is the overlap at which the curve ms (q| 1 ) intersects the given 2 : the points at this overlap are marginally stable rank-1 saddles, with one flat mode (the isolated eigenvalue is exactly equal to zero). For smaller overlaps the stationary points are minima; the closest ones are still correlated to the minimum s 1 (dashed gray region in Fig. 2), since their Hessian still exhibits an isolated eigenvalue, that is nevertheless positive.
Eventually, for even smaller q the points become minima that are totally uncorrelated to s 1 . At q = 0, we recover the unconstrained complexity of local minima. Therefore, all the stationary points enclosed in the violet region of Fig. 2 are typically saddles that are geometrically connected to the reference minimum in configuration space. Their complexity is shown in the same figure. Among them, the deepest ones have parameters q * ( 1 ), * 2 ( 1 ) that correspond to the intersection between the curves ms (q| 1 ) and (q| 1 ).

Figure 2:
Left. The colored regions identify the range of energy densities 2 of the stationary points found at overlap q with a minimum of energy density 1 = −1.167: the violet area corresponds to rank-1 saddles, the gray one to minima that are either correlated (dashed area) or uncorrelated to the reference one. The eigenvalue density ρ(λ) of the Hessian matrices at the stationary points is sketched at the bottom of the plot. Right. Color plot of the complexity of the rank-1 saddles as a function of their energy and overlap q with the reference minimum of energy 1 = −1.167.

The landscape in the vicinity of a local minimum
All the saddles lying in the vicinity of the reference minimum s 1 , and corresponding to the violet region in Fig. 2, represent possible escape states for the system trapped in the local minimum. In this work we study where gradient descent dynamics starting from these saddles land in the energy landscape. By developing a theoretical framework that combines the Kac-Rice method and dynamical mean-field theory, we obtain the equations characterizing the minima that are connected to the original reference minimum through the saddles, see Fig. 1: these are the states that the system can reach if it manages to escape from s 1 through one of the surrounding saddles. The connectivity of s 1 in configuration space can thus be characterized by studying the energy density ∞ and the overlap N x ∞ = s ∞ · s 1 of the minima s ∞ reached asymptotically, as a function of the saddle parameters q, 2 . Fig. 3 shows the resulting distributions, for a representative value of 1 as above. Interesting correlations emerge between minima and saddles: at fixed energy 2 of the saddle, those at higher overlap (i.e., those closer to the reference minimum) are more optimal, as they allow to reach minima that lie deeper in the landscape, and at furthest distance from the reference minimum. Upon changing the energy of the saddle, one discovers that there exists a trade-off between energy and overlap: the saddles that connect the reference minimum to the deeper ones are not the same ones that connect it to the further ones, thus allowing to explore a larger portion of configuration space. In particular, the saddles at q * , * 2 that correspond to the minimal energy barrier are optimal in terms of energy of s ∞ , but not in terms of its overlap x ∞ . Overall, we see that both the range in energy and in overlap of the connected minima is rather limited: escaping through these saddles, the system reaches minima that are highly correlated from the reference one. We comment on the implications of this on the dynamics in Sec. 7, and refer to Sec. 4 for a more detailed analysis of the asymptotic solutions of the dynamical equations. Figure 3: Left. Energy density ∞ of the minimum reached asymptotically by the dynamics starting from an index-1 saddle at energy 2 and overlap q with the reference minimum having energy 1 = −1.167. Right. Overlap x ∞ between the reference minimum and the one reached asymptotically by the dynamics starting from the saddle at energy 2 and overlap q with the reference minimum.

Activated processes and dynamical instantons
As already stressed in the introduction, one of the main aim of this work is to obtain the dynamical instanton that corresponds to the activated process associated to the escape from a given minimum s 1 towards a new minimum s ∞ , see Fig. 1. Instantons are in general obtained as special extremal solutions of a large deviation functional [33,15]. In the case of mean-field spin glasses the corresponding mathematical object is a functional of the two-point functions [34,35]. Although, in principle one could look for dynamical solutions by extremizing this functional and imposing suitable boundary conditions in time, in practice analyzing the corresponding equations represents a formidable challenge. No numerical solution has been obtained yet. On the analytic side, despite the results in [17,18,19,20] the problem remains largely open. The main obstacle is the lack of intuition on the kind of solution one is looking for -an information that is missing so far. The only case in which a dynamical instanton has been fully worked out is in the study of finite-time metastable states where periodic boundary condition in time are enforced [35]. This is quite a different situation with respect to the one we are interested in here.
In the following we show how to obtain the dynamical instanton corresponding to the activated process sketched in Fig. 1. The resulting shape of the two-point correlation function is shown in Fig. 4. It displays three time regimes: the first one corresponding to the ascent from s 1 , the second one corresponding to the approach and the departure from the saddle, and the final one associated to the descent towards the new minimum. Since the basic objects is a symmetric two-time functions, C(t, t ), this leads to six different time-sectors and six different behaviors for the correlation function (depending on which of the three regimes the times t and t belong to). The explicit form of the dynamical instanton associated to the simple activated process in Fig. 1 will be instrumental in finding instantons associated to more complex relaxation processes, in particular to equilibrium relaxation. We shall get back to this issue in the conclusion. instantonic solution that links a reference minimum (s 1 ) at energy 1 = −1.167 to a neighboring minimum (s ∞ ) reached through a saddle (s 2 ) at energy 2 = −1.1555 and with overlap q = 0.75 with s 1 . The plot shows correlation equal to one on the diagonal and plateaux on other three different levels, corresponding to the overlaps q = 0.75 between s 1 and s 2 , c ∞ = 0.957 between s 2 and s ∞ , and x ∞ = 0.619 between s 1 and s ∞ .
3 Self-consistent dynamical equations describing the escape from a saddle In this section we derive the equations describing the system evolution with specific initial conditions, that correspond to being in a saddle at a fixed distance of a given local minimum of the energy landscape. We remark that dynamical equations with constrained initial conditions for the p-spin spherical model have been derived in simpler settings, see for instance Refs. [36,37,24]. In particular, Ref. [36] studies the exponential relaxation of the system initialized within one of the metastable states that contribute to the Boltzmann measure in the so called dynamical phase, at temperatures between the static and the dynamic transition temperatures. In Ref. [37] and in the more recent [24] the overlap between the initial condition of the dynamics and a thermalized condition in the same temperature range is also enforced to take a fixed, non-zero tunable value. The approach we present below goes one step further since we condition on the initial con-dition s 2 to be itself a stationary point, beside conditioning on its energy density and on the overlap with the reference minimum s 1 . From the technical point of view our approach combines the Kac-Rice method developed to study critical points of high-dimensional rough landscapes [38,31] with dynamical field theory [33,34].

The dynamical action with constrained initial conditions
Let s(t) denote the spin configuration at time t, and let E[s t ] be the time-dependent energy field evaluated at s(t): The vector s(t) is obtained as a solution of the Langevin equation: where ξ i (t) is white noise with correlations z(t) is a Lagrange multiplier that enforces the spherical constraint s(t) · s(t) = N , α is equal to twice the temperature T , and With this normalization, typically E[s t ] ∼ √ N . We assume that the dynamics has a specified initial condition s(0) = s 2 , with corresponding energy: The dynamical generating functional corresponding to the stochastic evolution (3) and obtained integrating over the noise reads: whereŝ i (t) are auxiliary fields 1 and we highlighted the dependence on the initial condition of the dynamical evolution.

Implementing the initial conditions
We aim at averaging the above dynamical functional over all possible initial conditions s 2 that are stationary points found in the vicinity of some local minimum s 1 of the energy landscape, having energy: We assume that the two stationary points are at overlap N q = s 1 · s 2 , and define the following average over the initial conditions: where the integrals are over the sphere of radius √ N . The explicit form of the measure is given by: This measure encodes the spherical constraint, since e α [s] with α = 1, · · · , N − 1 are unit vectors spanning the tangent plane of the sphere at the point s, and H[s] is the Hessian matrix of the energy functional at the point s, which is also projected onto the tangent plane and has components: The normalization N s 1 ( 2 , q| 1 ) denotes the total number of stationary points of energy density 2 that are found at overlap q with a fixed stationary point s 1 having energy density 1 , whereas N ( 1 ) is the total number of stationary points of energy density 1 . We thus define the generating functional averaged over the initial conditions as: The interpretation of (9) is as follows: the measure (10) weights uniformly all stationary points at a given value of energy density 1 ; for energies below the threshold energy, these stationary points are typically local minima, since the number of saddles of finite index is exponentially suppressed in the dimension N (with respect to the number of minima). Therefore, in the large-N limit the point s 1 extracted with such measure will be a local minimum with a high probability, that converges to one as N → ∞. Similarly, the analysis of Ref. [21] reveals that for each choice of q, 2 , the energy landscape is dominated by one specific type of stationary points that are either local minima or index-1 saddles, see Fig. 2. More precisely, any local minimum s 1 is typically surrounded by an exponentially large (in the dimension N ) population of stationary points distributed over a finite range of overlaps q; among them, the ones that are at larger overlap q (and thus closer to the minimum) are typically index-1 saddles. This implies that if a stationary point is selected at random among those at large-enough overlap, with probability that converges to one in the large-N limit this point will be an index-1 saddle. By suitably changing the parameters q, 2 at fixed energy 1 , we can select the initial condition s 2 to be either an index-1 saddle or a local minimum. Of course we are particularly interested in the regime of parameters in which the initial condition is an unstable stationary point, i.e. a saddle.

Averaging over the random couplings
The generating functional (12) can be averaged over the quenched random couplings J i 1 ,··· ,ip . We denote the corresponding average with Computing this average is a priori non-trivial because of the normalizations in the denominator, which are explicit functions of the random couplings. The computation can be performed exploiting the identity x −1 = lim n→0 x n−1 . This naturally leads to a replica calculation, in which higher moments of the quantities N ( 1 ) and N s 1 ( 2 , q| 1 ) have to be determined. As it follows from the results of [21], however, such a replica calculation reproduces the results obtained within the so called annealed approximation, which in this case corresponds to averaging separately the numerators and the denominator of (13): We can therefore focus on the average of the numerator, which we denote with: where and To perform the average over the random couplings, we make use of the following trick: because the energy functional is homogeneous, it can be expressed (together with its gradient) in terms of the time-dependent symmetric matrix field M ij (t) defined as: Indeed, we can write: The matrix field (18) is symmetric and random, with a Gaussian statistics induced by the couplings. The covariance of the field evaluated along two fixed different dynamical trajectories s a (t), s b (t ) is given by: As a consequence, the average over the random couplings can be equivalently re-written as an average over this matrix field. This is convenient as it allows us to account for the constraints in the initial condition of the dynamics (encoded in the measure (10)) in a straightforward way, given that for both s a with a = 1, 2 we can write the energy and the gradient in terms of the matrix field: We define the initial conditions of the matrix field M [s a (t = 0)] = m a . As we show in Appendix B, by averaging over the matrix field and implementing the constraints (21), we can re-write (15) in the form: where the proportionality factors do not depend on the dynamical variables s t ,ŝ t , and thus do not matter for the derivation of the dynamical equations. In this formula S[s t ,ŝ t ] is the dynamical action, whereas the term K[s t ,ŝ t ] is given by an integral over the initial conditions of the matrix field (18). We describe the structure of the two terms in the following subsection, and refer to the Appendices for the detailed derivations.

Order parameters and the role of causality
We have: where V 0 [s t ,ŝ t ] is given in (17). The action S 0 encodes the dynamical evolution given by (5), and is generic. The term S B instead accounts for the peculiar initial conditions of the dynamics: it arises when imposing that the initial condition s 2 is a stationary point of energy density 2 , at overlap q from a local minimum of energy 1 . Both actions depend on the dynamical variables only through the two-point functions contained in the 2×2 matrix Q(t, t ) with components: In addition, we introduce the 2 × 2 matrices containing the one-point functions: and the vector: With this notation, it holds: This term thus reproduces the dynamical action obtained when starting from random initial conditions [25]. All the non-trivial information on the initial condition of the dynamical evolution is contained in S B . This term depends explicitly on the overlap q, as well as on the energy densities 1 , 2 . It contains two contributions: which are derived in Appendix D. The first contribution S (1) is generated by conditioning s a to be stationary points. We can write it as: and where, for arbitrary 2 × 2 matrices with components x ab with a, b ∈ {1, 2}, we have introduced the form: The second contribution S x] follows from conditioning both on the gradient and on the energy density of the points s a , and reads: where and A is a 4 × 4 matrix given in Appendix D.
Let us now comment on the term K[s t ,ŝ t ]. This term is obtained as an integral over the (N −1)×(N −1) matrices m a , which denote (up to a shift) the projection of M [s a (t = 0)] = m a on the tangent plane at s a . Its explicit form reads: (34) From this expression we see that the Hessian matrices m a are Gaussian distributed, with inverse covariances Ω * = [Σ * ] −1 that are given explicitly in Appendix C. The term Φ a [s t ,ŝ t ] inside the determinant denotes a matrix whose components can be written as: where we introduced the functions with: Thus, these matrices are a sum of a rank-one projector and of a second matrix φ a which depends in principle on the dynamical variables s 2 (t),ŝ 2 (t), and can not be expressed compactly in terms of the order parameters (24) and (25). It might therefore seem that the determinants in (34) give a contribution to the action that depends explicitly on the whole time evolution, and that therefore has to be taken into account when deriving the dynamical equations. However, as it appears from the analysis performed in Appendix C, the components of φ a vanish when the dynamical average is restricted to trajectories that fulfill the requirement of causality. As a consequence, when the dynamical evolution is causal the matrices Φ a reduce to rank-1 projectors, that depend explicitly only on the parameters q, 1 and 2 that characterize the initial condition. This is consistent with the natural expectation that the terms appearing in the measure (9), that select the initial condition of the dynamics, are not affected by the subsequent dynamical evolution of the system. Inspecting the distribution of the entries of the matrix m a and the explicit form of the functions µ a (q, 1 , 2 ), one can easily show that the integrand in K[s t ,ŝ t ] reproduces exactly the flat measure over critical points at overlap q with each others, see Appendix C for details. Therefore, accounting for the causality of the dynamical evolution we recover which cancels precisely with the denominator in (14). As it follows from this simplification, all the information on the initial conditions s 2 enters in the boundary terms of the dynamical action (23) only. These terms turn out to encode the statistical properties of the Hessian at the initial condition s 2 , as we show explicitly in Sec. 3.3.1.

Variation of the action and dynamical equations
To finally obtain the dynamical equations, we focus on the remaining term: where we introduced the order parameters (24) and (25), and: where s (1) a (t) =ŝ a (t) and the product over α, β is implicit. The integral over the dynamical variables s 2 (t),ŝ 2 (t) is Gaussian, with kernel: A standard calculation gives: with: Substituting (42) into (39) and taking the variation with respect to Λ and Q we get: Combining these equations with the one obtained taking the variation of the action with respect to x we obtain the coupled equations: where we used the notation . A lengthy (but straightforward) calculation of the functional variations of the action leads to the dynamical equations reported below (see [34] for details of the derivation in the simplified case in which the boundary terms are absent). We stress that the equations are given under the assumption that the resulting typical dynamical trajectories are causal, meaning that we assume that the saddle-point solution satisfies: which implies in particular r(0, t) = 0 for any t > 0. The remaining equations are for the correlation function c(t, t ), the response function r(t, t ) for t > t , and the overlap x(t) ≡ x 1 (t) = c 12 (t) with the minimum s 1 . We report them in the following, and refer to Appendix E for the explicit expression of the constants involved.

Dynamical equation for the overlap with the nearby minimum
The equation for x(t) reads: and G ,q depends linearly on the energies, and reads: and the constants G a i (q) are functions of x(0) = q, and are reported in Appendix E.

Dynamical equations for the correlation function and the Lagrange multiplier
The equation for the correlation c(t, t ) reads: where the energy-dependent part is a linear combination of 1 , 2 given by: and the constants are given in Appendix E. Setting t = t we obtain the equation for the multiplier z(t) enforcing the spherical constraint during the dynamics 2 : and F at equal times reduces to: (53) When t = 0, setting x(t = 0) = q and c(t = 0) = 1 we get: which for α = 0 reduces to the correct value of the Lagrange multiplier enforcing the spherical constraint at a stationary point of energy N 2 .

Dynamical equation for the response function
The equation for the response r(t, t ) is formally unaltered by the coupling to the initial conditions, and reads: In this equation the information on the initial conditions is implicitly encoded in the Lagrange multiplier z(t).

Two limiting cases: simplifications and checks
We now consider two interesting limits of the above equations, which we remind are derived under the assumption that the initial condition s(t = 0) = s 2 is a stationary point of energy density 2 , at overlap q with a local minimum s 1 of energy density 1 .
The first case we focus on consists in the limit α = 2T → 0, when the noise in the Langevin equation vanishes and the dynamics reduces to gradient descent starting from a stationary point. As we shall see, and as expected, if this point is a minimum then the system remains stuck there, otherwise if this point is a saddle a dynamical instability takes place. The second case corresponds to the limit q → 0, when the initial condition decouples from s 1 , and one samples uniformly all stationary points at a given energy. For this reason we will refer to it as "microcanonical initial conditions". This limit is useful to check our equations since it can be connected to the one analyzed in [36].
3.3.1 Gradient descent from a stationary point: the "static" solution and its stability In the noiseless limit α → 0, the dynamical equations must admit a solution in which the system does not move away from the initial condition, given that the latter is a stationary point. We refer to this as the "static" solution. It is easy to check using the explicit form of the constants given in Appendix E that x(t) = q and c(t, t ) = 1 solve the above equations in this limit. Indeed, plugging this ansatz into (47) we get: which rightly gives the value of the zero-time multiplier z(0) = z 0 = −p 2 . The same identity is obtained from (49). The equation (55) for the response becomes: Assuming time-translation invariance, this is equivalent to The Laplace transform of this equation is simply: where we used that the Laplace transform of the derivative is ωR(ω) − R(0 − ), and R(0 − ) = 0. The equation admits the shifted GOE resolvent as a solution, i.e., where the function G is given in (113). The inverse Laplace transform is proportional to a Bessel function, This result coincides with the one of stochastic dynamics in purely quadratic landscapes [39,40], as in the noiseless limit the non-linear part of the potential is not explored.
The initial condition s 2 has a Hessian whose statistics depends on the parameters q, 1 and 2 , as recalled in Appendix A. Its eigenvalue density is almost entirely positive definite (and GOE-like), with the exception of possibly one negative eigenvalue that appears for certain values of the parameters (given by the condition (111)). When the initial condition s 2 is a saddle with one single negative eigenvalue, the "static" solution must be dynamically unstable, since there exist a direction in configuration space in which the landscape has negative curvature, allowing the system to escape from the stationary point, see Fig. 1. For fixed q, this happens whenever the initial condition s 2 is chosen to have energy 2 ∈ [ ms , ], see Fig. 2. In order to check this instability from the dynamical equations, we consider the linearization of Eq. (47) around the static solution x(t) = q. Setting x(t) = q + δx(t), we get: where the operator O acts as: and R(s) is the response in the stationary point with energy density 2 , z 0 = −p 2 andG ,q reads:G with the a i (q) given in (37). The static solution becomes unstable when the linear operator O has eigenvalues that becomes positive. We assume that δx(s) is slowly varying, which is correct close to the transition where the instability is small. As a consequence, we can extract it from the integration in (63). Taking t → ∞ we get: with Using that the integral is the Laplace transform evaluated at zero, which is related to the GOE resolvent G σ with σ 2 = p(p − 1)/2 as: see Eq. (60), one finds that Using (64) we get that the instability condition λ ∞ = 0 reads: This equation is precisely equivalent to the one corresponding to the isolated eigenvalue of the Hessian at s 2 being equal to zero. Indeed, as we recall in Appendix A the isolated eigenvalue of the Hessian is given by: where λ min solves the equation Multiplying (69) by √ 2 and using that G σ which corresponds to λ min = √ 2p 2 and thus λ 0 (q, 1 , 2 ) = 0. Therefore, the dynamical solution c(t, t ) = 1 and x(t) = q becomes unstable exactly at the values of parameters at which s 2 transitions from being a minimum to being a saddle, as expected.

Microcanonical initial conditions
We now consider the case in which q → 0, where the initial condition of the dynamics s 2 decorrelates from the minimum s 1 . In this limit, the only non-vanishing G a k (q) constant is G 1 2 (q) → −p, while the non-vanishing F a k (q) constants are F 2 1 (q), F 1 4 (q) → −p. The equation for x(t) reduces to: which is homogeneous and thus admits the solution x(t) ≡ 0 for x(0) = q = 0. The equation for the correlation when x(t) = 0 for any t reduces to: while the Lagrange multiplier reads: These equations give the evolution of the correlation function for a dynamics conditioned to start from a typical stationary point of energy density 2 , which therefore will be a local minimum for 2 < th . The first two terms in the second line of (75) and the last term in (76) are generated by conditioning on the stationarity of the initial condition: setting them to zero, we get the dynamical equations conditioned to start from a point extracted with uniform measure from the manifold at a given energy density 2 . This is a case that has been already considered in the literature: it is the microcanonical equivalent of the one analyzed in [36], where the initial condition is extracted with a Bolzmann measure at a temperature T between the statical and the dynamical transition temperatures. It provides a useful check of our method, which is different from the one followed in [36]. In fact, we recover the same dynamical equations, in particular the same boundary terms 3 .
4 Where does the system fall when escaping from the saddle?
The aim of this section is to study where the system falls after escaping from the saddle illustrated in Fig.1. One (trivial) possibility is to come back to the original reference minimum. The other possibility-the interesting one-is that the system lands in a different basin. In order to analyze this case, we consider the large time limit in which the system equilibrates within the basin. This allows us to obtain closed equations describing the properties of the basin, or more precisely the minimum since we consider the small-noise case.
4.1 Equations for the minimum: asymptotic analysis of the dynamics after the fall from the saddle When the initial condition s 2 is an unstable saddle, in presence of weak thermal fluctuations (α = 2T = 0) the system eventually escapes from the saddle (even though this might require extremely large times). In this section we study the asymptotic solutions of the dynamical equations representing the dynamics within the basin that has been reached. We therefore assume that after a finite time t eq a stationary limit is reached (see [37] for a similar computation), meaning that: Moreover, we assume that in the asymptotic limit the dynamics equilibrates into some local minimum of the energy landscape, and that the fluctuation-dissipation relation holds at large times: In the limit of zero temperature, if the dynamics ends up asymptotically in a minimum then C(τ ) → 1 (and notice that C(0) = 1). To capture the dynamical evolution it is necessary to 3 For a comparison, one needs to keep in mind that the Lagrange multiplier µ(t) in [36] and the z(t) in this work are related by introduce the scaling variable φ(τ ) = β(1 − C(τ )), with φ ∞ = β(1 − A ∞ ). Assuming that the initial transient decouples from the long-time dynamics: the equation for z(t) becomes for t, t → ∞: and The equation for the correlation with these assumptions is and using (82) we get: Setting φ(x) = β(1 − C(x)) and we finally obtain: which has a finite limit when T → 0 because the correlation of the noise is α = 2T ; integrating the equation for the response from t to t reproduces (84). In the limit τ → ∞ we get: and φ 0 = 0. The fluctuation-dissipation relation implies that φ ∞ coincides with the static susceptibility in the minimum reached asymptotically by the dynamics. The equation above is indeed consistent with this interpretation since φ ∞ satisfies the same equation of G σ (z ∞ ), where G σ is the GOE resolvent which is directly related to the static susceptibility, see Eq. (113), as well as (60).
Additional relations between the parameters x ∞ , c ∞ , z ∞ and φ ∞ are obtained from the t → ∞ limit of the equations for x(t), z(t) and c(t), which gives the following coupled equations: and and The solution of these four coupled equations gives information on the minima reached asymptotically by the dynamics, as a function of the parameters q and a that specify the initial conditions. In particular, the energy of the minimum reached asymptotically by the dynamics can be read out from z ∞ . As we anticipated, we expect two kinds of solutions for these equation when the initial condition s 2 is an unstable saddle. One solution should correspond to the trajectory that escapes from the saddle and goes back to the original minimum s 1 . In fact, we do find that this set of equations admits the solution x ∞ = 1 and c ∞ = q, and substituting these values into (88), (89) and (90) we obtain the identity z ∞ = −p 1 . This indicates that the two stationary points s 1 and s 2 are not only geometrically connected (meaning that the unstable direction of the saddle s 2 is oriented towards the minimum s 1 in configuration space) but also dynamically connected, since there exists a solution of the dynamical equations that corresponds to the relaxation from the saddle to the reference minimum, see also Sec. 5. The other (less-trivial) solution of the above system of equations instead corresponds to the system relaxing to another local minimum s ∞ that is connected to the reference one through the rank-1 saddle s 2 . We focus on this second solution in the following.

Geometrical arrangement of the minimum past the barrier and the saddle in configuration space
We now want to discuss the correlations between the pairs of minima connected by the index-1 saddles. In order to do so, we solve the asymptotic Eqs. (88), (89) and (90) for the parameters c ∞ , x ∞ and z ∞ . Given z ∞ , the response φ ∞ is then readily obtained solving the quadratic equation (87). We choose a representative energy of the reference minimum, equal to 1 = −1.167 as in Fig. 2 (recall that gs ≈ −1.172 and th ≈ −1.1547). From the study of the constrained complexity [21], we know that index-1 saddles are the dominant stationary points in a range of energies and overlaps corresponding to the violet region in the figure: for any 2 ∈ [ * 2 , th ] we find that the typical stationary points are saddles if q ∈ [q ms ( 2 ), q m ( 2 )]. For the chosen 1 , the deepest energy of these saddles is * 2 ≈ −1.158, and the corresponding stationary points are found at q * ≈ 0.677; the range of allowed overlaps is maximal for the saddles that are at the threshold energy, where q m ( th ) = q M ≈ 0.757. Beyond this value of the overlap, the landscape is typically devoid of stationary points (besides the reference minimum).
As shown in Sec. 3.3.1, in this regime of 2 , q the static solution of the dynamical equations (corresponding to c ∞ = 1 and x ∞ = q) is unstable, and another solution of the asymptotic equations is found, with c ∞ < 1. We denote with ∞ the energy density of the minimum that is reached asymptotically by the dynamics. Fig. 3 shows the values of this asymptotic energy density as a function of the energy of the saddle 2 and of its overlap q with the reference minimum, as well as the values of the asymptotic overlaps x ∞ with the reference minimum. The following features are observed: • At fixed energy 2 of the saddle, the asymptotic energy ∞ decreases with q, meaning that the saddles that are closer to the reference minimum connect the latter to minima that lie deeper in the landscape; the same holds true for the overlap x ∞ for sufficiently small values of 2 (see the caption of Fig. 5 for more details). Therefore, among the with q, at fixed energy 2 of the saddle. Middle and Right. The behavior of the asymptotic overlap x ∞ with q depends on 2 : for 2 sufficiently small, x ∞ decreases with q, i.e., the closer is the saddle to the reference e minimum, the farther is the one reached asymptotically; for energies 2 closer to the threshold, the behavior of x ∞ is non-monotonic.
saddles at the same depth in the landscape, the ones that are closer to the minima lead to a more efficient exploration of configuration space, as they allow to explore farther regions and to reach deeper minima. We recall that increasing q corresponds to selecting saddles that are less numerous (have lower complexity) and that are in general steeper along the direction connecting to the reference minimum (as they have a smaller isolated eigenvalue).
• At fixed overlap q with the reference minimum, deeper saddles connect the latter with local minima with smaller energy. In particular, the deepest minimum that can be reached through this family of index-1 saddles is connected to the reference one through the lowest saddle of energy * 2 . However, this is not the farthest point that can be reached through this family of index-1 saddles.
In Fig. 6 we focus on the closest saddles to the minimum for each 2 (i.e., on those at overlap q M ( 2 ) with the minimum, having zero complexity), and plot the asymptotic energy and overlaps reached from these saddles, which show an almost linear dependence on 2 . We see that moving along the curve corresponding to zero complexity of the saddles, the ones having lower energy lead to lower energy minima, but that are at larger overlap with the original minimum. Thus, there is a competition between energy and overlap of the asymptotic states: the saddles leading to lowest energies are not those leading to the farthest stationary points.
More generally, the asymptotic analysis shows that the minima that are reached through this family of saddles have a distribution in energy concentrated around values that are much higher than 1 (the energy of the reference minimum), and are rather close to the energy 2 of the saddles. Moreover, the asymptotic correlation with the initial condition (the saddle) remains quite close to one, as we show in Fig. 7. This suggest that the minima reached asymptotically are close to the saddles in configuration space. Moreover, we find that they are correlated to the reference minimum 4 . Indeed, the corresponding parameters (x ∞ , ∞ ) lie in a region of configuration space that is dominated by minima having an Hessian that feels the presence of the reference minimum through a single (positive) isolated eigenvalue, see

Numerical solution and free-fall dynamics
The purpose of this section is to present a full numerical solution of the equations (47), (49), (52), (55). We shall show that after escaping from the selected saddle the system displays a relaxation dynamics towards the connected minima, thus validating the assumptions behind the asymptotic solution obtained in Sec. 4 (in particular we exclude the existence of aging dynamics and trapping in spurious minima). The numerical solution of the free-fall dynamics from the saddle will be instrumental in reconstructing the shape of the dynamical instanton in the next section, see  asymptotically by the dynamics and the saddle chosen as initial condition. The flatter is the negative direction of the saddle (i.e., the closer is 2 to ms ), the closer is the minimum reached asymptotically. Right. The red points represent the parameters of the saddles chosen as initial conditions for the dynamics, while the black ones are the parameters of the minima reached asymptotically from the saddles with the same symbol. The inset is a zoom of these points. The saddles that have a flatter unstable direction (those at smaller q) lead to closer local minima. All minima reached asymptotically lie in the region of configuration space that is dominated by minima correlated to the reference one, having one positive isolated eigenvalue (dashed gray area).

Kicking the system out of the saddle
As already discussed in the previous section, when the initial condition is on the saddle the system remains stuck there even though this is an unstable point. The reason is that this unique unstable direction is one out of N , so in the large N limit the system does not escape from the saddle in any finite time. By linearizing the dynamics around the unstable saddle is easy to establish that the escape time equals ln (N/α) /|λ 0 |, i.e. it increases logarithmically with N (λ 0 is the negative eigenvalue of the Hessian corresponding to the unstable direction). In the following, since we are interested in the free-fall dynamics, we bypass this slow process by introducing a small perturbation aligned, or counter-aligned, with the unique unstable direction of the saddle. We implement this perturbation in the form of an impulse, a kick, of infinitesimal amplitude and duration in the direction of s 1 which has a finite projection on the unstable direction [21,22], i.e. along the vector s 1 − s 2 . However since the component along s 2 is compensated anyway by the spherical constraint we simplify and consider a kick in the direction s 1 − qs 2 (see Fig. 8) perpendicular to s 2 . This leads to the modified dynamical equations: with initial condition s(t = 0) = s 2 chosen as usual. For ε > 0 (< 0), the kick pushes the system towards (away from) the minimum s 1 . In the second case the convergence to the other minimum s ∞ is favored. The equations for x(t), c(t, t ) and z(t) change in a very simple way that can be read from (91) and only affects the contributions coming from the initial condition Kicks with amplitudes of opposite signs allow numerical integration of dynamical paths from the saddle towards the original minimum and from the saddle away from the original minimum. The second path is P do . The time reversal of the first path is P up . The dynamical instanton path P I is obtained by joining P up and P do .
G ,q [c, x] and F ,q [c, x] in the following way: The equation for r(t, t ) that is not explicitly affected by initial conditions would change uniquely through the Lagrange multiplier z(t), which has itself a null contribution δ(t)[x(t) − qc(t)] = 0 from this kick by construction. The simplest form for the new system equations can then be rewritten using (52), (55), and From the last new equation it becomes evident that, if for ε = 0 x(t) = q ∀t, setting ε > 0 (< 0) leads to an initial increase (decrease) of x(t) from q and therefore a consequent relaxation towards (away from) s 1 , as pictorially represented in Fig. 8.

Numerical integration scheme
The algorithm used to integrate the dynamical equations is a modification of the code developed for the Cugliandolo-Kurchan equations on a fixed time-grid used in [41,42] and available at https://github.com/sphinxteam/spiked_matrix-tensor. We introduced two modifications to it. The first one consists in adding the terms of the equations derived in Sec. 3 that enforce the initial condition of the dynamics. The second one is due to the presence of the kick. As it emerges from Eqs. (93) and (94), while introducing the effect of the kick for one time quantity is straightforward, two point functions should incorporate at any t > 0 the effect of the kick from t = 0. However the standard numerical approach (see an example of source code at https://github.com/sphinxteam/spiked_matrix-tensor) obtains the two point correlation function c(t, t ) with t > t from the integration of {c(t − dt, s)} with s ∈ [0, t − dt] (example in yellow in the Fig. 9). In this scheme the singular contribution coming from the impulse at t = 0 would be only included in c(dt, 0). A solution to this issue has been implemented by evaluating separately c(dt, t) with dt < t through a modified integration routine on {c(0, s)} with s ∈ [0, t] and adding the contribution from the kick. The result, by symmetry, gives c(t, dt) (in red in Fig. 9) to be used in the subsequent integration step for c(t + dt, t ).

Free-fall dynamics from the saddle and asymptotic solution
We now present the full numerical solution with initial condition on the saddle s 2 . The results shown in this section refer to a reference minimum s 1 at energy 1 = −1.167 and initial condition on a saddle s 2 at overlap q = 0.75 from s 1 and at energy 2 = −1.1555. We have taken α = 0, i.e. zero temperature. A first check of our numerical scheme is that without the kick the numerically integrated dynamics is stuck on the saddle, which is indeed what we find, as anticipated in Sec. 3.3.1.
We then implement the kick as explained above and find the results reported in Fig. 10 in terms of the overlap x(t) with the original minimum s 1 and the energy (t), for a positive and negative kick of amplitude ε = 10 −3 . We observe that the dynamics on both sides of the saddle lead to a finite time relaxation towards the two neighboring minima. We validate the prediction for the long time energies and correlation obtained in Sec. 4 under the TTI (Time  Translation Invariance) hypothesis, see the perfect correspondence in Fig. 10 with the long time limit of numerical integration for the corresponding quantities. We have also verified explicitly that TTI holds for correlation and response asymptotically (only the latter has a non-trivial TTI dynamics since α = 0).

The shape of the dynamical instanton
In this section we focus on the dynamical instanton, which corresponds to the activated process that allows the system to escape from the minimum s 1 to the new minimum s ∞ by crossing the barrier associated to s 2 . In order to obtain the dynamical instanton, we combine the results on free-fall dynamics derived above with time-reversal transformations.
In fact, the theory of activated process at low temperature developed in theoretical physics and mathematics (referred to Freidlin and Wentzell in probability theory) established that an activated process can be decomposes in two parts: first an upward trajectory to the saddle, which is the time-reversal of the free-fall descent (in our case from s 2 to s 1 ), and then the free-fall descent from s 2 to the new minimum. In the following we recall the time-reversal field transformations that will allow us to reconstruct the dynamical instanton.

Time reversal
The time reversal c R (t, t ), r R (t, t ) of the correlation c(t, t ) and the response function r(t, t ) for t > t follows from the relation between the time reversal fields s R (t),ŝ R (t) and the original field s(t) and auxiliary fieldŝ(t). Let us recall them [33,43] in a simplified setting where with an action and with R[s t ] = E[s t ] + z(t)s(t)/2. The single path time-reversal is as follows This choice is self-explanatory for s R (t). The non trivial transformation of the auxiliary field is obtained instead by imposing the invariance under time inversion of the action in Eq. (96), except from the production of boundary terms at s(0) = s I = s R (τ ) and s(τ ) = s F = s R (0) that assure detailed balance all along the dynamical path: The transformations under time reversal for correlation and response functions, as defined in Eq. (24), are therefore inherited from the single field transformations as follows as d(t, t ) = lim N →∞ŝ (t) ·ŝ(t )/N = 0.

Reconstruction of the dynamical instanton
As schematically shown in Fig. 8, since we know by direct numerical integration the correlation and response function along the free-fall dynamics s 2 → s 1 , we can obtain their time-reversed counterparts using the relations above. We shall denote the corresponding correlation function c up (t, t ) and the associated dynamical path P up . In order to construct the dynamical instanton, the time-reversed path thus obtained is merged with the forward dynamical path P do from the saddle to the new minimum s ∞ . Accordingly, the correlation functions c do (t, t ) for this process is obtained by direct numerical integration along the free-fall dynamics s 2 → s ∞ . The probability rate of such dynamical instanton equals at leading order e −2(E 2 −E 1 )/α , with E 2 and E 1 the energy of the saddle and the original minimum, respectively, as it follows from the results recalled in the previous section. Since the difference in energy between the saddle and the original minimum is extensive, this implies that the activated process associated to the dynamical instanton typically takes place on a time-scale that diverges exponentially with N . We wish to describe the reconstructed dynamical instanton in terms of a global two time correlation function c(t, t ) defined on the entire time span t ∈ [0, τ f ] and t ∈ [0, τ f ] where τ f = τ up + τ do , and τ up , τ do are the time span of the dynamical paths respectively towards (P up ) and from (P do ) the saddle. Finally τ s = τ up is the time at which the saddle is visited. However, a reconstruction based on a junction at time τ s of these two distinct dynamical paths lacks the off-diagonal sectors where t ∈ [τ s , τ f ] and t ∈ [0, τ s ], and viceversa. To fill this gap we propose an approximated interpolation of the correlation function c(t, t ) in these dynamical sectors based on the following decomposition for t > τ s and t < τ s with v and v two vectors on the sphere, perpendicular to s 2 . For t and t approaching τ s both vectors correspond to the saddle s 2 . The above decomposition corresponds to fixing the projection of the dynamical variables s(t), s(t ) along the direction of the saddle to its typical value, which is given by the solution of the dynamical equations. The projection along the orthogonal direction is then automatically fixed by the spherical constraint. The directions v and v are in principle varying with time during the dynamical evolution, and so is their overlap. We neglect this time dependence and set: This condition ensures that the boundary conditions are verified: at t = τ f , t = 0, where it is expected s(τ f ) = s ∞ , s(0) = s 1 , we have that their scalar product is x ∞ as it should, since The resulting expression for the correlation function for t ∈ [τ s , τ f ] and t ∈ [0, τ s ] then reads Finally we get c(t , t) = c(t, t ) by symmetry. We are now in position to completely reconstruct the dynamical instanton corresponding to barrier crossing in mean-field glassy systems. Its shape is shown in Fig. 4.

Conclusion
The main outcome of this work is the identification for mean-field models of glasses of the simplest activated processes, which correspond to the escape from a given minimum through saddles of index one. By combining the Kac-Rice method and dynamical field theory, we have constructed explicitly the dynamical instanton associated to the jump over the barrier, and characterized the new minima that the system can reach after the jump. This represents a first step towards a general classification and analysis of dynamical instantons in rough high-dimensional energy landscapes. In particular, the dynamical equations derived in this work allow us to describe escapes from local minima passing through a particular family of rank-1 saddles, those that are closer to the reference minimum in configuration space. The reason for this is that these rank-1 saddles are the typical stationary points (i.e., those that are exponentially more numerous than any other type of stationary points) found at highoverlap with the minimum. We have found that the minima reached dynamically through these saddles are correlated to the reference one, being quite close to it in configurations space; moreover, they are at higher energy. Other saddles geometrically connected to the reference minimum exist at higher distance (smaller value of the overlap) but are atypical [22], i.e. they are still exponentially numerous in N , but their number is subleading with respect to that of local minima, which are instead the typical critical points for that value of the overlap. Initializing the dynamics in one of these saddles requires to condition explicitly on the properties of the Hessian of the initial condition, thus generating additional terms in the dynamical equations. Deriving the corresponding dynamical equations and characterizing their asymptotic solutions is potentially interesting, since these saddles might connect the reference minimum to other local minima that are less correlated with the reference one, being at smaller overlap with it or having lower energy. These saddles can provide more direct escape paths, whereas the ones analysed in this work are more likely to give rise to back and forth motions with frequent returns to the original minimum. We leave this interesting open problem to future work. More broadly, it is worth examining the extremization equation of the large deviation dynamical functional by leveraging on the special solution we constructed. Generalizing such solution (numerically or analytically) provides a new way to obtain the dynamical instantons which correspond to more complex activated processes, and in particular the ones leading to thermal relaxation.
Finally, the diagonal element M N −1 N −1 has yet another variance (that we do not report since it is not relevant in the following), and a non-zero average equal to: with the constants a i (q) already defined in (37) in the main text. Therefore, M is a GOE matrix modified by finite-rank additive and multiplicative perturbations that alter the statistics of the entries in the last line and column, that single out the direction connecting s 1 and s 2 in configuration space. The bulk of the eigenvalue density of M is given by a semicircle, and it is insensitive to the modified statistics of the elements outside the (N − 2) × (N − 2) invariant block. As argued in [21,22], the perturbations to the GOE statistics can nevertheless generate a sub-leading correction to this density, in the form of a single isolated eigenvalue λ min (q, , 0 ) that lies outside the support of the semicircle. This eigenvalue exists whenever [22] µ and it solves the equation The solution to this equation can be compactly written as: When (111) holds and when the smallest eigenvalue of the matrix √ 2H, is negative, the point s 2 is a rank-1 saddle. The eigenvector associated to this eigenvalue has a macroscopic projection along the direction in configuration space connecting the saddle s 2 to the minimum s 1 (see [22] for the explicit calculation of the magnitude of this projection). This is what happens for parameters that correspond to the violet region in figure 2.

B Derivation of Eq. 22
In this Appendix, we derive Eq. 22. We introduce the shorthand notation M [s t a ] ≡ M a (t) and enforce the initial conditions as: We can therefore re-write the average (16) as (117) where DM a denotes the joint Gaussian measure: and given that M ij (t) is symmetric we have restricted the covariance matrix to i ≤ j and k ≤ l: where χ is an indicator function. The matrix at the exponent in (117) reads and: where m a is the projection of the matrix m a onto the tangent plane at s a , and 1 is the identity matrix. Notice that the fields in (120) are exactly at equal time: this will be relevant for the discussion in Appendix C. The integration over the matrix field and over the auxiliary variables λ a ij gives for (15) the following expression: with an action where The proportionality is due to the fact that we are neglecting the functional determinant arising from the integration over the matrix field, as well as the determinant resulting from the Gaussian integration over λ ij . These terms can be disregarded as they do not depend explicitly on the spin variables, and therefore will not matter when deriving the dynamical equations from the optimization of the dynamical action. The expression for V 0 is given in (17).
We now focus on the integration over the initial conditions m a ij . In order to implement the constraints in (121), it is convenient to express the components of the matrices m a in the bases e α [s a ] in which the constraints are given, which span the tangent planes to the sphere at s a . To this aim, we introduce the rescaled unit vectors σ a = s a / √ N for a = 1, 2. We introduce a first set of unit vectors B 1 = {e 1 , · · · , e N −2 , w N −1 , w N } such that: and the remaining e α for α ≤ N − 2 span the region of space that is orthogonal to both s 1 and s 2 . Analogously, we introduce a second set These two sets are related by: The vectors e α [s 1 ] in (121) spanning the tangent plane at s 1 can be chosen to be equal to B 1 \ {w N }, while the vectors e α [s 2 ] can be identified with B 2 \ {v N }. It is convenient to determine the covariances (119) between the matrix elements M a αβ expressed in the corresponding bases B a . For K = N − 2, let us collect the matrix elements M a αβ into the following vectors: It is easy to check that at t = 0 = t the covariance matrix Σ ≡ Σ(0, 0), and thus its inverse Ω, have a block-diagonal structure with respect to this decomposition: Let us determine the explicit form of Σ(0, 0). The first block has a particularly simple structure: where we introduced the compact notation Σ N N for the 2 × 2 matrices with components Σ ab αN,αN , which are equal for any α ≤ N − 2, and similarly for the other blocks. Notice that this reduces to a diagonal matrix for q → 0, when the initial condition s 2 of the dynamics is orthogonal (and thus uncorrelated) to the minimum s 1 6 . Finally, the correlations of the components of M 1 form a 6 × 6 matrix with the following block structure: where each block is a 2 × 2 matrix with components Σ ab xy,zξ = M a xy M b zξ and x, y, z, ξ ∈ {N − 1, N }. The various block read: This general structure allows to decompose the sum in (123) in the following way: where: The constraints in (121) correspond to setting m a αN = 0 for α < N , and m a N N = p(p−1) a . Notice that the term U 1/2 couples the matrix elements m a αN , that have to be put to zero, with the elements m a αN −1 , on which the integration is free. Similarly, the integration on the elements m a N −1 N −1 in U 1 is free, while the elements m a N −1 N and m a N N are constrained to take a given value. To decouple the constrained matrix elements from the unconstrained ones, we make use of Gaussian conditioning 7 . Introducing the vector notation Ξ αβ = (Ξ 1 αβ , Ξ 2 αβ ) T and imposing m a αN = 0, we obtain: The second term in the sum (139) depend on some shifted 2-dimensional vectors Ξ * α1 and on a modified 2 × 2 correlation matrix Σ * 1/2 given by: We find: With an analogous reasoning, setting = ( 1 , 2 ) T , we see that once the conditioning on m a N −1 N = 0 and m a N N = p(p − 1) a are implemented the sum U 1 takes the form: 7 We make use of the following identity holding for two generic vectors x1, x2: where Σ is a generic correlation matrix with blocks Σij and: In this case Σ {1,0} is a shorthand notation for the 4 × 4 matrix with block structure: and: , By defining: and which coincides with Eq. (22) with the identification (23). The term K[s t ,ŝ t ] contains all the terms depending on the components m a αβ : its structure is described in detail in the following Appendix.

C The integral over the Hessian matrices
After shifting the integration variables m a αβ and implementing the constraints, we see that the term K[s t ,ŝ t ] in (22) can be compactly written as (151) Each of two matrices m a is therefore made of an (N − 2) × (N − 2) block of entries having a GOE statistics that is basis invariant; every entry m a αβ in this block is correlated only with itself, and with the analogous entry m b αβ of the other matrix. This remains true also for the entries belonging to the last line and column of the matrices m a : their correlations, however, are different; moreover, even their variance depends explicitly on the overlap q 8 .
A simple calculation gives: where we used the notation c ab (t , t) = s a (t ) · s b (t)/N and r ab (t , t) = s a (t ) ·ŝ b (t)/N . We recall that the components of Ξ 1 αβ are given in the basis B 1 , and those of Ξ 2 αβ in the basis B 2 . Performing the necessary algebra we find: where and c a x , d a xy are constants (depending on the overlap parameter q). Therefore the matrices Φ a for a = 1, 2 have components given by: with L a linear functions of their arguments. The second term takes the explicit form: with the same functions as in (37). This implies that as stated in Eq. 35, where φ a αβ [s t ,ŝ t ] = L a Ξ a α β is a linear combination of the integrals (153).
Equipped with these explicit expression, we can discuss the role of causality in the simplification of this term. The integrals (153) involve either products of the spin variable s 2 (t) and of the response fieldŝ 2 (t) evaluated exactly at the same time, or terms proportional to the response function r a2 (0, t). When the dynamical evolution is causal, these terms will typically be equal to zero: therefore, when the average over dynamical trajectories is restricted to causal ones, we can set φ = 0. This simplifies considerably the shifts Φ a , that reduce to simple rank-1 projectors. Exploiting this crucial observation, we finally obtain: By direct comparison with the the results recalled in Appendix A, we see that the matrix m 2 in (158) reproduces exactly the statistics as the conditional Hessian matrices at a stationary point s 2 at fixed overlap q from a reference minimum s 1 , as expected. More precisely, √ 2N m 2 = M with M defined in (107). The symmetric statement holds for m 1 . This allows us to conclude that (38) holds true.

D Derivation of the boundary terms in the action
In this Appendix we derive the boundary terms in (28). The first term is given by This term arises from conditioning the points s 1 and s 2 to be stationary points: in fact, it emerges from the constraint m a α N = 0, which corresponds to setting the gradients to zero. From (133) we find that: Moreover, with the notation introduced in (25) we find: Combining everything we get the expression (29). The second contribution to the boundary terms is given by: .