Unveiling the anatomy of mode-coupling theory

The mode-coupling theory of the glass transition (MCT) has been at the forefront of fundamental glass research for decades, yet the theory's underlying approximations remain obscure. Here we quantify and critically assess the effect of each MCT approximation separately. Using Brownian dynamics simulations, we compute the memory kernel predicted by MCT after each approximation in its derivation, and compare it with the exact one. We find that some often-criticized approximations are in fact very accurate, while the opposite is true for others, providing new guiding cues for further theory development.

Even though MCT is by no means an exact theory, it is all-pervasive in theories of the glass transition and arises naturally from many different theoretical approaches and perspectives.For example, it is central to the scenario sketched by random first order transition theory [50], and provides a clear connection between the theories of spin-glasses and structural glasses [51][52][53].In particular, schematic MCT is exact for some spin-glasses [51].In a field-theoretic setting, MCT can be derived as a self-consistent one-loop resummation [52,[54][55][56], and has been likened to a Landau theory [52,57] as well as a mean-field theory [46,50,58] for the glass transition.Various kinetic-like approaches have also led to the same MCT equations [54,59].
Despite its successes and ubiquity, MCT is also criticized for failing to capture several key qualitative and quantitative features of glassy dynamics.In particular, it typically overestimates the glassiness of a material to a varying degree which depends on the specific system studied [7,60].Additionally, the theory does not account for the so-called dynamic crossover [61,62].Specifically, MCT predicts that the structural relaxation time scales as a power law with temperature and ultimately diverges at an ideal glass transition.In many simulations and experiments of glass-forming liquids, this power law is indeed also observed, but typically only at mildly supercooled temperatures; instead of diverging, the experimental relaxation time eventually crosses over into an Arrhenius (exponential) scaling [63][64][65][66].The temperature at which this crossover occurs is usually referred to as the mode-coupling temperature T MCT .The inability of MCT to predict the crossover to Arrhenius behavior renders the theory generally only applicable at relatively weak degrees of supercooling [7,62].
Interestingly, the crossover temperature T MCT , like the theory it is named after, emerges repeatedly as an important and almost universal characteristic temperature for liquid dynamics near the glass transition.In fact, apart from the Kauzmann temperature T K and the experimental glass transition temperature T g itself, the mode-coupling temperature is the only one to appear so consistently.Physically, T MCT is often interpreted as the temperature below which structural reorganizations become dominated by collective 'activated' or 'hopping' events instead of non-cooperative relaxation [1,[66][67][68].Relatedly, around this crossover temperature, the potential energy landscape manifestly loses all its delocalized unstable modes, suggesting that the crossover is caused by a localization transition [69].In the random first order transition theory scenario, this is interpreted as a transition to a 'mosaic' of local metastable states [50].The above observations lead to the belief that the breakdown of MCT at T MCT coincides with a physical change in the behavior of glassy liquids.Consequently, a clear under-standing of this breakdown is vital in order to advance towards a more accurate, and ultimately exact theoretical description of the glass transition.
Many attempts have been made to rectify MCT, but these have either been largely fruitless, at least in a qualitative sense, or they have abandoned the first-principles approach in favor of ad hoc corrections to change the predicted scenario.These efforts include (but are not limited to) extended MCT [70,71], generalized MCT [72][73][74][75], and its off-diagonal cousin [76,77], and more formal theories [78,79].There is a large divide between these different approaches, not only in the method by which they attempt to improve upon the theory, but also in the choice of the specific MCT approximations they seek to address.The reason for this disunity of the field is mainly that the various approximations made in the derivation of MCT are notoriously unintuitive, technical, and uncontrolled, rendering it difficult to decide which approximation should be improved in the first place.Given that there is no consensus on which of the MCT approximations should be addressed, it is not surprising that there also exists no agreement on the method by which to do so.Moreover, during the conception of the theory it has been suggested that the MCT approximations should be treated as one entangled set [80]; While this may be understandable from a purely technical point of view, it makes it even more obscure how to move forward.
Here we present a fundamentally new approach to investigate the validity and failures of MCT.Instead of heuristically comparing its predictions with experimental and simulation observations (which has been the approach thus far [32,60,61,81,[81][82][83][84][85][86][87][88]) we rigorously disentangle the different MCT approximations made in the derivation of the theory to reveal its inner workings.Arguably this approach has been deemed too challenging in the past due to the technicalities involved, yet here we show that it is now clearly within computational reach.Specifically, we identify and critically assess five different approximations within the MCT derivation: (i) neglecting projected dynamics; (ii) the projection on density doublets; (iii) the diagonalization approximation; (iv) the factorization approximation; and (v) the convolution approximation.We compute the relevant terms before and after applying each of these approximations directly from simulations of a frequently used model liquid, unambiguously judging their validity.Our approach thus exposes the anatomy of microscopic MCT, allowing us to rule out a complete class of MCT improvements and providing much-needed guidance for the development of a more accurate first-principles theory of the glass transition.

Exact theory of the dynamics of colloidal liquids
Let us first specify our system of interest.We consider the dynamics of a three-dimensional colloidal fluid of N particles.The position r i of particle i evolves according to the overdamped Langevin equation [89] ṙi in which ζ is the friction coefficient, F i is the potential force acting on particle i, and ξ i (t) is a random force that satisfies 〈ξ i (t)〉 = 0, and We denote the thermal energy by k B T .For the interaction term we use the repulsive Weeks-Chandler-Andersen potential (see Appendix B.1 for details).In order to assess the MCT approximations in the cleanest possible test case, we avoid any potentially confounding effects due to polydispersity or non-additivity [81], and hence we focus on a simple monodisperse system in the liquid regime.The particle trajectories generated by Eq. ( 1) contain in principle the full dynamics of the system, but one can equivalently consider the joint probability distribution P(r 1 , . . ., r N , t), which specifies the probability density of finding a particle i in a volume dr i centered around r i at time t.In equilibrium, when the probability density is time-independent, P(r 1 , . . ., r N ) defines an ensemble average of some observable A as 〈A〉 = dr 1 . . .dr N A(r 1 , . . ., r N )P(r 1 , . . ., r N ) . (2) The probability density function formally evolves in time according to the Smoluchowski equation Ṗ(r 1 , . . ., r N , t) = ΩP(r 1 , . . ., r N , t), in which Ω is the Smoluchowski operator This operator will become important when defining the MCT approximations below.
In the context of dense liquids and the glass transition, we are mainly interested in the structural relaxation dynamics of the liquid.A standard probe for such structural relaxation is the intermediate scattering function Here ρ k = j e ik•r j is a density mode, i.e. the Fourier transform of the microscopic density at wave vector k (k = |k|).The operator Ω † is the Hermitian conjugate of the Smoluchowski operator, which does not act on the probability distribution P in the definition of the ensemble average.The initial condition of the intermediate scattering function is the static structure factor S (2) (k) ≡ F (k, t = 0), where we have added the superscript (2) to clarify that this is a two-point density correlation function.Note that for isotropic liquids such as the one considered in this work, F (k, t) and S (2) (k) depend only on the magnitude k of the wave vector.Both F (k, t) and S (2) (k) can be readily obtained from scattering experiments or computer simulations and are therefore also widely studied in theories and experiments of dense liquids [90].
In order to obtain an exact equation of motion for the density modes ρ k (t) and their associated correlation function F (k, t), we use the operator formalism of Mori and Zwanzig [91,92].The basic principle is to decompose the space of dynamical variables into a resolved subspace, which is spanned by the density modes, and an unresolved subspace containing all other dynamical variables.Briefly, we perform this decomposition by introducing a projector P = ρ k 〉 ρ * k ρ k −1 ρ * k that projects onto the space spanned by the density modes, and the associated orthogonal projector Q = 1 − P. For technical reasons unique to Brownian systems (elaborated in Appendix A), we also need a second exact projection step with the projectors The framework enables us to write a generalized Langevin equation for the density modes such that the only coupling between the resolved and the unresolved space is contained in the so-called fluctuating force R k (t) and a memory kernel K(t) that describes the time-autocorrelation function of the fluctuating force.By multiplying the resulting equation with ρ * k (t) and taking an ensemble average, we find the following equation of motion for the intermediate scattering function [8,93,94]: Here, the memory kernel is defined as In this representation, Q ′′ = QQ ′ , and is the fluctuating force, in which c (2) (k) is the direct correlation function [90].Equation ( 5) provides an exact description of the dynamics of a dense liquid.Explicitly, if we would be able to compute the exact memory kernel K(k, t), the exact density correlation function F (k, t) can be obtained.However, the kernel poses a major theoretical bottleneck, since there exists no general theory that allows for an exact prediction of K(k, t).It is the aim of mode-coupling theory to approximate this memory kernel such that it can be evaluated in a self-consistent manner.The first difficulty in treating K(k, t) lies in the fact that the exact kernel evolves according to a different differential equation than standard observables, that is, it evolves with e Ω † Q ′′ t instead of the standard e Ω † t .This means that it is non-trivial to compute it either from theory or from standard particle-based simulations [95,96].
Nonetheless, it is possible to write an exact integral equation for the memory kernel by using the Dyson operator identity [97].The result is in which both K Ω † (k, t) and W (k, τ) are functions that evolve with standard Brownian dynamics and can thus be measured directly from simulations.Their precise definitions are given in Appendix A. Once these functions are measured, we can numerically solve the integral equation of Eq. ( 8) to find the exact memory kernel K(k, t) governing the full dynamics.This exact kernel will serve as our benchmark in order to assess the quality of the various MCT approximations.
As a consistency check, we have verified our procedure for obtaining the exact memory kernel by inserting the calculated K(k, t) from Eq. ( 8) back into Eq.( 5).The resulting F (k, t) can then be compared to F (k, t) measured directly from the same simulations.This comparison is made in Fig. 1, where the full lines are the direct measurement and the dashed lines are the solutions of Eqs. ( 8) and ( 5) at the location of the main peak of the structure factor.These results show that the intermediate scattering function is indeed very faithfully recovered by our procedure, confirming that the obtained K(k, t) is a very accurate reconstruction of the exact memory kernel and thus an accurate benchmark for MCT.Numerical details of this procedure are presented in Appendix B.2.

Approximations of the memory kernel
Having established the exact equation of motion and the exact memory kernel, we now proceed to assess the validity of various approximations made to the memory kernel within the framework of MCT.In order to do so, we follow the standard MCT derivation [3,8] and evaluate the memory kernel after each main step in the derivation.For completeness we also present the full derivation of MCT in Appendix A. Our key result is presented in Fig. 2, which shows the obtained approximate memory kernels, as well as the corresponding intermediate scattering functions, for the highest and lowest temperatures considered in this work.All comparisons are made at the wave number corresponding to the main peak of the static structure factor, i.e. k = 7.0.This wave number is chosen as it corresponds to the typical distance between nearest neighbors, and hence to the typical cage size; Within MCT, this length scale governs the cage effect and is deemed the most important for structural relaxation [98].In the next subsections, we discuss each of the MCT approximations in order of appearance in the derivation. .The solid lines correspond to F (k, t) obtained from direct simulation measurements, while the black dashed lines correspond to the numerical solutions of the exact equations ( 5) and ( 8).This comparison serves to validate our numerical procedure to extract the exact memory kernel via Eq.( 8).

Neglecting projected dynamics
The exact memory kernel R k is propagated in time using the operator e Ω † Q ′′ t .Unfortunately, the presence of the orthogonal projector Q ′′ renders the time evolution of the memory kernel physically non-intuitive and mathematically intractable, since it does not behave in accordance to the same physical laws that underlie the normal Brownian dynamics of microscopic observables (which evolve with e Ω † t ).There exists some analytical work for simple systems expanding e Ω † Q ′′ t in polynomials of Q ′′ , which thus can be applied to provide increasingly accurate expressions for the memory kernel [99,100].However, within mode-coupling theory, the approximation e Ω † Q ′′ t = e Ω † (1−P ′′ )t ≈ e Ω † t is employed to keep the theory tractable.We refer to this approximation as neglecting the projected dynamics.Note that the neglect of P ′′ in the propagator is also trivially required after the final MCT approximation is made (see Sec. 3.4) [3].In the present work, however, we treat the neglect of P ′′ as the first explicit MCT approximation, as it can be imposed separately from later MCT approximations.This first step implies that K(k, t) ≈ K Ω † (k, t) where K Ω † (k, t) is the same function that appears in the integral equation of Eq. ( 8) for the exact memory kernel.
Figure 2 shows both the exact memory kernel K(k, t) and the approximate kernel without projected dynamics, K Ω † (k, t), at the wave number corresponding to the main peak of the static structure factor.It is clear that when t → 0, the two kernels become equal.While this is mathematically trivial, it can also be physically understood by the realization that the fluctuating force R k resides in the subspace orthogonal to the density modes, implying that projections onto the orthogonal subspace have no effect when applied directly.However, as t increases, the influence of the part of the fluctuating force that evolves into the space spanned by density modes grows, resulting in a slower initial decay of the true memory kernel compared Figure 2: The memory kernel and associated intermediate scattering functions for our colloidal liquid at several steps in the derivation of mode-coupling theory.The top panels show the memory kernels at T = 1.0 (left) and T = 3.0 (right) as a function of time.Here, K is the exact memory kernel, K Ω † is the kernel with unprojected dynamics, K offdiag , K gmct , K mct , and K (2) mct are the memory kernels after the doublet projection, diagonalization, factorization, and convolution approximations, respectively.For the meaning of the error bar in the off-diagonal kernel, we refer to Appendix B.2.The inset is a zoom-in of the final relaxation behavior.The bottom panels show the intermediate scattering functions at the same temperatures as a function of time according to each of the different memory kernels, obtained by solving the corresponding generalized Langevin equation.The black dashed line indicates the exponential relaxation that corresponds to a liquid with no memory.In the bottom left figure, we show in the inset the intermediate scattering functions obtained from K(t), K (2) mct , and from a fully self-consistent solution, denoted as F (2) scmct , of the mode-coupling equations using only structure as input.
to the one with standard Smoluchowski dynamics.The results is that K Ω † (k, t) corresponds to more liquefied short-time dynamics than K(k, t).
Interestingly, the relaxation of the kernel K Ω † at long times is in fact slower than that of the exact memory kernel, developing a very low shoulder that delays the final relaxation process.This can be seen most clearly in the insets of Fig. 2. The corresponding intermediate scattering functions F Ω † (k, t) reveal the same pattern, i.e. after an initially faster decay, they ultimately relax over a longer time scale than the exact F (k, t) at both temperatures.This clearly implies that the orthogonal dynamics are not simply "rescaled" regular dynamics, and that the neglect of P ′′ thus introduces a non-trivial modification of the full time-dependent dynamics.

Projection on density doublets
The next step in the derivation of mode-coupling theory is to project the memory kernel on the space spanned by doublets of density modes.The motivation for this projection is that, next to the singlet density modes ρ k which we have explicitly included as our resolved variables in the Mori-Zwanzig formalism, the most important dynamic observables for structural relaxation are products of two density modes ρ k ρ k ′ [101].As we have not included them directly in the theory, their effects must still be contained within the memory kernel, and can thus be extracted by projecting it on the space of all density doublets Q (2)   q,q ′ =ρ q ρ q ′ − ρ * q+q ′ ρ q ρ q ′ ρ * q+q ′ ρ q+q ′ −1 The quantity q,q ′ contains only the part of the doublets that is orthogonal to the space of density singlets [102].This procedure is formally exact [74].However, instead of the exact transformation, we opt to project on doublets ρ q ρ q ′ that are not orthogonalized with respect to the singlet subspace.This procedure is approximate (see Appendix A for additional details), and we refer to it as the doublet projection approximation.After carrying out the projection, we obtain what we call the off-diagonal memory kernel, which contains only contributions originating from the space of density doublets, Note that here the density modes evolve in time with normal dynamics e Ω † t .In the above, we have introduced the vertex as which can be interpreted as a static coupling constant for wave vectors k and q.The inverse four-point structure factor appears here as the normalization of the density-doublet projector such that it is idempotent.While the time-dependent off-diagonal four-point density correlation function in Eq. ( 9) can be evaluated numerically, its static inverse is unfortunately more problematic.The reason is that it is defined by the relation rendering the problem of finding it an intractably large linear algebra problem, if it exists at all [102,103].To proceed, we therefore simplify the vertices by neglecting the off-diagonal terms, retaining only the terms in the sum for which q ′′ = q and q ′′ = k − q.This also causes the normalization term to factorize, yielding, where we have introduced the direct correlation functions c (2) and c (3) .These can be related to the corresponding structure factors as 1/S (2) (k) = 1 − ρc (2) (k), and Here, the three-point structure factor is defined as It is important to note that we have now made two independent approximations in this step: the projection on density doublets and the diagonalization of the inverse four-point structure factor in the vertex.We currently have no direct means to separate the effects of these two approximations.Fortunately, however, there is a way in which we can indirectly estimate the validity of the static diagonalization approximation in isolation, namely by considering the t = 0 limit of the dynamic four-point correlations.We shall revisit this point in Sec. 4.
To assess the overall quality of this step, we measure the off-diagonal kernel of Eq. ( 9) with the approximated vertices of Eq. ( 14) from our simulation data.The results in Fig. 2 clearly show that this step causes a significant overestimation of the memory, resulting in an error in the relaxation time of an order of magnitude.The size of this error seems to increase as the temperature is lowered, suggesting that the discrepancy might become more severe as the glass transition is approached.Moreover, the shoulder that is already visible in the memory kernel K Ω † is much more pronounced in the off-diagonal kernel.The presence of this shoulder suggests that the relaxation of the off-diagonal memory kernel consists of a slow and a fast relaxation process, where the slow process is spuriously causing an overestimation of the structural relaxation time.

Diagonalization
Up to this point, we have expressed the approximate memory kernel in terms of static system properties ρ, D 0 , S (2) (k), and S (3) (k 1 , k 2 , k 3 ), and a time-dependent part given by the offdiagonal four-point correlation function However, the off-diagonal form makes it inherently difficult to deal with this function [104][105][106].To proceed, we give two possible approaches.Firstly, one may construct a separate equation of motion for F (4) , which can be solved self-consistently with Eq. ( 5).This approach is called off-diagonal generalized mode-coupling theory [76,77].The main drawback of this idea is that the integrals involved are difficult to evaluate numerically within reasonable computational time due to the large combinatorial space of wave vector arguments.The second approach is much more common [107] and is also used in classical mode-coupling theory: from the four-point function F (4) , all off-diagonal terms are neglected.Specifically, we keep only two diagonal terms in the sum of Eq. ( 9), that is, the terms where q ′ = q and q ′ = k − q.Thus, upon diagonalization only those remain and all other terms vanish: We stress that, similar to the diagonalization of the inverse four-point structure factor S (4) −1 in the vertices, this technical approximation is uncontrolled.The reason we choose to denote this memory kernel with the subscript 'gmct' is that this is the same kernel that appears in the first equation of the Generalized Mode-Coupling Theory (GMCT) hierarchy [73][74][75].GMCT attempts to improve on MCT by retaining the diagonal four-point function explicitly and constructing a new equation of motion for it (i.e.avoiding factorization, which shall be treated in the next step, Section 3.4).Briefly, GMCT proceeds by re-applying the Mori-Zwanzig formalism using density doublets as the resolved variables and projecting the new fluctuating force on density triplets, yielding a six-point density correlation function in the new memory kernel.In principle this scheme can be continued for arbitrarily many density modes, creating an infinite hierarchy which can be truncated or solved self-consistently at arbitrary finite order.In this way, GMCT seeks to delay the factorization approximation of MCT.We note that in some works, the dynamic diagonalization and factorization are collectively referred to as "the factorization approximation", because the factorization of a four-point function implies its diagonalization.For clarity we keep them separate here.
Figure 2 shows that the diagonalization approximation of the dynamic four-point density correlation has a major effect on the predicted memory kernel: compared to the off-diagonal kernel of the previous step, the kernel is reduced by approximately a factor of two, and the time scale of relaxation is about an order of magnitude faster.This large effect of the diagonalization approximation can be seen as a confirmation that the approximation is inherently uncontrolled, but at the same time it also partially corrects for the significant overestimation error introduced in the previous step.Note also that the clear shoulder present in the off-diagonal kernel has disappeared, suggesting that the two relaxation processes seen in the decay of the off-diagonal memory can in fact be identified as a fast process characterized by diagonal density decorrelations and a slow off-diagonal contribution.

Factorization
The next and sometimes last step in the derivation of classical mode-coupling theory is to factorize the diagonal four-point function in terms of a product of two-point functions, We denote this memory kernel with the subscript 'mct' since it is the standard kernel that is widely used in microscopic mode-coupling theory [5].Notably, this kernel is expressed in terms of the intermediate scattering functions F , and hence Eq. ( 5) now becomes a selfconsistent equation.In order to test the factorization approximation, however, we do not solve Eqs. ( 5) and ( 18) self-consistently, but rather we evaluate the kernel of Eq. ( 18) directly from simulation data, similar to how the previous memory kernels were computed.
From Fig. 2 it can be seen that the data of K gmct are identical to those of K mct within our error margins.This forces us to conclude that, at least in the liquid regime, the factorization approximation of diagonal density correlations is very accurate and can be employed without caution.We have confirmed that ρ * show similar agreement in our simulations.
The validity of the factorization approximation is, in fact, not very surprising, since there exists a host of literature (e.g.[108]) showing that the four-point dynamic susceptibility χ (4)  1 whereas the two terms on the right hand side scale with O(N 2 ).The notion that relative fluctuations are vanishingly small in the thermodynamic limit is typical in statistical physics.Since the fluctuations captured in χ (4) are a direct measure for the error of the factorization approximation, one can readily infer that in the thermodynamic limit, the factorization approximation becomes exact.This statement holds throughout the supercooled phase as long as χ (4) remains finite, which simulations indicate it does [108][109][110] (mode-coupling theory predicts it to diverge only at the ideal glass transition, but to remain finite everywhere else [35]).In this light it is hard to justify attempts to avoid or delay the factorization of four-point density correlations in cases where one is willing to diagonalize them.

Convolution approximation
The last approximation usually employed in the derivation of MCT is the convolution approximation for the vertices [111], which simplifies the required static input for the theory.Although there are analytical results for the three-point direct correlation function c (3) of hard particles [112], the theory becomes more tractable if it only requires two-point functions as input.To this end, the convolution approximation is often made, setting c (3) = 0 [113] (see [34,114,115] for notable exceptions).We add a superscript (2) to the mode-coupling theory memory kernel K mct (k, t) to indicate that structural triplet correlations are neglected.Note that we could also have made this approximation at any earlier point in the theory, but we conjecture that the effect of it is insensitive to when it is actually employed.
We show in Fig. 2 that the neglect of triplet correlations in the vertices only has a small quantitative effect on the dynamics, very weakly increasing or decreasing the predicted memory kernel and intermediate scattering functions depending on the temperature.Note that the lines for K gmct (k, t), K mct (k, t), and K (2) mct (k, t) lie very close together and are therefore hard to distinguish.This is clear evidence that the inclusion of triplet correlations is unnecessary for our colloidal liquid at the density and temperatures considered in this work.
The fact that the triplet correlations have no significant influence on the predicted dynamics indicates that, at least in the liquid regime studied in this work, the microscopic structure is still well described by two-point correlation functions only.In general, however, the validity of the convolution approximation depends highly on the material and state point studied.For example, the incorporation of triplet correlations is known to be more important for strong network-forming systems than for fragile models such as the one studied here [115], and supercooling to lower temperatures may also give rise to non-trivial higher-order structural features [116][117][118][119][120][121].

Discussion
In this work we have explicitly resolved each of the main approximations comprising the standard mode-coupling theory of the glass transition, allowing us to unveil the effect of each consecutive approximation on the predicted memory kernel for a model colloidal liquid.In all cases, the approximate kernel could be benchmarked against the exact result, providing an unambiguous test for the theory's validity.Let us now discuss the relative importance of each MCT approximation step.Our main results, summarized in Fig. 2, clearly show that, apart from the factorization and convolution approximations, all approximations have a significant and non-trivial effect on the memory kernel.Specifically, the first three approximations affect both the absolute magnitude of the kernel and the time scales of decay, biasing the predictions either towards more liquid-like or more glassy dynamics and imposing qualitatively different decay patterns.Curiously, after the final MCT approximation is made, the predicted intermediate scattering function is closest to the exact dynamics, at least in the regime of dense (yet not supercooled) liquids studied in this work.We also point out that the MCT approximations which are virtually exact, i.e. factorization and convolution, are ironically the ones for which the most attempts have been made in the past to circumvent them.
Our work identifies two key approximations that manifestly impact the memory kernel the most, both of which involve neglecting off-diagonal four-point density correlations.Explicitly, when going from K Ω † to K offdiag we neglect the static off-diagonal terms in the vertices, and when going from K offdiag to K gmct we neglect the dynamic off-diagonal terms.These two steps coincide with a significant increase and decrease of the approximate memory kernel, respectively.However, recall that the diagonalization of the static four-point function only applies to its inverse S (4) −1 [see Eq. ( 14)], whereas for the dynamic case the diagonalization approximation is applied to the standard correlation function.We surmise that this causes the two diagonalization approximations to have, in effect, opposite signs.The combined result of both approximations is thus a cancellation of errors, either fortuitous or by design, which we believe also underlies, at least in part, the success of standard MCT.
As already mentioned in the introduction, it is well known that standard MCT has the general tendency to overestimate the glassiness of a system.Our results of Fig. 2 now allow us to expose precisely which step in the MCT derivation is responsible for this overestimation, namely K offdiag .Recall that at this step the projection onto density doublets is introduced, combined with our diagonalization of S (4) −1 .Unfortunately, we are currently unable to directly separate the effects of these two approximations due to the great computational difficulties associated with evaluating the off-diagonal version of S (4) −1 .However, we can provide indirect evidence that the static diagonalization, rather than the projection on doublets itself, is the more likely cause of the overestimation error of K offdiag .Briefly, we find that the diagonalization of the dynamic four-point function introduces a change in the predicted memory kernel of around 50% at t → 0 (comparing K offdiag with K gmct at T = 1.0).We expect that employing this same approximation to the inverse four-point correlation function in each vertex should introduce at least a similar error in the opposite direction (going from K Ω † to K offdiag ).This is also consistent with what we observe in Fig. 2, leading us to believe that the main source of error in MCT ultimately stems from the neglect of off-diagonal density correlations.
In contrast to the method by which we have evaluated the MCT memory kernel, which is to use the intermediate scattering function F obtained from simulations, the usual way to solve MCT is to do so self-consistently.That is, the two-point correlation function F that appears in K mct is chosen such that it satisfies equation ( 5) with K mct as memory kernel.This self-consistency effectively magnifies the error made by MCT, since any small error propagates iteratively through both the kernel and F itself.From the main results in Fig. 2 we can already infer that self-consistent MCT should yield an overestimation of the intermediate scattering function as compared to our directly calculated F mct .To see this, we write K mct ≡ K mct [F ] and note from our results that F mct > F for all times, at least at low temperatures.It follows that and hence the overestimation error will be further increased in subsequent self-consistent iterations until convergence is reached.To numerically confirm that this is indeed the case, we show the self-consistent MCT solution F (2) scmct in the inset of the bottom left panel of Fig. 2. Explicitly, for our system at T = 1, self-consistent MCT predicts a relaxation time of τ α = 0.6, whereas our measurements give τ α = 0.3 for MCT, and the true relaxation time is only τ α = 0.2.In addition to this overestimation, the self-consistency property also gives rise to the prediction of a spurious divergence of the relaxation time.Overall our main results thus understate the severity of the errors made by self-consistent MCT.
Using our results, we have argued that there is little reason to attempt to improve MCT by delaying or avoiding the factorization approximation specifically.Nevertheless, such attempts have had significant success in recent years in the form of (diagonal) generalized modecoupling theory, as it typically improves upon the quantitative predictions of MCT [73][74][75]122,123].In hindsight, we believe that these improvements are again fortuitous consequences of another cancellation-of-errors effect.In order to solve the equation of motion for the diagonalized four-body correlator ρ * k ρ * q ρ k (t)ρ q (t) , several additional approximations are made within GMCT, whose effects seem to partially cancel the errors made by the standard MCT, yielding quantitatively improved results.If, within GMCT, an exact equation of motion for diagonal four-body correlators in terms of the intermediate scattering function was employed, the theory would reproduce the factorization approximation in the thermodynamic limit and therefore the results of GMCT would be equivalent to that of MCT.

Conclusion and outlook
In conclusion, we have unveiled the effect of each of the approximations that enter the modecoupling theory derivation.Our results explicitly show that the success of standard MCT is rooted in a remarkable cancellation of errors, as conjectured earlier from a different perspective [124].We have found that the diagonalization approximation in the statics and dynamics has the most significant impact on the predicted dynamics, as alluded to earlier [106].It is clear from our results that any attempt to improve this approximation by including offdiagonal density correlations should treat both the statics and dynamics on an equal footing lest the predictions of the theory may be worsened.
In future research we aim to apply our methods to a more supercooled system in order to evaluate whether our conclusions hold when the glass transition is approached.Preliminary results suggest that they do.Similarly, it is still an open question to what degree our findings depend on the type of dynamics studied and on the fragility of the material in question.We believe that more work in this direction should inform a more systematic approach to improving one of the most promising theories of the glass transition.
One can envision several routes toward a more quantitative dynamical theory of the glass transition based on the Mori-Zwanzig approach.Firstly, the effects of the projected dynamics can be dealt with rigorously in a self-consistent manner.Indeed, it is not hard to express W (k, t) directly in terms of F (k, t) and its derivatives, which can then be used in conjunction with Eq. ( 8) to include the effects of projected dynamics.Secondly, in order to include offdiagonal four-point correlations into the theory, novel numerical schemes may be employed, especially efficient integration and inversion routines, to evaluate the off-diagonal memory kernel [106].To alleviate the computational costs, one may also seek to restrict the full wave vector space to only the most important off-diagonal correction terms.In this regard, including e.g.only the wave vectors corresponding to the main peaks of S (4) [121] might already provide a reasonable improvement.Finally, formal diagrammatic corrections to MCT have been derived [54,78].Numerical integration or analytic analysis of the diagrams involved may provide invaluable new insights in the microscopic dynamics of the glass transition.and the correlation function between the fluctuating force and Since the latter two quantities evolve in accordance with the standard evolution operator e Ω † t , we can measure them directly from particle resolved Brownian dynamics simulations.The results are shown in Fig. 3.The first of the two can be rewritten as in which we have introduced h k = i j (k • F j )e ik•r j .We denote this quantity as K Ω † seeing that it is equivalent to the irreducible memory kernel K evolving with the Smoluchowski operator The latter two equations show that these functions have many terms in common (in fact, their definitions differ only by a Q), which is reflected in their remarkably similar decay as presented in Fig. 3(c,d).Additionally, it can be shown that both the memory kernel itself and its time integral should scale with k 2 for small k [3,125].While Eq. (A.10) does show the correct scaling for the kernel itself, on first glance it does not for its time integral (each term scales with k 0 ).Nevertheless, together with Eq. (A.11) and (A.7), the correct scaling should be recovered.Thus we expect that the k 0 -scalings of each of the terms in the time integral of Eq. (A.10) cancel to yield an overall correct k 2 proportionality.The low wavelength limit is an interesting regime, but we leave it for further study here.Its investigation in more supercooled systems could shed light on the failure of MCT to produce a breakdown of the Stokes-Einstein relation.Now that we have measured the functions W (k, t) and K Ω † (k, t), we can solve the integral equation Eq. ( 8) numerically, and insert the result into Eq.( 5).The intermediate scattering function found by this method can be compared to one directly measured from the same simulations.This comparison is made in Fig. 1.
In order to derive the mode-coupling equation, we follow the main text and start with the irreducible memory kernel As a first step, we replace the orthogonal dynamics with standard dynamics, yielding The second step is to project on density doublets, in which the projection operator P 2 projects on the space of all density doublets, i.e.
The prefactor is included to prevent over-counting.Substituting (A.14) into (A.13),we find Inserting the definition of the Smoluchowski operator and integrating by parts, it is not hard to show that (A.16)This leads us to define the vertex so that the memory kernel reduces to Next, we neglect all the off-diagonal elements of the inverse four-point structure factor in the vertex, keeping only p = q and p = k − q: (2) (q) − k S (3) (−q, q − k, k) S (2) (k) .
(A.19)This approximation is fully uncontrolled.We then factorize the diagonal inverse structure factor into the product of two two-point functions, yielding As we discuss in the main text, this step is exact in the thermodynamic limit.Lastly, we do a convolution approximation of S (3) , neglecting c (3) , and find where we have used the definition of the direct correlation function 1/S (2) (k) = 1−ρc (2) (k).To arrive at the final mode-coupling theory equation, the dynamical off-diagonal memory kernel can be diagonalized and factorized as indicated in the main text.
If, instead, we had projected on the set of doublets orthogonal to the space of singlets ρ q+q ′ , we would have found in which the vertices contain the normalization factor of the orthogonalized projection operator.When the diagonalization approximation is applied to this expression, the last three terms become subdominant relative to the diagonal four-point intermediate scattering function, and can be neglected, simplifying the expression to that given in Eq. (17).The omission of the last three terms, together with our use of a factorized normalization of the projection operator mathematically constitute the error of what we have referred to as the doublet projection approximation.

B Numerical details B.1 Brownian dynamics simulations
The results from this work are obtained from trajectories of Brownian dynamics simulations performed with the LAMMPS software package [126].We use a purely repulsive singlecomponent system of Weeks-Chandler-Andersen type, characterized by the pair interaction potential for r/σ < 2 1/6 and U(r) = 0 for all other r.Here, r is the inter-particle distance, σ = 1 describes the particle size and ε = 1 is the interaction strength.All results are presented in terms of these units.The simulations contain N = 2000 particles confined within a cubic box with number density ρ = 0.95 and periodic boundary conditions.At the lowest temperature studied, k B T = 1, this system has a liquid-solid coexistence region for ρ ∈ (0.96, 1.03) [127], which we just stay below.
We integrate the Brownian equations of motion, Eq. ( 1), using a time step of ∆t = 10 −5 and friction coefficient ζ = 1.First, we equilibrate the system for 10 7 time steps and we subsequently run an equal number of steps for production.During the production run, we save the particle positions on a quasi-logarithmic grid in order to compute the time-dependent quantities.For each of the 4 temperatures studied, we run 50 independent simulations, allowing us to take proper ensemble averages.

B.2 The exact memory kernel
In order to solve the integral equation ( 8) to find the exact memory kernel, we first obtain K Ω † (k, t) and W (k, t) at k = 7.0 from the simulation trajectories.To do so, we straightforwardly evaluate their definitions, Eqs.(A.10) and (A.11), whereby we average over all 50 independent simulation trajectories, over all allowed wave vectors in the range k ∈ (7.0 ± 0.1), and over a small number of time origins.Because the evaluation of Eq. ( 8) is highly sensitive to noise, we additionally apply a locally estimated scatterplot smoothing (LOESS) filter with polynomial degree 2 and a smoothing parameter of 0.1 [128].The resulting smoothed functions are inserted in a discretized version of Eq. ( 8), for which we have used a non-equidistant Simpson's rule on a logarithmic grid [129].The memory kernel is subsequently found by solving the resulting system of equations.
To validate the obtained memory kernel, we insert it into Eq.( 5), which is solved by the method presented by Fuchs and coworkers [130].The resulting intermediate scattering function is compared with the measured one in Fig. 1.

B.3 Off-diagonal memory kernel
To simplify the computation of the off-diagonal memory kernel, Eq. ( 9), we write it as in which B(k, t) = q ρ q (t)ρ k−q (t)V k,q .We compute this function B(k, t) at the peak of the static structure factor for all t by explicitly performing the sum over all allowed wave vectors q up to some cutoff k c .The auto-correlation function of the result yields the off-diagonal memory kernel.We have found that the cutoff required for convergence of this memory kernel is much larger than that needed for diagonal memory kernels.In particular, the cutoff chosen for this memory kernel is temperature dependent and given by k c = 83.0,84.7, 86.2, 88.3 for T = 1.0, 1.5, 2.0, 3.0, respectively.These values are obtained by computing the off-diagonal memory kernel as a function of this cutoff from one set of simulation trajectories and performing a sinusoidal fit to the data.This procedure is illustrated in Fig. 4. The one-sided error bars in Fig. 2 are set equal to the amplitude of the fitted sine wave, providing an overestimation of the convergence error in the off-diagonal memory kernel.The triplet correlation function c (3) (k, q) appearing in the vertices contributes significantly only for small values k and q [113].Therefore we set it equal to zero for values of q > 10.0, saving computation time and decreasing the amount of noise.For simulations closer to the glass transition, where triplet correlations may play a more dominant role, it might be necessary to increase this cutoff, or forgo it completely.We use the same cutoff for the triplet correlations in the diagonalized kernels.

B.4 GMCT and MCT memory kernels
The diagonal memory kernels K gmct , K mct , and K (2) mct converge faster than the off-diagonal one.This allows us to decrease the cutoff wave number of the sums in their definitions to k c = 40.0,equal to that used in many numerical implementations of the standard MCT equations.The convergence error decreases as t increases, because the intermediate scattering function, and thereby the integrand, decays as F (k, t) ∼ e −D 0 k 2 t for small t and large k.We estimate that at t = 10 −3 and t = 10 −2 , the relative convergence errors are at most 5% and 0.2%, respectively.For smaller t, the error is larger, but the influence of the memory kernel on the dynamics at such short time scales is negligible.

Figure 1 :
Figure1: Intermediate scattering functions F (k, t) for our colloidal liquid as a function of time t for different temperatures, evaluated at the location of the main peak of the static structure factor (k = 7.0).The solid lines correspond to F (k, t) obtained from direct simulation measurements, while the black dashed lines correspond to the numerical solutions of the exact equations (5) and (8).This comparison serves to validate our numerical procedure to extract the exact memory kernel via Eq.(8).

Figure 3 :
Figure 3: Time-correlation functions as a function of time t at the peak of the static structure factor kσ = 7.0 obtained by direct simulation measurement.Panel (a) shows the correlation between the density modes ρ k and the momentum averaged longitudinal stress h k , and (b) displays the auto-correlation function of that stress.In (c) and (d) we show W (k, t) and K Ω † (k, t) which together determine the irreducible memory kernel as expressed by Eq. (8) of the main text.The data are extracted from Brownian dynamics simulations of a Weeks-Chandler-Andersen system at number density ρ = 0.95 for four different temperatures above the crystallization transition.

Figure 4 :
Figure 4: The off-diagonal memory kernel as a function of the cutoff value in the sum from the definition of B(k, t) for different values of t at T = 1.0.The high k c data is sinusoidally fitted to estimate the maximal convergence error.The vertical dashed line indicates the cutoff used in this work.