Classicality with(out) decoherence: Concepts, relation to Markovianity, and a random matrix theory approach

Answers to the question how a classical world emerges from underlying quantum physics are revisited, connected and extended as follows. First, three distinct concepts are compared: decoherence in open quantum systems, consistent/decoherent histories and Kolmogorov consistency. Second, the crucial role of quantum Markovianity (defined rigorously) to connect these concepts is established. Third, using a random matrix theory model, quantum effects are shown to be exponentially suppressed in the measurement statistics of slow and coarse observables despite the presence of large amount of coherences. This is also numerically exemplified, and it highlights the potential and importance of non-integrability and chaos for the emergence of classicality.


Introduction
It is an obvious everyday fact that the world around us does not show direct quantum effects: we can safely disregard the wave-like behaviour of matter and do not need to worry about the effects of measurement backaction. But this causes a conundrum because our everyday world is built out of particles that are fundamentally quantum. Studying the emergence of classicality from underlying quantum physics is thus of foundational importance, but also has great practical relevance for the realization of quantum technologies. Yet, the questions how to prove the emergence of classicality and also the prior question what needs to be proved are not fully settled. The present paper aims to clarify and extend the discussion and it is divided into two parts. The first part (Sec. 2) is about clarifications and makes two contributions. The first contribution is to give a focused overview over three different approaches to the question what needs to be proved. These are the decoherence approach in open quantum systems (OQS) [1][2][3][4], the consistent/decoherent histories formalism [5,6], and a recent approach based on the notion of Kolmogorov consistency [7][8][9][10][11][12][13][14][15]. The second contribution emphasizes the crucial role played by the rigorous definition of (multi-time) quantum Markovianity [16][17][18] (for introductions see Refs. [13,14]), which connects the decoherence approach in OQS to the other two approaches.
The second part (Sec. 3) is about how to prove classicality by extending recent research results [7,8,15]. Therein it has been observed that isolated many-body systems can behave classical even in presence of large amount of coherences and that non-integrability and chaos might be the key to understand this behaviour. In particular, Ref. [15] argued that this is a generic effect for a large class of observables (specified in greater detail later on) and estimated that deviations from classical behaviour are exponentially small in the system size. Here, we confirm this estimate and provide an alternative derivation of it in Sec. 3.1, which is inspired by the idea to model complex isolated quantum systems by random matrix theory [19][20][21][22][23][24]. Moreover, a simple model is used to illustrate important features of this new approach to classicality in Sec. 3.2, before concluding in Sec. 4.
For completeness, we remark that there are alternative explanations of classicality, which we will not study here. For instance, classicality is sometimes explained by gravity as the fundamental cause for decoherence [25] or by collapse theories that directly modify Schrödinger's equation [26]. However, we are here interested in explanations within the conventional framework of non-relativistic quantum mechanics, where gravity or collapse theories cannot play any role. Moreover, also the semiclassical limit of large action S/ℏ ≫ 1-formalized by taking ℏ → 0 [27] among other limiting procedures related to temperature, mass, angular momentum, etc.-is often invoked to explain classicality. While this provides an important consistency check, we here avoid such limiting procedures (after all, decoherence is a major obstacle to build a quantum computer and one can certainly not claim that a quantum computer operates in the high temperature limit). Finally, for fundamental criticism about decoherence we refer the reader to Leggett [28], the importance of decoherence for macroscopic objects was discussed in Refs. [29][30][31][32][33][34], and for a general criticism of prevailing notions of classicality in the cosmological context see Ref. [35]. Furthermore, a topic which we will only briefly touch is quantum Darwinism [36][37][38], which further refines the notion of decoherence in OQS.

.1 Definition used in this work
For any discussion of the quantum-to-classical transition, it is important to precisely define what "classical" behaviour means. One here hits the first obstacle as the boundary between the quantum and classical world is not one-dimensional: depending on the problem different boundaries can be drawn.
For instance, one legitimate way to define classicality could be to ask whether a state of a bipartite quantum system obeys all Bell inequalities or not. This definition, however, only reveals static quantum features of a state, and does not allow to draw any conclusion about how classicality emerges from an underlying quantum description.
Here, we are precisely interested in this emergence and ask whether a process, i.e., an experimentally well-defined procedure to access the time evolution of a quantum system [13,14], can look classical. To be precise, consider an isolated quantum system with (time independent) Hamiltonian H prepared in some state (density matrix) ρ 0 . Let X = M x=1 λ x Π x be some observable with eigenvalues λ x , eigenprojectors Π x and M denoting the total number of distinct eigenvalues (measurement outcomes). The probability to measure x n , . . . , x 1 at times t n > · · · > t 1 is with U k ≡ e −iH(t k −t k−1 ) the unitary time evolution operator between two times (ℏ ≡ 1). Note that Eq. (1) can be experimentally reconstructed by performing n repeated measurements on a quantum system and by repeating this procedure many times to create sufficient statistics. Next, pick some k ∈ {1, . . . , n − 1} and define p(x n , . . . ,ẍ k , . . . , x 1 ) to be the same probability as in Eq. (1) except that no measurement is performed at time t k (and thus no outcome x k is recorded), which is indicated with the notationẍ k and obtained from Eq. (1) by dropping the two projectors Π x k . Then, the process is classical if the following "probability sum rule" is satisfied for all k < n and all n > 1 (up to some error much smaller than the considered probabilities): x k p(x n , . . . , x k , . . . , x 1 ) = p(x n , . . . ,ẍ k , . . . , x 1 ).
In words, a process is classical if marginalizing over measurement results is identical to not measuring at any given time t k . Since measurements can be disturbing in quantum mechanics, even on average, the validity of Eq. (2) signifies the absence of quantum effects from the perspective of measuring X. An example violating Eq. (2) is the famous double slit experiment, see Fig. 1. The following facts further support the idea that this is a good definition of classicality (though, as emphasized above, not the only one). First, observe that Eq. (2) defines a classical stochastic process [39], where it is also known as the Kolmogorov consistency condition. Classicality as considered here therefore has a clear operational meaning, which was also used in Ref. [7][8][9][10][11][12][13][14][15]: a process is classical if (at least in principle) a classical stochastic process can be used to generate the same measurement statistics. The idea of defining classicality in this way is rooted in a "blackbox-mentality": there might be some very expensive quantum computer in front of you, but if the available measurement statistics can be simulated, or emulated, with a classical stochastic processes, then the measurement statistics alone do not allow you to draw the conclusion that there is anything quantum going on in the computer. Furthermore, this definition of classicality also has a clear practical motivation because classical stochastic processes are much easier to analyse and simulate than quantum stochastic processes. Figure 1: The double slit experiment where a coherent source of particles ρ 0 hits a detection screen at position x 2 after passing a wall with two holes (the double slit). (a) The particle's location x 1 is also measured at the double slit, allowing to decide through which slit it passed (corresponding trajectories indicated by dashed lines). No interference pattern is seen on the detection screen, also not after averaging over x 1 . (b) There is no measurement of the particle's location at the double slit, it thus retains its coherent wave-like properties and an interference pattern emerges. Clearly, Eq. (2) is violated: the dynamics is non-classical.
Moreover, Eq. (2) implies the validity of Leggett-Garg inequalities [40,41] and it is closely related but not equivalent to the conditions imposed in the consistent or decoherent histories formalism [5,6], which we review below in Sec. 2.3. Importantly, however, the definition of classicality used here does not hinge on any specific interpretation of quantum mechanics. Confirming Eq. (2) experimentally only requires measurements of X, no further hidden assumption is contained in its definitions. Clearly, classicality is defined with respect to some observable X, i.e., a system that behaves classical with respect to X can behave non-classical with respect to a different observable Y . Finally, notice that Eq. (2) could be also violated in a classical context, for instance, whenever an external agent (e.g., some observer or experimenter) actively intervenes in the process, e.g., by performing feedback control operations [13,14,42]. We exclude these scenarios here by definition of the probabilities in Eq. (1), which in the classical limit (replacing projectors on Hilbert space by characteristic functions on phase space) clearly obey Eq. (2).
In the remainder of this section, we first review the well known decoherence approach and ask whether it explains classicality according to the definition used here (Sec. 2.2). Afterwards, we comment on the relation to the perhaps less well known consistent or decoherent histories approach (Sec. 2.3). Finally, Sec. 2.4 concludes with an intuitive explanation why Eq. (2) can be satisfied for an isolated quantum system.

The decoherence approach for open quantum systems
We consider an open quantum system (OQS) S coupled to some environment or bath B. The total Hilbert space is thus a tensor product H S ⊗ H B of the system and bath Hilbert spaces H S and H B , respectively. The dynamics in the full system-bath space is unitary and generated by a Hamiltonian H SB = H S + H B + V SB with H S (H B ) the system (bath) Hamiltonian alone (suppressing tensor products with the identity in the notation) and V SB their interaction. The reduced system state ρ S (t) = tr B {ρ SB (t)} is obtained from a partial trace of the full system-bath state over the bath degrees of freedom. In contrast to ρ SB (t), ρ S (t) does not evolve in a unitary way. Decoherence happens whenever it is possible to identify a fixed special basis {|ψ x ⟩} in H S , which is called the pointer basis. The special role of this basis is to ensure that any ini-tial OQS state ρ S (0) becomes after a characteristic (and typically very short) decoherence time t dec diagonal in that basis. In equations, Here, the c x,y (0) = ⟨ψ x |ρ S (0)|ψ y ⟩ are complex numbers, which ensure positivity and normalization of ρ S (0) but are otherwise arbitrary, and the p x (t) are the probabilities to be in state |ψ x ⟩ at time t. Equation (3) is indeed a remarkable robust prediction of OQS theory [1-4, 43, 44]. In particular, we repeat that the pointer basis is fixed, i.e., it does not depend on the initial system state, but it is determined by the system-bath Hamiltonian and the initial bath state (though the dependence on the latter should be mild in realistic situations). The pointer basis is also often described as "stable", "robust" or "objective" [1][2][3][4] and we come back to these properties below. We further add some clarifications. First, for the sake of generality one should stress that the pointer basis might not be a basis of pure states |ψ x ⟩, but rather a complete set of orthonormal projectors {Π S x } acting on H S , where certain projectors can have a rank greater than one [45]. In that case, there exist "decoherence-free subspaces" (caused, e.g., by additional conservation laws), but they do not change the fundamental point of our discussion and we continue to call {Π S x } the pointer basis for simplicity. Moreover, we here assume pointer states to be orthogonal, which is typically the case for finite dimensional OQS, but pointer states can be non-orthogonal (e.g., coherent states of a harmonic oscillator [46]). Second, in Eq. (3) we allowed the probabilities p x (t) to be time-dependent. Their change, however, typically happens on a time scale much slower than the decoherence time scale (see, e.g., Ref. [47]) and is called "dissipation"-a phenomenon already known from classical open systems. Third, we remark that a more nuanced presentation of decoherence is possible. For instance, in order to determine the measurement basis, Zurek in his seminal paper was actually interested in the decoherence of the measurement apparatus, which was in turn coupled to the system to be measured and an environment [48]. However, also the measurement apparatus is an OQS, and for the remainder of this paper it is not necessary to explicitly distinguish between system and measurement apparatus. In the following we call the phenenology explained above OQS decoherence to distinguish it from the decoherent histories mentioned later.
Next, we ask whether decoherence explains the emergence of classicality according to Eq. (2) if applied to a system observable X S = M x=1 λ x Π S x , i.e., an observable commuting with the pointer basis and acting trivially on the bath space. To this end, we first confirm that for all times larger than the decoherence time t dec we have where D is a dephasing operation in the pointer basis. 1 Thus, measuring and averaging is identical to not measuring. Next, let us additionally assume that Eq. (4) holds for the full system-bath state: If that is the case, one can confirm our definition of classicality, i.e., the validity of Eq. (2). However, Eq. (4) does not imply Eq. (5), even though the converse is true. Thus, OQS decoherence does not imply classical measurement statistics according to Eq. (2).
It is thus worth thinking about which condition on top of decoherence could imply classicality, and it seems that two different strategies are conceivable.
The first strategy takes a more detailed look at the environmental degrees of freedom. Indeed, the validity of Eq. (5) is equivalent to having vanishing quantum discord [49] in the pointer basis, and testing Eq. (4) has been suggested as a tool to probe non-classical system-bath correlations [50]. However, deciding whether the system-bath state has zero quantum discord or not requires knowledge of the full system-bath state. This knowledge is unavailable experimentally and therefore the condition of zero quantum discord is inaccessible from an operational perspective of OQS theory. Moreover, the idea of having zero quantum discord is problematic from the perspective of having a unitarily evolving "universe" consisting of the system and the bath as explained later in Sec. 2.3.
However, a refinement of this idea is possible and has lead to the recently much studied approach of quantum Darwinism, see Ref. [36][37][38] and references therein. In a nutshell, quantum Darwinism starts by dividing the bath into many different "fragments" F ⊂ B and asserts that most fragments, even those of small size, have (close) to zero quantum discord with respect to the pointer basis, i.e., Eq. (5) applies to most states ρ SF (t k ) = tr B\F {ρ SB (t k )}, where B \ F denotes all bath degrees of freedom except those of the fragment. The resulting classical correlations between the system S and most fragments F allow external observers to learn about the system state even by only looking at a small fragment F of the bath, and different observers looking at different small fragments will agree about the state of S. Thus, an objective world emerges.
For the important mechanism of photons scattered off some material object the idea of quantum Darwinism is indeed intuitively appealing because the scattered photons allow different observers, by looking at different narrow angles at the object, to infer, e.g., the same colour or position of it. Moreover, since photons are non-interacting and scatter off to infinity, it becomes clear that their detection does not change the future evolution of the object. As a consequence, Eq. (2) follows.
For other environments, however, the applicability of quantum Darwinism is less clear and whether it guarantees the emergence of classical measurement statistics is unknown at present. In condensed matter, for instance, the bath does not split into non-interacting fragments and perturbations might not be able to escape to infinity. In this case, quantum Darwinism will generically hold at most for transient times [51], yet objectivity and Kolmogorov consistency might nevertheless arise-as this work will indeed confirm now within and later also without OQS decoherence.
Within the paradigm of OQS decoherence, this brings us to the second strategy implying classical behaviour. This strategy is different from the first by rejecting the idea that information about (fragments of) the bath is directly accessible or known. Instead, solely the degrees of freedom of the OQS are deemed operationally accessible, and it then becomes necessary to think about how could one locally decide whether the pointer states are stable, robust or objective. Historically, Zurek introduced for this purpose the "predictability sieve" [52], which requires to compute the change in von Neumann entropy of the OQS state ρ S (t) as a function of a pure initial state ρ S (0) = |ψ(0)⟩⟨ψ(0)| S . If it changes very slowly, the dynamics are predictable as the state remains approximately pure, but if it changes very rapidly, the dynamics are unpredictable as the state becomes very mixed. Now, from what we said above, we see that initial pointer states are characterized by a slow change in von Neumann entropy, whereas superpositions of pointer states quickly decohere into a mixture on a time scale t dec . The predictability sieve thus selects out the pointer states.
Although the predictability sieve has appealing properties, it is ultimatively not satisfactory for the following reason. If we want to find out whether something is predictable (or stable, robust or objective), it is best to really "take a look at it". For instance, the memories in our computers are stable because we can repeatedly read them out without changing their state. How can this idea be formalized mathematically? Clearly, one way to test this property is to measure the OQS in the pointer basis, say at some time t 1 ≥ t dec , and then to look whether this measurement influences the future evolution of the OQS at some time t 2 > t 1 , for instance, by checking whether the future probabilities of the pointer states depend on the measurement at time t 1 . We now notice that this idea to check for predictability is exactly equal to testing our definition of classicality in Eq. (2) for the pointer basis {Π S x }. Clearly, other ways are possible, but within this second strategy they should always be related to watching the response to some form of external perturbation or intervention on the OQS: do the pointer states remain stable if we shake them a bit?
After having spelled out the basic idea, it remains mostly a technical problem to realize that OQS decoherence in the form of Eq. (3) plus the condition of Markovianity as defined in Ref. [16] (for introductions see Refs. [13,14]) is sufficient to imply classical measurement statistics. In short, this definition of Markovianity is based on the idea that local operations on the system performed by an external agent do not influence the OQS dynamics generated by the environment. Importantly, the property of Markovianity can be checked by local system operations only (no knowledge of the bath state is required) [13,14,16,17]. However, since the definition of Markovianity requires to check multi-time correlations (in complete analogy to the classical definition), knowledge of the time evolution of ρ S (t) alone is insufficient to check for Markovianity (a discussion focused on this point can be found in Ref. [18]).
The connection to Markovianity now becomes transparent by realizing that "shaking a bit the system" is an external intervention that will sooner or later also influence the environment. Can this influence of the environment cause a different behaviour of the system? If the answer is no, then this precisely means that the dynamics is Markovian. In that case, we can conclude the following. First, we found above that OQS decoherence implies that Dρ S (t 1 ) = ρ S (t 1 ) for t 1 ≥ t dec . Obviously, one also has Iρ S (t 1 ) = ρ S (t 1 ) where I is the identity operation which, operationally speaking, literally means "do nothing!" Now, according to the definition of Markovianity explained above, the dynamics induced by the environment is insensitive to local operations on the system performed by an external agent. Since the two operations D and I do not change the OQS state, the future dynamics is insensitive to the dephasing operations, that is: the pointer states are stable and the dynamics is classical. Formal definitions and a proof are given in Appendix A.1.
This important message together with various other notions (some of which are only introduced in Sec. 2.3) is summarized in Fig. 2. Notably, the role of multi-time statistics to probe the stability of pointer states and its connection to Markovianity is not the focus of the OQS decoherence approach [1][2][3][4] and also not of quantum Darwinism [36][37][38], although -within the histories framework introduced below -connections (with varying degree of generality and rigour) have beed made [53][54][55][56]. Moreover, a clear-cut consensus from numerical studies about the relation between non-Markovianity and quantum Darwinism has not yet emerged [57][58][59]. Thus, it seems worthwhile in the future to look for a closer connection of (non-)Markovianity and quantum Darwinism in physical relevant situations.
Finally, we make two more important observations. First of all, the question "what is the pointer basis?" is non-trivial and has been only answered in certain limiting cases (e.g., very strong or very weak system-bath coupling) [1][2][3][4]. In general, for a complex open many-body system coupled to a complex many-body environment the pointer basis is not known. It is an advantage of the approach presented in Secs. 2.4 and 3 of this paper that no pointer basis needs to be identified. Second, all what we said above was restricted to the OQS paradigm, i.e., local observables defined on a system-bath tensor product structure,   (7) it is currently not known whether also the inverse implication holds). Comments: (1) is explained at the end of Sec. 2.3 (and "happens only trivially" is the only term not defined rigorously in this diagram), (2) assumes a local system observable (discord is undefined without a system-bath tensor product structure) and is here understood as applying repeatedly for all times t k considered in Eq. (2), (3) is discussed around Eq. (5), (4) is easily shown with the definition in Sec. 2.3, (5) is explained in Sec. 2.2 and proven in Sec. A.1 (the strict implication follows from the existence of classical processes that are non-Markovian), (6) and (7) are shown in Sec. 2.3, (8) follows because the derivation of Leggett-Garg inequalities assumes Kolmogorov consistency [40,41]. whose identification can be non-trivial [60]. This restriction is also lifted in the present approach, which makes it appealing for questions usually studied within the formalism reviewed next.

Consistent and decoherent histories
The consistent or decoherent histories formalism is an attempt to explain how standard reasoning based on classical logic can be applied in an isolated quantum system in general, and in the cosmological Universe in particular [5,6,[61][62][63][64][65][66]. As we will see, it is closely related to the Kolmogorov consistency criterion. It has been also viewed as a new interpretation of quantum mechanics [6,61,63], but this view has been fiercely debated [67]. We here prefer to remain agnostic about this question, but simply point out again that the mathematical definitions and relations introduced below make sense without reference to any particular interpretation of quantum mechanics. The approach starts by introducing a decoherence functional D for two histories x ≡ (x n , . . . , x 2 , x 1 ) and y ≡ (y n , . . . , y 2 , y 1 ) "happening" at times t n > · · · > t 1 : Here, the projectors and unitary time evolution operators have the same meaning as in Eq. (1) and we immediately confirm that the diagonal elements of the decoherence functional correspond to our previously introduced joint probabilities: p(x n , . . . , Depending on the precise reference, different notions of "consistency", "decoherence" or "(quasi)classicality" have been introduced based on the decoherence functional. It is beyond the scope of this article to review them all here, so we restrict the discussion to the two most commonly employed definitions. First, Griffith originally proposed what we here call the consistent histories condition [61]: i.e., the vanishing of the real part of the decoherence functional for different histories. Gell-Mann and Hartle, among others (see, e.g., Refs. [65,66] and references therein) prefer to use the following condition, which we call the decoherent histories condition: Three immediately obvious remarks follow. First, condition (8) implies Eq. (7). Second, D(x; y) = 0 always if x n ̸ = y n , i.e., the final "measurement results" cannot be different. Third, Eq. (7) implies the Kolmogorov consistency condition (2) (and hence so does Eq. (8)). Confirming the last result requires a few lines of algebra, but it was already shown by Griffiths [61] and many others and will thus not be repeated here.
A not so obvious conclusion is that the decoherent histories condition is strictly stronger than the consistent histories condition. This is explained with a result of Diósi [68], who considered two decoupled quantum systems A and B prepared in a decorrelated state In this situation, one immediately confirms that the joint decoherence functional factorizes as where unprimed (primed) histories refer to subsystem A (B). Now, suppose that A and B separately satisfy the decoherent histories condition. Then, this is also the case for the non-interacting composite AB, as one would intuitively expect. However, this conclusion does not hold for the consistent histories condition, thus the latter cannot imply the former. Thus, Diósi's argument is typically invoked to say that Eq. (8) is a more meaningful condition than Eq. (7). Now, if we consider the probabillities for such a decoupled system, they factorize as expected: . Interestingly, if A and B separately satisfy the Kolmogorov consistency condition, then this is also true for the composite AB. Thus, Diósi's argument cannot be invoked to refute our definition of classicality based on Eq. (2). 2 Two further statements are noteworthy: First, what we said above implies that the decoherent histories condition is strictly stronger than the Kolmogorov consistency condition. Second, confirming the decoherent histories condition experimentally is obviously much harder than confirming the Kolmogorov consistency condition.
Next, we turn to the relation between decoherent histories and OQS decoherence, which obviously has been already the topic of previous works, see, e.g., the above references and in particular also Refs. [69,70]. Quite intuitively, one would expect that histories defined by measurements in the pointer basis naturally satisfy the decoherent histories condition, i.e., that OQS decoherence generates decoherent histories, but it has been recognized that this relation is not that easy [69,70]. Indeed, since OQS decoherence alone is not sufficient to imply Kolmogorov consistency, it also cannot be sufficient to imply decoherent histories. Interestingly, the extra condition of Markovianity is again sufficient to show that OQS decoherence implies decoherent histories. The proof of this statement is given in Appendix A.1 (see also Refs. [53][54][55][56] for related work).
Finally, we turn to the question whether decoherence in a stronger, global sense can explain the emergence of decoherent histories. Here, global decoherence means that the unitarily evolving state is block diagonal with respect to the projectors {Π x }, i.e., ρ(t) = x Π x ρ(t)Π x (note that this implies zero quantum discord, Eq. (5), for local system projectors). If that is the case for all times t k appearing in definition (6) of the decoherence functional, one immediately confirms that the histories satisfy the decoherent histories condition. Unfortunately, however, if ρ(t) is unitarily evolving this can in general only be the case for trivial situations. To see this, we restrict the discussion to pure states |ψ(t)⟩, which is sufficient for isolated systems. Now, from x Π x = I we infer that every pure state can be written as |ψ(t)⟩ = x p x (t)e iφx(t) |ψ x (t)⟩ with p x (t) the probability to measure outcome x and Π y |ψ x (t)⟩ = δ x,y |ψ x (t)⟩. Next, notice that the only states |ψ(t)⟩ that are block diagonal or globally decohered are states with p x (t) = δ x,x * for some x * , i.e., these states are fully localized in one subspace or, with respect to the measurement outcomes, we can say that they are deterministic. Now, this can certainly happen in some cases, for instance, if the dimension of the subspace x * dominates by far all other subspaces (which corresponds to the usual criterion of x * describing an equilibrium state in statistical mechanics), or if the times t k are carefully chosen such that |ψ x (t k )⟩ is localized in one subspace. Moreover, if Π x commutes with the Hamiltonian, its probability remains constant and always generates classical statistics.
However, excluding globally conserved quantities, considering interesting nonequilibrium dynamics and rejecting the unrealistic idea that we are able to carefully choose the times t k , the state |ψ(t)⟩ cannot remain block diagonal. For instance, if the state has a high initial probability p(x 0 ) ≲ 1 for some x 0 and a high probablity for some final state p(x n ) ≲ 1 with x n ̸ = x 0 , then there must be some intermediate time t where the state passed from x 0 to x n such that p x 0 (t) = 1/2. Thus, global decoherence can only happen under trivial or unrealistic circumstances.
A summary of this and the last section can be found in Fig. 2.

The new approach: General picture
We now discuss the general picture behind the new approach from Refs. [7,8,15]. It is claimed to be "new" for two reasons. First, as discussed above, it defines classicality in terms of the Kolmogorov consistency condition. This differs from the basic question in OQS decoherence ("What is the measurement/pointer basis?") and it is close to but still different from the histories approach. In particular, Kolmogorov consistency can be independently well motivated by asking the question "When can a quantum process be simulated by a classical process?", as also done recently in Refs. [9][10][11][12]. Second, emphasis is put on the following two physical aspects. First, the focus is not on OQS: even global observables of an isolated system can behave classical. Second, non-integrability is regarded as essential, or at least very helpful, to derive classicality. While the relation to chaos has been also studied in OQS decoherence (see, e.g., Refs. [71][72][73][74][75][76][77][78][79]), it has not been regarded as essential: the traditional workhorse model of OQS theory uses an integrable bath of harmonic oscillators ("Caldeira-Leggett model") prepared in a canonical Gibbs ensemble. This is avoided in the following by considering pure states.
To the best of the author's knowledge, the basic physical picture behind this emergence of classicality has been already explained by van Kampen in 1954 [80] (without, however, receiving any attention of the community working on the quantum-to-classical transition). Three basic ingredients, which can hardly count as assumptions, plus one major assumption are necessary to see the emergence of classicality. The three ingredients are: (i) the system has a well-defined overall energy, i.e., the energy spread ∆E of the initial wavefunction is sufficiently narrow 3 ; (ii) the system has many particles N ≫ 1, i.e., the Hilbert space dimension of the aforementioned energy shell is exponentially large: D ≡ dim H ∼ exp(N ); (iii) the system is non-integrable or, more precisely, it should obey the eigenstate thermalization hypothesis (ETH). Given the success of the ETH this is considered a mild assumption for realistic many-body systems found in nature [23,24].
The major assumption concerns the observable X that one is probing: according to Refs. [15,80] it should be coarse and slow. Coarseness means that the number of potential measurement outcomes is much smaller than the Hilbert space dimension: M ≪ D. Again, this can hardly count as an assumption. In particular, observe that an observable X S defined for an OQS is a coarse observable in the full system-bath space. Slowness instead is the crucial assumption and it has been discussed in detail (together with various subtleties) in Ref. [15]. Intuitively, it means that the time scale τ X on which ⟨X⟩(t) = tr{XU t ρ 0 U † t } evolves is much longer than the microscopic evolution time scale ℏ/∆E. This is equivalent to saying that the matrix with elements X km = ⟨k|X|m⟩ with respect to an ordered energy eigenbasis {|k⟩} is narrowly banded.
It is interesting to contrast this approach to previous work done within the consistent or decoherent histories formalism, where slow (or quasi-conserved) observables also played an essential role [63,65,66,81]. Without noting the work of von Kampen, the focus in these works was to derive the consistent or decoherent histories condition by arguing that the wave packet remains strongly localized along some trajectory, which is described by a classical deterministic equation, i.e., it was argued that the wave function |ψ(t)⟩ should remain approximately localized in one (time-dependent) subspace Π x(t) throughout the dynamics. It is questionable whether this is always an adequate idea, but, importantly, the assumption of remaining localized around some classical trajectory is also not necessary. As it will come clear below, the pure state |ψ(t)⟩ is allowed to have an abundance of coherences (even maximal coherences) and can still behave classically. This marks another and perhaps the most important novel aspect.
It is also interesting to connect the assumptions above to the OQS decoherence approach. To this end, consider an OQS and let X = H S be the system Hamiltonian. Since H S is locally conserved ([H S , H S ] = 0), H S is a slow observable provided that the coupling V SB is weak enough. Furthermore, it is also coarse since 1 ≪ dim H B . Thus, in the weak coupling regime local measurements of the energy should give rise to classical statistics obeying the Kolmogorov consistency condition (2). This is unison with the predictions of the pointer basis in the decoherence approach, but we repeat that the identification of a pointer basis is not necessary in the present approach. However, it should be emphasized that the notion of slowness is subtle and it does not seem to be a sufficient criterion for classicality: by precisely tuning the "fine-structure" of X km it appears that one can generate arbitrary exceptions to the "rule" [82], albeit those might not be generic. In any case, it provides a different perspective on the problem and it gives an immediate intuitive explanation why the world around us appears classical: human senses are simply to slow and coarse to resolve the evolution of fast observables that could potentially show quantum effects.
So how can it be that decoherence is not necessary to generate classical measurement statistics? The following picture lacks rigour, but gives some intuition.
To start with consider a two-level system with Hamiltonian ∆ 2 σ z = ∆ 2 (|1⟩⟨1| − |0⟩⟨0|) and as the observable choose X = σ x . Moreover, let the initial state ρ 0 with respect to the eigenbasis |±⟩ = (|0⟩ ± |1⟩)/ √ 2 of σ x be parametrized as Here, the parameter r quantifies the "strength" of the coherences in the σ x basis, which if there are further conserved quantities, the same argument has to be also applied to them.
The diagonal elements equal the probabilities to measure spin up |+⟩ or spin down |−⟩ with respect to the x-direction. Their time evolution is strongly influenced by the coherences and therefore the dynamics is not classical. Of course, this is not a counterexample since the system is neither a non-integrable many-body system nor is the observable coarse and slow.
So what happens for a coarse observable in a non-integrable many-body system? The single elements of Eq. (10) now become blocks of many elements and the probability p x (t) to find the system in some state x becomes the trace over block x. It will typically contain a sum of contributions from many coherences ⟨i|ρ 0 |j⟩ = r ij e iϕ ij of the initial state, schematically written as: Now, observe the following facts. First, for a coarse observable of a many-body system the number of terms contributing to the sum is huge (of the order e N with N the particle number). Second, for a non-integrable system the energy differences ∆ ij are incommensurable (apart from rare accidential degeneracies) and effectively random. Thus, unless the ϕ ij are precisely tuned or r ij = 0 for most but a few pairs (i, j), Eq. (11) is a sum of many terms of random sign and small magnitude. 4 Thus, the enormous amount of coherences cannot add up to a significant contribution and therefore it effectively does not matter whether coherences are present or not, i.e., as long as one only asks questions about the measurement statistics in Eq. (1) we can set r ij = 0 for all (i, j). This explanation for classical behaviour is essentially statistical and similar in spirit to the explanation of the second law. Yes, it is possible that the positions and momenta of all the molecules in the air surrounding you conspire such that they can be all found in one corner of the room in the next second, yet this possibility is extremely unlikely. Similarly, it is possible that all microscopic coherences of a coarse observable align in phase to give rise to a strong contribution and, consequently, a strong violation of Kolmogorov consistency, yet this is again extremely unlikely. In essence, this also underlies the decoherence approach. Yes, it is possible that a qubit in contact with a bath suddenly "recoheres", yet this would require again a very unlikely because precisely tuned cooperation of many phases of the system-bath state. Thus, the emergence of classical behaviour is related to the general phenomenon of irreversibility, which is extremely hard to avoid given our coarse human senses.
Finally, one might wonder where above the assumption of slowness enters. This is indeed not directly visible here. However, our argument why Eq. (11) is small was based on assumptions about the number of coherences r ij and the absence of correlations within and between the coherences r ij , phases ϕ ij and frequencies ∆ ij . Unfortunately, and in particular for states prepared out of equilibrium, these assumptions become questionable. Indeed, a state drawn at random can be shown to give with overwhelming probability rise to equilibrium statistics [83][84][85][86]. Nonequilibrium states thus cannot be completely random and must possess a sufficient amount of correlations. Slowness now helps because the microscopic state evolves on a much shorter time scale than the observable, effectively randomizing many phases before any noticeable change in p x (t) occurs. Thus, from the perspective of the slow observable X, the systems looks locally equilibrated and the precise microstate no longer matters [15,80].
3 Classicality: Derivation and numerical verification 3.1 Derivation using random matrix theory To show the approximate validity of the Kolmogorov consistency condition (2) one needs a model that, ideally, is as general as possible to cover a wide range of scenarios while being at the same time also specific enough to permit explicit calculations. Unfortunately, these two desiderata are often mutually exclusive. In Ref. [15] the model was assumed to obey the ETH, which is currently considered to be a mild assumption for most realistic many-body systems found in nature [23,24]. The drawback of this generality was that some plausible but at the end unproven assumptions entered the derivation.
Here, a random matrix theory approach is used, which has been successful in modelling a variety of generic properties of complex systems [19][20][21][22][23][24]. Indeed, our current understanding of the ETH is much based on random matrix theory although the ETH has been shown to be valid for a much larger class of models. The model considered here is therefore more restrictive than the model of Ref. [15], but it comes with the benefit that we need less unproven assumptions (albeit we still need some).
To capture the relevant physics of a non-integrable many-body system we follow Deutsch [87] and others [88][89][90][91][92][93][94][95][96] and consider a Hamiltonian of the form Here, H 0 is some "baseline" Hamiltonian, ϵ a small parameter and H 1 a banded random Hermitian matrix chosen, e.g., from the Gaussian orthogonal or unitary ensemble. For instance, H 0 = H S + H B could be the bare system and bath Hamiltonian and ϵH 1 = V SB their (weak) interaction, but many more examples are imaginable. Note that H 0 does not need to be integrable, it is only assumed to describe a many-body system with an extremely dense spectrum in the considered energy interval (recall ingredients (i) and (ii) in Sec. 2.4). Moreover, the model does not literally assume that the perturbation is random. Instead, the basic idea of random matrix theory is that some property holds for the overwhelming majority of random perturbations and that the real physical (and non-random) Hamiltonian then also belongs to this overwhelming majority. Finally, the smallness of ϵ implies that the range of the spectrum and the mean level spacing δe of H 0 and H are comparable, but their eigenvectors are still strongly perturbed as long as ϵ is larger than the extremely small level spacing δe. Let |µ⟩ and |m⟩ be the eigenvectors of H 0 and H, respectively. A central role in the following is played by the unitary matrix which holds for both the Gaussian orthogonal or unitary ensemble (and even beyond strict Gaussianity) and whose detailed justification is left to the literature [87][88][89][90][91][92][93][94][95][96][97][98][99]. The important point is that the overlap between eigenvectors of H and H 0 is exponentially small in the particle number N and for our estimate below we will set for notational simplicity max n u(n) = D −1 with D the Hilbert space dimension of the energy shell (which is exponentially large in N ). Next, as an observable we allow any coarse Hermitian operator X = M x=1 λ x Π x that commutes with the unperturbed Hamiltonian, [H 0 , X] = 0, but not with the perturbation H 1 (the case [H 1 , X] = 0 trivially gives rise to classical dynamics for X). Coarseness means that M ≪ D and the smallness of ϵ implies that X evolves on a slow time scale. Note that also observables X with [H 0 , X] ̸ = 0 can behave classical [7,8,15], but the above assumption turns out to be very convenient for the calculation below.
Then, we consider the joint probability to measure x 1 at time t 1 and x 2 at time t 2 given an arbitrary initial state (perhaps far from equilibrium) ρ 0 . We further introduce the single time probability and consider the difference The goal in the following is to show that Q is extremely small. This then implies that the Kolmogorov consistency condition (2) is satisfied at the level of arbitrary two-time probabilities given any initial state ρ 0 . An extension of the derivation to arbitrary n-time probabilities with n > 2 is certainly desirable, but it is a very complicated problem, likely requiring novel techniques. However, abstracting from Refs. [100][101][102][103], where theorems about n-time correlations functions for large n were proven under different circumstances, it appears likely that approximate consistency continues to hold also for n > 2.
To make analytical progress, we first need some assumption about the initial state ρ 0 and at this stage we follow Ref. [15]. In summary, Ref. [15] has described the initial state by a preparation procedure using some completely positive map M such that ρ 0 = Mψ 0 = α K α ψ 0 K † α , where ψ 0 is the state prior to the preparation. So far, this is completely general [13,14,104], but now two assumptions are introduced. First, by polar decomposition we write K α = √ P α V α , where V α is a unitary and P α a positive operator. It was now assumed that the P α = P α (X) are functionally dependent on X. Translated into an experimental context, this means that the experimentalist has control over X (for instance, by measuring it), but they are not in control over the precise microstate within the subspaces of Π x . Second, it was assumed that the prior state ψ 0 is at equilibrium or, technically speaking, Haar randomly distributed in the energy shell. Both assumptions thus express the idea that the initial state preparation can bring a system, which is at equilibrium from a macroscopic point of view (note that ψ 0 is a pure state), arbitrarily far from equilibrium with respect to X. Then, using a measure concentration inequality in form of Levy's lemma [86], it was shown that smallness of Eq. (17) is (in almost all cases) equivalent to showing smallness of Here, D x 0 ≡ tr{Π x 0 } is the dimension of the subspace associated to measurement outcome x 0 . Thus, in essence the term Q, which is a three-point correlation function for the projectors Π x with unknown correlations with respect to the initial state ρ 0 , got transformed into the term q(x 2 , x 0 ), which is a four -point correlation function for the projectors Π x without any initial state dependence. We evaluate Eq. (18) in the energy eigenbasis of H and introduce the following convention, which is perhaps unconventional but useful for later considerations. Since there will be many terms indexed by many quantities m 1 , m 2 , . . . and µ 1 , µ 2 , . . . , where m i (µ i ) labels energy eigenvalues of H (H 0 ), we decide to write labels m i (µ i ) as superscripts (subscripts) as in Eq. (13) and take the freedom to simply replace them by the number i whenever appropriate. Thus, Eq. (18) then becomes where ω 12 = E 1 − E 2 denotes the difference between two eigenenergies of H. Next, we recall that X is a narrowly banded operator (due to its slowness) and this also implies that Π x is narrowly banded (due to the coarseness of X) [15]. Thus, in an ordered energy eigenbasis we can safely assume Π mn x = 0 if m − n ≥ d for a sufficiently large number d. Importantly, while d ≫ 1 can be enormous in realistic applications, a central feature of slowness and coarseness is that still d ≪ D. Thus, d/D serves as a small parameter in the following and the corresponding restricted summation is denoted as 1≈2≈3≈4 . Finally, notice that q(x 2 , x 0 ) = 0 if t 2 = t 1 or t 1 = 0, which allows to turn Eq. (19) into We continue by using Eq. (13) and [H 0 , X] = 0 to obtain Here, χ µ (x) is the indicator function which is one if and only if Π x |µ⟩ = |µ⟩ and zero otherwise. Furthermore, note that we use an overbar to denote the complex conjugate. Inserting Eq. (21) into Eq. (20), we arrive at We have now reached a point, where we can try to evaluate q(x 2 , x 0 ) using random matrix theory. However, since q(x 2 , x 0 ) ∈ R can be positive or negative, showing smallness of q(x 2 , x 0 ) on average is only an indicator, but not a gurantee that q(x 2 , x 0 ) is small in general (because it could also strongly fluctuate for different realizations of the random Hamiltonian). Thus, we will actually show that [q(x 2 , x 0 )] 2 is small, which establishes smallness of q(x 2 , x 0 ) and its variance, and which is one point where we go beyond the treatment of Ref. [15]. Thus, we aim to evaluate 1 D 2 Note that the ensemble average is only performed over the matrix elements V m µ , but excludes the frequencies ω mn . In principle, these should be included in the ensemble average as well, but the smallness of the random perturbation and the extremely small mean energy level spacing δe suggest that the behaviour of q(x 2 , x 0 ) is insensitive to small perturbations of ω mn for times much smaller than the extremely long Heisenberg time ℏ/δe (for further justification see Refs. [93][94][95]105]).
Evaluation of Eq. (23) is facilitated by the fact that ten constraints apply. First, due to the factors e iω ij t − 1 we infer the four constraints m 1 ̸ = m 2 , m 3 ̸ = m 4 , m 5 ̸ = m 6 and m 7 ̸ = m 8 . Second, due to the fact that x 1 ̸ = y 1 , x ′ 1 ̸ = y ′ 1 and x 2 ̸ = x 0 (see Eq. (18)) we find the six constraints µ 1 ̸ = µ 2 , µ 3 ̸ = µ 4 , µ 5 ̸ = µ 6 , µ 7 ̸ = µ 8 , µ 3 ̸ = µ 8 and µ 4 ̸ = µ 7 . Nevertheless, evaluation of Eq. (23) remains challenging even under the simplest approximation that we will employ here (though we discuss corrections later on). This approximation assumes that the V m µ are independent zero-mean Gaussian random numbers obeying with δ mn and δ µν denoting the standard Kronecker symbol with super-or subscripts, respectively. The ensemble average in Eq. (23) can then be evaluated using Isserlis' theorem, which turns an expectation value of 2n random variables into sums over "pairings" where each pairing is a product of n pairs. As an example, consider Quite discomfortingly, the ensemble average in Eq. (23) involves sixteen random numbers. In Appendix A.2 a numerical code is detailed that generates all pairings using Isserlis theorem while respecting the ten constraints mentioned above. Then, from the total amount of 40,320 pairings 347 distinct pairings (no multiplicity) survive. It has been found too demanding to write a programme that automatically estimates Eq. (23) and since it is very tiring to investigate 347 cases manually, we look for the most dominant contributions. This is justified because we are only interested in an order-of-magnitude estimate of E{[q(x 2 , x 0 )] 2 }, not its exact value.
To find the leading order contribution, we observe that each pairing gives rise to a different number of distinct Kronecker deltas. In general, the fewer the Kronecker deltas, the larger the contribution because each Kronecker delta "kills" a high dimensional sum. Some care, however, is required because the sums run over spaces with potentially very different dimension. Specifically, every lower subscript runs over a subspace with dimension equal to the rank of some projector Π x , which is always smaller than D but could still be comparable to it. In contrast, six out of the eight superscripts run over subspaces with dimension d ≪ D. Therefore, the leading order contributions are given by the terms that have the fewest Kronecker deltas in total or the fewest Kronecker deltas with respect to the subscripts.
Starting with the latter, the programme from Appendix A.2 shows that the pairing with the fewest amount of subscript-Kronecker deltas is Here, all u(m − µ) ≥ 0 were replaced for notational simplicity with their maximum value D −1 in agreement with the comment below Eq. (14). Then, inserting this term into Eq. (23) and killing all the sums, one obtains the contribution Now, since |e iω − 1| ≤ 2 for all ω ∈ R the first line is estimated by setting e iω − 1 = O(1) such that 1≈2 O(1) ≈ Dd. The second line can be exactly evaluated by introducing the Hilbert subspace dimension D x ≡ tr{Π x } = 1 χ 1 (x) associated to the projector Π x . Thus, one obtains 1 D 2 x 0 D 8 Dd We see that even in the worst case scenario, assuming D x 0 ≈ 1 and D x 2 ≈ D, the right hand side scales at least as d/D 2 , which is certainly negligible small. We continue by considering the terms with the fewest Kronecker deltas in total. The programme from Appendix A.2 shows that these terms have six Kronecker deltas in total and there are the following four of them (neglecting the universal prefactor D −8 ): Let us look at the first one. Inserting it into Eq. (23) gives rise to the contribution Using the same reasoning as above, the summation over the superscripts is approximated as D 2 d 4 and the summation over the subscripts gives δ which is still negligible small, though potentially considerably larger than Eq. (28). Using the same strategy, it turns out that also the remaining contributions in Eq. (29) have the same scaling. Thus, to summarize, it was shown that the dominant contributions to E{[q(x 2 , x 0 )] 2 } scale like (d/D) 4 , which is negligible small due to the slowness and coarseness of the considered observable X. Remarkably, this scaling holds for all times t 1 and t 2 (and is thus clearly applicable out of equilibrium). Several assumptions concerning the ETH ansatz as detailed in Ref. [15] could be overcome due to the fact that we employed a more transparent but also more restrictive random matrix theory approach from the beginning. Clearly, also the random matrix theory approach is not without assumptions, but recalling its enormous success to deal with non-integrable or chaotic many-body systems [19][20][21][22][23][24] most assumptions should turn out to be mild in practice.
Nevertheless, a critical point concerns the approximation that the V m µ are Gaussian and uncorrelated. Both cannot be strictly true. First, unitarity implies |V m µ | 2 ≤ 1, which is not satisfied by a Gaussian distribution. Second, unitarity also implies that µ V m µV n µ = δ mn and m V m µV m ν = δ µν , which is satified on average, see Eqs. (14) and (24), but not for a single realization if the V m µ are taken uncorrelated. While one might expect corrections to be negligible in many cases due to the huge Hilbert space dimension and the smallness of V m µ , Dabelow and Reimann have shown that they can be important [93,95]. Yet, their goal was to determine the exact time-dependent behaviour of expectation values, instead of the rough order-of-magnitude estimate that we were interested in here. Nevertheless, Appendix A.3 confirms that at least the leading order correction does not give rise to a different scaling. Unfortunately, the author found the calculation of higher order corrections to be intractable, so the present derivation might be best interpreted as strong evidence, but no proof, that slow and coarse observables imply classical measurement statistics in random matrix models and, likely, also beyond.
Finally, we remark that the above derivation made use of the property x 1 ̸ = y 1 in Eq. (18), but the evaluation of the sum x 1 ̸ =y 1 has not been crucial. Thus, we have indeed not only shown the Kolmogorov consistency condition but also the strictly stronger decoherent histories condition.

Numerical verification
To illustrate the main features of the present approach to classicality, we use a simple toy model. This model cannot compete with the simulations of more realistic, non-random models in Refs. [7,8,15]. Yet, the results clearly support our general findings and they are also used to point out interesting features that were not yet investigated.
The toy model describes energy exchanges between two energy bands and has been studied in detail in Ref. [106]. Each band is described by N equidistant energy levels such that the baseline (unperturbed) Hamiltonian is Here, δE sets an overall energy scale, which we choose in our numerics equal to δE = 0.5. Moreover, |i⟩ describes an energy eigenstate of the first (second) band if i ∈ {0, . . . , N − 1} (i ∈ {N, . . . , 2N − 1}). The random coupling between the bands is mediated by where the v ij are independent zero mean unit variance Gaussian random numbers. The observable X we consider quantifies the energy imbalance between the two bands and is defined as where Π 1 (Π 2 ) is the projector on the first (second) energy band. Clearly, this is a coarse observable with two eigenspaces of dimension N . The prefactor (2N ) −1/2 is pure convention, but bears the advantage that the observable X has the same "size" (meaning that tr{X} = 0 and tr{X 2 } = 1 always) for different N . The condition for X to be slow was worked out in Ref. [106] and reads i.e., weak coupling gives rise to a slow observable as usual (note that [H 0 , X] = 0). In the numerical simulations we choose ϵ = ϵ(N ) such that the left hand side of Eq. (35) becomes 0.01 unless otherwise stated. Moreover, the relaxation time-scale of X is given by τ X = (4πϵ 2 N ) −1 δE [106]. We first consider the structure of the observable X in more detail as it plays an important role in our study. For this purpose Fig. 3 shows matrix elements of the observable X and the projector Π 1 and we can immediately confirm that X and Π 1 are narrowly banded. Besides the observation of narrow bandedness, we also note that the off-diagonal elements of Π 1 appear random and vary erratically. Furthermore, they decay with system size. More specifically, the black circles are roughly one order of magnitude larger than the blue circles, which suggests a scaling 1/ √ D for the off-diagonal elements. These points are in complete agreement with the general predictions of the ETH [23,24].
Let us now consider the dynamics and test whether the time evolution of X is sensitive to a dephasing operation. To this end, we plot and compare in Fig. 4 the two quantities i.e., the probability to find the system in the first energy band at time t with or without dephasing operation at time s < t, respectively. Note that Fig. 4 plots these quantities for a single realization of the random matrix Hamiltonian and for a single Haar-randomly chosen initial state confined to the first energy band Π 1 , but (importantly) the results were found to be representative as different realizations of the Hamiltonian or initial state give rise to a similar picture. We also note that the dephasing in Fig. 4 clearly happens before the system equilibrates and it becomes immediately evident that the process becomes classical with increasing N . This confirms our main result, but Fig. 4 contains two more important pieces of information. First, the circles and crosses in Fig. 4 are generated for the same Hamiltonian and initial state, but by using truncated projectors, which are obtained from Π x by setting all off-diagonal elements with a distance to the diagonal greater than d/2 to zero. The choice for d was d = D 0.7 and the reason for choosing this precise value of the exponent becomes clear later. For now it only matters that we confirm that d/D = D −0.3 is very small for large D and that we see in Fig. 4 that even for small D the dynamics is unchanged. This provides numerical evidence that truncating the sums, which was a crucial step to arrive at Eq. (20) in our general derivation, is justified.
Second, another important piece of information is revealed by the number ∆ in the top right corner of each plot. It equals the trace norm between the states with and without dephasing defined as where Dρ = x Π x ρΠ x denotes the dephased state. The trace norm is a distance measure characterizing the distinguishability of two quantum states and it has a wide range of applications and favorable properties [104], including that (1 + ∆)/2 is the maximum success probability to distinguish between ρ and Dρ in an unbiased mixture given unlimited measurement power. Thus, ∆ seems to be well suited to measure the amount of coherences in ρ. Interestingly, we show in Appendix A.4 that max ρ ∆(ρ, Dρ) = 1/2. Now, observing the values for the trace norm in Fig. 4 we see that they are very close to the maximum possible value. In that sense, the dephasing operation is (almost) maximally invasive and a lot of coherences is destroyed. This demonstrates that (global) decoherence is not needed to explain classical behaviour and even maximally coherent states can show classical behaviour. This is not in conflict with OQS decoherence, where decoherence happens locally but usually not globally.
Next, Fig. 5 shows the scaling behaviour of Q, Eq. (17), as a function of the Hilbert space dimension. To this end, we plot the time average for s = τ X and t + s = 3τ X , characterizing the (average) distance between the black solid and pink dashed curves in Fig. 4. Moreover, to minimize the risk of statistical outliers, this is done for three different realizations of the random matrix Hamiltonian and three Haar-randomly chosen initial states, thus giving nine realizations for each N as indicated by black circles in Fig. 5. To extract the scaling, we average these nine points for each N and fit a curve of the form which is inspired by Ref. [15]. By looking at Fig. 5 (note the logarithmic scale), one might wonder whether it is a good idea to fit all the data by a straight line (pink dashed line with exponent α = 0.9) because the behaviour for N ≲ 400 clearly deviates from the straight line fit obtained for N ≥ 600 (blue solid line with exponent α = 0.6). It is not completely clear to the author what causes the discrepancy, but in Ref. [106] it was observed that the weak coupling approximation requires the side constraint 8π 2 N 2 ϵ 2 /δE 2 > 1, which for our choice of ϵ implies N > 200. This might explain the "anomalous" behaviour for small N . Also note that the exponent α = 0.6 roughly fits the scaling behaviour observed in Ref. [15] (where α was observed to be in the range [0.25, 0.6]).
In any case, we use the fit to determine the number d at which we truncated the projectors to generate the pink crosses and black circles in Fig. 4. Namely, our main result predicted a scaling of the form (d/D) 4 for [q t,s (x 0 )] 2 defined in Eq. (23). This suggests that ⟨Q⟩ should scale as (d/D) 2 . Comparing with D −α for α = 0.6 gives d = D 0.7 as used in Fig. 4.
Finally, we challenge the present approach by relaxing certain assumptions. First, we ask what happens if the initial state is not randomly chosen within the first energy band. Figure 6(a) shows the breakdown of classicality for a highly atypical initial state |ψ 0 ⟩ = |i⟩ for some randomly selected i ∈ {0, . . . , N − 1} for N = 6000. Experimentally, preparing such an initial state requires precise microscopic control over the eigenstates of each energy band, which clearly violates the agreement made above Eq. (18). Nevertheless, classicality is quickly restored for more realistic states as demonstrated in Fig. 6 Fig. 4). (b) Scaling plot as in Fig. 5 for the just mentioned highly atypical states (black circles) and for random superpositions of 0.005N such highly atypical states (pink triangles). (c) Influence of two dephasing operations (indicated by the vertical black lines) for N = 6000 and a typical nonequilibrium initial state as considered in Fig. 4.
is a randomly chosen subset with 0.005N many elements (i.e., 0.5% of the energy levels in the first band are initially populated) and the c i are zero mean unit variance Gaussian random numbers (pink triangles). Despite quite large fluctuations, Fig. 6(b) indicates a scaling law and the pink triangles are (on average) clearly below the black circles, showing the emergence of classicality even for moderately atypical states with a small fraction of populated levels. Finally, Figure 6(c) shows exemplarily what happens for two dephasing operations and an initial random state as used also in Fig. 4. While this is certainly not conclusive, it indicates that for sufficiently large dimensions the here introduced concept of classicality is robust also for n ≥ 3 measurements.
Last but not least, Fig. 7 investigates the impact of the coupling strength on classicality, which is directly related to the slowness of X. Here, weak, medium or strong coupling means that the right hand side of Eq. (35) was fixed to 0.01, 0.1 or 1. One sees that classicality is well satisfied up to medium coupling strength, but fails in the strong coupling regime. This is not a deficit of the present theory because clearly not all observables can behave classical. For strong coupling the eigenenergies of the total Hamiltonian can no longer be approximated by the local eigenenergies of the two bands, but are strongly hybridized, and it is questionable how far X describes any meaningful energy difference in this case.

Conclusion
The first half of this paper compared and contrasted well established and important approaches to classicality, namely decoherence in OQS and consistent/decoherent histories, with recent abstract research [9][10][11][12] as well as numerical evidence [7,8,15] and general derivations [15] of classicality based on the Kolmogorov consistency condition. Arguably, the difference between the consistent/decoherent histories condition and the Kolmogorov consistency condition is small. However, Kolmogorov consistency is easier to verify experimentally than consistent/decoherent histories and it can be independently well motivated from an operational perspective. Moreover, we established that quantum Markovianity is  Fig. 4 for N = 6000 and by choosing ϵ such that the right hand side of Eq. (35) equals 0.01 (weak coupling, as before (a)), 0.1 (medium coupling (b)) and 1 (strong coupling (c)). Note that the relaxation time scale changes inversely proportional to it.
a key concept to relate decoherence in OQS to both the consistent/decoherent histories and the Kolmogorov consistency condition. Figure 2 summarizes the first part.
The second half of the paper has given for an experimentally relevant class of initial states an independent derivation of the Kolmogorov consistency and the decoherent histories condition based on a random matrix theory model. We carefully checked numerically the correctness of the involved approximations. Remarkably, it was explicitly shown that even maximally coherent states can give rise to classical dynamics for global observables.
Several interesting research avenues open up for the future. For instance, classicality could here be only established for "mini-histories" with two measurement results and extending the derivation to longer histories, as done in a different context in Refs. [100][101][102][103], is highly desirable. Indeed, recent numerical results for up to five-time histories have confirmed that the emergence of classicality is a robust phenomenon [107]. Moreover, various fundamental question might appear in a new light, for instance, the relationship between quantum Darwinism [36][37][38] and quantum Markovianity, applications of decoherent histories to OQS theory [108,109], or the implications of the present findings for quantum cosmology [35]. Finally, the central quantity investigated in Eq. (18) can be more generally seen as a particular example of a Kirkwood-Dirac quasiprobability [110]. The behaviour of these quasiprobabilities for chaotic systems has also raised attention in relation to out-of-time-ordered correlators [111][112][113], which might open up interesting possibilities for fruitful connections between different fields. of OQS decoherence exists and the notion is used rather conceptually (what follows is, however, closely related but not identical to the treatment of Refs. [9,10,12]). However, if one wants to prove things mathematically, one has to start with a definition. For this purpose, we use the dephasing operation D in the pointer basis as introduced in Sec. 2.2. Then, for a quantum Markov process we define OQS decoherence by requiring that where [A, B] = AB − BA is the commutator in superoperator space. Is this a good definition of OQS decoherence? At least it implies that the dynamics induced by the environment is not able to create coherences in the pointer basis. To see this, we introduce the superoperator P x,y ρ ≡ Π S x ρΠ S y . Next, suppose that ρ S = Dρ S is some system state without coherences. Then, Eq. (42) implies that P x,y Eρ S = 0 for all x ̸ = y, i.e., it is not possible to create coherences in the pointer basis when starting from a decohered state. Clearly, this captures a key aspect of the OQS decoherence concept, but one could naturally impose further constraints. For instance, Eq. (42) makes no statement about the decoherence time t dec and, since the short time dynamics of OQS is complex, one might additionally require that Eq. (42) is only valid on a coarse time scale, i.e., for t ℓ − t k not too small. In any case, the minimal definition given here turns out to be sufficient to prove that histories in the sense of Eq. (8) are decoherent.
To see this, we conveniently write the decoherence functional for a quantum Markov process using superoperators: D(x; y) = tr S {P xn,yn E n,n−1 P x n−1 ,y n−1 E n−1,n−2 · · · P x 1 ,y 1 E 1,0 ρ S (t 0 )}. (43) Next, note that the decoherence functional does not change when subjecting it to a final dephasing operation D in the pointer basis (in fact, the decoherence functional does not change under any final dephasing): D(x; y) = tr S {DP xn,yn E n,n−1 P x n−1 ,y n−1 E n−1,n−2 · · · P x 1 ,y 1 E 1,0 ρ S (t 0 )}.
Now, let k be the first index for which x k ̸ = y k , i.e., x ℓ = y ℓ for all ℓ > k. By the definition of OQS decoherence, we can then permute D through until we hit the time t k : D(x; y) = tr S {P xn,yn E n,n−1 · · · E k+1,k DP x k ,y k E k,k−1 · · · P x 1 ,y 1 E 1,0 ρ S (t 0 )}.
Finally, elementary algebra shows that DP x k ,y k ρ = 0 whatever the input state ρ is. QED. It is interesting to note that strictly weaker conditions suffice to show Kolmogorov consistency for quantum Markov processes [9,10,12], but they seem insufficient to show the decoherent histories condition.

A.2 Numerical implementation
This appendix includes some details about how to numerically fascilitate the evaluation of the expectation values appearing in Eq. (23) using Mathematica [115].
We are interested in expectation values of the form with an equal amount of V -and complex conjugateV -terms. Since each pair in Isserlis theorem requires one V -and oneV -term, not all permutations of (V m 1 The lowest level angular bracket {. . . } contains one specific pair with always two Latin and two Greek indices. The middle level angular bracket contains the product of all pairs, which form one specific "pairing". In general, P airings will contain many forbidden pairings due to constraints such as m 1 ̸ = m 2 , µ 1 ̸ = µ 2 , etc. To filter them out, we map each pairing to a graph with vertices (m 1 , m 2 , . . . , µ 1 , µ 2 , . . . ) and edges created by the Kronecker symbols of each pair, e.g., the pair {m 1 , m 2 , µ 4 , µ 4 } creates an edge between m 1 and m 2 and a (redundant) edge between µ 4 and µ 4 . Then, e.g., to respect the constraint m 1 ̸ = m 2 , a pairing is only accepted if there exists no path in the graph from m 1 to m 2 , see Fig. 8 for a sketch. As an example, the following code creates a list accepted that stores the numbers i for which the ith element of P airings satisfies the constraints m 1 ̸ = m 2 (further constraints can be easily included): Here, the graph G for each pairing i is created using a set of edges stored in network, where each edge is symbolized by •−• (typeset as "Esc ue Esc" in Mathematica).
Next, we create a list Rem that simply contains all remaining pairings that satisfy the constraints above. This can be done by Note that Sort brings the elements of the pairing in a standard form again. As hinted at already above, the structure of the pairings can be nicely illustrated with a graph with edges indicating Kronecker deltas. For further manipulation, we now like to convert Rem into a list of graphs:  ] con = Map[Sort]@ * ConnectedComponents/@graphs; The final output con contains the connectivity of each graph associated to each accepted pairing. For instance, if {{m 1 , m 1 , µ 1 , µ 3 }, {m 2 , m 3 , µ 2 , µ 4 }, {m 2 , m 4 , µ 4 , µ 4 }} is one accepted pairing, then its connectivity is {{m 2 , m 3 , m 4 }, {µ 1 , µ 3 }, {µ 2 , µ 4 }, {m 1 }}. This format has nice properties as it directly reveals the "structure" of each pairing in terms of (multi-valued) Kronecker deltas. To find out whether there are multiple pairings with the same structure, one can run Tally[con]. Given the connectivity list con, it is also straightforward to count the total number of sums that get killed due to Kronecker deltas. In the following, the function kills computes this number for a given pairing and f inal stores these numbers for each element of con: Clearly, the above procedure can be also applied to the graphs formed by Greek indices only-as we needed to do to find the terms with the fewest amount of subscript-Kronecker deltas around Eq. (26).

A.3 Higher order corrections
In Ref. [93] (see also Sec. 3.4.4 of Ref. [105]) Dabelow and Reimann developed a systematic way to take into account correlations among matrix elements, which we briefly summarize here. Unfortunately, it will turn out that this procedure becomes quickly untractable due to the fact that we already start with an expectation value over a product of sixteen random numbers. Moreover, the method does not take into account correlations with respect to both the perturbed and unperturbed bases |m⟩ and |µ⟩, but only with respect to one of them. Therefore, while that method was found to work well in Refs. [93,95,105], it still does not provide an exact treatment of the problem.
To start with, notice that the matrix element V m µ can be seen as the m'th component of a D-dimensional vector V µ . To each V µ we associate two vectors v µ and w µ , where the components of v µ are assumed to be independent Gaussian random variables with statistical properties equal to those of Eq. (24). Then, what was effectively done in the main text was to approximate  (49) and obtain the vectors {w µ } from {v µ } using a Gram-Schmidt procedure, which orthonormalizes the set {v µ }, thereby taking into account constraints imposed by the unitarity of V m µ . Thus, starting from w 1 = v 1 , we set 5 for all µ ≥ 2 and where ⟨w|v⟩ denotes the standard complex scalar product. Inserting the w µ in Eq. (49) gives an explicit expression in terms of the independent Gaussian variables v m µ that can be calculated using Isserlis' theorem and takes into account correlations. Unfortunately, we would need to do this for eight vectors in Eq. (49) (and their complex conjugates) and the total number of v m µ -terms, and consequently the number of pairings in Isserlis' theorem, quickly grows to astronomically large numbers, even when respecting the constraints identified in the main text (m 1 ̸ = m 2 , µ 1 ̸ = µ 2 , etc.). To see this, it might be helpful to explicity write down the components obtained via the Gram-Schmidt procedure. Clearly, everything is simple for the first vector: w m 1 = v m 1 . The second vector is also still managable: w m 2 = v m 2 − nv n 1 v n 2 v m 1 . The third vector, however, contains already six terms with up to seven v-components: and it does not get simpler for the remaining vectors. However, recall that we are only interested in an order-of-magnitude estimate. Each added pair of v-terms comes with an extra D-dimensional summation, but also contributes a factor of the order D −1 due to Eq. (24). In general, one therefore expects that these contributions roughly cancel each other in an order-of-magnitude estimate provided that the minimum number of Kronecker deltas as identified in the main text remains the same (if one term in Isserlis' theorem gives rise to fewer Kronecker deltas than before, then an additional sum appears potentially contributing a huge factor).
Whether this is the case has been explicitly checked to lowest order in the Gram-Schmidt procedure. This means one first sets for µ ∈ {2, 3, . . . , 8}, which follows from Eq. (50) by replacing w µ by v µ on the right hand side. This approximation is then inserted into Eq. (49) and only terms with a single additional sum are kept. There are 2 · (1 + 2 + · · · + 7) = 56 such single sum contributions, where the factor two arises because one has to take into account w µ and