Lecture notes on Diagrammatic Monte Carlo for the Fröhlich polaron

These notes are intended as a detailed discussion on how to implement the diagrammatic Monte Carlo method for a physical system which is technically simple and where it works extremely well, namely the Fröhlich polaron problem. Sampling schemes for the Green function as well as the self-energy in the bare and skeleton (bold) expansion are disclosed in full detail. We discuss the Monte Carlo updates, possible implementations in terms of common data structures, as well as techniques on how to perform the Fourier transforms for functions with discontinuities. Control over the variety of parameters, especially in the bold scheme, is demonstrated. Sample codes are made available online along with extensive documentation. Towards the end, we discuss various extensions of the method and their applications. After working through these notes, the reader will be well equipped to explore the richness of the diagrammatic Monte Carlo method for quantum many-body systems.


Introduction
These notes originate from a series of lectures taught at international summer schools intended for researchers interested in numerical methods and strongly correlated systems. They introduce the diagrammatic Monte Carlo (DiagMC) method, a quantum Monte Carlo method for strongly correlated systems in which one, simply put, samples over all Feynman diagrams. Feynman diagrams are versatile and employ a universal language used in high-energy as well as in condensed matter physics. DiagMC is one of the most promising methods still under active development to deal with generic fermionic models in high dimensions. The goal is to give an introduction and flavor of this method.
Prerequisities for a thorough understanding of this text are familiarity with the basics of quantum mechanics and elementary quantum field theory (notions such as the interaction picture, Wick's theorem, Green function formalism, etc.), statistical mechanics (partition function, solving two-level systems, etc.), and undergraduate computational physics (curve fitting, root solving, interpolation techniques, etc.) including classical Monte Carlo methods (notion of detailed balance, Markov chain Monte Carlo, Metropolis algorithm, etc.).
Let us summarize the main idea of the method, and how it differs from other quantum Monte Carlo schemes -admittedly, different researchers use the notion of diagrammatic Monte Carlo in quite different contexts. To this end, we must discuss the type of expansion, the sampling space, and the nature of the sampled series. Newcomers may skip the remainder of this paragraph in a first reading.
The starting point is a very general perturbative expansion of the form, F ( y) = n x 1 ,...x n D(x 1 , . . . , x n ; y).
We compute a function F depending on external coordinates y (for example, the Green function G(k, τ) with momentum k and imaginary time τ) which has a perturbative expansion. At every order n there are internal coordinates x 1 , . . . , x n (these are the internal momenta and imaginary times) which can be discrete or continuous, and are summed or integrated over. [1] for a recent review). By contrast, a "strong-coupling" expansion is used in path-integral Monte Carlo simulations [2], in the worm algorithm [3] and the fermionic impurity solver CT-HYB [1]. In these schemes one perturbs in the kinetic hopping term whereas the solvable system (the potential energy term) is diagonal in the chosen Fock or real-space basis but not quadratic -it corresponds to the atomic limit. An expansion of the partition function Z at inverse temperature β = 1/T and volume V in the sense of a weak-coupling expansion and in the spirit of Eq. 1 reads where in the second line we worked out the time-ordering operator τ of the first line. This expansion leads to nothing but a Taylor expansion in the interaction H 1 , namely Z = k c k g k with g the coupling strength amplitude of H 1 and c k the coefficients that can be determined by evaluating all the integrals in Eq. 2 order by order, and which remain independent of g. Methods such as lattice determinant Monte Carlo and the impurity solvers CT-INT and CT-AUX (but also the Monte Carlo methods referred to as strong-coupling expansions) evaluate physical quantities in thermodynamic equilibrium as and give it the following statistical meaning: Sample configurations c are obtained, which are distributed according to the partition function Z with respective weights p c , and in which the quantity Q is evaluated. Hence, The unbiased estimator for the expectation value of the quantity Q is then to sum up Q c over all independent configurations and divide by the number of independent measurements. The normalization through the partition function is here manifest. As long as the system volume V and its inverse temperature β are finite, the Eq. 2 is an expansion in an entire function and hence always convergent (with the finiteness of the system we explicitly exclude all possible UV divergences that may still arise as e.g. in Sec. 8.2). The finiteness of the system ensures that no true spontaneous symmetry breaking can occur, which is at the heart of such methods as finite size scaling. When physicists use the term DiagMC in the sense of the expression "sampling over all Feynman diagrams" it implies a number of differences compared to the previous paragraph: The thermodynamic limit is taken from the start, the partition function is usually not used for normalization (instead, the lowest order diagram is often chosen (see below)), nor does the sampling necessarily take place in the space of the partition function diagrams: The method (usually) relies on the cancellation of disconnected diagrams when computing correlation functions as can be found in standard textbooks [4][5][6][7][8]. This can equivalently be considered an expansion of the free energy F ∼ log Z.
These differences allow us to sketch some of the key properties of Feynman diagrams, which can be considered its advantages: All diagrams are topologically distinct and the magnitude of the prefactor is always 1 [6]. The language of Feynman diagrams is universal in all fields of physics. Feynman diagrams factorize over internal building blocks, such as particle propagators (single particle Green functions), interactions, and vertices. Consequently, the diagram weight also factorizes, which is a prerequisite for successfully developing a Markov chain Monte Carlo method. Analytical treatments of low orders or limiting cases can be built in analytically. In DiagMC one does not attempt to write down all diagrams explicitly (since the number of diagrams grows factorially with expansion order, this is only possible for the lowest expansion orders anyway) but one instead develops algorithmic rules that allow one to sample over all diagrams. This implies changing the internal integration variables, but also the topology and the expansion order. Non-perturbative features are accessible via skeleton series [9] and (partial) resummations of a certain class of diagrams. This takes us away from the bare expansion, and we will also see how this works for the Fröhlich polaron. In fact, any analytical treatment known from the literature can be built in. Ideally, the Monte Carlo sampling should only deal with featureless functions originating from high-dimensional integrals whereas any intricacy related to the field theory is dealt with analytically a priori.
The aforementioned differences bring us at the same time to the first main difficulty in the development of the DiagMC method, which is the series convergence: It is usually unknown whether a series converges or not. The series is guaranteed to diverge at a phase transition, but it may happen sooner. In fact, most series in physics are asymptotic, which can be established rigorously in a number of cases. A well known argument, first formulated in the context of quantum electrodynamics, is Dyson's collapse argument [10]: When rotating the electric charge from e to ie in the complex plane around the origin, one sees that the system is unstable to collapse (the potential energy scales quadratically with the number of particles, which is faster than the kinetic energy), rendering the convergence radius zero. The same holds for any interacting bosonic field theory: No matter how small in magnitude the attraction in the potential energy is, it beats the kinetic energy for large enough particle numbers, leading to a collapse. The asymptotic nature of the series can sometimes be dealt with using resummation methods [8], but, in general, the issue of a non-convergent series is an open problem and in our view the most difficult one that DiagMC faces.
The second main difficulty in the development of DiagMC is the sign problem. Sign alternations are often inherent (and necessary) to the issue of convergence -without sign alternations the factorial growth in the number of diagrams could never lead to a meaningful result for an asymptotic series. Nor is the sign extensive in the system volume, as in path integral Monte Carlo simulations, which would prohibits us from finding the full solution [11]. Nevertheless, the sign problem puts in practice a limit on the expansion orders that can be reached. Di-agMC features hence a tacit assumption that the sign problem is sufficiently weak such that sufficiently high expansion orders can be reached in order to extrapolate in a reliable way to infinite expansion orders (often in combination with a resummation scheme that is powerful enough). Unfortunately, this assumption can only a posteriori be checked.
The third difficulty is dealing with multi-dimensional objects such as a multi-legged vertex in the Bethe-Salpeter equation. Despite active research in the fields of self-adaptive grids and concise data storage formats, this is equally an unsolved problem. However, only in cases that an explicit expression for the whole object (or a high-dimensional subpart) is required (such as in self-consistency schemes) can this be considered a problem; otherwise one can just sample over such an object without ever evaluating it in full.
In these notes we consider a model where these three problems do not occur: the Fröhlich polaron model is sign-positive in the imaginary time formalism and the Green function convergent for all finite values of the imaginary time. Due to the rotational symmetry of free space can the Green function be stored as a two-dimensional object, which is easy to histogram and manipulate. There are other simplifying factors, which are related to the absence of vacuum polarization diagrams, or, equivalently, the observation that Feynman diagrams for polaron (and impurity) problems can be mapped onto path integrals (cf. the structure of a backbone line in Fig. 5 below). Indeed, the analytical properties of mesoscopic systems such as polarons and impurity systems appear to be much simpler than those of true many-body problems. Furthermore, for almost all problems of this type very accurate variational approaches (and wavefunctions) are known. The Fröhlich polaron is hence ideal to get acquainted with the DiagMC method. Not suprisingly, it was also the first model to which the method was applied 20 years ago [12,13].
This text is structured as follows. After discussing perturbative expansions with continuous variables in Sec. 2, the main body of this text deals with the Fröhlich polaron problem, whose Green function is obtained from a bare expansion in Sec. 3, the self-energy from a bare expansion in Sec. 4 and from a bold expansion in Sec. 5. The source codes are made publicly available as discussed in Sec. 6. In Sec. 8 some related physical systems (of the polaron or impurity type) are listed where the acquired techniques can (and have been) applied without going into detail about the physics. For completeness, we mention that the method has also been successfully applied to a number of problems that cannot be considered of the polaron or impurity-type leading to deeper insight in notoriously hard problems. We mention resonant fermions [14][15][16], frustrated magnetism [17][18][19], and physics found in the Hubbard model [20][21][22][23], among others.

Continuous-time Monte Carlo
It is quite common to have discrete as well as continuous variables in quantum field theory. In this first section we explain, by means of the celebrated two-level system, how continuous variables and variable expansion orders can be dealt with in a Monte Carlo sampling. We employ the path integral representation here.

Model
Consider a two-level system with Hamiltonian, where σ x = 0 1 1 0 and σ z = 1 0 0 −1 are the usual Pauli matrices in the z-basis with basis states |↑〉 = 1 0 and |↓〉 = 0 1 . The h-field tries to orient the spin along the −z-axis which is countered by the Γ -field which tries to orient the spin along the −x-axis. This system can be solved exactly, with the solutions (shown in Fig, 1) which makes this system a good model to get acquainted with continuous-time Monte Carlo. From the symmetry of the Hamiltonian we see that we can swap x ↔ z if we also swap Γ ↔ h.

Perturbative expansion
Starting from the partition function we notice that the σ z operator is diagonal in this basis. In order to prepare for a perturbative expansion in the Γ σ x term, we introduce the Heisenberg operators and rewrite the partition function as This is an explicit formulation of Eq. 2, Z = Tr τ exp(−β H 0 ) exp(− β 0 H 1 (τ)dτ). To lowest order (n = 0) there are just 2 contributions, Z 0 = exp(−βh) + exp(βh). Graphically, this can be depicted as a continuous worldline from τ = 0 to τ = β (see panel (a) in Fig. 2). We use a full line for spin-up and a dashed line for spin-down. Note that worldlines are continuous and periodic in β because of the cyclical properties of the trace. For this reason, there are no non-zero contributions for n = 1, nor for any odd value of n. This means that the term (−Γ ) n in Eq. 10 is always positive and we do not have to worry about a sign problem. In second order (n = 2) (see panel (b) in Fig. 2) we have Note that there is no factor of 1/2 because it cancelled with the number of equivalent contributions from the time ordering operator and the corresponding changes in the time integration boundaries [6]. Although the higher order terms can be written in the same fashion, the integrals quickly become too complicated to evaluate explicitly. We therefore switch to a stochastic approach, for which it is easiest to think in terms of a graphical depiction, as shown in panel (c) of Fig. 2. To a vertex we attribute a factor Γ , and to each segment of length τ (measured taking the periodic boundary conditions in β into account) we attribute a weight exp(±τh) with the sign depending on the spin state.
Analyzing the limiting cases, we expect to find, with almost equal probability, worldlines that are dominated by one of the spin states with few vertices at high temperatures. At low temperatures, we expect a dashed line with few kinks for Γ h, whereas for h Γ the spin wants to orient along the −x-direction, which graphically translates into having many vertices, and for which our chosen basis along the z-direction is a poor choice. Our main task when designing a Monte Carlo scheme is hence to reach high expansion orders at low temperature with good efficiency.

Monte Carlo updates
There exist many equivalent ways to sample this system. The choice we make here resembles the updates later used in the Fröhlich polaron code, with similar design criteria. A minimal ergodic set of updates consists of the pair INSERT/REMOVE. If the INSERT update is chosen, we attempt to insert a new pair of vertices as shown in Fig. 3. We therefore select a random time τ 1 chosen uniformly over τ 1 ∈ [0, β[. Looking in the direction of positive imaginary times, we determine the time interval ∆ counted from τ 1 over which the spin occupation does not change. The second vertex is placed at a time chosen uniformly over the interval ∆. For the reverse update, the pair to be removed consists of randomly selecting a vertex and taking the subsequent one in the direction of positive time. The weight of the diagram segment between τ 1 and τ 2 in the old (i.e., before the INSERT update) configuration X is W (X ) = e −(τ 2 −τ 1 )hn 0 , with n 0 the spin occupation at time τ 1 in the old configuration. The weight of the corresponding segment W (X )P(X →Y ) . For the REMOVE update the acceptance factor is 1/r. The differentials dτ 1 and dτ 2 enter the formulas for W (Y ) and P(X → Y ) as a consequence of working with a continuous variable τ but they drop out in the acceptance factor r.

Estimators
The observables of interest are the expectation value of the spin magnetization along the xand the z-axis. There are two ways to measure the magnetization along the z-axis (up to an irrelevant minus sign). The first one consists of evaluating the magnetization at a fixed time τ = 0, which is an integer number. The second one evaluates the integral 1 β β 0 σ z (τ)dτ, which is a floating point number. Perhaps the reader thinks that the second way is far superior because it contains information from all times, but we will see that this is not true: the second way of measuring has only slightly lower error bars for the same runtime, whereas it is a considerably more expensive operation to perform, scaling linearly in the number of interaction vertices (even if done "on the fly" after every update). The magnetization along the x-direction can be measured as 〈σ x 〉 = − 〈N V 〉 βΓ as can be seen from Eq. 10. It is equally straightforward to obtain estimators for quantities such as 〈σ x (τ)σ(0)〉, but we will not discuss this further.

Results
Let us start at low temperature with a strong magnetic field in the x-direction. We take as parameters β = 10, Γ = 0.4, h = 0.05. After an initial thermalization phase of one million updates, we perform 10,000,000 updates, measuring after each one. After just a few seconds we see that we reproduce the exact result with error bars between 0.0001 and 0.001. The integrated autocorrelation times are about 3. We spend about 4% of the time in the zeroth order diagram, and close to 40% of the time in fourth order, although the code has occasionally gone to 16th order. So we sampled over quite a large Hilbert space and the code performed very well. There is no reason to optimize further.
At low temperature and strong magnetic field in the z-direction ( β = 10, Γ = 0.05, h = 0.4 and same runtime parameters as before) the autocorrelation times are also about 3. The error bars on 〈σ z 〉 are typically an order of magnitude smaller than in the case Γ h, which is explained by the fact that our basis is better suited. The error bars on 〈σ x 〉 are only slightly larger than before. In other words, the code is still behaving as expected. At high temperature ( β = 0.1, Γ = 0.2, h = 0.2 and same runtime parameters as before) the magnetization along either direction is about −0.08 and hence very weak. Perhaps surprisingly, we see that the error bar on 〈σ z 〉 is of the order of 0.04, which is 50 times larger than the error bar on 〈σ x 〉 and one to two orders of magnitude larger than what we had at low temperatures -whereas low temperatures should be much more difficult to simulate. This is also reflected in the integrated autocorrelation times, which are about 8, 000 (it could well be worse because it is not clear if the code has converged) for the magnetization along the z-axis and only ∼ 1 along the x-axis. Physically, the system has rotational symmetry in spin space, but this is clearly not respected in our updating procedure. As expected, the code spends 99.98% of the time in the zeroth order diagram and the acceptance ratio for our INSERT-REMOVE updates is 0.02%. What could be the reason for such bad autocorrelation times in an essentially non-interesting regime? The world-lines are 99.98% of the time straight world-lines but the up and down orientations are almost equally probable because of the high temperature. Our current update scheme only allows one to change the orientation of the magnetization via the insertion of kinks, which is highly inefficient at high temperature. To cure this problem, we add another update SPIN-FLIP which, for simplicity, is only allowed in the zero-vertex sector and which attempts to swap between the up and down orientations of the spin. Adding this update cures the problem. It is good practice to keep the code as simple (and local) as possible, and to optimize or write extra updates only in case problems pop up.
With this we close the discussion on sampling continuous variables and different expansion orders and proceed to the main part.

Fröhlich polaron: bare expansion for the Green function
The Fröhlich polaron problem describes the interaction between an itinerant electron and longitudinal, optical phonons. Historically, it was the first problem to which diagrammatic Monte Carlo was applied [12,13,24] for which it could provide definite answers regarding the polaron spectrum and arbitrarily precise polaron energies for any coupling strength. The Hamiltonian for a system in a volume V is given by The operators a k and b q are annihilation operators for electrons of mass m with momentum k and phonons with momentum q, respectively. The phonon frequency ω q = ω ph can be taken momentum-independent for optical, longitudinal phonons. The dimensionless coupling constant is α. Typical values for α vary from 0.023 for InSb over 0.29 for CdTe to 1.84 for AgCl (and are thus rather weak) [25]. We will work in units ħ h = m = ω ph = 1 and take the It is not the purpose of these notes to give an overview of the physics of the Fröhlich polaron, whose thermodynamics is now well understood (but questions remain for transport). We refer to the lecture notes by J. Devreese [25] for a pedagogical introduction. The basic competition in the model is between the electron kinetic energy trying to delocalize the particle and the phonons trying to localize it. The system can lower its energy by dressing the electron with phonons, resulting in the formation of a polaron. Its residue can be very low and the effective mass very high, but the polaron is never fully localized or fully self-trapped; there is hence no transition in this model.
For historical importance and to illustrate the connection with path integrals, let us remark that the Hamiltonian is quadratic in the phonon propagators, which can hence be integrated out. This results in a retarded one-particle propagator for the electron, where |r, τ〉 is in the basis of position and imaginary time. Thus, Eq. 13 conveys the intuitive idea of obtaining the probability amplitude for an electron to return to its initial position after an imaginary time evolution up to inverse temperature β by integrating over all possible trajectories ('paths') through imaginary time. This path integral expression served as the basis of Feynman's variational ansatz [26] which is remarkably accurate for the polaron energy for all coupling strengths. This path integral is, because of the retarded self-interaction, not as easy to simulate as the two-level system of the previous section, and will hence not be used for actual computations.
The structure of this section is as follows: We start with reviewing the necessary fieldtheoretical formulas to study quasi-particle properties, followed by the description of the algorithm used to simulate the polaronic Green function using a bare expansion. Next, we show some results that can be obtained with this code. In the following section the self-energy is computed using the bare expansion, with special emphasis on Fourier transforms and an illustration for the first-order diagram. Finally, the bold expansion of the self-energy is introduced, again splitting the discussion between the first-order diagram and higher order ones.

Digest of many-body theory
The central object of our analysis is the full single-particle Green function, which is related to the bare Green function G 0 and the self-energy Σ via the Dyson equation as For the polaron problem, we will work at zero temperature. To avoid instabilities due to poles, it is more convenient to work in imaginary time than with Matsubara frequencies ω n in the sampling. For impurity problems, the bare Green function is just with θ (·) the Heaviside function, ε k = k 2 2m the dispersion, and µ an energy shift which is used as a tuning parameter (see below). In Matsubara representation the bare Green function takes the form .
The full Green function will have a pole at iω n = E k −µ where E k is the self-consistent solution to given that the imaginary part of Σ(k, E k − µ) vanishes. We may then expand the self-energy around the pole position, allowing us to rewrite the full Green function approximately as with the quasi-particle residue The approximation Eq. 19 holds as long as the quasi-particle pole is sufficiently far away from the dissipative continuum, the separation to which we call ∆. Transforming back to imaginary time, the quasi-particle energy E k and residue Z k (which is the modulus squared of the overlap between the quasi-particle state and the free electron state) can be extracted from the large τ behavior of the full Green function under the same assumptions, We will solve the problem of obtaining Σ(k, τ) for fixed µ by diagrammatic Monte Carlo and are left with the task of finding E k such that Eq. 17 is satisfied. This can be done by a root-solving algorithm in combination with one-dimensional integration. When E k is found self-consistently, Eq. 20 determines the corresponding residue. The dispersion of the quasiparticle is given by analyzing E(k) as a function of k.

Algorithm
The simplest way to solve the Fröhlich polaron problem is by considering the bare expansion of the full green function by using the expansion elements shown in Fig. 4. This was also presented in the original solution by Prokof'ev and Svistunov [12]. Wick's theorem tells us that there can be no unpaired phonon creation and annihiliation operators, i.e., all phonon operators pair into 'arcs', the number of vertices is always even and V (q) in Eq. 12 only enters as a product with its complex conjugate. Graphically, the expansion is illustrated in Fig. 5. We can label the expansion order by counting the number of phonon propagators. In order n there are n phonon propagators, 2n vertices and 2n + 1 impurity Green functions. Our task consists of sampling over all possible diagrams for the Green function G(k, τ), i.e., sample over all possible expansion orders n, all allowed topologies, and integrate over all internal momenta q i , i = 1, . . . , n, and vertex times τ j , j = 1, . . . , 2n. Only connected diagrams are considered. This diagram is a typical example of a 'configuration' whose weight is the product of the weights of the composing Green functions, vertices and phonon propagators.
Every Feynman diagram is a valid Monte Carlo configuration, with a weight that factorizes into the product of the individual electron propagators G 0 (k, τ), phonon propagators and vertices. It is convenient to absorb the 1/q vertex dependence into the phonon propagators D(q, τ) = D(q, τ)/q 2 and constants into the coupling constantα 2 = 2πα 2, see Eq. 12 and our choice of units.
As an example, the full expression for the weight W (2) c of the second order diagram with crossing phonon lines (one of the three possible topologies in second order) reads Here p and τ are the external momentum and time, respectively. The independent momenta are chosen as p 1 and p 2 , whereas q 1 = p−p 1 , q 2 = p 1 −p 2 , and p 3 = p 2 +p−p 1 = p−q 2 follow through momentum conservation. The factorization of the weight into bare Green functions and phonon propagators is now manifest. The extension of such explicit analytical formulas to higher order is however cumbersome in comparison to drawing the Feynman diagrams. We proceed therefore to how the diagrammatic Monte Carlo sampling can be performed. The updating scheme discussed below differs from the one introduced originally by Prokof'ev and Svistunov. Using the freedom which every designer of a Monte Carlo procedure has, we seek the simplest set of updates that is ergodic and remains as local as possible. By locality we mean that the number of changes to the current configuration is minimal and only involves one diagrammatic element plus its adjacent elements.
External variables -Because of the spherical symmetry of the Hamiltonian, we can choose the orientation of the external momentum k to be along the x-axis as (k, 0, 0), in which case the full Green function is a two-dimensional object in (k, τ) space. We can predefine a set of external momenta k j for which we compute G. The simplest choice is a uniform grid, k j = j∆ k with ∆ k = k max /N k where k max the momentum cutoff and N k the number of momentum points. Figure 6: Illustration of the INSERT-REMOVE pair of updates.
Other choices for the grid are possible and perhaps even better because for low momenta we expect a dispersion akin to k 2 /(2m * ) with m * the effective mass of the polaron, which suggests that a quadratic grid might be better. We will not go into further detail. The same procedure can be applied for the external time τ. Let us follow here another approach and consider τ to be a continuous coordinate between 0 and τ max , which we bin into a uniform grid of bin length ∆ τ at the expense of a small systematic discretization error (∆ 2 τ ) if we use the discretized τ j = ( j + 1/2)∆τ. A logarithmic grid might be a better choice given Eq. 21 (from experience we know that it does not really matter at this stage).
Normalization -We choose the zeroth order diagram for normalization, which is just the bare propagator G 0 . Because of the updates CHANGE-P and CHANGE-TAU (see below) the total normalization integral is All quantities in the Monte Carlo sampling can be normalized by multiplying with /C 0 where C 0 counts how often we are in the zeroth order diagram. This can be seen as follows: The estimator for the normalization diagram is The estimator for the full Green function is where we used that G(p j , τ i ) = p k τ max 0 G(p k , τ)δ(τ − τ i )δ p j ,p k dτ and all τ residing in the bin i are taken together in entry τ i (it is possible to improve on this by computing the ratio G(p j , τ i )/G(p j , τ) , although there is seldomly a need for that). The proportionality constant drops out when normalizing with ∆ i the volume of the time-bin i. The same normalization can be applied to other quantities of interest such as the bare Green function and the first-order Green function. CHANGE-P -This update is only allowed if the expansion order is 0. In this update, which is its own reverse, we uniformly select a new p j from the set of allowed external momenta and accept it according to the Metropolis algorithm as min [1, r] with acceptance factor . We can also opt to keep the external momentum fixed in a single run.
CHANGE-TAU -This update is only allowed if the expansion order is 0. In this update, which is its own reverse, we select a new external time τ using an exponential distribution. If the dispersion is ξ p = ε p −µ with p the external momentum and u a random number uniformly chosen between [0, 1[, then we construct τ = − ln u/abs(ξ p ) and accept it as the new external time of the diagram if τ < τ max .
INSERT -This update attempts to increase the number of phonon propagators by one (its reverse is REMOVE, see below and Fig. 6) and is constructed as follows: Select a ran- Figure 7: Illustration of the SWAP update. The phonon propagators carrying momentum q 1 and q 2 are understood to connect back to the fermion line at vertices at times τ A and τ B , respectively. τ A and τ B may each be either before or after τ 1 and τ 2 . After the update, the central fermion propagator carries a momentum k = k − q 1 − q 2 .
dom electron propagator and identify its left and right endpoints. Let us call this propa- The three components of the momentum q are obtained from the Box-Müller algorithm as a gaussian random number with mean 0 and variance m/(τ 2 −τ 1 ). The acceptance factor is given by where X stands for the old configuration and Y for the new one, W (·) for their respective weights, and P(·) for their a priori transition probabilities. They are given by Here, p INS and p REM are the probabilities to select the INSERT and REMOVE probabilities, respectively. Note in particular that all differentials cancel in the acceptance factor r. The reader notices that further cancellations occur in the acceptance factor such as the electron propagators between τ 1 − τ L and τ R − τ 2 , as well as any µ-dependence. Those cancellations are only exact if function calls are used; for tabulated objects in combination with interpolation techniques there are tiny deviations from these cancellations. REMOVE -This is the reverse update of INSERT. We uniformly select a phonon arc and check if its vertices are consecutive elements in the time ordered confiugration (see P(Y → X ) above). If this is not the case, the update is rejected. The acceptance factor is the inverse of the one determined above for the INSERT update.
SWAP -The INSERT and the REMOVE update allow to change the expansion order but are insufficient to generate all possible topologies because they do not allow phonon arcs to cross. The SWAP update allows one to change the topology within a given expansion order n ≥ 2. With the notation of Fig. 7 we randomly select a vertex excluding the last one. If it has a time τ 1 and the next one a time τ 2 , we attempt to swap the end points of their respective phonon propagators. In order to conserve momentum at every vertex, the momentum of the electron propagator between τ 1 and τ 2 changes -in line with our design criterion of finding updates that are local. The acceptance factor is given EXTEND -Although this update is not needed for ergodicity, it is a useful one to improve the sampling. It changes the duration of the rightmost electron propagator in a similar fashion as the CHANGE-TAU update.

Implementation
The number of diagrams grows as (2n − 1)!! = (2n − 1)(2n − 3) . . .. Only the lowest expansion orders can be evaluated explicitly, but the Monte Carlo algorithm manages to sample over the most important contributions in any order. The parameter µ requires some finetuning: Its magnitude needs to be sufficiently large such that the full Green function decays exponentially. The closer µ is chosen to the unknown polaron energy E 0 (i.e., |µ| |E 0 |) the less rapid this decay will be and the more accurate the fit (cf. Eq. 21) can be performed. For sufficiently strong α, and µ chosen closely to E 0 , the expansion order can be 100 or more.
Other authors prefer the use of a cyclical implementation [13] instead of a backbone line. The aim is to treat the electrons and the phonons on equal footing. It is also the structure that naturally arises at finite temperature. At zero temperature, we see little advantages for polaron problems and have not used cyclical diagrams in our codes.

Data structure
Let us now discuss the data structure. There are various equivalent ways to store the diagram. E.g., one may either (i) store the intervals between the emission and absorption of a single phonon along with its momentum, or (ii) one opts to store the vertices. We choose the latter approach. The necessary information needed to specify a vertex are its time, a pointer to the vertex that it connects to via the phonon propagator, the phonon momentum and at least one momentum interacting at the vertex such that all momenta can be inferred from momentum conservation. If we choose, say, to store only the phonon momenta, all electron momenta in the diagram can be computed from the given external electron momentum and by invoking momentum conservation at every vertex, but this is obviously a costly operation scaling linearly with the number of vertices. In the present implementation we decided to redundantly store all three momenta at each vertex for reasons of simplicity and memory-locality. A configuration is then specified by a time-ordered collection of such vertex objects.
When choosing the data structure, one should be conscious of the operations required by the update scheme and their respective complexity. Obviously, the ability to INSERT and REMOVE vertices efficiently while retaining the time ordering as well as the ability to seek forward and backward along the electronic backbone line are crucial, thus ruling out plain contiguous array-like data structures. Likewise, the INSERT update needs to randomly pick an electron backbone segment, the REMOVE update randomly picks a phonon propagator, and the SWAP update randomly selects a pair of adjacent vertices. All three of these ultimately draw a vertex uniformly from the set of all vertices (or in case of SWAP from all but one).
We implemented a number of different data structures to meet these requirements to varying degrees and gauge their impact.

1.
A doubly-linked list as provided in C++ by std::list satisfies the first criterion with (1) insertion and removal but requires one to start at the beginning and iterate through the list to reach a randomly picked vertex, thus resulting in (N ) scaling (with N the number of vertices).  2. A self-balancing binary search tree, e.g. an AVL or red-black tree, provides (log(N )) insertion and removal and in principle also allows for true random access of an ordered sequence in (log(N )) when nodes keep track of the number of nodes in their subtrees. Search trees will automatically enforce ordering which we however do not benefit from as the update scheme is designed in a way that retains time ordering anyway. While std::map is usually implemented in terms of binary search trees, it cannot be used off-the-shelf here as it hides its tree implementation and does not allow for the kind of additional bookkeeping required to achieve fast random access. For testing, we implemented an AVL search tree with a function to randomly access elements by index.

3.
A doubly-linked list may be combined with a contiguously stored array (a std::vector) of iterators to the list elements that serves as a lookup table. Upon insertion, an iterator to the newly created list element is pushed to the back of the array. The list element is likewise tagged with the index of its iterator in the array. When removing a list element, its iterator in the array swaps places with the last one (updating the tag of its list element) before it is popped. This procedure retains the (1) complexity of insertion and removal operations and keeps an up-to-date array containing iterators to all the list elements contiguously, albeit not in time order. Thus, we do not get proper random access but gained the ability to pick a random element in (1). Care has to be taken when applying this to the SWAP update.
The performance impact of the choice of data structure depends on the average order that is reached in the course of the simulation which in turn depends on the system parameters. In our benchmark, Fig. 8, we decided to keep α = 1 fixed and vary the chemical potential to probe a number of mean expansion orders.
In situations where the average order was below 10, the search tree (implemented as an AVL tree) performed badly compared to the list-based data structures due to the added overhead. It would only become a feasible alternative outperforming the plain list when orders beyond 40 were reached as can be seen from Fig. 8. In contrast, the list-array combination barely shows any scaling with the diagram order and was consistently faster than the plain list indicating that the overhead added due to the lookup array is very light. For models with a sign problem where only low expansion orders can be reached, it does not matter how the data structure is implemented.

Error bars
The estimation of the error bars on the Green function is complicated by the fact that the normalization itself is estimated from the same simulation. We employ the jackknife resampling technique to account for that. This requires knowledge of the time series. Sampling after every single update would result in excessive memory demand and post-processing time due to many highly-correlated samples and negate the efficiency of the local update scheme. Thus, we group updates into bunches of N loop elementary updates. After each elementary update, we increment histograms for the Green function, the zeroth order counter used for normalization, etc. After N loop elementary updates, the histograms are measured, i.e. recorded in the time series, and subsequently reset. The choice of N loop can be guided by the estimated autocorrelation time obtained from a binning analysis.
Within the framework provided by the ALPSCore library (cf. Sec. 6), we chose to rely on the FullBinningAccumulator to perform the above binning analysis for us. Further, any derived quantities calculated from the observables are automatically resampled using the Jackknife method.

Results
For α = 1.0, µ = −1.2 and a runtime of about 1 minute on a single core laptop one can extract the polaron energy with an accuracy better than a percent. Stronger couplings are a little bit harder to simulate: We show the Green function for α = 5, µ = −6 and zero momentum in Fig. 9a. By fitting the exponential tail according to Eq. 21, one can extract the polaron energy (E 0 = −5.55) and residue (Z = 0.032). Here, the fit took data from τ fit = 5 onwards into account. However, for µ = −6 the Green function decays rather quickly. This limits the maximum time that can be accessed with reasonable accuracy. Additionally, when taking the error bars into account in the fit, the data close to τ fit impact the fit more strongly due to their higher accuracy. This leads to a heightened sensitivity towards systematic errors from the non-asymptotic fast initial decay with respect to the choice of τ fit . In order to get a more reliable estimate of the polaron energy, we tuned the chemical potential to achieve longer imaginary times along with a less severe growth of the error bars. Choosing µ = −5.6 ( Fig. 9b), τ = 40 was accessible, allowing us to probe the asymptotic regime over time scales many times that of the initial fast decay. The inset in Fig. 9b displays results for the polaron energy estimated from fits with different τ fit . This has been done for the data from each of the 28 independent MPI processes to yield an error on that estimate. It can be seen that after an initial influence from the non-asymptotic onset, the results beyond τ fit = 5 stay consistent within their error bars. Our final result reads E 0 = −5.5498 ± 0.0021 and Z = 0.03215 ± 0.00084. The polaron energy was thus found with a relative accuracy of 0.04 % at modest computational effort.
The polaron energy is remarkably close to the value predicted by Feynman's variational ansatz despite the rather strong coupling α = 5. Feynman's trial action is parametrized by parameters v and w, the latter of which was assumed to have only a mild influence on the end result [26]. Feynman then optimized for v at fixed w, treating parts of the integrals approximately. From the expressions he gave in the strong-coupling regime, we find E 0 = −5.33 (for w = 1) and E 0 = −5.39 (for w = 3). With today's readily available numerical integration and optimization tools, we also optimized for v and w simultaneously without taking any approximations to the integrals. This results in an improved variational energy of E 0 = −5.44 (v = 4.03, w = 2.14). Thus, our Monte Carlo estimate is 2 % lower, in accordance with the variational principle. The dispersion for α = 1 is shown in Fig. 10. In this calculation we recalculated one of the first hallmarks of DiagMC [12]. It shows that the perturbative result incorrectly predicts an endpoint to the dispersion, whereas the DiagMC results show that the binding energy can be seen up to zero energy. In passing we note that other formula for the computation of the effective mass and the group velocity exist, see Ref. [13,24]. The histogram over the expansion orders for α = 5 is shown in Fig. 11. For large enough expansion orders n, H[n] decays exponentially. The acceptance factors are about 5% for INSERT and REMOVE and 29% for SWAP. Those numbers are acceptible. If deemed too low, or if the frequency of visiting the normalization diagram is too low, reweighting and flat histogram techniques should be used.

Fröhlich polaron: self-energy
It is often advantageous to compute the self-energy instead of the full Green function and resort to the Dyson equation (Eq. 14) to obtain the latter. However, a Fourier transform from imaginary times to (Matsubara) frequencies is needed to cast the Dyson equation in algebraic form; otherwise, it is a convolution. Below we first discuss how to perform such Fourier transforms by considering the first-order diagram, and then proceed with the diagrammatic Monte Carlo computation of the full self-energy. In this text, the self-energy is always understood as the one-particle irreducible self-energy [5].    Figure 14: Same as in Fig. 13 but instead of the self-energy we compute the convolution (Σ G 0 )(τ) for p = 0, which behaves as ∼ τ for τ → 0. Compared to Fig. 13, treating the divergence of the self-energy more carefully allowed us to substantially reduce the required number of Matsubara frequencies.

Fourier transforms explained for the first-order self-energy
The first-order self-energy is shown in Fig. 12 together with the Green function related to this diagram via the Dyson equation. The first-order self-energy can be computed analytically for zero external momentum, When applying Eq. 17 and using a root solver, we find that the polaron energy is given by E (1) 0 = −2.6258286 for α = 5 in our units. The first-order self-energy for non-zero external momenta can be computed as, where F is the Dawson function, F (x) = e −x 2 x 0 e y 2 d y.
For small values of the argument, it behaves as F (x) ≈ x − 2x 3 /3. For large values of x, it behaves as F (x) ≈ 1/(2x) + 1/(4x 3 ). This formula for Σ (1) (p = 0, τ) is the Fourier transform of the self-energy in real space andD which can be seen as a retarded Coulombic-like potential.
In principle, all we need to do is apply the Dyson equation (Eq. 14) and Fourier transform G (1) (p, ω) back to the imaginary time domain. However, as we will see, this must be done very carefully.
Shape of the self-energy -The most important observation is that the self-energy diverges as 1/ τ for τ → 0 for any p. This is in fact quite a common situation in continuous space, originating from the momentum integral over the bare electron Green function. The first-order selfenergy very often has limits that need to be analyzed analytically. Is this divergence a problem? On the one hand, any integral ε 0 Σ (1) (p, τ)dτ is convergent; in particular, there is no problem with the existence of a proper Fourier transform. On the other hand, a Taylor expansion of the self-energy around the middle of the τ-bin shows that the binning process has uncontrollable systematic error bars for sufficiently small values of τ → 0. The solution is not difficult: One can choose to refrain from sampling and compute the self-energy analytically for fixed, discrete τ j values, thereby circumventing the binning issue. If this is not an option and sampling remains essential, then one should make a measurement ofΣ (1) (k, τ) =Σ (1) (k, τ) τ, which is a featureless function of τ. Whenever the discretized Σ (1) (k, τ j ) is needed, it is obtained fromΣ (1) (k, τ j )/ τ j . Unfortunately, this is not the only problem associated with the 1/ τ divergence as we will see in the discussion on the Fourier transforms.
Fourier transforms -One of the most fundamental differences between classical mechanics and quantum mechanics is the occurrence of non-commuting operators in the latter, which in turn leads to the time ordering inherent to quantum field theory. This is already apparent from the Heaviside θ (·) function in Eq. 15. It has the following frequency representation It is this behavior which explains the structure in Eq. 16. Note that the coefficient of the 1/(iω n ) term in Eq. 16 is exactly 1 (which is identical to the jump in the Green function for τ = 0 + and τ = 0 − ) thanks to the (anti-)commutation relations of bosonic (fermionic) annihilation and creation operators [5]. Therefore, the same asymptotic behavior 1/(iω n ) holds not only for G 0 but for any Green function G. In the inverse Fourier transform G(k, ω n ) → G(k, τ), brute force summing (or integrating) over all frequencies in the hope of restoring the Heaviside θ (·) function is hopeless. One therefore needs to treat the large-frequency tails carefully. At finite temperature, Fourier transforms assume that the function can be periodically continued, which is likewise violated. The easiest solution is to (i) only use the analytic formulations Eq. 15 and Eq. 16 for the bare Green function, and (ii) never perform a Fourier transform on any full Green function G but only on differences δG = G − G 0 . In doing so, the leading asymptotic frequencies are compensated as well as the discontinuity in imaginary time taken care of. It is also possible to treat the 1/(iω n ) 2 in the same fashion: Its coefficient in frequency space corresponds to the (sum of the) slope(s) of the Green function at τ = 0 + (and τ = 0 − ) in imaginary time.
In the literature one can also find formulas for the 1/(iω n ) 3 term, but we have never seen a case where this is necessary for the success of the calculation.
When we rely explicitly on the jump being 1 in G between τ = 0 + and τ = 0 − , we need to make sure in the Monte Carlo sampling that we choose τ max large enough such that G(k, τ max ) ≈ 0 since by construction we have G(k, 0) = −1. As we have seen before, this means that τ max must be quite large when µ is chosen close to the polaron energy E 0 . The region where Σ (1) (p = 0, τ) is sizable may well appear smaller than this. Let us now use the imaginary-time and Matsubara formalisms for the Fourier transforms; specifically, with the Matsubara frequencies ω n = 2nπ β for bosons and ω n = (2n+1)π β for fermions. Here, we have introduced a fictitious inverse temperature β = τ max to impose the discretization in frequency space. A single electron obviously has no statistics, so we can use either bosonic or fermionic frequencies, or just ω n = nπ β -it should not matter as long as we use a transformation that turns the Dyson equation into an algebraic equation. From the Dyson equation δG(p, ω n ) can be written as Observing the decay of G(p, τ) over many decades (as a function of τ) requires in turn a huge number of Matsubara frequencies. The naive implementation of the Fourier transform scales as (N 2 ) where N is the number of points in the time/frequency domain and becomes too costly. Although alternative approaches exist [27], let us explain here how fast Fourier transforms (FFT) can be used, with a scaling as (N log N ). For simplicity and efficiency, we rely on the open source package FFTW [28]. Although our input data is real (impyling that G(p, ω n ) = G * (p, −ω n )) and we could use the function calls r2c and c2r (cf. the FFTW documentation; we could save a factor 2 in storage) we consider this advantage negligible and use instead the easier function call dft for complex input and output. At this point the reader should keep in mind that FFT is only a tool for solving the equations Eq. 34 and Eq. 35.

It performs
between input data X and output data Y , and this is not identical to Eqs. 34  FFT assumes an equidistant grid where the input data are located exactly on the grid points. If we need more Matsubara frequencies than we have grid points in imaginary time, or if we use a non-uniform grid, we need interpolation methods. In practice, quadratic or spline interpolation is used. After binning the data for the self-energy, we have discretized values τ j = ( j + 1/2)∆τ (in order to avoid τ = 0 the same has to be done with Eq. 29); i.e., we do not have the self-energy evaluated precisely on the grid points as FFT requires. If we choose bosonic Matsubara frequencies ω n = 2πn/β, then this problem does not pop up for the frequencies. Dropping the diagonal momentum index k to make the notation lighter, the discretized form of Eq. 34 reads with ∆τ = β/N . This is almost identical to what FFT can compute for us: Apart from multiplying the input Σ[ j] with ∆τ (the integration measure), the output self-energy of the FFT must be multiplied by e inπ/N for each positive frequency n = 0, . . . , N /2, and by −e inπ/N for each negative frequency n = N /2 + 1, . . . , N − 1. This operation needs to be undone when δG(p, ω n ) → δG(p, τ) (and we should not forget the factor 1/β). In case of fermionic Matsubara frequencies, a similar phase multiplication on the data in the time domain can be derived (we leave the exact phase as an exercise). The Green function corresponding to the first-order self-energy at zero momentum is shown in Fig. 13. We see that the required number of Matsubara frequencies is prohibitively large before agreement with the bare result is found; i.e., the systematic error of the truncation in Matsubara frequencies dominates over the statistical error of the unbiased Monte Carlo sampling of the bare Green function. The reason is that the nasty ∼ 1/ τ divergence of the first-order self-energy leads by dimensional arguments to a 1/ ω n behavior, which decays even slower than a Green function for large frequencies. One could treat this divergence analytically (e.g., by Taylor-expanding the self-energy), or pursue the following approach: First, notice that the 1/ τ divergence in Eq. 30 is to leading order independent of the momentum p (and hence identical to the p = 0 result). Second, notice from the Dyson equation Eq. 36 that it suffices to compute Σ(p, ω n )G 0 (p, ω n ) corresponding to the convolution (Σ(p) G 0 (p))(τ) = τ 0 Σ(p, τ )G 0 (p, τ − τ )dτ in imaginary time and which hence behaves as ∼ τ for τ → 0. This cures the divergence but still has a divergent slope: When binning data over the sampled continuous variable τ we should still measure (Σ G 0 )(τ) τ, as mentioned before. Before performing the Fourier transform, we subtract In this approach only (a few) thousand Matsubara frequencies are needed, mostly to accommodate the decay of the Green function over many decades, see Fig. 14.

Computation of the full self-energy
Compared to the code for the computation of the full Green function in the bare expansion, a few minor modifications are needed for the evaluation of the self- energy Σ(k, τ). First, the initial and final bare electron propagator have to be removed; the first vertex must have time 0 and the last one time τ. We can still use a bare propagator for normalization purposes, but it obviously does not contribute to the self-energy measurement and is hence considered a fake diagram. The transition between the fake sector and the first-order diagram is best done with a separate update pair FROM-FAKE and TO-FAKE. INSERT and REMOVE allow one then to switch between expansion orders n and n + 1 for n ≥ 1, but do not need further modifications. The EXTEND update also needs to be slightly modified because the duration of the final phonon propagator changes. Finally, in the SWAP update we need to check for one-particle reducibility: in case an electron propagator is not covered by any phonon propagator, the diagram is oneparticle reducible, i.e., it would fall into two pieces when cutting this propagator line. A simple check for one-particle reducibility is to compare the momentum of an electron propagator with the external one. Because in the SWAP update only one electron momentum changes, this is a very cheap test to perform. The required changes to the code are left as an exercise for the reader. Note that, although the above is sufficient, the actual implementation that accompanies this document uses elements (such as DRESS-VERTEX) of Sec. 5.3 for efficiency reasons.
The self-energy is shown in Fig. 15. One sees that the first-order self-energy diverges as 1/ τ for τ → 0 but decays rapidly for large τ. The higher order terms have a vanishing contribution at τ = 0 but develop an exponential tail for large τ which is important for the energy of the polaron. This figure shows that computing the convolution of the self-energy with the bare Green function is optional for the higher order terms.
In Fig. 16 we show the estimation of the polaron energy from Eq. 17. The intersection point of the right-hand side integral φ(E) = ∞ 0 Σ(0, τ)e (E−µ)τ dτ with the identity line E = E determines the polaron ground state energy. We have sampled φ(E) in the Monte Carlo simulation directly on a non-uniform grid of E-values centered around the polaron energy estimate from the analysis of the Green function in the bare expansion (cf. Sec. 3.6). One could also calculate φ(E) from the discretized self-energy in post-processing and find the intersection point using a bisection scheme, albeit without the benefit of reliable error bars.
Note that the data are strongly correlated amongst different values of E resulting in a relatively smooth appearance of φ(E) whereas the error analysis reveals that the errors are indeed substantial. The integral over the first-order self-energy may be calculated analytically, yielding φ (1) (E) = −α m/(ω ph − E), thereby treating the 1/ τ divergence exactly, while the higher orders are integrated numerically from QMC data. However, we did not find any significant discrepancy compared to taking the full self-energy.
φ(E) should be independent of the choice of µ and, indeed, the curves calculated from simulations carried out at µ = −6 and µ = −5.6 coincide within their errors. However, like in the bare expansion, choosing µ close to (but below) the polaron energy significantly reduces the error as longer times can be accessed before the self-energy (Green function) decays. The locus of the intersection point has been interpolated. Any interpolation errors are bound to be negligible compared to the statistical one. Propagating the error onto the intersection point, we find E 0 = −5.5497 ± 0.0042 or a relative error of 0.08%, in agreement with the analysis from the bare expansion, Sec. 3.6. In contrast, for µ = −6, the result is E 0 = −5.546 ± 0.016 which is significantly less accurate.

Fröhlich polaron: bold diagrammatic Monte Carlo
The sampling space can be further reduced when skeleton techniques are used. Graphically, this corresponds to the notion of 2-particle irreducibility: The bold diagrams for the self-energy do not fall apart when cutting any two electron propagator lines. Originally demonstrated for a (linear) scattering problem [9], it was believed that non-perturbative physics can be incorporated this way and that the series convergence could be better than for the bare series.
In case the bare series is absolutely convergent, the bare and the bold series must converge to the same answer. The bold series for the Fröhlich Hamiltonian has merely a demonstrative character: in case of a convergent sign-free sampling of the bare series, it makes little sense to use anything more complicated.

First-order self-consistent diagram: non-crossing approximation
Let us first illustrate the method by considering the self-consistent approach to first order, i.e., the diagram shown in the upper panel of Fig. 17. One sees that the self-energy depends on the full Green function G(k, τ), which itself is a function of the self-energy via the Dyson equation    (see the lower panel in Fig. 17). This is a self-consistency problem, The self-consistency problem is usually solved by iteration (note that this iteration is not a Markov process). Given an initial guess for G (1) B , the self-energy is computed (numerically or stochastically for the higher orders) in imaginary time and subsequently Fourier transformed to Matsubara representation, from which a new Green function is extracted via the Dyson equation and brought back to imaginary time representation by an inverse Fourier transform, and this procedure is repeated until convergence is reached. It is often needed to introduce a damping factor. As can be seen in Fig. 17 the diagrams thus summed correspond in the bare series to all possible diagrams in which the phonon lines do not cross (the so-called noncrossing approximation (NCA)).
Given the previous experience with numerical instabilities in the first-order self-energy using the bare expansion (see Sec. 4.1), we anticipate the same problem. We split hence that is, we subtract the bare propagator from the bold propagator and evaluate the corresponding contributions to the self-energy separately. The first part is simply the first-order self-energy Σ (1) ≡ Σ The integral can be split as where the first and third integral can be evaluated numerically and the middle integral vanishes in the limit ∆p → 0, The first-order contribution Σ (1) [G 0 ] in Eq. 44 singles out the 1/ τ divergence of the self-energy. As before, it should be treated carefully, for instance as the convolution (Σ (1) (p = 0) G 0 (p))(τ) which we suggested before. The corresponding Green function is shown in Fig. 18 for zero momentum. The momentum dependence of the Green function can be seen in Fig. 19.

Grid and momentum cutoff
It is seen in Fig. 17 that even the computation of Σ B (p, ω n ) for all momenta p. We hence need a two-dimensional grid for G and Σ in (p, τ) space, and biquadratic (or bicubic if affordable) interpolation. We also need to introduce a momentum cutoff to store the measurement of Σ. We now analyze if (and when) the momentum cutoff matters.
The bare Green function decays with momentum as a gaussian for fixed values of τ. Large values of τ are usually unimportant because of the µ dependence in the propagator, providing a cutoff in time (even for p = 0 this term is present). If large values of τ occur, they provide a cutoff for the momenta of the order of p ∼ 2m/τ. However, for very small values of τ there is no restriction on the values that p can take.
To see what influence large momenta have in practice, we show in Fig. 20 the histogram of the logarithm of the modulus of all electron momenta contributing to the self-energy (using the bare G 0 expansion) when the external momentum is p = 0. The main contribution is at low momenta, as expected, and large momenta are suppressed as a power law with exponent 4.25 (5); contributions for p > 4 are seen to be already very small. But given that our requirement on precision is extremely high (just recall the reported very small values of the Green function for large values of τ -surely we need our systematic error to be much smaller than the signal), it is not a priori clear that the cutoff dependence is going to be negligible. This is likewise reflected in the first-order (non-bold) self-energy. Choosing a very small τ, we see that for p 2 2m/τ the first-order self-energy is to leading order momentum independent (namely, a large constant because of the ∼ 1/ τ divergence) but for p 2 2m/τ it behaves as ∼ p 2 . This follows from the asymptotic expansions of the Dawson function. The momentum dependence in the NCA approximation is unfortunately not identical to the one in the bare first-order self-energy and hard to grasp analytically. Figure 23: Illustration of the VERTEX-DRESS / VERTEX-UNDRESS pair of updates.
Higher order diagrams should be better behaved: When phonon lines cross, then phase space arguments for τ show that small values of τ are not giving important contributions to the self-energy. We believe it is indispensable to numerically check the influence of the cutoff, which we show for the Green function in the NCA approximation in Fig. 21. Note that we used bare propagators whenever momenta whose magnitude is higher than the cutoff were requested, but it makes little to no difference compared to a hard cutoff. This is an approximation but would lead to a correct evaluation of the first-order bare diagram in all circumstances. It is seen in Fig. 21 that the cutoff dependence is not pronounced.
However, quantities such as the energy converge rather slowly with the cutoff parameter (since the energy corresponds to the asymptotic decay of the Green function one can appreciate this aspect from Fig. 21) even though the approximate values are very close to the final one. Furthermore, precise energies remain sensitive to the discretization. Getting control beyond the ∼ 0.5% level of accuracy on such quantities is not an easy task in a self-consistent approach, and a systematic study of all parameters is warranted. We show a characeristic example in Fig. 22, where we modify the cutoff as well as the momentum spacing. We chose the time discretization such that it is about two orders of magnitude weaker than the momentum discretization in this plot. However, choosing a smaller starting value for the fitting interval over which the energy is determined leads to fluctuations of the same order of magnitude as in Fig. 21. Since the tail should be fitted for the energy, we believe that our starting value (τ fit = 5) is conservative. A striking feature in Fig. 22 is the non-monotonous behavior for large cutoffs, indicating that too few momenta might well have led to erroneous extrapolations. At the moment we see no other option than a purely numerical analysis of this dependence, and argue that it is indispensable to perform such checks.

Code
Bold DiagMC requires only a couple of changes to the code for the self-energy: Boldification -This step has been described already in Sec. 5.1.
New updates -The current implementation of the INSERT update automatically leads to a two-particle reducible diagram. One possibility is to keep the updating scheme as is supplemented with introducing a flag signalling two-particle reducibility, and making sure that the self-energy is measured only in the irreducible space. There exists however a way to add a phonon arc such that it always leads to an irreducible diagram. It works as follows (see Fig. 23): first a random vertex at time τ V is chosen, excluding the one at τ = 0. Then we propose to insert a new phonon propagator that dresses this (but only this) vertex. This is also a local update, and since it increases the number of phonon line crossings by one, this update always leads to a physical diagram. We leave the derivation of detailed balance for this VERTEX-DRESS/VERTEX-UNDRESS pair as an exercise, but note that choosing the last vertex requires special care.
Irreducibility checks in SWAP -In a bold code we need to make sure that no subpiece of a diagram can be identified with a lower order diagram already taken into account (which is the same as the requirement of two-particle irreducibility). Fortunately, there exists a simple check: if no 2 momenta are identical then the diagram is bold irreducible. If one uses the VERTEX-DRESS/VERTEX-UNDRESS updates, then reducibility can again only happen during the SWAP update, and one only needs to check the new momentum k between τ 1 and τ 2 against the incoming and outgoing momenta of τ A and τ B (for the notation see Fig. 7). If this simple check cannot be done (as in any system not dealing with polarons), then one keeps track of the momenta in the diagram in the form of a hash

Results
For our standard example α = 5 we see in Fig. 24 that the full Green function agrees with the one obtained from the bare expansion (cf. Fig. 9). We also see that only few iteration steps are needed before convergence is reached. In particular, the inset reveals that after six boldification iterations, it stays well within the error bars of the bare Green function over the whole range of imaginary times considered. We do not have any rigorous means of obtaining error bars on the Green function in the bold scheme, however the norm of the change δG B due to boldification may be considered indicative of convergence. Since we do not reset our observable for the selfenergy after boldification but just keep on sampling with respect to the new bold propagator, we have an easy way to systematically improve our results and diminish the statistical Monte Carlo errors by just carrying out further bold iterations. Otherwise, one would need to employ some sort of heuristic for increasing the number of Monte Carlo sweeps within each iteration step to reflect the higher demand on statistical accuracy as self-consistency is reached. If the desired accuracy is not reached a comprehensive analysis of the sources of all systematic errors is necessary, which combined with the statistical errors from the sampling can be a cumbersome task. The data presented in Fig. 24 took 840 CPU-hours (30 hours on one 28-core Broadwell node @ 2.4 GHz) to gather. Each of the six iteration steps consisted of 7.5 × 10 12 elementary Monte Carlo updates. The time taken by the actual boldification step is negligible in comparison, at only a few minutes each. The bold scheme thus requires significantly more computational effort to achieve results that are comparable to the bare approach. However, one should be mindful that the bold scheme has a somewhat larger configuration space to cover as it samples the self-energy at different momenta while the momentum was kept fixed at zero in Fig. 9.

Open source codes
We provide our C++ implementations of the DiagMC method for the systems discussed in parts 2 through 5 under an open source license (GPL v3). They are available through the Git repository at https://gitlab.lrz.de/Lode.Pollet/LecturesDiagrammaticMonteCarlo .
Our codes make use of the ALPSCore library [29], based on the original ALPS project [30]. ALPSCore employs the HDF5 data format [31], as well as the Boost C++ libraries [32]. Further, we rely on the FFTW3 library [28] for the Fourier transform necessary for the Dyson equation in the self-energy formalism. Finally, the Faddeeva package implementation of the Dawson function [33] is used in the calculation of the first order of the self-energy.
Please refer to the README files in the code repository for technical details on how to build and run the codes. We also provide parameter files which allow you to reproduce the data depicted in Figs. 9, 15, 16, and 24.

Outlook
In these notes we only discussed the concepts of the (irreducible) self-energy and skeleton diagrams for the Green function propagator. In a many-body context, the (irreducible) polarization and the effective interaction can be treated in the same way and give rise to such effects as screening (a well-known example is the electron gas model [5]). Graphically, the interaction is also a two-point line-object: The interaction corresponds to the propagation of a single boson. For polaron and impurity-like problems, the medium is considered an infinite bath and can hence not be renormalized. In practice, bold DiagMC schemes rely on the G 2 W scheme [34].
More generally bold diagrammatic elements can also be introduced at the two particle level. The full system of non-perturbative self-consistent equations are known as the Hedin equations [35]. The central object of the 5 Hedin equations is the 3-point irreducible vertex; Green functions and effective interactions are related via their respective Dyson equations to the self-energy and the polarization, whereas the vertex can be expressed in terms of bold propagators and the irreducible vertex. There is however no closed form for the right hand side in the self-consistent equation for the 3-point vertex (in the language of functional integrals, it is possible to write down the right hand side as a functional derivative, but this remains impractical for an actual numerical computation). Thus far, the self-consistent treatment of the 3-point vertex has not been attempted in diagrammatic Monte Carlo because of the curse of "dimensions": already for the Fröhlich polaron in 3 dimensions with rotational symmetry it is a 5-dimensional object whose storage, interpolation and stochastic evaluation are non-trivial.

Extensions
In this final section we very briefly discuss a number of systems that can rather straightforwardly be studied with the techniques outlined in this manuscript. Our goal is to show the similarities between these systems from the algorithmic point of view rather than a full discussion of the physics of these models, which is beyond the scope of these lecture notes. Wherever possible, we will provide references to reviews.

Acoustic phonons
In contrast to the optical branch relevant for the model discussed in the main part of the text, acoustic phonons have a linear dispersion, ω q = cq with c the speed of sound. The derivation of the large polaron in the Fröhlich model results in a coupling V (q) ∼ q between the impurity and the phonon bath [36]. Consequently, a UV-cutoff is needed as can easily be seen by writing down the first-order self-energy expression. The polaron properties were computed using diagrammatic Monte Carlo in Ref. [37]. Just as for the optical phonons, Feynman's variational Ansatz [26] is in excellent agreement with the full results, but predicts a transition between a quasi-free and a self-trapped polaron, which may be continuous or discontinuous depending on the value of the cutoff [36]. This transition was also observed in a path-integral Monte Carlo study [38], seen as a jump in the potential energy. It cannot be ruled out that this jump is cutoff dependent, and this jump is not a proof of any (strong or weak) localization. The authors of Ref. [37] did not elaborate but noted that the structure of the diagrams which contribute significantly differs in the quasi-free and self-trapped regimes.

Bose polaron
The Bose polaron describes an impurity immersed in a weakly interacting Bose-Einstein condensate (BEC). The system is usually modelled with δ-pseudopotentials in continuum space, after which the interactions of the BEC are linearized using the Bogoliubov prescription, resulting in a Fröhlich type of Hamiltonian, with phonon dispersion ω(k) = kc 1 + (kξ) 2 /2, where c is the speed of sound and ξ the healing length of the BEC. The coupling of the impurity to the BEC phonons is given by V (q) ∼ (ξq) 2 (ξq) 2 +1 1/4 . This derivation breaks down however when the impurity is able to sufficiently deplete the condensate: As soon as the impurity gets dressed by two (or more) phonons, one must also take the full density-density repulsion of the bosons into account for stability reasons, but this lies outside the Fröhlich BEC polaron model. One therefore expects substantial differences between experiments and the predictions of this model in the strongly interacting regime. An excellent review of the physics of the Bose polaron can be found in Ref. [39]. The Fröhlich-type BEC polaron model is, just like the previously discussed acoustic phonons, UV divergent and a renormalization scheme is essential. In the diagrammatic Monte Carlo study of Ref. [37] it was found that the momentum cutoff had to be orders of magnitude larger than any physical parameter before the (renormalized) ground state energies could be reliably extrapolated in the inverse of the cutoff. Fluctuations over many orders of magnitude make the simulations inefficient. It would hence be interesting to revisit this problem with a different renormalization scheme, and/or utilizing a partial resummation of diagrams to cure the sensitiveness to the cutoff.

Fermi polaron
When an impurity is immersed in a dilute, non-interacting Fermi sea, the ground state can either be a polaron or a molecule when the impurity forms a bound state with just one fermion.
Like for the Bose polaron, the interactions between impurity and bath originate from a typical cold atom setup with all their benefits: The scattering lengths can be tuned, even made infinitely strong, but the interactions remain of zero range. The Fermi polaron was one of the first hallmarks of modern diagrammatic Monte Carlo simulations [40,41], firmly establishing the polaron-to-molecule transition. Note that the presence of fermionic propagators leads to a sign-problem, which makes the simulations much harder than for bosonic problems and limits the reachable expansion orders typically to 8-12, depending on the dimensionality, interaction strength, species mass, etc [42][43][44][45]. A particularly elegant way to deal with the UV divergence and resonant interactions simultaneously is by introducing the T-matrix [40,41]. There exist excellent reviews on the topic of the Fermi polaron, such as Refs. [46,47].

Multi-polaron systems
A finite density of electrons coupled to optical phonons within the Holstein model (i.e., the electron density couples locally to the displacement operator via a coupling of the form g i a † i a i (b † i + b i ) where the amplitude of the coupling is g and a i the electron/impurity annihilation operator on site i and b i the phonon one) was considered in Ref. [48]. In this case, the presence of a Fermi surface for the electrons leads again to a sign problem (since the particle and the hole propagators have opposite sign). In Ref. [48] the bold series was sampled up to orders 4-6, high enough to observe convergence. It was found that the effective mass increases and the residue decreases with increasing electron density at fixed coupling strength for typical metals. The authors found that approximating the self-energy with a purely local one is accurate to 2%.

Spin-boson models
The spin-boson Hamiltonian is the prototypical model for a quantum-mechanical system embedded in a dissipative bath [49], describing the coupling of a two-level system to an infinite bath. It is defined as where σ x and σ z are the Pauli spin-1/2 operators, b i and b † i are boson creation and annihilation operators, ω i are harmonic oscillator frequencies, and ∆ is the tunneling matrix element. The coupling between the spin and the bosonic bath is determined via the λ i by the spectral function J(ω) = π i λ 2 i δ(ω−ω i ) = 2παω 1−s c ω s for ω < ω c and zero otherwise. The parameter α describes the coupling strength to the dissipative bath. The parameter s distinguishes between a sub-ohmic bath (s < 1) and an ohmic bath (s = 1). At zero temperature and for s ≤ 1 the system undergoes a phase transition at finite coupling strength α c between a delocalized and a localized state, in which the system is no longer able to tunnel and behaves essentially classically. There was controversy about the nature of the phase transition in the sub-ohmic case. On general grounds one expects the transition to fall in the universality class of the classical Ising model with long-range interactions with mean-field critical exponents for s < 1/2. The first numerical group renormalization studies observed however different exponents obeying hyperscaling for s < 1/2 and argued the breakdown of the quantum-to-classical mapping.
The continuous time Monte Carlo simulations of Ref. [50] are free of systematic errors and could establish the exactness of the quantum-to-critical mapping by observing the expected mean-field exponents. The discrepancies had thus to be found in the truncation of the bosonic Hilbert space in the numerical renormalization group approach. The Monte Carlo sampling of this system resembles Sec. 2 but needs to be augmented with a cluster update for the retarded spin-spin interactions, see Ref. [50], resulting from integrating out the bath modes.
For a comprehensive review of the physics of spin-boson models, see Ref. [51].

Anderson localization
When free fermions can hop on a lattice subject to disorder in the chemical potential, they will always localize in 1D and 2D and for strong enough disorder in 3D. For quenched disorder drawn from a Gaussian distribution, the diagrammatic technique is simplest to derive. The diagrammatic structure is in fact very similar to Sec. 8.4: The electron propagator is dressed with arcs, but those arcs have no time-dependence (in contrast to the exponential decay for the polarons, see Eq. 13). The Green function at zero temperature on a 3D lattice was computed in real time in Ref. [52]. While unable to locate the transition (which requires the computation of the conductivity and analyzing it for low frequencies and momenta), it showed the very strong local character of the self-energy (cf. Sec. 8.4).

Impurity models
Models such as Anderson's impurity model occur as auxiliary problems in dynamical meanfield theory, when one seeks to sum over all skeleton diagrams for the self-energy built with purely local Green functions. The important point is that this sum is not accomplished directly but through the impurity problem, for which a variety of Monte Carlo solvers have been developed in continuous time, see Ref.

Real-time phenomena
The spectral function [13,24] and the optical conductivity [55] have been determined from the corresponding imaginary time correlation functions for the Fröhlich polaron using analytic continuation methods. The optical conductivity of the Holstein model was studied in Ref. [56,57], as well as its mobility [58]. To date, no polaron studies have been published directly for real time following the approach of Ref. [52] for the Anderson model. By contrast, impurity models have also been studied to address out-of-equilibrium phenomena, see Refs. [59][60][61][62][63][64][65][66][67][68]. One is typically interested in the transport of quantum dot like systems coupled to external leads, and attempts to monitor the time evolution for a long enough period of time such that a steady state sets in.

Conclusion
The purpose of these notes is to provide a pedagogical overview of the technical aspects of diagrammatic Monte Carlo simulations, lowering the barrier for newcomers, and giving a flavor of its power to experienced researchers acquainted with other numerical techniques. With the techniques outlined here interesting physics has been discovered and established unambiguously in the past. With only minor changes open, challenging problems can still be attacked, and we gave a number of examples in the previous section. To study the complexity of strongly interacting problems a few more steps are needed, such as resummation techniques, more updates, and sign alternations. The series will in general not be convergent, which we consider to be the greatest challenge for diagrammatic Monte Carlo simulations, and the diagrammatic structure is more complicated than the diagrams considered here, which all have a backbone line for the impurity propagator. Just as for the Fröhlich polaron, it is imperative to treat as much as possible of the physics in an analytical way. Having gone through this tutorial the reader can understand better the technical aspects of the method, appreciate the efforts described in the literature, or start coding and exploring on their own.