MatsubaraFunctions.jl: An equilibrium Green's function library in the Julia programming language

The Matsubara Green's function formalism stands as a powerful technique for computing the thermodynamic characteristics of interacting quantum many-particle systems at finite temperatures. In this manuscript, our focus centers on introducing MatsubaraFunctions.jl, a Julia library that implements data structures for generalized n-point Green's functions on Matsubara frequency grids. The package's architecture prioritizes user-friendliness without compromising the development of efficient solvers for quantum field theories in equilibrium. Following a comprehensive introduction of the fundamental types, we delve into a thorough examination of key facets of the interface. This encompasses avenues for accessing Green's functions, techniques for extrapolation and interpolation, as well as the incorporation of symmetries and a variety of parallelization strategies. Examples of increasing complexity serve to demonstrate the practical utility of the library, supplemented by discussions on strategies for sidestepping impediments to optimal performance.


Motivation
In condensed matter physics, strongly correlated electrons emerge as paradigmatic examples of quantum many-body systems that defy a description in terms of simple band theory, due to their strong interactions with each other and with the atomic lattice.Their study has led to a cascade of discoveries, ranging from high-temperature superconductivity in copper oxides (cuprates) [1,2] to the Mott metal-insulator transition in various condensed matter systems such as, e.g., transition metal oxides or transition metal chalcogenides [3][4][5] and the emergence of quantum spin liquids in frustrated magnets [6,7], to name but a few.
The study of correlated electron systems is equally exciting and challenging, not only because the construction of accurate theoretical models requires the consideration of many different degrees of freedom, such as spin, charge, and orbital degrees of freedom, as well as disorder and frustration, but also because of the scarcity of exactly solvable reference Hamiltonians.The single-band Hubbard model in more than one dimension, for example, has remained at the forefront of computational condensed matter physics for decades, although it in many respects can be regarded as the simplest incarnation of a realistic correlated electron system [8,9].It is therefore not surprising that a plethora of different numerical methods have been developed to deal with these models [10].
However, no single algorithm is capable of accurately describing all aspects of these complex systems: each algorithm has its strengths and weaknesses, and the choice of algorithm usually depends on the specific problem under investigation.For example, some algorithms, such as exact diagonalization (ED) [11][12][13] or the density matrix renormalization group (DMRG) [14,15] are better suited for studying ground-state properties, while others (quantum Monte Carlo (QMC) simulations [16][17][18][19], functional renormalization group (fRG) calculations [20][21][22], ...) perform better when one is interested in dynamic properties such as transport or response functions.
Another popular method, dynamical mean-field theory (DMFT) has been immensely successful; in particular it correctly predicts the Mott transition in the Hubbard model [23].By approximating the electron self-energy to be local, it however disregards non-local correlation effects, leading to a violation of the Mermin-Wagner theorem [24,25] as well as a failure to predict the pseudo-gap in the Hubbard model [10].Non-local (e.g.cluster [26][27][28][29] or diagrammatic [30]) extensions of DMFT improve on that front, but are computationally much more expensive.Ultimately, the choice of algorithm is guided by the computational resources available and the trade-off between accuracy and efficiency, as well as by physical insights into which approximations may be justified more than others.
A common motif of many of these algorithms is that they rely on the computation of nparticle Green's functions, where usually n = 1, 2. Roughly speaking, these functions describe correlations within the physical system of interest, such as its response to an external perturbation.In thermal equilibrium, Green's functions are usually defined as imaginary-time-ordered correlation functions, which allows the use of techniques and concepts from statistical mechanics, such as the partition function and free energy.In Fourier space, the corresponding frequencies take on discrete and complex values.This Matsubara formalism is widely used to study strongly correlated electron systems, where it provides a powerful tool for calculating thermodynamic quantities, such as the specific heat and magnetic susceptibility, as well as dynamical properties, such as the electron self-energy and optical conductivity [31,32].
In this manuscript, we present MatsubaraFunctions.jl,a software package written in Julia [33] that implements containers for Green's functions in thermal equilibrium.More specifically, it provides a convenient interface for quickly prototyping algorithms involving multivariable Green's functions of the form G i 1 ...i n (ω 1 , ..., ω m ), with lattice/orbital indices i k (k = 1, ..., n) and Matsubara frequencies ω l (l = 1, ..., m).In an attempt to mitigate monilithic code design and superfluous code reproduction, our goal is to promote a common interface between algorithms where these types of functions make up the basic building blocks.We implement this interface in Julia, since some more recently developed methods, such as the pseudofermion [34][35][36][37][38][39][40][41] and pseudo-Majorana fRG [42][43][44][45], seem to have been implemented in Julia as the preferred programming language.In the spirit of similar software efforts, such as the TRIQS library for C++ [46], this package therefore aims to provide a common foundation for these and related codes in Julia that is fast enough to facilitate large-scale computations on high-performance computing architectures [47], while remaining flexible and easy to use.

Equilibrium Green's functions
In this section, we give a brief introduction to equilibrium Green's functions and their properties.In its most general form, an imaginary time, n-particle Green's function G (n) is defined as the correlator [48] where T is the imaginary-time-ordering operator and 〈 Ô〉 = 1 Z Tr(e −β Ĥ Ô) denotes the thermal expectation value of an operator Ô with respect to the Hamiltonian Ĥ at temperature T = 1/β.Note that natural units are used throughout, in particular we set k B ≡ 1.Here, c ( †)  are fermionic or bosonic creation and annihilation operators and Z = Tr(e −β Ĥ) is the partition function.The indices i k represent additional degrees of freedom such as lattice site, spin and orbital index.In order for the right-hand side in Eq. ( 1) to be well defined, it is necessary to restrict the τ arguments to an interval of length β, as can be seen, for example, from a spectral (Lehmann) representation of the expectation value [48].Furthermore, the cyclicity of the trace implies that the field variables are anti-periodic in β for fermions, or periodic in β for bosons, respectively.This allows us to define their Fourier series expansion where 2k , with k ∈ are the fermionic or bosonic Matsubara frequencies. 1hese definitions carry over to the n-particle Green's function G (n) , giving 3 Code structure jl is an open-source project distributed via Github [49] and licensed under the MIT license.Using Julia's built-in package manager, the code can be easily installed using 1 $ julia 2 julia> ] 3 pkg> add https://github.com/dominikkiese/MatsubaraFunctions.jl from the terminal.Here, ] activates the package manager from the Julia REPL, where add downloads the code and its dependencies.The following is an overview of the functionality of the package, starting with a discussion of its basic types and how to use them.A full documentation of the package is available from the github repository.

Basic types
The package evolves around three concrete Julia types: MatsubaraFrequency, MatsubaraGrid and MatsubaraFunction.A Matsubara frequency can be either fermionic or bosonic, that is, For a given temperature T = 1/β and Matsubara index k they can be constructed using Basic arithmetic operations on these objects include addition, subtraction and sign reversal, each of which creates a new MatsubaraFrequency instance.Note that the bosonic Matsubara frequency at zero is included in the positive frequency count.Grid instances are iterable and can be evaluated using either a MatsubaraFrequency or Float64 as input.
Here, we first select a random linear index idx and then evaluate g1 using either the corresponding Matsubara frequency g1[idx] or its value.In the former case, g1(g1[idx]) returns the corresponding linear index of the frequency in the grid, whereas g1(value(g1[idx])) finds the linear index of the closest mesh point. 2 The package supports storage of grid instances in H5 file format.
1 using HDF5 2 file = h5open("save_g1.h5","w") 3 save_matsubara_grid!(file, "g1", g1) 4 g1p = load_matsubara_grid(file, "g1") Finally, a MatsubaraFunction is a collection of Matsubara grids with an associated tensor structure G i 1 ...i n for each point (ν 1 , ..., ν m ) in the Cartesian product of the grids.The indices i k could, for example, represent lattice sites or orbitals.To construct a MatsubaraFunction users need to provide a tuple of MatsubaraGrid objects, as well as the dimension of each i k .In addition, a floating point type can be passed to the constructor, which fixes the data type for the underlying multidimensional array. 3Similar to the grids, MatsubaraFunctions can be conveniently stored in H5 format.

Accessing and assigning Green's function data
The library provides two possible ways to access the data of a MatsubaraFunction, using either the bracket ([]) or parenthesis (()) operator.While the notion of the former is that of a Base.getindex, the latter evaluates the MatsubaraFunction for the given arguments in such a way that its behavior is well-defined even for out-of-bounds access.The bracket can be used with a set of MatsubaraFrequency instances and tensor indices i k , as well as with Cartesian indices for the underlying data array.It returns the value of the data exactly for the given input arguments, throwing an exception if they are not in bounds.In addition, the bracket can be used to assign values to a MatsubaraFunction as shown in the following example.
1 y = 0.5 2 T = 1.0 3 N = 128 4 g = MatsubaraGrid(T, N, Fermion) 5 f = MatsubaraFunction(g, 1) When f is evaluated using Matsubara frequencies within its grid, it returns the same result as if a bracket was used.However, if the frequencies are replaced by Float64 values, a multilinear interpolation within the Cartesian product of the grids is performed.If the frequency / float arguments are out of bounds, MatsubaraFunctions falls back to extrapolation.The extrapolation algorithm distinguishes between one-dimensional and multidimensional frequency grids.In the 1D case, an algebraic decay is fitted to the high-frequency tail of the MatsubaraFunction, which is then evaluated for the given arguments.The functional form of the asymptote is currently restricted to f (ν) = α 0 + 4which is motivated by the linear or quadratic decay that physical Green's functions typically exhibit.For multidimensional grids, a constant extrapolation is performed from the boundary.Different modes of evaluation are illustrated in an example below.

Extrapolation of Matsubara sums
A common task when working with equilibrium Green's functions is the calculation of Matsubara sums 1 β ν f (ν), where we have omitted additional indices of f for brevity.However, typical Green's functions decay rather slowly (algebraically) for large frequencies, which presents a technical difficulty for the accurate numerical calculation of their Matsubara sums: they may require some regulator function to control the convergence 5 (difficult to implement) and a large number of frequencies to sum over (expensive).In contrast, there exist analytical results for simple functional forms of f even in cases where a straightforward numerical summation fails.MatsubaraFunctions provides the sum_me function, which can be used to calculate sums over complex-valued f (ν), if f (z) (with z ∈ ) decays to zero for large |z| and is representable by a Laurent series in an elongated annulus about the imaginary axis (see App.A for details).An example for its use is shown below.Note that this feature is experimental and its API as well as the underlying algorithm might change in future versions.

Padé approximants
Although the Matsubara formalism provides a powerful tool for the calculation of thermodynamic quantities, it lacks the ability to directly determine, for example, dynamic response functions or transport properties associated with real-frequency Green's functions, which facilitate comparison with experiments.There have been recent advances in the use of real-frequency quantum field theory [50][51][52][53], yet the calculation of dynamic real-frequency Green's functions remains a technically challenging endeavor.In many applications, therefore, one resorts to calculations on the imaginary axis and then performs an analytic continuation in the complex upper half-plane to determine observables on the real-frequency axis.The analytic continuation problem is ill-conditioned, because there may be significantly different real-frequency functions describing the same set of complex-frequency data within finite precision.Nevertheless, there has been remarkable progress in the development of numerical techniques such as the maximum entropy method [54][55][56] or stochastic analytical continuation [57,58].These methods are particularly useful when dealing with stochastic noise induced by Monte Carlo random sampling.A corresponding implementation in Julia is, for example, provided by the ACFlow toolkit [59].On the other hand, if the input data are known with a high degree of accuracy (as in the fRG and related approaches), analytic continuation using Padé approximants is a valid alternative.Here, one first fits a rational function to the complex frequency data which is then used as a proxy for the Green's function in the upper half-plane.If the function of interest has simple poles this procedure can already provide fairly accurate results, see e.g.Ref. [60].In MatsubaraFunctions we implement the fast algorithm described in the appendix of Ref. [61], which computes an N -point Padé approximant for a given set of data points {(x i , y i )|i = 1, ..., N }.A simple example of its use is shown below.Note that it might be necessary to use higher precision floating-point arithmetic to cope with rounding errors in the continued fraction representation used for calculating the Padé approximant.

Automated symmetry reduction
In many cases, the numerical effort of computing functions in the Matsubara domain can be drastically reduced by the use of symmetries.For one-particle fermionic Green's functions , relating positive and negative Matsubara frequencies.Our package provides an automated way to compute the set of irreducible MatsubaraFunction components, 6 given a list of one or more symmetries as is illustrated in the following example 6 These are all elements of the underlying data array which cannot be mapped to each other by symmetries.
Here, one first constructs an instance of type MatsubaraSymmetry by passing a function that maps the input arguments of f to new arguments extended by a MatsubaraOperation.The latter specifies whether the evaluation of f on the mapped arguments should be provided with an additional sign or complex conjugation.Next, the irreducible array elements are computed and an object of type MatsubaraSymmetryGroup 7 is constructed from a vector of symmetries provided by the user.Here, the length of the vector is one (we only considered complex conjugation), but the generalization to multiple symmetries is straightforward (see Ref. [62] for more examples).A MatsubaraSymmetryGroup can be called with a MatsubaraFunction and an initialization function. 8This call will evaluate the MatsubaraInitFunction for all irreducible elements of the symmetry group of f, writing the result into the data array of h.Finally, all symmetry equivalent elements are determined without additional calls to the (costly) initialization function.Symmetry groups can be stored in H5 format as shown below.

Running in parallel
To simplify code parallelization when using MatsubaraFunctions.jl, the package has some preliminary MPI support based on the MPI.jl wrapper and illustrated in an example below.
1 using MatsubaraFunctions 2 using MPI 7 A MatsubaraSymmetryGroup contains all groups of symmetry equivalent elements and the operations needed to map them to each other. 8A MatsubaraInitFunction takes a tuple of Matsubara frequencies and tensor indices and returns a floating point type.
Calls of MatsubaraSymmetryGroup with an initialization function have an opt-in switch (mode) to enable parallel evaluation of the MatsubaraInitFunction (by default mode = :serial).If mode = :polyester, shared memory multithreading as provided by the Polyester Julia package [63] is used. 9This mode is recommended for initialization functions that are cheap to evaluate and are unlikely to benefit from Threads.@threads due to the overhead from invoking the Julia scheduler.For more expensive functions, users can choose between mode = :threads, which simply uses Threads.@threads, and mode = :hybrid.The latter combines both MPI and native Julia threads and can therefore be used to run calculations on multiple compute nodes.

Performance note
By default, types in MatsubaraFunctions.jlperform intrinsic consistency checks when they are invoked.For example, when computing the linear index of a MatsubaraFrequency in a MatsubaraGrid, we make sure that the particle types and temperatures match between the two.While this ensures a robust modus operandi, it unfortunately impacts performance, especially for larger projects.To deal with this issue, we have implemented a simple switch, MatsubaraFunctions.sanity_checks(),which, when turned off10 disables @assert expressions.It is not recommended to touch this switch until an application has been thoroughly tested, as it leads to wrong results on improper use.For the MBE solver discussed in Sec.4.3.2,we found runtime improvements of up to 10% when the consistency checks were disabled.

Hartree-Fock calculation in the atomic limit
As a first example of the use of MatsubaraFunctions.jlwe consider the calculation of the one-particle Green's function G using the Hartree-Fock (HF) approximation in the atomic limit of the Hubbard model, i.e., we consider the Hamiltonian where U denotes the Hubbard interaction and nσ are the density operators for spin up and down.In the following, we fix the chemical potential to µ = 0, i.e., we consider the system in the strongly hole-doped regime.The Hartree-Fock theory [64-66] is a widespread method in condensed matter physics used to describe, e.g., electronic structures and properties of materials [67,68].It is a meanfield approximation as it treats the electrons in a solid as independent particles being subject to an effective background field due to all the other particles.
In an interacting many-body system, the bare Green's function G 0 has to be dressed by self-energy insertions, here denoted by Σ, in order to obtain G, which is summarized in the Dyson equation where G 0 , G and Σ in general are matrix-valued.In HF theory one only considers the lowest order term contributing to the self-energy, which is linear in the interaction potential.For the spin-rotation invariant single-site system at hand, Σ = Σ σ = Σ and the HF approximation for the self-energy reads where n is the density per spin.The Dyson equation then takes the simple form Below, we demonstrate how to set up and solve Eqs. ( 8) & ( 9) self-consistently for the density n using Anderson acceleration [69,70] as provided by the NLsolve Julia package [71] in conjunction with MatsubaraFunctions.jl.
1 using MatsubaraFunctions 2 using NLsolve Here, we first build the MatsubaraFunction container for G and initialize it to G 0 (ν) = 1 iν .This container is then passed to the fixed-point equation using an anonymous function, which mutates G on each call to incorporate the latest estimate of n.11 Fig. 1 shows exemplary results for the full Green's function and HF density.As can be seen from Fig. 1(b) the latter deviates from its bare value n 0 = 1 2 when the temperature is decreased and approaches n = 0 for T → 0, as expected.

GW calculation in the atomic limit
In this section, we extend our Hartree-Fock code to include bubble corrections 12 in the calculation of the self-energy.The resulting equations, commonly known as the GW approximation, allow us to exemplify the use of more advanced library features, such as extrapolation of the single-particle Green's function and the implementation of symmetries.Therefore, they present a good starting point for the more involved application discussed in Sec.4.3.1.
The GW approximation is a widely used method in condensed matter physics and quantum chemistry for calculating electronic properties of materials [72][73][74].In addition to the Hartree term Σ H = U n, which considers only the bare interaction, the mutual screening of the Coulomb interaction between electrons is partially taken into account.For spin-rotation invariant systems it is common practice to decouple these screened interactions η13 into a density (or charge) component η D and a magnetic (or spin for the atomic limit Hamiltonian.The η are computed by summing a series of bubble diagrams in the particle-hole channel, i.e., where the polarization bubble P is given by A diagrammatic representation of these relations is shown in Fig. 2. Finally, the set of equations is closed by computing G from the Dyson equation.Since the Green's function transforms as G * (ν) = G(−ν) under complex conjugation [48], we also have that and likewise Σ * (ν) = Σ(−ν).Thus, the numerical effort for evaluating Eqs. ( 10) and ( 12) can be reduced by a factor of two using a MatsubaraSymmetryGroup object.To structure the GW code, we first write a solver class which takes care of the proper initialization of the necessary MatsubaraFunction instances.As a second step, we implement the self-consistent equations, which we solve using Anderson acceleration.Note that we have rewritten the GW equation for the self-energy as which is beneficial since the product of G with the constant contributions to η D/M simply shifts the real part of the self-energy by Here, we make use of the flatten! and unflatten!functions which allow us to parse MatsubaraFunction data into a one dimensional array. 14The fixed-point can now easily be computed with, for example, 1 const T = 0.3 # temperature 2 const U = 0.9 # interaction 3 const N = 1000 # no.frequencies In Fig. 3 we show exemplary results for the self-energy and density obtained in GW .In contrast to the Hartree-Fock calculations in the previous section, Σ is now a frequency-dependent quantity, whose real part asymptotically approaches U n GW .As can be seen from Fig. 3(b), these GW densities agree quantitatively with the HF result for weak interactions U/T ≲ 1 2 , but yield larger densities for higher values of U as expected when the local interaction is partially screened.

Multiboson exchange solver for the single impurity Anderson model
Note: Readers who are not interested in the formal discussion presented below should feel free to skip this section and proceed directly to Section 5 on future directions.
In the following, we extend upon the previous computations for the Hubbard atom by coupling the single electronic level to a bath of non-interacting electrons.Specifically, we consider the single-impurity Anderson model, a minimal model for localized magnetic impurities in metals introduced by P.W. Anderson to explain the physics behind the Kondo effect [75].It is defined by the Hamiltonian where n σ (τ) = dσ (τ)d σ (τ).Integrating over the only quadratically occurring Grassmann variables for the bath electrons, one formally obtains Z = σ D dσ D (d σ ) e −S red with the reduced action given by Switching to Matsubara frequencies as described in section 2, the non-interacting Green's function for the localized d electrons reads Following [77] we choose an isotropic hybridization strength V k ≡ V and a flat density of states with bandwidth 2D for the bath electrons, leading to the hybridization function15 In the following, we set V = 2, measure energy in units of V /2 = 1 and set the half bandwidth to D = 10.In the context of this work, we focus on the particle-hole symmetric model, setting ε d = −U/2.Then, the Hartree term of the self-energy, Σ H = U/2 is conveniently absorbed into the bare propagator, Consequently, the Hartree propagator is used instead of the bare propagator throughout.

Single boson exchange decomposition of the parquet equations
Following [78], we now reiterate the single-boson exchange (SBE) decomposition of the fourpoint vertex and, subsequently, of the parquet equations.The starting point for the SBE decomposition, which was originally developed in [79][80][81][82][83][84], is the unambiguous classification of vertex diagrams according to their U-reducibility in each channel.In order to introduce this concept in the context of the parquet equations, we first have to discuss the similar concept of two-particle reducibility, which provides the basis for the parquet decomposition of the vertex, This decomposition states that all diagrams which contribute to the two-particle vertex Γ can be classified as being part of one of four disjoint contributions: γ r with r ∈ {a, p, t} collects those diagrams which are two-particle reducible (2PR) in channel r, i.e., they can be disconnected by cutting a pair of propagator lines, which can either be aligned in an antiparallel (a), parallel (p) or transverse antiparallel (t) way.All remaining diagrams, which are not 2PR in either of the three channels, contribute to Λ 2PI , the fully two-particle irreducible (2PI) vertex.One can equally well decompose Γ w.r.t.its two-particle reducibility in one of the three channels,  antiparallel (a), parallel (p) or transverse antiparallel (t) way.All remaining diagrams, which are not 2PR in either of the three channels, contribute to ⇤ 2PI , the fully two-particle irreducible (2PI) vertex.One can equally well decompose w.r.t.its two-particle reducibility in one of the three channels, = I r + r , which defines I r = ⇤ 2PI + P r 0 6 =r r 0 , collecting all diagrams that are 2PI in channel r.The Bethe-Salpeter equations (BSEs) then relate the reducible diagrams to the irreducible ones, This short-hand notation introduces the ⇧ r bubble, i.e., the propagator pair connecting the two vertices, see [77] for their precise channel-dependent definition, as well as for the connector symbol , which channel-dependently denotes summation over internal frequencies and quantum numbers.The self-energy ⌃, which enters the propagator via the Dyson equation where U is the bare interaction and the symbol • denotes the contraction of two vertex legs with a propagator.Together, equations ( 20), ( 21) and ( 22) are known as the parquet equations [84,85] and can be solved self-consistently, if the 2PI vertex ⇤ 2PI is provided [86][87][88][89].Unfortunately, ⇤ 2PI is the most complicated object, as its contributions contain nested contractions over internal arguments.Often, the parquet approximation (PA) is therefore employed, which truncates ⇤ 2PI beyond the bare interaction U .In the context of the SBE decomposition relevant to this work, U -reducibility is an alternative criterion to the concept of two-particle reducibility for the classification of vertex diagrams.A diagram that is 2PR in channel r is also said to be U -reducible in channel r if it can be disconnected by removing one bare vertex that is attached to a ⇧ r bubble, as illustrated in Fig. 4. Furthermore, the bare vertex U is defined to be U -reducible in all three channels.The U -reducible diagrams in channel r are in the following denoted r r and are said to describe single-boson exchange processes, as the linking bare interaction U , which would disconnect the diagram if cut, mediates just a single bosonic transfer frequency.The diagrams which are 2PR in channel r but not U -reducible in channel r are called multi-boson exchange diagrams and denoted M r .With these classifications, the two-particle reducible vertices can be written as r = r r U + M r , making sure to exclude U , which is contained in r r but not in r .The parquet decomposition (20) yields in this language, = ' U irr + P r r r 2U, This short-hand notation introduces the Π r bubble, i.e., the propagator pair connecting the two vertices, see [78] for their precise channel-dependent definition, as well as for the connector symbol •, which channel-dependently denotes summation over internal frequencies and quantum numbers.The self-energy Σ, which enters the propagator via the Dyson equation where U is the bare interaction and the symbol • denotes the contraction of two vertex legs with a propagator.Together, equations ( 20), ( 21) and ( 22) are known as the parquet equations [85,86] and can be solved self-consistently, if the 2PI vertex Λ 2PI is provided [87][88][89][90].
Unfortunately, Λ 2PI is the most complicated object, as its contributions contain nested contractions over internal arguments.Often, the parquet approximation (PA) is therefore employed, which truncates Λ 2PI beyond the bare interaction U.In the context of the SBE decomposition relevant to this work, U-reducibility is an alternative criterion to the concept of two-particle reducibility for the classification of vertex diagrams.A diagram that is 2PR in channel r is also said to be U-reducible in channel r if it can be disconnected by removing one bare vertex that is attached to a Π r bubble, as illustrated in Fig. 4. Furthermore, the bare vertex U is defined to be U-reducible in all three channels.The U-reducible diagrams in channel r are in the following denoted ∇ r and are said to describe single-boson exchange processes, as the linking bare interaction U, which would disconnect the diagram if cut, mediates just a single bosonic transfer frequency.The diagrams which are 2PR in channel r but not U-reducible in channel r are called multi-boson exchange diagrams and denoted M r .With these classifications, the two-particle reducible vertices can be written as γ r = ∇ r − U + M r , making sure to exclude U, which is contained in ∇ r but not in γ r .The parquet decomposition (20) yields in this language, where ϕ Uirr is the fully U-irreducible part of Γ .For a diagrammatic illustration of the first equation, see Fig. 8 in [78].The channel-dependent decomposition of the vertex Γ = I r + γ r = ∇ r + T r can also be split into U-reducible and U-irreducible parts in channel r, defining the U-irreducible remainder T r = I r − U + M r in channel r.Inserting all these definitions into the BSEs (21) and separating U-reducible and U-irreducible contributions gives the two sets of equations, for each channel r.From equation ( 24) one can derive (see [78] for details) that the singleboson exchange terms can be written as ∇ r = λr • η r • λ r , where λr , λ r denote the Hedin vertices [72] and η r the screened interaction in channel r.The former are related to the U-irreducible vertex in channel r via λr = 1 r + T r • Π r • 1 r = 1 r + 1 r • Π r • T r and can be understood as U-irreducible, amputated parts of three-point functions, as they depend on only two frequencies.In contrast to the GW approximation discussed in Sec.4.2, the screened interaction η r is now defined in terms of a Dyson equation, • λr dressed by vertex corrections.In the previous expressions, the connector • denotes an internal summation similar to •, the only difference being that summation over frequencies is excluded.The corresponding unit vertex is denoted 1 r .Lastly, one can rewrite the SDE in terms of the screened interaction and the Hedin vertex in channel r which yields, for example, In summary, the SBE-equations to be solved read As before, they require only the fully two-particle irreducible vertex Λ 2PI as an input.Notably, if one employs the so-called SBE approximation [80], which amounts to setting Λ 2PI = U as in the parquet approximation and neglecting multi-boson exchange contributions M r = 0, all objects involved depend on at most two frequencies.This scheme is therefore numerically favorable compared to the PA if the SBE approximation can be justified [91] .In the context of this paper, we do not employ the SBE approximation, but include multi-boson exchange (MBE) contributions.

Implementation in MatsubaraFunctions.jl
In this section, we present the implementation of the PA in its MBE formulation using MatsubaraFunctions.jl.In doing so, we build upon the code structure developed in Sec.4.2, i.e. we first define a solver class for which we later implement the self-consistent equations, as well as an interface to solve for the fixed point using Anderson acceleration, see Fig. 5.In order to keep the discussion concise, we refrain from showing all of the code and, instead, focus on computational bottlenecks and point out tricks to circumvent them.For completeness, however, we also make the entire code available via an open-source repository on Github, see Ref. [62] and provide additional implementation details in App.B.  Extending the GWsolver from Sec. 4.2 to the MBEsolver needed here is a straightforward endeavor, since we just have to add containers and symmetry groups for the Hedin and multiboson vertices.Furthermore, we extend the solver to include buffers which store the result of evaluating Eqs.(26c), (26d) and (26h), such that repetitive allocations of the multidimensional data arrays for λ r and M r are avoided.Note that, due to the symmetries of the SIAM studied here, it suffices to include either λ r or λr , since λ r = λr .In addition, all containers can be implemented as real-valued. 16 function calc_T(   Profiling the MBE code reveals that most of the time is spent calculating the irreducible vertices T r , which are needed to compute both λ r and M r .In the former case, two legs of T r are closed with a propagator bubble, while in the latter case, T r enters both to the left and to the right of the respective (Bethe-Salpeter-like) equation.When optimizing the code, it is therefore crucial to find an efficient way to evaluate Eq. (26e).In the example above, an exemplary implementation of T r in the density channel is shown.Here, we make use of the grid_index_extrp function, which given a Matsubara frequency and a grid g finds the linear index of the frequency in g or, if it is out of bounds, determines the bound of g that is closest.This function is normally used internally to perform constant extrapolation for MatsubaraFunction objects with grid dimension greater than one. 17Here, however, it can be used to precompute multiple linear indices at once, allowing us to exclusively use the [] operator and thus avoid unnecessary boundary checks.Note that we could have used tailfits for the screened interactions η r but opt to utilize constant extrapolation instead. 18 Furthermore, when T r is inserted into the equations for the Hedin and multiboson vertices, it is summed up along one fermionic axis.Therefore, some frequencies, e.g.w1 = w + v + vp in line 23 of the code snippet above, will assume the same value for many different external arguments.Hence, to circumvent repeated (but superfluous) grid_index_extrp calls, it is beneficial to precompute T r on a finite grid, which needs to be large enough to maintain the desired accuracy.To this end, we add buffers for the irreducible vertices to our solver class, such that we can compute e.g. the density T D and magnetic contributions T M inplace and in parallel, as shown in the example below. 17Therefore it is not exported into the global namespace. 18Since η r depends only on one frequency argument, it can be stored on a rather large grid, such that its asymptotic behavior is well-captured even without polynomial extrapolation.Here, we also make use of the fact that many frequency arguments (and their respective linear indices) are shared between different channels, which speeds up the calculation of T even further.The implementation of, say, Eq. ( 26h) is now rather straightforward.M D , for example, can be computed as shown below.Here, we utilize the corresponding MatsubaraSymmetryGroup object with the hybrid MPI + Threads parallelization scheme.In addition, we make use of views for the bubble and vertices to avoid repeated memory lookups in the Matsubara summation.

Benchmark results
In this section, we benchmark the presented implementation of the MBE parquet solver against an independent implementation in C++.Our motivation for this comparison is twofold: Firstly, we want to verify the overall correctness of both implementations and, secondly, we want to test how robust the multiboson formalism is to implementation details.This regards, for example, the treatment of correlation functions at the boundaries of their respective frequency grids.While the Julia code relies on (polynomial or constant) extrapolation, the C++ code replaces correlators with their asymptotic value instead.Ideally, these details should be irrelevant, except in the most difficult parameter regimes.Both codes used the physical parameters as stated after Eq. ( 18) and the frequency parameters according to Tab. 1.We begin by examining the static properties of the model including the quasiparticle residue Z given by as well as the susceptibilities in the density (D) and magnetic (M ) channels.The latter can be obtained from the screened interactions analogous to Ref. [92], that is The corresponding results are summarized in Fig. 6.Both codes are in quantitative agreement and predict a strong enhancement of magnetic fluctuations at low temperatures.However, as Table 1: Frequency parameters for the benchmark results in Figs.6-9.We show the total number of frequencies used for the various Matsubara functions.Since the boxes are symmetric around zero there is an even (odd) number of Matsubara frequencies along fermionic (bosonic) directions.
total no.frequencies G 4096 Σ 512 η 1023 λ 575 × 512 M 383 × 320 × 320 has been noted in Ref. [77], the characteristic signature for the formation of a local magnetic moment at the impurity, a decrease of χ D for temperatures T ≲ 2 (for the specific choice of numerical parameters used here), is not captured by the parquet approximation.Instead, χ D increases monotonically over the entire range of temperatures considered and the system remains in a metallic state with well-defined quasiparticles (i.e.0 < Z < 1).
Figure 7 shows a direct comparison of the MBE vertices and their evolution with decreasing temperature within both codes.As can be seen from the middle column, showing the screened interaction, Hedin and multiboson vertex in the magnetic channel, most of the long-lived magnetic correlations are already captured by the screened interaction itself and thus by the corresponding single-boson exchange diagrams.In contrast, low-energy scattering processes mediated by multiple bosons seem to be less relevant, as indicated by a comparatively small M M contribution.This picture is somewhat reversed in the other channels (left and right column in Fig. 7).In the density channel, for example, the largest contribution originates from short and also long-lived multiboson fluctuations, especially at low temperatures.
Figure 8 presents further results for M X as a function of its two fermionic frequencies ν and ν ′ (with fixed Ω = 0).Remarkably, the structure of these objects is dominated by cross-like structures similar to those discussed in Ref. [92], which become more pronounced when T is decreased.A comparison of the data obtained with both codes (shown in the second row of Fig. 8), reveals that it is precisely these structures that seem difficult to capture in numerical calculations, and where small differences in the implementation can have a significant effect.However, the relative difference between the results from both codes is still small (≲ 0.01).
As a final benchmark of the codes, we have considered their respective serial and parallel performance for a single evaluation of the parquet equations in SBE decomposition (see Fig. 9).Surprisingly, the Julia code based on MatsubaraFunctions.jloutperforms the C++ implementation by about a factor of four when run in production mode (i.e., with internal sanity checks disabled).We would like to note that this is most likely not due to a fundamental performance advantage of Julia over C++, but simply the result of several optimizations (such as those presented in Sec.4.3.2) that were more easy to implement using MatsubaraFunctions.jl.Note that we approximated the derivative in Eq. ( 27) by a fourth order finite differences method.The upper panels show the data for different temperatures, the lower panels the absolute deviation between the Julia and the C++ implementation, respectively.For lower temperatures the features in the data require the computation and storage of a larger number of frequency points.The agreement of the data persists to the lowest temperature shown in this paper.

Future directions
We have presented a first version of the MatsubaraFunctions.jllibrary and its basic functionality.Although the library already offers many features, most notably an automated interface for implementing and exploiting symmetries when working with Green's functions (including several options for parallel evaluation), as well as high performance for larger projects (see Sec. 4.3.1 and the discussions therein), several generalizations of the interface and further optimizations are currently under development.In addition, we will add more support for generic grid types other than just Matsubara frequency grids.These could include, for example, momentum space grids and support for continuous variables (such as real frequencies).Note, however, that calculations in momentum or real space are already feasible with the current state of the package, if a suitable mapping from, say, wavevectors to indices is provided.Accuracy improvements for fitting high-frequency tails and more advanced extrapolation schemes for Matsubara sums are also in the works.
In the future, it will be very valuable to extend the ecosystem surrounding MatsubaraFunctions.jl.For example, many state-of-the-art diagrammatic solvers rely on the efficient evaluation of similar diagrams such as vertex-bubble-vertex contractions, which are a common feature of Bethe-Salpeter-type equations.These operations could be developed independently of the main library, providing even more quality-of-life options for the user.Moreover, such a toolkit would allow for the swift deployment of different types of solvers, including fRG solvers for quantum spin systems and self-consistent impurity solvers such as the MBE code presented in Sec.4.3.2, to name but a few.With many new and exciting correlated materials becoming available, fast and flexible solvers are of utmost importance to facilitate scientific progress, and we strongly believe that a package like MatsubaraFunctions.jl could be a useful tool for their rapid development.
the Munich Quantum Valley, supported by the Bavarian state government with funds from the Hightech Agenda Bayern Plus.N.R. acknowledges funding from a graduate scholarship from the German Academic Scholarship Foundation (Studienstiftung des deutschen Volkes) and additional support from the "Marianne-Plehn-Programm" of the state of Bavaria.The numerical simulations were, in part, performed on the Linux clusters and the SuperMUC cluster (project 23769) at the Leibniz Supercomputing Center in Munich.The Flatiron Institute is a division of the Simons Foundation.

B Implementation details for the MBE solver
In this section we provide additional information on the implementation of the MBE equations, which were introduced on a general basis in Sec.4.3.1 of the main text.As for any application involving many-body Green's functions, it is crucial to choose an appropriate parametrization of the self-consistent equations that reflects the symmetries of the field theory under consideration.Here, we deal with the implementation of SU( 2) symmetry (spin rotation invariance) as well as time translation invariance (energy conservation) for the MBE equations of the impurity model defined in Sec.4.3.

B.1 SU(2) symmetry
Consider an SU(2) transformation U = e iεσ , where ε ∈ 3 and σ is the vector of Pauli matrices.Under U, the fermionic creation and annihilation operators transform into where we have omitted all indices except the spin s = {↑, ↓}.For SU(2) symmetric actions it can be shown that single-particle Green's functions G (1) ss ′ are diagonal and also invariant under spin flips, i.e.G (1) ss ′ = G (1) δ ss ′ [48].Two-particle correlators G (2) , on the other hand, can be decomposed into two components G (2)|= and G (2)|× , which preserve the total spin between incoming and outgoing particles G (2) Furthermore, the Bethe-Salpeter-like equations ( 24) can be diagonalized by introducing a singlet (S) and a triplet (T ) component in the t channel.Moreover, this change of basis has the advantage that physical response functions can be obtained directly from the screened interaction in the respective channel.The spin susceptibility χ M , for example, is simply given by −U 2 χ M = η M + U for a local Hubbard U.For this reason, the {S, T, D, M } basis is sometimes called the physical spin basis, whereas the decomposition into parallel (=) and crossed terms (×) is known as the diagrammatic spin basis [48].In the implementation of the MBE solver, the former is used.

B.2 Time translation invariance
The interacting part of the impurity action from Sec. 4.3 is static, i.e. the bare interaction U is τ-independent.Consequently, one and two-particle Green's functions are invariant under translations in imaginary time, which implies conservation of the total Matsubara frequency between incoming and outgoing legs [48] and, thus, G (1) (ν, ν ′ ) = G (1) (ν) × βδ ν|ν ′ ,

3 4 11 12 for v in g 13 G 21 G 23 24 # calculate the residue 25 FFigure 1 :
Figure 1: Exemplary Hartree-Fock results.(a) Comparison of the bare Green's function G 0 with the HF result G HF for T /U = 1 3 .(b) Hartree-Fock density n as a function of temperature.

Figure 2 :
Figure 2: Diagrammatic representation of spin-conserving GW equations in the atomic limit.Wavy lines denote the screened interactions in the density (red) and magnetic (blue) channel.They are obtained by dressing the respective bare interactions with a series of bubble diagrams P(Ω), as illustrated in the second and third line from the top.

FFigure 3 :
Figure 3: Exemplary GW results.(a) The complex-valued self-energy Σ GW with its real part offset by the Hartree shift Σ H = U n GW for T /U = 1 3 .(b) GW and Hartree-Fock densities as a function of U/T .
T, U, N) 6 init = zeros(ComplexF64, length(S.Sigma)) 7 res = nlsolve((F, x) -> fixed_point!(F,x, S), init, method = :anderson, m = 8, beta = 0.5, show_trace = true) → describing an impurity d level ε d , hybridized with conduction electrons of the metal via a matrix element V k .The electrons in the localized d state, where n d,σ = d † σ d σ , interact according to the interaction strength U, whereas the c electrons of the bath are non-interacting.Following[76], in a path-integral formulation for the partition function Z = σ D dσ D (d σ ) D ck,σ D c k,σ e −S with the action S = β 0 L(τ)dτ, the Lagrangian for the model is given by

2 Figure 4 :
Figure 4: Illustration of U -reducibility in the three two-particle channels a, p and t.The Figure is analogous to Fig. 4 of [79] and adapted from [77]. 1 and 2 can be any vertex diagram or the unit vertex.

Figure 4 :
Figure 4: Illustration of U-reducibility in the three two-particle channels a, p and t .The Figure is analogous to Fig. 4 of [80] and adapted from [78].Γ 1 and Γ 2 can be any vertex diagram or the unit vertex.

Figure 5 :
Figure5: Structure of the MBE code.First, an instance S of type MBEsolver is constructed by passing the SIAM parameters T , U, V and D and the sizes for the Matsubara grids.The self-energy Σ is initialized using second order perturbation theory (PT2), while all other MatsubaraFunctions are set to their bare values.In an optional step, MatsubaraSymmetryGroups for λ r and M r (here denoted by SG) can be precomputed.Next, the solve!function is used to find the fixed-point of the MBE equations using Anderson acceleration.To interface with NLsolve, the fields Σ, η r , λ r and M r of S (which are sufficient to determine all other involved quantities) are flattened into a single one-dimensional array.After convergence, S is finally written to disk in H5 file format.

Figure 6 :
Figure6: Results for the quasi-particle residue Z and the density/magnetic susceptibility χ D/M .The comparison shows good agreement between the two codes.Note that we approximated the derivative in Eq. (27) by a fourth order finite differences method.

Figure 7 :
Figure 7: Benchmark of vertex quantities between the Julia and C++ code.We show frequency slices through various SBE ingredients (top to bottom: screened interactions, Hedin vertices, multiboson vertices) at different temperatures and channels.The comparison shows good agreement between both codes.

Figure 8 :
Figure 8: Slice through multi-boson contributions M D , M M and M S .The upper panels show the data for different temperatures, the lower panels the absolute deviation between the Julia and the C++ implementation, respectively.For lower temperatures the features in the data require the computation and storage of a larger number of frequency points.The agreement of the data persists to the lowest temperature shown in this paper.

Figure 9 :
Figure9: Performance benchmark between the Julia and C++ code.We show the time taken for a single evaluation of the parquet equations in SBE decomposition.Note that the runtimes have been normalized to the serial result of the C++ code.

G ( 2
the p channel, and a density (D) and magnetic (M ) contribution G