divERGe implements various Exact Renormalization Group examples

We present divERGe, an open source, high-performance C/C++/Python library for functional renormalization group (FRG) calculations on lattice fermions. The versatile model interface is tailored to real materials applications and seamlessly integrates with existing, standard tools from the ab-initio community. The code fully supports multi-site, multi-orbital, and non-SU(2) models in all of the three included FRG variants: TU$^2$FRG, N-patch FRG, and grid FRG. With this, the divERGe library paves the way for widespread application of FRG as a tool in the study of competing orders in quantum materials.


Introduction
High-performance computing brought many insights into physical problems deemed unsolvable by analytic means.Especially the ab-initio treatment of condensed matter systems in the form of density functional theory, GW and dynamical mean-field approaches is a remarkable success story [1][2][3][4][5][6][7].Besides the versatility of the methods mentioned above, a key ingredi-ent for this success is the availability of optimized libraries, such as ABINIT [8], Quantum Espresso [9], Vasp [10], BerkleyGW [11], YAMBO [12], TRIQS [13] and many more [14], leveraging the need to develop complex codes over and over again.If no such public code is available, each researcher has to implement the method themselves, thus creating a lot of redundant work with, most likely, sub-optimal outcome.Therefore, it is elemental for the broad applicability of a method to have a public code available -as well as a documented knowledge about best practice in implementations.
In the study of correlated materials, the ab-initio-based treatments mentioned above already incorporates many important features.However superconducting orders from electronic interactions and other effects of long-range interactions generated during the approximate solution of the many-body Schrödinger equation are only partially, or not at all, included.This creates the need for methods and codes that allow us to connect to these developments, and extend the status quo by adding the missing pieces.In computational condensed matter physics, it has proven efficient to start from an effective low energy description of the material, keeping only a few relevant bands.The process of how to arrive at such a down-folded model poses the first obstacle.Subsequently, we still have to solve a model including a few bands, with complicated interactions.To tackle such problems, we often need to introduce approximations, which should be well controlled.For a broad class of materials, we can utilize pertubative approaches, such as the random phase approximation (RPA), the parquet approximation [15] and FRG [16,17].While the former one includes only specific diagrammatic channels, the latter two are diagrammatically unbiased, thus being prime candidates for the extension of the ab-initio machinery; the issue being that implementations of the full equations, incorporating all their dependencies are beyond our current reach.
In this paper we present divERGe 1 -an open source, high-performance (multi-node CPU & multi-GPU) C/C++/Python library (available at [18]) that implements different flavors of the FRG [16,17].The library is based on a general model interface (cf.Section 3), and three different computation backends: (i) grid-FRG [19,20], (ii) truncated unity FRG (TU 2 FRG) [21][22][23], and (iii) orbital space N -patch FRG [24][25][26].Each performs different approximations of the central equations, resulting in different numerical complexity, as detailed in Appendix D. This paper is designed as a hands-on introduction to the usage of divERGe.We therefore briefly summarize FRG as a numerical method in Section 2, introduce the model structure in Section 3, explain how the flow equations are solved in Section 4 and how the results are analyzed in Section 5.
In the following we briefly explain the type of models that can be studied with divERGe, and thereafter detail the equations that are solved and give an introduction to the analysis of results.

General setup
In general, we aim to study arbitrary fermionic models, on arbitrary lattices, with a kinetic and a two-body interaction contribution.The Hamiltonian (in second quantization) for such a model reads with numbers denoting a collection of all quantum numbers specifying an electronic singleparticle state.The model initialization from within the code is described in detail in Section 3.

Flow equations
Once the model is implemented, we solve the flow equations for the three different diagramatic channels and (optionally) the static self-energy (primed variables are summed over): where Matsubara frequency summations over ik ′ 0 are implicit.The full two particle vertex F is given as with U the fully irreducible vertex.The scale derivative of the loop is given in terms of the single particle Greens-function and the single scale propagator S = G(∂ Λ (G 0 ) −1 )G: The non-interacting Green's function G 0 is modified by a multiplicative regulator f (Λ): and the interacting Green's function is given as Within divERGe, we employ a sharp cutoff as regulator, i.e., f (Λ) = Θ(|ω| − Λ).This choice significantly reduces numerical complexity in several parts of the code (see Ref. [52] and Appendix D).
From the above equations, we immediately find the connections between FRG and other diagrammatic approaches, such as the RPA and the parquet approximation.If we were to neglect the flow of all but one channel, we restore the RPA series specific to that channel [28], while when comparing with parquet, we find that we miss the multi-loop class of diagrams [53][54][55].These observations motivate a nomenclature as "RPA+" of the FRG flow -we include the RPA diagrams of all channels and most of the simultaneous cross-insertions of those ladders as feedback (missing only the multi-loop corrections).

Analysis of results
The differential equations for the vertices (flow equations) are solved numerically until a divergence of one of the vertex components occurs or the minimal scale is reached.A divergence is indicative of a phase transition to an ordered state.In that case, the flow is stopped and from analysis of the vertex, susceptibilities, and linearized gap equations, we can extract information of the ordered state, see Ref. [52].

Creating a model
The central object in a momentum space FRG calculation is a model that includes information on kinetics and interactions.Alongside with some more parameters, these two pieces of information make the required ones to setting up a simulation.For defining a model from a Hamiltonian, we first have to choose our basis states with o 1 the combined orbital-site index within a unit cell and R the real-space lattice vector which connects a site to the reference unit cell.The spin index in z basis is denoted by s 1 .
While the code technically allows for arbitrary spins, the spin symmetric variant of the flow equations is only implemented for spin-1/2 particles.

Required structures
We choose to use a formulation of kinetics based on real space hopping parameters.Qualitatively equivalent, the vertices are also passed as real space description in the three distinct interaction channels.Both must be supplied by the user.The following section explains these three key ingredients of the code, starting with the overarching diverge_model_t structure: index_t nk [3]; We will detail the use of all fields within the structure in the following.For initialization of the diverge_model_t struct, we provide the function  This function returns a pointer (i.e., a handle) to a diverge_model_t structure.It furthermore ensures a sensible default of all optional parameters, but does not set the required ones.
For the destruction of the structure, the function void d iv e rg e _m o de l _f r ee ( div erge_model_t * model ) ; is provided. 3The resulting skeleton that initializes the library and a model handle is given in

Example 1
The skeleton of a simple example program using the divERGe C(++) library is formed by: including the function declarations (ll.1-2), initializing the library (ll.5-6), allocating a diverge_model_t structure (l.7), and freeing resources (ll.97-98).... d iv e rg e _m o de l _f r ee ( model ) ; diverge_finalize () ; } Example 1.In the diverge_model_t structure, the following data fields must be set • nk [3]: number of k-points for the vertex in direction of the reciprocal lattice vectors.If non-zero, periodicity along that direction is assumed: To simulate a 2D model we have to set nk[2]=0.
• nkf [3]: number of fine k-points around each coarse one in each direction for the loop integrals.Dimensionality has to match nk [3].
In general, the Python wrappers discard the diverge_ prefix present in the C functions.We solely describe the C interface in the following, as the Python interface follows tightly.
In addition to releasing internally allocated data and the handle, it takes care of releasing all user-allocated data attached to the diverge_model_t handle.
• n_orb: number of orbitals and sites in the unit cell.
• positions[MAX_N_ORBS] [3]: positions of each site and orbital -the first n_orb entries will be used as the positions. 4 SU2: Set to true means that the model is SU(2) symmetric and therefore spins are implicit.
• n_spin: number of spin quantum numbers (n spin = (S + 1/2) • 2), with the exception that for SU (2) symmetric systems n_spin should be set to one.If the model is not SU (2) symmetric the interactions are forcibly symmetrized by the crossing relations (ensuring that Pauli's principle holds).
From C(++), setting all nonzero parameters results in Example 2. To access a field ("field") Example 2 Setting simple model parameters: A triangular lattice (ll.[8][9][10][11][12] is written directly to the arrays, whereas the basis of a honeycomb lattice (ll.13-16) is calculated using the lattice vectors.Other required and optional parameters are set in lines 18-28.
... ... from Python using the handle ("model") returned by the routine diverge.model_init(),one must use model.contents.field.In order to facilitate the interfacing with numpy arrays, the diverge.view_arrayfunction is given.It returns an array view of the chunk of memory that is input as parameter.Usage is detailed in the Python examples (see the divERGe repository [18]).Beyond these simple fields, the diverge_model_t struct contains more complicated members, explained in greater detail in the following.

Kinetics: rs_hopping_t
In spirit similar to Wannier90 [56][57][58], we define one single hopping parameter through the following structure: The number of such hopping parameters given in diverge_model_t is n_hop.From C the struct is initialized using plain malloc, while from Python, using the provided function diverge.alloc_array(shape,dtype) is required in order to bypass Python's garbage collector. 5he value of the hopping parameters can be readily read off from the kinetic part of the Hamilton operator T as the matrix elements where oi are the orbitals/sites as specified by positions (and n_orb), si are the spin quantum numbers and R [3]   A call to diverge_read_W90_C 6 allocates and fills the rs_hopping_t array and sets *len to the number of elements that were read.nspin describes the spin quantum numbers in the _hr.dat file.A default value of nspin = 0 amounts to an SU(2) symmetric model.If nspin ̸ = 0, it must be set as |nspin| = 2S + 1, with S the physical spin (i.e., for S = 1/2 we have |nspin| = 2).The sign determines whether the spin index is the one which increases memory slowly (negative) or fast (positive) in the _hr.dat file.Note that the divERGe convention is always (s, o), 7 i.e. spin indices walking through memory faster than orbital indices ('outer indices').In Example 3, we could -instead of explcitly looping over hopping parameters -substitute lines 30-40 with the following function call (given graphene_hr.datcontains hoppings): 1 model -> hop = d i ve r ge _r e ad _ W9 0 _C ( " graphene_hr .dat " , 0 , & model -> n_hop ) ;

Interactions: rs_vertex_t
In general, any static two particle-interaction can be written as Thus in all generality, we can specify the two particle interaction by its dependency on four orbitals, four spins and three momenta.This scaling in system size with power 3 (or even 4) discourages users from a direct initialization of the four point object (however it is possible, see Appendix A.4). Luckily, a fair subclass of possible interactions can be formulated as inter-orbital bilinear vertices in one of the three inequivalent interaction channels [59].The simple interface therefore restricts the possible input to such vertices that can be efficiently represented in real space. 8,9Similar to the hopping parameter definition (cf.Section 3.1.1),we define the structure for vertices as 4 index_t o1 , o2 , s1 , s2 , s3 , s4 ; The allocation of the structure is analog to the one of the rs_hopping_t.It differs from the hopping parameters only in two points: The interaction channel is given as character that can either be 'C' (for crossed particle-hole, i.e., the C-channel), 'D' (for direct particle-hole / D) or 'P' (for particle-particle / P) and the user can supply four spin indices instead of two.In the three interaction channels, the single vertices thus represent the following terms in the interaction part V of the Hamiltonian: Notably, each term on the right hand side corresponds to a single rs_vertex_t.Moreover, a default spin configuration can be given for non-SU(2) systems as a special case: For s 1 = −1, the spin dependence of a given rs_vertex_t is initialized such that ...
where I is any of the three channels.This ensures that the initialization of a non-SU(2) Hubbard-Kanamori interaction is identical to the one in an SU( 2) system when crossing is enforced (see examples).We stress here that if a non-crossing symmetric interaction is initialized and crossing symmetry enforcement is turned off, the different backends do not have to give compatible results, as there is an arbitrary freedom of parametrization in the flow equations.Furthermore, the results in such a simulation are, in general, unphysical (violating Pauli's principle).
We are aware that with this interface, some two particle interactions cannot be encoded.To circumvent this shortcoming, the code offers the possibility to provide a custom vertex generation function for the full four point vertex; full_vertex_generator_t (cf.Appendix A.4).

Optional but recommended: Point group symmetries
When allocating a diverge_model_t structure and inputting all the variables described above, your code will already run.However, we recommend to also provide divERGe with the symmetries of the model.Depending on the backend, this will reduce the runtime and memory footprint or make the results symmetry preserving [52].
The point-group symmetries of the model can be attached to a handle (diverge_model_t) via two arrays and their length (cf.ls. 15 − 17 in the listing on p. 5), i.e., 1 index_t n_sym ; where rs_symmetries stores the real-space transformations (3×3 matrices M, with r ′ = Mr ) and orb_symmetries stores the transformation behavior of the combined state of spin and orbit.n_sym is the number of symmetries present in the model.The symmetries can be directly provided by the user, however this can become cumbersome for multi-orbital and/or multi-site models.Hence we provide const site_descr_t * orbs , index_t n_orbs , const sym_op_t * syms , index_t n_syms , double * rs_trafo , complex128_t * orb_trafo ) ; to generate both the real space transformation (rs_trafo) and the orbital/sublattice/spin space transformation (orb_trafo) for a symmetry operation specified by the list of symmetry operators stored in syms and for orbitals specified by the site descriptors stored in orbs.We will explain these two structs in the following.First, a single symmetry operator is defined by the structure typedef struct sym_op_t { char type ; double normal_vector [3]; double angle ; } sym_op_t ; where type specifies the type of symmetry, with possible values 'R' (rotation), 'M' (mirror), 'I' (inversion), 'S' (spin rotation), 'F' (spin flip), and 'E' (identity).normal_vector [3] encodes the normal vector to the plane in which the symmetry operation acts and angle the angle of the rotation in degrees (ignored if the operation does not need an angle).To encode, for example, a mirror in the yz-plane, we can set the following for the structure's fields: In the call to diverge_generate_symm_trafo, the parameter n_syms specifies the number of elementary symmetry operations needed to describe the current symmetry.The order of application of these elementary operations O i to some vector As the symmetry generator also works for tight-binding orbitals with non-trivial angular dependence, we require the site_descr_t structure, which describes the content of real spherical harmonics of the tight-binding basis function for each orbital-and site index: orbs and the number of elements n_orbs. 10The structure is defined as Again the interface is inspired by the orbital interface of Wannier90 [56][57][58].n_functions gives the number of basis functions required for constructing the orbital and functions gives the type of spherical harmonic, while amplitude is the complex weight of each of the basis functions.For example, to generate a p + orbital, we require a p x and a p y orbital with weights of 1/ 2 and i/ 2 respectively.In case the coordinate system of the real spherical harmonic does not align with the coordinate system of the lattice, we also allow for the arguments xaxis and zaxis which are used to define the x and z axis of the real harmonic, these are We stress here that the value n_orbs passed to the symmetry generation routine does not have to equal n_orb from the model, but only should be if you are planning to use the resulting symmetry transformations directly in the model.The examples include models of various symmetry using both the Python and the C/C++ interfaces.

Preparing the model for a flow
After the model struct is initialized and filled with user data, internal structures must be generated.In addition to common internal structures, each backend defines its own internal structure.Since users may want to do something with the Hamiltonian array, the energies, the momentum meshes, etc. (common internal structures), the common internal structures are initialized separately from the backend specific ones.Before starting with these potentially computationally expensive operations, it is advised to check for obvious errors in the model.These two steps result in the following lines of code: After the validity check and initialization of common internals, many codes will in practice call one of the following functions: contains the diagonalization of the Hamiltonian.This also implies that the Hamiltonian cannot be changed after calling this function. 12Changing the default Hamiltonian generated by diverge is easiest accomplished by writing a custom Hamiltonian generator that, as a first step, calls the default Hamiltonian generator (see Appendix A.5).

Some convenience functions
Often times, one wishes to perform simulations of a model at several values of chemical potential µ or filling ν (where in our convention, ν = 0 corresponds to a completely empty model and ν = 1 to a completely filled one).Three convenience functions are defined in divERGe to perform tasks related to changing µ or ν of a given model: For all of them, the energy buffer (E) may be NULL (or None from Python), allowing for usage of the internally constructed energy arrays (and number of bands) instead of E and nb.The function diverge_model_set_filling, aside from setting µ correctly, returns the value µ that was needed to fill the model to ν (at T = 0).Note that in advanced use cases (when, e.g., the standard Hamiltonian generator is overwritten), the first three of those functions are not guaranteed to return meaningful results.
Lastly, the user can access point group symmetrization routines for use on arbitrary arrays of certain shape and data type: The _fine (_coarse) functions act on arrays where the momentum dimension is equivalent to the fine (coarse) momentum mesh.In case the auxiliary buffers (aux) are provided, they must be of the same shape as the main buffer (buf).If they are NULL, internal allocations are performed.We offer symmetrization of 2-point functions [_2pt, shape (n k , n b , n b )], and diagonal functions [_mom, e.g.energy arrays, shape (n k , sub)].

Model output
Before performing the flow (see Section 4), which usually presents the computationally most expensive task, it is advised to check whether the model is implemented correctly.Notably, this step is often computationally cheap enough to be performed on a local machine.We allow for models to be written to disk via Using the second function, users can achieve fine-grained control over optional output such as the momentum meshes, or the dispersion and orbital-to-band matrices in the primitive zone.
As return value, the above functions give the MD5 checksum of the output file as a string.
Details on the binary file format and optional output parameters are given in Appendix C.1.As part of divERGe, we ship the Python library diverge.output that reads all divERGe output files, including the model file and postprocessing files (see Section 5).Plotting the band structure of a model merely requires a few lines of Python: Listing 1 contains all the code that produces the band structure plot of a three-band Emery model shown in Fig. 1.
The steps described in Sections 3.3 to 3.5 are practically illustrated as a code snippet in Example 5.
Example 5 Preparing a model for the flow: Validation (l.62) precedes setting the common internals (l.67), the filling (l.68), and the TUFRG specific internals (l.69).We also showcase model output to a file setting some of the non-default parameters (ll.70-74).

Performing the flow 4.1 A single flow step
Given that the user has allocated, filled, and initialized (the internals of) a diverge_model_t structure, the next step towards performing an FRG flow is through the opaque structure diverge_flow_step_t (serving as a handle).Allocation of which requires a fully initialized diverge_model_t structure and must be performed via one of the following two functions: The second function is valid if and only if a single set of internal structures (corresponding to a single backend) has been initialized in the model.As users may want to initialize multiple backends, we provide the first of the two functions for precise control via the mode parameter (that can be "grid", "patch", or "tu").In both cases, channels is passed as string and encodes which interaction channels are included in the FRG flow.If the character 'P' ('C', 'D') is found in the channels string, the respective diagrams are calculated, allowing easy access to RPA calculations in any of the interaction channels.For systems without SU(2) symmetry, one must not include the D-channel without the C-channel and vice-versa.Otherwise, Graßmann symmetry would be broken.In addition to the interaction channels, the character 'S' stands for the inclusion of (static) self-energies.Currently, these are only supported in the TUFRG backend, which is subject to change in future releases.
After the initialization, the diverge_flow_step_t handle serves the purpose of performing Euler integration steps of the FRG flow [Eqs.(2) to ( 5 We stress that negative dΛ is required to flow from high Λ to zero.Since users often wish to loop over flow steps, we provide a simple adaptive integrator.Its usage is explained below.

Integrating the flow equations
Under the approximations and assumptions taken within the scope of divERGe, the flow equations are integrated from high scales Λ = ∞ to low scales Λ = 0 until a phase transition (divergence of a vertex element) is encountered or a minimal Λ is hit.In practice we "flow" using a construction similar to loop shown in Example 6, where the function call to diverge_euler_defaults_CPP() returns sensible defaults for many models (cf.Appendix B.1).This code snippet will integrate the flow equations until a stopping criterion Example 6 Flow step initialization (l.78) and cleanup (l.96), integrator setup (the default values are reasonable for most models; we change them to showcase the mechanism; ll.79-82), and flow loop idiom (ll.83-88).
... ... Listing 2: Python code to produce Figure 2 from the output generated by examples/ c_cpp/honeycomb.cpp[18].As the vertex eigenvectors for the honeycomb lattice Hubbard model are complex in general, we must declare a helper function to plot these complex arrays (complex_cmap).We select the eigenvectors at indices 1 and 2; those are the ones corresponding to d x y -and d x 2 − y 2 -wave superconducting gaps. is met. 13After each integration step the snippet outputs the channel and vertex maxima, as well as the current value of the integration scale Λ.The step-width is adjusted to keep the error tangible while at the same time being efficient (cf.Appendix B.1).Note that when selecting only a specific channel in the flow step initialization, the FRG flow amounts to an RPA calculation in that channel (with formfactors accounting for non-channel-specific long range interactions).
We specifically chose to leave control to the user regarding the flow loop, as it is never performance critical and there are many things that can be done at each step of performing the flow.For example, vertices may be accessed through diverge_flow_step_vertex.
The indices etc. of the returned array are of course backend dependent.Nonetheless, one can use it to, e.g., track specific components of the vertex.If the selfenergy is included, we additionally provide functions to adjust the filling after each step by calling diverge_flow_step_refill, leading to a quasi-canonical description.Note that this quasicanonical description may be ill-defined in many cases and is therefore not enabled by default.For further usage see the online documentation 14 as well as the examples.[18] or Appendix E.2).The plot is generated with the small Python script presented in Listing 2. We encode the complex phase as color, and the magnitude as opacity.Notice how most of the orbital weight of the gap function is on the off-diagonal elements ∆ 12 (k) and ∆ 21 (k).

Post processing and output
After integration of the flow equations and reaching a stopping criterion of the FRG flow, susceptibilities, vertex eigenvectors or solutions to linearized gap equations may provide physical insights to the system [20,28,40,51,52].Depending on the backend, the functions ...

Conclusion
In this paper, we presented in detail the usage of the divERGe library.We focused on explaining the interface for in-and output as well as the backend implementations.For applications of this framework on physical systems, see Refs.[19,20,23,26,35,48,51,60]. We believe that the flavors of FRG realized in divERGe represent a sophisticated drop-in replacement for RPA in many applications in which qualitative predictions of correlated phases are wanted.As such, the divERGe library has a promising future as an extension of the ab-initio pipeline.We believe that the library presented here significantly ameliorates the usability of FRG as a method to study competing orders in solid state systems and hence massively increases the reach and popularity of FRG in the scope of ab-initio as well as model calculations.In the future, we hope that the tight connection of divERGe to Wannier90 allows to extend high-throughput materials databases [61][62][63] to FRG, paving the way for systematic characterization of competing electronic orders in quantum materials.Moreover we plan to further entangle divERGe with the existing ecosystem by, e.g., providing wrappers for interaction parameters obtained from first principle codes [64].
The publication of this framework can however only be seen as a first step and many possible future extensions are imaginable: First and foremost, the code in its current form does not treat the frequency dependence, which would allow us to introduce retardation effects on the two-particle level.This will not only require to "just implement" the frequency dependence, but smarter representations of the frequency content of the high-dimensional vertex functions have to be found.First steps in this direction have been recently taken [65][66][67][68].Once this has been achieved, the usage as a post-DMFT tool becomes available through the DMF 2 RG route and related proposals [69][70][71][72].Secondly, while a significant amount of time has been spent with the optimization of the backends, reducing computational effort remains a continuous challenge we strive to address.

A Advanced initialization
The following fields of the diverge_model_t are not required to be filled by the user.In fact, we highly recommend to only use these if you know what you are doing.No guarantees for correctness of the results can be given anymore as some of these options override key functions.

A.1 Fermi surfaces: mom_patching_t
The N -patch FRG backend of divERGe relies on the assumption that the model is defined on a fixed momentum space given by nk (chosen to sufficient accuracy).Patching of the Fermi surface as well as momentum integration of the loops happens on this momentum grid.We can therefore guarantee, e.g., general operation for nontrivital Hamiltonians or usage with Green's functions instead of Hamiltonians.In practice, the requirement of a fixed momentum mesh simplifies many aspects of an N -patch FRG calculation.For example, the structure to define a patching is as simple as follows: 1 struct mom_patching_t { It includes the total number of patches (n_patches), the indices of these patches referencing the coarse momentum mesh (patches), the weights assigned to each of the patches for Brillouin zone integrals (weights), and a detailed description of the refinement for each of the patches (p_count, p_displ, p_map, p_weights). 15  In the codebase, several convenience functions that simplify usage of the mom_patching_t struct are defined.For example, Fermi surfaces can be found automatically using the following function: Notice how the routine operates on arbitrary energies and number of bands. 16The resulting vector is allocated and written to *fs_pts_ptr, with its size saved in *n_fs_pts_ptr. 17To 15 The first two arrays, p_count and p_displ, serve as descriptors for the third and fourth array: For a given patch index p, the slice p_displ[p]:p_displ[p]+p_count[p] of the arrays p_map and p_weights describes all indices corresponding to the refinement of patch p as well as their weights, respectively. 16Similar to other divERGe routines, passing NULL for the energy array makes the library operate on the model internals.
generate the patching struct from the Fermi surface indices (which can of course be modified before the struct generation), we provide In addition, automatically refining the integration regions and re-symmetrizing those refined integration meshes is available via 1 void d i v e r g e _ p a t c h i n g _ a u t o f i n e ( diverge_model_t * mod , mom_patching_t * patch , const double * E , index_t nb , index_t ngroups , double alpha , double beta , double gamma ) ; 2 void d i v e r g e _ p a t c h i n g _ s y m m e t r i z e _ r e f i n e m e n t ( diverge_model_t * mod , mom_patching_t * patch ) ; We advise users to consult the documentation and/or source code in case they wish to use those functions.
To automate the procedure of generating a patching with the energies calculated from the hopping parameters supplied, the default examples leave model->patching untouched and instead only call 1 void d i v e r g e _ m o d e l _ i n t e r n a l s _ p a t c h ( diverge_model_t * m , index_t np_ibz ) ; Note that this call requires the common internal structures (cf.Section 3.3) to be set.

A.2 Formfactor expansions: tu_formfactor_t
The tu_formfactor_t structure is filled automatically when calling 1 void d i v e r g e _ m o d e l _ i n t e r n a l s _ t u ( diverge_model_t * mod , double dist ) ; This call ensures that all form-factors for each site/orbital on the lattice that have a length smaller than dist are included.Precisely, global and local point-group symmetries are respected by this type of truncated unity [23].The structure that is filled is exposed to the user and reads where R [3] is the Bravais-lattice vector 18 corresponding to the form-factor.The bond connects the orbital/site indexed by ofrom with the one indexed by oto which is located in the unit cell shifted by R [3].d is the absolute distance between ofrom and oto+R [3].ffidx is always constructed by the code itself, it enumerates all possible exponentials generated by the different form-factors, which usually are far less than the form-factors themselves.
For fine-grained control over form-factors, the generation step can be bypassed by setting a nonzero value for n_tu_ff.This transfers responsibility of allocating and filling the tu_formfactor_t struct to the user.The code will then only sort it into the standardized form and set ffidx.As minimal requirement, each site/orbital must possess the on-site formfactor, i.e. ([0,0,0],o 1 ,o 1 ,0.0,0).

A.3 Custom channel based vertex generation
To allow for user-defined channel based vertex input, the model can be equipped with a function pointer that will be called instead of the default channel generation.The pointer is expected to have the form the return value tells whether the buffer has been touched (1) or not (0).If it has been touched, it is expected to contain the corresponding vertex channel in the order (nk, n_spin, n_spin, n_orb, n_spin, n_spin, n_orb).Note that this function does not offer more versatility than the default vertex generator, but can be used to implement interaction profiles that have a natural representation in momentum space, rather than real space.

A.4 Custom full vertex generation
In cases where the user requires more versatility for the vertex initialization, they can attach a full vertex generator function of the form 1 void (* f u l l _ v e r t e x _ g e n e r a t o r _ t ) ( const diverge_model_t * model , to the model structure.The function is expected to return the full vertex at a specific momentum combination in buf, as the full two-particle interaction is in general far too large to store. We expect the user to not make use of parallelism within their full vertex generator, as the function is called in parallel by divERGe.The index order is expected to be (n_spin, n_orb, n_spin, n_orb, n_spin, n_orb, n_spin, n_orb).We do not encourage users to employ a custom full vertex generator.

A.5 Custom Hamiltonian generation
The custom Hamiltonian generator can be useful if, e.g., the Hamiltonian is present as data rather than hopping elements.It must be of the form and attach it to the model handle as custom Hamiltonian generator hfill.

A.6 Custom Green's function generation
Analogously, the Green's function generator is of the structure 1 greensfunc_op_t (* g r e e n s f u n c _ g e n e r a t o r _ t ) ( const diverge_model_t * model , The function is expected to provide the Greens function at Λ and Λ * in buf in the index order (±Λ, prod(nk*nkf), n_spin,n_orb, n_spin,n_orb).If gfill is set the Greens function is generated with the user defined greens-function generator, which can be useful for simulation of models where only a subspace of orbitals is correlated. 19n addition to changing gfill, the user may also set another Green's function generator, gproj.If this one is set, divERGe calculates L = L(G) − L(G proj ).This functionality is used to remove certain subspaces from the kinetics. 20It can also be used to isolate the effects of individual bands in the calculation.We note that when using a sharp frequency cutoff, we formally replace the regulator f (Λ) by where G proj 0 ≡ T is the propagator in the target space and G 0 − G proj 0 ≡ R the propagator in the remote space (everything except the target space).This choice of the cutoff restricts the two particle propagator to the high energy sector while the Greens function is not restricted [59,74].One can easily prove that with this regulator, terms of the form are contained in the FRG flow, where we choose to implement the last expression for in-code simplicity.

B Additional structures B.1 Controlling the flow: diverge_euler_t
We include an adaptive Euler integrator in the library that allows fine-grained control over the integration of the flow equations.In practice, the flow loop will often be given as Notice how the next Λ and dΛ are generated using the function diverge_euler_next within the condition of the while loop: It returns 1 if a next step should be performed, and 0 if the flow should be terminated.We offer control over the termination conditions and integrator through the diverge_euler_t structure as follows: The meaning of each of the parameters is explained in the following: 20 Example found in examples/c_cpp/t2g_cfrg.cpp [18].
• Lambda: starting scale Λ of the flow -should be larger than the bandwidth (default: • dLambda: starting step-width dΛ of the flow, has to be negative.A rough estimate is given by dΛ ≈ −0.1Λ (default: dΛ = −5) • Lambda_min: stopping value Λ min of Λ if no divergence is hit -has to be nonzero due to the finite energy resolution of the simulation (default: Λ min = 10 −5 ) • dLambda_min: minimal allowed step-width dΛ min serving as a lower bound b on the step-width (default: dΛ min = 10 −6 ) • dLambda_fac: defines an upper bound B for the step-width as B fac = dΛ fac • Λ (default: • dLambda_fac_scale: additional scaling factor dΛ fac−sc for the calculation of the width of the next step, used as an upper bound B for step-width: • maxvert: stopping condition such that the flow is halted when V max > maxvert, i.e., the maximal element of the vertex reaches maxvert (default: maxvert = 50) • maxvert_hard_limit: Hard limit for the stopping condition -especially important if consider_maxvert_iter_start is set (default: 10 4 ) • niter: current number of performed flow steps (gets updated with each call to the step-width function diverge_euler_next) • maxiter: maximal number of performed flow steps (default: −1, thus ignored) • consider_maxvert_iter_start: number of steps for which the stopping condition is ignored, usefull when starting with long range interactions (default: −1, thus ignored) • consider_maxvert_lambda: Λ starting from which the stopping condition is active (default: −1.0, thus ignored).
After performing an Euler step from Λ to Λ + dΛ, the next step-width dΛ is calculated as dΛ = max[min(B fac , B fac−sc ), b], i.e., taking into account all the upper and lower bounds defined above.Moreover, the multiple halting conditions are checked and the return value of diverge_euler_next is chosen accordingly.

B.2 Encoding of real harmonics: real_harmonics_t
The enumeration real_harmonics_t encodes each real harmonic up to the g-shell such that the orbitals can be easily set up for use as function in site_descr_t.Following Refs.[75,76], we define real harmonics as in terms of the spherical harmonics Y l,m .Our naming convention is orb_ followed by the letter of the shell (s,p,d,f,g) and the value of m where the negative values are indicated by an 'm' and positive values by an underscore '_'.Alternatively, all harmonics up to the d-shell are accessible by their name.C File formats divERGe uses binary files as output for the model and postprocessing data. 21Their file formats are defined below.Python classes that read divERGe output files are distributed in the diverge.outputlibrary.Note that we are well aware of the HDF5 library [77], but for high portability and reduction of dependencies decided against using it and instead implemented a very lightweight I/O based on the C standard library functions fopen(), fwrite(), fclose() (or their MPI counterparts).

C.1 Model output file
To control the contents of the model file beyond their defaults, we provide the structure diverge_model_output_conf_t, which is defined as No output is generated for flow information, as the user themselves are responsible for looping over integration steps, i.e., flowing.The individual elements control optional storage of (and are treated as boolean variables in-   The data is preceded by a header (written in binary form) consisting of 128 64bit (signed) integer numbers (index_t).If not specified else, all header variables are of the index_t type.The data section following the header contains all arrays specified through their displacement (displ) and size (given in bytes).
ternally if not specified else) • kc: store coarse momentum mesh • kf: store fine momentum mesh • kc_ibz_path: if an IBZ-path has been set, stores the points in the coarse mesh on the path • kf_ibz_path: if an IBZ-path has been set, stores the points in the fine mesh on the path • H: store the Hamiltonian on the fine mesh • U: store the orbital-to-band transformations on the fine mesh • E: store energies on the fine mesh • npath: integer value.If nonzero, use this value as if diverge_output_set_npath had been called, but with precedence.Allows control over the number of points on the band structure path (if > 0), or, using the internal fine k-mesh and the path constructed from there (cf.kf_ibz_path, if −1).
The binary model output file consists of a 128×64 bit header followed by data.A graphical representation of the file format for model output (i.e. its header) is documented in Fig. 3. To read an array from the output file, one must look up the displacement and size from the header.A simple C example program reading an array from a divERGe model file is shown in Listing 3.

C.2 Postprocessing output files
The heterogenous nature of postprocessing options for the different backends suggests specific file formats for each of them.A definition of the respective file headers is given in the following subsections.To control what will be stored, we provide the diverge_postprocess_conf_t struct, which is defined as 3 bool p a t c h _ q _ m a t r i c e s _ u s e _ d V ; 4 int p a t ch _ q _ m a t r i c e s _ n v ; 5 double p a t c h _ q _ m a t r i c e s _ m a x _ r e l ; 6 char p a t c h _ q _ m a t r i c e s _ e i g e n _ w h i c h ; • patch_q_matrices: assemble a list of all possible momentum transfers q X in each of the interaction channels X and save the vertices V (q X , k X , k X ′ ) in this representation (default: true) • patch_q_matrices_use_dV: use the differential vertex instead of the vertex for generation of the q_matrices (default: false) • patch_q_matrices_nv: how many eigenvectors to store in the q_matrices.If <0, do not perform an eigen decomposition, if ==0, store all eigenvectors.(default: 0) • patch_q_matrices_max_rel: restrict the analysis only to those q X where the relative vertex norm is greater than this value (default: 0.9) • patch_q_matrices_eigen_which: choose which eigenvalues and -vectors to store: 'M'agnitude, 'P'ositive, 'N'egative, or 'A'lternating (default: 'M') • patch_V: include the full vertex in the output (default: false) • patch_dV: include the differential vertex in the output (default: false) • patch_Lp: include the particle-particle loop in the output (default: false) • patch_Lm: include the particle-hole loop in the output (default: false) • grid_lingap_vertex_file_P: store the q P = 0 vertex to this file if not "" (default: "") • grid_lingap_vertex_file_C: store the q C = 0 vertex to this file if not "" (default: "") • grid_lingap_vertex_file_D: store the q D = 0 vertex to this file if not "" (default: "") • grid_n_singular_values: Number of singular values to be stored from the lineaized gap solution (default: 20) • grid_use_loop: Solve linearized gap equation for vertex times loop (default: true) • grid_vertex_file: name of the file into which the vertex should be stored.Requires a lot of disk space!(default: "" -which means nothing will be saved) • grid_vertex_chan: if grid_vertex_file is non-empty chooses the channel in which the vertex is stored (options are 'P','C','D','V', default: '0') • tu_which_solver_mode: specifies routine to diagonalize channel -possible values are 'e' (force eigensolver), 's' (force SVD) and 'a' (auto).The default is 'a' which checks whether the channel at q is hermitian and then decides whether to use an eigenvalue decomposition or an SVD • tu_storing_threshold: absolute value above which eigenvalues/singular values are stored (default: 50, should be smaller or equal to maxvert) • tu_storing_relative: consider tu_storing_threshold as a relative maximum instead of an absolute value, i.e., store all eigen/singular values and vectors if they are above the product of tu_storing_threshold and the maximum eigen/singular value for the channel over all q (default: false) • tu_n_singular_values: number of singular values stored from the solution of the linearized gap equation (default: 1) • tu_lingap: evaluate linearized gap equation in each channel (default: true) • tu_susceptibilities_full: calculate susceptibility in orbital basis q , o 1−4 , s 1−4 (default: false) • tu_susceptibilities_ff: calculate susceptibility in TU-native mixed orbital/bond basis q , o 1,3 , b 1,3 , s 1−4 (default: false) • tu_selfenergy: store selfenergy if included in the calculation (default: false) • tu_channels: store full channels from tu simulation Requires a lot of disk space!(default false) • tu_symmetry_maps: store symmetry transformation for one vertex leg (i.e., eigenvector; default false)

C.2.1 Grid FRG backend
The (binary) grid output file consists of a 64 × 64 bit header that is described in Fig. 4.This header is followed by the output data.

C.2.2 N-patch backend
The (binary) N -patch output file consists of a 128 × 64 bit header that is described in Fig. 5.This header is followed by the output data.The header (64 signed 64-bit integers) is followed by data indexed by the displacement/size information.Note that, unlike model output files (cf.Fig. 3), displacement and size information is given in units of 64 bits, i.e., 8 bytes.For grid-FRG, the user has control over saving the susceptibilities for each interaction channel as well as the solutions of a linearized gap euqation at q X = 0.
-patch output header: (128 × int64_t)  The header (128 signed 64-bit integers) is followed by data indexed by the displacement/size information.Note that, unlike model output files (cf.Fig. 3), displacement and size information is given in units of 64 bits, i.e., 8 bytes.For Npatch FRG, the user has control over saving the vertex (V ), the loops (L PP , L PH ), the differential vertex (dV ), and a channel native representation of the vertices optionally eigen-decomposed.As the N -patch representation of channel native interactions is non-trivial, the data is further serialized with a sketch on how to read the individual matrices or eigenvalues/-vectors on the bottom right.The Python library diverge.outputautomatically deserializes those objects.

C.2.3 TUFRG backend
The (binary) TUFRG output file consists of a 128 × 64 bit header that is described in Fig. 6.This header is followed by the output data.

D Implementational details
This chapter specifies design choices for the critical parts of each of the backends.We do not discuss the implementation itself, but rather the basic ideas.Furthermore, the parallelization strategy is explained for each of the backends.

D.1 Grid FRG backend
Conceptually, the process of solving the FRG flow equations on a fixed momentum Bravais grid for all vertex indices is simple.Achieving sufficient performance for simulations on nontrivial two-and three-band models requires splitting the vertex buffers on more than one computation node (via MPI).The main bottleneck is then given by the available memory size, but also by computation of the contractions.For parallelization, the vertices are reordered such that a channel-native representation is obtained and the contractions reduce to many matrix products (along q ) that are distributed across all compute nodes with MPI, then branching to cuBLAS (GPU) or parallel BLAS (CPU) calls.The reordering procedure, while simple on one node, involves multiple communication steps on a multi-node architecture.
Fundamentally the Grid backend scales with O(N

D.2 N-patch backend
N -patch FRG inherently breaks momentum conservation and thereby rotational symmetries of the system, which is due to the structure of the method itself: If all momenta are fixed to the Fermi surface k 1,2,3 , the fourth momentum is not always.One approximates the vertex element with one momentum away from the Fermi surface by fixing it to the closest point on the Fermi surface.
The patching procedure and pinning of momentum points to a grid is discussed in Appendix A.1.
Fundamentally the N -patch backend scales with O(N ) in memory, with N p the number of momentum patches.Since the objects involved are not extremely huge in memory, they are copied onto all MPI ranks of a calculation.The computational work is distributed using MPI and OpenMP on CPUs as well as CUDA on GPUs, with the main issue on both architectures being memory locality (since the vertex products cannot be written as matrix multiplications).

D.3 TUFRG backend
For the TUFRG backend, the most expensive components for usual models are the loop calculation and the projections.For the projections, the implementation follows Ref. [23], with a slight improvement -the calculation of the expression is split into two parts.We recapitulate the general formula for the projection from C to P: s 1 ,s 3 , which technically is a Fourier Transformation onto a restricted mesh.In a second step we multiply out the the vertex in real-space with the prefactors.These are calculated in advance and require only little memory.In the standard form this is parallelized with OpenMP; when enabling MPI, the calculation is split along different q and at the end of step one a call to MPI_Allreduce is performed.If GPUs are available, the second step is performed with a hand-written kernel.
For the calculation of the loop we utilize a Fourier transformation trick [55,78,79] in combination with a sharp frequency cutoff.This leads to Again, the implementation uses OpenMP for parallelization over orbitals, spins and bonds.An optimized version for MPI (based on FFTW-MPI) is provided at compile time.In case of GPU usage, a hand written kernel is used for the calculation of the Green's function products and combined with calls to cuFFT.The flow product is evaluated using GEMM calls, which are distributed along the coarse momentum index and offloaded to GPUs if available.Depending on the chosen model and parameters, the leading scaling is either the loop O(N q N k log(N q N k )N 2 f f N 2 o N 4 s ), the flow step matrix products O(N q N 3 f f N 3 o N 6 s ) or the the interchannel projections O(N 2 o N 3 f f N 4 s N q ).Furthermore, memory usage of a calculation scales with O(N q N 2 o N 2 f f N 4 s ).Analogously, we use Fourier transformations for the calculation of the static self-energy,

D.4 Scaling and runtime
To give potential users a better feeling of what can and cannot be done with divERGe we present scaling plots and give reference runtimes for a three band Emery model.The formfactor cutoff distance is set to 1.2 and the number of coarse and fine momentum points is varied, see Fig. 7.We observe a significant speedup when computing on GPUs, though the GPU speedup is not optimal (bottlenecks: host-device communication and parts of the algorithms that run on CPUs).The CPU algorithm has its main bottleneck in communication, precisely in the calculation of the loop (via FFTW-MPI).Note that the momentum resolutions treated in Fig. 7 are way beyond what is necessary for most calculations.A fully converged (3 orbitals, 127 formfactors, 32 2 momentum points and (32 × 55) 2 integration points for the loop) standard calculation (reaching Λ = 10 −3 ) on a single JURECA DC-GPU node (4 NVIDIA A100 GPUs) takes around 5 Minutes.

E Example codes used in this manuscript
The data used in Fig. 1 (and Listing 1) were generated using the Python code given in Appendix E.1.Figure 2 (plotted via Listing 2) contains data generated using the C++ code from
r g e _ m o d e l _ v a l i d a t e ( model ) ) 2 printf ( " something went wrong !\ n " ) ; 3 d i v e r g e _ m o d e l _ i n t e r n a l s _ c o m m o n ( model ) ;

1
void d i v e r g e _ m o d e l _ i n t e r n a l s _ g r i d ( diverge_model_t * m ) ; 2 void d i v e r g e _ m o d e l _ i n t e r n a l s _ p a t c h ( diverge_model_t * m , index_t np_ibz ) ; 3 void d i v e r g e _ m o d e l _ i n t e r n a l s _ t u ( diverge_model_t * m , double maxdist ) ; which initialize the backend specific internals for the grid (l.1), N -patch (l. 2) or the TUFRG (l. 3) backend.Be aware that in the case of N -patch FRG, a Fermi surface patching is generated automatically. 11This renders the above function sensitive to the model's chemical potential µ.Users are thus expected to adjust the value of µ (see Section 3.4) before the backend specific internals are set, but after the common ones, since the call to 1 void d i v e r g e _ m o d e l _ i n t e r n a l s _ c o m m o n ( diverge_model_t * m ) ;

2
const double * E , index_t nb ) ; 3 double d i v e r g e _ m o d e l _ s e t _ f i l l i n g ( diverge_model_t * model , 4 double * E , index_t nb , double nu ) ; 5 void d i v e r g e _ m o d e l _ s e t _ c h e m p o t ( diverge_model_t * model , 6 double * E , index_t nb , double mu ) ;

1
double d i v e r g e _ s y m m e t r i z e _ 2 p t _ c o a r s e ( diverge_model_t * model , 2 complex128_t * buf , complex128_t * aux ) ; 3 double d i v e r g e _ s y m m e t r i z e _ 2 p t _ f i n e ( diverge_model_t * model , 4 complex128_t * buf , complex128_t * aux ) ; 5 double d i v e r g e _ s y m m e t r i z e _ m o m _ c o a r s e ( diverge_model_t * model , 6 double * buf , index_t sub , double * aux ) ; 7 double d i v e r g e _ s y m m e t r i z e _ m o m _ f i n e ( diverge_model_t * model , 8 double * buf , index_t sub , double * aux ) ;

1
char * d i v e r g e _ m o d e l _ t o _ f i l e ( diverge_model_t * m , const char * name ) ; 2 char * d i v e r g e _ m o d e l _ t o _ f i l e _ f i n e g r a i n e d ( diverge_model_t * m , const char * name , const d i v e r g e _ m o d e l _ o u t p u t _ c o n f _ t * cfg ) ;

Figure 1 :
Figure 1: Band structure of a three-band Emery model plotted with the script given in Listing 1.The model parameters (and the file model.dvg)are those from the Emery example: examples/python_tutorial/emergent-3-model-details.py in the git repository of divERGe [18] (or Appendix E.1).

61 // check ! 62 if
( d i v e r g e _ m o d e l _ v a l i d a t e ( model ) ) 63 diverge_mpi_exit ( EXIT_FAILURE ) ; 64 // finalize model and save to disk 65 double filling = 0.6 , 66 ffdist = 2.1; 67 d i v e r g e _ m o d e l _ i n t e r n a l s _ c o m m o n ( model ) ; 68 d i v e r g e _ m o d e l _ s e t _ f i l l i n g ( model , NULL , -1 , filling ) ; 69 d i v e r g e _ m o d e l _ i n t e r n a l s _ t u ( model , ffdist ) ; 70 d i v e r g e _ m o d e l _ o u t p u t _ c o n f _ t cfg = → d i v e r g e _ m o d e l _ o u t p u t _ c o n f _ d e f a u l t s _ C P P () ; 71 cfg .E = true ; 72 cfg .npath = -1; 73 cfg .kf_ibz_path = 1; 74 d i v e r g e _ m o d e l _ t o _ f i l e _ f i n e g r a i n e d ( model , " graphene_model .dvg " , & cfg → ) ;

1
d i v e r g e _ fl o w _ s t e p _ t * d i v e r g e _ f l o w _ s t e p _ i n i t ( diverge_model_t * model , 2 const char * mode , const char * channels ) ; 3 d i v e r g e _ fl o w _ s t e p _ t * d i v e r g e _ f l o w _ s t e p _ i n i t _ a n y ( diverge_model_t * model , 4 const char * channels ) ; )].One integration step starting at Λ and going to Λ + dΛ is calculated by calling 1 void d i v e r g e _ f l o w _ s t e p _ e u l e r ( d iv e r g e _ f l o w _ s t e p_ t * step , double Lambda , 2 double dLambda ) ; e r g e _ fl o w _ s t e p _ t * step = d i v e r g e _ f l o w _ s t e p _ i n i t _ a n y ( model , " PCD " → ) ; 79 diverge_euler_t eu = d i v e r g e _ e u l e r _ d e f a u l t s _ C P P () ; e r g e _ f l o w _ s t e p _ e u l e r ( step , eu .Lambda , eu .dLambda ) ; 85 d i v e r g e _ f l o w _ s t e p _ v e r t m a x ( step , & vmax ) ; 86 d i v e r g e _ f l o w _ s t e p _ c h a n m a x ( step , cmax ) ; 87 mpi_printf ( " %.5 e %.5 e %.5 e %.5 e %.5 e \ n " , eu .Lambda , cmax [0] , → cmax [1] , cmax [2] , vmax ) ; 88 } while ( d iv e rg e _e u le r_ n ex t ( & eu , vmax ) ) ; ... 96 d i v e r g e _ f l o w _ s t e p _ f r e e ( step ) ;

Figure 2 :
Figure 2: Visualization of the d x y and d x 2 − y 2 superconducting gap functions of the honeycomb lattice Hubbard model calculated in N -patch FRG (examples/c_cpp/honeycomb.cpp[18] or Appendix E.2).The plot is generated with the small Python script presented in Listing 2. We encode the complex phase as color, and the magnitude as opacity.Notice how most of the orbital weight of the gap function is on the off-diagonal elements ∆ 12 (k) and ∆ 21 (k).

1 2 . 7
void d i v e r g e _ p o s t p r o c e s s _ a n d _ w r i t e ( d i v e r g e _ f lo w _ s t e p _ t * s , 2 const char * name ) ; 3 void d i v e r g e _ p o s t p r o c e s s _ a n d _ w r i t e _ f i n e g r a i n e d ( d i v e r g e _ fl o w _ s t e p _ t * s , 4 const char * name , const d i v e r g e _ p o s t p r o c e s s _ c o n f _ t * cfg ) ;perform an optimized set of these post-processing tasks and write the results to disk.Note that we take the same approach as for diverge_model_t (cf.Section 3.5) in how the default behavior can be changed by calling the fine-grained function with an additional configuration structure.Details on the available parameters and file formats are found in Appendix C.2, a practical example is given in Example 7.All post-processing and model (cf.Section 3.5) files can be read into Python classes using the library diverge.output.The file type is automatically recognized.As a demonstration, we plot the (negative) leading P-channel eigenvectors of a honeycomb lattice Hubbard model close to van Hove doping (the parameters are equivalent to those given in Fig. 4 (b) of Ref. [52], at filling ν = 0.6).The few lines of Python in Listing 2 (mostly matplotlib) result in the visual representation of the two degenerate superconducting d x y -and d x 2 − y 2 -wave states shown in Fig. Example Doing post-processing given a flow step instance.The default parameters (l.90) can be changed (ll.91-93) and passed to the post-processing routine (l.94).The code obtained from merging the snippets given in Examples 1 to 7 can be found in examples/c_cpp/ honeycomb_tutorial.cpp[18].e r g e _ p o s t p r o c e s s _ c o n f _ t out = → d i v e r g e _ p o s t p r o c e s s _ c o n f _ d e f a u l t s _ C P P () ; 91 out .t u _ s t o r i n g _ re l a t i v e = true ; 92 out .t u _ s t o r i n g _ t h r e s h o l d = 0.8; 93 out .t u _ w h i c h _ s o l v e r _ m o d e = 's '; 94 d i v e r g e _ p o s t p r o c e s s _ a n d _ w r i t e _ f i n e g r a i n e d ( step , " graphene_post .dvg " , → & out ) ; RTG 1995, within the Priority Program SPP 2244 "2DMP" -443273985 and under Germany's Excellence Strategy -Cluster of Excellence Matter and Light for Quantum Computing (ML4Q) EXC 2004/1 -390534769.LK gratefully acknowledges support from the DFG through FOR 5249 (QUAST, Project No. 449872909, TP5).

1
mom_patching_t * d i v e r g e _ p a t c h i n g _ f r o m _ i n d i c e s ( diverge_model_t * m , 2 const index_t * fs_pts , index_t n_fs_pts ) ;

1
int (* c h a n n e l _ v e r t e x _ g e n e r a t o r _ t ) ( diverge_model_t * model , char channel , complex128_t * buf ) ;

1 6 /
diverge_euler_t euler = d i v e r g e _ e u l e r _ d e f a u l t s ; // or change the defaults 2 double vmax = 0.0; 3 do { 4 d i v e r g e _ f l o w _ s t e p _ e u l e r ( step , euler .Lambda , euler .dLambda ) ; 5 d i v e r g e _ f l o w _ s t e p _ v m a x ( step , & vmax ) ; / do something with the vertices here 7 } while ( d iv e rg e _e u le r_ n ex t ( & euler , vmax ) ) ; n s i d e r _ m a x v e r t _ i t e r _ s t a r t ; 13 double c o n s i d e r _ m a x v e r t _ l a m b d a ; 14 }; typedef struct d i v e r g e _ m o d e l _ o u t p u t _ c o n f _ t 10 } d i v e r g e _ m o d e l _ o u t p u t _ c o n f _ t ;

Figure 3 :
Figure 3: File format header for the files written by diverge_model_output.

Figure 4 :
Figure 4: Specification of the file format for grid-FRG postprocessing output files.The header (64 signed 64-bit integers) is followed by data indexed by the displacement/size information.Note that, unlike model output files (cf.Fig.3), displacement and size information is given in units of 64 bits, i.e., 8 bytes.For grid-FRG, the user has control over saving the susceptibilities for each interaction channel as well as the solutions of a linearized gap euqation at q X = 0.

Figure 5 :
Figure5: Specification of the file format for N -patch FRG postprocessing output files.The header (128 signed 64-bit integers) is followed by data indexed by the displacement/size information.Note that, unlike model output files (cf.Fig.3), displacement and size information is given in units of 64 bits, i.e., 8 bytes.For Npatch FRG, the user has control over saving the vertex (V ), the loops (L PP , L PH ), the differential vertex (dV ), and a channel native representation of the vertices optionally eigen-decomposed.As the N -patch representation of channel native interactions is non-trivial, the data is further serialized with a sketch on how to read the individual matrices or eigenvalues/-vectors on the bottom right.The Python library
Listing 3: C example program that reads the dispersion array from a divERGe model output file (model.dvg) to illustrate how to deal with the file format from languages that are not Python.
If not specified else, all header variables are of the index_t type.The data section following the header contains all arrays specified through their displacement (displ) (given in bytes) and size (given in number of elements with the respective type).All arrays are of complex128_t type but mi_to_ofrom, mi_to_oto, mi_to_R, bond_sizes, bond_offsets, idx_ibz_in_fullmesh, all arrays containing off or len in their name and mi_map_idx which are of type index_t.More specific information on the array shapes is found in the diverge.outputlibrary.
) in memory, with bottlenecks usually being communication, memory size, and batched matrix matrix products.