SmoQyDQMC.jl: A flexible implementation of determinant quantum Monte Carlo for Hubbard and electron-phonon interactions

We introduce the SmoQyDQMC.jl package, a Julia implementation of the determinant quantum Monte Carlo algorithm. SmoQyDQMC.jl supports generalized tight-binding Hamiltonians with on-site Hubbard and generalized electron-phonon interactions, including non-linear $e$-ph coupling and anharmonic lattice potentials. Our implementations use hybrid Monte Carlo methods with exact forces for sampling the phonon fields, enabling efficient simulation of low-energy phonon branches, including acoustic phonons. The SmoQyDQMC.jl package also uses a flexible scripting interface, allowing users to adapt it to different workflows and interface with other software packages in the Julia ecosystem. The code for this package can be downloaded from our GitHub repository at https://github.com/SmoQySuite/SmoQyDQMC.jl or installed using the Julia package manager. The online documentation, including examples, can be obtained from our document page at https://smoqysuite.github.io/SmoQyDQMC.jl/stable/.


Overview & scope
This paper introduces SmoQyDQMC.jl,a user-friendly Julia implementation of the determinant quantum Monte Carlo (DQMC) algorithm [1,2], and its associated auxiliary packages.
The SmoQyDQMC.jlpackage is capable of simulating a broad class of tight-binding Hamiltonians with on-site Hubbard and/or generalized electron-phonon (e-ph) interactions.The model Hamiltonians can be defined on arbitrary lattice geometries and a wide variety of measurements can be performed for these simulations, including density-density, spin-spin, currentcurrent correlation functions and pairing for different symmetries.The code has been designed with a scripting interface inspired by packages like ITensor.jl [3], rather than a configuration file-based workflow commonly found in open-source implementations of DQMC [4,5].This design allows users to easily incorporate SmoQyDQMC.jlinto complex workflows and directly interface it with the Julia programming language's rich ecosystem of scientific computing and machine learning packages.Crucially, our implementation accomplishes this flexibility without sacrificing performance, with SmoQyDQMC.jlachieving ideal O(β N 3 ) scaling in computational complexity in the system size N and inverse temperature β.
This document provides detailed descriptions of SmoQyDQMC.jl'sdesign philosophy and underlying algorithms.It is also intended to serve as a cite-able document when using this code for research.This document is not intended to serve as a detailed user manual for the package.Instead, we encourage readers to consult our online documentation, which will be maintained as a living document with new examples and reference material added continuously over the package's lifetime.

Background and motivation
Quantum Monte Carlo (QMC) algorithms are a powerful family of methods for performing numerically exact nonperturbative simulations of quantum many-body condensed matter systems .These methods come in many flavors, including zero temperature projection and variational Monte Carlo methods, to finite-temperature auxiliary field methods like DQMC or continuous-time QMC and beyond.
Several open-source implementations of the DQMC algorithm have been developed over the years with support for different classes of Hamiltonians.Perhaps the most popular and well-known are the Algorithms for Lattice Fermions (ALF) [5] and QUantum Electron Simulation Toolbox (QUEST) [4] projects.The QUEST package supports single-and multi-orbital Hubbard Hamiltonians defined on arbitrary lattice geometries.The ALF package provides support for a much broader class of Fermionic interactions, as well as coupling to classical or quantum bosonic fields, which enables simulations of standard e-ph Hamiltonians like the Holstein model.Both QUEST and ALF are currently implemented in Fortran90, which hinders their integration with modern machine learning and scientific computing packages. 2he SmoQyDQMC.jlpackage is a Julia implementation of the DQMC algorithm, with support for tight-binding Hubbard Hamiltonians with and without e-ph interactions.The code leverages hybrid Monte Carlo (HMC) methods [91][92][93] to sample the phonon fields, which allows it to treat a much broader class of e-ph interactions.The package also includes additional features that are not currently included in existing DQMC implementations.Specifically, the initial release of this package includes support for: 1. Arbitrary lattices and bases in zero-, one-, two-, and three-dimensions, 2. intra-orbital Hubbard interactions in any site/orbital in the lattice, 3. real or complex hopping parameters to enable the introduction of magnetic fields or boundary condition averaging to reduce finite size effects, 4. Hamiltonians with spin-dependent tight-binding and e-ph interaction parameters, 5. fully momentum-dependent e-ph interactions, including long-range Holstein-and SSHlike couplings, 6. coupling to multiple phonon branches, either via the same or different microscopic coupling mechanisms, 7. low-energy optical and acoustic phonon branches, 8. nonlinear e-ph interactions and anharmonic lattice potentials up to fourth order in the atomic displacements, 9. dynamical tuning of the chemical potential to archive a targeted density 〈n〉, 10. spatial disorder in any Hamiltonian parameter, and 11. support for measuring a wide range of common observables on arbitrary lattice geometries.
Importantly, SmoQyDQMC.jl is user friendly and utilizes a scripting interface rather than more traditional workflows based on input configuration files.This design allows users to implement straightforward parallelization at the script level, adapt the package to their existing workflows, and more readily interface SmoQyDQMC.jlwith the rich ecosystem of scientific computing packages being actively developed in the Julia programming language.For example, it can be readily coupled to existing machine learning and artificial intelligence packages to enable new research in this direction [94].
If the SmoQyDQMC.jlpackage is unable to meet the needs of a particular user, we have also provided two lower-level supporting packages.The first package, JDQMCFrameworks.jl,implements the core computational kernel of the DQMC algorithm.The second package, JDQMCMeasurements.jl, implements a set of functions for measuring various correlation functions for arbitrary lattice geometries in a DQMC simulation.This package also exports several additional utility functions for transforming measurements from position space to momentum space, and also measuring susceptibilities by integrating correlation functions over the imaginary time axis.These two lower-level packages were used to develop the SmoQyDQMC.jlpackage, and provide the tools necessary for a user to develop their own specialized implementations of the DQMC algorithm.

Relevant links, documentation, and reporting
The source code for the SmoQyDQMC.jlpackage and its associated auxiliary packages can be found on the SmoQy Suite's GitHub page [95].As a package registered with Julia programming language's General registry, the package can also be installed using the Julia package manager by issuing the command

julia> ] pkg> add SmoQyDQMC
The package's documentation, including examples, can be found in our online documenation [96].It includes full API and a growing list of example scripts that can be used to run several model Hamiltonians of interest.The source code for the JDQMCFrameworks.jl[97] and JDQMCMeasurements.jl [98] packages can also be found on the SmoQy Suite's GitHub page.

Supported hamiltonians 2.1 Specifications
This section describes the class of Hamiltonians currently supported by the SmoQyDQMC.jlpackage, and how the various terms appearing in the Hamiltonian are parameterized within the code base.Throughout this section, we use bold roman indices (e.g., i, j, . . . ) to index the unit cell within the cluster, Greek symbols (e.g., ν, γ, µ, . . . ) to index different orbitals within the atomic basis, and n i,ν to index different phonon species on orbital ν of unit cell i.Throughout, we normalize ħ h = 1 and denote N = N • n as the total number of orbitals in the lattice, where N is the number of unit cells, and n is the number of orbitals per unit cell.
For our purposes, it is convenient to partition the full Hamiltonian into three terms where Û describes the non-interacting lattice (phonon) degrees of freedom and K and V are the total electron kinetic and potential energies, respectively.Note that both K and V can depend on the dynamical lattice coordinates, leading to an electron-phonon coupling that is either diagonal or off-diagonal in the orbital basis.V also includes any contributions from the intra-orbital Hubbard repulsion on a given site.The non-interacting lattice terms are further subdivided into the sum of three terms The first term describes the placement of local quantum harmonic oscillator (QHO) modes on sites in the lattice, i.e. an Einstein solid, while the second term introduces anharmonic contributions to the oscillator potential.The third term introduces coupling (or dispersion) between the QHO modes.The sums over i (j) and ν (γ) run over unit cells in the lattice and basis orbitals within each unit cell, respectively.The sum over n i,ν (n j,γ ) runs over the different flavors of QHO modes placed on a given orbital in the lattice.
The position and momentum operators for each QHO mode are Xn i,ν and Pn i,ν , respectively, with corresponding phonon mass M n i,ν .The spring constant is , with Ω 0,n i,ν specifying the phonon frequency.Ûanh then introduces an anharmonic X 4 n i,ν contribution to the QHO potential energy that is controlled by the parameter Ω a,n i,ν .Similarly, Ω0,n i,α ,n j,γ ( Ωa,n i,α ,n j,γ ) is the coefficient controlling harmonic (anharmonic) dispersion between QHO modes.
The electron kinetic energy is conveniently expressed as The first term describes the non-interacting spin-σ electron kinetic energy K0,σ = − i,ν j,γ t σ,(i,ν),(j,γ) ĉ † σ,i,ν ĉσ,j,γ + h.c., (7) where t σ,(i,ν),(j,γ) is the spin-σ hopping integral from orbital γ in unit cell j to orbital ν in unit cell i, and may be real or complex.The second term describes the interaction between the lattice degrees of freedom and the spin-σ electron kinetic energy via a Su-Schrieffer-Heeger (SSH)-like coupling mechanism [99,100] Kssh,σ = i,ν j,γ Here, the modulations of the spin-σ hopping integrals to m th (= 1 − 4) order in displacement are controlled by the parameters α σ,m,n i,ν ,n j,γ .Lastly, the electron potential energy is expressed as is the non-interacting spin-σ electron potential energy.Here, µ is the chemical potential and ε σ,i,ν is the spin-σ on-site energy for orbital ν in unit cell i.

The second term
Vσ,hol = is the contribution to the spin-σ electron potential energy that results from a Holstein-or Fröhlich-like coupling to the lattice degrees of freedom.The parameter ασ,m,n i,ν ,(j,γ) controls the strength this coupling in the Vσ,hol term.It is important to note that the two available parametrizations shown in Eq. ( 11) that are available in SmoQyDQMC.jlare inequivalent.
Finally, the third term is the on-site Hubbard interaction contribution, where U i,ν is the on-site Hubbard interaction strength.Note that SmoQyDQMC.jlallows the user to parameterize the Hubbard interaction using either functional form for Vhub .The top-most is particle-hole symmetric and is often useful at half-filling.

A flexible approach to electron-phonon coupling
The class of Hamiltonians supported by SmoQyDQMC.jl is flexible enough to accommodate the simulation of most standard e-ph model Hamiltonians.For example, the canonical single-band Holstein model with coupling to a single Einstein phonon branch can be obtained by placing a single QHO mode on each site (n i,ν = 1) and setting Ûanh = Ûdisp = Kssh = 0, while retaining the Vhol term [101].Similarly, the optical single-band SSH model in D-dimensions [64] can be arrived at by setting Ûanh = Ûdisp = Vhol = 0 but requires D QHO modes per site.
The bond SSH model can also be expressed by coupling pairs of QHO modes, one with finite mass and the other with infinite mass, effectively associating the finite mass QHO mode with a bond [64,102].The acoustic SSH model can be expressed by introducing QHO modes with zero frequency (Ω 0,n i,ν = 0), that are then coupled together with the Ûdisp term [100].Importantly, SmoQyDQMC.jlallows users to define Hamiltonians that combine these various models, enabling the simulation of systems with multiple phonon branches that can each couple to the electrons in different ways.

Algorithm details
This section provides details on the various algorithms used in the SmoQyDQMC.jlpackage.
Here, we have taken an axiomatic approach and focused on describing what the algorithms do rather than providing detailed derivations or justifications for their correctness.Instead, we have provided several references throughout the text for any reader who is interested in the relevant derivations.

Formulation of DQMC algorithm
DQMC is an auxiliary field QMC method for simulating systems of itinerant fermions on a lattice in the grand canonical ensemble [1,2].For an inverse temperature β = 1/T (k B = 1), the algorithm adopts a discrete imaginary time grid τ = ∆τ • l, where l = 1, . . ., L τ indexes the imaginary time-slice and ∆τ = β/L τ .Using this grid, DQMC then expresses the partition function as a path-integral in imaginary time where B = e −∆τ Ĥ is the imaginary-time propagator over the discretization interval ∆τ.
Next, the Suzuki-Trotter (ST) approximation is applied to all imaginary-time slices l to factorize the various terms appearing in the exponentiated Hamiltonian [103].The resulting form for the partition function is where the B ≈ B + O(∆τ 3 ) approximation is used.By applying cyclic property of the trace it is straightforward to see that Eq. ( 14) is left unchanged if the lower order B ≈ B + O(∆τ 2 ) approximation is used instead.SmoQyDQMC.jlcan run simulations based on applying either Eq. ( 15) or (16).
In cases where the kinetic energy operator K is static, a formulation based on Eq. ( 16) is usually preferable as the computational overhead is lower.However, when e-ph interactions modulate the hopping amplitudes and the checkerboard approximation is applied, it is better to use a formulation based on Eq. ( 15) as it improves the efficacy of this additional approximation; this aspect is discussed in greater detail in Sec.3.5.At this point, the only term in the Hamiltonian that is not quadratic in Fermion creation and annihilation operators is Vhub .This term can be rendered quadratic using the discrete Ising Hubbard-Stratonovich (HS) transformation [104,105] (or other variants [106][107][108]) where mσ = 2n σ − 1, η = U/|U|, and cosh α = e 1 2 ∆τ|U| .This decoupling is valid for both a repulsive (U > 0) and attractive (U < 0) local Hubbard interaction.Applying the Ising HS transformation at each imaginary-time introduces Ising HS variables s l,i,ν at every orbital with a finite Hubbard interaction in the lattice, for every imaginary time slice l.
Next, we evaluate the trace over the lattice degrees of freedom.This task is best accomplished in the position basis where we can analytically integrate out the phonon momentum operators Pn i,ν and replace the phonon position operators Xn i,ν by scalar phonon fields x l,n i,ν for all phonon modes n i,ν and imaginary-time slices l [56,61,63,67].
After these operations are performed, each term appearing in the exponentiated Hamiltonian is quadratic in Fermionic operators, and may be written in the form where ĉ † σ = [ĉ † σ,1,1 , . . ., ĉ † σ,N ,n ] is a row-vector of creation operators for each of the N orbitals in the system, and ĉσ is the corresponding column vector of annihilation operators.Therefore, O σ,l is a N × N Hermitian matrix.
The resulting expression for the partition function is with s and Dx denoting the path integral over all Ising HS fields s l,i,ν and all phonon fields x l,n i,ν , respectively.The propagator operators now explicitly depend on the spin σ and imaginary-time slice l and are given by Bσ,l = e − ∆τ 2 Kσ,l e −∆τ Vσ,l e − ∆τ 2 Kσ,l , or Bσ,l = e −∆τ Vσ,l e −∆τ Kσ,l , ( depending on whether the approximation from Eq. (15) or Eq. ( 16) is used, respectively.The dependence on the fields s l,i,ν and x l,n i,ν appear as Kσ,l = K0,σ + Kssh,σ,l = K0,σ + Kssh,σ (x l,n i,ν ) , (22) in the electron kinetic energy, and Vσ,l = V0,σ + Vhol,σ,l + Vhub,σ,l = V0,σ + Vhol,σ (x l,n i,ν ) + Vhub,σ (s l,i,ν ) , (23) in the electron potential energy, respectively.Employing the Blankenbecler, Scalapino and Sugar (BSS) relation [1], we integrate out the Fermionic degrees of freedom to arrive at our final expression for the partition function where S B (x) is the strictly bosonic action associated with the lattice degrees of freedom and M σ (τ) is the spin-σ Fermion determinant matrix.The latter is given by where I is the identity matrix, and B σ,l are the spin-σ propagator matrices for imaginary-time slice l.Note that det M σ (τ) is the same for all values of τ.The propagator matrices take a similar form to the propagator operators appearing in Eq. ( 20) and Eq. ( 21), with or where K σ,l and V σ,l are the spin-σ electron kinetic and potential energy matrices for imaginarytime slice l, respectively.Each K σ,l is strictly off-diagonal and Hermitian, while the V σ,l matrices are diagonal.The total bosonic action S B (x) appearing in Eq. ( 24) can be conveniently expressed as a sum of four terms The first term is the contribution to the bosonic action arising from the QHO term Ûqho in the Hamiltonian, as defined by Eq. ( 3), and the term corresponds to the anharmonic lattice potential term Vanh defined in Eq. ( 4).The third term arises from dispersive couplings between QHO modes described by Ûdisp , defined in Eq. ( 5).The final term arises due to the manner in which the Holstein-like interactions appearing in Vhol are parameterized in Eq. ( 11).
The high-dimensional integral appearing in Eq. ( 24) is not analytically tractable, but lends itself to a numerical solution.In particular, DQMC simulations perform a Monte Carlo sampling of the HS fields s and phonon fields x, using as Monte Carlo weights the argument of the integral in Eq. ( 24), We note, however, that the Monte Carlo weights as determined by Eq. ( 33) are not strictly positive.They can take on both positive and negative values in many cases, and can become complex if other types of HS transformations are used.This aspect leads to the well-known Fermion sign problem [109][110][111].To circumvent the negative weights, DQMC instead takes the absolute value of Eq. ( 33) as the Monte Carlo weights Specifically, applying the Metropolis-Hastings criteria, the acceptance probability for updating the field configuration from (s, x) to (s ′ , x ′ ) is given by However, using W (s, x) necessitates employing a reweighting procedure to recover unbiased measurements from a simulation; this will be discussed more in Sec.3.9.

DQMC simulation overview
This section briefly outlines the overarching structure of a DQMC simulation.By design, this structure closely mirrors how scripts using SmoQyDQMC.jlare written to perform simulations.
We have provided numerous examples of such scripts in the online documentation, and Algorithm (1) provides an overview of what this structure might look like.The goal of a DQMC simulation is to faithfully sample the relevant fields according to the probability distribution described by Eq. ( 34).In the case of SmoQyDQMC.jl,these fields correspond to either the Ising HS fields that result from decoupling the Hubbard interaction or the phonon fields.The fields are initialized to a random configuration at the beginning of a simulation.Then, updates to the field configurations are proposed using various methods, which are either accepted or rejected with a probability given by Eq. ( 35).Sec.3.6 outlines an efficient procedure for updating the Ising HS fields, whereas sections 3.7 and 3.8 describe methods for efficiently updating the phonon fields.
Another important part of the simulation is making measurements as field configurations are sampled.However, measurements should not be performed at the start of a simulation as the initial field configuration is typically far from the target equilibrium distribution described by Eq. (34).Therefore, the fields are first updated N therm.times during an initial thermalization period to equilibrate the field configurations to the target distribution.The remainder of the simulation is then broken into N bin intervals, with n bin updates performed per interval, as shown in Algorithm (1).After each set of updates to the field configurations, measurements are made and recorded.Electronic correlation measurements are made using the single-particle electron Green's function (which is defined and discussed in Sec.3.3).At the end of each interval, the average of the previous n bin measurements are written to the file.At the end of the simulation, these interval-averaged measurements are processed to generate final estimates for the expectation value of measured observables; the details of this analysis are discussed in Sec.3.9.Algorithm (1) also shows how the chemical potential can be updated during a simulation to achieve a target electron density.Sec.3.10 gives more details on this functionality.
Finally, the algorithms for updating fields and measuring electronic correlation functions require reliably calculating the single-particle electron Green's function matrices defined in Sec.3.3.A naive approach to performing these calculations often results in numerical instabilities, rendering the result unreliable.Sec.3.4 describes efficient algorithms for suppressing these numerical instabilities that would otherwise prevent the DQMC algorithm from functioning correctly.

Single-particle electron Green's functions
A central quantity in DQMC simulations is the imaginary-time single-particle Green's function.This section provides a brief review of the definitions of this quantity and its properties.
The spin-σ electron Green's function in the orbital basis is given by where T is the imaginary-time ordering operator.Eq. ( 36) describes the creation of a spin-σ electron at orbital γ in unit cell j at imaginary-time τ ′ , and annihilation at orbital ν in unit cell i at imaginary time τ.The imaginary-time axis spans the interval 0 ≤ (τ, τ ′ ) < β, and the electron Green's function is additionally subject to the aperiodic boundary condition when τ ̸ = τ ′ .Given a fixed set of field configurations s and x, the equal-time Green's can be related to the matrix elements of the Fermion determinant matrix by where we have used the short-hand notation with B σ (τ, τ − ∆τ) = B σ,l and B σ (τ, τ) = I.Therefore, subject to the boundary condition The equal-time Green's function matrix G σ (τ, τ) can be propagated to adjacent imaginary-time slices in the forward and reverse directions using the relations and respectively.These relationships between the Green's functions on adjacent time-slices are the basis for an efficient updating scheme when performing Metropolis-Hastings or Heat-bath sampling [2], as discussed in Sec.3.6.The time-displaced (or unequal-time) Green's functions G ν,γ σ,i,j (τ, 0) and G ν,γ σ,i,j (0, τ) correspond to the matrix elements of and respectively, where we have applied the boundary condition In similar fashion to the equal-time Green's function matrices, the time-displaced Green's function matrices can be propagated in the forward and reverse imaginary-time directions using the relations and Specific boundary conditions arise as a result of the aperiodic boundary condition set by Eq. (37), and the definition of the Fermionic Green's function in Eq. (36).In particular, the time-displaced Green's function matrices satisfy where it is again assumed that τ > 0.
Lastly, again assuming a fixed HS and phonon field configuration, higher order correlation functions can be measured by applying Wick's theorem to express them as sums of products of the single-particle electron Green's functions [12,112].

Numerically stable framework for DQMC simulations
The procedure for updating the HS (s l,i,ν ) and phonon (x l,n i,ν ) fields requires calculating the equal-time Green's function matrices G σ (τ, τ) for all imaginary time slices l ∈ [1, L τ ].Similarly, performing measurements of any imaginary time-displaced correlation functions requires calculating the imaginary time-displaced Green's functions G σ (τ, 0) and G σ (0, τ).
A straightforward approach for computing these quantities is outlined in Algorithm 2. Unfortunately, this naive approach fails due to well-documented [2,[113][114][115][116] numerical instabilities associated with evaluating the ill-conditioned products of B σ,l matrices.Specifically, repeated matrix multiplication by the propagator matrices B σ,l accumulates numerical errors that quickly become severe.These numerical errors appear both when attempting to evaluate the B σ (β, 0) term appearing in Eq. (40), and also as a result of repeated applications of Eqs. ( 42), (47), and (48).The errors are then further amplified when matrix inversions are performed, as in Eq. (40).
Practical implementations of the DQMC algorithm have to overcome these numerical instabilities by introducing stable matrix factorizations [2,[113][114][115][116].The SmoQyDQMC.jlpackage uses stabilization procedures based on those introduced in Ref. [114] and further discussed in Ref. [116].Our package also stores intermediate matrix products to improve the algorithm's efficiency [5,117,118], an approach described in greater detail below.To outline this procedure, we first introduce the L DR matrix factorization, which represents products of propagator matrices B σ,l and is based on the column-pivoted QR factorization.For a non-singular square matrix A, its column-pivoted QR factorization is given by where Q is a unitary matrix, R ′ is an upper-triangular matrix, and P is a permutation matrix.
The corresponding A = L DR factorization is then defined as a product of matricies Algorithm 2 Numerically Unstable Forward Propagation Framework for DQMC Simulations Calculate G σ (0, 0) using Eq. ( 40): Apply Eq. ( 47): [Insert Updates to Fields or Time-Displaced Correlation Measurements Here.]

end for
Here, both the unitary matrix L and matrix R are well-conditioned matrices, and D is a diagonal matrix.To further improve numerical stability, it is useful to factorize D as where D max = max(D, 1) and D min = min(D, 1) [114].
Next, the imaginary-time axis of length L τ is split into N s = L τ /n s intervals of length n s .The relative position within interval n ∈ [1, N s ] is given by l n ∈ [1, n s ], where the corresponding imaginary-time slice is l = l n + (n − 1)n s .For simplicity's sake, here we have assumed mod(L τ , n s ) = 0.While this will not be the case in general, it is relatively straightforward to generalize the algorithms outlined below to the case that this assumption does not hold.
The composite propagator matrix associated with an interval n is given by with the product of Bσ,n matrices represented by where n > n ′ and Bσ,n,n−1 = Bσ,n .The corresponding L DR factorization for Bσ,n,n ′ is denoted by Fσ,n,n Finally, Fσ denotes an array of L DR factorizations of length N s to store sequential Fσ,n,n ′ factorizations.
With these definitions in hand, we now turn our attention to Algorithms (3a) and (3b) for propagating in imaginary-time the equal-time and time-displaced Green's function matrices in the forward (τ = 0 → τ = β) and reverse (τ = β → τ = 0) directions, respectively.In particular, note that the input state of that array F in Algorithm (3b) matches the output state from Algorithm (3a).Likewise the input state of F in Algorithm (3a) matches the output state from Algorithm (3b).Therefore, it is necessary to alternate the application of Algorithms (3a) and (3b).This results in a DQMC updating procedure that alternates sweeping forward and backward in imaginary time rather than cyclically.The advantage of this approach is that it allows the DQMC algorithm to retain a computational cost that scales linearly with β, while remaining numerically stable [117].The parameter n s is now understood to be the period in imaginary-time with which the Green's function matrices are re-computed using a more expensive, but numerically stable procedure.
In practice, it is important to monitor the numerical stability of a simulation to determine whether n s needs to be decreased to increase numerical stability, or increased to reduce computational overhead.To accomplish this task, let G stable σ (τ, τ) correspond to the equal-time Green's matrix that is recomputed using a numerically stable procedure, while G naive σ (τ, τ) represents the same matrix, but generated via simple propagation in imaginary time using Eq.(42) or Eq.(43).We then keep track of the maximum element-wise difference during a simulation.We would like δG σ remain below some maximum threshold δG σ < δG max , with δG max ≈ 10 −6 a sufficient upper bound in most cases.Typical values for the period of numerical stabilization that satisfy this stability condition are n s ∼ 5 − 10, but depend on the

end if end if end for
Output: Fσ = Fσ,1,0 , Fσ,2,0 , . . ., Fσ,N s ,0 strength of the interactions, inverse temperature β, and ∆τ.When this condition is being violated in the course of a DQMC simulation, it is an indication that n s may need to be reduced.Conversely, if δG σ ≪ δG max throughout the simulation, then n s can often be increased to reduce the simulation's run time.SmoQyDQMC.jloffers useful functionality for reducing n s dynamically during a simulation if instances of δG σ > δG max are detected too frequently.However, it is worth noting that DQMC simulations performed with SmoQyDQMC.jlretain a computational cost that scales linearly with β for any n s .Rather, reducing n s simply increases the computational prefactor associated with the linear dependence on β.

The checkerboard approximation
The SmoQyDQMC.jlpackage makes heavy use of the checkerboard approximation [119].For example, this approximation is necessary for efficiently simulating optical SSH e-ph interactions, where the hopping integrals are modulated by the atomic displacements of phonon Evaluate Eq. (44b) using routine A.5: Evaluate Eq. (45b) using routine A.5:

end if end if end for
Output: Fσ = Fσ,N s ,0 , Fσ,N s ,1 , . . ., Fσ,N s ,N s −1 modes defined to live on the sites of the lattice [102].Note that SmoQyDQMC.jldoes allow users to run simulations without the checkerboard approximation, but this will slow the code down significantly.
The checkerboard approximation is motivated by the observation that the exact exponentiated kinetic energy matrices Γ σ,l (∆τ) = e −∆τK σ,l , ( appearing in the definition of the propagator matrices B σ,l are dense N × N matrices.As a result, evaluating the product of B σ,l with another dense matrix scales as O(N 3 ), and constitutes one of the leading computational costs in DQMC simulations.Moreover, when SSH-like e-ph interactions are present (arising from the Kssh term in the Hamiltonian), updating a phonon field x l,n i,ν necessitates diagonalizing the corresponding spin-σ kinetic energy matrix K σ,l in order to re-exponentiate it.If this diagonalization is performed at every update, it would introduce an exorbitant additional computational cost and significantly slow the simulation.
To ameliorate these computational hurdles, we introduce the order O(∆τ 2 ) checkerboard approximation, which replaces the Γ σ,l (∆τ) dense matrices with sparse matrix representations.Applying this approximation reduces the cost of multiplying a dense matrix by B σ,l from O(N 3 ) to O(N 2 ).But more importantly, it reduces the cost of exponentiating the K σ,l matrices when a phonon field is updated from O(N 3 ) to O(N ).
Consider a spin-σ kinetic energy matrix K σ,(i, j) = −t σ,i, j , where t σ,i, j is the total hopping amplitude associated with the bond connecting orbitals i and j in the lattice.This matrix can be expressed as a sum of bond matrices K σ = b k σ,b .Here, b indexes each bond in the lattice, and the corresponding bond matrix is of the form where a hopping amplitude may in general be complex, t σ,i b , Next, the bonds must be sorted into groups, or colors, such that the bonds of a given color do not overlap or touch.This condition corresponds to the bond operators kb (and corresponding matrices k b ) of the same color all mutually commuting with one another.The task of constructing these groups can be reduced to the edge coloring problem in graph theory.It is important that the minimum number of colors are used, as this improves the accuracy of the checkerboard approximation.However, the precise composition of each color is not unique, and some coloring schemes are better than others.In this code, the colors are assigned by systematically iterating over the unit cells in the lattice, and assigning a color to each bond with a site contained in the current unit cell; for more information, we refer the reader to our Checkerboard.jlpackage [120].
Having assigned a color to each bond in the lattice, the total kinetic energy matrix may be expressed as where K σ,c is the kinetic energy matrix associated with just the bonds b assigned the color c, with N c the number of colors.Absent any approximation, the exponential of a single color matrix K σ,c is given by With these definitions in hand, the checkerboard approximation is given by where is the checkerboard matrix.A simple and efficient method for multiplying a dense matrix by Γσ (∆τ) is given by Algorithm 2 in Ref. [119].
The efficacy of the checkerboard approximation deteriorates in models with longer range hopping.This occurs due to the increase in the average coordination number with distance, increasing the number of color groups the bonds need to be partitioned into.Moreover, while Γ σ (∆τ) is Hermitian, the checkerboard matrix Γσ (∆τ) is not.In order to mitigate concerns regarding the accuracy of the checkerboard approximation, a symmeterized form Γ † σ ( ∆τ 2 )• Γσ ( ∆τ 2 ) can be used instead.This form results in a Hermitian checkerboard matrix and significantly improves the overall accuracy of the checkerboard approximation [121].In a DQMC simulation, this corresponds to using the checkerboard approximation in conjunction with the symmetric definition for the propagator matrices B σ,l , as defined by Eq. ( 26).
SmoQyDQMC.jl has support for both the asymmetric or symmetric checkerboard approximation.When performing simulations models that include SSH interactions, it is recommended that the symmetric checkerboard approximation be used in order to help ensure the efficacy of the checkerboard approximation.

Efficient local updates of Hubbard-Stratonovich fields
In this section, we outline a well-known method for efficiently updating the Ising HS fields introduced in Sec.3.1 to decouple a local Hubbard interaction [2].A naive, or brute-force, approach to updating the HS fields consists of re-calculating the fermion determinants from scratch each time an update to a field s l,i,ν is proposed.Updating a single field would then scale as O(N 3 ), with the resulting computational cost to update all HS fields in space and imaginary time scaling as O(βN 4 ).It is possible, however, to reduce the cost of updating a single HS field from O(N 3 ) to O(N 2 ) by taking advantage of the fact that the e −∆τV σ,l matrices that appear in the definition of the propagator matrices are diagonal [2].This property allows one to reduce the computational complexity of updating all HS fields from O(βN 4 ) to O(βN 3 ).For a derivation of the method outlined in this section for updating Ising HS fields, and a discussion on how to define and update HS transformations for other types of fermionic interactions, we refer readers to Ref. [12] and Ref. [112].
We will assume an asymmetric form for the B σ,l matrices in the following discussion, as defined in Eq. (27).Given an update to a single HS field (s l,i,ν → s ′ l,i,ν ), the corresponding change in the propagator matrix may be expressed as where is a matrix with a single non-zero matrix element at index i on the diagonal.Employing the Sherman-Morrison matrix identity, one can show that the change in the corresponding fermion matrix determinant is given by the scalar equation The updated Green's function matrix can then be efficiently calculated from the old one using the rank-1 update which in terms of matrix elements is Additional gains in efficiency can also be made by accumulating these updates in a delayed updating scheme [122].This local updating scheme forms the basis for efficiently sampling HS fields across all spatial sites and imaginary times.One begins by proposing and accepting/rejecting updates to the HS fields for a single time slice, requiring O(N 3 ) operations.After all of the updates have been proposed for this imaginary time, the Green's function is advanced to the next time slice using Eq. ( 42), and the process is repeated.In practice, the imaginary time axis is sequentially iterated over, and Eq. ( 67) and Eq. ( 69) are used to efficiently update the HS fields at each imaginary time slice.While algorithms (3a) and (3b) are used to iterate over the imaginary time slices, the order the HS fields are iterated over at each time-slice is randomized to help reduce autocorrelation times.Note that a similar approach may be used to sample the fields when the symmetric definition for the B σ,l matrices [Eq.(26)] is used.However, in this case the Green's function matrix appearing in eqs.( 67) to ( 69) is replaced by Finally, we note that the same updating scheme can be used to sample the phonon fields in certain types of electron-phonon models (e.g. the Holstein model).However, it is not straightforward, or even necessarily possible, to formulate an efficient local updating scheme for arbitrary types of electron-phonon interactions.Moreover, even when local updates can be used they are inefficient in de-correlating the phonon fields, particularly when the phonon energy is much smaller than the electronic hopping (Ω/t ⪅ 0.5) [67,123,124].Various alternative sampling methods have been proposed, including self-learning Monte Carlo [123,124] and Langevin and HMC methods [92,[125][126][127], which can reduce the autocorrelation time associated with sampling the phonon fields.SmoQyDQMC.jluses an optimized HMC method for sampling the phonon fields, as outlined in the next section.

HMC updates of phonon fields
In the SmoQyDQMC.jlpackage, specialized hybrid Monte Carlo (HMC) updates are used to sample the phonon fields in DQMC simulations of e-ph models.The HMC method, also frequently referred to as Hamiltonian Monte Carlo, was first developed by the lattice gauge theory community [92], and has since become a widely used tool for sampling continuous random variables more broadly [127].In an HMC update, the phonon fields evolve according to a fictitious Hamiltonian dynamics to construct proposed global updates to every phonon field simultaneously.
To define the fictitious Hamiltonian dynamics used to evolve the phonon fields, the DQMC Monte Carlo weight defined in Eq. ( 34) is re-expressed as where defines an effective action.The first term S B (x), given by Eq. ( 28), is the purely bosonic contribution to the total action.The second term S F (s, x) describes the fermionic contribution, and is given by valid for all τ.Moving forward, we shall suppress reference to the HS fields s, as they will be treated as a constant while proposing updates to the phonon fields x.
Next, a conjugate momentum p is introduced for each phonon field x, which allows us to define the effective Hamiltonian which may be interpreted as the sum of "potential" and "kinetic" energies, where M is a positive-definite dynamical mass matrix.The corresponding Hamiltonian equations of motion are defining a symplectic, time-reversible, and energy-conserving dynamics.
To perform an HMC update, the first step is to directly sample the momentum according to the equilibrium Boltzmann distribution exp(−p T M −1 p/2) according to where each element of the random vector R is drawn from a standard normal distribution.
In the simplest approach, the dynamical mass matrix is set to the identity, M = I.Next, the Hamiltonian dynamics are evolved for N t time-steps using the leapfrog integration method where ∆t is the integration step size and is the force driving the dynamics.Like the underlying Hamiltonian dynamics, the leapfrog integration method is symplectic and time-reversible.As a result, absent numerical instabilities, the total energy H(x) will be conserved to O(∆t 2 ) for arbitrarily long trajectories.Lastly, the final state (x f , p f ) of the HMC trajectory replaces the initial state (x i , p i ) with a probability given by the Metropolis-Hastings criteria where ∆H = H(x f , p f ) − H(x i , p i ).Crucially, the HMC method exactly satisfies detailed balance as a result of the leapfrog integration method being time-reversible and symplectic.The most expensive part of performing an HMC update is evaluating the derivative of the action More specifically, evaluating the fermionic contribution to the derivative ∂ x is the dominant cost, scaling as O(βN 3 ); Sec.3.7.1 discusses how to evaluate this derivative.
Unfortunately, applying the basic HMC method outlined in this section still results in long autocorrelation times that stem from the disparate timescales introduced to the Hamiltonian dynamics by the bosonic action S B (x). Sec.3.7.2introduces a refined HMC method that utilizes two complementary methods for addressing this issue.However, while HMC updates are a powerful method for sampling phonon fields, Sec.3.8 discusses intermittently supplementing the HMC updates with other types of global updates to help further reduce autocorrelation times and mitigate ergodicity concerns in certain situations.Lastly, while one might consider decoupling the Hubbard interaction using continuous HS fields which could then be sampled with HMC updates, the absence of a bosonic contribution coupling those fields in the imaginary time direction is known to make the approach ineffective [91,93].

Evaluating the derivative of the action
The most computationally expensive aspect of performing a HMC update is calculating the derivative of the action at each time step during the HMC trajectory.Evaluating the derivative of the bosonic action S B (x) is straightforward and fast to evaluate, with a computational cost that scales as O(βN ).
Taking the derivative of the fermionic action for a single spin species S F (x) is both more involved, and more computationally expensive, scaling as O(βN 3 ).The derivative of the fermionic action for just a single spin species S F,σ (x), as defined in Eq. ( 73), is where l is the imaginary time slice and n specifies the phonon mode.
In the case that the propagator matrices B σ,l are defined using the asymmetric definition given in Eq. ( 27), then Eq. ( 81) becomes where Λ σ,l = exp(−∆τV σ,l ) and Γ σ,l = exp(−∆τK σ,l ).If the B σ,l matrices are instead defined using the symmetric definition given in (26), then Eq. ( 81) instead becomes where Λ σ,l = exp(−∆τV σ,l ) and Γ σ,l = exp(−∆τK l /2).At this point, the simplification can be applied to eqs. ( 82) and ( 83), where A is an N × N matrix.Also, when the exact form for the exponentiated kinetic energy matrix is used, and not the checkerboard approximation, then the substitution can also be applied, and Γ † σ,l = Γ σ,l .However, if the checkerboard approximation is being used, then Eq. ( 85) is no longer correct.Rather, applying the chain rule and taking advantage of the cyclic property of the trace, the relationship should be used instead, where Γσ,l is the checkerboard approximation for Γ σ,l as defined in Eq. ( 64).In similar fashion, the simplification can be applied as well.
In practice, to evaluate the derivative of the action we sequentially iterate over the imaginary time axis using Algorithm (3a) or (3b), generating each equal-time Green's function matrix G σ (τ, τ).This matrix is then used to calculate the derivative of the action with respect to the phonon fields for the current imaginary time slice.Doing this for each imaginary time slice, we evaluate the derivative of the action with respect to all the phonon fields while retaining O(βN 3 ) scaling in the DQMC simulations.

Resolving disparate timescales in the bosonic action
In this section, we consider an isolated QHO with mass M ; generalizing this discussion to a collection of QHOs is straightforward.
An important cause of long autocorrelation times in DQMC simulations of e-ph models is the QHO action S qho (x), defined in Eq. (29).The action associated with a single QHO is with the resulting forces in a corresponding Hamiltonian dynamics given by Next, defining the discrete Fourier transform in imaginary time as and applying it to f qho,l results in for n ∈ [0, L τ ) and τ = l • ∆τ.The dynamical modes xn are the Fourier transform of the phonon fields in imaginary time x l .
The resulting Hamiltonian equations of motion in frequency space associated with S qho (x) are given by ṗn = − kn xn , which describes a system of L τ independent harmonic oscillators with spring constants The resulting ratio of the magnitude of the forces for the fastest (x L τ /2 ) and slowest demonstrating that S qho (x) introduces disparate timescales to the dynamics, especially as one must choose ∆τ small to preserve the accuracy of the Trotter approximation.This results in standard HMC updates needing to use a small integration time-step ∆t to resolve the high frequency dynamical modes, generating long autocorrelation times for the low frequency dynamical modes.Similarly, when performing local updates, in order to attain a reasonable acceptance rate, the proposed changes to the phonon field need to be very small relative to the QHO characteristic length scale ∆X = 1/ 2M Ω, once again giving rise to long autocorrelation times.
One successful approach for addressing this issue is the Fourier acceleration method, whereby carefully selected values for the dynamical mass Mn appearing in Eq. ( 92) are selected so as to reduce autocorrelation times [59,125,126,128,129]. Specifically, the dynamical masses are given by where Ω reg = 1 + η 2 reg Ω and η reg ∈ ≥0 acts as a regularization parameter.Transforming back to imaginary time, this corresponds to a dynamical mass matrix, as appears in Eq. ( 75), given by where M is a diagonal matrix with the values along the diagonal given by Eq. ( 95).In the case that η reg = 0, the dynamical frequencies ωn = kn / Mn for the harmonic oscillators described by Eq. ( 92) all equal the QHO frequency Ω, resulting in all the dynamical modes xn evolving at the same rate.In opposite limit that η reg = ∞, the dynamical mass matrix simply reduces to the scalar value M = ∆τM .Intuitively, decreasing η reg from infinity to zero should be thought of as increasing the mass of the high-frequency dynamical modes, thereby slowing them down so that they evolve at the same rate as the low-frequency dynamical modes.
In SmoQyDQMC.jlwe use a slightly modified version of the exact Fourier acceleration hybrid Monte Carlo (EFA-HMC) method that was recently introduced in Ref. [130].This approach takes advantage of the fact that the equations of motion in Eq. ( 92) are analytically integrable using the solution xn (t) = xn (0) cos( ωn t) + pn (0) sin( ωn t) / ( Mn ωn ) , pn (t) = pn (0) cos( ωn t) − xn (0) sin( ωn t) × ( Mn ωn ) , (97) given the initial conditions xn (0) and pn (0) for each dynamical mode.This analytic integration of the phonon fields and conjugate momentum, referred to as exact Fourier acceleration  4).The full EFA-HMC method used in SmoQyDQMC.jl is then presented in Algorithm (5).An important detail in Algorithm ( 5) is that the time-step is randomized at the start of each HMC trajectory, with the amount of randomization controlled by the parameter δ ∈ (−1, 1).This practice helps avoid ergodicity concerns that can arise as the result of quasi-periodic behavior, an issue that can occur when the electrons only weakly couple to the high frequency modes in the dynamics [131].
It should be noted that that when η reg = ∞ and δ = 0, Algorithm (5) becomes equivalent to the version of the EFA-HMC method originally introduced in Ref. [130].In practice, a good starting place for performing EFA-HMC updates is to set N t • ∆t ≈ π/(2Ω), with N t = 4, η reg = 0.0 and δ = 0.05 [132].If the acceptance rate is low (≲ 60%), the first thing to try is decreasing ∆t and increasing N t such that their product remains constant.If the acceptance rate is very large (≳ 95%) it may be worth decreasing N t while increasing ∆t, or increasing η reg .

Reflection and swap updates
While HMC updates help decorrelate the phonon fields, other factors can increase the autocorrelation time.In the case of the Holstein model, the e-ph interaction induces an effective phonon mediated electron-electron attraction, giving rise to heavy bipolaron physics at moderate to large coupling, also resulting in long autocorrelation times.Likewise, DQMC simulations of the repulsive Hubbard with strong interactions (U/t ≫ 1) contend with low acceptance rates for local updates as the HS fields develop a tendency to align in the imaginary time direction [133].
Another potential issue that can arise is a formal ergodicity problem if only HMC updates are used to sample the phonon fields.In particular, the Fermion determinant det M σ going to zero corresponds to the action S(x) diverging.Thus contours of det M σ = 0 in the phase space of phonon configurations describe nodal surfaces that separate regions of det M σ > 0 and det M σ < 0 by an infinite potential energy barrier S(x) → ∞.A typical HMC update cannot cross these surfaces [93,134], which introduces a formal ergodicity problem that is both difficult to predict a priori and challenging to diagnose.There is no general guarantee that prevents this from occurring, and has been observed in simulations of e-ph models in the anti-adiabatic limit (Ω/t ≥ 1) [129].For these reasons, SmoQyDQMC.jlincludes two additional types of updates, termed reflection and swap updates.In the case of a reflection update, the phonon fields of a randomly chosen phonon mode in the lattice are reflected about the origin for all imaginary times simultaneously.In a swap update, two phonon modes in the lattice are randomly chosen, and their phonon fields are interchanged, or "swapped", for all imaginary-time slices [126].Both these updates can also be used with Ising HS fields in the same way.The utility of these types of updates lies in the fact that they propose large, non-continuous changes to multiple degrees of freedom simultaneously, allowing simulations to cross regions of phase space that would otherwise be inaccessible using smaller, incremental updates.

Error estimation and reweighting
This section will review how SmoQyDQMC.jlcomputes the error associated with measured observables using the binning method and how the sign problem is addressed with reweighting [135] and the Jackknife algorithm [12].
To reliably calculate the error associated with the sample mean for a measured observable, effectively independent samples are required.However, the sequence of states generated by Markov chain Monte Carlo (MCMC) algorithms like DQMC are highly correlated.A standard approach to addressing this issue is the binning, or blocking method [12].In this approach, the sequence of measurements generated in a MCMC simulation is partitioned into equally sized bins, or intervals, and the average value is computed for each bin.Once the bins become sufficiently large, containing a number of sequential measurements larger than the autocorrelation time, the average values associated with each bin may be treated as statistically independent samples.To calculate the error, one then calculates the sample standard deviation of the mean associated with the binned averages.
A SmoQyDQMC.jlDQMC simulation is structured such that N meas measurements are made during the simulation, which are aggregated and written to binary file N bins times during the simulation using the JLD2.jlpackage [136].Therefore, each set of measurements written to file is the average of N binsize = (N meas /N bins ) individual measurements, where it is assumed that mod(N meas , N bins ) = 0. Once the simulation is complete, and as long as the binary data files persist, the mean and error for any measurement can be calculated using n bins bins, where n bins ≤ N bins and mod(N bins , n bins ) = 0.
The binning method outlined above is a generic method for calculating the error associated with estimates generated by a MCMC simulation.However, the situation in a generic DQMC simulation is more complicated.The expectation value of an observable Ô is given by 〈O〉 = Tr e −β Ĥ Ô Tr e −β Ĥ , (98) which, in the context of a DQMC simulation, is reformulated as where W (s, x) is defined in Eq. ( 33).Ideally, this expression would be directly evaluated by performing a MCMC simulation with W (s, x) used as the Monte Carlo weights.Unfortunately, the sign problem prevents this direct approach, which manifests as W (s, x) not remaining a strictly positive real number.As a result, SmoQyDQMC.jlinstead uses W (s, x) as the Monte Carlo weight in DQMC simulations, as defined in Eq. (34).The severity of the sign problem is then characterized by the average sign S = 〈s〉, where and we have adopted the notation In the absence of a sign problem S = 1.The sign problem then becomes progressively worse as S approaches zero.The origin and behavior of the sign problem is not entirely understood, but it is known to be particularly sensitive to increasing the system size, inverse temperature and Hubbard interaction strength [110].
In this scenario, the reweighting method is used to extract the correct expectation value for an given observable according to At this point the Jackknife algorithm is used to correctly propagate errors.The Jackknife algorithm is a method for evaluating functions of expectation values, as in Eq. ( 102), and is used in conjunction with the binning method.For more information we refer the reader to Ref. [12] for a thorough description and derivation.

Chemical potential tuning
The DQMC method is formulated in the grand canonical ensemble, where the average charge density 〈n〉 is determined by the chemical potential µ.The SmoQyDQMC.jlpackage provides two modes of operation to control the average particle number.The first mode is the traditional approach, where the chemical potential µ is fixed during the simulation, and 〈n〉 converges to its equilibrium value.Using this approach, one typically performs several runs at different values of µ to determine the value needed to produce the desired charge density 〈n〉.The second mode automates this process, dynamically adjusting the chemical potential µ to obtain a target value for the charge density 〈n〉 specified by the user.This automation is achieved using the algorithm described in Ref. [137] and has been implemented as a stand-alone package MuTuner.jl[138] that SmoQyDQMC.jluses to incorporate this functionality.
The µ tuning algorithm can be utilized in simulations with and without a Fermion sign problem; however, we have found that the chemical potential tuning can become unstable if the average value of the Fermion sign becomes too small.When this occurs, we recommend reverting to the fixed µ mode of operation.

Performance
Figure 1 assesses the performance of the SmoQyDQMC.jlpackage for three representative models, namely the single-band Hubbard, Holstein, and optical SSH models, defined on 1D chains of length N .The dimension of the system is unimportant with respect to measuring the scaling of the DQMC algorithm, which nominally scales as the cube of the total number of orbitals in the system N 3 , independent of the dimension.
Figure 1 reports the simulation run time, including the time needed to perform measurements of the time-displaced Green's function and normalized by the number of updates that were performed.All simulations were performed at half-filling (µ = 0) with fixed ∆τ = 0.1 and adopting only nearest neighbor hopping t.For the Hubbard model, we set U = 4t to place the system in the Mott insulating regime in one dimension.For the Holstein and optical SSH simulations, we set the phonon energy Ω = t and e-ph coupling α = t, giving rise to charge ordered states.We note, however, that we have obtained similar performance measures when simulating low-energy optical and acoustic phonon modes [64], which are traditionally very challenging for conventional QMC approaches [123,124].Finally, the asymmetric form for the propagator matrices was used in the Holstein and Hubbard simulations without the checkerboard approximation.For the simulations of the optical SSH model, we adopted the symmetric propagator definition and checkerboard approximation.
Figure 1 demonstrates that SmoQyDQMC.jlachieves the nominal O(βN 3 ) scaling of the DQMC algorithm for all three cases.This result confirms that our HMC updates for the phonon fields are efficient and bypass the typical increase in computational complexity normally associated with performing global or block updates of the phonon fields [67].
It is also straightforward to parallelize SmoQyDQMC.jlsimulations with MPI using the MPI.jl package [139].SmoQyDQMC.jldoes not list this package as a dependency, but rather the parallelization is introduced at the script level.Examples demonstrating this functionality are included in the online documentation.Additionally, when simulations are parallelized using MPI.jl, final estimates for measured observables are obtained by averaging results over all walkers simulated in parallel.

Summary & future directions
The SmoQyDQMC.jlpackage, implemented in the Julia programming language, exports a user-friendly implementation of the DQMC method for simulating Hubbard and e-ph interactions without sacrificing performance.By adopting a scripting interface, SmoQyDQMC.jl is unique relative to similar DQMC software packages, opening the door to integrating the exported functionality into more complicated workflows that leverage the growing Julia ecosystem of scientific computing and machine learning packages.With extensive online documentation that includes an ever-growing list of examples, SmoQyDQMC.jlwill help make the DQMC method accessible to a broader community of researchers.
Moving forward, one obvious direction for future development is expanding the class of Hamiltonians that SmoQyDQMC.jlcan simulate.Adding support for inter-orbital Hubbard interactions, or user-defined Fermion interactions, and couplings to classical degrees are all planned for future releases.Adding support for an arbitrary number of Fermion spin species, and asymmetric couplings to each spin sector is also planned.Longer term goals include developing a suite of companion packages that export other QMC variants, including the dynamical cluster approximation that uses DQMC as a solver [24,140], linear-scaling QMC methods for simulating e-ph models [126], the zero-temperature projector QMC method [6,113], and constrained path QMC algorithms for tackling the sign problem [30,141].

Figure 1 :
Figure 1: The average time per Monte Carlo update sweep, including the time needed for measurements of the time-displaced Green's function, for the one-dimension (1D) Hubbard (black •), Holstein (red □), and optical SSH (blue △) chains.The left panel plots results as a function of the chain length N at a fixed β = 4/t.The right panel shows results as a function of inverse temperature for a fixed chain length N = 256.All results have been normalized to the largest time and the dashed line shows the ideal O(βN 3 ) curve.

Algorithm 1
Overview of DQMC simulation structureInitialize Ising HS fields s and phonon fields x to random initial configuration.
for i ∈ [1, N therm.] do Update Ising HS field s using algorithm from Sec. 3.6.Update phonon fields x using algorithms from sections 3.7 and 3.8.Update chemical potential; see Sec. 3.10 end for for bin ∈ [1, N bins ] do for i ∈ [1, n bin ] do Update Ising HS field s using algorithm from Sec. 3.6.Update phonon fields x using algorithms from sections 3.7 and 3.8.Make measurements; see Sec. 3.3.Update chemical potential; see Sec. 3.10 end for Write average value of measurements in current bin to file.end for Average binned measurements to get final estimates for measured observables; see Sec. 3.9.