DCore: Integrated DMFT software for correlated electrons

We present a new open-source program, DCore, that implements dynamical mean-field theory (DMFT). DCore features a user-friendly interface based on text and HDF5 files. It allows DMFT calculations of tight-binding models to be performed on predefined lattices as well as \textit{ab initio} models constructed by external density functional theory codes through the Wannier90 package. Furthermore, DCore provides interfaces to many advanced quantum impurity solvers such as quantum Monte Carlo and exact diagonalization solvers. This paper details the structure and usage of DCore and shows some applications.


Introduction
Dynamical mean-field theory (DMFT) has become a standard theoretical tool for studying strongly correlated electronic systems [1]. In a DMFT calculation, an original lattice model is mapped to an effective Anderson impurity problem whose bath degrees of freedom are selfconsistently determined. Although the DMFT formalism was originally proposed for models such as Hubbard models, it can be combined with density functional theory (DFT) based on ab initio calculations to describe the electronic properties of strongly correlated materials [2,3]. This composite framework, called DFT+DMFT, has been applied to various types of materials, such as cuprates [4], Fe-based superconductors [5][6][7], and f -electron materials [8][9][10].
To make DFT+DMFT available to more users and to promote the development of a community of users and developers, there is a strong need for an open-source program package with a user-friendly interface. Some open-source software packages for performing DFT+DMFT calculations have been developed. The TRIQS (Toolbox for Research on Interacting Quantum Systems) project [11] aims to provide the necessary components to implement DFT+DMFT codes, including a quantum Monte Carlo (QMC) impurity solver [12] and an interface to DFT codes [13]. The user can implement their own DMFT code using the building blocks in Python. The ALPS project provides a DMFT code for simple Hubbard models [14] and implementations of several continuous-time QMC (CT-QMC) impurity solvers [15][16][17]. w2dynamics [18] provides a state-of-the-art implementation of a QMC algorithm and includes a Python program for performing simple DFT+DMFT calculations. DMFTwDFT features a user-friendly interface and has interfaces to various DFT codes and the CT-QMC impurity solver [19]. DFT+embedded DMFT Functional (eDMFT) [20] features a full set of DFT+DMFT functionalities but it is not open-source software.
DCore is an open-source program package that implements (DFT+)DMFT calculations for multi-orbital systems. This package is built on top of existing TRIQS Python libraries and provides interfaces to external impurity solvers from the ALPS [15,16] and TRIQS projects [12,21] as well as exact-diagonalization solvers based on pomerol [22] and HΦ [23]. DCore provides a flexible interface based on text and HDF5 files and does not require extensive expert knowledge. It also provides a set of command line tools to analyze a self-consistent solution and evaluate physical quantities such as momentum-resolved spectral functions.
The remainder of this paper is organized as follows. In Sec. 2, the methods used for implementing DCore are described. In Sec. 3, the structure of DCore and a flowchart of simulations are shown. In Sec. 4, the formats of input and output files are explained. In Sec. 5, brief instructions on the installation of DCore are provided. In Sec. 6, several run examples of DCore from simple Hubbard models to DFT+DMFT calculations for a correlated material are demonstrated. In Sec. 7, a summary and the conclusions of this paper are given.

Methodology
In this section, a brief explanation of the methodology used for implementing DCore is provided. For more technical details of the DFT+DMFT method, we refer the reader to a review article [2]. The current version of DCore implements only one-shot DFT+DMFT calculations based on Wannier90 1 . DCore does not support charge self-consistent calculations [9], which are supported by other programs such as eDMFT and DMFTwDFT. In a charge selfconsistent calculation, the DFT effective potential is updated using the density matrix obtained by DMFT calculations. This is essential especially in performing lattice optimization. DCore specializes in the analysis of multi-orbital Hubbard models.

One-shot DFT+DMFT calculations and tight-bonding models
In one-shot DFT+DMFT calculations, one first performs "band calculations" usually using the local density approximation (LDA), where correlation effects in strongly correlated orbitals (e.g., partially filled d orbitals) are not described accurately. One projects the effective onebody LDA Hamiltonian onto some tight-binding basis, which yields a multi-orbital tightbinding model. Then, one introduces local electron interactions for correlated orbitals into the tight-binding model. The resultant multi-orbital Hubbard model is solved using DMFT. We will discuss how to subtract the contributions of electron correlations at the LDA level from DMFT results (double-counting corrections) in a later section. One may not introduce additional interactions at the DMFT level for weakly correlated orbital for which LDA is accurate enough (e.g., deep oxygen orbitals).

Model
DCore is designed to solve a tight-binding model with periodic boundary conditions in second quantization. We denote the primitive vectors of the lattice by a 1 , a 2 , a 3 . Then, the coordinates of a unit cell can be specified by an integer array of length 3, R = (R 1 , R 2 , R 3 ), as 3 i=1 R i a i . Each unit shell involves shells, which are sets of spin orbitals. We define a creation operator and an annihilation operator for the α-th spin orbital in the s-th shell as c † Rsα and c Rsα , respectively. In the reciprocal space, we define the corresponding creation and annihilation operators by where momentum index k = (k 1 , k 2 , k 3 ) and k i ∈ Z. N denotes the number of k points. The Hamiltonian of a tight-binding model can be written as follows: where the first term in Eq. (3) denotes a non-interacting Hamiltonian. In the above equations, s and s run either over correlated (crsh) "shells" or over all "shells" (i.e., the correlated shells and a non-interacting shell 2 ). Hereafter, the number of the correlated shells is denoted by ncor. α and β denote the spin-orbital index in each shell. The second term H int (R, s) denotes an interaction at the s-th correlated shell. This Hamiltonian is given as where U αβγδ (s) denotes a spin-full four-rank Coulomb tensor. Note that H int is limited to intra-shell interactions and local interactions in each unit cell.

DMFT formalism
DCore solves the following DMFT self-consistent equations, where the lattice Green's function G, the local Green's function G loc , the noninteracting impurity-model Green's function G 0 , and the impurity/lattice self-energy Σ imp /Σ lattice are defined in the spin-orbital space: Σ σσ (iω n , s) ← Σ imp σσ (iω n , s).
Here, ω n ≡ (2n + 1)πT is a fermionic Matsubara frequency at temperature T , · · · k denotes an average over the momentum space, s indexes correlated shells, and P s is the projector to the s-th correlated shell. Double-counting corrections are denoted by Σ DC in Eq.   (10). The most numerically expensive step is the solution of the impurity problem, i.e., the calculation of the self-energy for a given G −1 0 . . . . k denotes an average over the momentum space and µ is the chemical potential.
All Green's functions and self-energies in Eqs. (6)- (10) are matrices at each Matsubara frequency, and their inversions are regarded as inverse matrices. The full self-energy Σ(iω n ) in Eq. (6) is defined for all spin and shell components, and has the spin-block structure where each spin-component has the internal shell structure where the last column and row correspond to the non-interacting shell. The block structures of the Green's functions are determined consistently with those of the self-energies in Eqs. (6)- (10). The Green's functions and the self-energies are assumed to be either spin-diagonal or dense matrices in the spin space with both diagonal and off-diagonal components. The former is sufficient for computing a paramagnetic state or a collinear magnetic state polarized along the z axis. The latter allows Σ(iω n ), G(iω n ), and H(k) to have spin-off-diagonal elements. This must be used when H has spin-orbit coupling that mixes the spin-up/down sectors or when one computes non-collinear magnetic states.
As illustrated in Fig. 2, by default, all correlated shells are assumed to be inequivalent shells that can have different self-energies. As described in the following sections, the user can assume that several correlated shells have the same self-energy. In this case, these correlated shells define one inequivalent shell (see Case 2 in Fig. 2).
The chemical potential µ is either adjusted to obtain a desired number of electrons or is fixed at a given value. In the former case, the value of µ is determined by using a bisection algorithm, which requires solving Eq. (6) multiple times at each self-consistent iteration. Note that the number of electrons per unit cell is given by −Tr G(τ = β − 0 + ), where Tr denote inverse temperature and matrix trace, respectively).
In Eq. (10), the new self-energy is computed by solving an impurity model for each correlated shell. The non-interacting and interacting parts of the impurity model are defined by G 0 (iω n , s) and U αβγδ (s), respectively. Some QMC impurity solvers accept the hybridization function defined below as an input: After computing impurity self-energies Σ imp (iω n ) for all correlated shells, we update Σ(iω n , s) by using simple linear mixing as and go back to Eq. (6). Here, σ mix (0 < σ mix ≤ 1) is a mixing parameter. DCore implements three different algorithms for computing the double-counting correction Σ DC : • HF DFT (default): Hartree-Fock (HF) contribution is subtracted from the DFT input.
• HF imp: HF contribution computed from the local Green's function G loc [Eq. (7)] is subtracted from the impurity self-energy.
For the default algorithm (HF DFT), the double-counting correction is computed as follows: where · · · 0 indicates the expectation value at the initial (Kohn-Sham) state with k-summation included, and {α, β, γ, δ} index spin orbitals.

MPI parallelization
DCore uses MPI parallelization for performing numerically expensive parts of DMFT calculations. For instance, k-average of G(k, iω n ) in Eq. (6)   DCore contains a set of programs that perform DMFT calculations for models and materials. The structure of the programs and data flow is summarized in Fig. 3.
DCore consists of four layers: (i) interface layer, (ii) DMFT self-consistent loop, (iii) convergence check and (iv) post-processing. These are respectively performed by the executables dcore_pre, dcore, dcore_check and dcore_post. Input parameters are provided by a single text file, which is read by all three programs. Data generated by dcore_pre and dcore are separately stored in an HDF5 file and passed to the next process.
(i) Interface layer: dcore_pre dcore_pre generates the HDF5 file necessary for the DMFT loop. Users specify parameters that define a model, such as the hopping parameters on a certain lattice, and interactions. The hopping parameters are given either for preset models (e.g., square lattice, Bethe lattice) or using the Wannier90 format.
(ii) DMFT self-consistent loop: dcore dcore is the main program for the DMFT calculations. The effective impurity problem is solved repeatedly to satisfy the self-consistency condition of the DMFT. For solving the impurity problem, dcore calls an external program. As an external program, we can select ALPS/CT-HYB [15], ALPS/CT-HYB-SEGMENT [16], or TRIQS/cthyb [12] as a CT-QMC solver, TRIQS/Hubbard-I [21] as a Hubbard-I solver, or pomerol [22] or HΦ [23] as an exact diagonalization (ED) solver.
(iii) Convergence check: dcore_check dcore_check reads the results of self-consistent calculations (e.g., the self-energy) from the output HDF5 file of dcore and outputs several figures that are useful for checking the convergence of the results.
(iv) Post-processing: dcore_post dcore_post computes some physical quantities from the converged solution of the DMFT loop. Currently, the (projected) density of states A(ω) and correlated band structures (momentum-resolved single-particle excitation spectrum) A(k, ω) can be calculated. The analytic continuation of the self-energy from the Matsubara frequencies to real frequencies is performed using the Padé approximation.

Format of input and output files
In this section, the format of input and output files for DCore is briefly explained.

Input file format
The

[model] block
This block includes parameters for defining a model to be solved. The parameter types are divided into four parts: (i) Basic parameters, (ii) Lattice parameters, (iii) Interaction parameters, and (iv) Local potential parameters. In the following, we describe the parameters defined in each part.
(i) Basic parameters There are four basic parameters, namely seedname, nelec, norb, and spin_orbit. seedname determines the name of the model HDF5 file. nelec and norb specify the number of electrons per unit cell and the number of orbitals, respectively. spin_orbit specifies whether the model has spin-orbit interactions.
(ii) Lattice parameters In DCore, chain, square, cubic, and bethe lattices are prepared as predefined models, as shown in Fig. 4. We can select a lattice type using the lattice parameter and set values of transfer integrals using parameters t and t'. For treating more realistic cases, such as DFT+DMFT calculations, hopping parameters in the Wannier90 format can be imported by selecting wannier90. In this mode, we can set a number of correlated shells in a unit cell and a mapping from correlated shells to equivalent shells using the ncor and corr_to_inequiv parameters, respectively. Figure 5 shows a schematic diagram of the shell structure in wannier90 mode. For experts, the lattice data can be custom prepared using the external mode. In this mode, all necessary data should be directly made in the dft_input group of the model HDF5 file. For details, see the reference  manual of DFTTools [13]. The information of reciprocal lattice vectors and the number of wave vectors can be also specified.
(iii) Interaction parameters The interaction part of the Hamiltonian is defined as Eq. (5). The interaction matrix U αβγδ (i) is specified by the parameter interaction. In DCore, three types of interaction, namely (a) kanamori, (b) slater_f, and (c) slater_uj, are defined.
(a) If interaction = kanamori, the Kanamori-type interaction is used; i.e., (b) If interaction = slater_f, the interaction matrix is constructed by the effective Slater integrals F 0 , F 2 , F 4 , F 6 [25]. These Slater integrals and the angular momentum l at each correlated shell are specified by the parameter slater_f as It is noted that F 0 , F 2 , F 4 , F 6 must all be specified.
(c) If interaction = slater_uj, the Slater-type interaction is used. When U, J, and the angular momentum l are set, the effective Slater integral is calculated internally in DCore according to the formula defined in Table 2. The U, J, and l at each correlated shell are specified by the parameter slater_uj as (iv) Local potential parameters An arbitrary local potential can be implemented using parameters local_potential_matrix Corr. shell1 (inequivalent shell 0)

Uncorrelated orbitals
Index of Wannier functions 0 a H 1 6 9 a s l s l X l c d Q 7 d 1 n 9 8 r s K 5 g D J 0 v L f u K w u g W 2 + r v P W h a 6 9 C 6 9 5 0 W 9 z S c a r h B z d K M V 3 4 1 E L R L p L M p w g r H G o 7 y k + S t t 1 W 5 r K B X W n P 3 a A d n t 8 R b z z 1 6 j D e Z H e Y 1 8 0 a H 1 6 9 a s l s l X l c d Q 7 d 1 n 9 8 r s K 5 g D J 0 v L f u K w u g W 2 + r v P W h a 6 9 C 6 9 5 0 W 9 z S c a r h B z d K M V 3 4 1 E L R L p L M p w g r H G o 7 y k + S t t 1 W 5 r K B X W n P 3 a A d n t 8 R b z z 1 6 j D e Z H e Y 1 8 0 a H 1 6 9 a s l s l X l c d Q 7 d 1 n 9 8 r s K 5 g D J 0 v L f u K w u g W 2 + r v P W h a 6 9 C 6 9 5 0 W 9 z S c a r h B z d K M V 3 4 1 E L R L p L M p w g r H G o 7 y k + S t t 1 W 5 r K B X W n P 3 a A d n t 8 R b z z 1 6 j D e Z H e Y 1 8 0 a H 1 6 9 a s l s l X l c d Q 7 d 1 n 9 8 r s K 5 g D J 0 v L f u K w u g W 2 + r v P W h a 6 9 C 6 9 5 0 W 9 z S c a r h B z d K M V 3 4 1 E L R L p L M p w g r H G o 7 y k + S t t 1 W 5 r K B X W n P 3 a A d n t 8 R b z z 1 6 j D e Z H e Y 1 8 0 a H 1 6 9 a s l s l X l c d Q 7 d 1 n 9 8 r s K 5 g D J 0 v L f u K w u g W 2 + r v P W h a 6 9 C 6 9 5 0 W 9 z S c a r h B z d K M V 3 4 1 E L R L p L M p w g r H G o 7 y k + S t t 1 W 5 r K B X W n P 3 a A d n t 8 R b z z 1 6 j D e Z H e Y 1 8 0 a H 1 6 9 a s l s l X l c d Q 7 d 1 n 9 8 r s K 5 g D J 0 v L f u K w u g W 2 + r v P W h a 6 9 C 6 9 5 0 W 9 z S c a r h B z d K M V 3 4 1 E L R L p L M p w g r H G o 7 y k + S t t 1 W 5 r K B X W n P 3 a A d n t 8 R b z z 1 6 j D e Z H e Y 1 8 As an example, we consider cases with atoms in a unit cell (ncor=2). For the right panels of (a) and (b), the two atoms are treated equivalently (i.e., the self-energy is assumed to be identical).

Output-file format
Output files are generated by each program. The output files for each program are described below.
1. dcore_pre dcore_pre generates seedname.h5, where seedname is defined in a model block in the input file. It has two groups, namely dft_input and Dcore. See the DFTTools website [13] for details on the data structure in the dft_input group. In the Dcore group, the values of interaction matrix U αβγδ (s) for each equivalent shell , where α, β, γ, δ denote the spin-orbital indices at each correlated shell, and local potential V i s,o1,o2 , where s denotes the spin and o1, o2 denote orbitals, are output.
2. dcore dcore generates seedname.out.h5. All data are stored in the dmft_out group. In this group, the input parameters, the total number of iteration steps, and the physical quantities, such as the local self-energy in the imaginary frequency domain and the chemical potential, at each iteration step are output. The data structure of the self-energy is described in the online document [32]. For instance, the self-energy at the first iteration for the 0-th inequivalent shell will be stored below /dmft_out/Sigma_iw/ite1/sh0 in seedname.out.h5. An external impurity solver is invoked as a subprocess from the main process of dcore. For each iteration and each inequivalent shell, dcore generates temporary input and output files in the working directory, e.g., work/imp_shell0_ite1 for the first iteration for the 0-th inequivalent shell. The user can check these temporary files to diagnose problems.
4. dcore_post dcore_post generates three text files, namely seedname_dos.dat, which includes the total spectral function, seedname_akw.dat, which includes the single-particle excitation spectrum A(k, w), and seedname_momdist.dat, which includes the momentum distribution function. A script (seedname_akw.gp) for displaying A(k, w) is also generated at the same time. All files and figures are output in the post directory.

Installation
DCore is a collection of pure Python programs depending on TRIQS [11] and TRIQS/DFTTools [13]. DCore supports TRIQS 3.0.x and Python3. Once these prerequisites are installed, the installation of DCore is quick and efficient: $ pip3 install dcore A pure-Python package will be automatically downloaded, being installed into the system site-packages directory. If you do not have root privileges, you can install DCore into the user site-packages directory: $ pip3 install dcore --user You can find the locations of the installed files as $ pip3 show -f dcore Please make sure that the PATH environment variable includes the directory containing the DCore executable files such as dcore, dcore pre. The detailed installation instructions can be found online [33]. In addition, at runtime, one of the external impurity solvers shown below must be available.

Examples
The examples below show how to define a model, how to compute a self-consistent solution, and how to compute physical quantities.

First example: Square-lattice Hubbard model
As the first example, we consider a single-band Hubbard model on a square lattice. Without the shell index, the Hamiltonian in Eq. (3) is reduced to where n iσ = c † iσ c iσ is the local number operator. The energy dispersion k is given by k = 2t(cos k x + cos k y ) + 4t sin k x sin k y .
Note that the nearest-neighbor hopping parameter t should be negative to minimize k at the Γ point. This is an effective model of cuprate superconductors and is one of the most studied models in strongly correlated electron systems. We explain below the steps for obtaining the DMFT solution with the input file dmft square pomerol.ini, which is available in the online tutorial [35].

Model construction
We first set up the model using dcore pre. Parameters for defining a model are provided in the [model] block of the input file. The square-lattice Hubbard model can be constructed as shown below. The square lattice is predefined in DCore and can be invoked simply by setting lattice = square. For more complicated models that are not predefined in DCore, this parameter is replaced with lattice = wannier90 and the lattice data are described in a separate file with the Wannier90 format. Sec. 6.1.5 describes its details. We consider half-filling, which is specified by nelec = 1.0. The value of transfer integral t is specified by t. In this case, t is set as −1, and the absolute value of t is taken as the unit of energy. The interaction parameter U is input by the first component of kanamori. The second and third components correspond to U and J, respectively, but are meaningless for single-band models. The system size is determined by nk = 8, which means that the k-average in Eq. (6) is evaluated by summing up 8 × 8 k-points.
With this input file, dcore pre is executed as This program generates an HDF5 file named square.h5, which includes all the information on the model Hamiltonian, such as H(k).

Self-consistent calculations
We now proceed to the main program dcore. For executing the program, we need three additional blocks in the parameter file, namely [system], [impurity solver], and [control].
The example below shows a parameter set for a square lattice: [system] T = 0.1 n_iw = 1000 fix_mu = True mu = 2.0 [impurity_solver] name = pomerol exec_path{str} = pomerol2dcore n_bath{int} = 3 fit_gtol{float} = 1e-6 [control] max_step = 100 sigma_mix = 0.5 converge_tol = 1e-5 The [system] block includes T for temperature T and n iw for the number of positive Matsubara frequencies. One can use either the parameter β (inverse temperature) or T (temperature) to set the simulation temperature. We fix µ at 2 by specifying fix mu = True and mu = 2.0 because the condition for half-filling is known. If fix mu is not activated, µ is adjusted every time the impurity problem is solved to make the occupation number equal to nelec.
The [impurity solver] block specifies which impurity solver is used and its configuration. In this example, we use the ED solver pomerol [22], which is specified by name = pomerol. Other parameters in this block are solver-dependent and require the type specification, such as {str} and {int}. exec path{str} is a full/absolute path to the executable (including its name). Here, we assume that the executable pomerol2dcore is in the PATH. In ED-based solvers, the hybridization function ∆ αβ σ (iω n ) in Eq. (13) is approximated by the function composed of N bath poles: Hereafter, we drop the shell index s for simplicity. The energy level E b σ and the hybridization parameter V αb σ are determined by fitting∆ αβ σ (iω n ) to ∆ αβ σ (iω n ) [1]. The fitting tolerance is specified by fit gtol. The resultant finite-size system consisting of the correlated site plus N bath bath sites is solved using numerical diagonalization.
The [control] block includes parameters that control the self-consistent calculations. The impurity problem is solved max step = 100 times at maximum, but the loop is terminated if a convergence criterion is satisfied. The criterion we use is where O[i] denotes a quantity at the i-th iteration. The quantities O include the chemical potential µ and the renormalization factor z [see Eq. (21)]. sigma mix specifies the mixing parameter σ mix in Eq. (14). The optimal value of sigma mix depends on models and parameters. Generally, a larger value of sigma mix (1 at the largest) can lead to a faster convergence, but may result in a bad convergence (e.g., oscillation and discontinuous change). A recommended strategy is that one starts from a large value of sigma mix such as 1 and 0.5 and decreases it if the convergence graph (see below) does not show a tendency of convergence. The main program dcore is executed as Here, the number of MPI processes that are internally invoked is given after the --np option. Note that dcore itself is launched as a single process. The results are stored in an HDF5 file named square.out.h5. After dcore is executed, one must check the convergence of the self-consistent calculations. For this purpose, the auxiliary program dcore check is provided. dcore check is called without the --np option as $ dcore_check dmft_square_pomerol.ini Figure 6 shows one of the figures generated by dcore check. The renormalization factor z defined by Then, dcore uses the final result for Σ(iω n ) in the previous run as the initial state and executes additional max step iterations. We note that the default is restart=False, which means the self-consistent loop is started over even though the previous result exists in the current directory. Other parameters in [control] block can be changed in the second run. For example, sigma mix may be reduced if the convergence graph does not show a converging behavior. This procedure is repeated until convergence is reached.
At this point, the DMFT solutions have been obtained for static quantities and G(k, iω n ) in the Matsubara domain. For example, the magnetization and the renormalization factor z can be discussed. Figure 7 shows the convergence of z versus the number of bath sites N bath . Convergence to the CT-QMC result (explained in Sec. 6.1.4) is observed for N bath ≥ 3.

Dynamical quantities
After the self-consistent calculations are finished, the post calculation dcore post is conducted to compute dynamical quantities on the real frequency axis such as the density of states A(ω) and the single-particle excitation spectrum A(k, ω). The parameters for this step are provided in the [tool] block as follows:  The k-path is specified by the parameter knode. In this example, k starts from Γ and comes back to Γ through X and M. For each interval, the path is divided into nk line = 100 points, on which A(k, ω) is computed. The ω mesh is generated using the parameters omega min, omega max, and Nomega. The parameter broadening specifies the extent of an artificial broadening δ in the spectrum. This is done by replacing ω by ω + iδ in the analytical continuation. broadening is set to a value of the order of T to obtain a smooth spectrum with the ED solver. For other solvers that treat the thermodynamic limit, one should set broadening = 0 to avoid artificial broadening.
dcore post is executed with the --np option because MPI parallel computation is used: After the computation is finished, numerical results are stored in post directory. The spectrum A(k, ω) on a given k-path, for example, is saved in a text file square akw.dat together with a plotting script square akw.gp in gnuplot format. One can plot the data by $ gnuplot square_akw.gp Figure 8(a) shows A(k, ω) plotted with this command. We note that the range of the colorbar needs to be changed manually to obtain a distinct spectrum. Although the ED solver yielded reasonable estimations for z, as shown in Fig. 7, the dynamical quantities clearly show artificial features. Because ∆(ω) is composed only of n bath = 3 poles located at ω = 0 and ±1.73, A(k, ω) exhibits features of a hybridized band around ω = ±1.73. However, the hybridization should take place equally in the whole energy region. A larger n bath is necessary to discuss dynamical properties. We note that the Padé approximation used for the analytical continuation does not ensure the causality, namely, A(k, ω) may becomes negative at some (k, ω). One can partly avoid this unphysical behavior by increasing the value of broadening. Another risk is involved in the analytical continuation from noisy Σ(iω n ), which will be described in 6.1.4.

CT-QMC solver
The CT-QMC method offers an unbiased simulation of general interacting models [36,37]. In particular, efficient calculations for the effective impurity problem are achieved by its hybridization-expansion formulation (CT-HYB), which performs Monte Carlo sampling in the expansion with respect to hybridization between the correlated atom and the bath as perturbation [38]. In special cases without exchange interactions, an even more efficient algorithm based on a segment picture can be used [39]. Here, we use ALPS/CT-HYB-SEGMENT, which is the ALPSCore [40] version of the implementation developed by Hafermann et al. [41]. A detailed documentation with overview of parameters and sample scripts is provided in the subdirectory Documentation in the repository [16].
The segment algorithm can handle only the density-density type interactions. This is the case with the single-orbital Hubbard model and the multi-orbital Hubbard model with J = 0. When the segment solver is attempted to use, DCore check if the model includes non-densitydensity interactions and if yes, the program stops before the impurity solver is invoked. Even in such cases, one can still apply the segment solver by activating the density density option as Then, the interactions generated by dcore pre are restricted to density-density type. Note that the density density option changes the model (e.g. when J is finite) and hence the results may differ from those obtained by other solvers such as ALPS/CT-HYB.
The simulation finishes either when cthyb.SWEEPS{int} Monte Carlo sweeps are finished or when the elapsed time reaches MAX TIME seconds. In this example, cthyb.SWEEPS{int} is so large that the computation time is controlled by MAX TIME{int}. cthyb.N MEAS{int} is the number of updates per measurement, and here we use 50 following the example in the official documentation [16]. cthyb.THERMALIZATION{int} specifies the number of updates before the measurement starts. cthyb.THERMALIZATION{int} = 100000 should be large enough to properly discard irrelevant samples. However, this value may not be large enough at lower temperatures. This should be checked by, for example, plotting the histogram of the expansion order [41]. The whole list of optional parameters can be found by running the command

$ alps_cthyb --help
Note that solver-dependent parameters should be input with the type specification, e.g., {int} and {str}, as in the above example. Because the QMC solver is time-consuming, it is important to use a proper initial guess for Σ(iω n ) to reduce the number of iterations. For example, we can use the converged result obtained for similar parameters or the result obtained for the same parameters with a different solver. This can be done by assigning the path of sigma.dat generated by dcore check to the initial self energy parameter as Here, /path/to/sigma.dat should be changed according to the environment. The parameter time reversal = True activates the spin average of Σ σ (iω n ). This is particularly useful for the QMC solver to improve statistics. Unlike the previous example using the ED solver, converge tol is not specified here, because statistical errors in the QMC method make the automatic convergence check less reliable. The convergence is checked visually using the graphs generated by dcore check.
With this input file, all programs, namely dcore pre, dcore, dcore check, and dcore post, are executed as in Sec. 6.1.1. Figure 9 shows graphs generated by dcore check. The right panel shows that the statistical error is on the order of 10 −3 and that convergence is reached at around the 10th iteration. The result for A(k, ω) is shown in Fig. 8(b). Note that artificial broadening is turned off by broadening = 0.0. The spectrum exhibits low-energy quasiparticle excitations and high-energy broadening due to correlations.
We note that spectra computed using the Padé approximation is extremely sensitive to statistical errors. For this reason, Fig. 8(b) might not be reproduced even with the same input as above. In such cases, one can try improving the QMC accuracy by increasing the number of MPI processes or by increasing MAX TIME{int} parameter. An alternative way for analytical continuation is discussed in Sec. 7 as a future development.
Here, a comment on the convergence check is in order. Although the automatic convergence check has been turned off in the above example, there is a way to employ the convergence check even for QMC solvers. In the case with Fig. 9, the following configuration works: The DMFT loop is terminated when the convergence criterion (20) is satisfied n converge times consecutively. converge tol is set larger than the magnitude of statistical errors, and the other parameter n converge=5 prevents the loop from being terminated prematurely. This way of automatic convergence check can be attempted to use if the expected magnitude of statistical errors is known in advance.
As discussed above, one could alternatively use a CT-HYB solver based on the matrix algorithm, such as ALPS/CT-HYB [see Sec. 6.2]. To obtain the same results with comparable accuracy, one may have to run a CT-HYB solver based on the matrix algorithm longer in runtime, typically by one order of magnitude.

Wannier90 interface
DCore provides simple lattice models such as a one-dimensional chain, a two-dimensional square lattice, and a three-dimensional cubic lattice. Arbitrary lattices that are not predefined in DCore can be implemented using the Wannier90 format [42]. In the following, we show how to construct the square lattice using Wannier90 input.
The only difference in dmft square pomerol.ini is in the [model] block, which includes The difference between this block and that in Sec. 6.1.1 is lattice = wannier90. When this is given, DCore reads a wannier90 file named seedname hr.dat (square hr.dat in the present case) to set up the one-body part of the Hamiltonian, H(k). square hr.dat describing the square lattice is given by

Antiferromagnetic state
With the Wannier90 format, one can construct arbitrary lattices. For example, we can enlarge the unit cell to allow an antiferromagnetic (AFM) solution. Let us see how this is done for the square lattice example. The total number of orbitals contained in the unit cell is specified by ncor = 4. nelec is the total number of electrons in the unit cell, and hence needs to be multiplied by 4. In the AFM state, we expect a staggered ordered state, where site 1 and site 4 are equivalent (e.g., spin-up state), and site 2 and site 3 are equivalent (spin-down state). This constraint can be imposed by setting corr to inequiv = 0, 1, 1, 0, which indicates that there are only two inequivalent sites: site 1 and site 4 are labeled with 0, and site 2 and site 3 are labeled with 1.
Two independent impurity problems are solved in each step of the self-consistent calculations. For each impurity model, the number of orbitals and the strength of interactions are specified by the norb and kanamori parameters, respectively. To obtain an AFM solution, we need to start the self-consistent calculation with a brokensymmetry initial guess. To this end, we set the initial self-energy to using local potential v σ αβ (s), which depends on the inequivalent site s. The [control] block contains an additional parameter that specifies the initial self-energy as The parameters in [system] and [impurity solver] do not change from the paramagnetic calculations. Figure 11 shows the convergence of the spin moment m ξ (s) (ξ = x, y, z) for inequivalent site s = 0. A convergence of m z to a non-zero value was obtained around the 30th iteration. The moment m z = 0.283 is about half the full moment (1/2). We confirmed that m z (s) for s = 1 converges to the negative side, namely m z (0) = −m z (1) (no figure). The parameters U = 4, n = 1, and T = 0.1 are the same as those in the previous calculations in Fig. 8, meaning that the paramagnetic solution was not a true solution of the DMFT equation. The phase diagram is shown, for example, in Ref. [44].
6.2 Solving multi-orbital model using QMC solver: t 2g model on a Bethe lattice In this subsection, we show how to solve a three-orbital model on a Bethe lattice by using DCore. In this model, an interesting phenomenon called spin-freezing transition occurs [45]. Spin-freezing is signaled by a peculiar frequency dependence of the self-energy: ImΣ(iω n ) ∝ ω 0.5 n . We will reproduce this result using DCore.

Model definition
We first make the input file of dcore_pre for generating an HDF5 file that is necessary for DMFT calculations. The Hamiltonian in Ref. [45] is defined as where α = 1, 2, 3 is the orbital index. We use a Kanamori interaction with U = 8, U = U − 2J = 5.3333333, and J = 1.33333. We use the same model parameters as those used in the paper [45]: n = 1.6, t = 1.0, U/t = 8.0, J/U = 1/6. The input file for dcore_pre is as follows: [model] lattice = bethe seedname = bethe nelec = 1. After running dcore_pre, an HDF5 file named bethe.h5 is generated. DCore discretizes the density of states of the Bethe lattice, a semicircular density of states on [−2t, 2t], along a virtual one-dimensional k axis with nk = 1000 points.

DMFT calculation
Next, we make the input file of dcore. The calculation in Ref. [45] was done using the matrix formalism of the CT-HYB method. To make a direct comparison, we use an implementation of the same algorithm, ALPS/CT-HYB, which was developed by one of the authors. An example input file of dcore is shown below.
[mpi] command = '$MPIRUN -np #' [system] beta = 50.0 [impurity_solver] name = ALPS/cthyb timelimit{int} = 300 exec_path{str} = hybmat [control] max_step = 40 sigma_mix = 1.0 restart = False In this example file, we omit the model block, which is the same as that in the input file of dcore_pre. The inverse temperature β is set to 50t. We also define a command for MPI parallelization using the mpi block. Here, # in command of the mpi block is replaced by the number of processes specified at runtime. In the above example, we define the environment variable MPIRUN and run the program with 24 MPI processes as follows.
$ export MPIRUN="mpirun" $ dcore dmft_bethe.ini --np 24 After running dcore, the results of the self-energy and Green's functions in each iteration are accumulated in an HDF5 file named *seedname*.out.h5 (bethe.out.h5 in the present case) and a text file that contains the self-energies (sigma.dat) is output in the check directory. Before analyzing the computed results such as the self-energy, the user may check the convergence of CT-QMC sampling. The impurity solver runs for 300 seconds at each iteration step, and the standard output and error of the solver are redirected to the standard output of dcore. The last part of the output of the solver at the last iteration looks as follows. The perturbation orders measured just before and after the measurement steps are close to each other ( 58), indicating that the thermalization time was long enough: ALPS/CT-HYB used 10 % of the total simulation time (timelimit) for thermalization. One can see that the average sign is close to 1, indicating that there is no severe sign problem. After checking the convergence of the self-consistent iterations using dcore_check, one can plot the self-energy stored in sigma.dat as shown in Fig. 12. The solid line shows the results taken from [45]. In the low-frequency region, ImΣ(iω n ) is proportional to ω 1/2 n . The three symbols represent the orbital diagonal components of ImΣ(iω n ) computed by DCore. The results obtained by dcore match those of [45]. −ImΣ(iω n ) Figure 12: Matsubara frequency dependence of ImΣ(iω n ). The solid line indicates the numerical results reported in Ref. [45]. The three symbols indicate ImΣ(iω n ) of d xy , d yz , and d zx orbitals, respectively, obtained by DCore.

DFT+DMFT example: SrVO 3
In this section, we demonstrate how to perform a multi-orbital DFT+DMFT calculation for a realistic band structure. In particular, we use SrVO 3 which has been studied by DMFT as a testbed in many previous studies.

Construction of Wannier functions
We fist construct maximally localized Wannier functions for the t 2g manifold using Quantum ESPRESSO [46] and Wannier90 [42]. Any DFT programs that support Wannier90 may be used alternatively. For our DFT calculation, we take a lattice constant of 7.29738 a.u and construct a three-orbital tight-binding model following a common procedure using the local density approximation (LDA). We provide a more detailed description on how to construct Wannier functions using Quantum ESPRESSO or another DFT program, OpenMX [47] online [48]. There, we can download input files as well as the data file of the resultant tightbinding model.

Model definition
We show the model block of our input file for performing a DMFT calculation using the Wannier functions below. We adopted the Kanamori interaction parameters (U = 3.44 eV, U = 2.49 eV, J = 0.46 eV) estimated in Ref. [49]. The number of grid points in the first Brillouin zone along each reciprocal vector is set to 10. This number does not have to match the one used in the DFT calculation. The total number of k points is 10 3 .
[model] We set input parameters for ALPS/CT-HYB in the impurity_solver block. The parameters timelimit and exec_path are passed to ALPS/CT-HYB. The impurity solver runs for 100 seconds with 96 MPI processes at each self-consistent iteration. We set an initial guess for the chemical potential to an appropriate value and inverse temperature to β = 20 eV −1 580 K. In the input file shown below, we specify the temperature by using the parameter beta instead of the parameter T for the sake of brevity in appearance.
[mpi] command = '$MPIRUN -np #' [system] beta = 20.0 mu = 12.290722 [impurity_solver] name = ALPS/cthyb timelimit{int} = 400 exec_path{str} = hybmat [control] max_step = 20 time_reversal = True sigma_mix = 0.8 The tool block includes input parameters for post processing. In this example, we compute A(k, ω) using dcore_post along the k-paths specified by symmetry points given in knode. As set by the parameter omega_pade, the data of the self-energy in the energy window of 0 < ω n ≤ 2 is used for analytic continuation to the real-frequency axis. Here, we introduce a small broadening factor for computing spectral functions.

Results
In Fig. 13, one can see that the t 2g bands are fitted very well by the Wannier functions. The t 2g manifold is separated in energy space from the e g bands above 1.5 eV and oxygen bands below −2 eV. Since we constructed the t 2g model, the e g bands and oxygen bands are not taken into account in the present DMFT calculation. Figure 14 shows A(k, ω) computed by DMFT Compared to the DFT band structure in Fig. 13, one can see that the t 2g band is substantially renormalized by strong correlation effects as expected.
There are several previous DMFT studies for this compounds [50][51][52]. These calculations aimed at a quantitative description of the spectrum of the real material and therefore took into account other bands such as the e g bands and oxygen bands. Thus, a direct comparison with our result is not possible.
For DCore, one could take into account those uncorrelated bands by constructing Wannier functions for the full 3d orbitals and oxygen p orbitals in a wider energy window. In this case, we may have to carefully adjust (manually) the level splitting between t 2g and the uncorrelated orbitals because the current version of DCore does not support intra-shell (site) interactions between the d and p shells.
We need extra care for quantum Monte Carlo simulations since they often suffer from a sign problem and many other technical issues. If the reader faces such a problem, it is recommended to create an issue on GitHub at the issues page [53] or directly contact the developers of the solver.

Conclusion
We introduced the open-source software package DCore, which is integrated DMFT software for correlated electrons. We described the algorithm, the structure of the package, installation, and file format, and demonstrated usage for the single-/multi-orbital Hubbard models and the Wannier90 model for SrVO 3 . We are planning to implement more functionality in a future version. The current version supports only the Padé approximation for analytic continuation. However, the Padé approximation is sensitive to statistical noise. A future version will support external analytic continuation solvers based on more stable methods such as SpM [54] and Maxent [55]. Furthermore, we will implement recently proposed efficient methods for computing static/dynamical lattice susceptibilities based on the Bethe-Salpeter equation [44,56]. DCore will support Python 3 as soon as TRIQS is upgraded to Python 3. There are also plans to support more external impurity solvers such as iQIST [57] and w2dynamics [18].