A supervised learning algorithm for interacting topological insulators based on local curvature

Topological order in solid state systems is often calculated from the integration of an appropriate curvature function over the entire Brillouin zone. At topological phase transitions where the single particle spectral gap closes, the curvature function diverges and changes sign at certain high symmetry points in the Brillouin zone. These generic properties suggest the introduction of a supervised machine learning scheme that uses only the curvature function at the high symmetry points as input data. We apply this scheme to a variety of interacting topological insulators in different dimensions and symmetry classes, and demonstrate that an artificial neural network trained with the noninteracting data can accurately predict all topological phases in the interacting cases with very little numerical effort. Intriguingly, the method uncovers a ubiquitous interaction-induced topological quantum multicriticality in the examples studied.


I. INTRODUCTION
Topological order is typically quantified by an integervalued topological invariant that is often calculated from the momentum space integration of a certain curvature function, whose precise form depends on the dimension and symmetry class of the system. [1][2][3] Though the profile of the curvature function in a topological phase varies with the system parameters, the topological invariant remains unchanged. Across topological phase transitions (TPTs) where the topological invariant jumps discretely, the curvature function displays a rather universal feature: [4][5][6][7] it gradually diverges at certain high-symmetry points (HSPs) in momentum space, and the divergence changes sign as the system crosses the TPT, causing the discrete jump in the topological invariant. Through analyzing the divergence of the curvature function, various statistical aspects of the Landau second-order phase transitions can be transposed to TPTs. This includes the notion of critical exponents, scaling laws, universality classes, and correlation functions. This forms the basis of the curvature renormalization group (CRG) method which can capture the TPTs solely based on the renormalization of the curvature function near the HSP, 8 regardless of whether the system is noninteracting [9][10][11][12][13] or interacting 14,15 or periodically driven. [16][17][18][19] The CRG method demonstrates that, although topology is a global property of the entire manifold of the D-dimensional Brillouin zone (BZ), the knowledge about topology can be entirely encoded in the curvature function near a HSP. Motivated by this intuition, in this paper we present a supervised machine learning (ML) scheme that utilizes only the curvature function at the HSPs as input data to predict TPTs. The proposed ML scheme answers an important question regarding the application of ML to topological phases: what is the minimal amount of data that is sufficient to distinguish topological phases? Various ML strategies have been suggested to address this issue, including the concept of quantum loop topography, 20,21 and using either the wave function, [22][23][24] Hamiltonian, 25-28 electron density, 29 system parameters, 30,31 transfer matrix, 32 or density matrix 33 as the input data. In contrast to these methods, we present a simple ML scheme based on input data comprising at most D + 1 real numbers in D dimensions applicable to different symmetry classes and weakly interacting systems. We train a simple fullyconnected artificial neural network with a single hidden layer with data from prototypical noninteracting TIs whose topological phases are well-known, and then use the trained network to predict the topological phase diagram when many-body interactions are adiabatically turned on such that the single-particle curvature function gradually evolves into its many-body version. We demonstrate how the ML scheme accurately captures the topological phases and phase transitions driven by interaction with very little numerical effort and simultaneously uncover interaction-driven multicritical points.
The article is organized in the following manner. In Sec. II A, we first review the generic features of the curvature function and the proposed supervised ML scheme based upon it. We then apply this scheme to predict the topological phase diagram of the Su-Schrieffer-Heeger model under the influence of nearest-neighbor interaction in Sec. II B as a concrete example. In Sec. II C 1, we apply the ML scheme to 2D Chern insulators with nearestneighbor interaction, and in section II C 2 to Chern insulators with electron-phonon interaction, elaborating on the quantum multicriticality caused by the interactions. The topological systems we consider are those whose topological invariant C is given by a D-dimensional momentum space integration where F (k, M ) is referred to as the curvature function or local curvature, and M = (M 1 , M 2 ...M D M ) is a set of tuning parameters in the Hamiltonian. This form of topological invariant has been proved to be true for any noninteracting system described by Dirac models in any dimension and symmetry class. 34 The points k 0 in momentum space satisfying k 0 = −k 0 (up to a reciprocal vector) are referred to as the high symmetry points (HSPs). For a D-dimensional cubic system, there are D + 1 distinguishable HSPs, such as k 0 = (0, 0), (π, 0), and (π, π) in 2D. Note that (0, π) and (π, 0) are indistinguishable in the sense that the curvature function has the same value at these two points. As the system approaches the TPT, the F (k 0 , M) generally diverges and flips sign as the system crosses the critical point Our aim is to construct a supervised ML scheme to identify the critical point M c of TPTs in the D M -dimensional parameter space. Certainly we may use the entire profile of the curvature function F (k, M) as the input data for ML, but this would be numerically expensive. The question then amounts to what is the minimal amount of data that can accurately predict M c with the smallest numerical effort. Since the critical behavior described by Eq. (2) is a defining feature of the TPT, it motivates us to design an ML scheme that uses only the curvature function at the D +1 distinguishable HSPs as input data. Our investigation suggests a supervised ML scheme that consists of the following steps: (1) In the training step, we seek a subspace M of the parameter space in which all the critical points M c and their corresponding HSPs k 0 at which the curvature function diverges are known.
(2) We generate F (k 0 , M) for several points in M, and label them according to the value of the corresponding topological invariant. We use this data to train the neural network.
(3) Once the neural network is trained, for an unexplored point in the parameter space M, we generate F (k 0 , M) at the same HSP as input data and ask the neural network to predict which phase this points belongs to. The procedure may be repeated to scan through the M space.
(4) Choose a different HSP to repeat the same procedure to ensure that all TPTs in the larger parameter space M have been captured.
This supervised ML scheme can be easily extended to studying interacting TIs. We choose the noninteracting limit as the subspace M to train the neural network, whose topology is often easier to solve, and ask the trained neural network to predict the situation when the interaction is turned on. Our approach assumes that the non-interacting system can indeed manifest nontrivial topology in parts of its parameter space, and that the interacting system is adiabatically connected to the same topological class as the non-interacting one, i.e. the interactions do not change the underlying nonspatial symmetries. Because the scheme only relies on the curvature function at the D + 1 distinguishable HSPs, it circumvents the tedious integration in Eq. (1) for the interacting cases, and consequently serves as a very efficient tool to obtain the phase diagram in the vast M parameter space. We now demonstrate the efficiency of our method by studying different interacting TIs.

B. Su-Schrieffer-Heeger model with nearest-neighbor interaction
We study the 1D Su-Schrieffer-Heeger (SSH) model in the presence of nearest-neighbor interaction to demonstrate the efficiency of the proposed supervised ML scheme. The noninteracting part of the Hamiltonian is given by where c Ii is the spinless fermion annihilation operator on sublattice I = {A, B} at site i, t + δt and t − δt are the hopping amplitudes on the even and the odd bonds, respectively, and Q k = (t + δt) + (t − δt)e −ik after a Fourier transform. We consider the nearest-neighbor interaction 14,35 where n Ii ≡ c † Ii c Ii , and V q = V (1 + cos q). In the limit of weak interaction, the changes to the topology of the model can be described by renormalizing the Hamiltonian with self-energies calculated from Dyson's equation. 14 To one-loop order, the self-energies are given by where the phase α k is defined by Q k ≡ |Q k |e −iα k . The Σ AA and Σ BB are the Hartree terms that introduce a finite chemical potential that shifts the entire spectrum by −V . Σ AB and Σ BA are the Fock terms that modify the off-diagonal elements of the 2×2 Hamiltonian matrix in the sublattice space. The phase of the modified offdiagonal element then reads and the topological invariant is simply the winding number of this phase The curvature function is thus F (k, δt, V ), with the parameter space M = (δt, V ). Note that in the noninteracting limit V = 0, the curvature function recovers the more familiar Berry connection. 14 To realize the ML scheme proposed in Sec. II A, since the noninteracting SSH model is known to go through TPT via gap closing at k 0 = π, we use the curvature function F (π, δt, 0) at k 0 = π as the input data to train a neural network that consists of a single dense hidden layer, as indicated in Fig. 1 (a). The noninteracting V = 0 subspace M = (δt, 0) is used to train the neural network, as indicated by the colored lines in Fig. 1 (c). The details of the training procedure are given in appendix A. In accordance to the usual notation, the δt < 0 data is labeled as nontrivial with C = 1, and δt > 0 as trivial with C = 0. After the neural network is trained, we use it to predict the topology in the interacting case V = 0 in the large M = (δt, V ) parameter space. For each M (darker colored areas in Fig. 1 (c)) we feed the curvature function at the same HSP F (π, δt, V ) to the network to obtain C. The resulting phase diagram shown in Fig. 1 (c) correctly captures the phase boundary between the C = 1 and the C = 0 phases, as can be compared by the results obtained from the curvature renormalization group (CRG) approach. 14 A comparison with Eq. (7) immediately points to the advantage of this ML scheme, because it does not require to an explicit calculation of the highly cumbersome integral in Eq. (8).

C. Interacting Chern insulators in 2D
We now apply our algorithm to study interacting TIs in two dimensions. To illustrate the power of the methodology, we consider two kinds of interactions: electronic interactions and electron-phonon interactions. In both cases, we find that the ML scheme predicts a complex phase diagram and the emergence of interaction-driven multicriticality.

Chern insulator with nearest-neighbor electronic interaction
The noninteracting Hamiltonian matrix of the Chern insulator takes the form H 0 = d(k) · σ in the (A, B) sublattice space for every momentum k, where For concreteness, we will examine the nearest-neighbor interaction of a form analogous to Eq. (4), with the vertex The effect of the interaction is to modify the Green's function by where the d-vector is renormalized by the intra-and inter-sublattice self-energies which generally depend on both momentum and energy. The precise form of the self-energies has been discussed previously in detail in Ref. 14. The topological invariant in terms of the full Green's function in this case reads 36,37 Note that abc is the Levi-Civita tensor where {a, b, c} = {ω, k x , k y }, and G ≡ G(k, iω) is the interaction-dressed single-particle Green's function. Because the lowest order self-energy is frequency-independent, Eq. (13) greatly simplifies to This form is similar to that of noninteracting 2D class A models, where it simply counts the associated skyrmion number of the self-energy-renormalized d -vector. The integrand in Eq. (14) is then treated as the curvature function F (k, M) = F (k x , k y , M, V ), with the mass term and interaction strength M = (M, V ) forming a 2D parameter space. We again use a neural network with a single hidden layer to determine the topology in the interacting case, as indicated by Fig. 2 (a), where the curvature function at the three distinguishable HSPs is used as the input data. The noninteracting subspace M = (M, 0) is used to train the neural network. The noninteracting subspace has 3 critical points corresponding to the divergence of curvature function at the 3 distinguishable HSPs. 38 Once the neural network is trained, we use it to predict the interacting case V = 0 in the larger parameter space M = (M, V ), yielding the phase diagram shown in Fig. 2 (c), which correctly captures the three topological phases, as can be compared with the CRG result that has previously solved part of the phase diagram. 14 This again suggests that our ML scheme is a very efficient numerical tool, since it circumvents the cumbersome integration of Eq. (14).
An unexpected result unveiled by our ML method is the prediction of an interaction-driven multicritical point between the C = 1 and C = −1 phases, as indicated by the red star in Fig. 2 (c) where four regions meet. Although the precise location of this multicritical point and the phase boundaries surrounding it can be altered by higher order self-energy corrections, our result suggests that many-body interactions can be a mechanism for the generation of multicritical TPTs. Such a feature has also been seen in 1D Creutz model with Hubbardtype interaction. 39 The curvature function at zero frequency ω = 0 and the three inequivalent high-symmetry points k0 = (0, 0), (0, π), (π, π) used as input data for the neural network -whose architecture is depicted in (b). The ML predicted topological phase diagram for the interacting Chern insulators: (c) electronelectron interactions and (d) electron-phonon interactions. In both cases, the neural network is trained with noninteracting data (shaded lines at V = 0 in c) and d)), corresponding to the three inequivalent topological phases with C = 0, ±1. The method unveils the existence of multicritical points between the C = 1 and C = −1 phases, indicated by the red stars.

Chern insulator with electron-phonon interaction
As electron-phonon interactions are ubiquitous in real materials and can affect properties such as transport of surface states, we now consider the impact of such interactions on the Chern insulator. [40][41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56] In particular, we consider the deformation potential coupling between an acoustic phonon mode and spinless fermions of the form 57 (15) where a q is the phonon annihilation operator, ω q = v s q is the phonon dispersion with sound velocity v s , and M q = u √ q with u a phenomenological coupling constant determined by sound velocity, electron-ion potential, and ion density. The noninteracting part is that given by Eq. (9). The results for the corresponding self-energies are presented in Appendix B, and extend the calculation of the one-loop self-energies for optical phonons detailed in Ref. 14 to the case of acoustic phonons. We treat the mass term M and the electron-phonon coupling u in Eq. (15) as tuning parameters M = (M, u), and aim to find the TPTs in this 2D parameter space. A crucial difference from the case of electron-electron interaction in Sec. II C 1 is that here, even at the oneloop level, the self-energy depends on both momentum and frequency K = (ω, k x , k y ), and so does the curvature functionF (K, M) (the integrand of the multidimensional integral in (13) ). Consequently, the numerical integration of the topological invariant in Eq. (13) becomes even more tedious, especially given the unbounded frequency integration. Nevertheless, we find that close to TPTs, the curvature function at ω = 0 diverges and flips at the HSPs of momentum k. In other words, the appropriate HSPs in this problem are given by K 0 = (ω, k x , k y ) = (0, 0, 0), (0, π, 0), and (0, π, π), at which the critical behavior of F (K 0 , M) follows that discussed in Sec. II A. This critical behavior at zero frequency is a reminiscence of gap closures at the Fermi energy at typical quantum critical points that is manifested in the spectral function A(k, ω) (detailed in Appendix B).
Our ML scheme becomes a powerful tool in this case, since it circumvents the momentum-frequency integra-tion in Eq. (13), relying instead on the divergence of the curvature function. As in Sec. II C 1, we use the noninteracting limit in the absence of phonons u = 0 as training data, and apply the ML scheme as illustrated in Fig. 2 (a)-(b). Fig. 2 (d) shows the phase diagram obtained by our ML scheme using the three distinct HSPs K 0 as input data. To check the validity of the results, we plot the spectral function across two representative TPTs (driven by either the mass term M or the electron-phonon coupling u), predicted by the ML scheme in Fig. 3. Note that the corresponding spectral functions clearly display a continuous closure and opening of gaps at ω = 0 consistent with a continuous phase transition. This implies that both TPTs driven by the electron-phonon interaction and the mass are second order transitions. To summarize, the phase diagram correctly captures all phases and phase boundaries, and moreover indicates the appearance of a multicritical point as a function of coupling u around M = −2.0, indicating that electron-phonon interaction can also serve as a mechanism to induce multicriticality. Thus, many-body interactions are added to the list of several recently uncovered mechanisms that can trigger topological multicriticality, including periodic driving or quantum walk protocols, [17][18][19]58 long range hopping or pairing, 12,13,59 spin-orbit coupling, 11,60 topological insulator/topological superconductor hybridization, 61 as well as more complicated mechanisms in the spin liquid 62 and toric code models. 63 We close this section by making a comparison between the CRG [8][9][10][11][12][13][14][15]17,18 and the ML scheme proposed here. Though both methods have their advantages and disadvantages, the ML scheme is more efficient than the CRG for obtaining the phase diagram and the related invariants while the latter is more useful to extract critical exponents associated with the TPTs.

III. CONCLUSIONS
In summary, we propose a supervised machine learning scheme based on the divergence of the curvature function at high-symmetry points, to rapidly identify different topological phases in interacting systems, thereby circumventing costly multi-dimensional integrations. The machine learning scheme consists of an artificial neural network that utilizes as input data D + 1 real numbers, representing the values of the curvature function at D distinguishable HSPs in either momentum or momentumfrequency space. The strategy is to train the neural network by the data in a subspace where the topological phases are known -typically the noninteracting caseand then use the trained neural network to predict the topology in a larger parameter space. Because the machine learning scheme circumvents the tedious multidimensional integration of topological invariants, especially in interacting systems, it is a highly efficient tool to map out the topology in a large parameter space regardless the type of interaction and dimension of the system, as demonstrated for several examples. The efficiency of this ML scheme also helps to quickly uncover the multicriticality caused by both the electron-electron and electronphonon interactions, where multiple topological phases join at a single point on the phase diagram, indicating that these many-body interactions serve as new mechanisms to generate multicritical TPTs. Though the results presented were based on the first order self-energy corrections, a valid approximation for weakly interacting systems, the proposed ML scheme can straightforwardly be extended to higher order self-energy terms. The scheme is widely applicable to topological materials in any dimension and symmetry class, provided the topological invariant is defined from the integration of a local curvature. Future directions include the study of strongly interacting TIs within the paradigm presented here in conjunction with numerical methods like exact diagonalization, 15 as well as the interplay of topology and symmetry-broken phases in interacting topological systems.

ACKNOWLEDGMENTS
The authors would like to thank Lode Pollet for useful discussions. This project has received funding from the European Union's Horizon 2020 research and innovation program under the Marie Sklodowksa-Curie grant agreement No. 895439 'ConQuER'. W. C. acknowledges the financial support from the productivity in research fellowship of CNPq. P. M. acknowledges funding from the ESPRC Grant no. EP/P009565/1. In this appendix, we give a brief overview of the details of the neural network architecture and training used to obtain the phase diagrams of the interacting topological insulators mentioned in the main text. The construction, training, and evaluation of neural networks was implemented using Tensorflow. 64,65 For all of the results shown in the main text, we employed a neural network with a single fully-connected hidden layer and varying input and output layer depending on the dimensionality of the system and the number of phases in the topological phase diagram (input: a single neuron for the 1D SSH model, three neurons for the 2D Chern insulators, output: two neurons for the 1D SSH model, three neurons for the 2D Chern insulators). We employed a hidden layer with 10 neurons to generate the results presented in the main text, but we empirically found that the width of the hidden layer can be reduced to 2-3 neurons without significant performance reduction. As activation function, we used a sigmoid for the hidden layer and a softmax for the output layer to obtain classification prob-abilities. To train the network, we used noninteracting data. We used 4096 points randomly distributed between δt/t = −1.0 and δt/t = 1.0 in the SSH model, and between M = −12.0 and M = 12.0 for the 2D Chern insulator, fed in batches of size 32. The training lasted for 50 epochs. The optimizer used during training was ADAM and the loss function was the categorical cross entropy.
Appendix B: Self-energy of Chern insulator with electron-phonon interaction For the Chern insulator with electron-phonon interactions discussed in Sec. II C 2, in the zero temperature limit T → 0, taking the Bose distribution N 0 = 0 and the Fermi distribution n F (x) = θ(−x), the self-energies are given by which depend on both momentum and the Matsubara frequency iω n . We then replace the Matsubara frequency by a continuous one, iω n → iω, in the calculation of the curvature function. The topological invariant is again given by Eq. (13), whose integrand can be expressed in terms of the d -vector in Eq. (12) by where we have denoted K = (ω, k x , k y ). Note that in the noninteracting limit d → d, only the first term in Eq. (B2) survives, which recovers the Berry connection F (k, M) in the integrand of Eq. (14) after a frequency integration. On the other hand, when calculating the spectral function A(k, ω) = − 1 π Im TrG ret (k, ω) , we use the retarded version G ret (k, ω) of the interacting Green's function in Eq. (11) obtained through an analytical continuation iω → ω + iη.