SciPost Submission Page
A supervised learning algorithm for interacting topological insulators based on local curvature
by Paolo Molignini, Antonio Zegarra, Evert van Nieuwenburg, R. Chitra, and Wei Chen
This is not the latest submitted version.
This Submission thread is now published as
Submission summary
Authors (as registered SciPost users): | Paolo Molignini · Everard van Nieuwenburg |
Submission information | |
---|---|
Preprint Link: | scipost_202105_00007v1 (pdf) |
Date submitted: | 2021-05-05 20:24 |
Submitted by: | Molignini, Paolo |
Submitted to: | SciPost Physics |
Ontological classification | |
---|---|
Academic field: | Physics |
Specialties: |
|
Approaches: | Theoretical, Computational |
Abstract
Topological order in solid state systems is often calculated from the integration of an appropriate curvature function over the entire Brillouin zone. At topological phase transitions where the single particle spectral gap closes, the curvature function diverges and changes sign at certain high symmetry points in the Brillouin zone. These generic properties suggest the introduction of a supervised machine learning scheme that uses only the curvature function at the high symmetry points as input data. We apply this scheme to a variety of interacting topological insulators in di erent dimensions and symmetry classes, and demonstrate that an arti cial neural network trained with the noninteracting data can accurately predict all topological phases in the interacting cases with very little numerical e ort. Intriguingly, the method uncovers a ubiquitous interaction-induced topological quantum multicriticality in the examples studied.
Current status:
Reports on this Submission
Report #2 by Anonymous (Referee 1) on 2021-6-13 (Invited Report)
- Cite as: Anonymous, Report on arXiv:scipost_202105_00007v1, delivered 2021-06-13, doi: 10.21468/SciPost.Report.3055
Strengths
1 - Novel and computationally efficient way for phase classification of topological insulators (qualifies as an exciting computational discovery)
2 - Uses generalisation neural networks in general provide in a meaningful way: training on computationally accessible data (non-interacting case) and validation and testing on computationally non-trivial data (interacting cases)
3- Method is presented in a clear and understandable way and exemplified on a range a different examples
Weaknesses
1 - The authors advertise their method as the most 'minimal' in terms of input data. I believe this invites potential discussion and criticism: the size of the input their use is small - that is true, but one has to know a lot about a specific model to calculate such inputs. One could easily argue that post-processed experimental data inputs are more 'minimal' as one can calculate, say, small amount of relevant correlations directly from data without knowing anything about the model. The presented method has lot of strengths but the' minimality' as currently formulated is in my opinion questionable and context-dependent.
2 - Authors use rather vague language when describing the training procedure in Section IIA - it's extremely hard to understand how specifically they label data and create the training set (concrete suggestions below)
3 - I did not find a link to OA repository with the code. Given that the results here are purely computational I believe it's appropriate provide the code (ideally the model itself, pre-trained checkpoints and a small data set to test the model on)
Report
In "A supervised learning algorithm for interacting topological insulators based on local curvature" the authors build the phase classification algorithm for topological insulators. The authors use only the value of the curvature function at the high-symmetry points in momentum space and show that this information is sufficient to distinguish the topological phases. Moreover, they show that algorithm trained on easily computable non-interacting data generalises to interacting cases where the curvature function would be expensive to calculate.
I believe the paper meets the Acceptance Expectation 1: computational discovery. Authors formulate novel approach to phase classification with an emphasis of using physics knowledge to compress the training set and the size of the model. Such ventures are particularly meaningful detour from trying to format physics data as input to the traditional image recognition classifiers. Finding minimal representations and effective models yields computational advantage in certain cases (as authors also show) and more readily interpretable models.
Regarding the general acceptance criteria
1- The paper is nicely written, though as pointed out above there are specific points that require further explanation for the sake of clarity
2- Abstract and introduction clearly situate the problem the authors are trying to solve in the context of the field and both are readable and clear (modulo specific overly long abstract sentence I will point out below)
3- Authors do provide sufficient details (modulo specific formulations in the algorithm structure), I believe I would be able to reproduce the results from the description given. However, I would encourage authors to support their verbal descriptions with the code.
4 - The citation list appears to be exhaustive and complete
5 - Again, the description is mostly sufficient (though some hyper parameters like training step size and activations functions are missing) but code would be better
6 - The conclusion, summary and outlook are clear.
Overall, I find this paper will meet the SciPost acceptance criteria if some modification that I list below are made. It is very concisely written and the method presented is a smart solution to complex problem. The presented results provide interesting and relevant insights into the topological phase classification and I believe the results are transferable to other scientists who are building efficient classifiers.
Requested changes
1 - Abstract: The sentence starting "We apply this scheme to a variety.." is too long and hard to follow. Please split in two.
2 - Introduction paragraph 1: In the part starting "Through analysing the divergence... " there are two follow-up sentences starting with "This includes.." and "This forms" - as a reader I lost track what "this" refers to at that point of the text. Please reformulate.
3- Introduction paragraph 2: I have reservation about the "minimal amount of data" statement. As mentioned above, in a different context this means different things and one could argue that minimal amount of data is for example something you can extract from experiment and do not need to calculate based on exhaustive knowledge of the Hamiltonian and wave-function. I would suggest you continue to emphasise that your input has a small size (which is a great merit) but do not claim it is in some sense minimal compared to other approaches ("In contrast to these methods" sentence). I would also suggest that emphasising the simplicity of training and follow-up generalisation are much more important and should be more front and centre in this paragraph as opposed to the input size.
4 - Machine Learning Topological... A, paragraph 2. Please make your 3-point summary of the training process significantly more concrete. "We seek a subspace" - do you seek it in order to create the training set or is that what NN does during training? (I assume the former, but it is not clear from the current statement). "we label them according to" - is the label the value of the topological invariant or is it some function of it ('according to' may mean many things, please be very specific here). I think in the point (3) you talk about how you generate the validation set please make that super specific and don't talk about "asking neural network". How is point (4) different from (3) - is that a continuation of the validation set generation or you are doing something conceptually different. Finally, may the interactions in principle enter already during the points (3) and (4) - if yes please do specify that here since the generalisation is an important point for your paper.
5 - Machine Learning Topological... B, below Eq. 8: since there are no space limitation I suggest you write out F(k,\delta t, V) explicitly. You spent lot of time talking about these functions in the theoretical descriptions of the algorithm, now when you have the concrete example it is helpful to be really explicit about what this function is (it is indeed visible from (7 ) and (8) but it still requires reader to notice it without any help).
6 - Figure 2 - I find the red star notation bit aggressive and not particularly useful for the message of the plot, I imagine that putting something more modest, like diagonal black cross, will have the same expressive result but will be more visually appealing.
7 - Machine Learning Topological... B, pg 5, left column, paragraph 2: again you say $\tilde{F}(K,M)$ is an integrand of large integral from two pages ago - I believe it would be better to write the expression out.
8 - Figure 3: I am bit confused about super thick boundary, it makes the ticks hard to identify. Is there a way to make it thinner? Does it make sense to colour-distinguish the curves or there is a specific reason why all masses have the same colour? I would suggest non-rainbow (for example shades of blue) colouring such that the curves are easier to distinguish.
Report #1 by Anonymous (Referee 2) on 2021-6-3 (Invited Report)
- Cite as: Anonymous, Report on arXiv:scipost_202105_00007v1, delivered 2021-06-03, doi: 10.21468/SciPost.Report.3017
Strengths
1- The authors demonstrate how a trained machine learning algorithm can be used to predict the topological invariant of interacting models
2- The authors show that a sparse amount of data allows to successfully predict the topological invariant, turning the method computationally greatly efficient
3- A minimal neural-network architecture is shown to have great efficiency for the task presented
Report
The authors propose an algorithm based on supervised learning to predict the topological invariant of both single-particle and interacting systems. In particular, the authors show that by using a sparse amount of data as a training set for a neural-network, in particular the curvature function at high symmetry points, the trained algorithm is capable of computing the topological invariant of the system. The authors further show that such a trained algorithm can also be applied to many-body data, providing a greatly accurate prediction of the topological invariant.
I found the manuscript of the authors greatly interesting and a very nice demonstration of how a supervised learning algorithm can be used to substantially speed up expensive calculations such as those of interacting topological invariants. I believe that their results are correct and that they are of great interest to the wide communities of topological matter and machine learning. For those reasons, I strongly recommend the publication of their manuscript in Scipost Physics once the minor comment below is addressed.
In their manuscript, the authors focus on models in which the topological phase transition happens by a gap closing at the high symmetry points. While this is often the case, I would like to point out that gap closings can also happen elsewhere in the Brillouin zone. Perhaps a simple example would be a single orbital model in the honeycomb lattice, in which the hoppings in the x-direction are weaker than the others. In this situation, the Dirac points appear somewhere in between the K and M points, and thus small perturbations opening trivial/topological gaps would lead to curvature functions located not in high symmetry points. I think that it could be interesting that the authors comment on whether if their algorithm trained with HSP-data would work for this case, or if one would rather need to retrain it with a more diverse set of k-points. This would, of course, not be an issue at all, yet I think that it could be greatly interesting for the readers to know how large the training set should be for a generic case. The authors do not need to show any new calculations in this regard but rather just comment on what would be the result based on the findings of their manuscript.
To summarize, I believe that their work is of great interest to the readership of Scipost Physics. Therefore, I strongly recommend the publication of their manuscript once the authors briefly comment on the point mentioned above.
Requested changes
1- It could be interesting to mention how important is that the gap closings happen at the high symmetry points for this algorithm to be successful.