SciPost Submission Page
Detecting Nematic Order in STM/STS Data with Artificial Intelligence
by Jeremy B. Goetz, Yi Zhang, Michael J. Lawler
- Published as SciPost Phys. 8, 087 (2020)
|As Contributors:||Michael Lawler|
|Arxiv Link:||https://arxiv.org/abs/1901.11042v3 (pdf)|
|Date submitted:||2020-05-13 02:00|
|Submitted by:||Lawler, Michael|
|Submitted to:||SciPost Physics|
|Subject area:||Condensed Matter Physics - Computational|
Detecting the subtle yet phase defining features in Scanning Tunneling Microscopy and Spectroscopy data remains an important challenge in quantum materials. We meet the challenge of detecting nematic order from local density of states data with supervised machine learning and artificial neural networks for the difficult scenario without sharp features such as visible lattice Bragg peaks or Friedel oscillation signatures in the Fourier transform spectrum. We train the artificial neural networks to classify simulated data of isotropic and anisotropic two-dimensional metals in the presence of disorder. The supervised machine learning succeeds only with at least one hidden layer in the ANN architecture, demonstrating it is a higher level of complexity than nematic order detected from Bragg peaks which requires just two neurons. We apply the finalized ANN to experimental STM data on CaFe2As2, and it predicts nematic symmetry breaking with 99% confidence (probability 0.99), in agreement with previous analysis. Our results suggest ANNs could be a useful tool for the detection of nematic order in STM data and a variety of other forms of symmetry breaking.
Ontology / TopicsSee full Ontology or Topics database.
Published as SciPost Phys. 8, 087 (2020)
Author comments upon resubmission
We thank this referee for his/her suggestions and comments. We have revised the manuscript accordingly and responded as follows:
(1) Upon coarse-graining the pixels, the authors write that the confidence of the ANN gradually wanes. Is this also seen in the data from the non-interacting model? Or does one have to include disorder that changes over a scale of a dozen lattice sites to reproduce this?
Following this referee's question, we have tested and indeed observed a similar behavior of waning confidence in the non-interacting-model data as the coarse-graining length scale starts to deviate significantly from the training set. We have included the related ANN results in the Appendix and added a sentence in connection to the observation in Fig.4.
(2) One of the main conclusions of this work is that the ANN remains successful in distinguishing nematic from symmetric samples even when standard approaches fail, say in case of strong disorder. This is remarkable. Imagine a sample with very strong disorder. Then, and this is a typical argument for disordered systems, it is impossible to say whether this sample was generated from phase A or from phase B in case the disorder distribution is generic (in which case one can continuously change the parameters of the disorder distribution, and hence generate the sample from either phase with equal probability). I, therefore, do not understand that the ANN can learn something that mathematically is not supposed to exist. And so it is fair to ask if the disorder modeling is sufficiently generic in this paper: it is written that δtδt and δμδμ are constant and isotropic, i.e., they do not make a further distinction between symmetric and nematic phases. If one makes these distributions broad and anisotropic, do the conclusions of this work still hold?
Indeed, in the ultra-strong-disorder limit, there is no physical distinction whatsoever that exists between the phases with different symmetries, and the ANN will not help. On the other hand, the distinctions may remain with the disorder strength below a certain threshold, though the information is likely blurred and hidden, such as the example in Figs. 7 and 8 - averaging over a large number of configurations reveals the existing yet hidden information. Such data is where the machine-learning-based method shines.
In the revised manuscript, we have added results in the Appendix on ANN outputs in the presence of even stronger disorders, where the distinctions between the phases indeed vanish. We have also included a discussion on the generality of the proposed AI perspective in the conclusive remarks.
Also, we did not include anisotropy to the disorder distributions since it makes it difficult to attribute the origin of the anisotropy and distinguish the underlying symmetry of the pristine model labeling the physical phases, which is our original and essential target.
(3) On p8 the authors write: "In comparison, machine learning approaches are base upon the original real-space LDOS data and thus may look beyond merely the Fermi wave vectors." (note the typo: base --> based). But what do the authors think that distinguishes the two cases? Looking at Fig 9d (please add the color scale to fig 9c), Fig 7d, and Fig 6d, it seems that there is a difference between the x and y direction for k-values in the range 2 to 3. Is this true, and if so, can it be explained? -- this amounts to the fact that the ANN output is difficult to interpret for humans.
We thank this referee for pointing out our typo. We have corrected the typo and added a color scale to Fig. 9c. We also agree with the referee on the existence of various differences between the x and y directions.
However, it is still hard to formulate a universal clue that applies more generally, especially since the isotropic cases (e.g., Fig. 6c and 7c) have, in principle, differences between the two directions as well due to the randomness of the quenched disorders. Also, although the peaks are relatively well-defined features in Fig. 6d (Friedel oscillations are present), they are much less apparent features overall in Fig. 7d due to limited resolution and cleanness. Further, Fig. 9d brings even further complication: the peaks lie in nearly identical locations due to similar k_F in both directions, yet the anisotropy is in v_F instead.
Honestly, we do not yet have a solid grasp on what the ANN uses to distinguish the two phases. Understanding how ANN works is an evolving frontier, and at this moment, there lacks a well-controlled method to interpret the ANN analytically in general.
(4) It is not entirely clear to me whether the modeling by a non-interacting model is sufficient in case no Bragg peaks or Friedel oscillations are observed.
We agree with this referee that, ideally, the training should consist of more generic models, including various interacting models and non-interacting models with broader settings for more rigorous and practical applications, which is beyond the scope of the current work. Instead, we focus on simple illustrations that the ANN indeed offers a useful perspective in certain situations, e.g., no Bragg peaks or Friedel oscillations, which were previously found rather difficult.
We have included a discussion on generality and future roadmap in the last section.
List of changes
* We have included the new ANN results on scaling in the Appendix and added a sentence in connection to the observation in Fig.4.
* We added new ANN results on disorder effects in the Appendix and a discussion on the generality of the proposed AI perspective in the conclusive remarks.
* corrected the typo: base-->based on p8 and added a color scale to Fig. 9c.
* We have included a discussion on generality of our approach and future roadmap in the last section.
Submission & Refereeing History
You are currently on this page
Reports on this Submission
Anonymous Report 1 on 2020-5-18 Invited Report
The authors answered to my questions in a satisfactory way. I see no further reason to uphold this paper.