SciPost Submission Page
A Normalized Autoencoder for LHC Triggers
by Barry M. Dillon, Luigi Favaro, Tilman Plehn, Peter Sorrenson, Michael Krämer
This is not the latest submitted version.
This Submission thread is now published as
Submission summary
Authors (as registered SciPost users): | Barry Dillon · Luigi Favaro · Tilman Plehn |
Submission information | |
---|---|
Preprint Link: | https://arxiv.org/abs/2206.14225v1 (pdf) |
Date submitted: | 2022-07-14 13:23 |
Submitted by: | Favaro, Luigi |
Submitted to: | SciPost Physics |
Ontological classification | |
---|---|
Academic field: | Physics |
Specialties: |
|
Abstract
Autoencoders are the ideal analysis tool for the LHC, as they represent its main goal of finding physics beyond the Standard Model. The key challenge is that out-of-distribution anomaly searches based on the compressibility of features do not apply to the LHC, while existing density-based searches lack performance. We present the first autoencoder which identifies anomalous jets symmetrically in the directions of higher and lower complexity. The normalized autoencoder combines a standard bottleneck architecture with a well-defined probabilistic description. It works better than all available autoencoders for top vs QCD jets and reliably identifies different dark-jet signals.
Current status:
Reports on this Submission
Strengths
1-well-written pedagogic introduction to autoencoders
2-first (to the best of my knowledge) practical application to jet physics and QCD
Weaknesses
1-examples are too simple
Report
Given the long time this paper had been stalled I suggest publication.
Requested changes
None
Report #1 by Anonymous (Referee 4) on 2022-12-31 (Invited Report)
- Cite as: Anonymous, Report on arXiv:2206.14225v1, delivered 2022-12-31, doi: 10.21468/SciPost.Report.6410
Report
This paper proposes a new application of unsupervised machine learning to anomaly detection at the LHC. In particular, the Normalized Autoencoder is introduced and studied in a few cases. The topic is important/timely, the work presented in the paper is serious, and the manuscript is well written. I would be happy to recommend publication in SciPost Physics following some comments below (one major and the rest minor).
Major
- It seems like the main motivation for this work is a method that is symmetric in the following sense: if I have datasets A and B, then p_A(x) is small if and only if p_B(x) is small. I'm not sure this is a necesary motivation and I do not fully understand it. The authors claim that it is a problem that other methods find that QCD jets do not have a low p_top while top jets have a low p_QCD. I don't see the issue - it could be that top jets look much less like a typical QCD jet than QCD jets look like typical top jets. In fact, since the NAE is symmetric, I am suspicious that it has not found the true data probability density.
Minor
Abstract:
- "Autoencoders are the ideal analysis tool for the LHC" -> "Autoencoders are an effective analysis tool for the LHC"? (see also a similar line in the Outlook)
- "its main goal of finding physics beyond the Standard Model" -> "one of its main"?
Introduction:
- "that no assumption should" -> "that few assumptions should" ? Even in your case, you have a preselection and use certain features and these choices build in some assumptions.
- "The problem with these studies is that it is not clear what the anomalous property of jets or events actually means...." -> I did not understand this paragraph. If a compression algorithm is doing its job, shouldn't it tag as anomalous events with low density? (e.g. should assigm more capacity to common events and less capcity = poor reconstruction to rare events?)
- "which is smaller enough to run on an LHC trigger" -> I don't think you actually address this? (see also a similar line in the outlook)
- "Although the jets we study in the paper would pass trigger selection cuts already, the results still demonstrate the utility of this technique on a trigger." -> seems to contradict itself?
Network and dataset:
- "By using the reconstruction error as the energy, the model will learn to poorly reconstruct
inputs not in the training distribution" -> this is perhaps the most important paragraph in the whole paper and I think it could be improved for clarity. Do I understand that the point you are making is that the NAE training sees examples from the model as well as from the data, so it has some "experience" with out of distribution examples (e.g. when the model starts far from the data)? If that is correct, why not try to do this more directly instead of relying on the model being bad? I'm sorry if I have not understood correctly (and in that case, please help me by rewording!)
- Preprocessing: does this remove important/useful physics information? (I know you study the performance for n later, but this is unsupervised, so it is pospsible that performance gets better/worse "by accident"?) See e.g. https://arxiv.org/abs/1511.05190.
QCD vs top jets:
- Fig. 4: black line is unlabeled (presumably, it is the random line).
Outlook:
- "However, density-based autoencoders have not been shown to work properly and have a massive depen- dence on data preprocessing" -> citation?