SciPost Submission Page
Deep Set Auto Encoders for Anomaly Detection in Particle Physics
by Bryan Ostdiek
This is not the latest submitted version.
This Submission thread is now published as
Submission summary
Authors (as registered SciPost users): | Bryan Ostdiek |
Submission information | |
---|---|
Preprint Link: | https://arxiv.org/abs/2109.01695v1 (pdf) |
Date submitted: | 2021-09-17 12:31 |
Submitted by: | Ostdiek, Bryan |
Submitted to: | SciPost Physics |
Ontological classification | |
---|---|
Academic field: | Physics |
Specialties: |
|
Approaches: | Experimental, Computational |
Abstract
There is an increased interest in model agnostic search strategies for physics beyond the standard model at the Large Hadron Collider. We introduce a Deep Set Variational Autoencoder and present results on the Dark Machines Anomaly Score Challenge. We find that the method attains the best anomaly detection ability when there is no decoding step for the network, and the anomaly score is based solely on the representation within the encoded latent space. This method was one of the top-performing models in the Dark Machines Challenge, both for the open data sets as well as the blinded data sets.
Current status:
Reports on this Submission
Report #1 by Anonymous (Referee 1) on 2021-10-18 (Invited Report)
- Cite as: Anonymous, Report on arXiv:2109.01695v1, delivered 2021-10-18, doi: 10.21468/SciPost.Report.3702
Strengths
1. Definitely topical, utilization of AEs and latent space for anomaly detection is of significant interest
2. Clear description of the methodology, datasets, architecture used
3. Presents multiple interesting results and avenues for future research
Weaknesses
1. The conclusions seem to largely suggest that the decoding portion of the network is unnecessary for the purposes of anomaly detection, but significant space in the paper is still devoted to discussing the design and tuning of this portion of the network. I think some modification to the presentation of the network design and results could make the paper even more compelling.
2. Data presentation in plots can be difficult to understand
3. Some grammatical errors, undefined jargon
Report
This paper presents a new network architecture for anomaly detection, and presents a surprising result that the decoder portion is largely unnecessary. I think that with some minor revisions for clarity/presentation this paper should be published, and some larger revisions would further strengthen the findings and their impact.
Requested changes
1. Some of the plots are difficult to interpret (specifically Fig. 4 and 5 - suggest combining the rarer backgrounds into one histogram).
2. Jargon and grammatical errors should be fixed.
3. Since the decoder portion is concluded to be unnecessary for anomaly detection, I suggest either a clear separation between the results with and without $\beta=1$ or a discussion of their relative performance on certain signals. I would suggest possibly also expanding the results in Fig. 6 to show the full set of TI values for a set of models (both with and without $\beta=1$), since anomaly detection means we do not know which model is out there and thus the distribution across a range of models is more valuable than the min/median/max.