SciPost logo

SciPost Submission Page

A Normalized Autoencoder for LHC Triggers

by Barry M. Dillon, Luigi Favaro, Tilman Plehn, Peter Sorrenson, Michael Krämer

This Submission thread is now published as

Submission summary

Authors (as registered SciPost users): Barry M. Dillon · Luigi Favaro · Michael Krämer · Tilman Plehn
Submission information
Preprint Link: https://arxiv.org/abs/2206.14225v3  (pdf)
Date accepted: Oct. 4, 2023
Date submitted: June 23, 2023, 11:06 a.m.
Submitted by: Luigi Favaro
Submitted to: SciPost Physics
Ontological classification
Academic field: Physics
Specialties:
  • High-Energy Physics - Phenomenology

Abstract

Autoencoders are an effective analysis tool for the LHC, as they represent one of its main goal of finding physics beyond the Standard Model. The key challenge is that out-of-distribution anomaly searches based on the compressibility of features do not apply to the LHC, while existing density-based searches lack performance. We present the first autoencoder which identifies anomalous jets symmetrically in the directions of higher and lower complexity. The normalized autoencoder combines a standard bottleneck architecture with a well-defined probabilistic description. It works better than all available autoencoders for top vs QCD jets and reliably identifies different dark-jet signals.

Author comments upon resubmission

We thank the referee for the feedback. We carefully reviewed the manuscript and rephrased statements on the effect of preprocessing and trigger applications.
However, we are not stating that our network is insensitive to changes in the probability distribution (preprocessing).
In Section 4, we discuss the changes induced by the reweighting in the data space and the response of the NAE samples and reconstructions.
We provide clarification on statements about preprocessing below. Please let us know if your concerns arise from more specific claims.

List of changes

Warnings issued while processing user-supplied markup:

  • Inconsistency: Markdown and reStructuredText syntaxes are mixed. Markdown will be used.
    Add "#coerce:reST" or "#coerce:plain" as the first line of your text to force reStructuredText or no markup.
    You may also contact the helpdesk if the formatting is incorrect and you are unable to edit your text.

Preprocessing:

  • Although the jets we study in the paper would pass trigger selection cuts already, the results still demonstrate how our approach limits the assumptions made on BSM signals to data preprocessing rather than latent space structure, in favor of a more model-agnostic network architecture

  • For phase space regions with such modes the NAE training adjusts the energy as the underlying structure of the latent space, such that the autoencoder gets a robust OOD detector

  • This way, the NAE training increases the energy for Aachen jets, because the normalization forces the main prong to a lower value. These effects are inevitable when training likelihood-based networks, a different preprocessing will change the density and, therefore, the anomaly scores.

  • Once again focusing on the $p_T$-reweightings we show the energy distributions for the QCD training data and the two signals in Fig.~\ref{fig:djmse}. We see that unlike in our earlier study~\cite{Buss:2022lxw} the effect of the preprocessing on the whole distribution is limited. A shift in performance at low signal efficiency can be seen by varying $n$ with the ordering between the two dataset being switched around $n = 0.2~...~0.3$. The energy distribution of the Heidelberg dataset has a shifted main peak at $n=0.5$ which is washed out by smaller reweighting factors, while the QCD distribution undergoes a slight shift and develops a longer high-energy tail. For the Aachen dataset, lowering $n$ moves the mean away from the QCD background and at the same time increases the width of the distribution. These patterns will affect the ROC curves at low signal efficiency and large background suppression.

  • Removed: However, already looking at the AUC as a performance measure this changes, because the performance ordering as a function of $n$ changes towards larger signal efficiencies.

  • For the more challenging Aachen and Heidelberg dark jets, the NAE works for a reasonable single choice of preprocessing. The performance gain from using different reweighting factors on the two datasets is explainable in terms of the changes induced on the features of the two signals.

  • Autoencoders are ML-analysis tools which effectively represent the idea behind LHC searches.


Triggers: Highlight using autoencoders as of the idea of triggers: not necessarily hardware/online triggers but tools to extract "interesting" events. Left hardware applications for future work.

  • One of the goals of this work is to develop an autoencoder which is a robust anomalous jets tagger. We explore the concept of using autoencoders as of triggers, i.e. tools that can extract interesting events from a given background with as little bias as possible.

  • The next step will be to benchmark our architecture on more realistic datasets, optimize it for performace, and study the possibility of implementing the network on hardware for online triggering.

Published as SciPost Phys. Core 6, 074 (2023)


Reports on this Submission

Report #1 by Anonymous (Referee 1) on 2023-9-27 (Invited Report)

Report

Thank you to the authors for their patience in this prolonged review process. At this point, I am fine with publishing the current version in SciPost.

The only change I would strongly suggest to the authors is to consider changing the title and any mention of "trigger". From the last response, it sounds like the authors really don't mean "trigger" as it is usually meant at the LHC. You say "not necessarily hardware/online triggers
but tools to extract 'interesting' events. ". If it is not online, then it is not a trigger! *Tagging* anomalous events is probably a better way of describing your work. Otherwise, I fear that the title is rather misleading.
  • validity: -
  • significance: -
  • originality: -
  • clarity: -
  • formatting: -
  • grammar: -

Login to report or comment