SciPost logo

SciPost Submission Page

Semi-visible jets, energy-based models, and self-supervision

by Luigi Favaro, Michael Krämer, Tanmoy Modak, Tilman Plehn, and Jan Rüschkamp

This is not the latest submitted version.

Submission summary

Authors (as registered SciPost users): Luigi Favaro · Tilman Plehn
Submission information
Preprint Link: scipost_202312_00024v1  (pdf)
Code repository: https://github.com/luigifvr/dark-clr
Date submitted: 2023-12-14 15:42
Submitted by: Favaro, Luigi
Submitted to: SciPost Physics
Ontological classification
Academic field: Physics
Specialties:
  • High-Energy Physics - Phenomenology

Abstract

We present DarkCLR, a novel framework for detecting semi-visible jets at the LHC. DarkCLR uses a self-supervised contrastive learning approach to create observables that are approximately invariant under relevant transformations. We use background-enhanced data to create a sensitive representation and evaluate the representations using a normalized autoencoder as a density estimator. Our results show a remarkable sensitivity for a wide range of semi-visible jets and are more robust than a supervised classifier trained on a specific signal.

Current status:
Has been resubmitted

Reports on this Submission

Report #2 by Anonymous (Referee 2) on 2024-5-17 (Invited Report)

Strengths

1- New state-of-the-art technique to search for new signals producing semi-visible jets
2- Method robust against model parameters of the signal
3- Test the sensitivity of the method to hyperparameter choices
4- Method potentially insensitive to simulation choice

Weaknesses

1- Difficult to follow if the reader is not familiar with the subject
2- The anomaly augmentation is done by dropping randomly some constituents. That may lead to potential difference with respect to a real signal for which the constituents that are not visible are correlated with each other (or rather I should say the constituents that are visible are correlated with each other, as they likely to originate from the same part of the shower history). There was no attempt to evaluate the potential bias.

Report

The paper introduces an innovative method for tagging semi-visible jets.
The results are shown to be insensitive to variations in BSM parameters compared to supervised or unsupervised learning techniques.
However, some textual enhancements are necessary to improve the paper's accessibility.

Requested changes

1- better define the architecture of your NNs, so they can be understood without reading the refs [21, 35, 36]. In particular, section 3.3 should be extended, and the NAE section could be summarized so the reader can get the point without going into the complexity of this algorithm.
2- Section 4 : Can you explain how you can say "We find that although the representation before the head network is more informative, the output of the head network encodes useful information for out-of-distribution detection"
Also it is not clear why the norm should be more discriminating than the direction for s_CLR
3- NAE description: if I understand correctly, you plug the NAE on the output of the CLR network, and train it only on background samples. You should clarify there is a second step of training to help the reader
4- The text of "Impact of Anomaly augmentation" refer to Figure 4 saying the performance of JetCLR are compared to DarkCLR ones, but Figure 4 is not clear about what is JetCLR and DarkCLR (even if I can guess it). Also for figure 4, why not using the same benchmark (epsilon_S=0.2) as for the table 2 ?
5- I suspect there are typos in the description of feed-forward encoder of NAE: the dimensions should go from 128 to 8, and not the opposite
6- what are the "(1) " after the numbers in table 2?
7- No need of "2.1" section if there is no "2.2" section
8- Section 2.1: why is there an upper cut on jet pT ? This cut description should be moved at the end of Section 2.1, otherwise the reader can think they are related to the reconstruction.
9- Section 3.2: why is it important to rescale the pT of the jet ? I guess the direction can be changed as well, could that impact the results ?
10- Equation 6-9: What is the meaning of "~" in "x~p_d" ?
11- define the acronyms before using them (CD, KL, LCT, AUC, QCD, SM, ...)

Recommendation

Ask for minor revision

  • validity: good
  • significance: high
  • originality: good
  • clarity: ok
  • formatting: excellent
  • grammar: excellent

Report #1 by Anonymous (Referee 1) on 2024-5-7 (Invited Report)

Strengths

The paper describes a new tagging algorithm for identifying semi-visible jets based on a contrastive learning representation.
1 - The algorithm presented is interesting and relevant, since it only relies on minimal physical features of the signal and it is not trained on specific signal data.
2 - In addition, the background rejection is superior to supervised classifiers and more stable with respect to changes in the model parameters.
3 - The text is well written and the results presented strongly support the paper claims.

Weaknesses

1 - The paper mostly focus on the technical aspects of the algorithm, but it is not very accessible for readers which are not familiar with current machine learning developments.
2 - The paper results go up to the ROC curves and do not address the physics impact of the proposed algorithm, i.e. how much the sensitivity of current searches can be improved by the algorithm.

Report

The paper contains relevant results and illustrate a novel approach for tagging semi-visible jets. In addition, the ROC curves obtained are more stable with respect to variations of the BSM parameters than supervised learning methods.
However, a few improvements in the text are needed in order to make the paper more accessible.

Requested changes

1 - I suggest to include in Figure 1 an example of positive augmentation, so it is easier to compare the effect of positive and anomalous augmentations.

2 - The authors should comment on whether including momentum smearing for the jet constituents would have any relevant impact on the results.

3 - The dimensionality of the head network's input and output should be made explicit is Sec.3.3.

4 -I suggest including a schematic figure showing the main steps of the network architecture along with the dimensions of the input and output of each step. It would help the reader who is not familiar with the CLR setup.

5 - The definition of the AUC and LCT acronyms are missing.

Recommendation

Ask for minor revision

  • validity: high
  • significance: high
  • originality: top
  • clarity: good
  • formatting: excellent
  • grammar: excellent

Login to report or comment