SciPost logo

SciPost Submission Page

Trials Factor for Semi-Supervised NN Classifiers in Searches for Narrow Resonances at the LHC

by Benjamin Lieberman, Andreas Crivellin, Salah-Eddine Dahbi, Finn Stevenson, Nidhi Tripathi, Mukesh Kumar, Bruce Mellado

Submission summary

Authors (as registered SciPost users): Benjamin Lieberman
Submission information
Preprint Link: https://arxiv.org/abs/2404.07822v1  (pdf)
Date submitted: 2024-04-22 16:26
Submitted by: Lieberman, Benjamin
Submitted to: SciPost Physics Core
Ontological classification
Academic field: Physics
Specialties:
  • High-Energy Physics - Phenomenology
Approaches: Computational, Phenomenological

Abstract

To mitigate the model dependencies of searches for new narrow resonances at the Large Hadron Collider (LHC), semi-supervised Neural Networks (NNs) can be used. Unlike fully supervised classifiers these models introduce an additional look-elsewhere effect in the process of optimising thresholds on the response distribution. We perform a frequentist study to quantify this effect, in the form of a trials factor. As an example, we consider simulated $Z\gamma$ data to perform narrow resonance searches using semi-supervised NN classifiers. The results from this analysis provide substantiation that the look-elsewhere effect induced by the semi-supervised NN is under control.

Current status:
In refereeing

Reports on this Submission

Anonymous Report 1 on 2024-5-6 (Invited Report)

Report

The manuscript introduces an innovative approach for detecting anomalies indicative of the production and decay of a resonance beyond the Standard Model. Employing classifiers constructed within a semi-supervised neural network framework, the method ensures assessment of look-elsewhere effects. A demonstration of the approach is provided in the context of potentially observing a signal within the $Z\gamma$ final state.

While the subject matter holds significant interest and the manuscript could certainly warrant consideration for publication within SciPost Physics Core, enhancing its comprehensiveness through further clarifications and elaborations would be beneficial. Specifically, it would be advisable for the authors to explore one or two additional illustrations of their method, including for instance conventional resonance searches with final states comprising a pair of jets, leptons, or photons, and to assess the impact of background systematics.

I now proceed with a list of comments that should be addressed by the authors.

1 - The introduction of the article could benefit from a discussion on experimental new physics searches utilising unsupervised machine learning methods, as well as a more comprehensive explanation of the distinctions/similarities between the proposed semi-supervised technique and other (semi-supervised) methods used for anomaly detection. The conclusions should be rewritten to reflect these considerations.

Furthermore, clarification is needed regarding the assertion that the proposed method is less model-dependent than other methods, especially semi-supervised or unsupervised ones.

2- The manuscript illustrates its methodology through resonance searches in the $Z\gamma$ final state. While the choice of this example is motivated by existing anomalies in LHC data, care should be taken to streamline the referencing and ensure clarity and conciseness in the justification. Consideration should be given to replacing the second paragraph on page 3 with a succinct statement detailing the rationale behind choosing the $Z\gamma$ analysis as an illustrative example of the proposed methodology. The heavy reliance on 16 self-citations out of 21 references, which exclude many relevant experimental papers, is in my opinion unnecessary in light of the actual topic of the present manuscript.

Furthermore, it is essential to accurately characterise the origin of these anomalies. Properly distinguishing between those confirmed by LHC collaborations and those proposed by phenomenological works, which may lack access to comprehensive statistical treatment, is crucial.

In addition, as written above, including other illustrative examples based on standard resonance searches in dijet, diphoton or dilepton final states, would be beneficial for readers.

3 - Section 2.1 lacks sufficient information on the simulation toolchain used. Event generation for the $pp\to Z\gamma \to \ell\ell\gamma$ process seems to enforce the intermediate $Z$-boson to be on-shell. However, since the mass window in $m_{\ell\ell\gamma}$ is large enough, off-shell $Z$ contributions, virtual-photon contributions, and their interference are relevant. It remains unclear whether they have been properly accounted for.

Furthermore, the discussion on the chosen parton density set is unclear. It is essential to clarify whether next-to-leading-order matrix elements have been consistently convolved with next-leading-order parton densities, and not leading-order ones.

Finally, the text does not clearly distinguish between generator-level cuts and reconstructed-level cuts that are implemented in the simulation. Providing a clear delineation between these sets of cuts is crucial. Additionally, Section 2.1 should include details on preselection criteria, like cuts on the number of leptons and photons, that are currently not discussed.

4 - The manuscript should define central jets and specify the associated pseudo-rapidity cut.

5 - Figures 1 and 2 should be adjusted to improve readability. The missing transverse energy spectrum could be presented with a log scale or a reduced domain to enhance clarity. Additionally, in figure 2, all eight lower insets should indicate whether they refer to the sideband or signal mass window. In fact, consideration should be given to showing both these curves.

6 -The caption of figure 4 should define the acronym 'BR' for clarity.

7 - In Section 3.3, the manuscript should avoid using the term 'centre of mass' to refer to the 'center of the signal mass window', as 'centre of mass' has a different well-defined meaning.

8 - It would be instructive to assess the impact of background systematics on the calculation of local significance, especially considering the incomplete background modeling acknowledged by the authors in Section 2. Equation (3) should be generalised accordingly, and the results of Section 4 updated subsequently.

9 - The bibliography should be carefully proofread to correct any errors. Specifically, attention should be given to identifying and correcting duplicate references (like references [2] and [3]), updating references that are now published (like reference [23]), and ensuring insertion of complete references (like [45] and [46]).

Recommendation

Ask for major revision

  • validity: -
  • significance: -
  • originality: -
  • clarity: -
  • formatting: -
  • grammar: -

Login to report or comment