SciPost logo

SciPost Submission Page

Kicking it Off(-shell) with Direct Diffusion

by Anja Butter, Tomas Jezo, Michael Klasen, Mathias Kuschick, Sofia Palacios Schweitzer, Tilman Plehn

This is not the latest submitted version.

This Submission thread is now published as

Submission summary

Authors (as registered SciPost users): Tomas Jezo · Sofia Palacios Schweitzer · Tilman Plehn
Submission information
Preprint Link: https://arxiv.org/abs/2311.17175v1  (pdf)
Date submitted: 2024-01-05 11:59
Submitted by: Palacios Schweitzer, Sofia
Submitted to: SciPost Physics
Ontological classification
Academic field: Physics
Specialties:
  • High-Energy Physics - Phenomenology
Approaches: Computational, Phenomenological

Abstract

Off-shell effects in large LHC backgrounds are crucial for precision predictions and, at the same time, challenging to simulate. We show how a generative diffusion network learns off-shell kinematics given the much simpler on-shell process. It generates off-shell configurations fast and precisely, while reproducing even challenging on-shell features.

Current status:
Has been resubmitted

Reports on this Submission

Report #3 by Anonymous (Referee 3) on 2024-3-1 (Invited Report)

  • Cite as: Anonymous, Report on arXiv:2311.17175v1, delivered 2024-03-01, doi: 10.21468/SciPost.Report.8646

Strengths

1) Generic, novel and powerful method to transform high-dimensional distributions
2) No pairing of events needed
3) Very promising proof of concept for an example LHC application

Report

The criteria for publication are met with some requested additions, see below.

The study opens a new pathway for fast surrogate modeling, with clear potential for multipronged follow-up work. In particular off-shell effects in dominant LHC background processes are modelled to high accuracy from the corresponding on-shell processes.

It provides a novel and synergetic link between particle physics and machine learning.

Requested changes

1) Optimal transport prescription is mentioned but details are missing
2) Two redundant degrees of freedom are said to increase the precision - the reader has difficulty understanding why
3) Define CFM the first time it is used
4) Fig. 4 caption "It illustrates the optimal transport
algorithm chosen by the network training" - this needs to be explained.
5) Speed improvements are claimed but quantification is missing
6) Amortisation is mentioned but it is not clear how this will be used at scale: a given process will need training data; and it is not clear to what extent a training for one process can be reutilised for another process. The reader would appreciate a discussion of these issues to judge production readiness
7) Add reference to related work such as https://arxiv.org/pdf/2312.10130.pdf

  • validity: high
  • significance: high
  • originality: high
  • clarity: high
  • formatting: excellent
  • grammar: perfect

Report #2 by Anonymous (Referee 2) on 2024-2-26 (Invited Report)

  • Cite as: Anonymous, Report on arXiv:2311.17175v1, delivered 2024-02-26, doi: 10.21468/SciPost.Report.8621

Strengths

1. The usage of generative surrogate networks to obtain off-shell configurations from given on-shell ones.
2. The idea to implement this in the future in the ML-enhanced release of the MADGRAPH even generator. In this way, it will be available to a larger group of researchers and potentially used at the LHC.

Weaknesses

1. Study has been performed at the LO level only.
2. Only a small set of observables was analysed.

Report

The manuscript deals with the generation of off-shell kinematics for the top quark pair production in di-leption decay channel at the LHC. In detail, it uses generative surrogate networks to obtain off-shell configurations from given on-shell ones. On the technical level this is achieved with the help of the Direct Diffusion networks that in principle can sample from any given distribution to produce another distribution. With this method the authors are able to reproduce full of-shell effects for various differential cross-section distributions at the 10% level. Furthermore, using a classifier reweighing the authors are able to improve the precision to well below 10%.

This is a very interesting and important study, but it is not complete. It looks more like the first step in a study that should have been carried out instead. It is not entirely understandable why the authors chose to conclude the study before analysing the real problems that arise at the next perturbation level, namely at NLO in QCD. At NLO in QCD full off-shell effects are "smeared out" by the contributions from real emission. Therefore, it would be beneficial for the quality of the manuscript to understand to what extent the proposed methods/algorithms would work in the presence of NLO QCD corrections. Indeed, it is not clear at all that the same outcome could be achieved in the presence of higher-order QCD corrections.

Furthermore, the authors write that the fast generative surrogate is ready to be implemented, using the Les Houches Event format, in the ML-enhanced release of the MADGRAPH even generator. Again, it is unclear to what extent this would be useful for various phenomenological applications at the LHC if it were only available at LO level.

Finally, one of the observables used in the manuscript, namely the invariant mass of the top quark, which is constructed from the top-quark decay products, can also dependent on an additional light jet (if resolved by the jet algorithm). This is a configuration that is not present in the LO prediction. It should be examined to which extend this would affect the top-quark reconstruction using ML methods.

All of these subtle issues are very important to examine in detail in order to ascertain the correctness / usefulness of the proposed methods. Therefore, I would like to encourage the authors to go one step further and conduct this study at the NLO QCD level. This should be feasible in a relatively short period of time, taking into account the fact that one of the authors is also a co-author of bb4l predictions, where full off-shell NLO QCD predictions for the pp -> tt +X process in di-lepton top-quark decay channel have been matched to parton shower programs using methods that allow for consistent treatment of resonances.

Consequently, I cannot recommend the manuscript for publication in its current form.

  • validity: low
  • significance: low
  • originality: poor
  • clarity: high
  • formatting: good
  • grammar: excellent

Report #1 by Anonymous (Referee 1) on 2024-2-25 (Invited Report)

  • Cite as: Anonymous, Report on arXiv:2311.17175v1, delivered 2024-02-24, doi: 10.21468/SciPost.Report.8613

Report

The authors describe a method to generate off-shell top-quark pair kinematics
from on-shell ones using a generative diffusion network. The article provides
an application of the ideas presented in arXiv:2305.10465 (Jet Diffusion versus
JetGPT - Modern Networks for the LHC) by a subset of the current authors. In
the current article the authors demonstrate that the direct diffusion method is
able to provide an accurate proxy to describe off-shell kinematics, which is
demonstrated at hand of the example of off-shell top-quark pair production
including leptonic decays at leading-order accuracy. Nonetheless, I can not
recommend this paper for publication due to the following reasons:

First, the authors seem to equate off-shell effects simply with kinematical
edges in differential distributions. However, off-shell effects have an impact
beyond the larger available phase space. For instance, the inclusion of
top-quark decays provide spin correlations between the charged leptons that
also have to be accurately described. Furthermore, interference contributions
between top-quark pair events and non-resonant continuum contributions can not
be differentiated which will be necessary for accurate simulations. However,
the authors fail to discuss these at all.

Secondly, the motivation of the authors was to provide a method that could be
employed to speed-up the event generation for future LHC simulations. The
study presented here is performed with unweighted events at leading-order
accuracy without taking into account further effects from parton showers. In
consequence, the generative direct diffusion network is employed to generate
hard events with off-shell kinematics, which still need to be interfaced with
parton shower programs. To this end, additional information besides the momenta
is necessary. Foremost, colorflow configurations and resonance histories need
to be supplied. Providing the latter information solely based on momentum
information is a challenging task and might diminish the accuracy of the
predictions. For example, in arXiv:1607.04538 it has been reported that the
proper choice of resonance history is essential to obtain accurate predictions.
Guessing from the final momentum configurations can lead to disastrous results.
The authors do not address these issues at all.

At last, the precursor article (arXiv:2305.10465) already includes already a
more complete study. In the previous article, hadronic Z-boson production up
to 3 jets including parton shower effects, hadronization and multi-jet merging
has been considered. It clearly, demonstrates that it is possible to push the
direct diffusion method all the way to the end of the simulations. I wonder why
the authors decided to work at the level of the hard scattering events instead
of the 'final product' as it would have eliminated the problem of parton shower
initial conditions completely.

Giving these facts, I do not think that this article provides major new results
that merits a publication in SciPost.

  • validity: -
  • significance: -
  • originality: -
  • clarity: -
  • formatting: -
  • grammar: -

Login to report or comment