Jet Diffusion versus JetGPT -- Modern Networks for the LHC

Anja Butter; Nathan Huetsch; Sofia Palacios Schweitzer; Tilman Plehn; Peter Sorrenson; Jonas Spinner

SciPost Submission Page

Jet Diffusion versus JetGPT -- Modern Networks for the LHC

by Anja Butter, Nathan Huetsch, Sofia Palacios Schweitzer, Tilman Plehn, Peter Sorrenson, Jonas Spinner

This is not the latest submitted version.

This Submission thread is now published as

SciPost Phys. Core 8, 026 (2025)

Submission summary

Authors (as registered SciPost users):

Nathan Huetsch · Sofia Palacios Schweitzer · Tilman Plehn · Jonas Spinner

Submission information
Preprint Link:	https://arxiv.org/abs/2305.10475v2 (pdf)
Date submitted:	June 26, 2023, 11:09 a.m.
Submitted by:	Palacios Schweitzer, Sofia
Submitted to:	SciPost Physics

Ontological classification
Academic field:	Physics
Specialties:	High-Energy Physics - Phenomenology
Approaches:	Computational, Phenomenological

Abstract

We introduce two diffusion models and an autoregressive transformer for LHC physics simulations. Bayesian versions allow us to control the networks and capture training uncertainties. After illustrating their different density estimation methods for simple toy models, we discuss their advantages for Z plus jets event generation. While diffusion networks excel through their precision, the transformer scales best with the phase space dimensionality. Given the different training and evaluation speed, we expect LHC physics to benefit from dedicated use cases for normalizing flows, diffusion models, and autoregressive transformers.

Current status:

Has been resubmitted

Reports on this Submission

Report #2 by Anonymous (Referee 2) on 2024-5-31 (Invited Report)

Cite as: Anonymous, Report on arXiv:2305.10475v2, delivered 2024-05-31, doi: 10.21468/SciPost.Report.9163

Strengths

The paper tries to apply some modern methods, transformer networks and diffusion models to LHC event generation. Event generation is important and a computationally expensive bottleneck in many experimental and theoretical analysis (although in reality it's the detector simulation that is slow, not the event generation).

Weaknesses

All their plots show pretty good agreement, but only qualitatively. It's hard to tell quantitatively how well the models are working or where they fail. The paper reads like they just wanted to get these models to work, and they do work ok, but they are not trying to find out what their relative strenghts and weaknesses. All the comparisons are vague and qualitative.

Similarly, they do not clearly state what exactly the problem is with event generation that they are trying to. Is it just speeding things up? The bottleneck is usually in tails of high multiplicity events, not the peaks. Are they getting those tails right?

Report

The paper is ok. Implementing models like this is not easy and the authors have gotten them to work, which is commendable. But they do not spend very much time on actually critically testing and evaluating the models and defending their use case.

Requested changes

1) Better motivation: give a quantitative performance goal related to a real limitation of current software and explain how these generative models will improve on it.
2) Add some quantitative measure of performance
3) Discuss whether the generative models are getting the tails of distributions right and where the failure modes are

Recommendation

Ask for minor revision

validity: high
significance: ok
originality: ok
clarity: good
formatting: excellent
grammar: excellent

Report #1 by Anonymous (Referee 1) on 2024-5-9 (Invited Report)

Cite as: Anonymous, Report on arXiv:2305.10475v2, delivered 2024-05-09, doi: 10.21468/SciPost.Report.9017

Strengths

1) The paper introduces two novel diffusion and an autoregressive transformer. models, specifically tailored for LHC physics simulations, expanding the toolkit available for high-energy physics research.
2) By adapting Bayesian versions of these models, the paper enhances the ability to control the learning processes and to capture training uncertainties, providing a more robust analytical framework.
3) The models are tested against Z plus jets events, which are critical for LHC studies, demonstrating their applicability and effectiveness in real-world scenarios within particle physics.
4) The study effectively contrasts the precision capabilities of diffusion models with the scalability advantages of the transformer, providing valuable insights into the suitability of each model depending on the simulation needs.
5) The paper provides practical guidance on how to leverage the strengths of each model type, suggesting dedicated use cases within the LHC physics domain, which could guide future applications and optimizations.
6) The presentation of the paper is very clear and detailed.

Weaknesses

1) Although the paper introduces novel models, it might lack a thorough comparative analysis with more traditional or established methods in LHC physics simulations, which would help in understanding the relative improvements in performance.

2) The research focuses predominantly on Z plus jets events, which, while important, may not fully demonstrate the models' effectiveness across a broader range of LHC physics scenarios.

3) The trade-offs between model precision and computational efficiency are not deeply explored, which could be critical for resource allocation in large-scale experiments.

Report

This paper investigates developing and applying two diffusion models and an autoregressive transformer designed for simulations in LHC physics. It introduces these models, applies Bayesian versions for enhanced control and uncertainty management during learning, and assesses their effectiveness in generating Z plus jets events, a crucial aspect of LHC physics simulations.
The paper's key findings show that diffusion models are noted for their precision, whereas the transformer is particularly effective at scaling with the phase space dimensionality. The differences in training and evaluation speeds between the models suggest potential for specialised applications within LHC physics simulations, with each model offering distinct advantages depending on the specific requirements of the simulation task.
Moreover, incorporating Bayesian methodologies allows for managing uncertainties in model training, enhancing the models' reliability for physics simulations. The study concludes that integrating these advanced machine-learning techniques could significantly benefit the LHC physics community by improving both the speed and precision of their analyses.
In my opinion, the paper meets the criteria for a speedy publication in SciPost. To address the perceived weaknesses outlined above might go beyond the scope of this publication and could be the basis for future research.

Recommendation

Publish (easily meets expectations and criteria for this Journal; among top 50%)

validity: top
significance: high
originality: high
clarity: top
formatting: perfect
grammar: excellent

SciPost Submission Page

Jet Diffusion versus JetGPT -- Modern Networks for the LHC

by Anja Butter, Nathan Huetsch, Sofia Palacios Schweitzer, Tilman Plehn, Peter Sorrenson, Jonas Spinner

This is not the latest submitted version.

This Submission thread is now published as

Submission summary

Abstract

Current status:

Reports on this Submission

Report #2 by Anonymous (Referee 2) on 2024-5-31 (Invited Report)

Strengths

Weaknesses

Report

Requested changes

Recommendation

Report #1 by Anonymous (Referee 1) on 2024-5-9 (Invited Report)

Strengths

Weaknesses

Report

Recommendation

Login to report or comment