SciPost Submission Page

How to GAN LHC Events

by Anja Butter, Tilman Plehn, Ramon Winterhalder

Submission summary

As Contributors: Tilman Plehn · Ramon Winterhalder
Arxiv Link: https://arxiv.org/abs/1907.03764v4
Date accepted: 2019-11-13
Date submitted: 2019-11-07
Submitted by: Winterhalder, Ramon
Submitted to: SciPost Physics
Discipline: Physics
Subject area: High-Energy Physics - Phenomenology
Approach: Theoretical

Abstract

Event generation for the LHC can be supplemented by generative adversarial networks, which generate physical events and avoid highly inefficient event unweighting. For top pair production we show how such a network describes intermediate on-shell particles, phase space boundaries, and tails of distributions. In particular, we introduce the maximum mean discrepancy to resolve sharp local features. It can be extended in a straightforward manner to include for instance off-shell contributions, higher orders, or approximate detector effects.

Current status:
Publication decision taken: accept

Editorial decision: For Journal SciPost Physics: Publish
(status: Editorial decision fixed and (if required) accepted by authors)




Reports on this Submission

Anonymous Report 2 on 2019-11-11 Invited Report

Strengths

1- Clarity of writing
2- Novelty of approach

Weaknesses

1- The case for the importance of this work should be strengthened

Report

The paper presents a study of using GANs to generate simulated particle collisions, including heavy particle production and multi-step decay. They are able to map the high-dim phase space with impressive accuracy (for a GAN), which to my knowledge has not been done before, especially in the context of tricky kinematics effects such as narrow resonances and phase-space boundaries.

The paper is also very well written, with clear, crisp prose and a good balance of references and introductory text to guide the reader.

There are two important questions in my mind: is the paper correct, and is it important? I address those below, and then comment on the discussion of the other referees.

(1) Is the work correct?

The paper is technically impressive, and to my reading the central claims made are well supported.

I have one worry. It would be nice to ensure the generator isn't memorizing the dataset, because this would just be unweighting a bunch of weighted events with a lot of extra steps involved. In image analyses, one often sees pictures of generated samples juxtaposed with the nearest neighbor in the true data to make sure that they are sufficiently different. (See the discussion in https://arxiv.org/pdf/1706.08224.pdf about support size of a generator; Figure 1 is the type of figures we have in mind). Alternatively, one could just measure the expected distance to the nearest point in the true dataset over the true dataset, and then calculate the expected distance to the nearest neighbor in the true dataset over the generated dataset to see that they are comparable. If generated samples are typically much closer to something in the true dataset, then it is clearly just memorizing data points.

(2) Is the work important/relevant?

Not every correctly-done study is worthy of publication. It needs to add something new and relevant.

What they have done is new IMO, but the case for the importance and relevance of the work presented here needs clarification.

The authors correctly point out that simulation tools are essential for likelihood-free inference in HEP, because we do not have good theoretical control of showers/hadronization/detector response and are forced to use simulations. They are also correct that the the detector response is by far the slowest and least-well controlled element of this chain, and that alternative approaches like GANs offer valuable speed-ups. They give a fair summary of recent progress in these areas.

But this paper does not focus on those areas that critically need to be sped up, and where theoretical knowledge is limited. Instead, this paper focuses on the simulation of the hard scattering, the piece that is both very well theoretically controlled and already extremely fast. They do not argue why such a tool is useful or important.

Why would HEP want to replace a procedure which is deeply connected to the underlying theory (drawing events from the PDF calculated by the matrix element) and can be automatically generated for an arbitrary new theory with a more-black-box GAN? They claim their GAN is fast (footnote on pg 15), but do not provide a straightforward comparison with existing tools, so this seems unlikely to be their primary argument. They emphasize that their GAN produces unweighted events rather than weighted events, but this hardly seems enough to motivate this complex procedure. They suggest that their GAN could also be generalized to learn the more useful steps of showers/hadronization/detector response, but do not show any results there, and seem to dismiss it as less interesting?

Perhaps there is an argument to be made for the importance and relevance of this work, but to my reading the authors do not make it. My best guess is that the authors found this task to be unsolved and a fun intellectual challenge.

I don’t think this should prevent the publication of this work, but my advice to the authors is to consider adding a paragraph to the introducing making a stronger case for the relevance of this. To wit: why should someone use this tool rather than use MG5?

(3) Have the authors addressed technical concerns raised by the other reviewers?

- I find the author’s replies to be satisfactory. On the most significant questions, I think they have clarified their claims of statistical uncertainty (as being relative to the generator sample, not the truth) and demonstrated that they do not have holes in their generated phase space.

Requested changes

1- I recommend (but do not require) adding discussion to the introducing making a stronger case for the relevance/importance of this.
2- I would like to see more evidence that the generator isn't memorizing the dataset.

  • validity: high
  • significance: ok
  • originality: good
  • clarity: high
  • formatting: perfect
  • grammar: excellent

Anonymous Report 1 on 2019-11-10 Invited Report

Report

Thank you for your response! I would like to quickly followup on both points - hopefully it won't take much time to do one last iteration before acceptance.

=====

You wrote

My question: Fig. 4: I still don't understand how the GAN can do better (closer to the true distribution) than the stat. uncertainty on the training dataset. Please explain.

Your answer: We don't say that the GAN does better than the stat. uncertainty. If the stat. Uncertainty is 20% the GAN obviously can only be equally precise. However, stating the GAN is correct within 10% means, that the ratio of GAN/True is 0.9. Considering the stat. uncertanty of the trainings data the GAN agrees with the true events within this uncertainty!

My response: This doesn't make sense to me - if your stat. uncertainty is 20%, then on average, the GAN cannot agree with the true to within 10%. The fluctuations in GAN/True better be comparable or larger than the stat uncertainty. Either I am completely missing the point or what is stated is not correct. In either case, please clarify in the text!

=====

v1 comment: Can you please demonstrate that your GAN is really able to generate statistically independent examples...

Your answer: We now show a 2D correlation plot of phi_j1 vs phi_j2 for 1 million true events and 1/10/50 million generated events next to it with a very small binning. This shows, that the GAN truly populates all phase space regions beyond the training data and does not produce any holes. Further, statistical independence is also a priori enforced by sampling from random noise.

My response: Thank you for this test! However, I don't think this exactly answers my question. First of all, your samples need not be statistically independent from the training sample because you are starting from random noise. The GAN could literally be memorizing and picking randomly one of the events from the training set. I think what you want to do is to also have a sample with 50M true events and show that the GAN with 50M looks like that. It is nice to see that there are no obvious holes, though there is clearly some feature in the bottom right plot of Fig. 6 between the yellow bands that looks to not be in the original. I won't insist too much on this point, so maybe you can at least partially convince the reader with some slight modifications to the text.

  • validity: -
  • significance: -
  • originality: -
  • clarity: -
  • formatting: -
  • grammar: -

Login to report or comment