SciPost Submission Page
How to GAN Event Subtraction
by Anja Butter, Tilman Plehn, Ramon Winterhalder
|As Contributors:||Tilman Plehn · Ramon Winterhalder|
|Arxiv Link:||https://arxiv.org/abs/1912.08824v3 (pdf)|
|Submitted by:||Winterhalder, Ramon|
|Submitted to:||SciPost Physics|
|Subject area:||High-Energy Physics - Phenomenology|
Subtracting event samples is a common task in LHC simulation and analysis, and standard solutions tend to be inefficient. We employ generative adversarial networks to produce new event samples with a phase space distribution corresponding to added or subtracted input samples. We first illustrate for a toy example how such a network beats the statistical limitations of the training data. We then show how such a network can be used to subtract background events or to include non-local collinear subtraction events at the level of unweighted 4-vector events.
Author comments upon resubmission
1) It is not at all clear and not discussed in the text, how the
reconstruction error is influenced by those choices. In a more
realistic set-up, in which one can not compute the error from a binned
analysis, one would not know a priori the error associated to the GAN
reconstruction. This might be an issue if one wants to use these
techniques for physics analyses, in which all the sources of
systematic errors should be carefully estimated and taken into
account. Is there a way to get such an estimate for the GAN approach?
-> What should we say - we are not aware of a serious study of
uncertainties in generative networks, but we are working on it. As
a matter of fact, we are starting a more serious collaboration on
this crucial LHC question with local ML experts...
Notice that, at the end of the Outlook section, there is the sentence
“we have shown how to use a GAN to manipulate event samples avoiding
binning”. Therefore it seems clear that this method is proposed as an
alternative to binning. As such, a proper treatment of errors would be
-> We completely agree, this paper is really meant as another
motivation for the major effort of studying errors in GAN
output. We added a comment along this line to Sec.2.
2) The examples discussed in the paper do not seem to be particularly
useful from the LHC point of view (“We are aware of the fact that our
toy examples are not more than an illustration of what a subtraction
GAN can achieve”, taken from the Outlook section). Although the hope
that the method is used for actual LHC analyses is expressed (“we hope
that some of the people who do LHC event simulations will find this
technique useful”), there is no mention to possible “useful”
applications. Do the Authors know any example of “useful” application
of the GAN technique?
-> We changed the introduction, the respective sections, and the
outlook accordingly. Now there should be a clearer picture of where
such an event subtraction might come in handy.
-> We fear there is a misunderstanding in our problem statement - our
goal is to construct a network that can generate events according
to the difference of two probability distributions. The referee's
network does an excellent job in constructing the distribution
corresponding to the difference of two event samples, but it cannot
be extended to generate statistically independent events.
1. I do not think Ref. uses a generative network. They use a DNN
and show that they perform better than Ref. which uses a GAN. The
authors can maybe take a deeper look into thesepapers.
-> Thank you for pointing this out, we wanted to cite Ref.
alongside with the generative phase-space studies, took it out now.
2. The authors do not provide the code that they use or any details
about it or what framework they used (PyTorch/Sci-Kit Learn/
TensorFlow etc.). The authors also do not provide the data they used
for the training. While this is not necessary, it is useful to have
it if someonewants to reproduce their results. I would suggest the
authors provide all these details (possiblyin a public repository) and
also an example code since the work is primarily computational.
-> We added a footnote clarifying that our code and out test data are
available upon request. We also added details on our software.
3. The authors do not make explicit the training times and the
hardware used for training the GANs. This is useful to benchmark it
against other regression methods.
-> As mentioned above we have doubts that this helps in comparing with
regression networks, given that we do not actually do a regression
:) In any case, we find that quoting such numbers are not helpful
in a field with collaborative spirit, but we have a track record of
happily participating in proper comparison studies.
4. The authors do not describe how they get the error-bars in the
left panels of Fig.2, Fig.3, etc. Are they from Eq.(1)?
-> They are, and we clarified this in the text.
1- It is unclear what happens if "B-S" does not have definite
sign. Namely if the event density distribution P_B(x) is larger than
P_S(x) in some region of "x", and smaller than P_S(x) in some other
region. In this case, neither P_S-P_B nor P_B-P_S are densities, and
the problem seems ill-defined. Since this can happen in potential
applications (see below), one should ask what the GAN would return if
trained on a problem of this type, and if the method would at least
allow one to recognise that there is an issue or it would instead
produce wrong results.
-> We expanded the discussion of signs and the zero function in
Sec.2.3. As a matter of fact, our CS-like example already has such
a sign problem which we solve with an off-set.
2- The first class of applications mentioned in the manuscript are
referred to as "background subtraction". However I could not find a
discussion of what this should be concretely useful for. The example
worked out in the manuscript (photon background subtracted from
Drell-Yan, in section 3.1) does not shed light on this aspect because
it is not clear why one might want to perform such subtraction.
Maybe the method is supposed to help for problems such as extracting
the new physics contribution from a simulation containing also the
standard model, for instance in cases where the new physics effect is
small and the approach based on bins becomes computationally
unfeasible. If this is the case, it should be clearly stated in the
manuscript. However one should also take into account that performing
a subtraction would be needed only if simulating the new physics
contribution separately is not feasible. This is the case in the
presence of quantum-mechanical interference between SM and new
physics. However in the presence of interference, "B-S" does not have
definite sign in general. So the feasibility and the usefulness of the
approach in this domain depends on point "1)".
-> Again, we admit that we only work with a toy model. We now add a
brief discussion of an appropriate problem, namely the kinematics
of a GANned 4-body-decay signal from signal-plus-background and
3- The second class of applications are "subtraction" (see section
3.2). Also in this case, the final goal is not clearly stated in the
paper. A short paragraph at the end of page 11 alludes to the fact
that this could help MC@NLO event generation. If this is the case, it
should be clearly stated and extensively explained. Also, it is found
in section 3.2 that the required task of subtracting the collinear
contribution cannot be accomplished because the method cannot deal
with "B-S" distributions that are very small. Would this prevent the
method to work, eventually?
-> We added some more discussion, including the subtraction of
on-shell events as a combination of the two examples. However, we
admit that we are not MC authors with a clear vision where exactly
such a tool would enter which MC code. We also improved the
numerics in Sec.3.2 to show that given some more optimization and
enough training time we do not expect precision to be an immediate
-> Altogether, we would like to thank the three referees and everyone
who has discussed with us since the first version of the paper came
out. We have changed the paper in many places, including abstract,
introduction, physics discussions, and outlook. This is why we are
confident that the current version is significantly improved over
the original draft and hope that SciPost agrees with that