SciPost Submission Page
How to GAN Event Subtraction
by Anja Butter, Tilman Plehn, Ramon Winterhalder
This is not the current version.
|As Contributors:||Tilman Plehn · Ramon Winterhalder|
|Arxiv Link:||https://arxiv.org/abs/1912.08824v2 (pdf)|
|Submitted by:||Winterhalder, Ramon|
|Submitted to:||SciPost Physics|
|Subject area:||High-Energy Physics - Phenomenology|
Subtracting and adding event samples are common problems in LHC simulations. We show how generative adversarial networks can produce new event samples with a phase space distribution corresponding to added or subtracted input samples. We illustrate some general features using toy samples and then show explicit examples of background and non-local collinear subtraction events in terms of unweighted 4-vector events. This event sample manipulation reflects the excellent interpolation properties of neural networks.
Submission & Refereeing History
Reports on this Submission
Anonymous Report 3 on 2020-3-22 Invited Report
The application of GAN techniques discussed in the manuscript is new, as far as I can tell, and it is quite clearly explained in the manuscript.
The actual domain of applicability of the method to concrete collider problems is not clear.
The manuscript reports on the idea of using GANs to perform the following operation. Using two sets of events (say, "B" and "S", possibly generated by two Monte Carlo generators), one can train a GAN to produce a generator of events that follow the difference between the distributions of the two original sets (i.e., "B-S"). The method is applied to toy problems, with the aim of outlining potential applications to collider phenomenology.
This application of GAN techniques is new, as far as I can tell, and it is quite clearly explained in the manuscript. The manuscript however leaves a number of open questions concerning the actual domain of applicability of the method to concrete collider problems.
In light of this, I believe that the manuscript is not ready for publication in the current form. If the authors can address the points reported below, and clarify the potential of the method to address concrete problems (and convincingly support its competitive advantages), one could address other less crucial aspects related with the presentation of the GAN algorithm and the comparison with other strategies that might be employed for the same task, which should perhaps be slightly extended.
The following points should be addressed before the manuscript can be considered for publication:
1- It is unclear what happens if "B-S" does not have definite sign. Namely if the event density distribution P_B(x) is larger than P_S(x) in some region of "x", and smaller than P_S(x) in some other region. In this case, neither P_S-P_B nor P_B-P_S are densities, and the problem seems ill-defined. Since this can happen in potential applications (see below), one should ask what the GAN would return if trained on a problem of this type, and if the method would at least allow one to recognise that there is an issue or it would instead produce wrong results.
On page 11 it is mentioned that the sign of "B-S" is not a problem because one can always learn "S-B" instead of "B-S". However this seems to assume that "B-S" has fixed sign on the entire feature (x) space, so this comment is not sufficient to address the question above.
2- The first class of applications mentioned in the manuscript are referred to as "background subtraction". However I could not find a discussion of what this should be concretely useful for. The example worked out in the manuscript (photon background subtracted from Drell-Yan, in section 3.1) does not shed light on this aspect because it is not clear why one might want to perform such subtraction.
Maybe the method is supposed to help for problems such as extracting the new physics contribution from a simulation containing also the standard model, for instance in cases where the new physics effect is small and the approach based on bins becomes computationally unfeasible. If this is the case, it should be clearly stated in the manuscript. However one should also take into account that performing a subtraction would be needed only if simulating the new physics contribution separately is not feasible. This is the case in the presence of quantum-mechanical interference between SM and new physics. However in the presence of interference, "B-S" does not have definite sign in general. So the feasibility and the usefulness of the approach in this domain depends on point "1)".
3- The second class of applications are "subtraction" (see section 3.2). Also in this case, the final goal is not clearly stated in the paper. A short paragraph at the end of page 11 alludes to the fact that this could help MC@NLO event generation. If this is the case, it should be clearly stated and extensively explained. Also, it is found in section 3.2 that the required task of subtracting the collinear contribution cannot be accomplished because the method cannot deal with "B-S" distributions that are very small. Would this prevent the method to work, eventually?
Anonymous Report 1 on 2020-3-7 Invited Report
A machine-learning method is proposed to construct new event samples that follow a distribution obtained by summing or subtracting given input samples. The method is based on the use of generative adversarial networks (GANs).
The GAN architecture takes two (or more) sets of input events following given distributions and trains a generator whose output follows a distribution corresponding to a linear combination of the given inputs. Typical cases correspond to the sum or subtraction of distributions.
Simple one-dimensional toy examples are considered to explain the general GAN architecture and to test its applicability. In these cases it is shown that the GAN approach can correctly reproduce the subtracted distribution, within the 1-sigma error band derived by a binned analysis.
Two examples with actual LHC montecarlo simulations are also considered. The first one is the subtraction of the photon continuum from the p p -> e+ e- process at the LHC. The second one is the subtraction of the collinear radiation part from the p p -> Z g process. In both cases the GAN method seems able to perform the wanted subtraction.
There are a couple of points that are not fully discussed in the paper, but are crucial to understand the usefulness of the proposed GAN method.
1) The first point regards the reconstruction error associated to the GAN approach. In the toy examples it is shown that the GSN approach is able to reproduce the target distribution within the 1-sigma error band obtained from a binned analysis. This result, however, might be strongly influenced by the hyper-parameters, i.e. the neural network structure, the training parameters and the training algorithm. In all the examples the hyper-parameters are carefully adjusted to obtain a good performance.
It is not at all clear and not discussed in the text, how the reconstruction error is influenced by those choices. In a more realistic set-up, in which one can not compute the error from a binned analysis, one would not know a priori the error associated to the GAN reconstruction. This might be an issue if one wants to use these techniques for physics analyses, in which all the sources of systematic errors should be carefully estimated and taken into account.
Is there a way to get such an estimate for the GAN approach?
Notice that, at the end of the Outlook section, there is the sentence “we have shown how to use a GAN to manipulate event samples avoiding binning”. Therefore it seems clear that this method is proposed as an alternative to binning. As such, a proper treatment of errors would be needed.
2) The examples discussed in the paper do not seem to be particularly useful from the LHC point of view (“We are aware of the fact that our toy examples are not more than an illustration of what a subtraction GAN can achieve”, taken from the Outlook section). Although the hope that the method is used for actual LHC analyses is expressed (“we hope that some of the people who do LHC event simulations will find this technique useful”), there is no mention to possible “useful” applications. Do the Authors know any example of “useful” application of the GAN technique?