SciPost Submission Page
How to GAN Event Unweighting
by Mathias Backes, Anja Butter, Tilman Plehn, Ramon Winterhalder
This is not the current version.
|As Contributors:||Tilman Plehn · Ramon Winterhalder|
|Arxiv Link:||https://arxiv.org/abs/2012.07873v2 (pdf)|
|Date submitted:||2021-01-21 09:23|
|Submitted by:||Winterhalder, Ramon|
|Submitted to:||SciPost Physics|
Event generation with neural networks has seen significant progress recently. The big open question is still how such new methods will accelerate LHC simulations to the level required by upcoming LHC runs. We target a known bottleneck of standard simulations and show how their unweighting procedure can be improved by generative networks. This can, potentially, lead to a very significant gain in simulation speed.
Submission & Refereeing History
You are currently on this page
- Report 3 submitted on 2021-02-23 20:18 by Anonymous
- Report 2 submitted on 2021-02-22 17:17 by Anonymous
- Report 1 submitted on 2021-02-19 00:20 by Anonymous
Reports on this Submission
Anonymous Report 3 on 2021-2-23 Invited Report
The paper explores a new approach to unweighting, which is an important part of MC event generation crucial to analysis of collider experiments.
Some points need to be clarified - see report.
This paper continues the exploration of GANs as a tool to improve Monte Carlo simulations of high-energy particle collisions. The general idea is to train a GAN on a weighted event sample produced by a standard first-principles event generator, and use GAN to produce an unweighted event sample which can then be directly compared with experimental data. This approach is an alternative to the standard unweighting procedure which may discard a significant fraction of events in the weighted sample, leading to inefficiencies.
Overall this is an interesting idea to try, and as far as I know it has not been tried before, making the paper worthy of publication. However I think there are a couple of points that require clarification.
1. Concerning the 500k Drell-Yan events shown in Figs. 4 and 5 and used in the discussion of sec. 3: are these events uniformly distributed in the 4D phase space defined by eq. (17)? If not, what is the distribution? The reason this is important is that any standard MC algorithm applied to DY would involve a phase space mapping which would greatly improve the unweighting efficiency of the standard unweighting procedure compared to uniform sampling. So, when the authors claim at the end of Sec. 3 that “a factor 100 between standard unweighting and the uwGAN method”, it is important to be more explicit about the precise meaning of “standard unweighting”.
2. While GAN algorithm can be used to generate large samples fast, fundamentally the GAN is an extrapolation of the training sample, so at some level its power must be limited by the size and quality of the training sample. If a physical observable is computed using the GAN-generated sample, in addition to purely statistical uncertainty, there will also be an uncertainty associated with training. (See for example 2002.06307). If the training sample is inefficient in terms of phase space coverage (e.g. contains too many events in areas with low cross section and not enough in areas with high cross section), it seems inevitable that this will ultimately limit the precision that can be obtained with GAN. Please comment.
The points listed in the report need to be addressed.
Anonymous Report 2 on 2021-2-22 Invited Report
The authors discuss the application of GANs to unweighting of event samples and perform a comparative study. I think this work is interesting and could inform the next generation of Monte Carlo event generation and their processing. I would encourage the authors to address one point before publication.
Drell-Yan is a very simple process and while their work shows that the GAN unweighting sufficiently reproduces the results of their event generator, the scalability of their results to more complex (and perhaps phenomenologically more relevant) final is not clear. Usually GANs (like other neural networks) can turn out to be fragile constructs when degrees of freedom change and complexity increases. Could the authors comment on the scalability of their framework? Do they expect any runtime improvements in more complex final states? Would this stand in tension with (perhaps excessive) process-dependent hyperparameter tuning?
Anonymous Report 1 on 2021-2-19 Invited Report
In this paper, the authors apply generative adversarial networks to the unweighting problem, a computational bottleneck in the production of high-precision calculations for the LHC experiments using Monte Carlo event generators.
The authors benchmark their setup against the mainstream VEGAS algorithm but also construct a deliberately challenging unweighting problem using Drell-Yan production as an example.
The study is very interesting and has clear relevance to the success of the LHC programme.
Would the authors have any insights as to what might be the origin of the small bias causing the uwGAN curve to be horizontally shifted with respect to the true distributions in Figure 1, at least where unweighted events are involved (left and right, but seemingly not in the centre)? A similar slope is also visible in the uwGAN/Truth ratio for the 2D Gaussian (Figure 3 bottom left), so it appears to be a systematic trend. It seems odd that the GAN would do so well with the complex toy models but fail to correct for such a comparatively straightforward shift?
The paper is overall rather well written, except for occasional machine-learning jargon that puts unnecessary obstacles in the path of readers not too well versed in the respective lingo. For example, it would improve the readability if abbreviations like "ReLU" or "ELU" were appropriately introduced in order to avoid the average reader having to consult Google. Similarly, spelling out "MMD" once would probably be beneficial.