SciPost Submission Page
GANplifying Event Samples
by Anja Butter, Sascha Diefenbacher, Gregor Kasieczka, Benjamin Nachman, Tilman Plehn
- Published as SciPost Phys. 10, 139 (2021)
|As Contributors:||Sascha Diefenbacher · Tilman Plehn|
|Arxiv Link:||https://arxiv.org/abs/2008.06545v3 (pdf)|
|Date submitted:||2021-03-26 11:47|
|Submitted by:||Diefenbacher, Sascha|
|Submitted to:||SciPost Physics|
A critical question concerning generative networks applied to event generation in particle physics is if the generated events add statistical precision beyond the training sample. We show for a simple example with increasing dimensionality how generative networks indeed amplify the training statistics. We quantify their impact through an amplification factor or equivalent numbers of sampled events.
Published as SciPost Phys. 10, 139 (2021)
Author comments upon resubmission
We are thankful to the referees for their insightful and helpful comments and suggestions. We integrated the suggestions into the text, with any modifications indicated in the list of Changes. Additionally we would like to clarify some of the raised points:
- I think that the factor x25 of expected statistics increase should actually be 19. My math is: LHC data amount to 160 fb-1. HL-LHC should deliver 3000 fb-1. So 3000/160 ~ 19. To get you 25, One would need LHC data to amount to 120 fb-1.
-> We had based this on the CERN website (https://home.cern/resources/faqs/high-luminosity-lhc) which states that the HL-LHC target is 4000 fb^-1
- I am concerned by the fact that 100 points are not a lot, and the fit in Fig. 1 shows some bias (within statistical uncertainties, of course). Did you verify that the fit is unbiased in average? Did you try more options that your 100-point experiments? I am worried that the results of Fig.1 might be biased by the low-statistics of the fit sample.
-> We have generated a very size able number of versions of Fig.1, without observing a common bias. For larger training samples we see a similar effect, but of course less dramatic. We further also now mention this in the text.
- You say at page 4 that your uncertainty evaluation for the fit is equivalent to using the full covariance matrix returned by the fit. Did you verify that? Why not just using the covariance matrix? I would be interested to see the comparison. I agree with you that this is true in absence of fit biases, for perfectly chi-sq distributions. But are you in that situation? You likelihood should have a two-fold ambiguity and the two minima might overlap if the parameter determination is not precise enough [here I am assuming that the bit parameters are the two mu and sigma + the relative fraction]
-> Thanks for pointing this out, we have amended the sentence to be more clear, as it is indeed not just the fit uncertainty, but also the variation stemming from the individual dataset. The main reason we chose this method is to make sure we are consistent between our 3 approaches.
- How was the GAN architecture chosen? Was any optimization performed? Did you try alternative architectures?
-> We performed some small by hand optimization, mostly varying the learning rates and the depth of the networks. We did also test other setups such as W-GANs and GANs without gradient penalty. However once we found a method that produced consistent and stable training results we did not further optimize the setup, as to keep our message as general as possible.
- In the 2D example, why did you centre one of the two Gaussians at a negative r value? Doing so, you reduce this contribution to a tail below the other Gaussian, so the sense of the camel back shape is gone. I was puzzled by this choice. I would have used some other positive value of mu for the second Gaussian. The same comment applies to the N-dim case.
-> The reason we use this specific formula is somewhat involved. In principle what we wanted is a Gaussian located at +4 that we can use as our radius. We sample form this Gaussian, then multiply the radii with vectors sampled from a N-dimensional unit sphere and the end result is our N-dimensional data. However a simple Gaussian at +4 (N(4,1)) with a cutoff at 0 is not normalized. This lead us to use the absolute |N(4,1)| This is, however equivalent to the Expression N(4,1) + N(-4,1), as our dateset is by design symmetrical around the origin. Meaning it is irrelevant whether a half the unit sphere vector are assigned a positive radius and the other half a negative one, or if all or assigned a positive radius, the resulting distributions are equivalent. The reason we went with this expression over |N(4,1)| was because we hoped it would make it easier to see the similarities between 1-D and N-D.
- In the Introduction, the authors state that NNs go beyond naive interpolation due to the structure of the network, which e.g. introduces some level of smoothness in the functions it tries to represent. That statement seems intuitive, but one is left with the question of to what extent it holds for different NN architectures and datasets. For example, could the authors comment on why choosing this particular camel back example, and whether they have tested their procedure with other functional forms?
-> We have played with other functional forms. The main issue was to avoid Gaussian-like functions, because they can be trivially learned. The camel back seemed as un-Gaussian as possible, and its topology with a central hole adds a challenge to the network.
List of changes
- Added reference to 'challenging the current simulation tools' statement
- Added a brief discussion to the introduction about statistical and systematic precision trade off of GANs.
- Added mention in text that amplification is also observable for larger sample sizes
- Made descriptions of the 3 compared approaches (Sample, fit, GAN) more visible
- Updated the figures to be more readable, and updated the caption to explain the dotted lines.
- Amended the sentence on fit uncertainties to be more clear
- Clarified sentence regarding the effects of increasing the numbers of quantiles
- Added a brief explanation on how we prevent mode collapse in the GAN training.
- Added a brief comment why we increase the number of samples from 100 to 500 when going to 5D
Submission & Refereeing History
You are currently on this page
Reports on this Submission
Anonymous Report 2 on 2021-5-19 (Invited Report)
I am satisfied with the answers provided by the authors and I recommend that the work is accepted for publication
Report 1 by Veronica Sanz on 2021-4-27 (Invited Report)
I am happy with the changes made in the text after the referee 1 and myself made recommendations. I recommend publication in its current form.