SciPost Submission Page
Exploring phase space with Neural Importance Sampling
by Enrico Bothmann, Timo Janßen, Max Knobbe, Tobias Schmale, Steffen Schumann
This is not the latest submitted version.
This Submission thread is now published as
|Authors (as registered SciPost users):||Enrico Bothmann · Timo Janßen · Max Knobbe · Steffen Schumann|
|Preprint Link:||https://arxiv.org/abs/2001.05478v2 (pdf)|
|Date submitted:||2020-01-30 01:00|
|Submitted by:||Schumann, Steffen|
|Submitted to:||SciPost Physics|
|Approaches:||Theoretical, Computational, Phenomenological|
We present a novel approach for the integration of scattering cross sections and the generation of partonic event samples in high-energy physics. We propose an importance sampling technique capable of overcoming typical deficiencies of existing approaches by incorporating neural networks. The method guarantees full phase space coverage and the exact reproduction of the desired target distribution, in our case given by the squared transition matrix element. We study the performance of the algorithm for a few representative examples, including top-quark pair production and gluon scattering into three- and four-gluon final states.
Submission & Refereeing History
You are currently on this page
Reports on this Submission
Report 2 by Tilman Plehn on 2020-3-2 (Invited Report)
- Cite as: Tilman Plehn, Report on arXiv:2001.05478v2, delivered 2020-03-02, doi: 10.21468/SciPost.Report.1548
The paper is one of the first application of machine learning to MC generation and one of the first applications of normalizing flow networks. It is very well done, technically state of the art, and indicates serious potential.
Just little stuff, plus why does the bloody 4-jet case not really work???
I really like the paper, it should be published with minor changes. Requested changes are largely suggestions concerning the presentation.
1- conceptually, I am not sure I understand the two figures of merit and how they are linked. Any chance you could explain more about this on p.3, where they are introduced?
2- I am sorry, but I find the beginning of Sec.2.3 hard to read for NN-non-experts. Any chance you could illustrate it better. Do you have an image illustrating your network structure?
3- I am not super-happy with the coverage arguments, which are partly linked to mapping an infinite weight space on a finite phase space. What exactly is not possible there? We did look into this problem in our Bayesian classification paper, and it is painful with the sigmoid function, but I do not understand the fundamental issues you are implying;
4- looking at Fig.1 (lower) I would argue that all you see is a lack of training data in the tails, as we know causes problems for GANs universally, for instance in event generation or in unfolding. Why do you consider this a fundamental problem?
5- concerning top decays - why is VEGAS not good? Or is flat just unusually good, because you mapped out the BW?
6- please explain Fig.2 more carefully, like what does it show, and what do the peaks imply? By the way, why do not use a linear weight scale for Fig.2? I seems not obvious given the scale-less latent space;
7- at the latest towards the bottom of p.11 I am wondering if the definition of the loss function is related to the two figures of merit, please say something about this;
8- in Sec. 4.2 the elephant sitting in the room is the lack of improvement in the 4-jet case. Please say more about this, including maybe what the reason might be and what you tried to improve it. Do you really think it can be improved with better GPUs? It looks like a problem of multi-channel naively combined with NNs, also after reading Ho-Chi's paper;
9- On with some comments concerning the presentation: I have to admit that I do not like switching back between integral representations like Eq.(1) and sum representations like Eq.(2). Why do you do that? In any case, if you do this please make sure everything is well defined...
10- please explain the phase space coverage for non-experts, because it is the main constraint on applying NNs to phase space generation, c.f. p.3, point (i);
11- at the beginning of Sec.2.1 it might be useful to mention that all of this is just a variable transform where we only really case about the Jacobian?
12- please explain Eq.(11) more carefully, and how it relates to Eq.(7);
13- what does the \sim before Eq.(12) mean?
14- the symbols in Eq.(23) are not clearly defined there, I think;
15- why the funny E_N notation in Tab.1?
16- moving Eq.(24) to somewhere in the introduction might make the introduction a little less dry/formal;
17- I am sorry, but the argument in the second paragraph on p.13 (The event weight distribution) is not clear to me. Can you have a look at this paragraph, please?
18- for applications of flow networks in physics it would be nice to cite the paper(s) by Ulli Koethe. They are not particle physics, but they are the first physics applications I know of.
- Cite as: Anonymous, Report on arXiv:2001.05478v2, delivered 2020-02-26, doi: 10.21468/SciPost.Report.1537
1 - The paper provides a thorough review of existing methods for adaptive Monte-Carlo integration. It outlines clearly how known techniques based on Neural Networks can fail to fill the full phase space and therefore yield a biased integration result.
2 - The authors construct a novel adaptive integration algorithm based on Neural Networks and Normalizing Flows. They apply this algorithm to various test cases, including a three-body decay, top-quark pair production and decay at a lepton collider, and partonic three- and four-jet production.
3 - They present a comprehensive comparison between one of the best existing adaptive integrators (Vegas) and their new technique for weight distributions, event generation efficiencies and integration uncertainties on physical observables of potential experimental interest.
The paper provides a thorough review of existing methods for adaptive Monte-Carlo integration. It outlines clearly how some techniques based on Neural Networks can fail to fill the full phase space and therefore yield a biased integration result.
The authors then proceed to construct a novel adaptive integration algorithm based on Neural Networks and Normalizing Flows. They apply this algorithm to various test cases, including a three-body decay, top-quark pair production and decay at a lepton collider, and partonic three- and four-jet production.
They present a comprehensive comparison between one of the best existing adaptive integrators (Vegas) and their new technique for weight distributions, event generation efficiencies and integration uncertainties on physical observables of potential experimental interest. They conclude that the efficiency of their novel integrator is superior to Vegas at low final-state multiplicity, but drops at higher multiplicity due to a lower compute efficiency.
I highly recommend this preprint for publication. I would suggest a minor modification, which is to include a reference to Nucl.Phys. B9 (1969) 568-576 in the last paragraph of Sec.2, where channel construction is discussed.