SciPost logo

SciPost Submission Page

The MadNIS Reloaded

by Theo Heimel, Nathan Huetsch, Fabio Maltoni, Olivier Mattelaer, Tilman Plehn, Ramon Winterhalder

This is not the latest submitted version.

This Submission thread is now published as

Submission summary

Authors (as registered SciPost users): Theo Heimel · Nathan Huetsch · Tilman Plehn · Ramon Winterhalder
Submission information
Preprint Link: https://arxiv.org/abs/2311.01548v2  (pdf)
Date submitted: 2023-11-17 10:04
Submitted by: Winterhalder, Ramon
Submitted to: SciPost Physics
Ontological classification
Academic field: Physics
Specialties:
  • High-Energy Physics - Phenomenology
Approach: Computational

Abstract

In pursuit of precise and fast theory predictions for the LHC, we present an implementation of the MadNIS method in the MadGraph event generator. A series of improvements in MadNIS further enhance its efficiency and speed. We validate this implementation for realistic partonic processes and find significant gains from using modern machine learning in event generators.

Current status:
Has been resubmitted

Reports on this Submission

Report #2 by Anonymous (Referee 2) on 2024-2-8 (Invited Report)

  • Cite as: Anonymous, Report on arXiv:2311.01548v2, delivered 2024-02-08, doi: 10.21468/SciPost.Report.8525

Strengths

Well documented application of ML techniques to the problem of importance sampling in Monte Carlo integration

Weaknesses

The benefits achieved with the new approach are less well described.

Report

This report documents the application of machine learning techniques, in particular invertible neural networks, for generating the importance sampling functions for use in Monte Carlo integration. The ML part of the paper contains a thorough description of the approach taken.

I would request a few clarifications in section 3 onwards before this study could be published:
A: Section 3.1 lists a few processes to which the approach is applied. The first two seem rather simple in terms of phase space complexity. The W+Jets processes are examples of W+2,3,4 jets, but all cases are from gluon initiated processes, which contribute much less to the cross section than the dominant qg channels. So I would like a description of why these processes can be considered a realistic sample for complexity. Is it even relevant to discuss these processes at Born level (I presume this is the level of complexity chosen, it is not detailed anywhere), where calculations have long been considered automatic.

B: There is no description of the baseline benchmark, i.e. how the non INN sampling is performed.

C: The presentation of the improvements is insufficiently specific. An example: "For instance, the unweighting efficiency for VBS now reaches 20%, up from a few per cent from the standard method and by more than a factor ten". How can the standard method give "a few percent", and yet differ by more than a factor ten from 20%? Unweighting efficiency in itself is a poor measure. How well are distributions in tails described? Does the chi^2 increase? These questions need addressing too.

Requested changes

As described above

  • validity: good
  • significance: good
  • originality: good
  • clarity: high
  • formatting: good
  • grammar: good

Report #1 by Anonymous (Referee 3) on 2024-1-16 (Invited Report)

  • Cite as: Anonymous, Report on arXiv:2311.01548v2, delivered 2024-01-15, doi: 10.21468/SciPost.Report.8411

Strengths

1. The authors continue their work on applying modern Machine Learning in the form of Neural Importance Sampling to the relevant computational problem of phase-space sampling. They take additional steps towards integrating the approach with the MadGraph5_aMC@NLO matrix element generator, which will make the approach available to its wide experimental and theoretical user base.

2. The quoted efficiency improvements reported are significant, also with respect to the previous MadNIS implementation. In particular the improvement in the unweighting efficiency indeed translates directly to speed-ups in overall parton-level event generation. The speed-ups exceed what has been reported by earlier studies with similar approaches, and I would agree with the statement in the introduction that this work goes beyond a proof-of-principle study (albeit one would always like to see even more results for further processes and multiplicities).

3. The authors give sufficient details on their methods and parameters etc. to ensure that their results are reproducible.

Weaknesses

1. The work could be contextualised better with respect to earlier Neural Importance Sampling applications to phase-space sampling (see "Requested changes").

2. The speed-ups are reported to the standard VEGAS approach, which makes sense. It is important to note though that the number of points in the VEGAS-based trainings seems to be orders of magnitude lower that the number of points used in the MadNIS trainings. Table 2 seems to suggest that the VEGAS trainings use O(100k) points in total, while the MadNIS trainings use $\text{min}(200 \times 0.8^{n_c}, 10\mathrm k) \cdot 88k$, i.e. between 0 and O(10M) points (?) (it is unclear whether the first argument to the min-function is a typo, as this would suggest very small batch sizes for sizable channel counts $n_c$ which can be as large as 72, in particular the first argument of the min function would always be smaller that the second argument; also naively one would assume that the batch size should increase with the $n_c$, not vice versa; is this a misunderstanding of mine?). The authors state that "VEGAS […] converges much faster" and that they "optimize VEGAS to the best of knowledge", however, this is not further quantified, which would be interesting for the reader to judge the robustness of the comparison by themselves.

3. The implementation and results are restricted to individual partonic channels, and are thus not yet applicable to most demanding large-scale simulations, which almost always require a summation over all partonic channels. This is however clearly stated by the authors.

Report

This submission reports an important step in the ongoing development of a practical Nested Importance Sampling application to the phase-space sampling problem in modern event generators.

While the individual additional improvements are to some degree incremental with respect to the previously published MadNIS, they are in totality a significant step forward in terms of speed-ups and in terms of the integration with the MadGraph5_aMC@NLO framework.

Therefore, the submission is a relevant contribution. It is of a high quality and definitely worth publishing in SciPost Physics. However, I include a few points in "Requested changes" which were confusing to me and/or that I think could be improved, and I would ask the authors to address them in a minor revision.

Requested changes

1. In the last sentence of the introduction, it is stated that the gains are "orthogonal to the development of faster amplitude evaluations and hardware acceleration using highly parallelized GPU farms." This statement should cite at least the very recent and ongoing efforts in this direction, namely the effort to port MadGraph5 to GPUs, and also [arXiv:2311.06198] .

2. Earlier works on NIS in the context of phase-space generation, i.e. the proof-of-principle study [arXiv:2001.05478], the application within Sherpa [2001.10028], the i-Flow package [arXiv:2001.05486], and the application of MadNIS to the new phase-space generator Chili [arXiv:2302.10449], are all cited, but only among other more general ML-for-phase-space efforts, after the part reading "we need to improve the phase-space integration and sampling". Only the MadNIS reference [arXiv:2212.06172] is then cited again when focusing on the application of NIS to phase-space sampling in the last paragraph. This way, the manuscript does not seem to be properly contextualised in the field, given the amount of shared DNA with the works mentioned above. In conclusion, I would suggest to amend the beginning of the last paragraph to briefly mention that NIS has been applied by these other groups to the same problem, instead of relying on the reader to work this out on their own.

2. As described in the "Weaknesses", I'm puzzled by the "Batch size" value in Tab. 2. The number of channels, $n_c$, is a positive integer, such that $0.8^{n_c}$ would certainly be smaller than 1. So why is there a min function applied, which would never select the second argument, i.e. the always greater value of 10k? Is this a typo? Please comment or amend.

3. Depending on how the previous point is resolved, the number of points in the MadNIS training might be much larger that the number of points for the VEGAS training (or not?). In any case, it's important to ensure that the VEGAS optimization is indeed fully converged to serve as a robust and fair baseline. The statement that the authors "optimize VEGAS to the best of knowledge" addresses this somewhat, but remains vague. I would ask the authors to elaborate (or even quantify if possible with reasonable effort), e.g. is it indeed ensured that the unweighting efficiency resulting from the VEGAS optimization is fully stabilized? Especially for higher multiplicities, a large number of points is usually required to map out the right-hand tail of the weight distribution in a sufficient way.

4. I would ask the authors to elaborate on the statement that "the maximum weight is determined by bootstrapping and taking the median" etc. This would also be helpful to better understand the issue of convergence discussed in the previous point.

  • validity: high
  • significance: high
  • originality: good
  • clarity: high
  • formatting: perfect
  • grammar: perfect

Login to report or comment