SciPost logo

SciPost Submission Page

Variance Reduction via Simultaneous Importance Sampling and Control Variates Techniques Using Vegas

by Prasanth Shyamsundar, Jacob L. Scott, Stephen Mrenna, Konstantin T. Matchev, Kyoungchul Kong

This is not the latest submitted version.

Submission summary

Authors (as registered SciPost users): Kyoungchul Kong · Konstantin Matchev · Jacob Scott · Prasanth Shyamsundar
Submission information
Preprint Link: https://arxiv.org/abs/2309.12369v1  (pdf)
Code repository: https://github.com/crumpstrr33/covvvr
Date submitted: 2023-10-02 16:51
Submitted by: Scott, Jacob
Submitted to: SciPost Physics Codebases
Ontological classification
Academic field: Physics
Specialties:
  • High-Energy Physics - Phenomenology
Approaches: Computational, Phenomenological

Abstract

Monte Carlo (MC) integration is an important calculational technique in the physical sciences. Practical considerations require that the calculations are performed as accurately as possible for a given set of computational resources. To improve the accuracy of MC integration, a number of useful variance reduction algorithms have been developed, including importance sampling and control variates. In this work, we demonstrate how these two methods can be applied simultaneously, thus combining their benefits. We provide a python wrapper, named CoVVVR, which implements our approach in the Vegas program. The improvements are quantified with several benchmark examples from the literature.

Current status:
Has been resubmitted

Reports on this Submission

Report #2 by Anonymous (Referee 1) on 2023-12-14 (Invited Report)

  • Cite as: Anonymous, Report on arXiv:2309.12369v1, delivered 2023-12-14, doi: 10.21468/SciPost.Report.8273

Strengths

The paper is clearly written and presents a novel idea of using Vegas grids as Control Variates in high-dimensional Monte Carlo integration.

Weaknesses

The chosen application examples are rather academic and an actual real-live problem is missing.

Report

I am very sorry for the delay in reviewing the paper.

The paper is rather clearly written and presents the neat idea of employing Vegas grids as Control Variates when integrating high-dimensional functions by means of importance sampling based on Vegas. For a set of functions the authors illustrate the potential of the method to effectively reduce the variance of the MC integral estimate.

The obtained results indicate that in certain cases indeed a variance reduction is achieved, though for the price of an increase computational effort, that alternatively could have been invested in more samples in the standard Vegas integration. Nevertheless, I personally find it relevant and important to present the technique in a SciPost article.

However, prior to publication, the authors should address an additional aspect of their approach. Importance sampling through, e.g. Vegas, serves two purpose, (i) the direct evaluation of an integral estimate, (ii) the generation of event samples that follow the target distribution, being fully differential, possibly subject to an unweighting procedure.

While the authors clearly address the first case, they do not comment on how their method can be applied and possibly needs to be generalised when aiming for 'event generation'! According to Eq. (20) this seems straightforward, but I am wondering what happens when the phase space gets constrained through cuts, such that only a fraction of the generated events get accepted. Would this require reevaluation of $E_p$? The authors should discuss the applicability of their technique as a generative method.

Requested changes

1) The authors should consider adding more references at certain places, in particular concerning ML methods for integration in the introduction, where currently only Ref. [6] is cited. There are way more papers that should be quoted here.

2) Also in the introduction the authors state that the 'change of variables ... is difficult, if not impossible, to accomplish ... ' Its certainly difficult but this is precisely the target normalising flows or in general INN have been applied for recently and prove to be quite versatile. This should clearly be mentioned here.

3) Similarly, in the conclusion the authors should quote papers regarding the statements about inference techniques. Furthermore, just quoting Ref. [24] for using domain knowledge about the integrand is by no means adequate. This is a standard technique used in ALL event generators and not just MadGraph!

  • validity: good
  • significance: good
  • originality: high
  • clarity: high
  • formatting: good
  • grammar: good

Report #1 by Anonymous (Referee 2) on 2023-10-17 (Invited Report)

  • Cite as: Anonymous, Report on arXiv:2309.12369v1, delivered 2023-10-17, doi: 10.21468/SciPost.Report.7938

Strengths

1. This paper is super well written and explain making him also available for non expert.

2. The author followed a list of example from another paper allowing to avoid bias in their presentation of the result.

3. The idea of the paper is original and important for the field of High Energy Physics due to the future High Luminosity run.

Weaknesses

1. The main result of the paper are not very exiting, in the sense that the gain over simply using VEGAS are often quite marginal.

2. The reason why the method is sometimes working and sometimes not is not investigated/explained in the paper.

3. The drawback/potential issue of control variates/importance sampling/VEGAS are not covered in the paper which can give a false impression to non expert user.

Report

This paper presents a code implementing an interesting way to combine two well known method of numerical integration. While the paper is well written, the results are not impressive. Consequently, I have some doubt that the method/code will be significantly used in the future.

In term of the acceptance criteria for SciPost Physics Codebases, all the criteria are passed with high color but the one for the "added value" which, is as said above, more border line (but certainly existing in some specific case). In conclusion my recommendation would be to publish this paper.

Given the weakness reported above, I'm suggesting below some modification/clarification to the paper in the hope to improve his clarity and impact. However I do not consider those changes as mandatory for publication.

Requested changes

As said in the above report, the requested changes are mainly suggestion to the authors on the point that I would have found interesting for them to comment.

1. Given the importance in our field of event generator code (Pythia, MadGraph, Sherpa, ...) on which many reader will think when reading this paper, it would be important to comment on the impact on event generation and in particular un-weighting efficiency.

2. Maybe it would be good to give more details on reference #10 and #11 in the introduction in order to provide a better picture on how novel your approach is compare to those two papers.

3. I have some doubt on the validity of Equation 19 if p(x) and g(x) are not normalised to one. My issue is to understand why the last term is within the integral and not outside of the integral. While not critical for the paper, I would kindly ask If he authors could double check that formula.

4. Maybe it would be interesting to comment on either section 2.3 or section 2.5 on the situation if the CV function is constant (which in section 2.5 is identical to the case when the grid is converging)

5. From the toy function used in the paper, I would claim that only two class of functions are providing significant result for the case with one CV, namely the Gaussian case and the Polynomial case. I would have like the authors to comment/investigate on the underlying reason for those absence (or presence) of such gain. My (maybe naive) guess is that this directly related to the fact that the function is (or not) separable dimension by dimension.

6. On table 1, the authors do not comment on the weird number on the RMS for the gaussian case with 8 dimension. In this case the RMS is significantly larger with CV, but according to table 2 the variance is reduced by 20%. Could the author check if they are a typo in one of the table or explained the reason of such apparent miss-match.

7. On Figure 2, it is not clear what "normalised variance" means in the sense that it is not clear if it is the normalised variance with or without CV for that iteration. The author can also consider to put both on the plot which can be instructive to understand such figure (see next point). This should also help to comment more on the reason about the correlation observed in that figure which are also not explained.

  • validity: top
  • significance: low
  • originality: high
  • clarity: top
  • formatting: perfect
  • grammar: -

Login to report or comment