SciPost logo

SciPost Submission Page

Phase Space Sampling and Inference from Weighted Events with Autoregressive Flows

by Rob Verheyen, Bob Stienen

This is not the latest submitted version.

This Submission thread is now published as

Submission summary

Authors (as registered SciPost users): Bob Stienen · Rob Verheyen
Submission information
Preprint Link: https://arxiv.org/abs/2011.13445v1  (pdf)
Code repository: https://github.com/rbvh/PhaseSpaceAutoregressiveFlow
Date submitted: 2020-11-30 15:05
Submitted by: Stienen, Bob
Submitted to: SciPost Physics
Ontological classification
Academic field: Physics
Specialties:
  • High-Energy Physics - Phenomenology
Approaches: Computational, Phenomenological

Abstract

We explore the use of autoregressive flows, a type of generative model with tractable likelihood, as a means of efficient generation of physical particle collider events. The usual maximum likelihood loss function is supplemented by an event weight, allowing for inference from event samples with variable, and even negative event weights. To illustrate the efficacy of the model, we perform experiments with leading-order top pair production events at an electron collider with importance sampling weights, and with next-to-leading-order top pair production events at the LHC that involve negative weights.

Current status:
Has been resubmitted

Reports on this Submission

Report #2 by Tilman Plehn (Referee 2) on 2021-1-14 (Invited Report)

  • Cite as: Tilman Plehn, Report on arXiv:2011.13445v1, delivered 2021-01-14, doi: 10.21468/SciPost.Report.2411

Strengths

1- the paper tackles some very serious challenges in LHC simulations
2- it employs new concepts and ideas
3- it represents the technical state of the art
4- the new ideas can easily be used by the Monte Carlo community

Weaknesses

1- some room for improvement in the presentation, as discussed below

Report

The paper is excellent, timely, technically state of the art, and should definitely be published in a leading physics journal.

Requested changes

Just front to back...
1- p.2, the dimensionality of phase space is less of a problem for VEGAS than for almost any other numerical tool, and I am not sure if phase space is really the leading bottleneck;
2- in the discussion of NFs as generative networks, I think our GAN and INN unfolding papers provide a useful benchmark in terms of mathematical foundation and performance;
3- p.6 I am not sure I got that right, is it true that their autoregressive flow is much more easily trained in one than in the other direction?
4- also p.6, I am sorry, but I find the discussion of the splines not very clear. What role do they play exactly?
5- somewhere at the end of Sec. 2.2 I would say clearly that the output of the flow network is a set of unweighted events, or not?
6- on p.10 it is not 100% clear what the different samplings mean. Flat is just a grid on the unit cube on one side of the NF, I assume? Where does the grid for the VEGAS sampling come from?
7- Sorry, but it is not clear to me how you compute the efficiencies in Tab.2 from the generated events, I am a little confused...
8- Fig.4, Any chance you could include the training data in those plots? I assume the True is before unweighting the training data? And I find the flat case a little useless, given that we have mass peaks which we know we cannot map out;
9- also in Fig.4, the y-range of the secondary panels is good only for one of the six panels;
10- in Fig.5, LHS, how are the curves normalized? Maybe include the integral in the caption to get a feel for things?
11- Fig.6, it would be nice to also see the training data. Why is the green mW peak so smooth and so off? That worries me in terms of error control. In the RHS it might be nice to also show a log scale, so the deviations in the tail are easier to spot? In the caption there is a typo green <-> blue;
12- Not sure if it is useful, but I was wondering how the trained network looks on the latent space side;
13- is ignoring the event weight really a good reference? I think what people sometimes do in real life is ignore the negative events and hold their breath;

  • validity: top
  • significance: top
  • originality: top
  • clarity: good
  • formatting: excellent
  • grammar: perfect

Report #1 by Anonymous (Referee 1) on 2021-1-6 (Invited Report)

  • Cite as: Anonymous, Report on arXiv:2011.13445v1, delivered 2021-01-06, doi: 10.21468/SciPost.Report.2373

Strengths

The presented work contributes to the current topic of employing machine learning techniques - here in particular autoregressive flows - to sample the phase space of particle collision final states. The authors study the capabilities of the flow method to infer the target distribution of top-quark production processes from pre-generated event samples.

1- The paper nicely introduces the subject and provides extensive reference to the relevant literature.

2- The authors employ cutting edge machine learning technology, i.e. the autoregressive flow method, to address the problem of event generation in high energy physics.

3- They consider relevant and non-trivial examples to test and benchmark their approach.

4- The paper is carefully and clearly written.

Weaknesses

To my understanding the authors should more clearly describe the scope of their approach and the differences with regards to Refs. [66-68]. In contrast to the work presented there, by inferring the target distribution from a limited set of pre-generated events, their generator does not necessarily provide events distributed according to the correct target distribution. The discussion of the results of their experiments in Secs. 3 and 4 misses this perspective, i.e. is somewhat misleading.

Report

The paper describes a novel approach to employ normalising flows as a generative model for sampling the phase space of HEP collision events. The authors consider the autoregressive flow method to infer the phase space distribution from pre-generated training sets of collision events. They use recent, state-of-the-art machine learning technology that promises significant potential to solve a pressing problem in contemporary high energy physics, namely the efficient sampling of complex, multi-dimensional phase spaces.

The proposed technique is applied to top quark pair production at leading and next-to-leading order, thereby confronting the ansatz with varying event weights, that in the latter case can be negative.

The authors convincingly show that their method captures broad features of the underlying distribution from the training samples provided, thereby in parts outperforming the well-known VEGAS sampler.

My general criticism concerns the classification of the approach proposed as a potential alternative for traditional Monte Carlo techniques. Even with the flow method providing bijective phase space maps, in their setup the sample is not guaranteed to produce events that follow the correct target distribution, as it is based on its inference from a limited set of training events. Their generative model, as apparent from Figs. 4 and 5 nicely approximates the true distribution but in certain regions of phase space produces significant differences.

The authors should more clearly specify and limit the scope of their approach and more carefully separate it from the work presented in Refs. [66-68]. This concerns in particular a more detailed discussion and interpretation of the found results, but also the claims made in the introduction and conclusions.

Requested changes

Given my comments from above, I would like to urge the authors to adjust the introduction, results discussion and conclusions accordingly. In particular:

1- Explain more clearly what is in fact shown in Figs. 4 & 5, if these are differential cross sections, they would be 'wrongly' predicted by the studied samplers. This is not compatible with the claim made in the conclusions: "first evidence for the use of autoregressive flows as a potential alternative for traditional Monte Carlo techniques".

2- How could the remaining deficiencies of the sampler systematically be cured?

3- For the flow and Vegas approaches, do the authors expect/see an improvement with an increase in training data statistics?

  • validity: high
  • significance: good
  • originality: high
  • clarity: high
  • formatting: excellent
  • grammar: excellent

Login to report or comment