SciPost logo

SciPost Submission Page

A standard convention for particle-level Monte Carlo event-variation weights

by Enrico Bothmann, Andy Buckley, Christian Gütschow, Stefan Prestel, Marek Schönherr, Peter Skands, Jeppe Andersen, Saptaparna Bhattacharya, Jonathan Butterworth, Gurpreet Singh Chahal, Louie Corpe, Leif Gellersen, Matthew Gignac, Deepak Kar, Frank Krauss, Jan Kretzschmar, Leif Lönnblad, Josh McFayden, Andreas Papaefstathiou, Simon Plätzer, Steffen Schumann, Michael Seymour, Frank Siegert, Andrzej Siódmok

This Submission thread is now published as

Submission summary

Authors (as registered SciPost users): Enrico Bothmann · Andy Buckley · Jonathan Butterworth · Louie Corpe · Leif Gellersen · Christian Gutschow · Deepak Kar · Frank Krauss · Leif Lönnblad · Andreas Papaefstathiou · Simon Plätzer · Steffen Schumann · Frank Siegert
Submission information
Preprint Link: https://arxiv.org/abs/2203.08230v4  (pdf)
Date accepted: 2022-10-20
Date submitted: 2022-10-11 12:43
Submitted by: Buckley, Andy
Submitted to: SciPost Physics Core
Ontological classification
Academic field: Physics
Specialties:
  • High-Energy Physics - Phenomenology
Approaches: Computational, Phenomenological

Abstract

Streams of event weights in particle-level Monte Carlo event generators are a convenient and immensely CPU-efficient approach to express systematic uncertainties in phenomenology calculations, providing systematic variations on the nominal prediction within a single event sample. But the lack of a common standard for labelling these variation streams across different tools has proven to be a major limitation for event-processing tools and analysers alike. Here we propose a well-defined, extensible community standard for the naming, ordering, and interpretation of weight streams that will serve as the basis for semantically correct parsing and combination of such variations in both theoretical and experimental studies.

Author comments upon resubmission

We have addressed the issues identified by the reviewer; our thanks for their efforts, which have again improved the presentation and fixed some ambiguities and minor errors.

List of changes

1. The duplication and confusing sentence in Section 2.3 have been resolved, and we have expanded on the meanings of different prefixes and explicitly that streams not for physical reweighting interpretation *must* be marked as such with a suitable prefix and vector position.

2. Searching for the EBNF ISO reference highlighted that this form is only semi-openly available, and has been critiqued for some cryptic constructions that certainly affected the use-case here. We have taken the opportunity to switch to the more compact, familiar, and accessible variant by W3C, which is now cited.

3. Minor corrections to the EBNF specification of allowed number formats.

Published as SciPost Phys. Core 6, 007 (2023)


Reports on this Submission

Report #2 by Anonymous (Referee 1) on 2022-10-16 (Invited Report)

  • Cite as: Anonymous, Report on arXiv:2203.08230v4, delivered 2022-10-16, doi: 10.21468/SciPost.Report.5909

Strengths

1) It covers important software application domain
2) It is clear

Weaknesses

1) It is missing section or at least statement about possible ambiguities

Report

I think that authors response missed my previous comments and my concerns remain valid.
Reweighting techniques are known and well established since many decades. They rely on mathematical
principles closely related to those of multidimensional integration variables changes.
To prevent biases, faulty applications, some rules need to be followed:
modified distribution need to be known and explicit. Usually it is not the case, approximations
and resulting ambiguities require evaluation. Who should take resposability: Monte Carlo authors, end users,
or somebody else?

Existence of ambiguity question, should be at least stated in the sumbmission. Clear and easy to spot warning
that somebody need to take responsability and do the hard work is needed. Good candidates would be
Monte Carlo authors who are the best to know necessary detail of the distributions to be modified by weights.
In general evaluation will depend on Monte Carlo and approximations used in it. Difficult ambiguity evaluation
can be delegated to users. This will substantially reduce size of user community.

Let me now go back to my previous concerns and author response.

-- I think that in its previous form, submission is not a physics paper but technical note addressed
-- to MC authors, may be only of MCnet community. I have not much to add to previous referee opinion.
-- I agree with it.

>> This is undoubtedly not just of interest to the MCnet community: it is relevant to the myriad of
>> experiment and phenomenology users who need to form MC-systematic uncertainties from their
>> multi-weighted event samples. The mess resulting from the absence of a standard has already led
>> to proliferations of unscalable machinery in experiments to try and hide such details from physics
>> users. This change to make the weights predictable and meaningful is relevant to all users of MC
>> events as well as their producers.

I agree, with they statement ``it is relevant to the myriad of experiment and phenomenology users'',
but to be useful physics and approximation details of the weight definitions must be given in submission
or it must be pointed where they can be found, or at least that user must look for the information.

-- On the other hand , standard for definition of weights used in Monte Carlo programs would be beneficial
-- for the community. Technical side of such definition is present, but essential, physics side is missing.
-- If it is not provided, document will be, in my opinion, nearly useless for the community of Monte Carlo
-- users and another failed attempt for standarisation will take place.

>> We disagree, but have added some further physics context to the introduction, to explain typical
>> applications of variation weights and the required combination rules. On the assertion that a standard
>> introduced without a physics component will be a failure, we refer to the Les Houches Event Format paper
>> (Comput.Phys.Commun. 176 (2007) 300-304, https://arxiv.org/abs/hep-ph/0609017, 510 citations), one of
>> the most successful community standards in high-energy physics, whose text contains no reference
>> whatsoever to physics applications! Out of the available SciPost journals, Physics Core seems the
>> closest match to Comp Phys Commun.

Usefulness of weights is not only about formats/standards/citations it is about their physics (and mathematical)
meaning. That is particular project dependent. It can not be just `examples' it is precondition for use.

-- If the authors can provide (and agree among themselves) on the section on limitation/(applicability
-- domain)for the weight use'', paper can include necessary physics content and its publication may be then
-- justified. My suggestion follow:

-- For each program it should be explained what are limitation in weights use, that is what is its
-- applicability domain. Alternatively detailed reference to manuals with page numbers etc. can be given.
-- This is not easy, as it touch approximations used in programs design. In my opinion that is essential
-- for the readers. Finally, template of explanations (what is required from the authors for the weight to
-- be used), would be beneficial for future programs (programs versions) documentation too.

>> We explicitly choose not to discuss issues of weight-applicability in this document: its purpose is not
>> to advocate for weighting methods or to claim relevances that they do not have. We are specifying a
>> convention for weight-stream organisation, to address issues in their already well-established domains
>> of validity. We have extended the introduction to make clear some basic technical essentials for use of
>> reweighting, such as the need for common support between the source and target distributions, but
>> a context-specific discussion on what physics can and cannot be effectively unweighted would be
>> boundless and off-topic.

I suspect, there is misunderstanding at this point. Weighting methods do not need explanation, they simply
reduce to redefinition of function which is to be integrated over massively multidimensional set of
variables and that is rather trivial from the point of view of mathematical
principles. My concern is, that to apply reweihgting techniques it is needed to understand what is the physics
content of orginal event sample and then of weights.

Side remark: Applications imprinting events into HepMC are plagued with standard stretching
(history entries hard process kinematic configurations different from the actual one, energy-momentum
non-conservation in vertices). In fact the stretching are understandable, needed, contribute to HepMC success,
but often confuse users. Your proposal undoubtedly will suffer from similar problems.

In my opinion, introduction of warning that validity (ambiguity evaluation) of re-weighting, require
details of assumptions/approximations used in event sample generation, is absolute minimum
before submission can be accepted. It must be stated that ambiguity is Monte Carlo, initialization variant
and user application dependent.

One may say that the topic is out of scope of the submission, but this nonetheless
need to be stated, at least. I am not asking for much. One or two sentences in summary and/or introduction
may be sufficient.

Requested changes

One may say that the topic is out of scope of the submission, but this nonetheless
need to be stated, at least. I am not asking for much. One or two sentences in summary and/or introduction
may be sufficient.

  • validity: good
  • significance: ok
  • originality: ok
  • clarity: good
  • formatting: reasonable
  • grammar: reasonable

Report #1 by Anonymous (Referee 2) on 2022-10-11 (Invited Report)

Report

The revised version addresses all concerns in my previous report.

Section 2.3 is substantially improved. The present version makes clear when to use which prefixes.

Using the W3C reference for ENBF is fine. Also the grammar has become more concise by employing character classes.

  • validity: -
  • significance: -
  • originality: -
  • clarity: -
  • formatting: -
  • grammar: -

Login to report or comment