Accurate Surrogate Amplitudes with Calibrated Uncertainties

Henning Bahl; Nina Elmer; Luigi Favaro; Manuel Haußmann; Tilman Plehn; Ramon Winterhalder

SciPost Submission Page

Accurate Surrogate Amplitudes with Calibrated Uncertainties

by Henning Bahl, Nina Elmer, Luigi Favaro, Manuel Haußmann, Tilman Plehn, Ramon Winterhalder

This Submission thread is now published as

SciPost Phys. Core 8, 073 (2025)

Submission summary

Authors (as registered SciPost users):

Henning Bahl · Nina Elmer · Luigi Favaro · Tilman Plehn · Ramon Winterhalder

Submission information
Preprint Link:	scipost_202412_00037v2 (pdf)
Code repository:	https://github.com/heidelberg-hepml/arlo
Date accepted:	Sept. 25, 2025
Date submitted:	Aug. 12, 2025, 4:24 p.m.
Submitted by:	Luigi Favaro
Submitted to:	SciPost Physics

Ontological classification
Academic field:	Physics
Specialties:	High-Energy Physics - Phenomenology
Approaches:	Theoretical, Computational

Abstract

Neural networks for LHC physics have to be accurate, reliable, and controlled. Using neural surrogates for the prediction of loop amplitudes as a use case, we first show how activation functions are systematically tested with Kolmogorov-Arnold Networks. Then, we train neural surrogates to simultaneously predict the target amplitude and an uncertainty for the prediction. We disentangle systematic uncertainties, learned by a well-defined likelihood loss, from statistical uncertainties, which require the introduction of Bayesian neural networks or repulsive ensembles. We test the coverage of the learned uncertainties using pull distributions to quantify the calibration of cutting-edge neural surrogates.

Author indications on fulfilling journal expectations

Provide a novel and synergetic link between different research areas.
Open a new pathway in an existing or a new research direction, with clear potential for multi-pronged follow-up work
Detail a groundbreaking theoretical/experimental/computational discovery
Present a breakthrough on a previously-identified and long-standing research stumbling block

Author comments upon resubmission

Dear Editor and Referees,

We thank the referees for their time, careful consideration, and positive evaluation of our manuscript. We list below the changes we have made concerning the helpful suggestions.

General remarks:

We conclude the KAN section, showing that the learned activation function gives us useful information to decide the best activation function for the standard neural networks. Fig.4 shows that the GELU activation function, approximately represented by a learned GroupKAN activation function, is preferred over the standard ReLU.

We share our implementation of KANs and Bayesian networks with the community. Our code Detailed comments:is available at https://github.com/heidelberg-hepml/arlo .

We hope that with these changes, our article can now be accepted for publication in its present form.

Sincerely,

H. Bahl, N. Elmer, L. Favaro, M. Haußmann, T. Plehn, R. Winterhalder

List of changes

Referee 1:

We state the relation to uncertainties as defined in the ML domain at the end of Section 2. As the ML definition is often vague and should be characterised on a case-by-case basis, we prefer to use an operational definition based on behaviour in the large training statistics regime. We prefer to keep the derivations of Sections 2 and 4 in the main body of the paper. We believe that they allow the reader to understand the logical progression from the theoretical Bayesian theory to the practical implementation of learned uncertainties. The full derivation also ensures that all the assumptions and approximations are discussed, allowing future readers to have an instructive reference which can be used to build upon this work.

Abstract: (General) I believe the abstract should describe what the goal of the paper is, what tools are used, and what the result is. Currently, the sentences are not logically connected unless the reader has already read the paper.

We made the abstract more logically coherent.

L3 KANs were not introduced.

When introducing the KANs in the introduction, we added additional motivation for studying them (a promise for better accuracy and scalability). This is mentioned again at the beginning of Sec. 4.

L5 comprehensive"ly" "precision surrogates" is unclear p2 Paragraph 3. "amplifying" "essentially interpretable" are jargon and unclear Paragraph 5. Where does the percent-level requirement come from? This is not true in general and highly application specific. Paragraph 6. "tested assumption" what does this mean?

We updated the introduction, taking into account these comments. In particular, we clarified the “amplification” argument. We agree that the percent-level requirement is application-specific and therefore removed the sentence. We rephrased the end of paragraph 6 to clarify our expectations for the statistical uncertainties.

p5 After Eq. 13 sentence not finished.

Fixed.

p7 "In reality" is too colloquial and does not add information p8 "Realistically, ..." same. p9 "define the underlying problem." A problem is set or solved, but not defined.

We removed several colloquial terms from the paper, including the ones above, and removed the unnecessary sentence in the scaled pull section.

Sec 4. The choice for KANs is not sufficiently explained, here or in the introduction.

When introducing the KANs, we added additional motivation for studying them (a promise for better accuracy and scalability). This is mentioned again at the beginning of Sec. 4.

p13, one but last paragraph. This is a typical example of my comments. It is not clear to the reader or explained (as far as I see) why a flatter activation function is desirable. This leaves the reader with the impression they are not on the same page as the author.

Flatter activation functions are not the scope of the study. In the final paragraphs of the KAN section, we discuss the general behaviour of the learned activation functions, and we observe that they can be used as a baseline for finding the best option among the fixed activations of a standard neural network. See general remarks.

Fig. 4. The imperfections at low f_smear are interesting. Please add general remarks on implications or limitations in general. Clearly, this is a hint of a ceiling for any such methods and of general interest.

The imperfections for a small smearing applied to the data originate from the intrinsic uncertainties of the network architectures. There is no infinitely expressive architecture which learns the data distribution exactly. This residual systematic effect can be seen as the deviation from the ``noise only” curve. We added this discussion to Section 5.1.

p18 "Bayesianize" is entirely unclear

We agree with the referee and we now specify that only a subset of the weights have learnable variance while the rest of the neural network is deterministic.

We revised the manuscript following the referee’s suggestion and made these additional changes:

At the beginning of Section 2, we clarify that the space $x$ is the space of four-momenta of the scattering particles.
In the same section, we state that the derivations of the various loss functions generalise beyond amplitude regression.
On pg.4, we clarify that a Gaussian ansatz for the posterior distribution does not imply a Gaussian posterior distribution.
We removed the derivation of Eq.8.
In Section 2.3, we clarified the role of $t$ as the iteration during training, and we discuss in more detail the pitfalls of a simple ensemble of networks.
In the “Function-space density” section, we rephrased the motivation for introducing a function space repulsion term over the weight-space one.

Referee 2:

We believe that multi-layer perceptron networks and activation functions are well-known and have been widely discussed in the field. We think an introduction is unnecessary, and it would rather dilute the novel methodologies we are presenting in the manuscript. We reformulated the introduction of section 3 (and also the text on KANs in the introduction), making it logically more connected to the rest of the text.

Minor comments:

Page 2, second paragraph: one reference appears as "?".

Fixed.

Section 3: The full name for the abbreviations MLP and GATr can be given as well.

The full names are now given.

Section 4: ReLU, GELU, leakyReLU: these are standard functions for machine learning, but for readers from particle physics the full name will be helpful. The authors might also consider providing the definitions in an appendix.

We now specify that ReLU, GELU, and leakyReLU belong to the rectified linear unit family of fixed activation functions.

Current status:

Published

Editorial decision: For Journal SciPost Physics Core: Publish
(status: Editorial decision fixed and (if required) accepted by authors)

Reports on this Submission

Report #2 by Anonymous (Referee 2) on 2025-9-9 (Invited Report)

Report

The two referee reports from the first round gave the authors indications
on how to improve the manuscript for a wider audience.
The authors didn't really pick up on that.
In the revised version the authors try to limit the modifications
as far as possible to a minimum.
The scientific content is worth a publication, but as the authors give the
impression that they are only interested in addressing readers in their
particular field of research,
SciPost Physics Core is maybe the better place for this manuscript.
I leave this decision to the editor.

Recommendation

Accept in alternative Journal (see Report)

validity: -
significance: -
originality: -
clarity: -
formatting: -
grammar: -

Report #1 by Anonymous (Referee 1) on 2025-9-9 (Invited Report)

Report

The authors have implemented most suggestions, and the text is significantly improved.

Recommendation

Publish (surpasses expectations and criteria for this Journal; among top 10%)

validity: top
significance: top
originality: top
clarity: high
formatting: -
grammar: -

SciPost Submission Page

Accurate Surrogate Amplitudes with Calibrated Uncertainties

by Henning Bahl, Nina Elmer, Luigi Favaro, Manuel Haußmann, Tilman Plehn, Ramon Winterhalder

Submission summary

Abstract

Author indications on fulfilling journal expectations

Author comments upon resubmission

List of changes

Current status:

Reports on this Submission

Report #2 by Anonymous (Referee 2) on 2025-9-9 (Invited Report)

Report

Recommendation

Report #1 by Anonymous (Referee 1) on 2025-9-9 (Invited Report)

Report

Recommendation

Login to report or comment