SciPost logo

SciPost Submission Page

Data-parallel leading-order event generation in MadGraph5_aMC@NLO

by Stephan Hageböck, Daniele Massaro, Olivier Mattelaer, Stefan Roiser, Andrea Valassi, Zenny Wettersten

Submission summary

Authors (as registered SciPost users): Zenny Wettersten
Submission information
Preprint Link: https://arxiv.org/abs/2507.21039v2  (pdf)
Code repository: https://github.com/madgraph5/madgraph4gpu
Date submitted: Aug. 4, 2025, 2:28 p.m.
Submitted by: Wettersten, Zenny
Submitted to: SciPost Physics
Ontological classification
Academic field: Physics
Specialties:
  • High-Energy Physics - Phenomenology
Approaches: Computational, Phenomenological

Abstract

The CUDACPP plugin for MadGraph5_aMC@NLO aims to accelerate leading order tree-level event generation by providing the MadEvent event generator with data-parallel helicity amplitudes. These amplitudes are written in templated C++ and CUDA, allowing them to be compiled for CPUs supporting SSE4, AVX2, and AVX-512 instruction sets as well as CUDA- and HIP-enabled GPUs. Using SIMD instruction sets, CUDACPP-generated amplitude routines routines are shown to speed up linearly with SIMD register size, and GPU offloading is shown to provide acceleration beyond that of SIMD instructions. Additionally, the resulting speed-up in event generation perfectly aligns with predictions from measured runtime fractions spent in amplitude routines, and proper GPU utilisation can speed up high-multiplicity QCD processes by an order of magnitude when compared to optimal CPU usage in server-grade CPUs.

Author indications on fulfilling journal expectations

  • Provide a novel and synergetic link between different research areas.
  • Open a new pathway in an existing or a new research direction, with clear potential for multi-pronged follow-up work
  • Detail a groundbreaking theoretical/experimental/computational discovery
  • Present a breakthrough on a previously-identified and long-standing research stumbling block
Current status:
In refereeing

Reports on this Submission

Report #1 by Anonymous (Referee 1) on 2025-9-26 (Invited Report)

Strengths

1 - Detailed account of developments of the MadGraph code suite for new CPU and GPU architectures
2 - Relevant benchmarks of the event generator in realistic use cases
3 - Reproducible runtimes and scaling tests

Weaknesses

1 - Some important references to developments outside the MadGraph collaboration are missing
2 - Some data only contained in figures should be made publicly available
3 - The GPU development targets only NVidia hardware

Report

This manuscript provides a detailed account of recent developments in the MadGraph code suite that target vectorized CPUs as well as NVidia GPUs. MadGraph is one of the leading event generators for collider physics and related fields, and as such an important tool for the high-energy physics community as a whole. The developments described in this manuscript are needed to allow the code to run on both CPUs and GPUs, providing both accelerated event generation and a broader spectrum of host systems for simulations at the LHC and beyond.

Requested changes

The manuscript is well written and most of the data are presented in a reproducible fashion. Only a few minor changes are needed: 1- In addition to Refs.[13,14], I would ask the authors to cite similar developments from collaborations other than their own, in particular arXiv:1905.05120, arXiv:2107.06625, arXiv:2109.11964, arXiv:2112.09588, arXiv:2203.07460, arXiv:2209.00843, arXiv:2302.10449, arXiv:2309.13154, arXiv:2506.06203, arXiv:2505.13608 2- In addition to Refs.[10-12], arXiv:2110.15211 should be cited. 3- Page 3, 3rd paragraph: "[..] never reached production" should either read "[..] never reached production quality" or "[..] was never used in production", depending on which scenario the sentence is supposed to describe 4- Page 4, 1st paragraph: "gridpacks" is technical jargon and should not be used in the introduction without an explanation 5- Page 11, Sec. 4, 2nd paragraph. It would be helpful if the authors could discuss or at least mention the possible effects of roundoff error on higher-order calculations, and whether their FP32 code base could still be of use as a component of the MadGraph NLO event generation framework 6- Page 12, 4th paragraph. It would be helpful if the authors would briefly explain why large gauge cancellations arise in the VBF process 7- Page 14, 1st paragraph. It is not clear what the statement on the washing out of roundoff errors means. I would argue that the roundoff error should always be smaller than the statistical precision of the event sample, and in many cases much smaller. This statement is entirely independent of the parametric precision of the calculation. Take for example the production of Z+b at the LHC. Even though the theory precision is no better than 10%, a 10% roundoff error on the mass of the b-quark in the final state would be detrimental, as it would change deadcone effects and the spectrum of the B hadrons, which can be resolved through vertexing. 8- Figs.11 and 22 should be made available in their original format, i.e. the searchable and clickable flame graph, which allows to investigate the entire call stack.

Recommendation

Ask for minor revision

  • validity: high
  • significance: high
  • originality: good
  • clarity: high
  • formatting: excellent
  • grammar: excellent

Login to report or comment