SciPost logo

SciPost Submission Page

Fast, accurate, and precise detector simulation with vision transformers

by Luigi Favaro, Andrea Giammanco, Claudius Krause

This is not the latest submitted version.

Submission summary

Authors (as registered SciPost users): Luigi Favaro · Claudius Krause
Submission information
Preprint Link: https://arxiv.org/abs/2509.25169v1  (pdf)
Code repository: https://github.com/luigifvr/vit4hep
Data repository: https://calochallenge.github.io/homepage/
Date submitted: Sept. 30, 2025, 10:15 a.m.
Submitted by: Luigi Favaro
Submitted to: SciPost Physics Proceedings
Proceedings issue: The 2nd European AI for Fundamental Physics Conference (EuCAIFCon2025)
Ontological classification
Academic field: Physics
Specialties:
  • High-Energy Physics - Phenomenology
Approach: Computational

Abstract

The speed and fidelity of detector simulations in particle physics pose compelling questions about LHC analysis and future colliders. The sparse high-dimensional data, combined with the required precision, provide a challenging task for modern generative networks. We present a comparison between solutions with different trade-offs, including accurate Conditional Flow Matching and faster coupling-based Normalising Flows. Vision Transformers allows us to emulate the energy deposition from detailed Geant4 simulations. We evaluate the networks using high-level observables, neural network classifiers, and sampling timings, showing minimum deviations from Geant4 while achieving faster generation. We use the CaloChallenge benchmark datasets for reproducibility and further development.

Current status:
Has been resubmitted

Reports on this Submission

Report #1 by Anonymous (Referee 1) on 2025-11-5 (Invited Report)

Disclosure of Generative AI use

The referee discloses that the following generative AI tools have been used in the preparation of this report:

I used ChatGPT-4 on 4/11/2025 to pass the review text through a grammar check.

Report

The authors explore generative models for calorimeter response simulations using a vision transformer-based architecture. The generation process of a single shower is factorized into two parts: one network (with two architectures explored: normalizing flow and conditional flow matching) generates the total energy deposition in each calorimeter layer, while another network (a 3D Vision Transformer) generates the normalized voxel energy distribution. The architectural details are described in the referenced articles. Both setups are evaluated using the CaloChallenge datasets. The performance is analyzed in terms of the generation time for a single calorimeter shower and the AUC scores from a classifier trained to distinguish GEANT4 from generated showers.

The article is well written and logically structured, presenting a comprehensive and technically sound study. Given that the primary motivation for exploring generative models in this context is to develop faster alternatives to GEANT4 Monte Carlo simulations, it might be beneficial to include a reference to the typical time required by GEANT4 to simulate a single calorimeter shower, even if it is generally executed on CPUs rather than GPUs as in the case of the generative approaches.

Recommendation

Publish (easily meets expectations and criteria for this Journal; among top 50%)

  • validity: -
  • significance: -
  • originality: -
  • clarity: -
  • formatting: -
  • grammar: -

Login to report or comment