SciPost logo

SciPost Submission Page

Full Event Particle-Level Unfolding with Variable-Length Latent Variational Diffusion

by Alexander Shmakov, Kevin Greif, Michael James Fenton, Aishik Ghosh, Pierre Baldi, Daniel Whiteson

Submission summary

Authors (as registered SciPost users): Michael James Fenton · Kevin Greif
Submission information
Preprint Link: https://arxiv.org/abs/2404.14332v2  (pdf)
Code repository: https://github.com/Alexanders101/LVD
Data repository: https://zenodo.org/records/13364827
Date submitted: 2024-10-31 15:41
Submitted by: Greif, Kevin
Submitted to: SciPost Physics
Ontological classification
Academic field: Physics
Specialties:
  • High-Energy Physics - Experiment
  • High-Energy Physics - Phenomenology
Approaches: Experimental, Computational

Abstract

The measurements performed by particle physics experiments must account for the imperfect response of the detectors used to observe the interactions. One approach, unfolding, statistically adjusts the experimental data for detector effects. Recently, generative machine learning models have shown promise for performing unbinned unfolding in a high number of dimensions. However, all current generative approaches are limited to unfolding a fixed set of observables, making them unable to perform full-event unfolding in the variable dimensional environment of collider data. A novel modification to the variational latent diffusion model (VLD) approach to generative unfolding is presented, which allows for unfolding of high- and variable-dimensional feature spaces. The performance of this method is evaluated in the context of semi-leptonic top quark pair production at the Large Hadron Collider.

Author indications on fulfilling journal expectations

  • Provide a novel and synergetic link between different research areas.
  • Open a new pathway in an existing or a new research direction, with clear potential for multi-pronged follow-up work
  • Detail a groundbreaking theoretical/experimental/computational discovery
  • Present a breakthrough on a previously-identified and long-standing research stumbling block

Author comments upon resubmission

Dear editors and reviewers,

Apologies for the delay in re-submission. Our lead author was away on a summer internship and has only recently returned. We have already responded to the reviewers in the comments section below. These comments refer to version 2 of the paper, now on the arXiv. We hope the new version addresses all of the reviewers concerns.

Sincerely,
Kevin for the team

List of changes

1. Section 4.2, 2nd paragraph: Addition of discussion of the ability to sample a single detector-level event multiple times.
2. Section 4.2, 7th paragraph: Discussion of corner plots presented in Appendix E
3. Section 6, 2nd paragraph: Remove sentence “This lack of prior dependence strongly motivates the use of VLD for unfolding”.
4. Section 6, 5th paragraph: Add statements on data and code availability.
5. Appendix B: Add definitions of the distance metrics used.
6. Appendix E: Add corner plots.

Current status:
Awaiting resubmission

Reports on this Submission

Report #2 by Anonymous (Referee 1) on 2024-12-17 (Invited Report)

Strengths

Making the VLD approach to unfolding flexible enough to accommodate unfolding problems with varying dimensionality is an excellent proposal.
Moreover the paper is well structured and clearly written.

Weaknesses

The description of the algorithm as well as some arguments sometimes lack clarity.

Report

Once the requested changes are addressed I recommend the paper for publication.

Requested changes

1. Typos and similar
- "They have been been"
- "D_{DENOISE}, This"
- Eq. 10/11 the tilde is off.
- Fig.3 should include \bar{t}, \bar{b}etc.
- Fig. 5: Some labels (a,b,c) are only half visible.
- Fig .13 c seems to be missing the SM truth line?
- empty page 33


2. You state "Because generative approaches require only synthetic data during training, they do not suffer from this limitation." This is only half correct. Indeed the initial unfolding algorithm can be trained on unlimited statistics of the MC. Once you observe a prior dependence are you have to iterate, the limited amount of available data does become a limiting factor for the iteration. Maybe it is possible to argue that the unfolding is still less affected though.

3. Fig. 1: This figure is very hard to follow. While it becomes clearer when reading the paper, a few design choices could be altered to ensure better readability.
1. Fix a main direction that conveys the main task you want to illustrate.
2. Avoid edges when possible, even if the figure becomes bigger.
3. The execution of the unfolding should lead to an arrow from (x_0,.., x_N) to the particle decoder, which I would consider the main direction. However that direction is not indicated.

4. Related to Fig. 1 and Fig. 2 several choices are not clear:
1. The role of $y_0$ as a learnable parameter. It seems to be independent from any input. So it is a learnable constant? Why is this not covered by the Multiplicity Predictor itself?
2. Is the sampling of the latent space in the VAE considered part of the encoder or the decoder and at which part does the output of the Denoising Network enter?
3. Is the transformer encoder in Fig. 2 an encoder or the denoising network in Fig. 1?
4. The relative factor between the losses of the diffusion process and the VAE is chosen to be one. Have you considered different relative factors? Could this improve the reconstruction?

5. Data representation:
1. You state that “Both the mass and energy are included [for the particle level information] in the representation to improve robustness to numerical issues.” Assuming that in the end these are not exactly compatible and px^2+py^2+pz^2+m^2 won’t equal E^2, which observable do you choose to present your results?
2. At a later point you state that no lepton mass is learned at all. How is this possible given your general choice of particle level observables?


6. Uncertainties and sampling:
You state “However in this application, the uncertainties obtained from sampling the model are strictly larger than the statistical uncertainties in the distributions.” Can you explain why this is the case? Have you checked the calibration of the distributions obtained from sampling multiple times for the same event? Do the migration matrices between reco and unfolded particle level observables reproduce the truth?

7. Deviations
You state “It is then unsurprising that the network tends to return the mean value of ην in events that are particularly difficult to unfold. “ While the argument seems logical it is surprising when looking at the plot. While there the unfolded distribution is overshooting at 0 one can observe a deficit directly next to it. Following the argument, the events that are particularly difficult to unfold would be right in the middle of the bulk. Can you expand on the argument?

Recommendation

Ask for minor revision

  • validity: -
  • significance: -
  • originality: -
  • clarity: -
  • formatting: -
  • grammar: -

Report #1 by Anonymous (Referee 2) on 2024-12-3 (Invited Report)

Report

The reply of the authors has addressed all comments I made before. I recommend the improved manuscript for publication in SciPost Physics.

Requested changes

There is just one minor thing I noticed that the authors might want to adjust: Towards the end of section 4.1, the variable $M$ does not refer to the previously used multiplicity at detector level, but rather the mass. Maybe they could use $m$ instead?

Recommendation

Publish (easily meets expectations and criteria for this Journal; among top 50%)

  • validity: high
  • significance: top
  • originality: top
  • clarity: top
  • formatting: perfect
  • grammar: perfect

Login to report or comment