SciPost Submission Page
Full Event Particle-Level Unfolding with Variable-Length Latent Variational Diffusion
by Alexander Shmakov, Kevin Greif, Michael James Fenton, Aishik Ghosh, Pierre Baldi, Daniel Whiteson
Submission summary
Authors (as registered SciPost users): | Michael James Fenton · Kevin Greif |
Submission information | |
---|---|
Preprint Link: | https://arxiv.org/abs/2404.14332v2 (pdf) |
Code repository: | https://github.com/Alexanders101/LVD |
Data repository: | https://zenodo.org/records/13364827 |
Date submitted: | 2024-10-31 15:41 |
Submitted by: | Greif, Kevin |
Submitted to: | SciPost Physics |
Ontological classification | |
---|---|
Academic field: | Physics |
Specialties: |
|
Approaches: | Experimental, Computational |
Abstract
The measurements performed by particle physics experiments must account for the imperfect response of the detectors used to observe the interactions. One approach, unfolding, statistically adjusts the experimental data for detector effects. Recently, generative machine learning models have shown promise for performing unbinned unfolding in a high number of dimensions. However, all current generative approaches are limited to unfolding a fixed set of observables, making them unable to perform full-event unfolding in the variable dimensional environment of collider data. A novel modification to the variational latent diffusion model (VLD) approach to generative unfolding is presented, which allows for unfolding of high- and variable-dimensional feature spaces. The performance of this method is evaluated in the context of semi-leptonic top quark pair production at the Large Hadron Collider.
Author indications on fulfilling journal expectations
- Provide a novel and synergetic link between different research areas.
- Open a new pathway in an existing or a new research direction, with clear potential for multi-pronged follow-up work
- Detail a groundbreaking theoretical/experimental/computational discovery
- Present a breakthrough on a previously-identified and long-standing research stumbling block
Author comments upon resubmission
Apologies for the delay in re-submission. Our lead author was away on a summer internship and has only recently returned. We have already responded to the reviewers in the comments section below. These comments refer to version 2 of the paper, now on the arXiv. We hope the new version addresses all of the reviewers concerns.
Sincerely,
Kevin for the team
List of changes
1. Section 4.2, 2nd paragraph: Addition of discussion of the ability to sample a single detector-level event multiple times.
2. Section 4.2, 7th paragraph: Discussion of corner plots presented in Appendix E
3. Section 6, 2nd paragraph: Remove sentence “This lack of prior dependence strongly motivates the use of VLD for unfolding”.
4. Section 6, 5th paragraph: Add statements on data and code availability.
5. Appendix B: Add definitions of the distance metrics used.
6. Appendix E: Add corner plots.
Current status:
Reports on this Submission
Strengths
Making the VLD approach to unfolding flexible enough to accommodate unfolding problems with varying dimensionality is an excellent proposal.
Moreover the paper is well structured and clearly written.
Weaknesses
The description of the algorithm as well as some arguments sometimes lack clarity.
Report
Once the requested changes are addressed I recommend the paper for publication.
Requested changes
1. Typos and similar
- "They have been been"
- "D_{DENOISE}, This"
- Eq. 10/11 the tilde is off.
- Fig.3 should include \bar{t}, \bar{b}etc.
- Fig. 5: Some labels (a,b,c) are only half visible.
- Fig .13 c seems to be missing the SM truth line?
- empty page 33
2. You state "Because generative approaches require only synthetic data during training, they do not suffer from this limitation." This is only half correct. Indeed the initial unfolding algorithm can be trained on unlimited statistics of the MC. Once you observe a prior dependence are you have to iterate, the limited amount of available data does become a limiting factor for the iteration. Maybe it is possible to argue that the unfolding is still less affected though.
3. Fig. 1: This figure is very hard to follow. While it becomes clearer when reading the paper, a few design choices could be altered to ensure better readability.
1. Fix a main direction that conveys the main task you want to illustrate.
2. Avoid edges when possible, even if the figure becomes bigger.
3. The execution of the unfolding should lead to an arrow from (x_0,.., x_N) to the particle decoder, which I would consider the main direction. However that direction is not indicated.
4. Related to Fig. 1 and Fig. 2 several choices are not clear:
1. The role of $y_0$ as a learnable parameter. It seems to be independent from any input. So it is a learnable constant? Why is this not covered by the Multiplicity Predictor itself?
2. Is the sampling of the latent space in the VAE considered part of the encoder or the decoder and at which part does the output of the Denoising Network enter?
3. Is the transformer encoder in Fig. 2 an encoder or the denoising network in Fig. 1?
4. The relative factor between the losses of the diffusion process and the VAE is chosen to be one. Have you considered different relative factors? Could this improve the reconstruction?
5. Data representation:
1. You state that “Both the mass and energy are included [for the particle level information] in the representation to improve robustness to numerical issues.” Assuming that in the end these are not exactly compatible and px^2+py^2+pz^2+m^2 won’t equal E^2, which observable do you choose to present your results?
2. At a later point you state that no lepton mass is learned at all. How is this possible given your general choice of particle level observables?
6. Uncertainties and sampling:
You state “However in this application, the uncertainties obtained from sampling the model are strictly larger than the statistical uncertainties in the distributions.” Can you explain why this is the case? Have you checked the calibration of the distributions obtained from sampling multiple times for the same event? Do the migration matrices between reco and unfolded particle level observables reproduce the truth?
7. Deviations
You state “It is then unsurprising that the network tends to return the mean value of ην in events that are particularly difficult to unfold. “ While the argument seems logical it is surprising when looking at the plot. While there the unfolded distribution is overshooting at 0 one can observe a deficit directly next to it. Following the argument, the events that are particularly difficult to unfold would be right in the middle of the bulk. Can you expand on the argument?
Recommendation
Ask for minor revision
Report
The reply of the authors has addressed all comments I made before. I recommend the improved manuscript for publication in SciPost Physics.
Requested changes
There is just one minor thing I noticed that the authors might want to adjust: Towards the end of section 4.1, the variable $M$ does not refer to the previously used multiplicity at detector level, but rather the mass. Maybe they could use $m$ instead?
Recommendation
Publish (easily meets expectations and criteria for this Journal; among top 50%)