Towards Universal Unfolding of Detector Effects in High-Energy Physics using Denoising Diffusion Probabilistic Models

Camila Pazos; Shuchin Aeron; Pierre-Hugues Beauchemin; Vincent Croft; Zhengyan Huan; Martin Klassen; Taritree Wongjirad

SciPost Submission Page

Towards Universal Unfolding of Detector Effects in High-Energy Physics using Denoising Diffusion Probabilistic Models

by Camila Pazos, Shuchin Aeron, Pierre-Hugues Beauchemin, Vincent Croft, Zhengyan Huan, Martin Klassen, Taritree Wongjirad

Submission summary

Authors (as registered SciPost users):

Camila Pazos

Submission information
Preprint Link:	scipost_202410_00060v1 (pdf)
Code repository:	https://gitlab.cern.ch/cpazosbo/cddpm-unfolder
Data repository:	https://zenodo.org/records/13993067
Date submitted:	2024-10-29 12:24
Submitted by:	Pazos, Camila
Submitted to:	SciPost Physics

Ontological classification
Academic field:	Physics
Specialties:	Atomic, Molecular and Optical Physics - Experiment High-Energy Physics - Experiment High-Energy Physics - Phenomenology
Approaches:	Experimental, Computational

Abstract

Correcting for detector effects in experimental data, particularly through unfolding, is critical for enabling precision measurements in high-energy physics. However, traditional unfolding methods face challenges in scalability, flexibility, and dependence on simulations. We introduce a novel approach to multidimensional object-wise unfolding using conditional Denoising Diffusion Probabilistic Models (cDDPM). Our method utilizes the cDDPM for a non-iterative, flexible posterior sampling approach, incorporating distribution moments as conditioning information, which exhibits a strong inductive bias that allows it to generalize to unseen physics processes without explicitly assuming the underlying distribution. Our results highlight the potential of this method as a step towards a "universal" unfolding tool that reduces dependence on truth-level assumptions, while enabling the unfolding of a wide range of measured distributions with improved adaptability and accuracy.

Author indications on fulfilling journal expectations

Provide a novel and synergetic link between different research areas.
Open a new pathway in an existing or a new research direction, with clear potential for multi-pronged follow-up work
Detail a groundbreaking theoretical/experimental/computational discovery
Present a breakthrough on a previously-identified and long-standing research stumbling block

Current status:

Awaiting resubmission

Reports on this Submission

Report #2 by Anonymous (Referee 2) on 2025-3-14 (Invited Report)

Strengths

1. New method to solve the bias problem in unfolding
2. Variety of different processes are included
3. For the dataset used in this study, the results of the generalized unfolder look promising

Weaknesses

1. Dataset lacks exciting correlations
2. Not clear how one dimensional metrics can quantify performances in a way meaningful to experimental analyses
3. Lack of comparison to ml-methods that correct for prior dependence
4. Missing distinction between own contribution and existing work

Report

An existing challenge of classical and ML-based unfolding algorithms is their lack of generalisability. This paper demonstrates some interesting follow-up work to existing machine learning based unfolding algorithms. It provides a method to unfold individual QCD jets in a generalized and unbiased way without retraining.
There are many interesting physics processes that suffer from prior-dependent unfolding results, especially when looking at resonant phase spaces. The chosen dataset also shows diverse pt-distributions for different physics processes, however it lacks exciting correlations. Instead of choosing jets from interesting physical processes, the study was done one QCD jets, a seemingly easy case study.
The authors tried to quantify the improvement of the generalized unfolder by one dimensional metrics, it is unclear how this translates to an improvement in a downstream tasks.
The authors only compare against dedicated generative unfolder without iterative corrections. However, there exists generative as well as discriminative unfolding techniques that iteratively correct for prior dependence. They would provide a more meaningful benchmark.
Lastly, in its current form the paper lacks distinguishability between existing work and own contribution.
Although I think the conceptual idea of the paper can be a good fit for this journal, I would recommend to revise and resubmit.

Requested changes

1. Introduction:
1.1. The generative unfolding pipeline is already established in many different previous works interchangeably using various generative networks. Using a different generative network itself should not be the selling point for this work. I would therefore ask the authors to stress this when talking about “our contribution”. The crucial novelty of the approach is the inclusion of additional information which produces unbiased results. Also, benefits of training data augmentation has shown to be beneficial in 2105.09923 and it should therefore be cited.
1.2. There has been iterative approach to generative unfolding in [11] to mitigate prior dependencies. Although cited in Table 1, it is listed as a non-iterative method, which is wrong.
1.3. Table 1 also lacks the Omnifold contribution to generalisability 2105.09923
1.4. Throughout the text there was some confusion between object-wise and event-wise unfolding. As far as I understood the authors claim to unfold individual jets rather than an entire event. Is this assumption correct? This generally differs from existing ml-based unfolding studies that unfold event-wise. But then, in Table 1 the authors method is listed as event-wise unfolding. Could this be clarified?
1.5. Although Fig. 1 intends to motivate the success of the method it awkwardly disrupts the flow and the reader does not have enough information yet. I would therefore recommend the authors to keep the results to section 3.

2. Methods:
2.1. As the authors never actually use unconditional DDPMs, it may be sufficient to stick to conditional DDPMs.
2.2. Again, conditional DDPMs have been used in - and outside of hep. I would therefore ask the authors to cite the respective papers, e.g. 2303.05376, 2307.06836, 2305.10475 etc.
2.3. Why restrict the study to QCD jets only? Please clarify.
2.4. The plot legend is rather small and the ratio plot could be between 0.9 and 1.1 to actually visualize deviations better.

3. Application to Particle Physics Data:
3.1. As far as I understand the authors over constrain the physical phase space, unfolding (E,px,py,pz,pT, phi, eta) for each jet + the first six moments of the pT-distributions? What happens to the additional degrees of freedom after unfolding? Is pT calculated from the unfolded px and py distribution or directly from the network output? Please clarify in the text.
3.2. The first six moments of the pT-distribution are unfolded as well and thrown away after unfolding. Out of curiosity do they match the truth moments?
3.3. In Figure 8a, it seems that the uncertainty of the dedicated unfolder is within the allowed 3-5% uncertainty region and in a similar region as the generalized unfolder? Do you only observe differences in energy and pt?
3.4. Figure 8b shows an event-wise correlation between two jets. Although, the author’s method is based on individual jets unfolding the unseen correlation is unfolded to a high precision. I would guess this means that detector effects of the dijet system factorize to the two individual jets? Is this true for every correlation? And would this be different for other physical objects?
3.5. I would also suggest the authors to define the hadronic recoil in the text.
3.6. What detector simulation produced the unknown sample of Figure 1a? Please clarify.

B.3 Detector Smearing and Jet Matching
B.3.1. Which setting were used to cluster the jets? Maybe clarify in the text.

Recommendation

Ask for major revision

validity: -
significance: -
originality: -
clarity: -
formatting: -
grammar: -

Report #1 by Anonymous (Referee 1) on 2025-2-11 (Invited Report)

Strengths

1. Novel idea for improving generative unfolding

2. Well-chosen test cases to present the new method

3. Well and clearly written text with excellent grammar

Weaknesses

1. Too much focus on the already-explored concept of generative unfolding, distracting from the real innovation that is presented

2. Hard to read and questionably placed figures

Report

The authors of this work present a novel addition to the field of generative, machine learning-based unfolding. The new method conditions the generative unfolding model on the moments of some of the to-be-unfolded observables in addition to the observables themself. This effectively gives the unfolding model access to dataset-level information, in addition to the standard event-level information. The authors further demonstrate that this added information results in a more generalizable unfolding model, which can be applied to a range of different physics processes.

I view this as a new step in the use of machine learning in particle physics, and in my view, it represents a significant step in building more generalized unfolding models. As such, I believe the concept and results presented in this work are a good fit for this journal. However, I also feel that the presentation of both method and results can be notably improved, which is why I would recommend minor revisions before publication. A list of recommended changes can be found below.

Requested changes

For the sake of readability, I will try to break this list down into major and minor requested changes.

Major:
1. L123 Section 2 "Methods". This section describes both the concept of including moment-conditioning in the unfolding, and the DDPM/cDDPM. However, it briefly mentions the moment-conditioning and then only goes into more depth about the conditioning after a longer derivation of the cDDPM. In my view, moment-conditioning is a smart and novel approach to generalizing generative unfolding methods, while the cDDPM is notably less new and exciting. Moreover, I fail to see why the moment-conditioning is presented as so strongly tied to the cDDPM, when it should be applicable to any generative unfolding model. Therefore I would request a reordering of section 2, placing greater and earlier emphasis on the moment-conditioning (moving some of the details from 2.3 into 2.1) and adding a paragraph or two on the transferability of the moment-conditioning concept to other generative approaches.

2. L144 Section "2.2 Denoising Diffusion Probabilistic Models" does a commendable job of presenting the cDDPM, however, I fail to see how the cDDPM fundamentally differs from other diffusion-based conditional unfolding methods, such as the ones mentioned in the introduction. I would request an added discussion the differences of the cDDPM compared to other diffusion-based Unfolders at the end of Section 2.2.

3. L275: "Part 2: Generalized Unfolder" This subsection describes the process of training the generalized cDDPM in quite a significant amount of detail, but never seems to specify how the conditional moments are derived during evaluation on a real dataset. The context indicates that one would calculate the moments over the whole evaluation set, but the note that moments are calculated on a process-by-process basis is at odds with this, as this would not be possible on a real dataset. An additional segment clarifying how the evaluation is performed in detail is needed here.

4. L365: "The dedicated unfolder is trained using data pairs (x, y), excluding in the distributional moments. In contrast, the generalized unfolder is trained on multiple simulated physics processes" and L375: "the generalized unfolder learns to model multiple posteriors from the diverse physics processes in its training data, whereas the dedicated unfolder captures only a single posterior represented by its specific training set." It appears to me that there are two main differences between the two approaches. On the one hand, one model is conditioned on the moments, while the other is not, and on the other, one model is trained on a wide range of processes, while the other is trained only on one. These two effects appear non-trivial to disentangle. While a non-conditional model trained on a wide range of priors could, intuitively, be similar to an approach simply trained on the average prior, the more varied training set can still impact the non-conditional model's training behavior. I would request a toy-case test where the conditional and non-conditional models are trained on the exact same data, to better quantify the relative contributions of the two effects.

5. L441: "In Figure 8, the model’s efficacy is further demonstrated with two tests: (1) reconstructing jet mass from unfolded results, indicating well-preserved correlations among jet vector components, and (2) reconstructing event-level observables from unfolded quantities, achieved by tracking event numbers through object-wise unfolding. The successful reconstruction of jet mass, which is not directly unfolded but derived from". Reconstructing higher-level observables is a physically well-motivated method of checking how well correlations are learned. However,classifier-based tests, especially using classifiers with inputs consisting of both detector-level and event-level information have become the gold standard for testing correlation correctness. As correctly modeling correlations is a major hurdle for high-dimensional unfolding, I would strongly suggest adding such a classifier test to the paper to better quantify the generalized cDDPM's ability to correctly learn generalized correlations.

6. Figures 3, 5, 6, and to a lesser extent Figures 1, 2, 4, 7, 8: The font on these figures is very small and challenging to read. Especially for Fig. 3, 5, and 6 the label and legend font is around 30% the height of the main text. This means that on a standard 1080p screen, viewing all 3 panels in these plots side by side already results in significant pixelation of the text and subsequent difficulty in reading the text. Two-panel figures are notably better than the three-panel ones but are still challenging to read in printed format. I would request increasing the font size in all figures and restructuring Fig. 3, 5, and 6 to only have two panels, (potentially adding an additional figure)

7. Both table 1 and the conclusion compare the methods presented to other unfolding methods, however, there is never a quantitative performance comparison between the presented method and other ML unfolding approaches. While I would love to see either other unfolding methods tested in the benchmarks used in the paper, or the generalized cDDPM tested on e.g. the OmniFold dataset, I understand that this is not a reasonable request at this point. In lieu, I would request an added point in the conclusion more explicitly addressing this lack of a quantified comparison, with a bigger focus on the novelty of the moment conditioning concept.

Minor
8. L59 "Related Work" currently is missing a discussion of ML unfolding that already sees application to real experimental data, such as at ATLAS (2405.20041), LHCb (2208.11691), and especially H1 (2108.12376, 2303.13620). Adding this would better show the relevance of the work to current HEP experiments.

9. L90 "Such a method would enable the unfolding of distributions for a wide range of processes, including those involving yet-undiscovered particles in new physics searches at high-energy colliders." This sentence strikes me as somewhat misleading, as even a perfectly general SM-trained unfolding model still would not be guaranteed to be able to unfold all BSM particles. I would suggest a change to "Such a method would enable the unfolding of distributions for a wide range of processes, including a subset of those involving yet-undiscovered particles in new physics searches at high-energy colliders."

10. L124. The section title "Our Unfolding Approach" strikes me as too colloquial. Maybe something like "Generalizing Unfolding Approaches" would be more fitting.

11. L203: "This bias is undesirable for the unfolding task" -> "This bias can prove detrimental for the unfolding task"

12. L263 "is acceptable or even desirable" -> "is acceptable or even desired".

13. L314 "While both approaches showed promising results, the latter demonstrated marginally better performance and was therefore adopted for all results presented in this work." The fact that the inclusion of the conditional moments in the output results in a performance increase strikes me as counterintuitive, as it appears to increase the difficulty of the generative task. A sentence or two discussing possible reasons for this behavior would make this a lot clearer.

14. Figure 8: The left-hand panel seems to have a different format compared to all other figures of this type for no apparent reason. This is no issue if intentional, but I wanted to raise it in case this is a case of an old figure that should have been updated.

15. General: the positioning of the figures across the paper should be improved. Especially having the conclusion broken up by two pages of results plots breaks the flow of the paper.

Recommendation

Ask for minor revision

validity: high
significance: good
originality: high
clarity: high
formatting: reasonable
grammar: excellent

SciPost Submission Page

Towards Universal Unfolding of Detector Effects in High-Energy Physics using Denoising Diffusion Probabilistic Models

by Camila Pazos, Shuchin Aeron, Pierre-Hugues Beauchemin, Vincent Croft, Zhengyan Huan, Martin Klassen, Taritree Wongjirad

Submission summary

Abstract

Author indications on fulfilling journal expectations

Current status:

Reports on this Submission

Report #2 by Anonymous (Referee 2) on 2025-3-14 (Invited Report)

Strengths

Weaknesses

Report

Requested changes

Recommendation

Report #1 by Anonymous (Referee 1) on 2025-2-11 (Invited Report)

Strengths

Weaknesses

Report

Requested changes

Recommendation

Login to report or comment