SciPost Submission Page

Class Imbalance Techniques for High Energy Physics

by Christopher W. Murphy

Submission summary

As Contributors: Christopher W. Murphy
Preprint link: scipost_201907_00004v4
Date accepted: 2019-12-02
Date submitted: 2019-11-24 01:00
Submitted by: Murphy, Christopher W.
Submitted to: SciPost Physics
Academic field: Physics
  • Learning
  • High-Energy Physics - Experiment
  • High-Energy Physics - Phenomenology
Approaches: Theoretical, Computational


A common problem in a high energy physics experiment is extracting a signal from a much larger background. Posed as a classification task, there is said to be an imbalance in the number of samples belonging to the signal class versus the number of samples from the background class. In this work we provide a brief overview of class imbalance techniques in a high energy physics setting. Two case studies are presented: (1) the measurement of the longitudinal polarization fraction in same-sign $WW$ scattering, and (2) the decay of the Higgs boson to charm-quark pairs.

Published as SciPost Phys. 7, 076 (2019)

List of changes

I have made the following changes based on the referee report of the anonymous referee:
1.) I am now decaying the $W$ bosons
2.) The threshold on $m_{jj}$ has been raised to 850 GeV

I have made the following changes based on the referee report of Tilman Plehn:
1.) I revised a paragraph in Section 2 discussing the precison-recall curve so that it goes into more detail.
2.) I added several paragraphs to Section 2 going into greater detail on focal loss and what it is useful for. In a nutshell it down weights easy-to-classify examples in the loss function so that training is more focused on the hard-to-classify examples.
3.) What was footnote 1 has been removed
4.) Feynman diagrams have been added, see Figure 1
5.) Figure 2 has been added, which gives both the precision-recall curve and the ROC curve for the models in Table 1.
6.) The cuts and feature engineering have been updated to match for Ref. [46] as closely as possible. Good agreement is found when a fair comparison can be made. I find the AUC to be higher for $\Delta\phi_{jj}$ than Ref. [46]. However this is due to my treatment of partons from the hard scattering process as jets.
7.) There is no problem with the signal-background separation in (what is now) Figure 3. The mean predicted probability for a classifier with an unweighted loss function trained on an imbalanced dataset is $r$, the imbalance ratio. I would take complete signal-background separation in the training dataset to be a sign of overfitting if such behavior was not also observed in the validation dataset, which it’s not in this case. Balancing the training set moves the mean value to 0.5. This can be seen in the upper right panel from the balanced random forest. Weighting the loss function with the inverse of the class frequencies also moves the mean value to 0.5. Focal loss is intermediate between these two scenarios, $r$ and 0.5, as can be seen in the bottom right panel.
8.) Performance is reported as the mean $\pm$ the standard deviation of the five folds of the cross validation. I added this to the caption of the table. It was already in the text. My apologies for the confusion around the use of the word significant. This is certainly not a Higgs boson level discovery, nor even 3-sigma evidence for something. Apparently I’ve been out of physics for too long. I was approaching from of the point of view how large of an effect would need to be observed to convince my business stakeholders it would be worth rolling this out. The manuscript has been revised accordingly.
9.) The global fit reference now includes citations for 9 papers, Ref. [56-64], and I added a footnote about 0904.3866.
10.) I added a paragraph to end of Section 4.1 attempting to clarify these questions.
11.) The precision-recall curves for the tests in Table 2 have been added in Figure 3. The curves are consistent with the finding in Table 2, and provide another way of demonstrating that class imbalance techniques can lead to a performance in charm tagging efficiency.
12.) A direct comparison against (what is now) Ref. [76] is not possible given the backgrounds considered in the two works are different.

Additional Notes:
- 3 figures have been added
- 10 references have been added

Reports on this Submission

Report 2 by Tilman Plehn on 2019-11-28 (Invited Report)


Thank you for considering all my comments. Let's publish!

  • validity: -
  • significance: -
  • originality: -
  • clarity: -
  • formatting: -
  • grammar: -

Anonymous Report 1 on 2019-11-24 (Invited Report)


The manuscript was revised, addressing the points previously raised by the referee. I believe that the manuscript is ready for publication.

  • validity: -
  • significance: -
  • originality: -
  • clarity: -
  • formatting: -
  • grammar: -

Login to report or comment