Layne Bradshaw, Rashmish K. Mishra, Andrea Mitridate, Bryan Ostdiek
SciPost Phys. 8, 011 (2020) ·
published 24 January 2020
Searching for new physics in large data sets needs a balance between two competing effects---signal identification vs background distortion. In this work, we perform a systematic study of both single variable and multivariate jet tagging methods that aim for this balance. The methods preserve the shape of the background distribution by either augmenting the training procedure or the data itself. Multiple quantitative metrics to compare the methods are considered, for tagging 2-, 3-, or 4-prong jets from the QCD background. This is the first study to show that the data augmentation techniques of Planing and PCA based scaling deliver similar performance as the augmented training techniques of Adversarial NN and uBoost, but are both easier to implement and computationally cheaper.