T. Aarrestad, M. van Beekveld, M. Bona, A. Boveia, S. Caron, J. Davies, A. De Simone, C. Doglioni, J. M. Duarte, A. Farbin, H. Gupta, L. Hendriks, L. Heinrich, J. Howarth, P. Jawahar, A. Jueid, J. Lastow, A. Leinweber, J. Mamuzic, E. Merényi, A. Morandini, P. Moskvitina, C. Nellist, J. Ngadiuba, B. Ostdiek, M. Pierini, B. Ravina, R. Ruiz de Austri, S. Sekmen, M. Touranakou, M. Vaškevičiūte, R. Vilalta, J. R. Vlimant, R. Verheyen, M. White, E. Wulff, E. Wallin, K. A. Wozniak, Z. Zhang
SciPost Phys. 12, 043 (2022) ·
published 28 January 2022
|
· pdf
We describe the outcome of a data challenge conducted as part of the Dark
Machines Initiative and the Les Houches 2019 workshop on Physics at TeV
colliders. The challenged aims at detecting signals of new physics at the LHC
using unsupervised machine learning algorithms. First, we propose how an
anomaly score could be implemented to define model-independent signal regions
in LHC searches. We define and describe a large benchmark dataset, consisting
of >1 Billion simulated LHC events corresponding to $10~\rm{fb}^{-1}$ of
proton-proton collisions at a center-of-mass energy of 13 TeV. We then review a
wide range of anomaly detection and density estimation algorithms, developed in
the context of the data challenge, and we measure their performance in a set of
realistic analysis environments. We draw a number of useful conclusions that
will aid the development of unsupervised new physics searches during the third
run of the LHC, and provide our benchmark dataset for future studies at
https://www.phenoMLdata.org. Code to reproduce the analysis is provided at
https://github.com/bostdiek/DarkMachines-UnsupervisedChallenge.