SciPost Submission Page
Strength in numbers: optimal and scalable combination of LHC newphysics searches
by Jack Y. Araz, Andy Buckley, Benjamin Fuks, Humberto ReyesGonzalez, Wolfgang Waltenberger, Sophie L. Williamson, Jamie Yellen
This is not the latest submitted version.
This Submission thread is now published as
Submission summary
Authors (as registered SciPost users):  Jack Araz · Andy Buckley · Benjamin Fuks · Humberto ReyesGonzález · Wolfgang Waltenberger 
Submission information  

Preprint Link:  https://arxiv.org/abs/2209.00025v2 (pdf) 
Code repository:  https://gitlab.com/taco/taco_code 
Date submitted:  20221123 17:10 
Submitted by:  ReyesGonzález, Humberto 
Submitted to:  SciPost Physics 
Ontological classification  

Academic field:  Physics 
Specialties: 

Abstract
To gain a comprehensive view of what the LHC tells us about physics beyond the Standard Model (BSM), it is crucial that different BSMsensitive analyses can be combined. But in general, search analyses are not statistically orthogonal, so performing comprehensive combinations requires knowledge of the extent to which the same events copopulate multiple analyses' signal regions. We present a novel, stochastic method to determine this degree of overlap and a graph algorithm to efficiently find the combination of signal regions with no mutual overlap that optimises expected upper limits on BSMmodel crosssections. The gain in exclusion power relative to singleanalysis limits is demonstrated with models with varying degrees of complexity, ranging from simplified models to a 19dimensional supersymmetric model.
Current status:
Reports on this Submission
Report 1 by Andrew Fowlie on 20221130 (Invited Report)
 Cite as: Andrew Fowlie, Report on arXiv:2209.00025v2, delivered 20221130, doi: 10.21468/SciPost.Report.6233
Strengths
See first report.
Weaknesses
See first report.
Report
I have spent about 2.5 hours examining the changes and thinking about them. My first report made a number of minor suggestions. I would like to thank the authors for their patience working through my long report. For the most part, I was satisfied by the changes. I didn't see my comment 2.1 (a) addressed though.
My main concern was the lack of clarity over the notion of power used in the paper. This isn't particular to this paper: in my personal opinion, there is a lot of confusion and contradiction over the way 'power' is used in the collider pheno literature, compared to how it would be used (i.e., quite precisely) in a statistical context. The authors address this by adding a comment to the introduction and in the numbered points in sec. 3.2. Unfortunately, I don't think the issue here has been fully resolved. In particular,
i) In the collider literature and in the Asimov approach, we are often making the approximation that Median = expected, and using 'median' and 'expected' interchangeably. In this setting, I agree that the maximum expected significance may be found by maximzing the expected LLR.
ii) But this says nothing about power; it only says something about expected significance. I don't understand the 'power corresponds to ... ' part of the author's response or the 'This [power] is equivalent to maximising ...' part of point 3 in sec. 3.2. Why is maximzing expected significance equivalent to maximizing power?
As I commented in my first report, as far as I can tell, in this setting and under the Wald approximation, the power depends on the expectation of the LLR under the null *and* the alternative model (as well as a choice of fixed T1 error rate). Under each model, the teststatistic q =  2 LLR follows a normal distribution with a mean and variance related by variance = 4. * mean. To find the power, you need to know the distribution of q under each model. Use the distribution under H0 to fix the T1 error. Use the distribution under H1 to compute the power at that fixed T1 error.
Perhaps there are some extra assumptions that the authors are making? The response hints at this (e.g., 'given some fairly reasonable approximations'; what approximations?). On the other hand, since it is common in the literature and since I expect power and expected significance under H1 to be quite closely related, it is reasonable to use it as a proxy, which is particularly useful in this setting since LLR is additive and so we have a sense in which power is additive which is required for the algorithm developed here. I think that would be perfectly reasonable so long as the logic is clarified.
Finally, there are typos in the changes in the text below eq. 3 (',,' and 'through by').
Requested changes
1. Address outsanding item from first report (2.1 (a))
2. Clarify relation between power and expected LLR
3. Fix typos
Author: Humberto ReyesGonzález on 20221223 [id 3187]
(in reply to Report 1 by Andrew Fowlie on 20221130)Apologies for the missed point, and what we suspect is a wrong inference on our part about the connection between power and expected significance. We have addressed these points in the latest version, and respond to the review text below.
The corresponding paragraph has been rephrased.
We used a simplified model approach with the intention of obtaining robust conclusions about potential signal overlaps for arbitrary scenarios. In certain cases, such as simplified model base reinterpretation, the assessment of overlap between two analyses must be generalised for any scenario (within the convex hulls), since such approach doesn’t involve MC event generation and thus overlaps can’t be determined on the fly. Nonetheless, in the case of socalled full (MCevent based) recasting, we can certainly determine the orthogonality between LHC analyses for each specific scenario under consideration. In fact, we did this for Example 3 in the paper.
You are quite right! Our intention was always to maximise the significance, and we had been using the word "power" informally to represent that... and then erroneously concluded that they were interchangeable here (we are not sure why, so thank you for insisting) and we could continue with that nomenclature.
You have clarified that it is indeed inaccurate, and probably confusing for more statistically minded readers, so have replaced the word "power" with "significance" or equivalent through the paper.
Fixed! Thanks again.