SciPost logo

SciPost Submission Page

Signal region combination with full and simplified likelihoods in MadAnalysis 5

by Gaël Alguero, Jack Y. Araz, Benjamin Fuks, Sabine Kraml

This is not the latest submitted version.

This Submission thread is now published as

Submission summary

Authors (as registered SciPost users): Jack Araz · Benjamin Fuks · Sabine Kraml
Submission information
Preprint Link: https://arxiv.org/abs/2206.14870v1  (pdf)
Code repository: https://github.com/MadAnalysis/madanalysis5
Date submitted: 2022-07-01 20:09
Submitted by: Kraml, Sabine
Submitted to: SciPost Physics
Ontological classification
Academic field: Physics
Specialties:
  • High-Energy Physics - Phenomenology
Approaches: Theoretical, Computational, Phenomenological

Abstract

The statistical combination of disjoint signal regions in reinterpretation studies uses more of the data of an analysis and gives more robust results than the single signal region approach. We present the implementation and usage of signal region combination in MadAnalysis 5 through two methods: an interface to the pyhf package making use of statistical models in JSON-serialised format provided by the ATLAS collaboration, and a simplified likelihood calculation making use of covariance matrices provided by the CMS collaboration. The gain in physics reach is demonstrated 1.) by comparison with official mass limits for 4 ATLAS and 5 CMS analyses from the Public Analysis Database of MadAnalysis5 for which signal region combination is currently available, and 2.) by a case study for an MSSM scenario in which both stops and sbottoms can be produced and have a variety of decays into charginos and neutralinos.

Current status:
Has been resubmitted

Reports on this Submission

Anonymous Report 1 on 2022-8-2 (Invited Report)

  • Cite as: Anonymous, Report on arXiv:2206.14870v1, delivered 2022-08-02, doi: 10.21468/SciPost.Report.5482

Strengths

1. Pedagogical description of the implementation method.
2. Validation summaries for 9 searches.

Weaknesses

1. Some of the validation results are problematic in terms of accuracy.

Report

The paper "Signal region combination with the full and simplified likelihoods in MADANALYSIS5" by G. Alguero, J. Araz, B. Fuks, and S. Kraml is a welcome development in the field of recasting at the LHC. It provides a tool, as a part of the MadAnalysis package, for automatic combination of signal regions in the searches published by ATLAS and CMS. Several methods for combination are implemented, depending on the data provided by collaborations. A number or recent ATLAS and CMS searches are making use of the multiple-bin signal regions where the exclusion is decided on the basis of a fit to histograms in distributions of certain kinematic variables. This approach has advantages over the "best signal region" approach which used to be more common in the past (although still used widely). As the authors demonstrate, the method has a profound impact in the CMS searches which define a large number of signal region (~ 100).

In the second section of the paper, the authors provide a detailed description of the implementation of statistical methods and interfaces to various data sources. For the ATLAS searches they use PHYF package and JSON format input files provided by the experiment. For the CMS, they use covariance matrices provided by the collaboration. I find the examples showing the details of implementation particularly useful.

The next section is devoted to validation of the searches. While some of them show a clear advantage, in other cases one can have doubts about their validity. I will specifically list my questions here:
ATLAS-SUSY-2018-31: I am not sure what is the purpose of showing LO analysis. It is just for this search and is clearly off. Not surprisingly as the missing K-factors can easily be of order 1.5. Otherwise the agreement is excellent (but also the gain from the combination rather small).
ATLAS-SUSY-2018-04: The implementation inherits problems of the original implementation, ie. overshooting the acceptance rate by some 30%. What is puzzling to me is why they can reproduce the expected limit and fail with the observed one? I would naively expect a similar level of (dis)agreement. This question equally applies to ATLAS-SUSY-2019-08, CMS-SUS-16-039, CMS-SUS-19-006.
CMS-SUS-16-039: While the agreement in the observed limit is clearly impressive in the high mass region, there is clearly a problem in the 3-body decay region. Is it clear why? In any case the improvement with respect to the best SR is impressive.
CMS-SUS-19-006: The authors downplay the over-exclusion but to me it seems quite dangerous. If I read the plot correctly for the expected limit at 0 LSP mass, this shifts the bound from 2170 GeV to 2350 GeV. This means the upper limit on the cross section is factor 2.4 too strong. Perhaps it does not look that bad in the plot but numerically it is not a negligible number. I would like to see, if possible, how the upper limits on cross section compare in the exclusion plot (at least in a reasonable vicinity of the exclusion line).

The fourth section provides a toy example of a realistic MSSM setup (in contrast to the simplified models from the previous section). The advantage of the improved statistical treatment is clearly demonstrated.

Requested changes

My main request, apart from the minor questions in the main report, is to show how the upper limits on cross section compare between an experiment and a recast. This is certainly more interesting than the LO analysis presented for one of the searches. I am aware that these data are not always available but as far as I remember ATLAS used to provide such information in the auxiliary plots. I think this would would give more confidence on the validity of the presented approach.

  • validity: high
  • significance: top
  • originality: high
  • clarity: top
  • formatting: perfect
  • grammar: perfect

Author:  Sabine Kraml  on 2022-08-31  [id 2776]

(in reply to Report 1 on 2022-08-02)
Category:
answer to question

We thank the referee for the careful assessment of our work. To the questions regarding section 3 (the validation of individual analyses), we reply as follows:

*) ATLAS-SUSY-2018-31, purpose of showing LO analysis: it is not uncommon that (exploratory) phenomenological studies use LO cross sections, as computing higher-order ones is CPU-time consuming and, depending on the model, not always straightforward (e.g., when the available calculations are for certain limits only). Also, in the case of the MSSM, the reference cross sections provided by the LHC SUSY Cross Sections WG are for the particular simplified-model assumptions and do not apply to more general MSSM scenarios. We think it is useful to illustrate the effect in one example. Showing LO results for all analyses discussed in section 3 would clearly be an unwarranted proliferation of plots, but one example can be useful to our mind.

*) ATLAS-SUSY-2018-04, and why we can reproduce the expected limit but fail with the observed one: in fact the agreement of the expected limit curves is somewhat accidental. As explained in the paper, the acceptance\times efficiency values from the MadAnalysis5 implementation of this analysis are roughly 30% higher than those from ATLAS. Therefore, the excluded cross section is smaller, and the observed limit on the stau mass stronger than in the official ATLAS result. For the expected limit, however, this effect is compensated by the fact that we report "apriori" (pre-fit) expected values, while ATLAS seemingly reports "aposteriori" (post-fit) expected values. The difference between pre-fit and post-fit is discussed in the last paragraph of section 2.1. Incidentally, for this particular analysis, the difference between pre-fit and post-fit background numbers is just of the right size to bring the ATLAS and MA5 expected limit curves in agreement. Had we used post-fit expected values, we would see roughly the same over-exclusion as for the observed limit. For better illustration, the attached "supplement.pdf" contains a concrete example.
We have added a remark in the paper (at the end of the last paragraph discussing ATLAS-SUSY-2018-04) to make this clear.

*) CMS-SUS-16-039: why in the 3-body decay region the limit derived with MA5 doesn't match the CMS one. In fact we had not explicitly simulated the 3-body decays with off-shell W and Z bosons. We thought this was clear from the figure caption and plot title, but agree that it is confusing that the limit curves don't match. We have now updated Figure 5 properly including also the off-shell region. The CMS exclusion contour is now reproduced well in both the high mass and the 3-body decay regions.

*) CMS-SUS-19-006, over-exclusion in particular in the expected limit: In reply to the referee's concern, we modified the last sentence of the relevant paragraph (see resubmission letter)

The requested plots showing the ratio of MA5/CMS excluded cross sections, for observed and expected limits, are attached. We agree that a detailed comparison of upper limits on a cross section obtained in a recast and with those provided officially is a very good way to assess the quality of an implementation of any specific recasting program. However, this kind of validation lies outside the scope of the present paper, which only focuses on the statistical modelling. We therefore decided not to include such plots and the related discussion in our manuscript.

As an aside, we note the information provided by CMS is sometimes a mix of pre-fit and post-fit background numbers (e.g., background numbers of events being pre-fit, but the covariance matrix being post-fit, or vice-versa) and the distinction is not always made clear. We are discussing this problem with the CMS collaboration, but do not want to enter into this discussion in the present paper.

Attachment:

supplement.pdf

Login to report


Comments

Anonymous on 2022-07-07  [id 2643]

Category:
remark

Combining signal regions with covariance matrices is much better than the common option of choosing the signal region with the best expected exclusion, which can be rather problematic and lead to flip flopping between which signal region is chosen with limited MC samples. Its great that experimental collaborations are now providing the covariance matrices that are required to do this and I strongly support the aims of this work in using it properly.