SciPost Submission Page
Fast simulation of detector effects in Rivet
by Andy Buckley, Deepak Kar, Karl Nordstrom
This is not the latest submitted version.
This Submission thread is now published as
|Authors (as Contributors):||Andy Buckley · Deepak Kar|
|Arxiv Link:||https://arxiv.org/abs/1910.01637v1 (pdf)|
|Date submitted:||2019-10-04 02:00|
|Submitted by:||Buckley, Andy|
|Submitted to:||SciPost Physics|
We describe the design and implementation of detector-bias emulation in the Rivet MC event analysis system. Implemented using C++ efficiency and kinematic smearing functors, it allows detector effects to be specified within an analysis routine, customised to the exact phase-space and reconstruction working points of the analysis. A set of standard detector functions for the physics objects of Runs 1 and 2 of the ATLAS and CMS experiments is also provided. Finally, as jet substructure is an important class of physics observable usually considered to require an explicit detector simulation, we demonstrate that a smearing approach, tuned to available substructure data and implemented in Rivet, can accurately reproduce jet-structure biases observed by ATLAS.
Submission & Refereeing History
You are currently on this page
Reports on this Submission
Anonymous Report 3 on 2019-10-31 (Invited Report)
- Cite as: Anonymous, Report on arXiv:1910.01637v1, delivered 2019-10-31, doi: 10.21468/SciPost.Report.1277
1- The authors present an extension to the extremely widely used Rivet analysis tool to approximate the most important detector effects and include them in newly built analysis codes.
2- The structure and paradigms of the implementation are described in detail, and seem to be kept minimal, yet easily extendable.
1- See requested changes below, in general only very minor clarifications are required.
This paper greatly improves the current situation of the experiments publishing unusuable (or barely usable) data which is not corrected for proprietary and publicly unknown detector effects. Existing solutions such as Delphes have known issues and inaccuracies sometimes of the same size as the detector effects themselves. It is thus a very much appreciated service for the theory community and (I suppose) the LHC collaborations.
1- in how far is the problem of detector unfolding "ill-posed". In my understanding there should be clear, in the authors terms, forward transfer function that should be invertible, even if maybe multiple solutions exist.
2- first paragraph, end of last sentence, the "/" probably should be replace with ".".
3- I am not sure I fully understand Fig. 1. What is the difference and (even schematic) meaning between "Reco ??" and "Reco/analysis". Why are "Reco/analysis" objects further from "MC truth" than "Detector hits"? Please expand on the definitions used in that graph and its discussion.
4- Dublicated citation  to MadAnalysis.
5- Section 2, Implementation. In the first line "is" should probably be "in". Also this sentence needs some overhaul for logic.
6- Same section, bottom of the page: The "The SmearedJets ..." sentence is a bit convoluted and uses a superfluous semicolon.
7- Sec. 4.2, please defineghost-associated or provide a reference.
8- Sec. 4.2, should by any chance MV2c20 be MC2c20?
9- Sec. 5. Has any validation of the photon treatment been performed?
10- ".., with Rivet and Delphes seen to stick closely together ..". Could this be formulated less colloquially?
11- Sec. 6. The discussion of the substructure observables is carried out on the basis of calorimeter modelling, how about tracking information? Is tracking info modelled/used as well?
12- Sec. 6. The authors find that the fit of the parameters of their expected detector response varies greatly with the used data input. I would like to ask the authors to somewhat expand on their discussion of the implications of this finding, possibly also including that the expected functional form may be incorrect. As it stands, neither of the fitted parameter values inspires much confidence in actually being used in unmeasured regions/observables.
Report 2 by Tilman Plehn on 2019-10-30 (Invited Report)
- Cite as: Tilman Plehn, Report on arXiv:1910.01637v1, delivered 2019-10-30, doi: 10.21468/SciPost.Report.1273
- the authors present a new tool which improves the modelling of detector effects especially for subet physics;
- the tool should be numerically efficient;
- the description is nice and physics-oriented.
- see requested changes, nothing that cannot be fixed (or argued away)
The paper has the potential to fill a known whole in LHC simulations, which too often rely on default Delphes, even though everybody knows that there are issues.
From the front to the back
1- define `ill-posed problem' as in unfolding. That accusation is a little too unspecific;
2- in the introduction it would be nice to mention that smearing is a very old way to describe detector effects. I learned this from Dieter Zeppenfeld in the late 90s, Tao Han had his famous hanlib with smearing functions. So while the presented approach is very useful, it is totally not new. Please make that clear and cite some old papers, for example Dieter's WBF papers should do.
3- I am sorry, but I do not get Fig.1. Why does the step from Det to Reco get us closer to MC? Is that guaranteed or hoped for?
4- global EFT analyses like we do them in SFitter are probably amongst the most sensitive users of detector simulation, and it's more actual physics than BSM stuff. Same for Glasgow's own TopFitter, just citing GAMBIT can be considered an insult here.
5- in 4.1, what about forward/tagging jets?
6- in 4.3 I do not understand what the authors are saying.
7- concerning 5, there is an ATLAS member on the paper and validation is all with Delphes, no data here? I am surprised about this lack of experimental honor!
8- for instance in Fig.2 the labels are too small to read on a laptop.
9- in Sec.6, the central limit theorem does not apply to profile likelihoods (as far as I understand), to that statement is a little pompous.
10- all over Sec.6 I am missing particle flow. Substructure tends to use lots of track information. At least comment and admit calorimeter defeat, please.
11- Eq.(2) is missing an error bar, so we can compare with Eq.(3).
12- in Sec.6.1, for instance, what is the effect of UE, pile-up, etc?
13- I learned that consistently writing in passive voice is bad style.
Report 1 by Jonathan Butterworth on 2019-10-7 (Invited Report)
- Cite as: Jonathan Butterworth, Report on arXiv:1910.01637v1, delivered 2019-10-07, doi: 10.21468/SciPost.Report.1212
1. Presents a useful, widely-applicable and well-designed software tool for particle physics
2. Presents a reasonable selection of demonstration results that the tool is performant
3. Presents enough of a guide that users should be able make use of the tool and adapt it to their needs.
1. Does not, in itself, contain original physics results (though this is not the intention, this is an enabling paper).
2. Contains a few unsubstantiated claims (see report).
The smearing tools presented should broaden the usefulness of the (already widely used) Rivet library to allow the inclusion of detector-level/reconstruction-level results which have not been corrected/unfolded for detector resolution and efficiency.
In section 1, the claim is made that "sound unfolding" adds very considerable time and effort. It is not really clear that this needs to intrinsically be the case. If the reco-level distributions are well enough understood for a publication (including systematic uncertainties etc) then the final unfolding step can be relatively trivial. I think the more serious problem is that reco-level distributions are often *not* this well understood, but are nevertheless published. Once can always imagine "pathological" exotic cases where the unfolding is unreliable, but in such cases the parameterised approach used here (and in some cases even the full detector simulation and/or the detector calibrations!) would also be unreliable.
(The desire of searches to use sparsely-populated bins does seem to be an intrinsic limitation on unfolding, however.)
In Section 2 the authors' approach is described as "a priori less accurate" than a DELPHES like detector model. I am actually convinced by the authors that in fact their approach is a priori *more* accurate in many cases, since the functions are tailored to specific analyses. Any generic detector simulation based on efficiency maps must surely contain compromises which will not be equally accurate for all event topologies?
The authors also state that "most calibrations are highly dependent on MC modelling". They should clarify what they mean by this. I don't think calibrations should be (or are) in the end highly dependent on the event generation, since they are validated using in situ measurements. However, they are dependent on the MC model of the detector (which is validated in various ways using data).
In section 2, what justification is there for saying the 10-20% accuracy would lead to "conservative" bounds? Couldn't they just as well be over-aggressive, depending upon the sign of the error made?
In section 3 the authors mentuon "unconscious factors". I don't think unconscious is the right word here?
Section 4.1 It seems odd, though probably justifiable, that Energy smearing is applied to momentum and that the mass is left unchanged. I presume the energy is recalculated so the result is a valid four vector? If the mass is to be then subsequently smeared, would the energy again be recalculated automatically? Some clarification would help, I think.
Section 4.2 "ghost-associated" needs description/reference
4.8 Jets have no efficiency calculated either? So it is not just MET...
Section 5 was the 10GeV pT applied to truth or smeared value?
1. Please address the comments in the report, some will imply changes.
2. Section 3 double reference for MadAnalysis should be removed. The word "implemention"appears too often in one sentence.
3. Page numbers would be nice.