SciPost Submission Page
Deep-Learning Jets with Uncertainties and More
by Sven Bollweg, Manuel Haussmann, Gregor Kasieczka, Michel Luchmann, Tilman Plehn, Jennifer Thompson
This Submission thread is now published as
|Authors (as registered SciPost users):||Manuel Haussmann · Michel Luchmann · Tilman Plehn · Jennifer Thompson|
|Preprint Link:||https://arxiv.org/abs/1904.10004v2 (pdf)|
|Date submitted:||2019-08-19 02:00|
|Submitted by:||Luchmann, Michel|
|Submitted to:||SciPost Physics|
Bayesian neural networks allow us to keep track of uncertainties, for example in top tagging, by learning a tagger output together with an error band. We illustrate the main features of Bayesian versions of established deep-learning taggers. We show how they capture statistical uncertainties from finite training samples, systematics related to the jet energy scale, and stability issues through pile-up. Altogether, Bayesian networks offer many new handles to understand and control deep learning at the LHC without introducing a visible prior effect and without compromising the network performance.
Published as SciPost Phys. 8, 006 (2020)
Submission & Refereeing History
You are currently on this page
Reports on this Submission
- Cite as: Anonymous, Report on arXiv:1904.10004v2, delivered 2019-11-27, doi: 10.21468/SciPost.Report.1342
Thank you for your responses and for v2! I have just two residual comments, hopefully it should be quick to address them:
Comment on v1: Fig. 13b: I did not follow why there is a kink for the B-LoLa for 40 constituents.
Your answer: We looked in more detail into this. The sharp kink is statistics. Training several times and including more points around 40 constituents leads a more smooth curve. We believe that the improvement starts at around 40 constituents, because this is a point where the top and QCD jets of our samples start to differ significantly from the once without pile up. We haven't included this explanation to the paper.
My response: This seems like a reasonable answer to me - why not add a note about it in the paper?
Comment: Can you please explain how the error bar will be used in practice? Of course, the tagger can be calibrated using data and the uncertainty on the calibration will be the uncertainty on the NN - not the uncertainty from the BNN itself. It is interesting that you now have some measure of "confidence" in the NN decision, but it is not related to the tagger uncertainty.
Your answer: We included a sentence to the conclusion: "The standard way classifiers in HEP analyses are currently employed is to determine a working point and then assess the uncertainty on its signal efficiency and background rejection. A classifier with per-jet or per-event uncertainties automatically provides these as well but with additional information provided by the uncertainty which could be
potentially included in a statistical analysis". This doesn't answer the question and rather states that we currently don't know. However, this doesn't mean there is no way to include the additional information we get from the jet level uncertainty to an actual analysis. But this is probably beyond the scope of this paper.
My response: I think it is fine that you don't know how it will be used, but I don't agree with what you wrote. The uncertainty from the calibration on a tagger comes from the precision on the data/MC scale factors. What you have computed is a statistical "uncertainty" related to the training. These are not the same thing. Please reword.
Please see the report.