SciPost Submission Page

A guide for deploying Deep Learning in LHC searches: How to achieve optimality and account for uncertainty

by Benjamin Nachman

Submission summary

As Contributors: Benjamin Nachman
Arxiv Link: https://arxiv.org/abs/1909.03081v1
Date submitted: 2019-10-02
Submitted by: Nachman, Benjamin
Submitted to: SciPost Physics
Discipline: Physics
Subject area: High-Energy Physics - Phenomenology
Approach: Experimental

Abstract

Deep learning tools can incorporate all of the available information into a search for new particles, thus making the best use of the available data. This paper reviews how to optimally integrate information with deep learning and explicitly describes the corresponding sources of uncertainty. Simple illustrative examples show how these concepts can be applied in practice.

Current status:
Editor-in-charge assigned


Submission & Refereeing History

Submission 1909.03081v1 on 2 October 2019

Reports on this Submission

Anonymous Report 1 on 2019-11-5 Invited Report

Strengths

1-Addresses relevant systematics related to the application of neural networks to (high energy) physics data analysis and hypothesis testing.

2-Provides simple yet clear examples for all the concepts introduced and uses these to illustrate and compare traditional and supervised machine learning approaches to likelihood estimation.

Weaknesses

1-The main concepts of "Deep Learning" and "modern machine learning" are not clearly and unambiguously defined. In the paper they are used synonymously with neural networks used for likelihood approximation, which is a much more narrower concept. A more accurate approach would be to consistently use the term "neural networks" instead throughout the paper, including the title.

2-In the discussion on the uncertainties of the neural network inputs on page 12, it is claimed that if these are well modeled by the Monte Carlo simulation, there is no residual source of systematic uncertainty when such inputs are used to train a neural network likelihood approximator. However, in practice there is always some degree of mismodeling of the input uncertainty distributions. From the existing discussion it is not clear what impact such mismodelling can have on the neural network outputs, how to estimate, and potentially mitigate it. For comparison in the case of a simple likelihod ratio, the impact of an (over/under)estimated input uncertainty is transparent and can be evaluated using e.g. nuisance parameter estimation.

Report

In the submitted manuscript the author addresses relevant systematics related to the application of neural networks to (high energy) physics data analysis and hypothesis testing - in particular related to optimality and systematic uncertainty. Using simple yet clear examples for all the concepts introduced, the paper illustrates and compares traditional and supervised machine learning approaches to likelihood estimation in high energy physics highlighting potential sources of sub-optimality and systematic bias.

The paper is timely and very relevant in light of the recent advancement and proliferation of machine learning approaches to likelihood estimation in high energy physics.

However, the paper could be improved in terms of clarity and conciseness. In particular, the main concepts of "Deep Learning" and "modern machine learning" are not clearly and unambiguously defined. In the paper they are used synonymously with neural networks used for likelihood approximation, which is a much more narrower concept. A more accurate approach would be to consistently use the term "neural networks" instead throughout the paper, including the title.

Perhaps a more important weakness of the present discussion is related to the uncertainties of the neural network inputs. On page 12 it is claimed that if these are well modeled by the Monte Carlo simulation, there is no residual source of systematic uncertainty when such inputs are used to train a neural network likelihood approximator. However, in practice there is always some degree of mismodeling of the input uncertainty distributions. From the existing discussion it is not clear what impact such mismodelling can have on the neural network outputs, how to estimate, and potentially mitigate it. For comparison in the case of a simple likelihod ratio, the impact of an (over/under)estimated input uncertainty is transparent and can be evaluated using e.g. nuisance parameter estimation.

Finally, there are several minor issues with the referencing and definitions as follows:

-Reference [13] lacks scientific rigor. The author should at least provide a general reference to a recent review or a set of foundational papers.

-$CL_{S+B}$ and $CL_{B}$ below Eq. (3.2) on page 4 are not defined.

-In the sentence starting on line 2 of page 13 with "One could apply...", the term "toys" is not defined.

Requested changes

1-The main concepts of "Deep Learning", "modern machine learning" should be clearly and concisely defined within the narrower scope used in the present paper, i.e. neural networks as likelihood approximators, addressing the point 1-weakness.

2-Reference [13] lacks scientific rigor. The author should at least provide a general reference to a recent review or a set of foundational papers.

3-$CL_{S+B}$ and $CL_{B}$ below Eq. (3.2) on page 4 are not defined.

4-The discussion on the uncertainties of the neural network inputs and their impact on the likelihood estimation should be extended, addressing the point 2-weakness.

5-In the sentence starting on line 2 of page 13 with "One could apply...", the term "toys" is not defined.

  • validity: good
  • significance: high
  • originality: good
  • clarity: high
  • formatting: good
  • grammar: excellent

Login to report or comment