CapsNets Continuing the Convolutional Quest

Sascha Diefenbacher; Hermann Frost; Gregor Kasieczka; Tilman Plehn; Jennifer M. Thompson

SciPost Submission Page

CapsNets Continuing the Convolutional Quest

by Sascha Diefenbacher, Hermann Frost, Gregor Kasieczka, Tilman Plehn, Jennifer M. Thompson

This Submission thread is now published as

SciPost Phys. 8, 023 (2020)

Submission summary

Authors (as registered SciPost users):

Sascha Diefenbacher · Tilman Plehn · Jennifer Thompson

Submission information
Preprint Link:	https://arxiv.org/abs/1906.11265v3 (pdf)
Date accepted:	Dec. 2, 2019
Date submitted:	Nov. 4, 2019, 1 a.m.
Submitted by:	Diefenbacher, Sascha
Submitted to:	SciPost Physics

Ontological classification
Academic field:	Physics
Specialties:	High-Energy Physics - Phenomenology
Approaches:	Theoretical, Experimental

Abstract

Capsule networks are ideal tools to combine event-level and subjet information at the LHC. After benchmarking our capsule network against standard convolutional networks, we show how multi-class capsules extract a resonance decaying to top quarks from both, QCD di-jet and the top continuum backgrounds. We then show how its results can be easily interpreted. Finally, we use associated top-Higgs production to demonstrate that capsule networks can work on overlaying images to go beyond calorimeter information.

Author comments upon resubmission

We would like to thank the referee for their interest and careful reading of our paper, as well as their useful comments. We have clarified some points in our paper and addressed the points the referee has raised. Please find attached our list of changes.

List of changes

1 - Add discussion of differences between original capsule implementation and the current method. - Both implementations are, by design, identical, to clarify this we added: Analogous to the original capsule paper, we transition between convolutional and capsule part by re-shaping...'' in section 4.

2 - Add information about how many routings were used. - We used 3 routings, as was shown to be optimal in other studies. We have rephrased a sentence to reflect this: We repeated this for a chosen number of routings, where three iterations have in other studies given the best results'' now readsWe repeated this for 3 routings, which has been shown in other studies to give the best results''

3 - More information about the preprocessing. The authors mention that CapsNets need less preprocessing, and it sounds like only scaling the images so the most intense pixel has a value of 1.0 was done. Is this the same for the benchmarks of the Rutgers DeepTop Taggers as done here? - We have added: In contrast to the minimal pre-processing we use for the event image capsule network, for the Rutgers tagger and the jet image capsule network we employ the full pre-processing for each jet as described in Ref. [10]. The jets are selected and centered around the $p_T$ weighted centroid of the jet, and rotated such that the major principal axis is vertical. The image is then flipped to ensure that the maximum activity is in the upper-right-hand quadrant. Finally, the images are pixelated and normalized."

4 - Compare CapsNets to networks of similar architecture but without the Capsules for the Pooling CapsNets'' architecture of Figure 6. - We have now made this comparison, and have included the plot in a response to the report. We observe a small but persistent increase in performance by using capsule networks over a dense network with a similar architecture. This comes along with the advantage of having the capsule vectors themselves, which provide a window into how the network is making decisions.

5 - Consider adding a W′ signal to see how CapsNets deal with signal which has some substructure signals and similar kinematics. While pre-selection should be able to deal with some of the differences here, it would be for the study of the CapsNet themselves. - A W' analysis would be an interesting new application for our network, but it would be a whole project in itself and falls outside the scope of our current publication.

6 - Some mention of how the results of [43] compare to the t¯tH classifier used here. - We have added For this set-up we find comparable performance to Ref. [43], with an AUC of 0.715, which is slightly above their upper limit.''

7 - [Optional] Publicly available code or code snippets. - Unfortunately, we are unable to dedicate the time to make a public code useful to the community.

Published as SciPost Phys. 8, 023 (2020)

Reports on this Submission

Report #1 by Anonymous (Referee 1) on 2019-11-14 (Invited Report)

Report

The authors have performed the checks which I asked for. I am now satisfied and happy to recommend this paper for publication.

validity: -
significance: -
originality: -
clarity: -
formatting: -
grammar: -

SciPost Submission Page

CapsNets Continuing the Convolutional Quest

by Sascha Diefenbacher, Hermann Frost, Gregor Kasieczka, Tilman Plehn, Jennifer M. Thompson

This Submission thread is now published as

Submission summary

Abstract

Author comments upon resubmission

List of changes

Reports on this Submission

Report #1 by Anonymous (Referee 1) on 2019-11-14 (Invited Report)

Report

Login to report or comment