The edge of chaos: quantum field theory and deep neural networks

Grosvenor, Kevin; Jefferson, Ro

doi:10.21468/SciPostPhys.12.3.081

SciPost Physics

The edge of chaos: quantum field theory and deep neural networks

Kevin T. Grosvenor, Ro Jefferson

SciPost Phys. 12, 081 (2022) · published 3 March 2022

doi: 10.21468/SciPostPhys.12.3.081
pdf
Submissions/Reports

Abstract

We explicitly construct the quantum field theory corresponding to a general class of deep neural networks encompassing both recurrent and feedforward architectures. We first consider the mean-field theory (MFT) obtained as the leading saddlepoint in the action, and derive the condition for criticality via the largest Lyapunov exponent. We then compute the loop corrections to the correlation function in a perturbative expansion in the ratio of depth T to width N, and find a precise analogy with the well-studied O(N) vector model, in which the variance of the weight initializations plays the role of the 't Hooft coupling. In particular, we compute both the O(1) corrections quantifying fluctuations from typicality in the ensemble of networks, and the subleading O(T/N) corrections due to finite-width effects. These provide corrections to the correlation length that controls the depth to which information can propagate through the network, and thereby sets the scale at which such networks are trainable by gradient descent. Our analysis provides a first-principles approach to the rapidly emerging NN-QFT correspondence, and opens several interesting avenues to the study of criticality in deep neural networks.

TY  - JOUR
PB  - SciPost Foundation
DO  - 10.21468/SciPostPhys.12.3.081
TI  - The edge of chaos: quantum field theory and deep neural networks
PY  - 2022/03/03
UR  - https://scipost.org/SciPostPhys.12.3.081
JF  - SciPost Physics
JA  - SciPost Phys.
VL  - 12
IS  - 3
SP  - 081
A1  - Grosvenor, Kevin
AU  - Jefferson, Ro
AB  - We explicitly construct the quantum field theory corresponding to a general class of deep neural networks encompassing both recurrent and feedforward architectures. We first consider the mean-field theory (MFT) obtained as the leading saddlepoint in the action, and derive the condition for criticality via the largest Lyapunov exponent. We then compute the loop corrections to the correlation function in a perturbative expansion in the ratio of depth T to width N, and find a precise analogy with the well-studied O(N) vector model, in which the variance of the weight initializations plays the role of the 't Hooft coupling. In particular, we compute both the O(1) corrections quantifying fluctuations from typicality in the ensemble of networks, and the subleading O(T/N) corrections due to finite-width effects. These provide corrections to the correlation length that controls the depth to which information can propagate through the network, and thereby sets the scale at which such networks are trainable by gradient descent. Our analysis provides a first-principles approach to the rapidly emerging NN-QFT correspondence, and opens several interesting avenues to the study of criticality in deep neural networks.
ER  -

@Article{10.21468/SciPostPhys.12.3.081,
	title={{The edge of chaos: quantum field theory and deep neural networks}},
	author={Kevin T. Grosvenor and Ro Jefferson},
	journal={SciPost Phys.},
	volume={12},
	pages={081},
	year={2022},
	publisher={SciPost},
	doi={10.21468/SciPostPhys.12.3.081},
	url={https://scipost.org/10.21468/SciPostPhys.12.3.081},
}

Cited by 8

Authors / Affiliations: mappings to Contributors and Organizations

See all Organizations.

¹ ² Kevin Grosvenor,
³ Ro Jefferson

Funders for the research work leading to this publication

Deutsche Forschungsgemeinschaft / German Research FoundationDeutsche Forschungsgemeinschaft [DFG]
Horizon 2020 (through Organization: European Commission [EC])