SciPost Phys. Core 2, 005 (2020) ·
published 15 April 2020
|
· pdf
We develop a statistical mechanical approach based on the replica method to
study the design space of deep and wide neural networks constrained to meet a
large number of training data. Specifically, we analyze the configuration space
of the synaptic weights and neurons in the hidden layers in a simple
feed-forward perceptron network for two scenarios: a setting with random
inputs/outputs and a teacher-student setting. By increasing the strength of
constraints,~i.e. increasing the number of training data, successive 2nd order
glass transition (random inputs/outputs) or 2nd order crystalline transition
(teacher-student setting) take place layer-by-layer starting next to the
inputs/outputs boundaries going deeper into the bulk with the thickness of the
solid phase growing logarithmically with the data size. This implies the
typical storage capacity of the network grows exponentially fast with the
depth. In a deep enough network, the central part remains in the liquid phase.
We argue that in systems of finite width N, the weak bias field can remain in
the center and plays the role of a symmetry-breaking field that connects the
opposite sides of the system. The successive glass transitions bring about a
hierarchical free-energy landscape with ultrametricity, which evolves in space:
it is most complex close to the boundaries but becomes renormalized into
progressively simpler ones in deeper layers. These observations provide clues
to understand why deep neural networks operate efficiently. Finally, we present
some numerical simulations of learning which reveal spatially heterogeneous
glassy dynamics truncated by a finite width $N$ effect.
SciPost Phys. 4, 040 (2018) ·
published 26 June 2018
|
· pdf
We construct and analyze a family of $M$-component vectorial spin systems
which exhibit glass transitions and jamming within supercooled paramagnetic
states without quenched disorder. Our system is defined on lattices with
connectivity $c=\alpha M$ and becomes exactly solvable in the limit of large
number of components $M \to \infty$. We consider generic $p$-body interactions
between the vectorial Ising/continuous spins with linear/non-linear potentials.
The existence of self-generated randomness is demonstrated by showing that the
random energy model is recovered from a $M$-component ferromagnetic $p$-spin
Ising model in $M \to \infty$ and $p \to \infty$ limit. In our systems the
quenched disorder, if present, and the self-generated disorder act additively.
Our theory provides a unified mean-field theoretical framework for glass
transitions of rotational degree of freedoms such as orientation of molecules
in glass forming liquids, color angles in continuous coloring of graphs and
vector spins of geometrically frustrated magnets. The rotational glass
transitions accompany various types of replica symmetry breaking. In the case
of repulsive hardcore interactions in the spin space, continuous the
criticality of the jamming or SAT/UNSTAT transition becomes the same as that of
hardspheres.
Prof. Yoshino: "In my response to << report 1>..."
in Submissions | submission on From complex to simple : hierarchical free-energy landscape renormalized in deep neural networks by Hajime Yoshino