SciPost logo

SciPost Submission Page

Proliferation of non-linear excitations in the piecewise-linear perceptron

by Antonio Sclocchi, Pierfrancesco Urbani

This is not the latest submitted version.

This Submission thread is now published as

Submission summary

Authors (as registered SciPost users): Antonio Sclocchi
Submission information
Preprint Link: https://arxiv.org/abs/2010.10253v1  (pdf)
Date submitted: 2020-10-27 13:46
Submitted by: Sclocchi, Antonio
Submitted to: SciPost Physics
Ontological classification
Academic field: Physics
Specialties:
  • Statistical and Soft Matter Physics
Approaches: Theoretical, Computational

Abstract

We investigate the properties of local minima of the energy landscape of a continuous non-convex optimization problem, the spherical perceptron with piecewise linear cost function and show that they are critical, marginally stable and displaying a set of pseudogaps, singularities and non-linear excitations whose properties appear to be in the same universality class of jammed packings of hard spheres. The piecewise linear perceptron problem appears as an evolution of the purely linear perceptron optimization problem that has been recently investigated in [1]. Its cost function contains two non-analytic points where the derivative has a jump. Correspondingly, in the non-convex/glassy phase, these two points give rise to four pseudogaps in the force distribution and this induces four power laws in the gap distribution as well. In addition one can define an extended notion of isostaticity and show that local minima appear again to be isostatic in this phase. We believe that our results generalize naturally to more complex cases with a proliferation of non-linear excitations as the number of non-analytic points in the cost function is increased.

Current status:
Has been resubmitted

Reports on this Submission

Report #2 by Anonymous (Referee 2) on 2020-12-6 (Invited Report)

  • Cite as: Anonymous, Report on arXiv:2010.10253v1, delivered 2020-12-06, doi: 10.21468/SciPost.Report.2256

Strengths

1. The authors study an interesting variant of the spherical perceptron problem in a non-convex / glassy regime, as a model for jamming of hard spheres in finite/low dimension.

2. They reinforce a previous discovery that several criticality properties associated with the jamming (SAT/UNSAT) transition are carried by the system in the entire jammed non-convex phase.

3. They investigates the link between the number of singularity points of the cost function and the form of isostaticity of the model in the jammed phase, and identify power laws in the gap and force distribution in the vicinity of the singularities.

Weaknesses

1. The study is not accompanied by a supporting theory. The authors claim that their results can be backed by replica computations, but they do not perform said computations. This being said, their claim is credible given earlier theoretical work on the linear perceptron in ref [1].

2. In absence of convexity, the properties of the local minima found are in principle very algorithm-specific. A skeptic would claim that all the results in the paper are specific to the BFGS algorithm (the algorithm used by the authors), and are far from describing ground states...

Report

Summary:

The authors study an interesting variant of the spherical perceptron problem in a non-convex / glassy regime where the cost function to be minimized is piecewise linear with two singularities. Previous work (ref [1]) studied the 'linear' perceptron model where the cost function is max(-x,0), i.e., has one singularity at zero, and discovered that the system is (almost) isostatic in the entire RSB region of the jammed phase, in the sense that the number of marginally satisfied constraints (i.e., where the gaps are equal to zero) is N(1 - o_N(1)).

This paper considers a piecewise linear cost function with two singularities: one at 0 and one at -H_0. The authors make a similar empirical observation: there are about N gaps equal to either -H_0 or 0 in the RSB region of the jammed phase. An interpretation is that the system is isostatic only globally. (Note that the fluctuations in the above two groups must then be anticorrelated.)

Moreover, the distribution of gaps, in addition to having a Dirac delta at each singularity, is found to exhibit four power laws, one on each side of each singularities. Experiments show that the exponent governing these powers laws is the same, and is given by the power law found at the jamming transition! The authors also study the distribution of forces at the contact points (i.e., the Lagrange multipliers of the gaps) and find similar power laws.

Comments:

This papers is an interesting empirical study on the nature of jamming criticality. The authors use the perceptron model as a good mean-field proxy for hard spheres in low dimension (this model is also worth studying for its own sake, as a mathematical model of high-dimensional random geometry).

The paper makes several interesting findings, all of which deserve further study. The finding that the system distributes an isostatic number of gaps on the singularity points of the cost function seems to be new, and is quite intriguing.

Requested changes

1. As a non expert in this area, I found the paper relatively easy to read and understand. However, I did not understand the claims about 'proliferation of non-linear excitations', 'avalanches and crackling noises' and their relationship to the power laws found in the gap distribution. Perhaps the authors can be more pedantic and expand further on this.

2. The notion of isostaticity deserves to be defined more precisely. The claims in the paper seems to be only approximate: c_0 + c_{-H_0} = 1 - o_N(1), as opposed to the stronger claim that the number of marginally satisfied constraints is *exactly* equal to N (or N-1).
The authors can perhaps clarify which form they are claiming.

3. See also points raised in Weaknesses (derivation of critical exponents from replica theory, the algorithmic issue.)

  • validity: high
  • significance: high
  • originality: high
  • clarity: high
  • formatting: good
  • grammar: good

Report #1 by Thibaud Maimbourg (Referee 1) on 2020-11-19 (Invited Report)

  • Cite as: Thibaud Maimbourg, Report on arXiv:2010.10253v1, delivered 2020-11-19, doi: 10.21468/SciPost.Report.2213

Strengths

1- The results deepen the understanding of jamming criticality in the overcompressed phase due to discontinuous forces.
2- The paper is clear and succinct.
3- The numerical results are convincing.
4- A plausible application to non-mean-field models (interacting spheres in 2D or 3D) is discussed.

Weaknesses

1- A weakness contained in the related strength. The paper builds on previous studies. Theoretical arguments are given as a transposition of the purely linear case without details and the numerical procedure is explained in previous works. As a consequence, the paper requires knowledge of the related works, which arguably is a right choice here, provided minor adjustments.
2- The figures' legibility and relation to the text could be improved.

Report

In this paper, the authors build on previous studies in the last years, by themselves and others, to investigate the consequences of non-analyticities in the cost function of the spherical perceptron optimization problem. Previous results have indeed shown that for a case of continuous forces $-v'(h)$, jamming criticality is somewhat fragile (due to null forces at contact) but its peculiar properties (related to isostaticity) get extended considering a linear potential (i.e. a finite sustaining force at zero gap). The paper studies the effects of additional such discontinuities of the forces (i.e. a piecewise linear potential at negative gap). In previous studies the non-analyticity of the potential was only located at zero gap, and contacts could be defined as constraints that are marginally unsatisfied. Here additional force discontinuities inside the region of unsatisfiability of the constraints lead to a definition of several "contact" locations (at each discontinuity), a notion of stability represented by the sum rule Eq.(8), corresponding to isostaticity when marginally satisfied, and leading to several delta peaks of the gap distribution surrounded by power laws, as well as four pseudogaps in the contact forces. The paper ends with a qualitative discussion of the corresponding emergent excitations, generalizations of the model, a plausible replica solution and application to interacting spheres.
The main tools are numerical simulations (using an algorithm developed in Ref.10 for a purely linear cost function at negative gaps) and theoretical arguments (a transposition of the pure linear case previously studied).

I found the paper well written, interesting and making a stronger point for non-convexity and non-analyticity of the cost function as ingredients for jamming phenomenology, that suggests a direction for future research. Considering the papers' strengths and weaknesses, minor modifications are required, mostly of informative character to the reader and to improve readability of the paper.

Requested changes

1- What is meant by convexity should be defined early on (convexity of the cost function with respect to certain variables).
Similarly, isostaticity should be defined early in the text. The first appearance of minima in the text (properly defined in the abstract however) should be (re)defined as minima of the cost function / energy landscape.

2- As was correctly done for Fig. 4, the other figures should indicate the parameters of the model instead of only numerical numbers. Namely, in Fig.1, there should be a label $-H_0$ on the $h$ axis at -0.3 with a vertical dashed line indicating the discontinuity and labels indicating the slopes -1 or -2 of the ramps. This is to give a visual representation to Eq.(2) as was intended by the authors. In Fig. 2 a $\sigma_{\rm convex}$ or $\Delta\sigma_{\rm convex}=\sigma_{\rm convex}-\sigma_J$ (or any better name that the authors see fit) should be defined, its numerical value in the present case given and labelled on the horizontal axis. This could be used in the text when talking about the topology trivialization transition. In Fig. 3 this label should be used again on the horizontal axis, perhaps with a vertical dashed line to indicate the isostaticity breakdown. In Fig. 5 (in the legends there are minus signs missing $h\sim -H_0$) and the horizontal axis should be labelled as some $\Delta h$ which could be defined in the caption.

3- In the second paragraph (p.1), what is meant by excitations "richer in nature" ?

4- In the next paragraph below Eq.(5) about the sign of $\mu$, a connection to Eq.(7) should be made.

5- Information about the number of patterns numerically sampled must be included.

6- Concerning Eq. (7), and referring to Eq.(3) of Ref.1, for the reader I guess it would be clearer to substitute $\cal H_{ij}$ by $\varepsilon\frac{\partial^2\cal L_\varepsilon}{\partial x_i\partial x_j}$ where $\cal L_\varepsilon$ could be explained as the regularized version of Eq.(3) of the present paper. Appendix B of Ref. 9 could be cited.

7- Could the authors comment on the fact that the exponents $\gamma$ and $\theta$ on Eqs. (11) are the same on the four sides of the contacts? Besides, in Fig. 5, is there a physical reason for the blueand violet curves to almost coincide? Could it be a consequence of the fact that these gaps belong to $\cal O_=$? Does it mean the following relation $A_0^-=A_{H_0}^+$ for the prefactors? (the very small $\Delta h$ points may not support this but it is hard to tell due to the error bars). Same
question for the red and black curves. Is the discontinuity in the violet curve related to starting sampling gaps around $\Delta h=h+H_0=H_0=0.3$ i.e. $h=0$, meaning gaps closer to the other contact in $h=0$? Similarly, Fig. 7 seems to indicate equal prefactors in Eq. (11), would
there be a physical reason for this?

8- In the discussion, reference(s) should be added concerning the possibility of localized non-linear excitations in spheres, and related works should be cited in Ref.18 (if the authors agree on their relevance), such as
M. Geiger, S. Spigler, S. d’Ascoli, L. Sagun, M. Baity-Jesi, G. Biroli and M. Wyart, Phys. Rev. E 100, 012115 (2019)
H. Yoshino, SciPost Phys. Core 2, 005 (2020)

9- In the final sentence it is unclear what are the other types of non-analyticities referred to. It seems the present phenomenology is linked to discontinuities in the forces, i.e a discontinuity in the first-order derivative. All higher-order singularities seem irrelevant. Maybe the authors have in mind something like a square root singularity (infinite force in a point)?

10- Some typos should be fixed by careful reading. Examples are:
In Eq. (4) the optimization on $f_\mu$ should display $h_\mu=-H_0$ in $\mathcal C_{H_0}$ as well.
The first sentence in second column of p.2 can be simplified.
The authors can search occurence of "contatcs", "oder", "incresing".

11- This comment is not properly speaking a requested change, although the answers could be included in the paper if the authors deem it relevant.
Do the authors expect any influence of the inner discontinuity at $-H_0$ on the phase diagram of the model with respect to the phase diagram derived in the pure linear case? It seems not as far as I can tell in the text. Is there in this respect a role played by the unsatisfied constraints which are not "contacts", i.e. in $\cal O_{<,=}$ ? They are present in the optimization of Eq. (4).

  • validity: top
  • significance: high
  • originality: high
  • clarity: high
  • formatting: excellent
  • grammar: good

Login to report or comment