SciPost logo

SciPost Submission Page

Nonequilbrium physics of generative diffusion models

by Zhendong Yu, Haiping Huang

Submission summary

Authors (as registered SciPost users): Haiping Huang
Submission information
Preprint Link: https://arxiv.org/abs/2405.11932v1  (pdf)
Date submitted: 2024-05-22 11:50
Submitted by: Huang, Haiping
Submitted to: SciPost Physics
Ontological classification
Academic field: Physics
Specialties:
  • Statistical and Soft Matter Physics
Approaches: Theoretical, Computational

Abstract

Generative diffusion models apply the concept of Langevin dynamics in physics to machine leaning, attracting a lot of interest from industrial application, but a complete picture about inherent mechanisms is still lacking. In this paper, we provide a transparent physics analysis of the diffusion models, deriving the fluctuation theorem, entropy production, Franz-Parisi potential to understand the intrinsic phase transitions discovered recently. Our analysis is rooted in non-equlibrium physics and concepts from equilibrium physics, i.e., treating both forward and backward dynamics as a Langevin dynamics, and treating the reverse diffusion generative process as a statistical inference, where the time-dependent state variables serve as quenched disorder studied in spin glass theory. This unified principle is expected to guide machine learning practitioners to design better algorithms and theoretical physicists to link the machine learning to non-equilibrium thermodynamics.

Author indications on fulfilling journal expectations

  • Provide a novel and synergetic link between different research areas.
  • Open a new pathway in an existing or a new research direction, with clear potential for multi-pronged follow-up work
  • Detail a groundbreaking theoretical/experimental/computational discovery
  • Present a breakthrough on a previously-identified and long-standing research stumbling block
Current status:
In refereeing

Reports on this Submission

Anonymous Report 2 on 2024-7-22 (Invited Report)

Strengths

Maybe it could contains some original results.

Weaknesses

Not clear.

Technical language not always appropriate.

Report

The innovative results of this paper are not clear, and because of
that I do not believe it should be published as it stands. A complete
revision and rewriting could maybe lead to an interesting note, but
that should be checked.

The language of the paper is very "mixed", and sometimes even symbols
that are not of common usage in physics are used without clear
definitions and exhaustive explanations.

Many concepts are here together, and they do not always mix well. Did
someone discuss in detail how GDMs work for spin glasses? The
introduction of an environment including spin glass, non-equilibrium
thermodynamics, and machine learning is at best, I believe, confusing.

It is not clear where are the original, innovative results coming  from
this research.

Recommendation

Ask for major revision

  • validity: low
  • significance: low
  • originality: low
  • clarity: poor
  • formatting: good
  • grammar: good

Anonymous Report 1 on 2024-7-10 (Invited Report)

Strengths

1. The paper provides a rather clear exposition of some non-equilibrium physics of stochastic processes, focusing mostly on Ornstein-Uhlenbeck processes and the time-reversed process thereof, which naturally arise in diffusion-based machine learning models.

2. The authors illustrate some of the derived quantities (such as the effective potential, entropy flux, entropy production) on a binary Gaussian mixture model. The accompanying plots (mostly in the one-dimensional case) are in general clear.

3. The authors collect some insights on spontaneous symmetry breaking in the reverse process, as explored in a number of previous investigations in the literature, and discuss consequences for diffusion generative models.

Weaknesses

1. A large portion of the discussed derivations is devoted to non-equilibrium thermodynamics of Ornstein-Uhlenbeck processes, and the time-reverse thereof, as appeared in previous cited works like [Seifert, 2012], with overall slightly insufficient discussion devoted to try to connect to diffusion generative models, beyond a review of the main insights of [Biroli and Mézard, 2023] or [Ambrogioni, 2023].

2. Many of the illustrative examples are provided for one-dimensional cases. While this is fine, no takeaway is clearly provided as to how the phenomenology is intuited to carry over (or not) to higher dimensions, and to diffusion models. For instance, the one-dimensional Franz-Parisi potential is computed, but the results rather insufficiently discussed and connected to the broader narrative.

3. [Minor] The paper is in general clear. Some parts nevertheless suffer from slightly hard to follow syntax. I am also providing a list of typos below, in "requested changes".

Report

The paper reviews the non-equilibrium thermodynamics (throught the lens of stochastic entropies, probability currents, effective potentials) of Ornstein-Uhlenbeck (OU) processes, giving additional illustrations in simple cases for the various thermodynamic quantities.

The connection to generative diffusion models, is, however, to the best of my understanding, not very developped beyond the fact that OU processes arise in the latter, and beyond a review of the main insights of [Biroli and Mézard, 2023] or [Ambrogioni, 2023]. The paper does not, in particular, establish any new connection, to the best of my reading. While the discussion on the Franz-Parisi potential seems novel, it is only evaluated in the one-dimensional case, and not discussed or connected to diffusion models. The paper therefore constitutes more of a (rather detailed and interesting) review of some recent non-equilibrium physics analyses of diffusion-based models and of the broader physics backdrop.

I therefore do not believe that the paper, in its current state, provides "a novel and synergetic link between different research areas" beyond previous literature. In particular, the objective stated in the abstract of "guid[ing] ML practitioners to design better algorithms" is slightly overclaimed. If a novel connection is indeed established in the paper and escaped my scrutiny, it would need to be greatly strengthened and much further discussed.

Requested changes

I am giving below a list of typos. Some of them are questions, and are possibly due to a misunderstanding on my side. Most of them are related to syntax, grammar or notations. The science is, to the best of my knowledge, sound and correct.

1- Abstract "analsysis of the diffusion models" : extra "the"
2- p.2 "centra", "likelihodd"
3- p.5 "the stochastic integral..." : incomplete sentence
4- p.5 "Ito convection"
5- p.5 "Stratonovitch convection"
6- "We make no difference between \textbf{X}, X, X_t, X(t)" : is this inconsistent notation really needed or could it be harmonized?
7- Sentence around (11) incomplete
8- p. 11 "of volume for transformed" missing "the"
9- Similarly, changing from t to \tau around (25)to denote time does not seem really justified.
10- Stochastic and ensemble entropies (25, 28) are both denoted S.
11- Extra ) in (33)
12- p.13 "it is proven that in a previous work that" extra "that"
13- p.25 "research filed"
14- in (13), shouldn't t' be t+dt?
15- What is \mathcal{N} in (17) or (59)? Is it a typo?
16- In (43) : is it for d=1? If not, why is the vector X_s squared?
17- The connection between the effective potential and the free energy is nice, but does one really need the Gaussian mixture assumption ? Can't it be derived in more generality from (49) that F=U?
18- Overall, some passages (notably the abstract) suffer from slightly awkward syntax, which would benefit from being adjusted.

Recommendation

Ask for major revision

  • validity: good
  • significance: low
  • originality: low
  • clarity: high
  • formatting: excellent
  • grammar: below threshold

Login to report or comment