Aurélien Decelle, Cyril Furtlehner, Alfonso De Jesús Navas Gómez, Beatriz Seoane
SciPost Phys. 16, 095 (2024) ·
published 5 April 2024
|
· pdf
Generative models offer a direct way of modeling complex data. Energy-based models attempt to encode the statistical correlations observed in the data at the level of the Boltzmann weight associated with an energy function in the form of a neural network. We address here the challenge of understanding the physical interpretation of such models. In this study, we propose a simple solution by implementing a direct mapping between the Restricted Boltzmann Machine and an effective Ising spin Hamiltonian. This mapping includes interactions of all possible orders, going beyond the conventional pairwise interactions typically considered in the inverse Ising (or Boltzmann Machine) approach, and allowing the description of complex datasets. Earlier works attempted to achieve this goal, but the proposed mappings were inaccurate for inference applications, did not properly treat the complexity of the problem, or did not provide precise prescriptions for practical application. To validate our method, we performed several controlled inverse numerical experiments in which we trained the RBMs using equilibrium samples of predefined models with local external fields, 2-body and 3-body interactions in different sparse topologies. The results demonstrate the effectiveness of our proposed approach in learning the correct interaction network and pave the way for its application in modeling interesting binary variable datasets. We also evaluate the quality of the inferred model based on different training methods.
Nicolas Béreux, Aurélien Decelle, Cyril Furtlehner, Beatriz Seoane
SciPost Phys. 14, 032 (2023) ·
published 14 March 2023
|
· pdf
Restricted Boltzmann Machines are simple and powerful generative models that can encode any complex dataset. Despite all their advantages, in practice the trainings are often unstable and it is difficult to assess their quality because the dynamics are affected by extremely slow time dependencies. This situation becomes critical when dealing with low-dimensional clustered datasets, where the time required to sample ergodically the trained models becomes computationally prohibitive. In this work, we show that this divergence of Monte Carlo mixing times is related to a phenomenon of phase coexistence, similar to that which occurs in physics near a first-order phase transition. We show that sampling the equilibrium distribution using the Markov chain Monte Carlo method can be dramatically accelerated when using biased sampling techniques, in particular the Tethered Monte Carlo (TMC) method. This sampling technique efficiently solves the problem of evaluating the quality of a given trained model and generating new samples in a reasonable amount of time. Moreover, we show that this sampling technique can also be used to improve the computation of the log-likelihood gradient during training, leading to dramatic improvements in training RBMs with artificial clustered datasets. On real low-dimensional datasets, this new training method fits RBM models with significantly faster relaxation dynamics than those obtained with standard PCD recipes. We also show that TMC sampling can be used to recover the free-energy profile of the RBM. This proves to be extremely useful to compute the probability distribution of a given model and to improve the generation of new decorrelated samples in slow PCD-trained models. The main limitations of this method are, first, the restriction to effective low-dimensional datasets and, second, the fact that the Tethered MC method breaks the possibility of performing parallel alternative Monte Carlo updates, which limits the size of the systems we can consider in practice.