SciPost logo

SciPost Submission Page

Spiking neuromorphic chip learns entangled quantum states

by Stefanie Czischek, Andreas Baumbach, Sebastian Billaudelle, Benjamin Cramer, Lukas Kades, Jan M. Pawlowski, Markus K. Oberthaler, Johannes Schemmel, Mihai A. Petrovici, Thomas Gasenzer, Martin Gärttner

This is not the current version.

Submission summary

As Contributors: Andreas Baumbach · Stefanie Czischek
Arxiv Link: https://arxiv.org/abs/2008.01039v4 (pdf)
Date submitted: 2021-08-04 03:02
Submitted by: Czischek, Stefanie
Submitted to: SciPost Physics
Academic field: Physics
Specialties:
  • Neural and Evolutionary Computing
  • Quantum Physics
Approaches: Experimental, Computational

Abstract

The approximation of quantum states with artificial neural networks has gained a lot of attention during the last years. Meanwhile, analog neuromorphic chips, inspired by structural and dynamical properties of the biological brain, show a high energy efficiency in running artificial neural-network architectures for the profit of generative applications. This encourages employing such hardware systems as platforms for simulations of quantum systems. Here we report on the realization of a prototype using the latest spike-based BrainScaleS hardware allowing us to represent few-qubit maximally entangled quantum states with high fidelities. Extracted Bell correlations for pure and mixed two-qubit states convey that non-classical features are captured by the analog hardware, demonstrating an important building block for simulating quantum systems with spiking neuromorphic chips.

Current status:
Has been resubmitted


Author comments upon resubmission

We thank the editor for considering our manuscript and for sending us the very constructive reports of the referees. We are convinced that we can resolve the referees' criticism in our response and hope to meet the publication criteria of SciPost Physics with our revised manuscript. Below we provide a point-by-point response to the referees' comments.

Reply to Report 1

We thank the referee for the detailed consideration of our manuscript and the constructive feedback, which we would like to answer point by point in the following.

  • Reply to 1):
    We thank the referee for this interesting comment. Our approach in the manuscript is based on a probabilistic formulation of quantum mechanics, in which a quantum state is represented by the probability distribution over the outcomes of a tomographically complete measurement. Formally, measurements in quantum mechanics are described by sets {M_a} of positive operators (in our case projection operators). This allows us to map a quantum state represented by the density matrix ρ to a positive definite probability distribution via P(a)=tr(M_a ρ). Under certain conditions (see reference [15] in the manuscript), this mapping is invertible, so by knowing P(a) one can reconstruct ρ. In order to make it represent the quantum state ρ, we configure our neuromorphic chip to sample from the probability distribution P(a), which entails the complete information contained in the quantum state. The fact that any density matrix (or wave function, in the case of pure states) can be represented by a probability distribution is indeed crucial for our approach. In the context of phase-space representations, the equivalent of the employed positive operator-valued measure (POVM) is the Husimi distribution, which is defined in terms of projections of the state onto a set of coherent states. In this case the coherent-state projectors constitute the set {M_a}. The Husimi distribution is non-negative and gives a complete representation of the state.

We have added a sentence in the last paragraph on page 4 to clarify this for the reader:

“Hence, any density matrix ρ can be mapped to a probability distribution P(a), and the information contained in the quantum state can be retrieved from that distribution.“

  • Reply to 2):
    We thank the referee for pointing out this ambiguity. It is correct that software-coded restricted Boltzmann machines (RBMs) have been used successfully for encoding quantum many-body states (Ref. [30] in the manuscript provides an overview). One of the motivations of our work is in fact to leverage the fast and energy-efficient sampling capabilities of the neuromorphic chip for eventually speeding up such approaches.

We would like to stress, however, that the neuromorphic hardware should not be considered as a mere emulator of a Boltzmann machine. Most importantly, it is a dynamical system, which, in a non-trivial and approximative manner, develops spike-time correlations that with the chosen network parameterization allow us to approximate a Boltzmann distribution. For that reason, we prefer to avoid the term “Restricted Boltzmann machine”.

We hence rather prefer not to consider our approach as equivalent to variational methods in quantum many-body physics using artificial neural networks, despite the fact that these approaches certainly provide motivation to expect scalability for certain classes of states. The hardware model shows the approximative behavior of an RBM, but in general no exact closed-form expression for the complete distribution underlying the neuron dynamics can be given.

Concerning the representation of mixed quantum states by means of RBMs, we remark that purifications of density operators have been demonstrated [G. Torlai, R. Melko, Latent Space Purification via Neural Density Operators, PRL, 2018; M.J. Hartmann, G. Carleo, Neural-Network Approach to Dissipative Quantum Many-Body Dynamics, PRL, 2019] which result in complex-valued wave functions that cannot directly be translated to our sample-based neuromorphic approach. The probabilistic approach used here has been pioneered by Carrasquilla et al. (see Ref. [15] in the manuscript), who used RBMs as well as more complex network architectures.

We added a sentence after Eq. (3) in the revised manuscript, emphasizing that the spiking neural network does not emulate a restricted Boltzmann machine and hence is different from software encodings of artificial neural networks representing quantum many-body systems:

“While the dynamical behavior of the spiking hardware approximates this probability distribution, the exact relation between the network parameters and the encoded distribution cannot be given in a closed form [18].”

  • Reply to 3):
    We agree with the referee and now explicitly state in the last sentence of the introduction and in the first sentence of the conclusion that the spiking network is classical.

  • Reply to 4):
    In order to extend the discussion of the operation mode of the neuromorphic chip and of the spike-based sampling framework, we reworked the second paragraph on page 4 in the main text.

Reply to Report 2

We thank the referee for the detailed consideration of our manuscript and the constructive feedback, which we would like to answer point by point in the following.

It is correct that the hardware can generate exact samples, which allows us to go beyond the contrastive divergence scheme. The statement in the main text addresses the issue that the wake-sleep learning we employ in classical machine learning suffers from the problem of requiring a large number of samples. Contrastive divergence instead estimates the gradient based on a single sample. For the neuromorphic hardware, the sample generation is not the bottleneck and thus it is feasible to employ wake-sleep learning directly, which is advantageous as compared to contrastive divergence because it increases the quality of the approximative determination of observables.

In Appendix B we concentrate on comparing runtimes for the task of sampling from a given network, independent of the learning algorithm which this sampling is employed for. In the revised manuscript we emphasize in the last paragraph of the conclusion section that the comparison only refers to sampling neuron configurations rather than to the training of the network.

We reformulated and extended the statement quoted by the referee, in the last paragraph of Section 2, in order to make it clearer for the reader, by:

“The performance characteristics of the neuromorphic hardware make computing additional samples cheap compared to reconfiguration and reinitialization. Hence we can take into account the complete sampled distribution for the update calculation, rather than relying on few-sample approximations as in contrastive divergence [13]. This enables a much better estimation of the D_KL gradient and does not rely on layer-wise conditional independence, allowing the exploration of network topologies other than bipartite graphs. See App. D for an extended discussion of the learning scheme.”

In addition, we reworked the last paragraph of Appendix D on pages 18 and 19. We extended this paragraph with a discussion of contrastive divergence, its benefits for software simulations, and its disadvantages when used for accelerated neuromorphic hardware.

List of changes

- Added "classical" in the last sentence of Section 1 (page 2):
"With this substrate, we demonstrate an approximate representation of quantum states with classical spiking neural networks that is sufficiently precise for encoding genuine quantum correlations. "

- Rewrote the second paragraph of Section 2 (page 4):
"The BrainScaleS-2 system, depicted in Fig. 1b, features 512 LIF neuron circuits interacting through a configurable weight matrix [11]. Similar to biological neurons in the human brain, LIF neurons communicate via spikes. Each neuron can be viewed as a capacitor integrating the currents it receives from its synaptic inputs to generate a membrane potential. Whenever this membrane potential crosses a threshold from below, the neuron sends a spike to the synaptic inputs of its efferent partners (Fig. 1c, top panel). After sending a spike, the neuron is set to an inactive state, in which no additional spike can be triggered for a certain time, referred to as the refractory period τ_ref. In the spike-based sampling framework, neurons in this refractory state encode the state z = 1, and z = 0 at all other times (Fig. 1c, lower panel). The stochasticity required for sampling is induced by adding a random component to the generation of spikes; for LIF networks, this can be ensured by sufficiently noisy membrane potentials [16,18]. To this end, we used on-chip sources to inject pseudo-Poisson spike trains into the network (see App. A)."

- Added a sentence in the fourth paragraph of Section 2 (page 4):
"Hence, any density matrix ρ can be mapped to a probability distribution P(a), and the information contained in the quantum state can be retrieved from that distribution. "

- Rewrote and extended the last paragraph of Section2, after Eq. (3) (page 5):
"In each training epoch, the synaptic weights were updated along the gradient of the D_KL (see App. D), which is derived assuming that the distribution p(v;W) is given by Eq.(1). While the dy- namical behavior of the spiking hardware approximates this probability distribution, the exact relation between the network parameters and the encoded distribution cannot be given in a closed form [18]. Instead, pairwise correlations ⟨v_i h_j⟩_{model} in the network were measured from the sampled distribution p(v,h;W). "

- Added "classical" in the first sentence of Section 6 (page 9):
"We have shown that a spiking neural network implemented on a classical neuromorphic chip can approximate entangled quantum states of few particles with high quantum fidelity. "

- Rewrote the second sentence of the last paragraph in Section 6 (page 9):
"We illustrate this property by comparing the sampling time on a neuromorphic chip with sampling times achieved in CPU implementations in App. B showing a gain through neuromorphic sampling already at moderate system sizes."

- Rewrote and extended the last paragraph of Appendix D (pages 18 and 19):
"In general, Hebbian training algorithms are based on minimizing the correlation mismatch between data and model distributions. The traditional way for estimating this mismatch is contrastive divergence [13,41], where the target and model distributions are approximated by a single layer-wise network update (CD-1). However, the improved performance of contrastive divergence relies on the fact that preparing the software network in a defined state is cheap compared to calculating updates of neuron configurations. Our neuromorphic hardware, as a physical dynamical system, implicitly calculates the neuron updates and the actual sampling run is cheap compared to the cost of the network initialization with the given performance characteristics. An implementation of the preparation-dominated contrastive divergence scheme on the spiking neuromorphic hardware hence does not provide any of the benefits observed in software simulations. In contrast we take advantage of these hardware characteristics by using the full model distribution to calculate network parameter updates, which improves the quality of the stochastic gradient estimation. We further optimize the hardware training implementation by reconstructing the correlations between visible and hidden layers from the encoded distribution by reweighting all samples p(v) according to the target probability p^∗(v), see Eq.(15) and Eq.(16). This is in contrast to contrastive divergence learning where the distribution of the visible layer is explicitly enforced to match the target distribution and only the correlations with the hidden layer are being sampled. Beyond the optimized implementation on the spiking neuromorphic system, our proposed training algorithm can be used to obtain network parameter updates for arbitrarily connected networks, while contrastive divergence is limited to strictly layered network structures. "

Submission & Refereeing History

Resubmission 2008.01039v5 on 26 October 2021

Reports on this Submission

Anonymous Report 2 on 2021-10-21 (Invited Report)

Strengths

The work links research areas that developed independently so far. This the characterization of quantum states and the representing neural networks on neuromorphic chips

Weaknesses

A particular weakness is that the description of the result is not as explicit as it should. There are claims that the work is "demonstrating that intrinsic quantum features can be captured by a classical spiking network". Yet, my understanding is that the classical spiking network just learns two distributions, the real and imaginary parts of the matrix element of the density matrix.

Report

This is an interesting work as it explores applications of neuromorphic chips in representing quantum states. A particular weakness of the paper is however that the description of the result is not as explicit as it should. There are claims that the work is "demonstrating that intrinsic quantum features can be captured by a classical spiking network". Yet, my understanding is that the classical spiking network just learns two distributions, the real and imaginary parts of the matrix element of the density matrix. It is no surprise at all that a classical computing device can compute representations of quantum states, even if they are highly correlated and non-classical. Every numerical simulation of a quantum system does this.

Requested changes

I suggest that the presentation of the material should be changed to clearly say that the network is able to approximate the matrix elements of the density matrix representing certain quantum states.
Any statements that seem to suggest that a classical device can show quantum correlations should be removed. I understand that such things are not explicitly claimed, but it is also not stated clearly enough that this claims are not made.

  • validity: good
  • significance: good
  • originality: high
  • clarity: low
  • formatting: good
  • grammar: excellent

Anonymous Report 1 on 2021-9-7 (Invited Report)

Report

I am satisfied with the authors reply to my comments and I have no further recommendations. I believe that the article meets the expectations and criteria for publication on SciPost Physics.

  • validity: -
  • significance: -
  • originality: -
  • clarity: -
  • formatting: -
  • grammar: -

Login to report or comment