QuCumber: wavefunction reconstruction with neural networks

As we enter a new era of quantum technology, it is increasingly important to develop methods to aid in the accurate preparation of quantum states for a variety of materials, matter, and devices. Computational techniques can be used to reconstruct a state from data, however the growing number of qubits demands ongoing algorithmic advances in order to keep pace with experiments. In this paper, we present an open-source software package called QuCumber that uses machine learning to reconstruct a quantum state consistent with a set of projective measurements. QuCumber uses a restricted Boltzmann machine to efficiently represent the quantum wavefunction for a large number of qubits. New measurements can be generated from the machine to obtain physical observables not easily accessible from the original data.


Introduction
Current advances in fabricating quantum technologies, as well as in reliable control of synthetic quantum matter, are leading to a new era of quantum hardware where highly pure quantum states are routinely prepared in laboratories.With the growing number of controlled quantum degrees of freedom, such as superconducting qubits, trapped ions, and ultracold atoms [1][2][3][4], reliable and scalable classical algorithms are required for the analysis and verification of experimentally prepared quantum states.Efficient algorithms can aid in extracting physical observables otherwise inaccessible from experimental measurements, as well as in identifying sources of noise to provide direct feedback for improving experimental hardware.However, traditional approaches for reconstructing unknown quantum states from a set of measurements, such as quantum state tomography, often suffer the exponential overhead that is typical of quantum many-body systems.
Recently, an alternative path to quantum state reconstruction was put forward, based on modern machine learning (ML) techniques [5][6][7][8][9].The most common approach relies on a powerful generative model called a restricted Boltzmann machine (RBM) [10], a stochastic neural network with two layers of binary units.A visible layer v describes the physical degrees of freedom, while a hidden layer h is used to capture high-order correlations between the visible units.Given a set of neural network parameters λ, the RBM defines a probabilistic model described by the parametric distribution p λ (v).RBMs have been widely used in the ML community for the pre-training of deep neural networks [11], for compressing high-dimensional data into lower-dimensional representations [12], and more [13].More recently, RBMs have been adopted by the physics community in the context of representing both classical and quantum many-body states [14][15][16].They are currently being investigated for their representational power [17][18][19], their relationship with tensor networks and the renormalization group [20][21][22][23][24], and in other contexts in quantum many-body physics [25][26][27].
In this post, we present QuCumber: a quantum calculator used for many-body eigenstate reconstruction.QuCumber is an open-source Python package that implements neural-network quantum state reconstruction of many-body wavefunctions from measurement data.Some examples include magnetic spin projections, orbital occupation number, polarization of photons, or the logical state of qubits.Given a training dataset of measurements, QuCumber discovers the most likely quantum state compatible with the measurements by finding the optimal set of parameters λ of an RBM.A properly trained RBM is an approximation of the unknown quantum state underlying the data.It can be used to calculate various physical observables of interest, including measurements that may not be possible in the original experiment.
This post is organized as follows.In Section 2, we introduce the reconstruction technique for the case where all coefficients of the wavefunction are real and positive.We discuss the required format for input data, as well as training of the RBM and the reconstruction of typical observables.In Section 3, we consider the more general case of a complex-valued wavefunction.We illustrate a general strategy to extract the phase structure from data by performing appropriate unitary rotations on the state before measurements.We then demonstrate a practical reconstruction of an entangled state of two qubits.A list of useful terms and equations can be found at the end of the post in the Glossary.

Positive wavefunctions
We begin by presenting the application of QuCumber to reconstruct many-body quantum states described by wavefunctions |Ψ with positive coefficients Ψ(x) = x|Ψ ≥ 0, where |x = |x 1 , . . ., x N is a reference basis for the Hilbert space of N quantum degrees of freedom.The neural-network quantum state reconstruction requires raw data D = (x 1 , x 2 , . . . ) generated through projective measurements of the state |Ψ in the reference basis.These measurements adhere to the probability distribution given by the Born rule, P (x) = |Ψ(x)| 2 .Since the wavefunction is strictly positive, the quantum state is completely characterized by the measurement distribution, i.e.Ψ(x) = P (x).
The positivity of the wavefunction allows a simple and natural connection between quantum states and classical probabilistic models.QuCumber employs the probability distribution p λ (x) of an RBM to approximate the distribution P (x) underlying the measurement data.Using contrastive divergence (CD) [11], QuCumber trains the RBM to discover an optimal set of parameters that minimize the Kullback-Leibler (KL) divergence between the two distributions.Upon successful training (p λ (x) ∼ P (x)), we obtain a faithful representation of the target quantum state, In the following, we demonstrate the application of QuCumber for the reconstruction of the ground-state wavefunction of the one-dimensional transverse-field Ising model (TFIM).The Hamiltonian is where σx/z i are spin-1/2 Pauli operators acting on site i, and we assume open boundary conditions.For this example, we consider a chain with N = 10 spins at the quantum critical point J = h = 1.

Setup
Given the small size of the system, the ground state |Ψ can be found with exact diagonalization.The training dataset D is generated by sampling the distribution P (σ z ) = |Ψ(σ z )| 2 , obtaining a sequence of N S = 10 5 independent spin projections in the reference basis x = σ z .1Each data point in D consists of an array σ z j = (σ z 1 , . . ., σ z N ) with shape (N,) and should be passed to QuCumber as a numpy array or torch tensor.For example, σ z j = np.array([1,0,1,1,0,1,0,0,0,1]),where we use σ z j = 0, 1 to represent a spin-down and spin-up state respectively.Therefore, the entire input data set is contained in an array with shape (N_S, N).
Aside from the training data, QuCumber also allows us to import an exact wavefunction.This can be useful for monitoring the quality of the reconstruction during training.In our example, we evaluate the fidelity between the reconstructed state ψ λ (x) and the exact wavefunction Ψ(x).The training dataset, train_data, and the exact ground state, true_psi, are loaded with the data loading utility as follows: import qucumber .utils .data as data train_path = " tfim1d_data .txt " psi_path = " tfim1d_psi .txt " train_data , true_psi = data .load_data ( train_path , psi_path ) If psi_path is not provided, QuCumber will load only the training data.
Next, we initialize an RBM quantum state ψ λ (x) with random weights and zero biases using the constructor PositiveWavefunction: from qucumber .nn_states import P o s i t i v e W a v e f u n c t i o n state = P o s i t i v e W a v e f u n c t i o n ( num_visible =10 , num_hidden =10) The number of visible units (num_visible) must be equal to the number of physical spins N , while the number of hidden units (num_hidden) can be adjusted to systematically increase the representational power of the RBM.
The quality of the reconstruction will depend on the structure underlying the specific quantum state and the ratio of visible to hidden units, α = num_hidden/num_visible.In practice, we find that α = 1 often leads to good approximations of positive wavefunctions [5].However, in the general case, the value of α required for a given wavefunction should be explored and adjusted by the user.

Training
Once an appropriate representation of the quantum state has been defined, QuCumber trains the RBM through the function PositiveWavefunction.fit.Several input parameters need to be provided aside from the training dataset (train_data).These include the number of training iterations (epochs), the number of samples used for the positive/negative phase of CD (pos_batch_size/neg_batch_size), the learning rate (lr) and the number of sampling steps in the negative phase of CD (k).The last argument (callbacks) allows the user to pass a set of additional functions to be evaluated during training.
As an example of a callback, we show the MetricEvaluator, which evaluates a function log_every epochs during training.Given the small system size and the knowledge of the true ground state, we can evaluate the fidelity between the RBM state and the true ground-state wavefunction (true_psi).Similarly, we can calculate the KL divergence between the RBM distribution p λ (x), and the data distribution P (x), which should approach zero for a properly trained RBM.For the current example, we monitor the fidelity and KL divergence (defined in qucumber.utils.training_statistics):from qucumber .callbacks import MetricEvaluator import qucumber .utils .t ra i n in g _ st a t is t i cs as ts log_every = 10 space = state .g e n e r a t e _ h i l b e r t _ s p a c e (10) callbacks = [ MetricEvaluator ( log_every , { " Fidelity " : ts .fidelity , " KL " : ts .KL } , target_psi = true_psi , space = space , verbose = True ) ] With verbose=True, the program will print the epoch number and all callbacks every log_every epochs.Now that the metrics to monitor during training have been chosen, we can invoke the optimization with the fit function of PositiveWavefunction.
The network parameters λ, together with the callbacks, can be saved (or loaded) to a file: With this we have demonstrated to most basic aspects of QuCumber regarding training a model and verifying its accuracy.We note that in this example the evaluation utilized the knowledge of the exact ground state and the calculation of the KL divergence, which is tractable only for small system sizes.However, we point out that QuCumber is capable of carrying out the reconstruction of much larger systems.In such cases, users must rely on other estimators to evaluate the training, such as expectation values of physical observables (magnetization, energy, etc).In the following, we show how to compute diagonal and offdiagonal observables in QuCumber.

Reconstruction of physical observables
In this section, we discuss how to calculate the average value of a generic physical observable Ô from a trained RBM.We start with the case of observables that are diagonal in the reference basis where the RBM was trained.We then discuss the more general cases of off-diagonal observables and entanglement entropies.

Diagonal observables
We begin by considering an observable with only diagonal matrix elements, σ| Ô |σ = O σ δ σσ where for convenience we denote the reference basis x = σ z as σ unless otherwise stated.The expectation value of Ô is given by The expectation value can be approximated by a Monte Carlo estimator, where the spin configurations σ k are sampled from the RBM distribution p λ (σ).This process is particularly efficient given the bipartite structure of the network which allows the use of block Gibbs sampling.
A simple example for the TFIM is the average longitudinal magnetization per spin, σz = j σz j /N , which can be calculated directly on the spin configuration sampled by the RBM (i.e., the state of the visible layer).The visible samples are obtained with the sample function of the RBM state object: which takes the total number of samples (num_samples) and the number of iterations (k) of a block Gibbs step as input.Once these samples are obtained, the magnetization can be calculated simply as magnetization = samples .mul (2.0).sub (1.0).mean () where we converted the binary samples of the RBM back into ±1 spins before taking the mean.

Off-diagonal observables
We turn now to the case of off-diagonal observables, where the expectation value assumes the following form Ô = 1 This expression can once again be approximated with a Monte Carlo estimator of the so-called local estimator of the observable: As long as the matrix representation O σσ is sufficiently sparse in the reference basis, the summation can be evaluated efficiently.
As an example, we consider the specific case of the transverse magnetization for the j-th spin, σx j , with matrix elements Therefore, the expectation values reduces to the Monte Carlo average of the local observable evaluated on spin configurations σ k sampled from the RBM distribution p λ (σ).QuCumber provides an interface for sampling off-diagonal observables in the ObservableBase class.Thorough examples are available in the tutorial section in the documentation. 2 As an example, σ x can be written as an observable class with The value of the observable is computed with SigmaX .s t a t i s t i c s _ f r o m _ s a m p l e s ( state , samples ) Similarly, the user can define other observables like the energy.The reconstruction of two magnetic observables for the TFIM is shown in Fig. 2, where a different RBM was trained for each value of the transverse field h.In the left plot, we show the average longitudinal magnetization per site, which can be calculated directly from the configurations sampled by the RBM.In the right plot, we show the off-diagonal observable of transverse magnetization.In both cases, QuCumber successfully discovers an optimal set of parameters λ that accurately approximate the ground-state wavefunction underlying the data.

Entanglement entropy
A quantity of significant interest in quantum many-body systems is the amount of entanglement between a subregion A and its complement Ā. Numerically, measurement of bipartite entanglement entropy is commonly accessed through the computation of the second Rényi entropy [28].When one has access to a pure state wavefunction ψ λ (x), Rényi entropies can be estimated through the "Swap" operator.It is a type of an off-diagonal observable that acts on an extended product space consisting of independent copies of the wavefunction, ψ λ (x) ⊗ ψ λ (x), referred to as "replicas".
As the name suggests, the action of the Swap operator is to swap the spin configurations in region A between the replicas, Here the subcript of the ket indicates the replica index, while the two labels inside a ket, such as σ A , σ Ā, describe the spins configurations within the subregion and its complement.In QuCumber, the Swap operator is implemented as a routine within the entanglement observable unit: where s1 and s2 are batches of samples produced from each replica.While ideally those samples should be entirely independent, in order to save computational costs, QuCumber just splits a given batch into two equal parts and treats them as if they were independent samples.This is implemented within the RenyiEntropy observable, The rest of the implementation is very similar to that for the transverse magnetization observable from the last section, once the amplitude of a sample is substituted with the product of amplitudes drawn from each replica.psi_ket1 = nn_state .psi ( samples1 ) psi_ket2 = nn_state .psi ( samples2 ) psi_ket = cplx .elementwise_mult ( psi_ket1 , psi_ket2 )

Complex wavefunctions
For positive wavefunctions, the probability distribution underlying the outcomes of projective measurements in the reference basis contains all possible information about the unknown quantum state.However, in the more general case of a wavefunction with a non-trivial sign or phase structure, this is not the case.In this section, we consider a target quantum state where the wavefunction coefficients in the reference basis can be complex-valued, Ψ(σ) = Φ(σ)e iθ(σ) .We continue to choose the reference basis as σ = σ z .We first need to generalize the RBM representation of the quantum state to capture a generic complex wavefunction.To this end, we introduce an additional RBM with marginalized distribution p µ (σ) parameterized by a new set of network weights and biases µ.We use this to define the quantum state as: where φ µ (σ) = log(p µ (σ)) [5].In this case, the reconstruction requires a different type of measurement setting.It is easy to see that projective measurements in the reference basis do not convey any information on the phases θ(σ), since The general strategy to learn a phase structure is to apply a unitary transformation U to the state |Ψ before the measurements, such that the resulting measurement distribution P (σ) = |Ψ (σ)| 2 of the rotated state Ψ (σ) = σ| U |Ψ contains fingerprints of the phases θ(σ) (Fig. 3).In general, different rotations must be independently applied to gain full information on the phase structure.We make the assumption of a tensor product structure of the rotations, U = and the resulting measurement distribution is 2 qubits To clarify the procedure, let us consider the simple example of a quantum state of two qubits: and rotation U = Ĥ0 ⊗ Î1 , where Î is the identity operator and is called the Hadamard gate.This transformation is equivalent to rotating the qubit σ 0 from the reference σ z 0 basis the the σ x 0 basis.A straightforward calculation leads to the following probability distribution of the projective measurement in the new basis |σ x 0 , σ 1 : where ∆θ = θ 0σ 1 −θ 1σ 1 .Therefore, the statistics collected by measuring in this basis implicitly contains partial information on the phases.To obtain the full phases structure, additional transformations are required, one example being the rotation from the reference basis to the σ y j local basis, realized by the elementary gate

Setup
We now proceed to use QuCumber to reconstruct a complex-valued wavefunction.For simplicity, we restrict ourselves to two qubits and consider the general case of a quantum state with random amplitudes Φ σ 0 σ 1 and random phases θ σ 0 σ 1 .This example is available in the online tutorial. 3We begin by importing the required packages: from qucumber .nn_states import C om p l ex W a ve f u nc t i on import qucumber .utils .unitaries as unitaries import qucumber .utils .cplx as cplx Since we are dealing with a complex wavefunction, we load the corresponding module ComplexWavefunction to build the RBM quantum state ψ λµ (σ).Furthermore, the following additional utility modules are required: the utils.cplxbackend for complex algebra, and the utils.unitariesmodule which contains a set of elementary local rotations.By default, the set of unitaries include local rotations to the σ x and σ y basis, implemented by the Ĥ and K gates respectively.We continue by loading the data in QuCumber, which is done using the load_data function of the data utility: train_path = " qubits_train .txt " train_bases_path = " qu bi ts _tr ai n_ ba se s .txt " psi_path = " qubits_psi .txt " bases_path = " qubits_bases .txt " train_samples , true_psi , train_bases , bases = data .load_data ( train_path , psi_path , train_bases_path , bases_path ) As before, we may load the true target wavefunction from qubits_psi.txt, which can be used to calculate the fidelity and KL divergence.In contrast with the positive case, we now have measurements performed in different bases.Therefore, the training data consists of an array of qubits projections (σ b 0 0 , σ b 1 1 ) in qubits_train_samples.txt, together with the corresponding bases (b 0 , b 1 ) where the measurement was taken, in qubits_train_bases.txt.Finally, QuCumber loads the set of all the bases appearing the in training dataset, stored in qubits_bases.txt.This is required to properly configure the various elementary unitary rotations that need to be applied to the RBM state during the training.For this example, we generated measurements in the following bases: Finally, before the training, we initialize the set of unitary rotations and create the RBM state object.In the case of the provided dataset, the unitaries are the Ĥ and K gates.
The required dictionary can be created with unitaries.create_dict().By default, when unitaries.create_dict() is called, it will contain the identity, the Ĥ gate, and the K gate, with the keys Z, X, and Y, respectively.It is possible to add additional gates by specifying them as where re_part, im_part, and unitary_name are to be specified by the user.
We then initialize the complex RBM object with state = C o m pl e x Wa v e fu n c ti o n ( num_visible =2 num_hidden =2 , unitary_dict = unitary_dict ) The key difference between positive and complex wavefunction reconstruction is the requirement of additional measurements in different basis.Despite this, loading the data, initializing models, and training the RBMs are all very similar to the positive case, as we now discuss.

Training
Like in the case of a positive wavefunction, for the complex case QuCumber optimizes the network parameters to minimize the KL divergence between the data and the RBM distribution.When measuring in multiple bases, the optimization now runs over the set of parameters (λ, µ) and minimizes the sum of KL divergences between the data distribution P (σ b ) and the RBM distribution |ψ λµ (σ b )| 2 for each bases b appearing in the training dataset [5].For example, if a given training sample is measured in the basis (x, z), QuCumber applies the appropriate unitary rotation U = Ĥ0 ⊗ Î1 to the RBM state before collecting the gradient signal.
Similar to the case of positive wavefunction, we generate the Hilbert space (to compute fidelity and KL divergence) and initialize the callbacks

Conclusion
We have introduced the open source software package QuCumber, a quantum calculator used for many-body eigenstate reconstruction.QuCumber is capable of taking input data representing projective measurements of a quantum wavefunction, and reconstructing the wavefunction using a restricted Boltzmann machine (RBM).
Once properly trained, QuCumber can produce a new set of measurements, sampled stochastically from the RBM.These samples, generated in the reference basis, can be used to verify the training of the RBM against the original data set.More importantly, they can be used to calculate expectation values of many physical observables.In fact, any expectation value typically estimated by conventional Monte Carlo methods can be implemented as an estimator in QuCumber.Such estimators may be inaccessible in the reference basis, for example.Or, they may be difficult or impossible to implement in the setup for which the original data was obtained.This is particularly relevant for experiments, where it is easy to imagine many possible observables that are inaccessible, due to fundamental or technical challenges.
Future versions of QuCumber, as well as the next generation of quantum state reconstruction software, may explore different generative models, such as variational autoencoders or generative adversarial networks.In addition, future techniques may include hybridization between machine learning and other well-established methods in computational quantum many-body physics, such as variational Monte Carlo and tensor networks.

A Glossary
This section contains an overview of terms discussed in the document which are relevant for RBMs.For more detail we refer the reader to the code documentation on https://qucumber.readthedocs.io/en/stable/,and References [11,29].
• Batch: A subset of data upon which the gradient is computed and the network parameters are adjusted accordingly.A smaller batch size often results in a more stochastic trajectory, while a large batch size more closely approximates the exact gradient and has less variance.
• Biases: Adjustable parameters in an RBM, denoted by b j and c i in Eq. ( 19).
• Contrastive divergence: An approximate maximum-likelihood learning algorithm for RBMs [11].CD estimates the gradient of the effective energy (19) with respect to model parameters by using Gibbs sampling to compare the generated and target distributions.
• Energy: The energy of the joint configuration (v, h) of a RBM is defined as follows: • Effective energy: Obtained from the energy by tracing out the hidden units h; often called the "free energy" in machine learning literature.
• Epoch: A single pass through an entire training set.For example, with a training set of 1,000 samples and a batch size of 100, one epoch consists of 10 updates of the parameters.
• Hidden units: There are n h units in the hidden layer of the RBM, denoted by the vector h = (h 1 , . . ., h n h ).The number of hidden units can be adjusted to increase the representational capacity of the RBM.
• Hyperparameters: A set of parameters that are not adjusted by a neural network during training.Examples include the learning rate, number of hidden units, batch size, and number of training epochs.
• Joint distribution: The RBM assigns a probability to each joint configuration (v, h) according to the Boltzmann distribution: • KL divergence: The Kullback-Leibler divergence, or relative entropy, is a measure of the "distance" between two probability distributions P and Q, defined as: The KL divergence between two identical distributions is zero.Note that it is not symmetric between P and Q.
• Learning rate: The step size used in the gradient descent algorithm for the optimization of the network parameters.A small learning rate may result in better optimization but will take more time to converge.If the learning rate is too high, training might not converge or will find a poor optimum.
• Marginal distribution: Obtained by marginalizing out the hidden layer from the joint distribution via • QuCumber : A quantum calculator used for many-body eigenstate reconstruction.
• Parameters: The set of weights and biases λ = {W , b, c} characterizing the RBM energy function.These are adjusted during training.
• Partition function: The normalizing constant of the Boltzmann distribution.It is obtained by tracing over all possible pairs of visible and hidden vectors: • Restricted Boltzmann Machine: A two-layer network with bidirectionally connected stochastic processing units."Restricted" refers to the connections (or weights) between the visible and hidden units: each visible unit is connected with each hidden unit, but there are no intra-layer connections.
• Visible units: There are n v units in the visible layer of the RBM, denoted by the vector v = (v 1 , . . ., v nv ).These units correspond to the physical degrees of freedom.In the cases considered in this paper, the number of visible units is equal to the number of spins N .
• Weights: W ij is the symmetric connection or interaction between the visible unit v j and the hidden unit h i .

Figure 1 :
Figure 1: The fidelity (left) and the KL divergence (right) during training for the reconstruction of the ground state of the one-dimensional TFIM.

Figure 2 :
Figure 2: Reconstruction of the magnetic observables for the TFIM chain with N = 10 spins.We show the average longitudinal (left) and transverse (right) magnetization per site obtained by sampling from a trained RBM.The dashed line denotes the results from exact diagonalization.

Nj=1
Ûj .This is equivalent to a local change of basis from |σ to {|σ b = |σ b 1 1 , . . ., σ b N N }, where the vector b identifies the local basis b j for each site j.The target wavefunction in the new basis is given by

PFigure 3 :
Figure 3: Unitary rotations for two qubits.(left) Measurements on the reference basis.(right) Measurement in the rotated basis.The unitary rotation (the Hadamard gate on qubit σ 0 ) is applied after state preparation and before the projective measurement.
state .space = nn_state .g e n e r a t e _ h i l b e r t _ s p a c e ( nv ) callbacks = [ MetricEvaluator ( log_every , { " Fidelity " : ts .fidelity , " KL " : ts .KL } , target_psi = true_psi , bases = bases , verbose = True , space = state .space , ) ]The training is carried out by calling the fit function of ComplexWavefunction, given the set of hyperparameters , callbacks = callbacks , )In Fig.4we show the total KL divergence and the fidelity with the true two-qubit state during training.After successfully training a QuCumber model, we can once again compute expectation values of physical observables, as discussed in Section 2.3.

Figure 4 :
Figure 4: Training a complex RBM with QuCumber on random two-qubit data.We show the fidelity (left), and KL divergence (right), as a function of the training epochs.

Funding information
This research was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC), the Canada Research Chair program, and the Perimeter Institute for Theoretical Physics.We also gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used in this work.Research at Perimeter Institute is supported by the Government of Canada through Industry Canada and by the Province of Ontario through the Ministry of Research & Innovation.P. H. acknowledges support from ICFOstepstone, funded by the Marie Sklodowska-Curie Co-funding of regional, national and international programmes (GA665884) of the European Commission, as well as by the Severo Ochoa 2016-2019' program at ICFO (SEV-2015-0522), funded by the Spanish Ministry of Economy, Industry, and Competitiveness (MINECO).