QGOpt: Riemannian optimization for quantum technologies

Many theoretical problems in quantum technology can be formulated and addressed as constrained optimization problems. The most common quantum mechanical constraints such as, e.g., orthogonality of isometric and unitary matrices, CPTP property of quantum channels, and conditions on density matrices, can be seen as quotient or embedded Riemannian manifolds. This allows to use Riemannian optimization techniques for solving quantum-mechanical constrained optimization problems. In the present work, we introduce QGOpt, the library for constrained optimization in quantum technology. QGOpt relies on the underlying Riemannian structure of quantum-mechanical constraints and permits application of standard gradient based optimization methods while preserving quantum mechanical constraints. Moreover, QGOpt is written on top of TensorFlow, which enables automatic differentiation to calculate necessary gradients for optimization. We show two application examples: quantum gate decomposition and quantum tomography.


Introduction
Many quantum-mechanical problems can be solved using optimization methods as illustrated by the following examples. The ground state of a quantum system with Hamiltonian H can be found using the variational method, which is akin to an optimization problem [3]: where |ψ is a non-normalized trial state, |Ω is the non-normalized ground state. This formulation of a ground state search problem was successfully used for the study of manybody quantum systems [4,5]. In particular, the ground state of a correlated spin system can be found in the following forms: matrix product states [6][7][8], projected entangled pair states [9,10] or neural networks [11][12][13]. To perform variational energy optimization one can utilize optimization algorithms such as the density matrix renormalization group [14,15], the time evolving block decimation [16][17][18] for tensor network architectures, the quantum natural gradient [19], and adaptive first-order optimization methods like the Adam optimizer [20] for neural-networks-based quantum parametrization. Problems of reconstruction of quantum states, quantum channels and quantum processes from measured data can also be formulated as optimization problems. For example, the state of a many-body quantum system can be reconstructed with neural networks by maximization of the logarithmic likelihood function on a set of measurement outcomes [21][22][23][24]. The Choi matrix of an unknown quantum channel can be reconstructed in a tensor network form via the minimization of the Kullback-Leibler divergence [25]. Non-Markovian quantum dynamics can be reconstructed from measured data in different ways [26,27] by use of optimization algorithms.
Some quantum mechanics problems require keeping certain constraints while minimizing or maximizing an objective function. For example, quantum phase transitions can be described using an entanglement renormalization technique, which requires an optimization over matrices with orthogonality constraints, i.e. isometric matrices. To solve this problem, Vidal and Evenbly suggested an algorithm [28][29][30] that does not have analogs in standard optimization theory. Another example of a constrained optimization problem emerging in quantum mechanics is quantum channel tomography. It requires preservation of natural "quantum" constraints, i.e. the completely positive and trace preserving (CPTP) property of quantum channels [31]. Constraints preservation here can be achieved by using a particular parametrization or by adding regularizers that ensure that the constraints are satisfied.
Adding regularizers into a loss function merely provides approximate preservation of constraints, and a naive parametrization may lead to over-parametrization and result in the optimization slowing down. One therefore needs a universal approach to quantum technology optimization. As many natural "quantum" constraints can be seen as Riemannian manifolds, Riemannian optimization can become a candidate well-suited for the role of universal framework for constrained optimization in quantum mechanics. In the present work, we introduce QGOpt (Quantum Geometric Optimization) [32], a library for Riemannian optimization in quantum mechanics and quantum technologies. It allows one to perform an optimization with all typical constraints of quantum mechanics.
This article is organized as follows. In Sec. 2, we give an overview of Riemannian optimization. We then turn to Riemannian manifolds in quantum mechanics in Sec. 3. In Sec. 4, we present the QGOpt application programming interface (API), and we illustrate its use in Sec. 5, with two examples: quantum gate decomposition and quantum channel tomography. In Sec. 6, we also show that QGOpt can handle optimization over an arbitrary Cartesian product of manifolds.

Overview of the Riemannian optimization
While optimizing an objective function defined on the Euclidean space, one performs a sequence of elementary operations like points and vectors transportation. We call these elementary operations optimization primitives. For example, one iteration of the simplest gradient descent method involves an update of the current estimation of the optimal point x t as follows: where v t = −η∇f (x t ) is a vector tending to improve the current estimation, t is the number of previous iterations, and η is the step-size. This update can be seen as a transportation of a point x t along a vector v t . More sophisticated algorithms may require keeping additional information about the optimization landscape in terms of vectors {m t } associated with the current point x t . These vectors should be transported together with x t to a new point and then updated according to a particular algorithm. However, as transportation of vectors in a Euclidean space is the trivial identity transformation, it may be safely skipped. Optimization on curved spaces requires a generalization of optimization primitives in a certain way. As an example of optimization algorithms we consider a gradient descent with momentum [33] and its Riemannian generalization [34,35]. We keep our overview simple, and for an in-depth introduction to the topic, we recommend references [36,37].
Let us assume that we aim to minimize the value of a function f : R n → R, and that we have access to its gradient ∇f (x). In the Euclidean space R n , a gradient descent with momentum consists of the following steps wrapped into a loop: 2. Taking a step along the direction of a momentum vector x t+1 = x t − ηm t+1 , where the initial momentum vector m 0 is the null vector, β is a hyperparameter whose value is usually taken around β ≈ 0.9, and η is the size of the optimization step.
Let us assume now that a function f is defined on a Riemannian manifold M that is embedded in the Euclidean space: f : M → R. Then we can no longer apply the standard scheme of gradient descent with momentum because it clearly takes x t out of the manifold M. This scheme can be generalized step by step. First, we have to generalize the notion of a gradient. The standard Euclidean gradient is not a tangent vector to a manifold and it does not take into account the metric of a manifold. One may introduce the Riemannian gradient that can be constructed based on the standard gradient ∇f (x). The Riemannian gradient lies in the space tangent to a point x and properly takes the metric of a tangent space into account. Although an optimization algorithm takes a step along a vector tangent to a manifold, it still takes a point out of the manifold. In order to fix this issue, one can replace a straight line step with a proper curved line step. In the Riemannian geometry the generalization of the straight line step is given by the notion of exponential map that reads where γ(t) is a geodesic [38] such that γ(0) = x in and dγ(t) dt t=0 = v, x in is an initial point on a manifold, x out is a final point. However, in practice the calculation of a geodesic is often computationally too inefficient. In these cases, one can use a retraction instead of an exponential map, which is a first-order approximation of a geodesic [38]: wherex out also lies in a manifold and x out − x out = O( v 2 ). A retraction is not unique so it can be chosen to be computationally efficient. The gradient descent with momentum also requires to transport the momentum vector at each iteration from a previous point to a new point. The Euclidean version of the gradient descent with momentum does not have an explicit step with transportation of the momentum vector because in the Euclidean space transportation of a vector is trivial. However, this step is necessary in the Riemannian case, where the trivial Euclidean vector transportation takes a vector out of a tangent space. A vector transport τ x,w (v) is the result of transportation of a vector v along a vector w which takes into account that a tangent space varies from one manifold's point to another in the Riemannian case. The overall Riemannian generalization of the gradient descent with momentum can be summarized as follows: 2. Taking a step along a new direction of the momentum x t+1 = R xt (−ηm t+1 ), 3. Transport of the momentum vector to a new point x t+1 : m t+1 = τ xt,−ηm t+1 (m t+1 ).
Note that other first-order optimization methods can be generalized in a similar fashion.

Riemannian manifolds in quantum mechanics
Many objects of quantum mechanics can be seen as elements of smooth manifolds. However, their mathematical description, suitable for numerical algorithms, may involve some abstract constructions that should be clarified. In this section we consider an illustrative example of the set of Choi matrices and describe this set as a smooth quotient manifold. We restrict our consideration to a plain description of all necessary mathematical concepts. At the end of the section, we also list all the manifolds implemented in the QGOpt library and describe their possible use.
The evolution of any quantum system that interacts with its environment can be described by a quantum channel. Here, we consider quantum channels defined as the following CPTP linear map: Φ : C n×n → C n×n . Any quantum channel can be represented through its Choi matrix [31]. A Choi matrix is a positive semi-definite operator C ∈ C n 2 ×n 2 that has a constraint Tr p (C) = 1, where Tr p is a partial trace over the first subsystem and 1 is the identity matrix. To make the notion of the partial trace less abstract, let us consider a piece of the TensorFlow code, which computes a partial trace of a Choi matrix. First, we apply a reshape operation to a Choi matrix that changes the shape of a matrix as follows C_resh = tf . reshape (C , (n , n , n , n ) ) The tensor C resh ∈ C n×n×n×n is an alternative representation of the Choi matrix. Further in the text, we distinguish two equivalent representations of a Choi matrix: C and C resh . The partial trace of a Choi matrix can be calculated using C resh as follows [Tr p (C)] i 1 i 2 = j [C resh ] i 1 ji 2 j . Practically it can be done by running tf . einsum ( ' ikjk -> ij ' , C_resh ) which means that we take a trace over the first and third indices (with numeration of indices starting from 0).
The Choi-Jamio lkowski isomorphism [39] establishes a one-to-one correspondence between quantum channels and Choi matrices. One can calculate the Choi matrix of a known quantum channel as follows where |Ψ + = n i=1 |i ⊗ |i and {|i } n i=1 is an orthonormal basis in C n . In order to show that the Choi matrix is a quantum channel itself, we consider the representation of Eq. (4) in terms of tensor diagrams [40,41]. The reshaped version of a Choi matrix [C resh ] i 1 j 1 i 2 j 2 is shown in Fig. 1. The tensor diagrams in Fig. 1 show that |Ψ + and 1 in the definition of the Choi matrix lead only to relabeling of multi-indices.
The set of all Choi matrices of size n 2 × n 2 (the corresponding quantum channel acts on density matrices of size n × n) C n is the following subset of C n 2 ×n 2 C n = C ∈ C n 2 ×n 2 C ≥ 0, Tr p (C) = 1 , where C ≥ 0, and Tr p (C) = 1 corresponds to the CPTP property of the corresponding quantum channel. This subset can be described as a Riemannian manifold that admits different Riemannian optimization algorithms. In order to describe C n as a Riemannian manifold, we may parametrize the Choi matrix with an auxiliary matrix A ∈ C n 2 ×n 2 : Figure 1: a) Diagrammatic representation of the Choi matrix. The block denoted by 1 represents the identity map in the definition of the Choi matrix. b) One can note that the state of a two-component quantum system |Ψ + can be seen as the identity matrix. c) Finally, we note that the Choi matrix is a quantum channel itself. The matrix C is positive semi-definite by construction. We also distinguish A ∈ C n 2 ×n 2 and its reshaped version A resh ∈ C n×n×n 2 that are connected by the reshape operation. The condition on a partial trace of a Choi matrix transforms to the following equality: and its diagrammatic form is given in Fig. 2. One can see that if in Eq. (7) we recast the two indices k and j into one index q, we then end up with the following relation: which means that [A resh ] qi is an isometric matrix and the corresponding tensor [A resh ] kij is a reshaped isometric matrix. The corresponding diagrammatic representation of Eq. (8) is given in Fig. 3. We call such a tensor obtained by reshaping an isometric matrix an isometric tensor. The set of all complex isometric matrices of fixed size forms a Riemannian manifold called complex Stiefel manifold [42] that we denote as St. Equations (7) and (8), and the diagram Fig. 3 show that the set of tensors A resh can be seen as a complex Stiefel manifold. At first glance, it looks like we have shown that the set of Choi matrices can be seen as a Stiefel manifold, but there is a problem that invalidates this statement: the matrices A and  AQ, where Q is an arbitrary unitary matrix, correspond to the same Choi matrix; in other words, we have an equivalence relation AQ ∼ A. Indeed, A diagrammatic version of Eq. (9) is depicted in Fig. 4. It shows that for any A there is a family of equivalent matrices [A] = {AQ|Q ∈ C n 2 ×n 2 , Q † Q = QQ † = 1}, which is the equivalence class of A and leads to the same Choi matrix. One can eliminate this symmetry by turning to a quotient manifold St/Q = {[A]|A ∈ St}, which consists of equivalence classes. This rather abstract construction can be imagined as a projection of a manifold along surfaces representing equivalence classes (see Fig. 5). Having a map π(A) = [A] and a horizontal lift [36], that connects tangent spaces of St/Q and tangent spaces of St, one can describe the abstract manifold St/Q through St. The quotient manifold St/Q can be further identified with the set of Choi matrices C n . It allows one to perform a Riemannian optimization on C n , by using the parametrization C = AA † . Mathematical details of this construction are given in Appendix A. The example of the quotient manifold representing the Choi matrices through their parametrization shows all the necessary steps that emerge while building the mathematical description of quantum mechanical manifolds. The set of all manifolds implemented in QGOpt library is listed below.
• The complex Stiefel manifold St n,p = V ∈ C n×p |V † V = 1 is a set of all isometric matrices of fixed size. A particular case of this manifold is a set of all unitary matrices of fixed size; therefore, this manifold can be used for different tasks related to quantum control. Some architectures of tensor networks may include isometric matrices as building blocks [43,44]; thus, one can use this manifold to optimize such tensor networks.
• The manifold of density matrices of fixed rank n,r = ∈ C n×n = † , Tr( ) = 1, ≥ 0, rank( ) = r is a set of all fixed-rank Hermitian positive semi-definite matrices with unit trace. Since density matrices represent The red curve represents a particular equivalence class F that is also called a fiber.
states of quantum systems, one can use this manifold to perform state tomography and optimization of initial quantum states in different quantum circuits. This manifold is implemented through a parametrization with a quotient structure on top of it.
• The manifold of Choi matrices of fixed rank C n,r = C ∈ C n 2 ×n 2 C = C † , Tr p (C) = 1, C ≥ 0, rank(C) = r is a set of all fixedrank Hermitian positive semi-definite matrices with auxiliary linear constraint (equality of the partial trace to the identity matrix). Choi matrices are used as representations of quantum channels; hence, one may use this manifold to perform quantum channel tomography and optimization of quantum channels in different quantum circuits. This manifold is implemented through a parametrization with a quotient structure on top of it.
• The manifold of Hermitian matrices H n = H ∈ C n×n H = H † is a linear subspace of a space C n×n . Since Hermitian matrices represent measurable physical operators in the quantum theory, one can use this manifold to perform a search of optimal measurable physical operators in different problems.
• The manifold of Hermitian positive definite matrices S n ++ = S ∈ C n×n S = S † , S 0 is a set of all positive definite matrices of fixed size. One can use it to search the optimal non-normalized quantum state in different tasks.
• The manifold of positive operator-valued measures (POVMs) with full rank elements Ei = 1, rank(Ei) = n can be considered as a tensor with Hermitian positive semi-definite full-rank slices that sum into the identity matrix. Since POVMs describe generalized measurements in quantum theory, one can use this manifold to perform a search of optimal measurements that give the largest information gain. This manifold is implemented through a parametrization with a quotient structure on top of it.
Mathematical details of the implementation of manifolds are given in Appendix A.

Manifolds API
In this section we discuss the API of the version 1.0.0 of the QGOpt library. The central class of the QGOpt library is the manifold base class. All particular manifold types are inherited from the manifold base class. All manifold subclasses admit working with the direct product of several manifolds. Optimization primitives of each particular manifold are implemented as methods of the corresponding class describing a manifold. This list of methods allows one not to pay particular attention to the details of the underlying Riemannian geometry. Let us consider basic illustrative examples. First, one needs to import all necessary libraries and create an example of a manifold. As an example we consider the complex Stiefel manifold.
1 import QGOpt as qgo 2 import tensorflow as tf Here, m is an example of the complex Stiefel manifold that contains all the necessary information on the manifold's geometry. Some manifolds allow one to specify a type of metric and retraction as well. Using this example of a manifold one can sample a random point from a manifold: Here, we sample a random tensor u, that is a complex valued TensorFlow tensor of size 4 × 3 × 2. This tensor represents a point from the direct product of four complex Stiefel manifolds. The first index of this tensor enumerates a manifold and the last two indices are matrix indices. Therefore, the tensor u can be seen as a set of four isometric matrices. One can generate a random tangent vector drawn from u. v = m . random_tangent ( u ) Here, v is a complex-valued TensorFlow tensor of the same size and type as u, and represents the random tangent vector drawn from u. Now let us assume that we have a random vector w which is of the same size and type, but is not tangent to u. One can make the orthogonal projection of this vector on the tangent space of u: The updated vector w is an element of the tangent space of u now. The projection method of quotient manifold performs the projection on the horizontal space. To get the scalar product of two tangent vectors one can use the following line of code: Here we pass u to the inner product method to specify the tangent space where we compute the inner product, because in Riemannian geometry the metric and inner product are pointdependent in general.
To implement first-order Riemannian optimization methods on a manifold one needs to be able to move points and vectors along the manifold. There are retraction and vector transport methods for this purpose. As an example let us move a point u along a tangent vector v via the retraction map: The new pointũ is the result of transportation of u along vector v. To perform transportation of a vector along some other vector one can run: Here we start from point u and transport a tangent vector v along a tangent vector w, and obtainṽ that is the result of the vector transportation.
The last important method converts the Euclidean gradient of a function to the Riemannian gradient. The Riemannian gradient replaces the Euclidean gradient to take into account the metric of a manifold and the tangent space in a given point. To calculate the Riemannian gradient one can use: where we denote the Euclidean gradient as e and the Riemannian gradient as r.
The numerical complexity of each optimization primitive varies from one manifold to another. The complexity of all primitives is summarized in Appendix B.

Optimizers
The Riemannian optimizers implemented in QGOpt are inherited from TensorFlow optimizers and hence have the same API. The main difference is that one should also pass an example of manifold while defining an optimizer, which guides the optimizer and preserves the manifold's constraints. Two optimizers are implemented, that are among the most popular in machine learning: Riemannian versions of Adam [20] and SGD [45].
If m is a manifold element and lr is a learning rate (optimization step size), then the Adam and SGD optimizers can be initialized as: Note that some other attributes like the momentum value of the SGD optimizer or the AMS-Grad modification of the Adam optimizer can also be specified.

Auxiliary functions
It is important to keep in mind that TensorFlow optimizers work well only with real variables. Therefore, one cannot use complex variables to represent a point on a manifold because they are being tuned while optimizing. The simplest way of representing a point from a complex manifold through real tensors is by introducing an additional index that enumerates real and imaginary parts of a tensor. For example a complex-valued tensor of shape (a, b, c) can be represented as a real-valued tensor of shape (a, b, c, 2). During calculations, we need to convert tensors from their real representation to their complex representation and back.
Let us initialize a complex-valued tensor as a point from a manifold using the method "random". In order to make this tensor a variable suitable for an optimizer, one needs to convert it to the real representation. Then, while building a computational graph, one may need to have a complex form of a tensor again: 1 # a random real tensor , last index enumerates 2 # real and imaginary parts 3 w = tf . random . normal ((4 , 3 , 2) , 4 dtype = tf . float64 ) 5 # corresponding complex tensor of shape (4 , 3) 6 wc = qgo . manifolds . real_to_complex ( w ) 7 # corresponding real tensor ( wr = w ) 8 wr = qgo . manifolds . complex_to_real ( wc ) 5 Examples of application of QGOpt

Quantum gate decomposition
In this subsection we consider an illustrative example of quantum gate decomposition. It is known that any two qubit-quantum gate U can be decomposed [46]: where U CNOT is the CNOT gate and {ũ ij } 4,2 i,j=1 is a set of unknown one qubit-gates. Since a set {ũ ij } 4,2 i,j=1 can be seen as the direct product of 8 complex Stiefel manifolds, one can use Riemannian optimization methods to find allũ ij . First, we initialize randomly a trial set {u ij } 4,2 i,j=1 that will be tuned by Riemannian optimization methods. For simplicity, we denote the decomposition introduced above as The optimal set of one-qubit gates can be expressed as: where each u ij obeys the unitarity constraint and · F is the Frobenius norm. Before considering the main part of the code that solves the above problem, we need to introduce a function that calculates the Kronecker product of two matrices: We use a randomly generated target gate that we want to decompose, U = m . random ((4 , 4) , dtype = tf . complex128 ) We initialize the initial set {u ij } 4,2 i,j=1 randomly as a 4th rank tensor, u = m . random ((4 , 2 , 2 , 2) , dtype = tf . complex128 ) The first two indices of this tensor enumerate a particular one-qubit gate, the last two indices are matrix indices of a gate. We turn this tensor into its real representation in order to make it suitable for an optimizer and wrap it up into the TensorFlow variable: We initialize the CNOT gate U CNOT as follows: The final step is to minimize the loss function L = D(u ij ) − U 2 F calculated during the previous step. We calculate the gradient of L, using automatic differentiation, with respect to the set {u ij } 4,2 i,j=1 : grad = tape . gradient (L , u ) and pass the gradient to the optimizer:

opt . apply_gradients ( zip ([ grad ] , [ u ]) )
The Adam optimizer performs one optimization step keeping the orthogonality constraints. We repeat the forward pass, gradient calculation and optimization steps several times, wrapping them into a for loop until convergence and end up with a proper decomposition of the gate U . The optimization result is given in Fig. 6. One can see that at the end of the optimization process, the error is completely negligible. This section in the form of a tutorial is available in the QGOpt online [47].

Quantum tomography
Another typical problem that can be addressed by Riemannian optimization is the quantum tomography of states [48,49] and channels [50,51]. Here, we consider an example of quan-  Figure 6: Frobenius distance between a gate and its decomposition. One can see that the distance rapidly decreases with the number of iteration towards nearly zero within machine precision.
tum tomography of channels because it involves a more complicated structure than quantum tomography of states. Let H = n i=1 C 2 be the Hilbert space of a system consisting of n qubits. Let us assume that one has a set of input states {ρ i } N i=1 , where N is a total number of states, and each ρ i is a density matrix on H. One passes initial states through an unknown quantum channel Φ true and observes a set of measurement outcomes M tetra , where M tetra k is an element of a tetrahedral POVM [52]: σ = (σ x , σ y , σ z ) , s 0 = (0, 0, 1), s 1 = 2 √ 2 3 , 0, − 1 3 , One can estimate an unknown channel by maximizing the logarithmic likelihood of measurement outcomes: For simplicity, we assume that the many-body tetrahedral POVM M is already predefined and has the shape (2 2n , 2 n , 2 n ), where the first index enumerates the POVM element. We also assume that we have a data set that consists of a set of initial density matrices of shape (N, 2 n , 2 n ) and a set of POVM elements of the same shape that came true after measurements.
In our experiments, the unknown channel has Kraus rank 2 and is generated randomly, the initial density matrices are pure and also generated randomly. Let us proceed with the practical implementation. First, we define an example of the quotient manifold equivalent to the manifold of Choi matrices: m = qgo . manifolds . ChoiMatrix () Elements of this manifold are connected with Choi matrices via the relation (6). We randomly initialize a point from the quotient manifold, 1 # random initial parametrization 2 A = m . random ((2**(2* n ) , 2**(2* n ) ) , 3 dtype = tf . complex128 ) 4 # variable should be real 5 # to make an optimizer work correctly 6 A = qgo . manifolds . complex_to_real ( A ) 7  # reshape parametrization 7 # (2**2 n , 2**2 n ) --> (2** n , 2** n , 2**2 n ) 8 Ac = tf . reshape ( Ac , (2** n , 2** n , 2**(2* n ) ) )  The complexity of the code above can be reduced by choosing the optimal order of tensor contraction; however, it becomes more complicated in this case, and is not suitable for the tutorial. Finally, we calculate the logarithmic likelihood gradient with respect to the parametrization of the Choi matrix: grad = tape . gradient (L , A ) and apply the optimizer to make an optimization step that does not violate the CPTP constraints:

opt . apply_gradients ( zip ([ grad ] , [ A ]) )
We repeat the calculation of the logarithmic likelihood function, gradient calculation and optimization steps several times, wrapping them into a for loop, until convergence is reached. To evaluate the quality of an unknown quantum channel estimation, we calculate the Jamio lkowski process distance [53]: where Φ true (Φ est ) is the true (estimated) quantum channel, C true (C est ) is the corresponding Choi matrix, · tr is the trace norm and 0 ≤ J(Φ true , Φ est ) ≤ 1. One can see in Fig. 7 that the Jamio lkowski process distance converges to some small value with the number of iterations and we end up with a reasonable estimation of an unknown quantum channel. This section is available in the QGOpt online documentation in the form of a tutorial [47].
6 Optimization over an arbitrary Cartesian product of manifolds.
In general, it is possible to perform optimization over the Cartesian product of different manifolds. The QGOpt library allows finding where M is an arbitrary Cartesian product of manifolds, implemented in the QGOpt library, f is a function that can be evaluated within the TensorFlow framework.

Discussion and concluding remarks
The range of application of the QGOpt library to different problems of quantum technologies is not limited to quantum gate decomposition and quantum tomography. The six manifolds implemented in QGOpt give rise to different interesting scenarios of constrained optimization usage in quantum technologies. For example, the complex Stiefel manifold can be used to address different control problems [54][55][56] where one needs to find an optimal set of unitary gates driving a quantum system to a desirable quantum state. It is also possible to use a complex Stiefel manifold to perform entanglement renormalization [43,44], machine learning by unitary tensor networks [57] or non-Markovian quantum dynamics identification [26]. In addition, quotient manifolds for quantum tomography, or for density matrices and Choi matrices can be used to maintain natural quantum constraints in different tensor network architectures. Quotient manifold of POVMs can be used for searching for an optimal generalized measurement scheme with maximum information gain. Finally, all these manifolds can be combined in one optimization task, which allows to address multi-component problems.
Although as of yet the QGOpt library includes only first-order optimization methods, we plan to extend the list of optimizers by including quasi-Newton methods such as the Riemannian BFGS [58], and the recently developed quantum natural gradient descent [19] generalized to the case of embedded and quotient manifolds.
To conclude, we have introduced the QGOpt library aimed at solving constrained optimization problems with natural quantum constraints. We have introduced and discussed abstract concepts such as quotient manifolds under the hood of QGOpt. We have gone through the QGOpt API and covered its most important features. We also sorted out examples of codes solving illustrative quantum technology problems. useful reports on the manuscript.
Funding information I.A.L. and S.N.F. thank the Foundation for the Advancement of Theoretical Physics and Mathematics "BASIS" for support under Project No. 19-1-2-66-1.

A Underlying geometry of manifolds implemented in the QGOpt library
In this appendix, we consider some mathematical aspects of the implementation of manifolds in the QGOpt library. First, we discuss how one can identify complex matrices, which are elements of all manifolds implemented in the QGOpt library, with real matrices. Any complex matrix A can be represented as follows The following correspondences between operations with complex matrices and operations with their real representations allow us to work with certain sets of complex matrices as with Riemannian manifolds of real matrices [59]. The QGOpt library contains six manifolds: three implemented as embedded manifolds and three implemented as quotient manifolds. Table 1 summarizes the geometry lying under the hood of a high-level description of the embedded manifolds in the QGOpt library.
We also summarize the geometry of the manifolds that are implemented as quotient manifolds. In our summary, we follow the book by Nicolas Boumal [36], which provides a very instructive presentation of the optimization on quotient manifolds.
Having an optimization problem on a quotient manifold, one works with two sets M and M that are connected as where [x] = {y|y ∈ M, y ∼ x} is the equivalence class of x, M is some Riemannian manifold and M is its quotient. We call a map π a canonical projection if it maps any x from M to its equivalence class: For M to be a manifold, one requires π to be smooth and its differential Dπ then one can decompose the tangent space T x M at a point x as These two types of inner product induce the same orthogonal projection [42]. The Riemannian gradient for the Euclidean inner product takes the following form [42] the Riemannian gradient for the canonical inner product takes the following form [42] u. Three types of retraction are available in the QGOpt library: SVD decomposition based retraction [37], QR decomposition based retraction [37] and Cayley retraction [35,37].
Vector transport is induced by a retraction.
It is implemented as the orthogonal projection of a vector on the tangent space of a point obtained via a retraction. [35,37]. Two types of inner product are available in the QGOpt library: Log-Cholesky inner product [43,60] and Log-Euclidean inner product [43].

Manifold
Both inner products keep the manifold complete. The Riemannian gradient for both inner products that are used in the QGOpt library is derived in [43].
Instead of a retraction, one uses the exponential map for both inner products in the QGOpt library. Closed form of the exponential map for the Log-Euclidean inner product can be found in [43], for the Log-Cholesky inner product in [43,60].
Instead of a vector transport, one uses the parallel transport for both inner products in the QGOpt library. The closed form of the parallel transport for the Log-Euclidean metric can be found in [43], for the Log-Cholesky metric in [43,60].
Manifold of Hermitian matrices Hn. Embedded manifold of Hermitian matrices that also is a linear subspace of the ambient space.
Only the Euclidean inner product is available in the QGOpt library; it reads v, w Instead of a vector transport, one uses the parallel transport, which is the identity transformation.
where H x is the orthogonal complement of V x also called the horizontal space.
where ξ is a vector from T [x] M and v is its representation from H x . Having introduced all the objects above, one can try to construct all primitives for optimization algorithms on a quotient manifold through the same primitives on a total manifold (see table 2).
where P S is the orthogonal projection operator on a subspace S π R x (lift x (ξ)) = π R y (lift y (ξ)) , where ∼ denotes equivalence relation between elements of M. It is also worth noting that in practice there is no need to go back from M to M after application of each primitive. Instead, one can work only with objects from M, which makes optimization algorithms on M almost identical to algorithms on M. Manifolds n,r , C n,r and POVM m,n in the QGOpt library are implemented using the above idea. The quotient geometry of the real version of the manifold n,r is described in [61] and also is implemented in the Manopt library [62]. The alternative approach to optimization on a POVM m,n is also considered in [63] and implemented in Manopt. To the best of the authors' knowledge, the manifold C n,r has not been considered from the Riemannian optimization point of view.
Let us consider total manifolds that are used to build quotient manifolds implemented in the QGOpt library. They are n,r = A ∈ C n×r * Tr(AA † ) = 1 , C n,r = A ∈ C n 2 ×r * where C p×q * is the set of complex full-rank matrices of size p × q. One can note that the manifold n,r is a sphere with the additional condition on the rank of A, manifolds C n,r and POVM m,n are complex Stiefel manifolds with the additional condition on the ranks of A and A i . Any element of the total manifolds above corresponds to either a density matrix, a Choi matrix, or a POVM. Indeed where is some density matrix, C is some Choi matrix and E i is an element of some POVM. However, there is an ambiguity: where Q is unitary and {Q i } m i=1 is a set of unitary matrices of the appropriate size. In order to lift the ambiguity we introduce equivalence classes [A] = {AQ|Q ∈ C r×r , QQ † = 1}, for n,r and C n,r , and the corresponding quotient manifolds To identify quotient manifolds with those that are introduced in the main text, we introduce the maps φ : n,r / ∼→ n,r : φ C : C n,r / ∼→ C n,r : φ POVM : POVM n,m / ∼→ POVM n,m : These three maps are bijections, which follows from Proposition 2.1 in [64] that is proved for real matrices, but generalization to the complex case is straightforward. They, as well as their inverses, are also differentiable which implies that these three maps are diffeomorphisms. It is enough to identify quotient manifolds with those introduced in the main text and turn to the optimization on quotient manifolds. Now, to perform optimization on n,r , C n,r and POVM m,n it is sufficient to introduce appropriate primitives for n,r , C n,r and POVM m,n that additionally satisfy Eqs. (25)(26)(27), and the projection on the horizontal space. The total manifolds equipped with the following inner products, induced by inner products of ambient spaces where POVM m,n , satisfy the condition (25) as shown in [36]. The projections on the horizontal space for the total manifolds are for POVM n,m , where the projections on tangent spaces are known for the total manifolds that are the sphere and the complex Stiefel manifolds; and the projections on the vertical spaces can be found by solving the Sylvester equation [65]. One can introduce several different retractions for total manifolds that, however, may not satisfy the condition Eq. (26). Since manifolds C n,r and POVM m,n are complex Stiefel manifolds we can use SVD-based retraction for them. One can show that SVD-based retraction satisfies the condition Eq. (9) (see [36]). For the manifold n,r one can use retraction on a sphere (see Example 4.1.1 in [37]). This retraction also satisfies the condition Eq. (26). Vector transports (see Table 2) induced by retractions above also satisfies the condition Eq. (27). The Riemannian gradients for n,r , C n,r and POVM m,n are known and can be used without modifications for optimization on quotient versions of these manifolds.
We thus have all optimization primitives for total manifolds, quotient manifolds, and equivalence between quotient manifolds and manifolds from the main text, which allows us to perform optimization on n,r , C n,r and POVM n,r .

B Complexity of algorithms and comparison with other libraries
In this appendix, we discuss questions of scalability of optimization algorithms presented in the QGOpt library and compares the QGOpt library with other frameworks. To address the scalability of optimization algorithms, one needs to estimate the asymptotic complexity of primitives used in those algorithms. Table 3 shows the complexity of optimization primitives for all manifolds. Let us compare the complexity of algorithms from the QGOpt library with some state of the art algorithms in quantum technologies. For example, let us consider quantum channel tomography, which can be implemented via optimization on C n,r . Under the assumption that a particular algorithm uses all the optimization primitives, one step of an optimization algorithm scales like O(n 3 r), where n is the dimension of a Hilbert space and r is Kraus rank (see Table 3). In general, the Kraus rank r is equal to n 2 , which means that the maximal complexity is O(n 5 ); however, if we have prior information that r is small, then one can significantly reduce the complexity of an algorithm. One can compare the one-step complexity of Riemannian-optimization-based algorithms for quantum channel tomography with the one-step complexity of an algorithm suggested in [50] that is based on the orthogonal projection on the set of CPTP maps. In turn, the orthogonal projection on the set of CPTP maps is implemented through repeated averaged projections on CP and TP sets of maps. The projection on the CP set has complexity O(n 6 ) that is larger than the complexity of Riemannian-optimization-based algorithms.

Manifold
Retraction Vector transport Riemannian gradient Inner product Projection  Let us also compare the QGOpt library with other libraries for Riemannian optimization. Table 4 shows a list of some related libraries. One can see that QGOpt suits best quantum technologies problems in terms of the number of quantum manifolds.