SciPost logo

SciPost Submission Page

DanceQ: High-performance library for number conserving bases

by Robin Schäfer, David J. Luitz

Submission summary

Authors (as registered SciPost users): Robin Schäfer
Submission information
Preprint Link: https://arxiv.org/abs/2407.14591v1  (pdf)
Code repository: https://gitlab.com/DanceQ/danceq
Date submitted: 2024-08-08 11:06
Submitted by: Schäfer, Robin
Submitted to: SciPost Physics Codebases
Ontological classification
Academic field: Physics
Specialties:
  • Condensed Matter Physics - Theory
  • Condensed Matter Physics - Computational
Approaches: Theoretical, Computational

Abstract

The complexity of quantum many-body problems scales exponentially with the size of the system, rendering any finite size scaling analysis a formidable challenge. This is particularly true for methods based on the full representation of the wave function, where one simply accepts the enormous Hilbert space dimensions and performs linear algebra operations, e.g., for finding the ground state of the Hamiltonian. If the system satisfies an underlying symmetry where an operator with degenerate spectrum commutes with the Hamiltonian, it can be block-diagonalized, thus reducing the complexity at the expense of additional bookkeeping. At the most basic level, required for Krylov space techniques (like the Lanczos algorithm) it is necessary to implement a matrix-vector product of a block of the Hamiltonian with arbitrary block-wavefunctions, potentially without holding the Hamiltonian block in memory. An efficient implementation of this operation requires the calculation of the position of an arbitrary basis vector in the canonical ordering of the basis of the block. We present here an elegant and powerful, multi-dimensional approach to this problem for the $U(1)$ symmetry appearing in problems with particle number conservation. Our divide-and-conquer algorithm uses multiple subsystems and hence generalizes previous approaches to make them scalable. In addition to the theoretical presentation of our algorithm, we provide DanceQ, a flexible and modern - header only - C++20 implementation to manipulate, enumerate, and map to its index any basis state in a given particle number sector as open source software under https://DanceQ.gitlab.io/danceq.

Current status:
In refereeing

Reports on this Submission

Anonymous Report 3 on 2024-9-12 (Invited Report)

Strengths

1- Efficient and elegant numerical library to generate and access qubit configurations with a fixed U(1) charge (e.g. magnetization for spin system)
2- Open source code
3- Very generic code able to tackle any local Hilbert space

Report

This paper provides a numerical library to generate and access generic qubit configurations for any quantum model with a conserved U(1) quantity. This is very often the case, e.g. total Sz magnetization for a spin system, and hence relevant to most numerical studies. Due to the exponentially large Hilbert space, generating the proper configurations (among the $q^N$ where $q$ is the local dimension and $N$ the number of sites) is not straightforward. The reverse question, regarding the index number of a given configuration, is needed when computing a matrix-vector multiplication on-the-fly, as done in state-of-the-art exact diagonalization (ED) methods. Again it is nontrivial and requires lookup tables that can become quite large and limit the computations.

This numerical library is very generic and provides an implementation for both mappings between physical configurations and indices in the full list. Thus, it is a very useful tool for any ED-like code. As reminded in the paper, a solution was provided by Lin long ago by dividing the system in two parts. The main achievement here is to generalize this idea to an arbitrary decomposition as well as local Hilbert space $q$.
Basically, it is possible to divide the $N$ sites into various subsystems such that the lookup tables can fit in memory (obviously if one uses less subsystems, computations will be more efficient but this is more costly in memory).

I find the paper very well written and the results convincing.
For all these reasons, I recommend the publication in SciPost Physics Codebases.

Requested changes

1- In many applications, $q=2$ (qubit or spin 1/2) and it is possible to generate valid spin configurations using a Gray-like code, which should be mentioned. Moreover, since they are ordered by construction, finding the index of a given configuration can be done with a simple binomial search. Why is it written that it is "prohibitevely expensive" ? It seems to me that it could be done for large systems as well ?

2- Minor changes: there are missing capital letters in some names, e.g. "bose", "fermi", "hubbard" etc.

Recommendation

Publish (surpasses expectations and criteria for this Journal; among top 10%)

  • validity: top
  • significance: top
  • originality: high
  • clarity: top
  • formatting: excellent
  • grammar: excellent

Anonymous Report 2 on 2024-9-11 (Invited Report)

Report

I think the presented manuscript could be published in SciPost Physics Codebases after revision. Overall I think that the manuscript provides valuable insight for numerically oriented research groups dealing with spin or Hubbard systems. I have a few recommendations:

1. The authors use "particle number" in order to name the deviation from some
basis state that could be termed vacuum. However, since they use spin models as examples most of the time I suggest to explain this a bit more carefully in the abstract and introduction.

2. In the introduction the authors mention state of the art numerical work in Refs. 30-32. I suggest to add some work of Kristel Michielsen who for a long time
has been holding records in terms of number of spins for dynamical problems, that by the way can be treated in similar ways.

3. A proof of Eq. (7) was presented in [51], not only the result. I leave it to the authors whether they want to present their proof in Sec. A; maybe it is more insightful or more easily accessible.

4. Either above Algorithm 1 or in the discussion it should be mentioned that the scheme developed in [50] applies to mixed spin systems not just Q=2.

5. Discussion on page 14: The use of "half filling" for spin systems is again (see 1) unusual or misleading. At least, the authors should add that this is equivalent to M=0.

6. If the author want to refer to one of the most potent free programs in the field, they could refer to spinpack by Schulenburg that uses the method of Lin. It was e.g. used for finite temperature Lanczos calculations for a kagome system with N=42 spins addressing the full Hilbert space as well as for N up to 72 addressing few magnon spaces in order to describe magnon condensation related to flat bands.
spinpack works in MPI/openMP hybrid mode.

Recommendation

Ask for minor revision

  • validity: -
  • significance: -
  • originality: -
  • clarity: -
  • formatting: -
  • grammar: -

Anonymous Report 1 on 2024-9-7 (Invited Report)

Strengths

1 - Provides rich context of the lexicographic problem and clearly explains the proposed solution and its implementation.
2 - The solution proposed in the basis enumeration algorithm has immediate and wide application.

Weaknesses

1 - An explanation of the titular library is lacking.
2 - The sectioning is sometimes unclear.

Report

"DanceQ: High-performance library for number conserving bases" proposes a scheme for enumerating the computational basis for a generic quantum system, such as spin chains, with possibly larger local Hilbert space. My general impression is that the proposed algorithm represents a valuable contribution to the community and probably deserves to have a software paper associated with it available in a publication, such as SciPost Physics Codebases. However, the submitted article mainly concerns the basis construction algorithm while being presented as being about a library.

The majority of the paper, section III, is dedicated to outlining an algorithm for constructing lookup tables which is well-explained. I appreciate the connection to the most common sectioning of the system into two subsystems as a limit of the algorithm.

I question whether, assuming the paper's topic is the library, the submitted paper meets point 4 of the Scipost Physics Codebases acceptance criteria. A description of the "logic of its workings and its added value" is provided for the algorithm (in great detail in Section III and Appendix B), however, I had to spend several hours trying to understand the library itself from the otherwise rich and detailed documentation available on GitLab. For instance, even an example program for the code is not provided. Section III F exemplifies this problem. There, the authors briefly introduce the Operator class, notably in a tone disconnected from the rest of the paper. This section clearly concerns the library design and not the basis construction algorithm. Further, the State, Basis, Vector, or variants of SparseMatrix and ShellMatrix classes are never mentioned but appear important to the design of the library.

As a consequence of this tension, I see some minor internal inconsistencies. For example: at the beginning of section IV, the authors write that "for optimal performance, the required memory should not exceed [the L3 cache] limit" while in section IV-A they conclude that "the lookup and retrieval of the indices plays only a secondary role during the full matrix-vector multiplication." This observation to me somewhat invalidates the conclusions drawn from Fig. 4.

I ran some example programs provided in the repository and found that the installation process is generally straightforward however the library has important dependencies which need to be configured separately. The instructions are provided however CMake is not configured to attempt to find some libraries automatically. Some further minor observations are included in the following section.

Requested changes

1 - An explanation of the core design philosophy of the package and some example code should probably be added.

2 - I found the discussion about matrix-vector multiplication (MVM) very insightful albeit short, focusing on the memory requirements of the lookup tables. I would be interested in, e.g., more precise timing for a single MVM which the authors characterize as "only slightly dependent on $N$". The "several MPI threads" should also be changed to a number. It would also be interesting to know how MVM with fixed $L$ and $N$ scales with increasing the number of cores. An important example could be comparing the $L=46$ spin system for $N=2$ and $N=3$ regarding memory requirement and time consumption.

3 - In the discussion of Fig. 5, the authors note "a clear trend that larger systems and larger filling fractions require mode subsystems for ideal [lookup table] performance." Yet some outliers are noticed in particular for $Q>1$ for larger $L=32$ in disagreement with this general trend. Can this be explained somehow?

4- Sections III E and IV could be improved by clearer subsectioning. I.e. a single subsection does not make sense to me.

5 - The symbols used in equations should be checked, e.g., the pairs $q-Q$ or $c_0-c_{P_0}$ are sometimes used interchangeably. The caption of Fig. 6 contains several typos.

6 - The authors are aware that Krylow space algorithms are useful beyond (imaginary) time evolution and ground state searches even without storing additional wavefunctions (e.g., Phys. Rev. B 102, 054408), so I am surprised by this omission, e.g., when they write "a Krylov subspace which can used to perform time evolution or to compute ground states and excitations..., or other eigenvectors using spectral transformations."

Recommendation

Ask for minor revision

  • validity: -
  • significance: good
  • originality: -
  • clarity: high
  • formatting: reasonable
  • grammar: excellent

Login to report or comment