SciPost Submission Page
Unraveling Complexity: Singular Value Decomposition in Complex Experimental Data Analysis
by Judith F. Stein, Aviad Frydman, Richard Berkovits
This Submission thread is now published as
Submission summary
Authors (as registered SciPost users):  Richard Berkovits 
Submission information  

Preprint Link:  https://arxiv.org/abs/2407.16267v1 (pdf) 
Date accepted:  20240903 
Date submitted:  20240724 16:50 
Submitted by:  Berkovits, Richard 
Submitted to:  SciPost Physics Core 
Ontological classification  

Academic field:  Physics 
Specialties: 

Approaches:  Experimental, Computational, Phenomenological 
Abstract
Analyzing complex experimental data with multiple parameters is challenging. We propose using Singular Value Decomposition (SVD) as an effective solution. This method, demonstrated through real experimental data analysis, surpasses conventional approaches in understanding complex physics data. Singular values and vectors distinguish and highlight various physical mechanisms and scales, revealing previously challenging elements. SVD emerges as a powerful tool for navigating complex experimental landscapes, showing promise for diverse experimental measurements.
Author comments upon resubmission
We have followed all of the first reviewer's requests for changes (please see our response and a detail list of changes in the following section) which were also endorsed by the second reviewer and we believe that the manuscript is now ready for publication.
List of changes
We list here the first referee's point by point requests for changes and followed after each point by our changes
Requested changes
1. Recent advances in data analysis using SVD should be mentioned, especially the seminal work of Gavish and Donoho on the recovery of lowrank matrices from noisy data:
 Gavish, M., & Donoho, D. L. (2014). The optimal hard threshold for singular values is 4/\sqrt{3}. IEEE Transactions on Information Theory, 60(8), 50405053.
 Gavish, M., & Donoho, D. L. (2017). Optimal shrinkage of singular values. IEEE Transactions on Information Theory, 63(4), 21372152.
 We mention these references in the introduction (line 13) and in new Ref. [7,8]
2. One important missing contextual remark is the use of SVD in statistics. Performing PCA on a dataset is equivalent to performing SVD on the centered data matrix, with the principal components and directions derived directly from the SVD's components. However, it should be explained that the point of view adopted in the paper differs significantly from the traditional interpretation of rows and columns in data matrix.
 We thank the referee for raising this point and we have now emphasised it in the first paragraph of the manuscript (lines 1721).
3. The rationale for assuming that 'V is swept (or interpolated) at equidistant increments' is not explicitly stated and is not mentioned when analyzing the experimental data. The authors should clarify this assumption and indicate what happens when it is not satisfied.
In the experiment the voltage is swept at equidistant increments of $5 \times 10^{5} V$. It is now clarified in the manuscript that to use this matrix representation, it is not essential for $đť‘‰_j$ to be equidistant; however, it is necessary that all $V _j$'s are the same for all measurements appearing in the same column. Similarly, it is essential that the values of $U_i$ are the same for the same row. If this is not the case in a particular experiment, it can sometimes be rectified by extrapolating the measurements to the same value of $V_i$ or $U_j$ (see lines 6570).
4. The term 'singular values amplitudes' seems atypical. In most mathematical and data analysis literature, 'singular values' is the preferred and widely accepted term.
 The term 'singular values amplitudes' was changed to 'singular values' throughout the manuscript.
5. Figures:
 Figure 1: The plot and text in Figure 1 are too small compared to the text, which complicates its comprehension.
 Figure 2 (e,f): This is not "FT analysis" but rather the "power spectrum (or power spectral density) of the data obtained by FT analysis."
 The text in Figure 1 was enlarged.
 The figure caption for figure 2 (e,f) was changed accordingly.
6. There is inconsistency in the shape of vectors. The authors should mention when the vectors are rows and when they are columns, which would avoid any confusion when writing U_i V_j. Only the lengths of U_i and V_j are mentioned while clearly they are interpreted as matrices in the paper. Note that the "outer product," mentioned on line 67 and displayed on line 166, is most often taken between two column vectors.
 it is now clearly stated that the outer product is taken between two vectors ${\vec U}^{(k)}_{i}$ (a column of length $M$) and ${\vec V}^{(k)}_{j}$ (a row of length $P$) (see line 77).
7. Line 73: It might be better to mention that this is a series of rank1 matrices, better stressing that the matrices X^(k) are basic elements analogue to the Fourier modes.
 We now mention that X^(k) is a rank 1 matrix (line 82).
8. Line 97: The optimality is a classical result in linear algebra, often called the "EckartYoung theorem."
 We now mention the Eckartâ€“Youngâ€“Mirsky theorem (line 107) and provide references (new Refs. [2628]).
Published as SciPost Phys. Core 7, 061 (2024)