Sivers extraction with Neural Network

Pseudo-data with simulated experimental errors can be generated to train an ensemble of Artiﬁcial Neural Networks (ANN) implemented on a regression to extract Transverse Momentum-dependent Distributions (TMDs). A preliminary analysis is presented on the reliability in extraction of the Sivers function imposed in the pseudo-data given the bounds on the experimental errors, data sparsity, and complexity of phase-space.


Introduction
Transverse Momentum Dependent Parton Distribution Functions (TMD PDFs) can be extracted from the processes that are corresponding to multiple kinematic scales such as Drell-Yan (DY), Semi Inclusive Deep Inelastic Scattering (SIDIS), and e + e − annihilation.Therefore, the crosssections (or differential cross-sections) measured from these processes are sensitive to the transverse momentum of partons, especially the magnitude of that momentum corresponding to non-negligible non-perturbative interactions.The original proposal for TMD PDFs was introduced by Collins, Soper, and Sterman [1][2][3].PDFs provide f (x) the parton density interms of light-cone momentum fraction (x), whereas TMD PDFs provide f (x, k ⊥ ) the parton density as a function of both light-cone momentum fraction and transverse momentum.There are eight TMD PDFs at the leading-twist, or in other words twist-2 approximation: O 1/Q 2 , which can be classified in terms of quark-polarization and nucleon-polarization (see Figure (1)).Among those eight TMD PDFs, there are two time-reversal odd TMDs, namely Sivers function & Boer-Mulders function, which represent the correlation between the spin of the quark and the spin of the hadron.The Sivers function corresponds to the polarized hadron, and the Boer-Mulders function corresponds to the unpolarized hadron.
Sivers [5,6] suggested that the k ⊥ distribution could have an azimuthal asymmetry when the initial hadron is transversely polarized, but this is in contradiction only with "Parity" and "Time-reversal" invariance (PT) of QCD.In other words, this asymmetry does not exist according to the PT invariance of QCD.The Sivers function is the correlation between unpolarized quarks in a transversely polarized nucleon.It vanishes by its naive definition [7].

Sivers asymmetry from SIDIS
In Semi Inclusive Deep Inelastic Scattering (SIDIS) process, the differential cross-section depends on both collinear parton distribution functions f q/p (x) and fragmentation functions D h/q (z), where q is the quark flavor, p represents the target proton, h is the hadron type produced by the process and z is the momentum fraction of the final state hadron with respect to the virtual photon.A simplified version of the SIDIS differential cross-section can be written as, where K(x, p hT , Q 2 ) represents the factorized kinematical factor.fq (x, k ⊥ ) is the unpolarized quark distribution with transverse momentum k ⊥ inside a transversely polarized (with spin S) proton with 3-momentum: and ∆ N f q/p ↑ (x, k ⊥ ) is the Sivers function which contains the effect on quarks due to the spinpolarization of the proton.The Sivers asymmetry in SIDIS process (see Fig. [1]) can be written in terms of the cross-sections as, which can be parameterized as [8], q N q e 2 q f q (x)D h/q (z) q e 2 q f q (x)D h/q (z) where, φ S , φ h are azimuthal angles of the final state hadron and the transverse polarization vector of the nucleon with respect to the lepton plane; and P hT is the transverse momentum

035.2
of the final state hadron with respect to the virtual photon in the center-of-mass of the virtual photon and the nucleon.The TMD Fragmentation Functions are parameterized as, where D h/q (z) is the fragmentation function for a quark with flavor q in a hadron type h.
⊥ 〉 ; and 〈k 2 ⊥ 〉, 〈p 2 ⊥ 〉 were fixed to 0.57 ± 0.08 GeV 2 , and 0.12 ± 0.01 GeV 2 as in [8] by fitting to the multiplicities from HERMES data.Assuming the Gaussian factorized form [8,9], the unpolarized TMDs and the Sivers function can be parameterized as, where, Therefore, one can simplify A (x, y, z, p hT ) = A 0 (z, p hT , M 1 ) q N q (x)e 2 q f q (x)D h/q (z) q e 2 q f q (x)D h/q (z) where f q (x) is the co-linear parton distribution function for flavor q that was obtained from CTEQ6l LHAPDF [10], whereas the fragmentation functions for π ±,0 , K ± are also extracted from NNFF10_nlo grids which are also available on LHAPDF [11].There are 13 fitting parameters: M 1 , N q , α q , β q , N q, where q = u, d, s and q = ū, d, s; and the fitting routine is iminuit (python version of MINUIT) [12].Two main differences of this work compared to [8] are: (1) LHAPDF grids for FFs (NNFF10_nlo) were used instead of the DSS implementation, (2) s and s quark-flavors were considered.Sivers asymmetries data on the SIDIS process, from HERMES (2009) [13] and (2020) [14] were used in this analysis, with the plan of extending the effort towards including all available data from different experiments.Furthermore, a step-forward has been taken to remove the model dependence behavior of N q (x) using Neural Network approach.

Modeling quark contribution with Neural Network
The lack of a satisfactory formulation for the quark contribution leads us to desire an unbiased, trainable model for this aspect of the Sivers asymmetry equation.Thus, in this work N q (x)

035.3
is being considered in a model-independent fashion.Neural networks are known to be able to adapt any arbitrary function through the universal approximation theorem [15], and thus are ideal for this application.In this study, we used multilayer feed-forward neural network models to approximate the quark contribution.
These in general operate by receiving inputs to the network, in this case, the kinematic variable x.These variables and a trainable matrix of "weights" are convoluted to comprise a hidden layer.To enable the network to approximate non-linear functions, a non-linear activation is applied to the outputs of this intermediate layer.Then these outputs are passed to the next hidden layer, and the process is repeated for an arbitrary number of layers.For the last step of this "forward pass," the outputs of the penultimate layer are passed to the output layer, and the estimate for the target value is obtained.No activation function is applied to the final layer because this is a regression problem.
In training the network, the "backward pass" is used to update the weights of the network with a loss function, which is a measure of how well the network fits the data.The partial derivative of the loss function with respect to each weight in the network is computed, then the weights are adjusted to reduce the magnitude of the loss function.In this problem, we have no direct experimental values for the quark contribution at different values of x.Thus, we propagate the outputs of the network through a computational graph (displayed in fig.2) that computes the entire Sivers function.Then, the weights for the neural network for each quark flavor are updated using the experimental observations of Sivers asymmetries in different kinematic ranges.
The inputs of the computational graph are x, z, andp hT .The PDFs and FFs are generated for the corresponding kinematic values, using LHAPDF outside the computational graph.They are then taken as inputs to the graph along with the kinematics.Specifically, e 2 q f q (x)D h/q (z) is calculated for each kinematic setting and fed as an input to the computational graph.It is necessary to calculate these expressions independently of the graph because neither the PDFs nor the FFs as taken directly from LHAPDF are automatically differentiable with TensorFlow.These inputs are then passed through the Sivers function as usual, except the neural networks which model the quark contribution for each flavor.These neural networks are each comprised of two dense layers with 32 nodes and ReLU activations.There is a separate network for each quark flavor.

Results
The fits to the kinematic sets have been performed using iminuit (see Table [1]).The neural net has been trained using the HERMES2009 data set.The asymmetry plots HERMES2020 for the NN are predictions which have been compared with the real data.Table [2] summarizes the χ 2 /d o f 's to compare how well the Sivers asymmetries (with respect to x, z, P hT ) are described by iminuit fits as well as by the Neural Network.
Table 1: Individual fit (iminuit) results with HERMES2009 & HERMES2020 data.Note: The acceptance in HERMES2020 in x was 0.023 < x < 0.6 whereas the corresponding range in HEREMES2009 was 0.023 < x < 0.4.Also, in HEREMES2020 three-dimensional kinematic binning was considered compared to one-dimensional kinematic binning in either x, z, or P h⊥ in HERMES2009.

Parameter
x < l a t e x i t s h a 1 _ b a s e 6 4 = " D W P + 6 m r J

Conclusions & Future work
The fit results to HERMES2009 data using iminuit & neural net model are consistent, therefore the NN trained model was used to generate the Sivers asymmetries for HERMES2020 Kinematics and compared with actual HERMES2020 data.It was observed that the inclusion of the strange quark contribution not only facilitates the fits but also describes a consistent behavior of the Sivers function.It is important to note that the HERMES2020 results from the neural network serve as a test set, as the model was trained only on HERMES2009 data.Such results could indicate over-fitting the HERMES2009 data which could be resolved in future work by incorporating further data sets.Performing the global fits with HERMES, COMPASS (SIDIS data) is currently ongoing and will be published.The results indicate that the neural network representation of the quark contribution offers a promising alternative approach towards model-independent nature and could provide a path toward reducing uncertainties in the Sivers functions because there is a clear need for data and constraining using the Drell-Yan process, and for both Sivers and Boer-Mülders functions.

Figure 1 :
Figure 1: Left: Leading twist TMDs categorized according to the polarization of the quarks and the nucleons.Right: Semi-inclusive hadron production in DIS processes [4]

Figure 2 :
Figure2: Entire TensorFlow computational graph with kinematics as inputs and Sivers asymmetry as output.Each N q is a neural network model.

Figure 3 :
Figure 3: The extracted Sivers functions (at Q 2 = 2.4 GeV 2 ) for the valence & sea quarks in the case of SU(3) f l avor by iminuit fits (first two columns: for HERMES2009 & HERMES2020 accordingly, and the third column: Neural Net fit to HERMES2009 data).

Figure 3 :
Figure 3: The extracted Sivers functions (at Q 2 = 2.4 GeV 2 ) for the valence & sea quarks in the case of SU(3) f lavor by iminuit fits (first two columns: for HERMES2009 & HERMES2020 accordingly, and the third column: Neural Net fit to HERMES2009 data).

Figure 3 :
Figure 3: The extracted Sivers functions (at Q 2 = 2.4 GeV 2 ) for the valence & sea quarks in the case of SU(3) f lavor by iminuit fits (first two columns: for HERMES2009 & HERMES2020 accordingly, and the third column: Neural Net fit to HERMES2009 data).

Figure 3 :
Figure 3: The extracted Sivers functions (at Q 2 = 2.4 GeV 2 ) for the valence & sea quarks in the case of SU(3) f lavor by iminuit fits (first two columns: for HERMES2009 & HERMES2020 accordingly, and the third column: Neural Net fit to HERMES2009 data).

< l a t e x i t s h a 1 _
b a s e 6 4 = " j q y l w c x B a g M h e s C Q 8 V F O W E k S c m o = " > A A A B 6 H i c b V D L T g J B E O z F F + I L 9 e h l I j H x R H Y N U Y 9 E L x 4 h k U c C G z I 7 9 M L I 7 O x m Z t Z I C F / g x Y P G e P W T v P k 3 D r A H B S v p p F L V n e 6 u I B F c G 9 f 9 d n J r 6 x u b W / n t w s 7 u 3 v 5 B 8 f C o q e N U M W y w W M S q H V C N g k t s G G 4 E t h O F N A o E t o L R 7 c x v P a L S P J b 3 Z p y g H 9 G B 5 C F n 1 F i p / t Q r l t y y O w d Z J V 5 G S p C h 1 i t + d f s x S y O U h g m q d c d z E + N P q D