An Overview of Lattice Results for Parton Distribution Functions

Following a ground-breaking proposal by Ji~\cite{PhysRevLett.110.262002}, numerical simulations of Quantum Chromo Dynamics (QCD) on a Euclidean lattice have provided new, valuable information on the structure of hadrons. In this talk, we briefly review the lattice approach to the reconstruction of parton densities, highlighting the connection between lattice observables and parton densities, with a focus on theoretical issues. Since parton distributions are extracted from lattice data by solving an inverse problem, we discuss some of the difficulties that affect these determinations and how they can be formulated in a Bayesian framework.


Introduction
A detailed understanding of the structure of hadrons, and of nucleons in particular, is central to the current and future experimental explorations of nature at particle accelerators.Two notable examples are the forthcoming runs of the Large Hadron Collider (LHC) at CERN and the experimental programme at the Electron Ion Collider (EIC), which is under construction in the US.As far as the LHC is concerned, the focus for future runs will be on increasing the luminosity and therefore the precision of the experimental measurements.Our current understanding of proton collisions being based on factorization theorems, a detailed knowledge of Parton Distribution Functions (PDFs) is necessary in order to be able to exploit the increase in statistical precision.The EIC, on the other hand, is designed specifically to investigate the internal structure of nucleons by probing the dynamics of quarks and gluons in head-on collisions of protons (or heavier nuclei) with a beam of electrons.The interplay between theory and experiment will ultimately provide the most accurate description of hadronic matter.
As a result, we expect in the near future significant progress in the determination of PDFs and in our understanding of the strong force in general, leading to a description of nucleons and possibly nuclei and atoms from first principles, i.e. from the QCD lagrangian.Lattice QCD is by now a mature tool for accurate QCD predictions and will play an active role in this area.
In this proceedings, we start by reviewing the field-theoretical definition of parton distributions, explaining how the nucleon structure is encoded in field correlators that are defined on the light-cone and can be computed in quantum field theories.We will then introduce a simple toy model in order to highlight the relation between these light-cone quantities and Euclidean correlators that are amenable to Monte Carlo simulations.We will see that the Euclidean correlators, after being properly renormalized, can be treated exactly like experimental data for which there exists a factorization theorem.
As a consequence of these factorization theorems, PDFs can only be extracted by solving an inverse problem.In the final part of this review, we discuss some features of inverse problems and their solution in a Bayesian framework.

Theoretical background
The current understanding of inelastic scattering processes that involve hadrons is based on factorization theorems.In kinematical regimes where there is a separation of scales, factorization theorems allow us to separate soft, non perturbative contributions from high-energy partonic dynamics, up to corrections given by powers of the ratio of the scales of the problem.The typical example of such processes is Deep Inelastic Scattering (DIS), where a lepton scatters off a nucleon through the exchange of a photon with a very large transferred momentum, as shown in Fig. 1.
The hadronic contribution to the cross section is encoded in the tensor which contains the matrix element of the electromagnetic current between the nucleon in the initial state |p〉 and a generic hadronic state |X 〉 in the final state.Using covariance under Lorentz transformations, the non perturbative dynamics due to the strong interactions is encoded in two form factors,

028.2
Figure 1: DIS process, where a lepton with momentum k scatters of a nucleon with momentum p.In the deep inelastic regime, the four-momentum squared Q 2 = −q 2 is much larger than the typical hadronic scale Λ 2 .In the deep inelastic regime, the norm of the transferred four-momentum is large compared to the typical hadronic scale, Λ 2 /Q 2 ≪ 1, and the form factors can be factorized as where the coefficient function C i is related to a partonic cross section that can be computed in perturbation theory, while the effect of the strong interactions in the nucleon is encoded in the PDF f R .The suffix R in this expression indicates that the PDFs are physical, finite quantities, defined in a particular renormalization scheme.Corrections to the factorized expression are suppressed by powers of Λ 2 /Q 2 .The core of the argument that underlies factorization is the fact that the hadronic matrix elements are dominated by diagrams as the one shown in Fig. 2. See e.g.Ref. [2] for a concise discussion in the context of a scalar field theory.
Looking at the structure of the diagram, it is clear that the non perturbative contribution to the hadronic tensor is given by the matrix element of a fermion bilinear (corresponding to the two fermion lines that connect the two halves of the diagram) between nucleon states.When all the details are worked out, the PDF is obtained as the integral of the matrix element of the fermion bilinear along the light-cone direction z − , where Γ determines the spin structure of the bilinear, while λ A is a matrix in flavor space and U is a Wilson line from −z − /2 to z − /2 which guarantees gauge invariance.The expression in Eq. ( 2) yields the bare PDF.It is UV-divergent and needs to be renormalized in the desired scheme in order to match the function f R (x) that appears in the factorization formula, Eq. ( 1).Similar arguments apply for less inclusive quantities, which lead to the introduction of Wigner functions, the basis for the definition of Generalized Parton Distribution Functions (GPDFs) and Transverse Momentum Dependent Parton Distribution Functions (TMDs), We see explicitly from this equation that Wigner functions are defined as off-diagonal elements of field bilinears, with the possibility of injecting a transverse momentum k ⊥ .

Lattice QCD and PDFs: A toy model
At first sight, it is difficult to imagine that lattice QCD, defined in Euclidean space, can provide information on the PDFs, since the latter are extracted from correlators evaluated along the light-cone.Indeed, for several decades, the only quantities that were amenable to Monte Carlo simulations were the moments of the PDFs, which are defined from the OPE as matrix elements of local operators, see e.g.Ref. [3,4].The seminal work in Ref. [1] introduced Euclidean matrix elements, where the fields in the fermion bilinear are separated in a purely spatial direction.These matrix elements allow the extraction of the PDFs using a factorization formula.In this respect, they are on the same footing as physical observables, and indeed they can be incorporated in phenomenological fits, as discussed in Refs.[5,6].It is pedagogical to study these Euclidean correlators in a simple scalar field theory, along the lines of the work originally presented in Ref. [2].In this simple framework, the conceptual steps can be understood without being distracted by complex calculations.Hence, we are going to consider the matrix elements in a scalar φ 3 field theory in D dimensions, In order to establish the necessary factorization theorem, we need to: 1. Renormalize the bilinear for z along the light-cone and for a Euclidean separation.
2. Find a relation between these two renormalized quantities, in the form of a factorization theorem, with IR-finite coefficient functions.
The results summarised here were originally discussed in Ref. [7].

Renormalization
In order to address the renormalization of the bilinears, we are going to consider their matrix elements between partonic states (i.e.elementary fields φ), which we denote M(ν, z 2 ).Lorentz invariance implies that the matrix element can only depend on the scalar quantities ν = p • z and z 2 .We use dimensional regularization and MS for renormalization.The tree level and one-loop diagrams that are needed to renormalize the field bilinear are summarised in Fig. 3.At tree-level, as shown in diagram (a), the matrix element only depends on ν, Note that there is no difference between the light-cone and the Euclidean expression at tree level.In both cases the result only depends on the variable ν.
For a light-cone separation, z 2 = 0, the bilinear at one-loop is logarithmically divergent.The renormalized matrix element is where the one-loop corrections are proportional to the coupling α = λ 2 /(64π 3 ).The multiplicative renormalization in the first line is the usual renormalization of the field, which appears in diagram (b).The convolution in the second line comes from diagram (c) in Fig. 3 and is the typical divergence that appears when such bilinears are concerned.Note that we have suppressed the dependence on z 2 since in the light-cone case we always have z 2 = 0.The renormalized light-cone matrix element is the quantity that is directly related to the PDF.
For a Euclidean separation, z 2 = −z 2 3 , the finite value of z 2 acts as a regulator and the one-loop calculation is finite after the usual renormalization of the field φ, where K 0 is a Bessel function, which is singular only at the origin z 3 = 0.

Factorization
In order to obtain a factorization theorem, we expand the expression for the Euclidean correlator for We see explicitly that the function in the convolution is IR-divergent when m → 0. However, this divergence matches exactly the divergence in the convolution for the light-cone quantity.
When relating the two renormalized expressions, the dependence on m 2 cancels and we obtain a relation that is IR-finite, as required in a factorization theorem.By inverting the relation between the light-cone renormalized correlator and the renormalized PDF f r , we can write C ξν, µ 2 z 2 3 = e iξν − α The left-hand side of this equation is a Euclidean correlator, which can be computed using standard techniques in Monte Carlo simulations of QCD.The right-hand side is a convolution that involves the PDF as defined from the light-cone correlator.It is interesting to notice that the dependence on z 2 only appears at O(α).This is the equivalent of the well-known violations of Bjorken scaling in DIS.Note also that the factorization formula receives corrections that are powers of mz 3 , i.e. factorization works at small distances, with power suppressed corrections, just like in DIS, with small distances replacing large momentum transfer.
There are a multitude of lattice observables that have been defined starting from the Euclidean correlator in position space discussed here, see e.g.Ref. [8,9] for recent reviews and exhaustive references.All of these lattice observables are related to light-cone correlators and therefore to PDFs following the steps outlined above.First the lattice quantities need to be properly renormalized and then the renormalized observable -extrapolated to the continuum limit -is matched to PDFs via some factorization theorem.After this procedure has been completed the lattice observables are exactly on the same footing as experimental observables, e.g.like the structure functions F i introduced above in the context of DIS [7,10].

Bayesian inverse problems
Determining the PDF from a finite set of lattice data using Eq. ( 7) is a typical example of an inverse problem.Lattice simulations can provide a finite number of input points, which are used to constrain a function, i.e. an element of an infinite-dimensional space.Such problem is clearly ill-defined and the solution does depend on a number of assumptions that need to be made.The formulation of the problem is summarised as a sketch in the diagram in Fig. 4. We aim to determine a function f , in this particular case a function of one real variable x.The model space, denoted M in the figure, is an infinite-dimensional space.The data set A is a dicrete set, each point in the data set is some functional of the function f , z = G( f ).Clearly the system is underdetermined, since it would require an infinite amount of data in order to uniquely determine the function f .A more constructive way to phrase the problem can be formulated in a Bayesian framework [11].Our knowledge, or prejudice, about the function f is encoded in some prior p( f ), which is then updated using the dataset: where on the right-hand side we have introduced the likelihood function p(A| f ).In principle one could try to work in infinite-dimensional spaces and define a probability measure.While this is a possibility, it is easier to reduce the problem by parametrizing the function f and work with a probability density in the finite-dimensional space of the parameters that define the model.Clearly the choice of the parametrization induces a bias that needs to be taken into account.
: Schematic representation of an inverse problem.On the left of the figure we see the space of data, with some dataset A. On the right we have the space of measurable functions M(X , Y) and the subset F that we explore with a given parametrization.

NNPDF fits of lattice data
As explained in the previous section, lattice data are on the same footing as experimental input into PDFs fits.In a series of papers, some of the lattice data have been incorporated in the general fitting framework developed by the NNPDF collaboration [5,6].It is worthwhile to emphasise that the lattice data are handled like any other dataset in NNPDF, with no need to adjust the methodology.The only input needed is a robust estimate of the statistical covariance and the systematic errors.This is modeled by considering the data, z, as stochastic variables distributed according to a multi-dimensional Gaussian distribution, centred at the value of the experimental measurement Z, with a covariance C, which we denote as Parametrization.In the NNPDF formalism, the parametrization of the function f is provided by neural networks, see Ref. [12] for the details of the latest implementation.A sufficiently large architecture provides a parametrization that is flexible enough to minize the functional bias.We denote the neural net parametrization as g[θ ], where θ is the set of parameters (biases and weights of the neural network).
Posterior distribution.The posterior distribution in the space of functions -i.e. in the space of functions that are parametrized by the neural networks -is described by a Monte Carlo set of replicas.The replicas implement a bootstrap propagation of the statistical fluctuations of data into the space of functions [13].Each replica z (k) is obtained by generating a set of pseudo-data, where ϵ (k) are distributed according to N (0, C).The set of replicas yields an ensemble of pseudo-data that reproduces the statistical distribution of the experimental data as encoded in the covariance matrix.For each replica, we Hence for each replica, we obtain a parametrization of the PDFs.This set of fitted parameters, θ (k) ; k = 1, . . ., N rep , yields the desired posterior distribution in the space of functions.Note that this ensemble of replicas is an example of importance sampling of the posterior distribution in the space of functions parametrized by the neural networks.
Results of the fits to lattice data were presented in Refs.[5,7].A few examples are reported in Fig. 5 and 6.It is important to remark that the lattice error in these analysis is dominated by systematic errors.
Despite the large errors, it is remarkable that a fairly small number of lattice data points provides a powerful constraint for the PDFs.It is clear that improved lattice simulations, with better control of systematic errors, could target the kinematical regions were the current uncertainties are large and have a significant impact.
Figure 7: Unpolarized gluon PDF from Ref. [14].Lattice results are compared with the results of fitting experimental data.

Recent fits
Since the analyses in Refs.[5,7], a large number of new simulations have appeared that have measured a broader variety of observables, and have improved both the statistical and systematic errors.These more recent simulations have also focused on helicity distributions, TMDs and GDPs, which are less precisely determined from fits to experimental data.These are areas where the lattice simulations are most likely to have a lasting impact.We do not have time to make an exhaustive review of contemporary results, for which we refer the reader to the recent proceedings of the Lattice conference [9].In order to entice the audience, we report here a few results from recent publications.This is not an exhaustive list of result, but does give an idea of the potential for lattice QCD.
Unpolarised gluon.The HadStruc collaboration has performed a careful study of the unpolarised gluon distribution [14].The study is interesting in many respects.In particular, the study takes into account the mixing of the gluon and the singlet bilinears under renormalization and introduces a new parametrization of the PDFs for their fits.The result is reported here in Fig. 7.Note that the pion mass in this work is still relatively large, m π ≃ 358 MeV.
Another unpolarized gluon.An independent determination of the unpolarised gluon distribution is presented in Ref. [15].Despite the fact that a wider range of pion masses is explored, there are still systematic discrepancies between these determinations and the fits to experimental data, which deserve to be clarified.It would be interesting to perform a combined analysis of these results using the NNPDF methodology as explained in the previous Section.These determinations are the benchmarks that the lattice results need to satisfy in order to show that systematic errors are under control.Results are shown in Fig. 8.

Helicity and transversity.
Ref. [16] is one example of a study that goes beyond the unpolarized distributions and addresses the determination of helicity and transversity parton distributions.These distributions are less constrained by experimental data, so that the impact of lattice simulations here is likely to be highly significant.Results are reported in Fig. 9.

Conclusion
Since the original work in Ref. [1], there has been a momentous activity in the lattice community in order to extract PDFs from numerical simulations of lattice QCD.Following a series of well-founded criticisms in Refs.[17,18] the theoretical status of these studies has been carefully examined, and it is now clear that lattice data in Euclidean spacetime can be related to light-cone PDFs by factorization theorems after the lattice observables have been properly renormalized and extrapolated to the continuum limit.
In this respect, lattice data can be treated exactly like any other experimental data.The extraction of the PDFs requires the solution of an ill-defined inverse problem, whose result depends on the choice of priors.This is a delicate point, which needs to be analysed with great care.A Bayesian formulation facilitates the explicit formulation of the assumptions underlying the prior disribution.Having clarified the prior, the posterior distribution is given by Bayes theorem, and can be sampled e.g. by Monte Carlo methods as suggested in the NNPDF approach.
As the high-energy physics community is driven towards precision analyses at hadronic colliders, a faithful determination of the errors on the Parton Distribution Functions becomes increasingly important.We expect to see significant progress in this area, especially from the synergy of different approaches.

Figure 2 :
Figure 2: Leading contribution to the hadronic tensor in the deep inelastic limit.Hard momenta flow in the lower part of the diagram, which yields the perturbative coefficient function C i , while the upper part contains all the soft lines.The leading contribution is characterized by having only two fermionic lines connecting the upper and the lower part of the diagram.Diagrams with more lines between these two subdiagrams are suppressed by powers of Q 2 .Diagram from Ref. [2].

Figure 3 :
Figure 3: Tree level and one-loop diagrams needed for the renormalization of the field bilinear φ(z)φ(0).

Figure 5 :
Figure 5: PDFs extracted from lattice data using the NNPDF methodology.Results from Ref. [5].The NNPDF fit to experimental data is shown in blue for comparison.

Figure 6 :
Figure 6: PDFs extracted from lattice data using the NNPDF methodology.Results from Ref. [7].The NNPDF fit to experimental data is shown in green in this plot.

Figure 8 :
Figure 8: Unpolarized gluon PDF from Ref. [15].Lattice results are compared with the results of fitting experimental data.

Figure 9 :
Figure 9: Helicity and transversity distributions from Ref. [16].Lattice results are compared with the results of fitting experimental data.