SciPost Submission Page

Probing Proton Structure at the Large Hadron electron Collider

by Rabah Abdul Khalek, Shaun Bailey, Jun Gao, Lucian Harland-Lang, Juan Rojo

This is not the current version.

Submission summary

As Contributors: Lucian Harland-Lang
Arxiv Link:
Date submitted: 2019-07-03
Submitted by: Harland-Lang, Lucian
Submitted to: SciPost Physics
Discipline: Physics
Subject area: High-Energy Physics - Phenomenology
Approach: Theoretical


For the foreseeable future, the exploration of the high-energy frontier will be the domain of the Large Hadron Collider (LHC). Of particular significance will be its high-luminosity upgrade (HL-LHC), which will operate until the mid-2030s. In this endeavour, for the full exploitation of the HL-LHC physics potential an improved understanding of the parton distribution functions (PDFs) of the proton is critical. The HL-LHC program would be uniquely complemented by the proposed Large Hadron electron Collider (LHeC), a high-energy lepton-proton and lepton-nucleus collider based at CERN. In this work, we build on our recent PDF projections for the HL-LHC to assess the constraining power of the LHeC measurements of inclusive and heavy quark structure functions. We find that the impact of the LHeC would be significant, reducing PDF uncertainties by up to an order of magnitude in comparison to state-of-the-art global fits. In comparison to the HL-LHC projections, the PDF constraints from the LHeC are in general more significant for small and intermediate values of the momentum fraction x. At higher values of x, the impact of the LHeC and HL-LHC data is expected to be of a comparable size, with the HL-LHC constraints being more competitive in some cases, and the LHeC ones in others. Our results illustrate the encouraging complementarity of the HL-LHC and the LHeC in terms of charting the quark and gluon structure of the proton.

Ontology / Topics

See full Ontology or Topics database.

Large Hadron Collider (LHC) Parton distribution functions (PDFs)
Current status:
Has been resubmitted

Reports on this Submission

Anonymous Report 3 on 2019-8-4 Invited Report

  • Cite as: Anonymous, Report on arXiv:1906.10127v1, delivered 2019-08-04, doi: 10.21468/SciPost.Report.1096


See attachment.


  • validity: -
  • significance: -
  • originality: -
  • clarity: -
  • formatting: -
  • grammar: -

Author Lucian Harland-Lang on 2019-09-05
(in reply to Report 3 on 2019-08-04)

Our response is found in the attachment.



Anonymous Report 2 on 2019-7-26 Invited Report

  • Cite as: Anonymous, Report on arXiv:1906.10127v1, delivered 2019-07-26, doi: 10.21468/SciPost.Report.1075


See details in extended report.


See details in extended report.


The paper tries to address the interesting and important question how precision deep-inelastic electron-proton data from a future 1.3 TeV collider (LHeC) could change our
knowledge about the structure of the proton. This is an exercise based on projected
data generated within the LHeC study group and using what the authors call a 'state-
of-the art' ansatz for the determination of parton distribution functions (PDFs).

The paper presents interesting results and is rather well written. However, it has
severe problems which require changes prior to a publication. These regard the
relation of the authors to the origin of the pseudo-data, and they concern the
distinction of using global and especially LHC data from what the LHeC provides.

Hence I like to request changes, motivated in three general points below.

1) The generated data and the section 2.1 is in fact the intellectual property of the
LHeC study group with Refs. [25] and [26]. Without such an input a paper of the
present type could not have been written. It is considered inappropriate that the
reference to this essential input is just a webpage which is cited only once. This would
be resolved and mistakes were easily resolved (such as on the positron luminosity) if,
as I sincerely propose, the current authors contacted the authors of the ep pseudo-
data generation and invited them to be co-authors of this paper. This would be
beneficial for the paper quality, and help future collaboration. It would then justify
the current wording of the section 2.1 and maintain a proper scientific standard w.r.t.
intellectual property. Furthermore, details of the pseudo-data as shown in various
presentations by the LHeC study group members could be added and thus the quality
of the input clearly described. This would enhance the credibility of the paper
considerably. A clarification of this point is of utmost importance for a publication in a
peer-reviewed paper.

2) The authors claim that they are using a so-called 'state-of-the-art global fit'. This
is insufficient for the scope of the paper since it misses the very important distinction
of the clean theoretical basis of ep PDF sensitive data versus the plethora of global
and especially pp data. The big advantage of such new, high energy and high
luminosity (w.r.t. HERA) DIS data would be the self-consistent determination of all
proton PDFs for the first time, and at utmost precision, including the discovery
potential for new QCD phenomena such as at small x. The ground-breaking idea as
presented by the LHeC study group (c.f. LHeC CDR from 2012) would be to use the
independent ep data to test e.g. QCD factorisation in pp data and empower new
physics searches at LHC. In contrast to this understanding, the paper in its present
from presents the current (PDF4LHC) and future (HL-LHC) global PDFs as one way to
determine PDFs and only asks whether LHeC would improve the precision. For
theoretical, experimental and conceptional reasons, however, these are two systems
which cannot just be compared without a very clear discussion of their relative merits.
Briefly, the LHeC comes with the prospect to replace the global PDFs for only then one
will come back to indeed test and develop QCD theory; the comparison of HL-LHC
based PDFs with LHeC would then be made as an interesting check for consistency of
the theory and phenomenological assumptions. Technically, and in contrast to
statements in the paper, there is no need for an LHeC analysis to follow that state-
of-the-art, especially it will have different parameterisations and can be expected to
be internally consistent, all at N3LO, which in a global fit may be out reach for quite
some time.

3) The authors refer mainly to their Ref. [9] that illustrates in a most optimistic
scenario anticipated 'ultimate' PDF precision from LHC pp data. This reference is
insufficient for the scope of this paper since it neglects theoretical uncertainties and
current / future inconsistencies of the various pp data. Already now, we know that the
current status of fitting pp data is ambiguous which is one of the main reasons why
global fits require large tolerance criteria to accommodate them (known since many
years already e.g. from the Tevatron jet data). For the scope of a paper which deals
with such kind of new ep data, it is mandatory to formulate exactly the used
parameterisations which should be adjusted to the used input sets, i.e. using also the
given polarisation information would open up sensitivities to different PDFs and
require thus a different set of chosen parameterisations. The before mentioned self-
consistency of ep data, i.e. taken with one detector at once and analysed consistently
including all correlations clearly known, demands the use of a tolerance criteria of 1. In
addition to the careful description of the used parameterisations, other details like the
influence of the strong coupling constant uncertainty and more details on the heavy
flavour scheme are needed in the paper (including the influence of knowing masses).
As also shown in the original LHeC CDR, such precise ep data could measure the
strong coupling and the charm and bottom masses with very high precision (see e.g.
Table 3.6 in Ref. [10] for charm mass).

Some initial, more detailed comments are listed underneath. Those should hopefully
help to improve the paper further and assumes that my three general points have
been addressed in a positive way.

The abstract is missing the important difference in the theoretical grounds of ep vs
pp. It also reflects the results of studying part of the expected outputs from an LHeC
only, i.e. as also stated on page 20, adding ep jet data is expected to constrain the
gluon distribution considerably further over a broad x range.
Please modify the abstract to reflect those differences (c.f. also general comments
above) and the illustrative and limited character of your study. The 'complementarity'
of HL-LHC and LHeC comment is misleading since you need to compare T=1 in ep
with T=3 (or any other T>1 value).

Section 1
The introduction has to reflect in more critical way the original contributions that
exists already for LHeC PDF studies, e.g. Refs. [10, 26] and further updates, e.g.
arXiv:1802.04317, also at workshops and conferences.
Add those and other references.

It is also important to discuss critically the so-called 'ultimate' HL-LHC data and
effects that new physics may have, see e.g. recent ZEUS publication about joint PDF and contact interaction fits [arXiv:1902.03048] discussing how PDFs may be modified
allowing new physics contributions. This is obviously a much more critical point for pp
data where new physics could be absorbed into PDFs. So, the aim to study LHeC data
in addition to HL-LHC is not really relevant, as explained above, since the goal of an
LHeC would be to deliver complete PDFs independently from LHC to empower pp

Further on page 3 you discuss the methodology of the profiling. This is a limited
methodology, however, in the context outlined above (no new physics, no new QCD
dynamics), it may be used. Please add the original literature for the profiling, in
particular in Chapter 3. The Ref. [9] is insufficient here and not an original source in
my view.

Section 2
As stated above, the paragraph 2.1 is only valid in the present form if the colleagues
who did the pseudo-data generation are appearing as co-authors. In principle the
whole paragraph belongs to Refs. [25,26]!
Then also mistakes appearing in the text can be corrected and the pseudo-data
could be explained in detail.
Correct mistakes as such:
The pseudo-data do not have an eta_l cut.
Table 2.1 belongs to Ref. [25]. What is F2_c,cc? The e+p data are expected to have
max. luminosity of 0.3 ab-1, please respect the given inputs here. The strange
contribution is generated as you can easily check
Figure 2.1 needs also the reference like 'based on Ref. [25].'.
The last sentence on page 6 is an inapproriate statement since the LHeC simulation
framework and assumptions evolved from the 2012 CDR towards a much more flexible
framework, see e.g. Ref. [19]. As the authors may know, the goal of the CDR2012 was
to study the precision in the limited HERA framework, the goal of the ongoing update
is to add b,c,s data and free the ubar=dbar assumption and thus demonstrate the
ability to unfold the partonic contents completely. In this regard the present paper
adds indeed useful and independent information.

The part 2.2 needs to be rewritten for it is essential to understand the flavour
decomposition used. For experts to judge the numerical results, section 2.2 has to
explain in detail all the assumptions and parametric forms for the QCD fits. The
PDF4LHC ansatz appears not suitable for an LHeC pseudo-data fit. As discussed
earlier, the choice of parameterisations in the PDF4LHC fit is driven by the T>1
tolerance criteria to accommodate inconsistencies in the very many used data sets. An
LHeC would make such an ansatz obsolete. Furthermore, any new QCD dynamics
needs to be discussed more critical, i.e. HERA data could have hints for the onset of
e.g. saturation or BFKL dynamics / low-x resummation. However, an LHeC would
measure such novel effects precisely and hence would give the important input how
QCD theory has to be developed further.
So, if the assumptions are made clearer to the reader, I think that this is a valid
approach for this study to restrict themselves to test DGLAP-like inputs.

Section 3
Please cite clearly the original contributions so that the reader is guided to the direct
sources (and not secondary sources).

On page 9, second paragraph, you introduce the T=3 criterion. As discussed, this is
irrelevant for precision ep data. For an apple-to-apple comparison, one would need
T=1 for ep and, if that is your choice, T=3 for the global fit.

Section 4
Re-phrase the use of T=1 for ep data.

Fig. 4.1 Here it has to be noted that at high x, the gluon distribution is very small and
known to not better than about two orders of magnitude currently. Any delta g/g in
the range of 0.5 to 1 would be a massive improvement. It would be interesting, since
xg varies so dramatically, to also show xg itself, at large x in a linear x - log (xg)

In the text below Fig. 4.1 you discuss 'some tensions' in HERA ep data. I am not sure
why this is relevant for this study since all HERA fits use T=1 regardless of that. Then
you would need to discuss the many more and much more obvious tensions in the
data sets used in global fits as well and well, this would be another paper, wouldn't it.
In my view, it would be much clearer and sufficient for the scope of this paper if you
state the assumptions, the details of the parameterisations and then show the results.

Fig. 4.2 and discussions of T values: Please make clearer that this is more like a
technical illustration only. ep data will use T=1 and in case new QCD dynamics is
discovered at LHeC, new QCD theory will be used (which goes much beyond this
study) .

Figs. 4.3 and 4.4 Please add the used T values in the captions. How are the relative
errors are normalised? E.g. if each 'fit' gives a new central value, the fractional
uncertainties may shift artificially to lower/smaller values than the reference value.
Please also add that no theoretical uncertainties and other effects are considered for
HL-LHC, e.g. the scale uncertainties at low and high masses are considerable.

For the discussion of LHeC vs HL-LHC: Please also repeat the assumptions: not all
LHeC data fitted, no new physics in HL-LHC observed, all pp data consistent within
T=3 while ep uses T=1.

The statement on page 14 on low mass Drell-Yan and inclusive D meson pp data
needs a more critical discussion. Those data are prone to large theoretical
uncertainties and may be finally of limited use w.r.t. improvements of our low-x

Section 5
This section is much too general and too much focussed on current paradigms
followed up in the PDF4LHC combination. It also neglects the more recent
developments of the fits performed within the LHeC study group. Please rewrite it and
make it specific using a rewritten section 2.2
LHeC would be a totally new machine going much beyond HERA limitations in the
allowed flexible parameters. It would allow to unfold all PDFs: u,d,c,b,s (no f_s
needed), ubar and dbar.

The summary of this section also needs to be rewritten.

Section 6
The sentence is unclear: "In the large-x region, which is of course crucial for BSM
searches, the LHeC and HL-LHC impact is broadly found to be comparable in size, with
the HL-LHC resulting in a somewhat larger reduction in the gluon and strangeness
uncertainty, while the LHeC has a somewhat larger impact for the ..."
-> In the case of pp, we want to find new (non-resonant like contact interaction)
physics at high masses/high x, how shall we use the same data for PDF fits? This
would be only possible with much more complicated fits a la the recent ZEUS paper.
Please clarify what you really mean. The point to be made here is not the similar
precision (which as you indicate is only due to the current neglect of jet LHeC data)
but the complete change of the analysis: LHeC would deliver external, reliable
precision PDF input to enable new physics searches at high mass/x, i.e. not using
those data for pp PDF analyses.

Overall, the overall summary needs to be adjusted and appropriate references to the
LHeC study group have to be added.

Requested changes

Please address point 1 to 3 in the extended report and further requests therein.

  • validity: good
  • significance: high
  • originality: good
  • clarity: good
  • formatting: excellent
  • grammar: excellent

Author Lucian Harland-Lang on 2019-09-05
(in reply to Report 2 on 2019-07-26)

Our response is found in the attachment.



Anonymous Report 1 on 2019-7-23 Invited Report

  • Cite as: Anonymous, Report on arXiv:1906.10127v1, delivered 2019-07-23, doi: 10.21468/SciPost.Report.1069


1. Gives a reasonable estimate of the effect of a future LHeC on the uncertainties of Parton Distribution Functions
2. Relates the work to other studies with critical comment
3. Relates the work to BSM and SM cross sections of current interest
4. Most points are made very clearly


1. Does not deal with realistic systematic uncertainties- but to be fair this is close to impossible with future projections
2. LHeC pseudo-data are used which may be superseded soon- but that is not anything the authors could control
2. The odd point is obscure


This paper makes as assessment of the likely impact of a future LHeC on the uncertainties on parton distribution functions PDFs using similar procedures to a recent paper which assessed the impact of the HL-LHC (by an overlapping set of authors). It is a valuable contribution to the debate. It also addresses most of the criticisms that could be made of such a study within the paper itself and as such it is a pleasure to read. I particularly enjoyed the discussion of appropriate tolerances.
Even if the LHeC pseudodata used are supeseded the paper makes a contribution in comparing methods currently used to assess their significance, which will be useful in future studies.
I recommend publication and have only a few minor comments to make under requested changes, which the authors may wish to consider.

Requested changes

1. 1 TeV proton data are mentioned in context of extending the kinematic range, would these not also be useful for the measurement of FL and hence the gluon PDF--please comment.
2. Add to references 31 and 32 a reference to arXIV:1906.01884 on the same topic.
3. The use of open charm and beauty production and exclusive vector meson production to constrain the low-x gluon is made. The theoretical status of these processes is not on the same level as the cross sections used and I think this should be pointed out.
4. On page 15 it becomes a little confusing when it is stated that 'when the LHeC pseduo-data are generated with this more restrictive HERAPDF2.0 parametrisation one is making strong assumptions about the future'. Surely the LHeC pseudo-data are whatever they are independent of whether ot not PDF4LHC15 or HERAPDF2.0 are being used for the profiling exersize. The strong assumption is that the HERAPDF parametrisation will be adequate to describe future LHeC data. Can the authors please clarify, or re-phrase?
5. In Fig 5.5 an attempt is made to separte the role of the input data set and the parametrisation. This is quite interesting but the model and parametrisation variations for the HERAPDF are not used. It would be interesting to see this figure when they are used. However since this is not the most important point of the paper, I do not insist.

  • validity: high
  • significance: high
  • originality: high
  • clarity: high
  • formatting: excellent
  • grammar: excellent

Author Lucian Harland-Lang on 2019-09-05
(in reply to Report 1 on 2019-07-23)

Our response is found in the attachment.



Login to report or comment