SciPost Submission Page
A benchmark study of dioxygen complexes based on coupled cluster and density functional theory
by Marcel Swart, Marc Reimann
This is not the latest submitted version.
This Submission thread is now published as
Submission summary
Authors (as registered SciPost users): | Marcel Swart |
Submission information | |
---|---|
Preprint Link: | https://doi.org/10.26434/chemrxiv-2023-56kdv-v2 (pdf) |
Date submitted: | 2024-06-24 23:14 |
Submitted by: | Swart, Marcel |
Submitted to: | SciPost Chemistry |
Ontological classification | |
---|---|
Academic field: | Chemistry |
Specialties: |
|
Approach: | Computational |
Abstract
A set of five compounds containing peroxo, superoxo or bis-mu-oxo moieties has been studied in the gas phase using CCSD(T)/aug-cc-pVQZ, also in combination with Goodson’s continued fraction approach. The corresponding analytical frequencies corroborate assignments of bands from experiments, and thus provide a consistent set of reference data that can be used for benchmarking a range of density functional approximations. A total of 100 density functionals have been checked for the general bond lengths, the specific peroxo/superoxo bond lengths, angles, vibrational frequencies and electronic energies. There is not one density functional that performs equally well for all of these properties, not even within one class of density functionals.
Author comments upon resubmission
The analysis of all statistics has been redone to take into account the updated reference data for compound 2, which affects mainly the vibrational frequencies (Table S4). We have also added more density functionals, including local and range-separated local hybrid functionals.
Similarly, anharmonic vibrational frequencies were obtained for the three compounds for which experimental data are available, which as expected indeed bring the coupled cluster results closer to experiment. This has been added with an additional table, and discussion in the main text.
Finally, some figures for statistical measures have been added to the main text; all six tables in the supporting information now include such a figure.
List of changes
General response to reviewers
After having published the manuscript on ChemRXiv, it was brought to my attention by Marc Reimann that the underlying coupled cluster data for one compound (2) was probably wrong for the frequencies. This relates back to the ‘well-known’ Hartree-Fock instability, which for some unexpected reasons only shows up in the vibrational frequencies. Marc suggested to redo the calculations for compound 2 with coupled cluster based on DFT orbitals, which I asked him to do and added him as co-author on the manuscript. Indeed, this solves the issue, and there is now excellent consistency between coupled cluster and DFT.
The analysis of all statistics has been redone to take into account the updated reference data for compound 2, which affects mainly the vibrational frequencies (Table S4). We have also added more density functionals, including local and range-separated local hybrid functionals.
Similarly, anharmonic vibrational frequencies were obtained for the three compounds for which experimental data are available, which as expected indeed bring the coupled cluster results closer to experiment. This has been added with an additional table, and discussion in the main text.
Finally, some figures for statistical measures have been added to the main text; all six tables in the supporting information now include such a figure.
Response to reviewer comments
Reviewer 1
Strengths
1. The study reports on an interesting topic, and rather systematic benchmarks are (sadly) seldom published as they take a lot of time to do, and are sometimes difficult to get accepted in journals ("what is the new chemical insight?"). This methodological work should be encouraged.
2. The reporting is done in a rather concise way, with data published as supporting information. This makes for an easy read, not getting down too much into the details in the main paper.
3. The methodology is reasonably well described, and the comparison with experimental data allows this to go beyond a simple "X vs Y" computational study.
Response:
We thank the reviewer for the positive comments.
After having published the manuscript on ChemRXiv, it was brought to my attention by Marc Reimann that the underlying coupled cluster data for one compound (2) was probably wrong for the frequencies. This relates back to the ‘well-known’ Hartree-Fock instability, which for some unexpected reasons only shows up in the vibrational frequencies. Marc suggested to redo the calculations for compound 2 with coupled cluster based on DFT orbitals, which I asked him to do and added him as co-author on the manuscript. Indeed, this solves the issue, and there is now excellent consistency between coupled cluster and DFT.
The analysis of all statistics has been redone to take into account the updated reference data for compound 2, which affects mainly the vibrational frequencies (Table S4). We have also added more density functionals, including local and range-separated local hybrid functionals.
Similarly, anharmonic vibrational frequencies were obtained for the three compounds for which experimental data are available, which as expected indeed bring the coupled cluster results closer to experiment. This has been added with an additional table, and discussion in the main text.
Finally, some figures for statistical measures have been added to the main text; all six tables in the supporting information now include such a figure.
Weaknesses
1. The author has gather a lot of data, but I do not think they exploit it sufficiently. In particular, there is no real graphic representation or statistical analysis of the trends. I encourage the author to add some further analysis, showing distributions of values, correlations between different functionals (within the same family and outside), etc. This would allow to give a more quantitative conclusion than "there is no single best performing functional" (which is somewhat expected). Maybe synthesizing the performance of every function on each metric (geometry, frequencies, etc.) with a "figure of merit"? I don't want to be too prescriptive, but I think further statistical analysis would strongly improve the paper. Right now, it lacks a bit in the depth of analysis and conclusions.
1bis. Please provide graphical representation of the data, so it is easier to read than tables.
We have added now figures that show a graphical representation of the data, which have been added to Tables S1-S6, and three of these have been added in the main text. For instance, Figure 1 has been added:
Figure 1. Mean absolute deviations (Å) of all bonds (blue, Table S1) and peroxo/superoxo-bonds alone (red, Table S2) for the 100 density functionals compared to the CCSD(T)/aug-cc-pVQZ reference data. Shown from left to right are the classes of composites, LDA, GGA, MGGA, hybrid, MGGA-hybrid, range-separated hybrid, local hybrid, range-separated local hybrid and double hybrid.
Table 4 (previously 3) already contains additional information on the spread of errors (minimum, maximum, and average for all five properties).
2. The data gathered is published only inside a PDF file, with page breaks, page numbers, comments. It is not machine readable, and very far from the FAIR principles for research data publication. Please publish the data sets gathered in an open, machine readable format (CSV, JSON, etc). This would allow other groups to reuse it easily, make their own analysis, complement the work and have a lot more value to the community.
The data have been prepared in CSV and added to the supporting information.
3. For full reproducibility of the work for others, it would be necessary to publish the input files for the calculations. Even though the methodology is described, every calculation has tons of small "hidden parameters" which can be important for reproducing existing work. Input files, optimized geometries should be published (again, as machine-readable files in a ZIP archive or an online repository like Zenodo). Otherwise, the promise of the author ("The resulting data could be used as reference in the development of new density functionals, or used to benchmark existing density functionals.") is somewhat empty.
The input-files for all types of calculations are included now in the SI.
Report
I think this study can be improved and, at a later stage, accepted for publication. With some extension, I expect it will meet the criteria for publication.
We thank the reviewer for the positive feedback.
Requested changes
See above.
Reviewer 2
In this manuscript, Swart studied the performance of 75 density functionals against CCSD(T) for 5 compounds. However, due to the fatal mistake identified (see questions 5) and the contents, I am concerned that it may not meet the publication requirements.
Response:
We thank the reviewer for the positive comments.
After having published the manuscript on ChemRXiv, it was brought to my attention by Marc Reimann that the underlying coupled cluster data for one compound (2) was probably wrong for the frequencies. This relates back to the ‘well-known’ Hartree-Fock instability, which for some unexpected reasons only shows up in the vibrational frequencies. Marc suggested to redo the calculations for compound 2 with coupled cluster based on DFT orbitals, which I asked him to do and added him as co-author on the manuscript. Indeed, this solves the issue, and there is now excellent consistency between coupled cluster and DFT.
The analysis of all statistics has been redone to take into account the updated reference data for compound 2, which affects mainly the vibrational frequencies (Table S4). We have also added more density functionals, including local and range-separated local hybrid functionals.
Similarly, anharmonic vibrational frequencies were obtained for the three compounds for which experimental data are available, which as expected indeed bring the coupled cluster results closer to experiment. This has been added with an additional table, and discussion in the main text.
Finally, some figures for statistical measures have been added to the main text; all six tables in the supporting information now include such a figure.
1. The author used the CFOUR package to perform the first and second derivative calculations of coupled cluster with singles and doubles and perturbative triples (CCSD(T)) for geometry optimization and Hessian. However, in the paper, the author only cites the classic quantum chemistry textbook, ref. [15], instead of detailed CCSD(T) works. As far as I know, the book only contains chapters about the first derivative for closed-shell CCSD. However, in the paper, the author used analytical gradients and Hessian of CCSD(T) for both closed- and open-shell references. The correct references should be cited, e.g.,
J.D. Watts, J. Gauss, and R.J. Bartlett, Open-shell analytical energy gradients for triple excitation many-body, coupled-cluster methods: MBPT(4), CCSD+T(CCSD), CCSD(T),and QCISD(T), Chem. Phys. Lett. 200, 1-7 (1992)
J. Gauss and J.F. Stanton, Analytic CCSD(T) second derivatives, Chem. Phys. Letters 276, 70-77 (1997)
P.G. Szalay, J. Gauss, and J.F. Stanton, Analytic UHF-CCSD(T) second derivatives: implementation and application to the calculation of the vibration-rotation interaction constants of NCO and NCS, Theor. Chim. Acta 100, 5-11 (1998)
Moreover, in the paper, the author mentioned that the 'ECC program' is used. Please clarify whether 'ECC' refers to a conventional UCCSD(T) or a spin-adapted UCCSD(T).
The references have been updated, and the ECC (conventional) has been clarified.
2. It is essential to explicitly state the charge and spin (multiplicity) used in the calculations for all five compounds under investigation. Regarding the three compounds with available experimental data, could the authors clarify whether the spin information is also derived from the experiments? To enhance the understanding of the remaining two compounds without experimental data, I recommend that the author conducts CCSD(T) calculations with different multiplicities. This will help in distinguishing between ground and excited states in the study.
Three of the five compounds have singlet state and neutral charge, with complexes 2 (charge +1, doublet) and 4 (charge -1, singlet) as exception. This was already clearly indicated in Scheme 2 and explicitly stated in the main text where the results were discussed. E.g. in main text 2[FAlO2]•+ or [FAlO2]0. We have now added the charges and multiplicities explicitly in the figure caption of Scheme 2 as well.
Indeed, the singlet ground state is inferred from experiment. The current contribution does not deal with excited states, for which other approaches might need to be used. Therefore, this is beyond the scope of the manuscript.
3. The paper and its supplementary information mention an issue when parallelizing the calculation of open-shell CCSD(T) energies. Upon reviewing reference 24, it appears that the parallel implementation for open-shell second derivatives has not been developed. Could the authors provide clarification on this matter? To my understanding, analytical gradient and Hessian algorithms tend to be more complex than energy calculations. I'm curious if the parallel versions of the first and second-order derivative algorithms also encounter issues at the open-shell CCSD(T) level. Could the authors provide clarification on the presence of any such bugs in these derivative algorithms?
The parallel implementation only affects the MP2 energy part. All other properties (first order, second order) are not affected.
The parallel implementation is working properly, and might have been added after the reference 24 was published.
4. Open-shell systems often exhibit spin contaminations in their UCCSD(T) wave functions. It is recommended that the authors report the value of <S^2> for the UCCSD wave function. To address spin contaminations, Szalay and colleagues introduced a spin-restricted method for open-shell systems, as described in their work (J. Chem. Phys. 107, 9028–9038 (1997)). If the calculated value of <S^2> for the open-shell UCCSD differs significantly from the ideal value, it is advisable to consider the use of a spin-adapted UCCSD(T) approach.
The <S2> value is what was to be expected (added to SI):
----------------------------------------------
Projected spin multiplicities:
----------------------------------------------
Projected <0|S^2 exp(T)|0> = 0.7528990194.
Approximate spin mult. = 2.0028969214.
----------------------------------------------
5. If I comprehend this correctly, in addition to exploring various second-order properties, the author has compared absolute energies obtained from different DFT methods with those from CCSD(T) or cc-cf calculations, which, in my view, appears illogical. Absolute energies derived from various single-reference or multi-reference CI methods could be valuable, considering the variation principle. However, the absolute MPn or CC energies should be approached with caution unless their wave functions have been properly normalized. To the best of my knowledge, comparing absolute energies obtained from DFT calculations is discouraged, and the focus should be on relative energies. Consequently, the results presented in Table S5 may not provide meaningful insights. Rather than absolute energies, it would be more meaningful if the author computed atomization energies (relative energies) using both CCSD(T) and DFT methods. Given that absolute energies are not considered meaningful in this context, it follows that the extrapolated FCI energies obtained via the Goodson method may also lack significance. However, it would still be valuable for the paper to report atomization energies based on the extrapolated CI energies
Atomization energies for many compounds can be found in many contributions from the literature. Moreover, because this involves calculations on atoms, rather than molecules, the result is prone to interpretation (and use of) of atomic term symbols, symmetries, and how to distribute electrons over the different orbitals and shells, when computed with a molecular program. Most likely we would need to run spin-orbit coupling on top of the CCSD(T)-cf for consistency. This is clearly beyond the scope of the current manuscript. Furthermore, it would deviate from the purpose of the current contribution, which is on dioxygen bonds.
To summarize the criticism in parts:
a) the CCSD(T)-cc is not a good representation of the exact energy.
It is the closest we could get to the exact values. Therefore we think we can use it as reference. Note that since we have studied the compounds as well with KS orbitals, we also have energy values for CCSD(T)-cc based on KS. These differ from the values based on HF orbitals by ca. 0.2 eV. This is most likely a good estimate of the accuracy we can achieve with CCSD(T)-cc.
b) the DFT absolute energies cannot be used.
This is exactly the reason why the comparison is meaningful. It is a reminder that unlike wavefunction approaches that (often) systematically improve the wavefunction towards the full-CI exact energy, within DFAs, some approximations are made that affect the convergence towards the same exact energy to be more random-like. The purpose of the fifth property is to highlight just that: how far have we (as DFT community) gone away from the exact energy.
Therefore, we think that the fifth property is valid and useful. We have, however, added some statements that make it clearer to the reader that this comparison is only an approximate one.
6. In my view, instead of starting with benchmarking the first (geometry) and second-order derivatives of energies for oxo compounds, the author should consider commencing with energy-related benchmarks. Atomization energies would be a suitable choice for such a benchmark.
As mentioned before, and clearly shown in the manuscript, there is a problem with (su)peroxo bonds. This is the main focus and topic of the manuscript. It would not be useful to study atomization energies for these bonds.
7. In the abstract, the author mentions, 'There is not one density functional that performs equally well for all of these properties, not even within one class of density functionals.' However, I couldn't find any analysis comparing different types of DFT, such as LDA or GGA. It would be beneficial to identify at least one type of DFT that demonstrates relatively better performance in terms of predicting geometry or vibrational frequencies. Without such analysis, the paper lacks valuable information beyond the CCSD(T) results.
The manuscript now includes visual proof of accuracy for different properties. For instance, Figure 3 provides for each class of DFT visual data. It would be unfair to average these out into different classes because the number of DFAs taken into account for the different classes is not the same. Also, the deviation within a single class of functionals is typically very large (as can also be seen in our data) and actually much larger than the difference between average values. The SI also has a Figure for each of Tables S1-S6, that clearly identifies the accuracy per class.
Figure 3. Overall assessment (Å2·°·cm-1·eV) for the 100 density functionals compared to the CCSD(T)/aug-cc-pVQZ reference data. Shown from left to right are the classes of composites, LDA, GGA, MGGA, hybrid, MGGA-hybrid, range-separated hybrid, local hybrid, range-separated local hybrid and double hybrid.
Reviewer 3
The work by Swart reports a benchmark study of 75 density functionals against CCSD(T) for five Al/Si-oxygen complexes. For bond length, bond angles, vibrational frequencies and electronic energies tested in this study, it was concluded that the none of DFT functionals can performs equally well for all of these properties.
Response:
We thank the reviewer for the positive comments.
After having published the manuscript on ChemRXiv, it was brought to my attention by Marc Reimann that the underlying coupled cluster data for one compound (2) was probably wrong for the frequencies. This relates back to the ‘well-known’ Hartree-Fock instability, which for some unexpected reasons only shows up in the vibrational frequencies. Marc suggested to redo the calculations for compound 2 with coupled cluster based on DFT orbitals, which I asked him to do and added him as co-author on the manuscript. Indeed, this solves the issue, and there is now excellent consistency between coupled cluster and DFT.
The analysis of all statistics has been redone to take into account the updated reference data for compound 2, which affects mainly the vibrational frequencies (Table S4). We have also added more density functionals, including local and range-separated local hybrid functionals.
Similarly, anharmonic vibrational frequencies were obtained for the three compounds for which experimental data are available, which as expected indeed bring the coupled cluster results closer to experiment. This has been added with an additional table, and discussion in the main text.
Finally, some figures for statistical measures have been added to the main text; all six tables in the supporting information now include such a figure.
(1) I am not sure if the electronic energies are comparable between CCSD(T) and different functionals and how meaningful to do this test?
To summarize the criticism in parts:
a) the CCSD(T)-cc is not a good representation of the exact energy.
It is the closest we could get to the exact values. Therefore we think we can use it as reference. Note that since we have studied the compounds as well with KS orbitals, we also have energy values for CCSD(T)-cc based on KS. These differ from the values based on HF orbitals by ca. 0.2 eV. This is most likely a good estimate of the accuracy we can achieve with CCSD(T)-cc.
b) the DFT absolute energies cannot be used.
This is exactly the reason why the comparison is meaningful. It is a reminder that unlike wavefunction approaches that (often) systematically improve the wavefunction towards the full-CI exact energy, within DFAs, some approximations are made that affect the convergence towards the same exact energy to be more random-like. The purpose of the fifth property is to highlight just that: how far have we (as DFT community) gone away from the exact energy.
Therefore, we think that the fifth property is valid and useful. We have, however, added some statements that make it clearer to the reader that this comparison is only an approximate one.
(2) I think it is generally recognized that DFT functionals performs well for the ground-state geometries, especially for the complexes composed of main group elements, as selected in this study.
Typically we would expect an accuracy of 0.01 Å for bonds. However, Figure 1, Table 4, Table S1-S2 show a totally different picture. There are few DFAs that meet this target of 0.01 Å, which is getting worse for the (su)peroxo bonds (Table S2), as discussed in the text. Roughly speaking, the deviation more than doubles on average, with some classes like double-hybrids showing deviations between 0.08 and 0.12 Å. Therefore, the "generally recognized view" does not hold for these dioxygen bonds. Indeed, this is one of the main conclusions.
Figure 1. Mean absolute deviations (Å) of all bonds (blue, Table S1) and peroxo/superoxo-bonds alone (red, Table S2) for the 84 density functionals compared to the CCSD(T)/aug-cc-pVQZ reference data. Shown from left to right are the classes of composites, LDA, GGA, MGGA, hybrid, MGGA-hybrid, range-separated hybrid, local hybrid, range-separated local hybrid and double hybrid.
(3) I feel it would be more meaningful to evaluate the performance of DFT in terms of transition state geometry or kinetic barriers, which could be more instructional.
The focus of the manuscript is on dioxygen bonds as in superoxo and peroxo complexes. We have taken from the literature experimental data, computed the CCSD(T) results which match experiment (when anharmonic frequencies were computed). This corroborates the good performance of CCSD(T) with large basis sets for these compounds. Therefore, the resulting bond lengths and vibrational frequencies provide an excellent benchmark set for validation of DFT. This is the purpose of the current manuscript.
Reviewer 4
Strengths
1. A new set of benchmark data for dioxygen complexes is reported.
2. A large number of density functional approximations are tested against the data.
3. The supporting information contains enough "input file" information to easily reproduce the coupled cluster calculations.
4. GIF files are provided as supporting information to visualise the vibrational normal modes.
Response:
We thank the reviewer for the positive comments.
After having published the manuscript on ChemRXiv, it was brought to my attention by Marc Reimann that the underlying coupled cluster data for one compound (2) was probably wrong for the frequencies. This relates back to the ‘well-known’ Hartree-Fock instability, which for some unexpected reasons only shows up in the vibrational frequencies. Marc suggested to redo the calculations for compound 2 with coupled cluster based on DFT orbitals, which I asked him to do and added him as co-author on the manuscript. Indeed, this solves the issue, and there is now excellent consistency between coupled cluster and DFT.
The analysis of all statistics has been redone to take into account the updated reference data for compound 2, which affects mainly the vibrational frequencies (Table S4). We have also added more density functionals, including local and range-separated local hybrid functionals.
Similarly, anharmonic vibrational frequencies were obtained for the three compounds for which experimental data are available, which as expected indeed bring the coupled cluster results closer to experiment. This has been added with an additional table, and discussion in the main text.
Finally, some figures for statistical measures have been added to the main text; all six tables in the supporting information now include such a figure.
Weaknesses
1. It is not clear how useful the comparison of absolute total energies is.
2. There is a known deficiency in the basis set selected and it is unclear what impact this may have on the results.
Report
In producing a new benchmarking data set at a high-level of theory, this work has clear potential for follow up work, particularly in the development and testing of new density functional approximations.
In general it is clearly written with suitable abstract and introduction. The combination of the computational details and supporting information means that the work is easily reproducible and there are appropriate citations.
To further address the highlighted weaknesses:
1. It is not clear that the total energy comparison is useful, particularly as the coupled cluster method is non-variational. Some form of relative energies would seem more appropriate and presumably remove the large outlier results that are evident in Table 3. Total atomisation energies would be a possibility here and should be relatively easy to compute as an extension of the calculations already performed.
Atomization energies for many compounds can be found in many contributions from the literature. Moreover, because this involves calculations on atoms, rather than molecules, the result is prone to interpretation (and use of) of atomic term symbols, symmetries, and how to distribute electrons over the different orbitals and shells, when computed with a molecular program. Most likely we would need to run spin-orbit coupling on top of the CCSD(T)-cf for consistency. This is clearly beyond the scope of the current manuscript. Furthermore, it would deviate from the purpose of the current contribution, which is on dioxygen bonds.
To summarize the criticism in parts:
a) the CCSD(T)-cc is not a good representation of the exact energy.
It is the closest we could get to the exact values. Therefore we think we can use it as reference. Note that since we have studied the compounds as well with KS orbitals, we also have energy values for CCSD(T)-cc based on KS. These differ from the values based on HF orbitals by ca. 0.2 eV. This is most likely a good estimate of the accuracy we can achieve with CCSD(T)-cc.
b) the DFT absolute energies cannot be used.
This is exactly the reason why the comparison is meaningful. It is a reminder that unlike wavefunction approaches that (often) systematically improve the wavefunction towards the full-CI exact energy, within DFAs, some approximations are made that affect the convergence towards the same exact energy to be more random-like. The purpose of the fifth property is to highlight just that: how far have we (as DFT community) gone away from the exact energy.
Therefore, we think that the fifth property is valid and useful. We have, however, added some statements that make it clearer to the reader that this comparison is only an approximate one.
2. The inclusion of "tight" d functions in the basis set is known to be important for second row elements such as Al and Si (see DOI: 10.1063/1.1367373 ) but not used in this investigation. For a benchmark study such as this, the impact of including such functions should be known.
The compounds had already been studied with aug-cc-pVTZ and def2-TZVPD as well, which showed similar results. We have now included the CCSD(T) results for aug-cc-pVTZ, aug-cc-pV(T+d)Z, aug-cc-pV(Q+d)Z in the SI as well (Tables S8 and S9). From these results it can be seen that the effect of the "tight" d functions is smaller than the difference between aug-cc-pVTZ and aug-cc-pVQZ. Therefore, the effect is small. Note that all DFA calculations used aug-cc-pVQZ, therefore we keep using the aug-cc-pVQZ basis set for CCSD(T) as reference values.
I also note that the continued fraction method is not that well known outside of the high-accuracy thermochemistry community. I would recommend giving slightly more information on this, including the key equation, as part of the manuscript.
This has been added to the manuscript now.
Requested changes
1. Replace the comparison of total electronic energies with total atomisation energies.
2. Calibrate the effect of including tight d functions in the basis for Al and Si. This could potentially be explored at a lower-zeta level [for example comparing aug-cc-pV(D+d)Z with aug-cc-pVDZ] to determine if it is significant for these particular systems.
3. Briefly expand the description of the continued fraction method.
Validity: Good
Significance: Good
Originality: Good
Clarity: High
Formatting: Excellent
Grammar: Excellent
Current status:
Reports on this Submission
Report #1 by Anonymous (Referee 2) on 2024-6-25 (Invited Report)
- Cite as: Anonymous, Report on arXiv:10.26434/chemrxiv-2023-56kdv-v2, delivered 2024-06-25, doi: 10.21468/SciPost.Report.9297
Report
Although the authors have significantly improved the manuscript, the fatal mistake (question 5 from my last report) has not been addressed. I believe the authors have a misunderstanding regarding "electronic energies" or "absolute energies." "Absolute energies" depend not only on the wave function or density, but also on the Hamiltonian. In some cases, the "absolute energies" of full CI may not serve as an "exact reference". This is because you have not considered scalar relativistic effect, finite nuclear effect, etc. All these changes in the Hamiltonian could result in significantly different "absolute energies." Moroever, even wave function methods, such as some MRCC, rely on an "effective Hamiltonian", which may not produce exact "absolute energies" as FCI. However, we consider these methods to be high-accuracy because we primarily care about "relative energies."
Moreover, the atomization calculations may not be as difficult as described in the authors' reply. The spin-orbit coupling is not critical in these systems. Considering the scalar relativistic effect, such as spin-free X2C or DKH, is sufficient. Since the authors did not consider the scalar relativistic effect initially, the absence of any relativistic effects should not introduce large errors in the atomization energies. This can be confirmed by the works on the basis set development. If the relativistic effect were significant for Al and Si, basis set developers would consistently recommend using the aug-cc-pVXZ-DK basis set over the aug-cc-pVXZ.
In summary, I cannot recommend the publication of this paper if the authors continue to make such misleading comparisons.
Recommendation
Ask for major revision