2019-nCoV vs. SARS-CoV: Which Truly Has a Higher ACE2 Affinity? A Quantum Chemical Perspective on Virus-Receptor Noncovalent Interactions

While the title question is still a matter for ongoing debate, current hypotheses suggest that a few particular amino acid substitutions are responsible for the ACE2 affinity differences between the two viruses (the spike proteins of which exhibit 76% sequence identity, and are therefore assumed to adopt similar folded structures). In the present paper, noncovalent interaction energetics associated with said substitutions are assessed by means of B3LYP/TZVP electronic structure calculations on a representative set of geometries – chosen to reflect different biochemically-significant spatial organizations of the amino acid residues at hand. We found that the sum and (unbiased-)average of calculated interaction energetics under consideration are larger in SARS-CoV compared to 2019-nCoV. Thus, we hereby challenge earlier predictions claiming a higher ACE2 affinity for 2019-nCoV employing “chemical-intuition”-based analyses of said substitutions alone. We also demonstrate that the latter predictions are potentially somewhat-biased for being based on a SARS-CoV–ACE2 crystal structure – which should not be expected to represent a bound spike-protein–receptor complex in physiological conditions. By comparing electronic-structure-based results with ones obtained using the widely-used MMFF94 molecular mechanics force-field, we show that despite being specifically parametrized for van der Waals interactions – such classical force fields might prove inadequate in cases where only several, well-defined noncovalent binding factors are assumed to play crucial roles in biochemically-significant binding events.


Introduction
2019-nCoV is a newly emerged, human-infectious coronavirus (CoV) that has recently spread to become a global epidemiological problem. As of 12.03. 20, there have been more than 134,509 diagnosed cases and more than 4,981 confirmed deaths worldwide (ABC news, USA). Due to the fact that the pathogenesis of this virus has not been thoroughly investigated, healthcare professionals are forced to rely on ad hoc practical treatment options.
A long-term, economically-sustainable battle against 2019-nCoV, however, would require the development of effective medicinal measures (e.g., drugs, vaccines and the like).
One important starting point towards such development is the virus' genome sequence, which is currently available (GenBank ID: MN908947.3). 1 It has been realized that 2019-nCoV shares 82 % sequence identity with severe acute respiratory syndrome-related coronavirus (SARS-CoV). Thus, past findings from medicinal chemistry studies on SARS-CoV may directly be used in current research efforts on 2019-nCoV.
It is well-known that CoV rely on their spike proteins for binding to a host cell surface receptor -which, in turn, leads to cell invasion. As for SARS-CoV, the particular receptor to which 2019-nCoV binds is angiotensin-converting enzyme 2 (ACE2). 2 Since the spike proteins for the two viruses exhibit no less than 76% sequence identity, they are broadly assumed to adopt similar folded structures. Again, in SARS-CoV, a receptor binding domain -responsible for high-affinity interactions with ACE2 -has already been identified in region S1 of the spike protein (see Ref. 2 for the appropriate nomenclature).
The current hypothesis is that such receptor binding domain is also employed by 2019-nCoV. 3 That being said, non-conserved mutations (that is, deviations from the known SARS-CoV amino-acid sequence in these regions) were recognized in structural regions that interact directly with ACE2. 4 Thus, it was suggested that some particular amino acid substitutions are to be held responsible for overall cell-affinity differences between the two viruses. Needless to say, identifying particular amino acids that significantly contribute to ACE2-binding in 2019-nCoV is expected to lead to the development of effective vaccines. 5,6 Based mostly on "chemical intuition", Morse et. al. have predicted that several amino-acid substitutions would result in a greater affinity for 2019-nCoV, due to stronger noncovalent interactions with specific, corresponding residues in ACE2. 3 In this context, the following substitutions were addressed (SARS-CoV/2019-nCoV): V404/K417, Y484/Q498, L472/F486, and C474/Q492. However, reliable assessments of the relevant noncovalent interactions -that may rise from subtle electronic effects which must be explicitly considered -are to be obtained in order to confirm/refute predictions of this sort.
In this context, electronic structure methods constitute a precious (and almost exclusive) source of information on biochemically-relevant noncovalent interactions between individual molecules -which, in turn, often used for the parametrization and calibration of more approximate computational modeling techniques (such as classical molecular mechanics force fields). [7][8][9][10][11] Having been involved in a few recent quantum chemical investigations of noncovalent complexes, [12][13][14][15][16] we shall now attempt to employ B3LYP/TZVP calculationswhich have been shown to be capable of providing reliable noncovalent interaction energetics 17 -for the purpose of calculating energy differences associated with the aforementioned substitutions. In such manner, reliable energetic "gains/losses" which result from these particular substitutions may be quantitatively assessed, so that important practical conclusions regarding the above prediction may be drawn. This general strategy is concisely summarized in Figure 1.

2019-nCoV SARS-CoV
representative discussions on protein flexibility in Refs. 18,19 ). Thus, crystal structure geometries should not be considered as an ultimate source of structural information on the systems at hand (this fact seems to have been overlooked in recent research efforts). We will therefore mostly restrict our discussion to particular spatial organizations of the relevant amino acid dimers/clusters, chosen to represent "upper bounds" for said interactions. In such manner, we would be able to use statistical inference tools and critically answer whether said substitutions actually lead to a substantial, potential increase in interaction energies -or, perhaps, suggest that their implications are not quite as straightforward. Still, and in order to compare our results with those of Morse et. al., we will also consider geometries derived directly from the crystal structure in Ref. 2 Since some recent/contemporary research ventures have involved indirect assessments of the aforementioned noncovalent binding factors by means of classical molecular mechanics force fields, [20][21][22] we shall also compare our electronic-structure-based energetics with ones obtained using the widely-applicable MMFF94 force-field -which has been specifically parametrized for van der Waals (vdW) interactions 23 and is often used for investigations of organic molecules and polypeptides. 24,25

Computational Methods
Electronic structure calculations were performed using the Gaussian16 26 software suite and ran on the Faculty of Chemistry HPC cluster at the Weizmann institute of science. The B3LYP functional, 27 which was recommended for biochemically-relevant noncovalent systems in Ref. 17 , as well as Grimme's DFT-D3 atom-pairwise dispersion corrections 28 and the TZVP basis set 29 were used throughout.
In addition, MMFF94 23,30 geometry optimizations and energy calculations were performed using the Avogadro software (Macintosh version 1.2). 31 The sixteen geometries considered in this work were obtained in the following manner: first,

Results and Discussion
Before we begin our analysis, a few notes about nomenclature are in place. As can be seen in Table 1, we labeled all noncovalent interactions under consideration based on the participating residues in ACE2. For instance, the particular interaction involving residues D30 and H34 in ACE2, as well as residue V404 in SARS-CoV, will be denoted 1-SARS-n (n denotes a particular molecular structure representing the latter interaction; specifically, 1-SARS-a corresponds to the SARS-CoV-ACE2 crystal structure from Ref. 2 ). Note that for the sake of good order, structure illustrations and names are also presented in Figure 2. Table 1. Shorthand notation used for interactions considered in this work (see also Figure 2).
Interaction number (n) ACE2 SARS/nCoV Notes a,b,c are used to denote different (chemically-significant) structures, which differ in their noncovalent nature (see Table 2).
n-SARS-a (n=1-4) correspond to interactions derived from the SARS-ACE2 crystal structure in Ref. 2 4 Q24 C474/Q492 Our calculated B3LYP/def2-TZVP noncovalent interaction energetics, as well as their deviation from ones calculated using the classical MMFF94 force-field, are presented in Table 2. We can now move on and focus on our B3LYP/def2-TZVP results. Excluding interaction 1, Again, based on the present interaction space, and considering that both receptor binding domains in the virus' spike protein and their corresponding binding sites in ACE2 may exhibit significant conformational flexibility, we may now come up with a few possible interaction "scenarios" for the two viruses -which will, in turn, help us establish important practical conclusions regarding the latter's overall ACE2 affinities (Table 3). First, if we assume that all interactions may indeed reach their maximum value for both viruses (scenario I), then SARS-CoV can be expected to bind to ACE2 with a slightly greater (~6%) overall affinity; however, and since this scenario should clearly not be perceived as having exclusive biochemical significance -additional scenarios must obviously be considered as well. In scenario II, minimum values are considered for all investigated interactions. 2019-nCoV, in this case, is clearly the "winner" -as the sum of its interactions amounts to ~350% of the corresponding sum for SARS-CoV. That being said, it should be stressed that interactions derived from the crystal structure of SARS-CoV-ACE2 (that is, n-SARS-a) -which can be expected to represent lower bounds for interactions 1, 2 and 4 -are likely to bias this scenario in favor of 2019-nCoV. Still, and since the statistical dispersion measures are larger for SARS-CoV even in case where crystal structure geometries are not considered (see Table   2) -it is possible to expect 2019-nCoV to bind to ACE2 with higher affinity in the current scenario. If we consider the average value of all investigated interactions for each virus as a measure for overall ACE2 affinity (scenario III), then 2019-nCoV is shown to have the upper hand (larger by ~26% than the average interaction for SARS-CoV). Similarly, summing the average value for each individual interaction (scenario IV) gives 2019-nCoV an advantage of ~34%. However, removing the aforementioned, somewhat-biased crystal-structure-derived energetics from our averaging (V and VI) -SARS-CoV clearly exhibits superior statistics (values are larger than those for 2019-nCoV by ~20% and ~18%, respectively). The above findings lead us to suspect that previous Before ending this discussion, we would like to clarify that we do not at all wish to imply that realistic ACE2 affinities can be trivially reduced (in the statistical sense) to the four noncovalent interactions considered above. What we do mean to demonstrate is that (a) hypotheses regarding implications of particular amino acid substitutions on overall receptor affinity must be confirmed/refuted by means of appropriate information on relevant noncovalent binding factors, and that (b) in the absence of such information, classical molecular dynamics simulations, as well as insights drawn from "chemical intuition", are prone to lead to erroneous conclusions. Alas, claims according to which 2019-nCoV has, in fact, a higher affinity towards ACE2 compared to SARS-CoV should not be based upon shots in the dark and questionable expectations.

Summary and Conclusions
Based on our analysis of the noncovalent interaction space under consideration, we were able to draw the following conclusions: • The sum of the strongest calculated noncovalent interactions is larger in SARS than in 2019. Thus, the former is, in principle, capable of stronger binding to ACE2contrary to former predictions.
• If only significantly interacting structures are considered (that is, if extremely weak interactions derived from the SARS-ACE2 crystal structure are removed from the investigated sample) then the average of all calculated noncovalent interaction energies and sum of average energies for interactions 1-4 are also larger in SARS-CoV than in 2019-nCoV (see scenarios V and VI in Table 3). This indicates that earlier predictions (suggesting the opposite) are potentially biased towards crystal structure geometries -which might indeed be unrepresentative of bound structures in physiological conditions.
• The MMFF94 force-field is clearly inadequate for calculating realistic noncovalent energetics for the interactions at hand. Classical molecular mechanics force-fields of this sort are thus prone to produce unreliable results in cases where a small number of noncovalent factors are considered as biochemically-significant.
In addition, we suggest that the following considerations should generally be considered in future investigations of biochemically-significant noncovalent binding factors, and particularly in virus-receptor affinity estimates: • Conformational flexibly for considered proteins in physiological conditions is cannot trivially be inferred from crystal-structure geometries (a fact that seems to have been overlooked in previous investigations); thuso Virus-receptor crystal structures should not be regarded as an unquestionable source of structural information on the bound structure at hand.
o A biochemically-significant interaction space should be considered, such that practical conclusions may statistically be inferred from it.
• Hypotheses suggesting biochemically-significant implications of particular noncovalent binding factors on macromolecular binding events should be confirmed/refuted using appropriate means and techniques -such as electronic structure methods proven reliable for the systems considered. In such manner, some (unfortunately) flawed inferences may indeed be averted.
As a final remark, we would like to express our hopes for a growing number of practicalproblem-solving quantum chemical research attempts. Indeed, even technically unsophisticated applications of electronic structure methods can be used for providing surprisingly-useful answers to important, large-scale (bio)chemical questions -as long as the latter are phrased and examined carefully and critically.