A simple fragment-based method for van der Waals corrections over density functional theory

Modelling intermolecular noncovalent interactions between large molecules remains a challenge for electron structure theory community. This is due to the high cost of calculating electron correlation energy. Fragment-based methods usually fare well in reducing the cost of computations in such systems while quantum Drude oscillators turn out to be a good model for van der Waals interactions. In this paper, we have developed a simple yet eﬀective method based on oscillator methods for calculating van der Waals interactions between molecular fragments as a correction to low-cost DFT functional PBE. We have tested our method on S66X8 with signiﬁcant success.


Introduction
van der Waals (vdW) is a crucial ingredient in understanding the non-covalent interactions between molecular fragments. [1][2][3][4] It originates from instantaneous and correlated quantum fluctuation of electron density electronic systems. As a result, they are ubiquitous and essential to compute with high accuracy. Being long-range part of electron correlation energy, an accurate estimation of the vdW interactions can be obtained from post-Hartree-Fock methods. These methods including coupled-cluster (CC) and configuration interaction (CI) calculations provide a systematic and accurate results. However, these methods are computationally too expensive to be used for large molecules particularly for molecular dynamics simulations. On the contrary, density functional theory-based approximations (DFTA) usually fail to describe the vdW interactions properly due to inadequate description of electron correlations. As a result, dispersion corrections are usually employed on top of DFT methods. In this paper, we aim to describe one such correction scheme which is an amalgamation of quantum oscillator based models for quantum fluctuations 2,5,6 and fragment-based models (see 7 for a comprehensive review).
The usual vdW corrections on top of DFT-based methods can be obtained by several strategies. From the perturbation theory, the dispersion energy can be written as a sum of pair-wise terms as, 6,8 The i, j are pair of atoms/systems, R ij is the distance between i and j, C n are dispersion coefficients arising from dipole-dipole (n=6), dipole-quadruple (n=8), quadruple-quadruple (n=10) interactions. In general, the dispersion energy is not pair-wise additive because of the many-body effects. Widely utilized methods for dispersion corrections as Grimme's -D2, 9 -D3 10 uses semi-empirical parameters. On the other hand, Tkatchenko-Scheffler method 11 captures the electronic many-body effects to some extent by defining the electron density based Hirshfeld screening of polarizabilities. Both of these methods has been employed successfully for large molecular systems. 12,13 The other way of introducing the vdW correction is through quantum Drude oscillator (QDO) model which is a quantum harmonic oscillators with two equal and opposite charges tagged at two ends. As a result, two QDOs interact between themselves through Coulomb interactions. This model has been successfully employed to capture long-range interaction between atoms. 6,[14][15][16][17] In particular, dispersion interaction in anionic water-clusters, 18,19 vibrational spectroscopy in charges water clusters, 20 as a precursor in force-field development, 21-23 simulating polarizable molecular ionic-liquid 24 and water-lipid interface 25 have been investigated using QDO. Many-Body Dispersion (MBD) method 5 has been another direction where QDO finds its use. In this method, the long-range quantum fluctuations of each atom is modeled by a QDO. The oscillators' polarizabilities have been set to the screened polarizabilities. The dipole-coupled interaction energy between these oscillators therefore provides the interaction energy which is equivalent to random phase approximation (RPA) of correlation energy. 26 In this article, we used an amalgamation of the two above-mentioned methods where after fragmenting the molecule based on simple chemical intuitions, we calculated the longrange vdW interaction energies between them via dipole-coupled quantum Drude oscillators.
Such coarse-graining of dispersion interaction helps reduce the computational costs for large molecules. As a result, it is less expensive than that of Dn methods of Grimme as well as traditional oscillator-based methods. We then tested the efficacy of our method on S66x8 dataset. 45 We found that even this simple procedure provides very good results for most of the systems.
This article is organized as follows. We discussed the theoretical motivations behind using QDO model as well as computational methodologies in section 2. Next, we analysed the fragmentation procedure carefully in section 3.1 followed by the performance of our method for S66X8 database in section 3.2. Then we discussed the performance of other exchange-correlation functionals other than PBE in section 3.3. Finally, we conclude the paper in section 4.

QDO model for dispersion energy
We start with the definition of electron correlation energy derived from adiabatic connection fluctuation dissipation theorem (ACFDT). 5 According to this formulation, the electron correlation can be written in terms of density response function χ and the interaction v. 26 Since we are interested in long range correlation energy only, RPA 46,47 can be employed to compute the vdW interactions. In RPA interacting response function χ λ is approximated as: In earlier studies, 26 it has been shown that the molecules can be divided into effective atomic fragments. Atomic response functions then can be represented by a set of quantum harmonic oscillators. 48 Therefore in Eq. (2) can be expressed as where A is a 3N ×3N matrix (N is number of atoms). For isotropic QHO A lm = −δ lm α l (iw) and T is the dipole-dipole interaction tensor.
We can, however, use a unitary operator U such that Consequently, A is no longer a diagonal matrix, although leading to If A can be constructed in a block-diagonal form, these blocks can be interpreted as fragment polarizabilities and therefore Eq.(4) can be written, in principle, as where A is the fragment polarizability matrix. However, in this paper we have chosen fragments out of simple chemical intuitions. As a result, our fragments are insensitive to chemical environments. Note that A defined through Eq.(5) depends on chemical ambiance.
More rigorous way of defining polarizabilty fragments based on unitary transformation will be taken up in future. Here we also approximate T by a simple dipole-dipole interaction matrix between the center-of-masses of two fragments. We modeled the quantum fluctuations of the molecular fragments by anisotropic oscillators. 14,26,49 The dispersion energies by perturbation theory given by second-order energy. 50 Here m i = (m x , m y , m z ) is an excited state of i oscillator. This expression gives the dispersion energy of electronic fragments A and B. H is interaction energy operator (see SI) and The detailed derivations and expressions for the dispersion energy using anisotropic oscillators are given in supplementary information (SI).

Computational Methodologies
To benchmark our method, we used S66X8 dataset for its diversity of interaction types. The interaction energy curves from this data-set also help analyzing the effects of fragmentation as well as many-body effects. We have computed all single-point energies using def2-QZVPPD 51 basis set. The underlying DF interaction energies are calculated as the difference between dimer and monomer energies. In this study, several exchange-correlation (XC) functionals including PBE, 52 PBE0, 53 B3LYP 54,55 and BHLYP 54,55 are used. These XC functionals provide good approximations for short-range interactions. The mid-range and long-range part of correlation energy which constitutes vdW interaction can be computed as, Here ∆E CCSD(T )/CBS and ∆E DF are interaction energies computed by CCSD(T) and approximated density functionals, respectively. Here ∆E DF contains no dispersion correction. After performing single point energy calculation of the systems at all inter-monomer distances by DF methods, we have spliced the larger monomers into smaller fragments. Detailed fragmentation procedure is mentioned in results and discussion section.
To compute the vdW correction over DF energies, each fragment is replaced by an anisotropic QDO. The dipole-coupled dispersion energies between these oscillators (E ij QDO (r ij ) between i th and j th fragments) have been used as the vdW correction energy. The total dispersion energy is obtained by Eq. (11), A Fermi-like damping function, 56 is used to avoid the near-singuarity at very small distance. Parameter B is taken as 6 for optimal performance of the method. r ij is the distance between the centre of mass of the fragments i and j. R ij is the sum of vdW radii of the fragments computed with Turbomole v7.5 58 and static polarizability tensor at CCSD are computed with Psi4. 59 3 Results and discussion

Fragmentation of Monomers
We fragmented the molecular moieties purely based on chemical intuitions (e.g. along the functional groups). Apart from ethene, ethyne and water, all monomers are subjected to fragmentation. As an example, the partitioning of AcNH 2 results in two fragments, -CH 3 and -CONH 2 ( Fig.1(a)). The fragments are then capped at the vacant site with H atoms to keep the charge and spin multiplicity unaltered. The polrizability tensor is computed for these fragments subsequently. Note that a balance between number of fragments and accuracy is imperative from the computational point-of-view. Later in this section, effect of fragmentation and the associated errors are discussed in details. The monomer fragmentation is done as follows.
• Non-cyclic or non-branched systems are cut and masked with H atom along the midpoints of bonds (see SI). Fragmenting these molecules in such a manner helps in parametrizing the associated QDOs and therefore provide a scope for transferibility of oscillator parameters across molecules. On the flip side, such fragmentations do not reflect the chemical environment for different molecules. The fragmentation of AcNH 2 is shown in Fig.1(a).
• For cyclic molecules, atom-centric fragmentation are performed. As an example, benzene is fragmented along the alternate C-H bond (as shown in Fig.1(b) ), thus obtaining three systems of propene. Note two fragments share a bridging atom, thereby making our fragmentation scheme less accurate. However, since the bonds in these systems are electron-rich, the bond-centric fragmentation turns out to be even less accurate. The effects of double-counting atoms will be analyzed and corrected in future works.

Performance in S66x8
The S66x8 dataset can be categorized into three broad sections, electrostatically bound complexes (named Hydrogen bond by Rezac et al 45 ), dispersion-bound and other different interactions. In the following sections, we discuss the role of dispersion in the complexes and how effective our model is to capture this interaction.

Electrostatically bound complexes
Electrostatically bound complexes can be characterized by two properties, inter-monomer distance and very low dispersion/electrostatic ratio (hereafter D/E) as computed from SAPT analysis. In most cases, the D/E < 0.5. 45 Therefore the qualitative pictures for the interaction energies are essentially obtained from DF methods(see Fig.3). The -D2, -D3 and QDO correction on the DF interaction energies provide more accurate PEC. To analyze further, the deviation of interaction energy by PBE-D2 (∆E P BE−D2 ) and PBE-QDO (∆E P BE−QDO ) from benchmark coupled-cluster energies are plotted in Fig.2(a)) as obtained via

Dispersion complexes
Complexes with large D/E ratio (1.3 < D/E < 5.42) are categorized as dispersion complexes. These complexes include stacked dimers. Indeed Fig.5(b) shows relatively large QDO correction compared to DF interaction energies. It is evident that PBE functionals, by themselves do not predict any binding for these complexes. As expected for such cases, QDO correction is the most significant source of binding energies. In fact, for most of the complexes QDO contribution is very prominent.
As a representative of these complexes, we are exhibiting the interaction energy curve of cyclopentane dimer in fig.4. We can clearly see that the QDO contribution is very prominent near equilibrium distance. Moreover, our method consistently agree well with benchmark interaction energies at all distance. It is interesting to note that while D2 correction fares well in long-range, near equilibrium our method is far superior than it. In dispersion bound Tr (α µ ) − Tr (α monomer ) (15) and their ratio r α = n µ=1 Tr (α µ ) Tr (α monomer ) (16) Clearly, an optimum fragmentation is expected to provide the r α that is close to unity.
We can see from the table 2 that the systems with large deviations from the benchmark numbers are also associated with r α >> 1. Therefore for stacking complexes the error in QDO interaction energies can be attributed to errors in fragmentations. For rest of the complexes, small under-estimatation of vdW interaction energy is observed. Notably, the interaction energy for pentane dimer and benzene-ethene complexes are underestimated by 1 kcal/mol. Overall, these large deviations in dispersion interactions can be attributed to the lack of Axilrod-Teller-Muto (ATM) 60,61 correction and higher order attractive correction terms. Note that most of our corrections has opposite signs than that of PBE-D2 in contrast to electrostatically bound cases discussed before. While it is difficult to pinpoint the reasons for this observation, we can guess that the inclusion of higher order terms may eventually correct for this discrepancy. From statistical viewpoint, both -D2 and QDO dispersion correction reduces the error of PBE (Table -3), while -D2 method is slightly better than QDO corrections.

Other interactions
Complexes within this category interacts mostly via π-electrons. Here we show the interaction energy curve of Benzene-water dimer as our representative system( fig.7). As expected, the magnitude of QDO correction here is in between those from previous two sections. While the error due to D2 correction is not much, our correction is signficantly better than D2 here. However, here the D2 approximates the long-range correction much better compared to our method. This may be due to the change of polarizability of fragments at different intermonomer distances which is not considered in our method. In future, therefore, such information should be incorporated in our method. Similar to previously mentioned systems, interaction energy by DF approximations gets better with QDO correction. The deviations of PBE-QDO and PBE-D2 from the CCSD(T) are presented in Fig.6(a). Our results deviate from CCSD(T) by more than 1 kcal/mol only for ethyne-AcOH and peptide-ethene. For majority of complexes, PBE-QDO predicts under-binding. This feature has also been observed for dispersion-dominated complexes. As expected, for systems with large dispersion energy component, the QDO correction becomes more significant (Fig.6(b)). We find that for most of these systems dispersion correction constitutes < 40% of total binding energies. Statistically, (Table-4) we find that both D2 and QDO improves PBE binding energies. However, our method appears to work slightly better than D2.   Interaction energy curve (in kcal/mol) for Benzene· · · Water OH−π complex plotted against the scaled distance between fragments.

Overall performance
The overall error in terms of MAE, MRE and MRAE for all the complexes are tabulated in Table-5. In terms of accuracy, the interaction energy of PBE-QDO is rather accurate. In electrostatic and mixed interaction type complexes, the accuracy is even surpassed than that of the the -D2 correction. Although the accuracy was decreased in dispersion complexes, mostly due to the overestimation of stacking complexes, it is still comparable to -D2. In 66 complexes, only 8 complexes show deviations from CCSD(T) energy by more than 1 kcal/mol.

Conclusion
In summary, we have developed a method to calculate intermolecular dispersion interaction based on a fragmentation technique. We represented the quantum fluctuations of electron densities for these fragments by anisotropic quantum Drude oscillators. It turns out that the method's success heavily depends on the type of fragmentation used. This aspect has been tested with comparing the monomer polarizability with fragment polarizability. We applied our methodology on S66X8 database to understand its efficacy as well as drawbacks. We find that the method works less accurately only for stacked dimers where the electron density is spreadout and therefore highly polarizable. In general, however, this method provides a simple yet effective approximation of dispersion interactions in large molecular systems with accuracy at least at par with popular D2 method. Particularly, we find that except 8 systems, QDO-corrected interaction energy for all other complexes lies within 1 kcal./mol of CCSD(T) energy. We also find that our correction due to QDO is maximum for dispersionbound complexes. For such systems, the QDO energy almost provides 100% binding energy.
The mean absolute error also shows that our method works ≈ 20% better than PBE-D2 methods. Another important aspect of the present work is the use of functional group-based fragmentation scheme. This fragmentation scheme while insensitive to different chemical environments in different systems, is more amenable to application in large systems such as proteins. We also compared the efficacy of different exchange-correlation energies to be used in conjunction to our correction. It turns out that our result, can be used for multiple exchange correlation functionals. However, for B3LYP the correction does not improve the binding energy which is at par with existing literature. The major limitations of our method are twofold: (1) the ad-hoc fragmentation procedure and (2) the lack of screening effects due to local environment. In future, we will investigate ways to include such information effectively in our scheme.

Supporting Information Available
The calculation for the expression of dispersion energy, fragmentation scheme applied for monomers and interaction energy curves for all 66 complexes have been provided.