Abstract
Extended tight-binding methods (xTB) as a series of semi-empirical QM Hamiltonians have shown promising potentials in dealing with chemical systems and biomolecules. The end-point free energy calculation with many variants in terms of sampling protocols and post-processing details is recognized as a crucial component of hierarchical screening. The fusion of the two techniques is explored in the current work, using a series of receptor-ligand datasets accumulated in our recent papers on host-guest binding. A thorough exploration of parameter combinations is presented. Specifically, both the popular and naïve single-trajectory sampling protocol and the three-trajectory realization that incorporates the conformational variation of individual components upon binding are considered. The sampled configurations are fed to an extensive combination of xTB Hamiltonians (GFN0, GFN1 and GFN2) and implicit solvent models (PB, GB and the most recent CPCM-X). The host-guest datasets involve macrocyclic hosts from three families, including cucurbiturils, octa acids and pillararenes, and each host is paired with a number of guest molecules to ensure the coverage of chemical space and the stability of performance statistics. The xTB implicit-solvent estimates exhibit a certain level of correlations with the experimental binding thermodynamics, with the three-trajectory GFN2-xTB/PB ΔH selection producing the top-tier performance across different host families. Face-to-face comparison between the xTB implicit-solvent performance with the top-performing MM/GBSA selections suggests that the three-trajectory MM/GBSA regimes outperform xTB variants in most cases. However, in situations that the MM/GBSA Hamiltonian fails severely (e.g., the difficult-to-handle sulfur-substituted pillararene dataset), altering the postprocessing Hamiltonian to the multiscale xTB implicit-solvent treatment could better the screening power. This observation agrees with our recent report on unexpectedly high performance of DFTB/GBSA in another difficult dataset (the SAMPL9 carboxylated pillararene), hinting on the potential applicability of the tight-binding implicit-solvent treatments.