State of the Art Iterative Docking with Logistic Regression and Morgan Fingerprints

Lewis Martin

doi:10.26434/chemrxiv.14348117.v1

There is renewed interest in docking campaigns for ligand-discovery since the advent of ultra-large scale virtual libraries. Using brute-force search, the scale of the libraries suggests highly parallelized compute should be used to avoid years-long computations. This paper reports a re-analysis of docking data from an ultra-large docking campaign at the D4 receptor and AmpC beta lactamase, and demonstrates large reductions in computation time to identify the top-ranked ligands. A search of ‘baseline’ featurizations shows that logistic regression on Morgan fingerprints with pharmacophoric atom invariants can match the reported performance on the same task using message-passing networks. With this approach, an ultra-large docking campaign could be performed in a matter of weeks using consumer-grade CPUs with RDKit and scikit-learn. All code and figures are available at https://github.com/ljmartin/dockop

State of the Art Iterative Docking with Logistic Regression and Morgan Fingerprints

Abstract

Keywords

Supplementary weblinks

Comments

Version History

Version Notes

Metrics

License

DOI

Funding

Author’s competing interest statement

Share

State of the Art Iterative Docking with Logistic Regression and Morgan Fingerprints

Authors

Abstract

Keywords

Supplementary weblinks

Comments

Version History

Version Notes

Metrics

License

DOI

Funding

Author’s competing interest statement

Share