Conformal Selection for Efficient and Accurate Compound Screening in Drug Discovery

01 November 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

In drug discovery, the reliability of compound screening based on manual assessments is compromised by potential bias, while existing methods lack robust risk control measures. To address these challenges, we introduced conformal selection as an enhanced approach to optimize the compound screening process with balanced risks and benefits. Leveraging conformal inference, our approach constructs p-values for each candidate molecule to quantify statistical evidence for selection. The final selection of molecules is determined by comparing these p-values against thresholds derived from multiple testing principles. Our approach offers rigorous control over the false discovery rate, ensuring validity independent of dataset size and requiring minimal assumptions. By avoiding the estimation of prediction errors required in previous approaches, our method achieves higher accuracy (power), thereby improving the ability to identify promising candidates. Furthermore, our method demonstrates superior computational efficiency. We validate these advantages through numerical simulations on real-world datasets.

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.