Populating Chemical Space with Peptides using a Genetic Algorithm

In drug discovery one uses chemical space as a concept to organize molecules according to their structures and properties. One often would like to generate new possible molecules at a specific location in chemical space marked by a molecule of interest. Herein we report the peptide design genetic algorithm (PDGA, code available at https://github.com/reymondgroup/PeptideDesignGA), a computational tool capable of producing peptide sequences of various chain topologies (linear, cyclic/polycyclic or dendritic) in proximity of any molecule of interest in a chemical space defined by MXFP, an atom-pair fingerprint describing molecular shape and pharmacophores. We show that PDGA generates high similarity analogs of bioactive peptides, including in selected cases known active analogs, as well as of non-peptide targets. We illustrate the chemical space accessible by PDGA with an interactive 3D-map of the MXFP property space available at http://faerun.gdb.tools/. PDGA should be generally useful to generate peptides at any location in chemical space.