Abstract
Peptides are a re-emerged strategy to fight a plethora of diseases and their utility has been expanded to new areas. Now sequence-based peptide design opens up new possibilities to develop peptidic molecular entities. However, its methodological limitations (e.g., its inefficiency in designing large peptides and that do not allow the analysis of post-traductional modification) limit their applicability domain. In contrast, ligand-based molecular design approaches have demonstrated their extensive applicability domain, although the peptide design-based in this method continues been not exploited. The main limitation has been the complex molecular structure of peptides, which has not been studied using classical fingerprints tuned for small organic compounds. Towards this end, MAP4 is a recently developed universal fingerprint that allows quantifying the sequence/structure diversity of natural products or peptides. As part of the peptide design, there is a current trend to develop predictive models which are founded on the available structure-activity data available. Before developing such models, it is essential to characterize in detail the structure-activity relationship and identify if any activity cliffs: subtle structural modifications that have a large and unexpected effect on the biological activity. In this study, we map the structure-activity landscape of an exemplary dataset with 165 peptides (anti-methicillin-resistant Staphylococcus aureus peptides) using a similarity metric based on MAP4 fingerprint. Specifically, we characterized the activity landscape of this data set, and we identified key amino acids (AAs) and structural motifs that play a key role in the activity of the anti-methicillin-resistant Staphylococcus aureus peptides. To the best of our knowledge, this is the first chemoinformatics approach to systematically explore the activity landscape of peptides emphasizing the quantification of the structural similarity. The approach is general and can be extended to analyze the presence of activity cliffs in any set of peptides. Identifying activity cliffs has practical implications during the development of predictive models.