Towards exploring the activity landscape of peptide datasets using MAP4 fingerprint

04 April 2023, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


Peptides are a re-emerged strategy to fight a plethora of diseases and their utility has been expanded to new areas. Now sequence-based peptide design opens up new possibilities to develop peptidic molecular entities. However, its methodological limitations (e.g., its inefficiency in designing large peptides and that do not allow the analysis of post-traductional modification) limit their applicability domain. In contrast, ligand-based molecular design approaches have demonstrated their extensive applicability domain, although the peptide design-based in this method continues been not exploited. The main limitation has been the complex molecular structure of peptides, which has not been studied using classical fingerprints tuned for small organic compounds. Towards this end, MAP4 is a recently developed universal fingerprint that allows quantifying the sequence/structure diversity of natural products or peptides. As part of the peptide design, there is a current trend to develop predictive models which are founded on the available structure-activity data available. Before developing such models, it is essential to characterize in detail the structure-activity relationship and identify if any activity cliffs: subtle structural modifications that have a large and unexpected effect on the biological activity. In this study, we map the structure-activity landscape of an exemplary dataset with 165 peptides (anti-methicillin-resistant Staphylococcus aureus peptides) using a similarity metric based on MAP4 fingerprint. Specifically, we characterized the activity landscape of this data set, and we identified key amino acids (AAs) and structural motifs that play a key role in the activity of the anti-methicillin-resistant Staphylococcus aureus peptides. To the best of our knowledge, this is the first chemoinformatics approach to systematically explore the activity landscape of peptides emphasizing the quantification of the structural similarity. The approach is general and can be extended to analyze the presence of activity cliffs in any set of peptides. Identifying activity cliffs has practical implications during the development of predictive models.


Activity landscape modeling
activity cliffs
chemical space
drug discovery
peptide design
structure-property (activity) relationships
Staphylococcus aureus


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.