ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
1/1
0/0

Random Forest Refinement of the KECSA2 Knowledge-based Scoring Function for Protein Decoy Detection

preprint
submitted on 19.10.2018 and posted on 22.10.2018 by Jun Pei, Zheng Zheng, Kenneth M. Merz Jr.
In this work, via the use of the ‘comparison’ concept, Random Forest (RF) models were successfully generated using unbalanced data sets that assign different importance factors to atom pair potentials to enhance their ability to identify native proteins from decoy proteins. Individual and combined data sets consisting of twelve decoy sets were used to test the performance of the RF models. We find that RF models increase the recognition of native structures without affecting their ability to identify the best decoy structures. We also created models using scrambled atom types, which create physically unrealistic probability functions, in order to test the ability of the RF algorithm to create useful models based on inputted scrambled probability functions. From this test we find that we are unable to create models that are of similar quality relative to the unscrambled probability functions. Next we created uniform probability functions where the peak positions as the same as the original, but each interaction has the same peak height. Using these uniform potentials we were able to recover models as good as the ones using the full potentials suggesting all that is important in these models are the experimental peak positions.

History

Email Address of Submitting Author

peijun0730@gmail.com

Institution

Michigan State University, Department of Chemistry

Country

United States

ORCID For Submitting Author

0000-0002-0204-0896

Declaration of Conflict of Interest

The authors declare no competing financial interest.

Exports

Read the published paper

in Journal of Chemical Information and Modeling

Logo branding

Exports