Data Science Enables Chemical Interpretation of HBC-Catalyzed Polymerization for Poly(disulfide)s

11 March 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


H-Bonding catalysts (HBCs) has gained widespread success in improving the selectivity of organic synthesis. The complexity of potentially engaging noncovalent interactions have imposed significant challenges in deciphering the contributing factors of these catalytic reactions. Herein, we present the use of interpretable machine learning on HBCs applied in poly(disulfide) synthesis, wherein the epiquinine-derived thiourea HBCs enable simultaneous regioselectivity control and rate acceleration over the otherwise unselective and already rapid polymerization. A training set of 28 catalysts with variation on the phenyl substituents were synthesized and tested to gather experimental observables: the apparent rate (kp) and regioselectivity (Pss). Considering the limited data size, we applied the feature-sensitive XGBoost algorithm for supervised machine learning. Upon screening over 64 potentially relevant descriptors, a reasonable fitting of the observables was established for kp (R2 = 0.76) and Pss (R2 = 0.91), and the key catalyst features necessary for achieving high reaction rates and regioselectivity were deconvoluted. The model suggests that sterically hindered, heavy atom-containing substituents or strongly electro-withdrawing groups on the HBC adversely affect the reaction rate. Substituents enhancing the electrostatic potential of the aromatic N atom are beneficial for achieving high regioregularity, while presence of ortho-substituents on the phenyl ring is unfavorable.


hydrogen-bonding catalyst
data science

Supplementary materials

Supporting Information
Experimental and computational procedures, synthesis and characterization of new compounds, GPC and NMR data for polymerization of different conditions, descriptor screening and SHAP analysis, and XYZ coordinates


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.