A Model Ensemble Approach Enables Data-Driven Property Prediction for Chemically Deconstructable Thermosets in the Low Data Regime

17 January 2023, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


Data science can accelerate materials discovery by learning composition-processing-performance models from pre-existing data sets, which then feed into active learning cycles in the laboratory. Thermoset polymer waste is a pressing environmental challenge that may be addressed by the accelerated discovery of new deconstructable variants; however, the combinatorial space of possible monomers, crosslinkers, additives, and manufacturing conditions is vast and Edisonian experimentation may struggle to find optimal designs. Moreover, data-driven strategies are limited for complex (co)polymers like thermosets because the training data is scarce and sourced from heterogeneous experimental approaches, resulting in overfit transferable models. Here, we introduce a novel closed-loop approach to the predictive design of chemically deconstructable thermosets that leverages experimental synthesis and characterization, machine learning, and virtual screening. Our computational model learns to map the molecular features of bifunctional silyl ether (BSE)-based cleavable comonomers to the thermal properties of the industrial thermoset polydicyclopentadiene (pDCPD). We address the challenges of limited data and overfitting by relying on both structural and information-rich domain-specific molecular features as inputs and by thoroughly quantifying model uncertainty. By training an ensemble of predictive models mixing multiple model architectures and parametrizations, our approach achieves predictions of a key thermoset parameter—the glass transition temperature—within less than 15 °C error over a wide temperature range with only 101 data points. The trained models were used to screen new possible BSE comonomer compositions and synthesis conditions, with promising combinations successfully validated experimentally. This work offers a closed-loop design process that we expect to be widely applicable to the discovery of deconstructable polymeric materials.


Data science
Machine learning
Degradable plastics

Supplementary materials

Supplementary Information for A Model Ensemble Approach Enables Data - Driven Property Prediction for Chemically Deconstructable Thermosets in the Low Data Regime
Synthesis and characterization procedures and data, as well as elaboration on machine learning approaches and results.


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.