ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
Manuscript.pdf (2.07 MB)
0/0

Evaluating Polymer Representations via Quantifying Structure-Property Relationships

preprint
submitted on 30.04.2019 and posted on 02.05.2019 by RUIMIN MA, Zeyu Liu, Quanwei Zhang, zhiyu liu, Tengfei Luo
Machine learning techniques are being applied in quantifying structure-property relationships for a wide variety of materials, where the properly representing materials plays key roles. Although algorithms for representation learning are extensively studied, their applications to domain-specific areas, such as polymer, are limited largely due to the lack of benchmark databases. In this work, we investigate different types of polymer representations, including Morgan Fingerprint (MF), molecular embedding (ME) and molecular graph (MG), based on a benchmark database from a subset of PolyInfo. We evaluate the quality of different polymer representations via quantifying the relationships between the representations and polymer properties, including density, melting temperature and glass transition temperature. Different representation learning schemes, such as supervised learning, semi-supervised learning and transfer learning, are investigated. It is found that ME outperforms the other representations for structure-property relationship quantification in all cases studied, and MG is shown to be much inferior than ME and MF, likely due to the relatively small volumes of training data available. For MEs, it is found that the similarities of substructure MEs under different learning schemes (e.g., SL, SSL and TL) are differently estimated, thus leading to different performance scores in structure-property relation quantification. Several ME mixtures have shown to outperform the single MEs in the corresponding regression tasks, and this is attributed to the information gain when mixing different ME.

History

Email Address of Submitting Author

rma4@nd.edu

Institution

University of Notre Dame

Country

United States

ORCID For Submitting Author

0000-0003-1527-9289

Declaration of Conflict of Interest

no conflict of interest

Exports