Abstract
Billions of dollars have been invested in recent years to build up national scattering facilities around the world with more advanced configurations and faster data collection for small angle scattering (SAS), a technique that enables in-situ structural analysis of nanoparticles (NP) under stringent sample environments. However, the interpretation of experimental SAS data collected in reciprocal space to determine its corresponding real-space morphology is typically a slow process that requires significant domain expertise, leading to high-throughput scattering facilities such as synchrotron scattering centers collecting large quantities of data that may potentially be left unanalyzed. Here, we report a fast and data-efficient machine learning (ML) framework for identifying NP morphologies and their corresponding structural parameters from both theoretical and experimental SAS data. The developed classification and regression models, which take as input raw scattering curves with minimal pre-processing, are able to accurately identify the morphology and structural dimensions from experimental scattering curves with comparable accuracy to human experts. Critically, we discuss design choices that facilitate the practical application of ML frameworks in scattering facilities. Our ML framework is designed to be easy to train, to work well when extrapolating to structural parameters outside of the parameter range the models were trained on, and to enable verification of ML predictions. The enhanced data analysis efficiency enabled by application of these ML models to real-time in-situ analysis of SAS data has the potential to revolutionize the utilization of synchrotron and neutron scattering facilities for probing nanostructures.