Abstract
The work presented in this study highlights an interesting and growing area in the biochemical separations field. Biochemical separations of 2,3-butanediol from a fermentation broth presents a non-trivial and multifaceted task that requires careful consideration and a deep understanding of the solvent-solute interactions present in the mixture. In this study, we have developed a database of 39,397 containing data with unique solvent-solute pairs, their SMILES string representation and the solubility of the pair. We have further partitioned these into organic and aqueous subsets that were used in predicting a solvent's affinity for 2,3-butanediol and water. Regression models were rigorously trained on both subsets and the models were further used for their predictive capabilities. In harnessing these predictive capabilities, we have validated the performance in systematic way utilizing a small experimental subset, density functional theory calculations and classical molecular dynamic simulations to ensure agreement amongst the available data and methods.