A consistent set of thermophysical properties of methane curated with machine learning

15 November 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Accurately predicting thermophysical properties across different states of matter is essential for industrial and scientific applications. However, experimental data often suffer from variability and noise, among other problems, thus necessitating robust modeling approaches to create consistent sets. In this work, we employ machine learning (ML) models to predict multiple thermophysical properties of methane in liquid, vapor, and supercritical phases, including isobaric and isochoric heat capacities, density, volume, Joule-Thomson coefficients, enthalpies, sound speed, and viscosities. We explored different ML algorithms including Adaptive Boosting, Bagging, Decision Trees, Extra Trees, Gradient Boosting, Histogram-based Gradient Boosting Regression Tree, K-Nearest Neighbors, Light Gradient Boosting Machine, Nu-Support Vector Regression, Random Forest, Extreme Gradient Boosting, and Artificial Neural Networks across regions of the phase diagram. Combining ML techniques with previously available raw experimental data shows that ML models provide predictions closer to the statistically treated National Institute of Standards and Technology (NIST) data when compared to the original experimental datasets. Therefore, our approach shows the ML’s potential to identify and generalize complex patterns, smooth inherent data noise, and manage variability. The results indicate that ML models, particularly the Extra Trees and Gradient Boosting models, are a scalable alternative for thermophysical property predictions, offering consistency and efficiency over traditional methods in data processing and curation.

Keywords

Methane
thermophysical properties
machine learning
data noise reduction

Supplementary materials

Title
Description
Actions
Title
Supporting materials for methane
Description
The Supporting Information is organized into four main sections. Section S1 (“Number of Data and Machine Learning Models”) provides a summary of the amount of experimental data sourced from the literature and also presents the ML models applied to each thermophysical property. The other sections are structured according to the aggregate state of matter: liquid (Section S2), vapor (Section S3), and supercritical phases (Section S4). Each of these sections first presents the performance metrics of the best ML models for predicting various thermophysical properties, followed by a detailed evaluation of the performance metrics for all ML models used for each property.
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.