Integrating data imputation and augmentation with interpretable machine learning for efficient strength prediction of fly ash-based alkali-activated concretes,Journal of Building Engineering

当前位置： X-MOL 学术 › J. Build. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Integrating data imputation and augmentation with interpretable machine learning for efficient strength prediction of fly ash-based alkali-activated concretes
Journal of Building Engineering ( IF 6.7 ) Pub Date : 2024-11-08 , DOI: 10.1016/j.jobe.2024.111248
Nausad Miyan, N.M. Anoop Krishnan, Sumanta Das

Fly ash-based alkali-activated concrete (AAC) is renowned for its superior mechanical performance and sustainability, presenting an attractive alternative to traditional Portland cement concrete. Despite these advantages, the broad compositional range of AACs presents challenges in precisely tailoring material properties. In this context, machine learning (ML) offers promising prospects to streamline and fast-track the development of advanced materials design strategies by predicting mechanical properties from compositional variations. Effective ML model development, however, hinges on the availability of a comprehensive, high-quality dataset. Previous studies often relied on literature-derived datasets, which typically include outliers, noise, and missing values, potentially leading to biased predictions. Moreover, limited dataset sizes could undermine the robustness of the models. Traditional ML methods applied to AACs also tend to lack interpretability. To address these issues, this paper utilizes several data imputation methods and Generative Adversarial Networks (GANs) for data augmentation, effectively doubling the dataset size. Following this, ML algorithms such as Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Neural Networks (NNs) are leveraged to predict compressive strength. The NN model, especially when enhanced by k-nearest neighbors (kNN) imputation (k = 5), demonstrated superior predictive accuracy compared to RF and XGBoost models. Further, SHAP (SHapley Additive exPlanations) analysis reveals key determinants of compressive strength, such as water content, SiO2, and curing conditions. Visualizations such as SHAP violin and river flow plots further elucidated feature contributions and property distributions. Overall, this study provides a robust framework for exploring composition-strength relationships in AACs, advancing the design of these environment-friendly materials.

中文翻译：

将数据插补和增强与可解释的机器学习相结合，以高效预测粉煤灰基碱活化混凝土的强度

粉煤灰基碱活性混凝土（AAC）以其卓越的机械性能和可持续性而闻名，是传统波特兰水泥混凝土的有吸引力的替代品。尽管有这些优势，但 AAC 的广泛成分范围也给精确定制材料特性带来了挑战。在这种情况下，机器学习（ML）提供了广阔的前景，可以通过从成分变化中预测机械性能来简化和快速跟踪先进材料设计策略的开发。然而，有效的 ML 模型开发取决于全面、高质量数据集的可用性。以前的研究通常依赖于文献衍生的数据集，其中通常包括异常值、噪声和缺失值，这可能导致预测有偏差。此外，有限的数据集大小可能会破坏模型的稳健性。应用于 AAC 的传统 ML 方法也往往缺乏可解释性。为了解决这些问题，本文利用多种数据插补方法和生成对抗网络（GAN）进行数据增强，有效地将数据集大小增加了一倍。在此之后，利用随机森林（RF）、极端梯度提升（XGBoost）和神经网络（NN）等 ML 算法来预测抗压强度。与 RF 和 XGBoost 模型相比，NN 模型，尤其是通过 k 最近邻（kNN）插补（k = 5）增强时，表现出卓越的预测准确性。此外，SHAP（SHapley 添加剂解释）分析揭示了抗压强度的关键决定因素，例如含水量、SiO2 和固化条件。SHAP 小提琴和河流流图等可视化进一步阐明了特征贡献和属性分布。总体而言，本研究为探索 AAC 中的成分-强度关系提供了一个强大的框架，从而推进了这些环保材料的设计。

更新日期：2024-11-08

点击分享查看原文

点击收藏

阅读更多本刊新发论文