当前位置: X-MOL 学术J. Phys. Chem. C › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparative Analysis of Conventional Machine Learning and Graph Neural Network Models for Perovskite Property Prediction
The Journal of Physical Chemistry C ( IF 3.3 ) Pub Date : 2024-09-24 , DOI: 10.1021/acs.jpcc.4c03212
Jirui Jin, Somayeh Faraji, Bin Liu, Mingjie Liu

Perovskite materials, renowned for their versatility and remarkable properties, pose challenges in discovering optimal candidates due to the vast compositional space. Data-driven machine learning (ML) offers promise in expediting material discovery; however, the trade-off between accuracy and efficiency across different ML models for predicting perovskite properties is not well-understood. In this study, we conducted a comprehensive assessment of various ML models for predicting the formation energy (Ef) and band gap (Eg) of perovskites. We designed a protocol to extract perovskite structures from three databases based on the stoichiometry, octahedral lattice motif, and alignment with established perovskite prototype structures. Benchmarking conventional ML algorithms (CML) against graph neural network (GNN) models across three data sets, we identified LGBM and GATGNN models as the top performers for CML and GNN, respectively, balancing exceptional prediction accuracy and computational efficiency. We further investigated the impact of the data size on model performance, emphasizing the need for over 1000 data points for optimal prediction accuracy. Additionally, through SHAP analysis, we provided valuable insights into the interpretation of CML models in predicting Ef and Eg. Our study establishes a standardized benchmark for evaluating various ML models across diverse data sets of perovskite materials, facilitating future applications in materials science, particularly in model selection and advancement of perovskite materials.

中文翻译:


钙钛矿性质预测的传统机器学习和图神经网络模型的比较分析



钙钛矿材料以其多功能性和卓越的性能而闻名,但由于其巨大的成分空间,给发现最佳候选材料带来了挑战。数据驱动的机器学习 (ML) 为加速材料发现提供了希望;然而,不同机器学习模型预测钙钛矿特性的准确性和效率之间的权衡尚不清楚。在本研究中,我们对用于预测钙钛矿的形成能 ( E f ) 和带隙 ( E g ) 的各种机器学习模型进行了全面评估。我们设计了一个协议,根据化学计量、八面体晶格图案以及与已建立的钙钛矿原型结构的比对,从三个数据库中提取钙钛矿结构。在三个数据集上对传统机器学习算法 (CML) 与图神经网络 (GNN) 模型进行基准测试,我们确定 LGBM 和 GATGNN 模型分别是 CML 和 GNN 的最佳表现,平衡了卓越的预测精度和计算效率。我们进一步研究了数据大小对模型性能的影响,强调需要超过 1000 个数据点才能获得最佳预测精度。此外,通过 SHAP 分析,我们为解释 CML 模型预测E fE g提供了宝贵的见解。我们的研究建立了一个标准化基准,用于评估钙钛矿材料不同数据集的各种机器学习模型,促进未来在材料科学中的应用,特别是在钙钛矿材料的模型选择和改进方面。
更新日期:2024-09-25
down
wechat
bug