当前位置:
X-MOL 学术
›
J. Chem. Inf. Model.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
High-Throughput Screening and Prediction of Nucleophilicity of Amines Using Machine Learning and DFT Calculations
Journal of Chemical Information and Modeling ( IF 5.6 ) Pub Date : 2024-08-08 , DOI: 10.1021/acs.jcim.4c00724 Xu Li 1, 2 , Haoliang Zhong 2 , Haoyu Yang 3 , Lin Li 4 , Qingji Wang 3
Journal of Chemical Information and Modeling ( IF 5.6 ) Pub Date : 2024-08-08 , DOI: 10.1021/acs.jcim.4c00724 Xu Li 1, 2 , Haoliang Zhong 2 , Haoyu Yang 3 , Lin Li 4 , Qingji Wang 3
Affiliation
Nucleophilic index (NNu) as a significant parameter plays a crucial role in screening of amine catalysts. Indeed, the quantity and variety of amines are extensive. However, only limited amines exhibit an NNu value exceeding 4.0 eV, rendering them potential nucleophiles in chemical reactions. To address this issue, we proposed a computational method to quickly identify amines with high NNu values by using Machine Learning (ML) and high-throughput Density Functional Theory (DFT) calculations. Our approach commenced by training ML models and the exploration of Molecular Fingerprint methods as well as the development of quantitative structure–activity relationship (QSAR) models for the well-known amines based on NNu values derived from DFT calculations. Utilizing explainable Shapley Additive Explanation plots, we were able to determine the five critical substructures that significantly impact the NNu values of amine. The aforementioned conclusion can be applied to produce and cultivate 4920 novel hypothetical amines with high NNu values. The QSAR models were employed to predict the NNu values of 259 well-known and 4920 hypothetical amines, resulting in the identification of five novel hypothetical amines with exceptional NNu values (>4.55 eV). The enhanced NNu values of these novel amines were validated by DFT calculations. One novel hypothetical amine, H1, exhibits an unprecedentedly high NNu value of 5.36 eV, surpassing the maximum value (5.35 eV) observed in well-established amines. Our research strategy efficiently accelerates the discovery of the high nucleophilicity of amines using ML predictions, as well as the DFT calculations.
中文翻译:
使用机器学习和 DFT 计算对胺的亲核性进行高通量筛选和预测
亲核指数( N Nu )作为一个重要参数,在胺催化剂的筛选中起着至关重要的作用。事实上,胺的数量和种类非常广泛。然而,只有有限的胺表现出超过 4.0 eV 的N Nu值,使它们在化学反应中成为潜在的亲核试剂。为了解决这个问题,我们提出了一种计算方法,通过使用机器学习(ML)和高通量密度泛函理论(DFT)计算来快速识别具有高N Nu值的胺。我们的方法首先是训练 ML 模型和探索分子指纹方法,以及基于 DFT 计算得出的N Nu值开发著名胺的定量构效关系 (QSAR) 模型。利用可解释的 Shapley 加法解释图,我们能够确定显着影响胺的N Nu值的五个关键子结构。上述结论可用于生产和培养4920种高N Nu值的新型假想胺。 QSAR 模型用于预测 259 种已知胺和 4920 种假设胺的N Nu值,从而鉴定出具有异常N Nu值 (>4.55 eV) 的五种新型假设胺。 这些新型胺的增强的N Nu值通过 DFT 计算得到了验证。一种新型假设胺 H1 表现出前所未有的高N Nu值,达到 5.36 eV,超过了在成熟胺中观察到的最大值 (5.35 eV)。我们的研究策略利用机器学习预测和 DFT 计算,有效地加速了胺的高亲核性的发现。
更新日期:2024-08-08
中文翻译:
使用机器学习和 DFT 计算对胺的亲核性进行高通量筛选和预测
亲核指数( N Nu )作为一个重要参数,在胺催化剂的筛选中起着至关重要的作用。事实上,胺的数量和种类非常广泛。然而,只有有限的胺表现出超过 4.0 eV 的N Nu值,使它们在化学反应中成为潜在的亲核试剂。为了解决这个问题,我们提出了一种计算方法,通过使用机器学习(ML)和高通量密度泛函理论(DFT)计算来快速识别具有高N Nu值的胺。我们的方法首先是训练 ML 模型和探索分子指纹方法,以及基于 DFT 计算得出的N Nu值开发著名胺的定量构效关系 (QSAR) 模型。利用可解释的 Shapley 加法解释图,我们能够确定显着影响胺的N Nu值的五个关键子结构。上述结论可用于生产和培养4920种高N Nu值的新型假想胺。 QSAR 模型用于预测 259 种已知胺和 4920 种假设胺的N Nu值,从而鉴定出具有异常N Nu值 (>4.55 eV) 的五种新型假设胺。 这些新型胺的增强的N Nu值通过 DFT 计算得到了验证。一种新型假设胺 H1 表现出前所未有的高N Nu值,达到 5.36 eV,超过了在成熟胺中观察到的最大值 (5.35 eV)。我们的研究策略利用机器学习预测和 DFT 计算,有效地加速了胺的高亲核性的发现。