当前位置: X-MOL 学术Angew. Chem. Int. Ed. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integrating Machine Learning and Large Language Models to Advance Exploration of Electrochemical Reactions
Angewandte Chemie International Edition ( IF 16.1 ) Pub Date : 2024-12-03 , DOI: 10.1002/anie.202418074
Zhiling Zheng, Federico Florit, Brooke Jin, Haoyang Wu, Shih-Cheng Li, Kakasaheb Y. Nandiwale, Chase A. Salazar, Jason G. Mustakis, William H. Green, Klavs F. Jensen

Electrochemical C-H oxidation reactions offer a sustainable route to functionalize hydrocarbons, yet identifying suitable substrates and optimizing synthesis remain challenging. We report an integrated approach combining machine learning (ML) and large language models (LLMs) to streamline the exploration of electrochemical C-H oxidation reactions. Utilizing a batch rapid screening electrochemical platform, we evaluated a wide range of reactions, initially classifying substrates by their reactivity, while LLMs text-mined literature data to augment the training set. The resulting ML models for reactivity prediction achieved high accuracy (>90%) and enabled virtual screening of a large set of commercially available molecules. To optimize reaction conditions for selected substrates, LLMs were prompted to generate code that iteratively improved yields. This human-AI collaboration proved effective, efficiently identifying high-yield conditions for 8 drug-like substances or intermediates. Notably, we benchmarked the accuracy and reliability of 10 different LLMs - including LLaMA, Claude, and GPT-4 - on generating and executing codes related to ML based on natural language prompts given by chemists to showcase their tool-making (code generation) and tool-use (function calling) capabilities and potentials for accelerating research across four diverse tasks. We also collected an experimental benchmark dataset comprising 1071 reaction conditions and yields for electrochemical C-H oxidation reactions.

中文翻译:


集成机器学习和大型语言模型以推进电化学反应的探索



电化学 C-H 氧化反应为碳氢化合物的功能化提供了一种可持续的途径,但确定合适的底物和优化合成仍然具有挑战性。我们报告了一种结合机器学习 (ML) 和大型语言模型 (LLMs) 的集成方法,以简化电化学 C-H 氧化反应的探索。利用批量快速筛选电化学平台,我们评估了广泛的反应,最初根据底物的反应性对底物进行分类,同时LLMs 文本挖掘的文献数据来增强训练集。由此产生的用于反应性预测的 ML 模型实现了高精度 (>90%),并能够对大量市售分子进行虚拟筛选。为了优化选定底物的反应条件,提示 LLMs 生成迭代提高产量的代码。这种人类与 AI 的合作被证明是有效的,有效地识别了 8 种类药物或中间体的高产率条件。值得注意的是,我们对 10 种不同的 LLMs - 包括 LLaMA、Claude 和 GPT-4 - 根据化学家给出的自然语言提示生成和执行与 ML 相关的代码,以展示他们的工具制作(代码生成)和工具使用(函数调用)的能力和加速四个不同任务的研究的潜力。我们还收集了一个实验基准数据集,包括 1071 个反应条件和电化学 C-H 氧化反应的产率。
更新日期:2024-12-03
down
wechat
bug