当前位置: X-MOL 学术ChemRxiv › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Data-Driven Reaction Discovery Strategy Based on Large Language Models
ChemRxiv Pub Date : 2025-01-03 , DOI: 10.26434/chemrxiv-2025-pnjg7
Jingyang, Zhang

The discovery of novel reactions and optimization of reaction conditions are fundamental challenges in organic synthesis, with significant implications for retrosynthetic analysis and condition selection. This work proposes a data-driven strategy for reaction discovery, integrating high-throughput experimentation (HTE) with insights derived from large language models (LLMs). By leveraging LLMs to process chemical information from extensive literature, the method enables hypothesis-driven design and experimental validation, minimizing reliance on serendipity. Taking cross-electrophile coupling (XEC) as a case study, this research extracts key trends, substrate combinations, and reaction conditions from 520 relevant publications. The methodology identifies unexplored substrate pairs and designs reaction plates for HTE, facilitating systematic discovery. Additionally, the concept of directed evolution in chemical catalysis is explored, hypothesizing that catalytic conditions can evolve systematically based on structural and reactivity similarities. The findings demonstrate the utility of combining LLMs with HTE for reaction discovery and catalysis research. This approach emphasizes methodology development, prioritizing the generation of hypotheses and protocols over isolated reaction discoveries, offering a scalable framework for advancing chemical innovation.

中文翻译:


基于大型语言模型的数据驱动反应发现策略



发现新反应和优化反应条件是有机合成中的基本挑战,对逆合成分析和条件选择具有重要意义。这项工作提出了一种数据驱动的反应发现策略,将高通量实验 (HTE) 与来自大型语言模型 (LLMs。通过利用LLMs 处理来自广泛文献的化学信息,该方法实现了假设驱动的设计和实验验证,最大限度地减少了对偶然性的依赖。本研究以交叉亲电子偶联 (XEC) 为案例研究,从 520 篇相关出版物中提取了关键趋势、底物组合和反应条件。该方法可识别未探索的底物对,并为 HTE 设计反应板,促进系统发现。此外,还探讨了化学催化中定向进化的概念,假设催化条件可以根据结构和反应性相似性系统地进化。研究结果表明,将 LLMs HTE 相结合,可用于反应发现和催化研究。这种方法强调方法开发,优先考虑假设和方案的生成,而不是孤立的反应发现,为推进化学创新提供了一个可扩展的框架。
更新日期:2025-01-03
down
wechat
bug