当前位置: X-MOL 学术J. Chem. Inf. Model. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Navigating Ultralarge Virtual Chemical Spaces with Product-of-Experts Chemical Language Models
Journal of Chemical Information and Modeling ( IF 5.6 ) Pub Date : 2024-10-16 , DOI: 10.1021/acs.jcim.4c01214
Shuya Nakata, Yoshiharu Mori, Shigenori Tanaka

Ultralarge virtual chemical spaces have emerged as a valuable resource for drug discovery, providing access to billions of make-on-demand compounds with high synthetic success rates. Chemical language models can potentially accelerate the exploration of these vast spaces through direct compound generation. However, existing models are not designed to navigate specific virtual chemical spaces and often overlook synthetic accessibility. To address this gap, we introduce product-of-experts (PoE) chemical language models, a modular and scalable approach to navigating ultralarge virtual chemical spaces. This method allows for controlled compound generation within a desired chemical space by combining a prior model pretrained on the target space with expert and anti-expert models fine-tuned using external property-specific data sets. We demonstrate that the PoE chemical language model can generate compounds with desirable properties, such as those that favorably dock to dopamine receptor D2 (DRD2) and are predicted to cross the blood–brain barrier (BBB), while ensuring that the majority of generated compounds are present within the target chemical space. Our results highlight the potential of chemical language models for navigating ultralarge virtual chemical spaces, and we anticipate that this study will motivate further research in this direction. The source code and data are freely available at https://github.com/shuyana/poeclm.

中文翻译:


使用 Product-of-Expert Chemical Language Models 导航超大型虚拟化学空间



超大型虚拟化学空间已成为药物发现的宝贵资源,可提供数十亿种合成成功率高的按需制造化合物。化学语言模型可以通过直接化合物生成来加速对这些广阔空间的探索。然而,现有模型并非旨在导航特定的虚拟化学空间,并且经常忽视合成的可访问性。为了解决这一差距,我们引入了专家产品 (PoE) 化学语言模型,这是一种模块化且可扩展的方法来导航超大型虚拟化学空间。这种方法通过将在目标空间上预先训练的先前模型与使用外部属性特定数据集微调的专家反专家模型相结合,允许在所需的化学空间内生成受控化合物。我们证明 PoE 化学语言模型可以生成具有理想特性的化合物,例如那些有利地对接多巴胺受体 D2 (DRD2) 并预测会穿过血脑屏障 (BBB) 的化合物,同时确保大多数生成的化合物存在于目标化学空间内。我们的结果突出了化学语言模型在超大型虚拟化学空间中导航的潜力,我们预计这项研究将激励该方向的进一步研究。源代码和数据可在 https://github.com/shuyana/poeclm 免费获得。
更新日期:2024-10-16
down
wechat
bug