当前位置: X-MOL 学术Inform. Fusion › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Pretraining graph transformer for molecular representation with fusion of multimodal information
Information Fusion ( IF 14.7 ) Pub Date : 2024-11-14 , DOI: 10.1016/j.inffus.2024.102784
Ruizhe Chen, Chunyan Li, Longyue Wang, Mingquan Liu, Shugao Chen, Jiahao Yang, Xiangxiang Zeng

Molecular representation learning (MRL) is essential in certain applications including drug discovery and life science. Despite advancements in multiview and multimodal learning in MRL, existing models have explored only a limited range of perspectives, and the fusion of different views and modalities in MRL remains underexplored. Besides, obtaining the geometric conformer of molecules is not feasible in many tasks due to the high computational cost. Designing a general-purpose pertaining model for MRL is worthwhile yet challenging. This paper proposes a novel graph Transformer pretraining framework with fusion of node and graph views, along with the 2D topology and 3D geometry modalities of molecules, called MolGT. This MolGT model integrates node-level and graph-level pretext tasks on 2D topology and 3D geometry, leveraging a customized modality-shared graph Transformer that has versatility regarding parameter efficiency and knowledge sharing across modalities. Moreover, MolGT can produce implicit 3D geometry by leveraging contrastive learning between 2D topological and 3D geometric modalities. We provide extensive experiments and in-depth analyses, verifying that MolGT can (1) indeed leverage multiview and multimodal information to represent molecules accurately, and (2) infer nearly identical results using 2D molecules without requiring the expensive computation of generating conformers. Code is available on GitHub11https://github.com/robbenplus/MolGT..

中文翻译:


用于多模态信息融合的分子表示的预训练图形转换器



分子表征学习 (MRL) 在某些应用中是必不可少的,包括药物发现和生命科学。尽管 MRL 中的多视图和多模态学习取得了进步,但现有模型仅探索了有限范围的视角,并且 MRL 中不同视图和模态的融合仍未得到充分探索。此外,由于计算成本高,在许多任务中获得分子的几何构象异构体是不可行的。为 MRL 设计一个通用的相关模型是值得的,但也具有挑战性。本文提出了一种新的 graph Transformer 预训练框架,该框架融合了节点和图形视图,以及分子的 2D 拓扑和 3D 几何模态,称为 MolGT。该 MolGT 模型在 2D 拓扑和 3D 几何图形上集成了节点级和图形级前置任务,利用定制的模态共享图形 Transformer,该转换器在参数效率和跨模态知识共享方面具有多功能性。此外,MolGT 可以通过利用 2D 拓扑和 3D 几何模态之间的对比学习来生成隐式 3D 几何。我们提供了广泛的实验和深入分析,验证了 MolGT 可以 (1) 确实利用多视图和多模态信息来准确表示分子,以及 (2) 使用 2D 分子推断几乎相同的结果,而无需昂贵的计算生成构象异构体。代码可在GitHub11https://github.com/robbenplus/MolGT..
更新日期:2024-11-14
down
wechat
bug