Nature Communications ( IF 14.7 ) Pub Date : 2023-05-04 , DOI: 10.1038/s41467-023-38192-3 Zhenxing Wu 1, 2 , Jike Wang 1, 2, 3 , Hongyan Du 1, 2 , Dejun Jiang 1, 2 , Yu Kang 1 , Dan Li 1 , Peichen Pan 1 , Yafeng Deng 2 , Dongsheng Cao 4 , Chang-Yu Hsieh 1 , Tingjun Hou 1
Graph neural networks (GNNs) have been widely used in molecular property prediction, but explaining their black-box predictions is still a challenge. Most existing explanation methods for GNNs in chemistry focus on attributing model predictions to individual nodes, edges or fragments that are not necessarily derived from a chemically meaningful segmentation of molecules. To address this challenge, we propose a method named substructure mask explanation (SME). SME is based on well-established molecular segmentation methods and provides an interpretation that aligns with the understanding of chemists. We apply SME to elucidate how GNNs learn to predict aqueous solubility, genotoxicity, cardiotoxicity and blood–brain barrier permeation for small molecules. SME provides interpretation that is consistent with the understanding of chemists, alerts them to unreliable performance, and guides them in structural optimization for target properties. Hence, we believe that SME empowers chemists to confidently mine structure-activity relationship (SAR) from reliable GNNs through a transparent inspection on how GNNs pick up useful signals when learning from data.
中文翻译:
图神经网络的化学直观解释用于具有子结构掩蔽的分子特性预测
图神经网络 (GNN) 已广泛用于分子特性预测,但解释其黑盒预测仍然是一个挑战。大多数现有的 GNN 化学解释方法都侧重于将模型预测归因于不一定来自具有化学意义的分子分割的单个节点、边或片段。为了应对这一挑战,我们提出了一种名为子结构掩码解释 (SME) 的方法。SME 基于成熟的分子分割方法,并提供符合化学家理解的解释。我们应用 SME 来阐明 GNN 如何学习预测小分子的水溶性、遗传毒性、心脏毒性和血脑屏障渗透。SME 提供符合化学家理解的解释,提醒他们注意不可靠的性能,并指导他们对目标属性进行结构优化。因此,我们相信 SME 使化学家能够通过对 GNN 在从数据中学习时如何获取有用信号的透明检查,自信地从可靠的 GNN 中挖掘构效关系 (SAR)。