当前位置: X-MOL 学术J. Biomed. Semant. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-domain knowledge graph embeddings for gene-disease association prediction
Journal of Biomedical Semantics ( IF 1.6 ) Pub Date : 2023-08-14 , DOI: 10.1186/s13326-023-00291-x
Susana Nunes 1 , Rita T Sousa 1 , Catia Pesquita 1
Affiliation  

Predicting gene-disease associations typically requires exploring diverse sources of information as well as sophisticated computational approaches. Knowledge graph embeddings can help tackle these challenges by creating representations of genes and diseases based on the scientific knowledge described in ontologies, which can then be explored by machine learning algorithms. However, state-of-the-art knowledge graph embeddings are produced over a single ontology or multiple but disconnected ones, ignoring the impact that considering multiple interconnected domains can have on complex tasks such as gene-disease association prediction. We propose a novel approach to predict gene-disease associations using rich semantic representations based on knowledge graph embeddings over multiple ontologies linked by logical definitions and compound ontology mappings. The experiments showed that considering richer knowledge graphs significantly improves gene-disease prediction and that different knowledge graph embeddings methods benefit more from distinct types of semantic richness. This work demonstrated the potential for knowledge graph embeddings across multiple and interconnected biomedical ontologies to support gene-disease prediction. It also paved the way for considering other ontologies or tackling other tasks where multiple perspectives over the data can be beneficial. All software and data are freely available.

中文翻译:

用于基因-疾病关联预测的多领域知识图嵌入

预测基因与疾病的关联通常需要探索不同的信息来源以及复杂的计算方法。知识图嵌入可以根据本体中描述的科学知识创建基因和疾病的表示,然后通过机器学习算法进行探索,从而帮助应对这些挑战。然而,最先进的知识图嵌入是在单个本体或多个但不相连的本体上生成的,忽略了考虑多个互连领域可能对基因疾病关联预测等复杂任务产生的影响。我们提出了一种新方法,使用基于知识图嵌入的丰富语义表示来预测基因-疾病关联,这些语义表示通过逻辑定义和复合本体映射链接到多个本体上。实验表明,考虑更丰富的知识图可以显着改善基因疾病预测,并且不同的知识图嵌入方法从不同类型的语义丰富度中受益更多。这项工作证明了跨多个相互关联的生物医学本体的知识图嵌入支持基因疾病预测的潜力。它还为考虑其他本体论或处理其他任务铺平了道路,在这些任务中,对数据的多个视角可能是有益的。所有软件和数据均可免费获得。
更新日期:2023-08-15
down
wechat
bug