当前位置: X-MOL 学术Cell Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GeneCompass: deciphering universal gene regulatory mechanisms with a knowledge-informed cross-species foundation model
Cell Research ( IF 28.1 ) Pub Date : 2024-10-08 , DOI: 10.1038/s41422-024-01034-y
Xiaodong Yang, Guole Liu, Guihai Feng, Dechao Bu, Pengfei Wang, Jie Jiang, Shubai Chen, Qinmeng Yang, Hefan Miao, Yiyang Zhang, Zhenpeng Man, Zhongming Liang, Zichen Wang, Yaning Li, Zheng Li, Yana Liu, Yao Tian, Wenhao Liu, Cong Li, Ao Li, Jingxi Dong, Zhilong Hu, Chen Fang, Lina Cui, Zixu Deng, Haiping Jiang, Wentao Cui, Jiahao Zhang, Zhaohui Yang, Handong Li, Xingjian He, Liqun Zhong, Jiaheng Zhou, Zijian Wang, Qingqing Long, Ping Xu, Hongmei Wang, Zhen Meng, Xuezhi Wang, Yangang Wang, Yong Wang, Shihua Zhang, Jingtao Guo, Yi Zhao, Yuanchun Zhou, Fei Li, Jing Liu, Yiqiang Chen, Ge Yang, Xin Li

Deciphering universal gene regulatory mechanisms in diverse organisms holds great potential for advancing our knowledge of fundamental life processes and facilitating clinical applications. However, the traditional research paradigm primarily focuses on individual model organisms and does not integrate various cell types across species. Recent breakthroughs in single-cell sequencing and deep learning techniques present an unprecedented opportunity to address this challenge. In this study, we built an extensive dataset of over 120 million human and mouse single-cell transcriptomes. After data preprocessing, we obtained 101,768,420 single-cell transcriptomes and developed a knowledge-informed cross-species foundation model, named GeneCompass. During pre-training, GeneCompass effectively integrated four types of prior biological knowledge to enhance our understanding of gene regulatory mechanisms in a self-supervised manner. By fine-tuning for multiple downstream tasks, GeneCompass outperformed state-of-the-art models in diverse applications for a single species and unlocked new realms of cross-species biological investigations. We also employed GeneCompass to search for key factors associated with cell fate transition and showed that the predicted candidate genes could successfully induce the differentiation of human embryonic stem cells into the gonadal fate. Overall, GeneCompass demonstrates the advantages of using artificial intelligence technology to decipher universal gene regulatory mechanisms and shows tremendous potential for accelerating the discovery of critical cell fate regulators and candidate drug targets.



中文翻译:


GeneCompass:使用知识知情的跨物种基础模型破译通用基因调控机制



破译不同生物体中的通用基因调控机制对于推进我们对基本生命过程的了解和促进临床应用具有巨大潜力。然而,传统的研究范式主要关注单个模式生物,并没有整合跨物种的各种细胞类型。单细胞测序和深度学习技术的最新突破为应对这一挑战提供了前所未有的机会。在这项研究中,我们构建了一个包含超过 1.2 亿个人类和小鼠单细胞转录组的广泛数据集。经过数据预处理,我们获得了 101,768,420 个单细胞转录组,并开发了一个以知识为依据的跨物种基础模型,名为 GeneCompass。在预训练过程中,GeneCompass 有效地整合了四种类型的先验生物学知识,以自我监督的方式增强我们对基因调控机制的理解。通过对多个下游任务进行微调,GeneCompass 在单个物种的不同应用中优于最先进的模型,并解锁了跨物种生物研究的新领域。我们还使用 GeneCompass 搜索与细胞命运转变相关的关键因素,并表明预测的候选基因可以成功诱导人类胚胎干细胞分化为性腺命运。总体而言,GeneCompass 展示了使用人工智能技术破译通用基因调控机制的优势,并显示出加速发现关键细胞命运调节因子和候选药物靶标的巨大潜力。

更新日期:2024-10-08
down
wechat
bug