Nature Machine Intelligence ( IF 18.8 ) Pub Date : 2024-10-23 , DOI: 10.1038/s42256-024-00915-6 Jialin He, Lei Xiong, Shaohui Shi, Chengyu Li, Kexuan Chen, Qianchen Fang, Jiuhong Nan, Ke Ding, Yuanhui Mao, Carles A. Boix, Xinyang Hu, Manolis Kellis, Jingyun Li, Xushen Xiong
Gene expression involves transcription and translation. Despite large datasets and increasingly powerful methods devoted to calculating genetic variants’ effects on transcription, discrepancy between messenger RNA and protein levels hinders the systematic interpretation of the regulatory effects of disease-associated variants. Accurate models of the sequence determinants of translation are needed to close this gap and to interpret disease-associated variants that act on translation. Here we present Translatomer, a multimodal transformer framework that predicts cell-type-specific translation from messenger RNA expression and gene sequence. We train the Translatomer on 33 tissues and cell lines, and show that the inclusion of sequence improves the prediction of ribosome profiling signal, indicating that the Translatomer captures sequence-dependent translational regulatory information. The Translatomer achieves accuracies of 0.72 to 0.80 for the de novo prediction of cell-type-specific ribosome profiling. We develop an in silico mutagenesis tool to estimate mutational effects on translation and demonstrate that variants associated with translation regulation are evolutionarily constrained, both in the human population and across species. In particular, we identify cell-type-specific translational regulatory mechanisms independent of the expression quantitative trait loci for 3,041 non-coding and synonymous variants associated with complex diseases, including Alzheimer’s disease, schizophrenia and congenital heart disease. The Translatomer accurately models the genetic underpinnings of translation, bridging the gap between messenger RNA and protein levels as well as providing valuable mechanistic insights for uninterpreted disease variants.
中文翻译:
使用 Translatomer 对核糖体分析进行深度学习预测,揭示翻译调控并解释疾病变异
基因表达涉及转录和翻译。尽管有大型数据集和越来越强大的方法专门用于计算遗传变异对转录的影响,但信使 RNA 和蛋白质水平之间的差异阻碍了对疾病相关变异的调节作用的系统解释。需要翻译序列决定因素的准确模型来缩小这一差距并解释作用于翻译的疾病相关变异。在这里,我们介绍了 Translatomer,这是一种多模态转换器框架,可从信使 RNA 表达和基因序列预测细胞类型特异性翻译。我们在 33 种组织和细胞系上训练 Translatomer,并表明序列的包含提高了核糖体分析信号的预测,表明 Translatomer 捕获序列依赖性翻译调控信息。Translatomer 对细胞类型特异性核糖体分析的从头预测的准确度为 0.72 至 0.80。我们开发了一种计算机诱变工具来估计突变对翻译的影响,并证明与翻译调节相关的变异在人类群体和物种之间都受到进化限制。特别是,我们确定了与复杂疾病相关的 3,041 个非编码和同义变体的细胞类型特异性翻译调节机制,这些基因座与表达数量性状位点无关,包括阿尔茨海默病、精神分裂症和先天性心脏病。Translatomer 准确模拟翻译的遗传基础,弥合信使 RNA 和蛋白质水平之间的差距,并为未解释的疾病变异提供有价值的机制见解。