当前位置: X-MOL 学术Cell Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Zero-shot prediction of mutation effects with multimodal deep representation learning guides protein engineering
Cell Research ( IF 28.1 ) Pub Date : 2024-07-05 , DOI: 10.1038/s41422-024-00989-2
Peng Cheng 1 , Cong Mao 2 , Jin Tang 3 , Sen Yang 1 , Yu Cheng 2 , Wuke Wang 3 , Qiuxi Gu 2 , Wei Han 3 , Hao Chen 2 , Sihan Li 2 , Yaofeng Chen 1 , Jianglin Zhou 1 , Wuju Li 1 , Aimin Pan 3 , Suwen Zhao 4, 5 , Xingxu Huang 3, 5 , Shiqiang Zhu 3 , Jun Zhang 2 , Wenjie Shu 1 , Shengqi Wang 1
Affiliation  

Mutations in amino acid sequences can provoke changes in protein function. Accurate and unsupervised prediction of mutation effects is critical in biotechnology and biomedicine, but remains a fundamental challenge. To resolve this challenge, here we present Protein Mutational Effect Predictor (ProMEP), a general and multiple sequence alignment-free method that enables zero-shot prediction of mutation effects. A multimodal deep representation learning model embedded in ProMEP was developed to comprehensively learn both sequence and structure contexts from ~160 million proteins. ProMEP achieves state-of-the-art performance in mutational effect prediction and accomplishes a tremendous improvement in speed, enabling efficient and intelligent protein engineering. Specifically, ProMEP accurately forecasts mutational consequences on the gene-editing enzymes TnpB and TadA, and successfully guides the development of high-performance gene-editing tools with their engineered variants. The gene-editing efficiency of a 5-site mutant of TnpB reaches up to 74.04% (vs 24.66% for the wild type); and the base editing tool developed on the basis of a TadA 15-site mutant (in addition to the A106V/D108N double mutation that renders deoxyadenosine deaminase activity to TadA) exhibits an A-to-G conversion frequency of up to 77.27% (vs 69.80% for ABE8e, a previous TadA-based adenine base editor) with significantly reduced bystander and off-target effects compared to ABE8e. ProMEP not only showcases superior performance in predicting mutational effects on proteins but also demonstrates a great capability to guide protein engineering. Therefore, ProMEP enables efficient exploration of the gigantic protein space and facilitates practical design of proteins, thereby advancing studies in biomedicine and synthetic biology.



中文翻译:


利用多模态深度表示学习对突变效应进行零样本预测指导蛋白质工程



氨基酸序列的突变会引起蛋白质功能的变化。准确且无监督的突变效应预测对于生物技术和生物医学至关重要,但仍然是一个基本挑战。为了解决这一挑战,我们在这里提出了蛋白质突变效应预测器(ProMEP),这是一种通用的多序列免比对方法,可以实现突变效应的零样本预测。开发了嵌入 ProMEP 的多模式深度表示学习模型,用于全面学习约 1.6 亿个蛋白质的序列和结构上下文。 ProMEP 在突变效应预测方面实现了最先进的性能,并在速度上实现了巨大改进,从而实现了高效、智能的蛋白质工程。具体来说,ProMEP 准确预测了基因编辑酶 TnpB 和 TadA 的突变后果,并成功指导了高性能基因编辑工具及其工程变体的开发。 TnpB 5位点突变体的基因编辑效率高达74.04%(野生型为24.66%);基于TadA 15位点突变体(除了A106V/D108N双突变使TadA具有脱氧腺苷脱氨酶活性)开发的碱基编辑工具表现出高达77.27%的A-G转换频率(vs ABE8e(之前的基于 TadA 的腺嘌呤碱基编辑器)为 69.80%),与 ABE8e 相比,旁观者效应和脱靶效应显着减少。 ProMEP 不仅在预测蛋白质突变效应方面表现出卓越的性能,而且还表现出指导蛋白质工程的强大能力。 因此,ProMEP能够有效探索巨大的蛋白质空间,促进蛋白质的实用设计,从而推进生物医学和合成生物学的研究。

更新日期:2024-07-05
down
wechat
bug