当前位置: X-MOL 学术Nat. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accurate and robust protein sequence design with CarbonDesign
Nature Machine Intelligence ( IF 18.8 ) Pub Date : 2024-05-23 , DOI: 10.1038/s42256-024-00838-2
Milong Ren , Chungong Yu , Dongbo Bu , Haicang Zhang

Protein sequence design is critically important for protein engineering. Despite recent advancements in deep learning-based methods, achieving accurate and robust sequence design remains a challenge. Here we present CarbonDesign, an approach that draws inspiration from successful ingredients of AlphaFold and which has been developed specifically for protein sequence design. At its core, CarbonDesign introduces Inverseformer, which learns representations from backbone structures and an amortized Markov random fields model for sequence decoding. Moreover, we incorporate other essential AlphaFold concepts into CarbonDesign: an end-to-end network recycling technique to leverage evolutionary constraints from protein language models and a multitask learning technique for generating side-chain structures alongside designed sequences. CarbonDesign outperforms other methods on independent test sets including the 15th Critical Assessment of protein Structure Prediction (CASP15) dataset, the Continuous Automated Model Evaluation (CAMEO) dataset and de novo proteins from RFDiffusion. Furthermore, it supports zero-shot prediction of the functional effects of sequence variants, making it a promising tool for applications in bioengineering.



中文翻译:


使用 CarbonDesign 进行准确而稳健的蛋白质序列设计



蛋白质序列设计对于蛋白质工程至关重要。尽管基于深度学习的方法最近取得了进展,但实现准确且稳健的序列设计仍然是一个挑战。在这里,我们介绍 CarbonDesign,这是一种从 AlphaFold 的成功成分中汲取灵感的方法,专为蛋白质序列设计而开发。 CarbonDesign 的核心是引入了 Inverseformer,它从骨干结构中学习表示,并学习用于序列解码的摊销马尔可夫随机场模型。此外,我们将其他基本的 AlphaFold 概念融入到 CarbonDesign 中:一种利用蛋白质语言模型的进化约束的端到端网络回收技术,以及一种用于沿着设计序列生成侧链结构的多任务学习技术。 CarbonDesign 在独立测试集上的表现优于其他方法,包括第 15 次蛋白质结构预测关键评估 (CASP15) 数据集、连续自动模型评估 (CAMEO) 数据集和来自 RFDiffusion 的 de novo 蛋白质。此外,它支持序列变异功能效应的零样本预测,使其成为生物工程应用的有前景的工具。

更新日期:2024-05-23
down
wechat
bug