当前位置: X-MOL 学术Nat. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Accurate de novo peptide sequencing using fully convolutional neural networks
Nature Communications ( IF 14.7 ) Pub Date : 2023-12-02 , DOI: 10.1038/s41467-023-43010-x
Kaiyuan Liu 1 , Yuzhen Ye 1 , Sujun Li 1, 2 , Haixu Tang 1
Affiliation  

De novo peptide sequencing, which does not rely on a comprehensive target sequence database, provides us with a way to identify novel peptides from tandem mass spectra. However, current de novo sequencing algorithms suffer from low accuracy and coverage, which hinders their application in proteomics. In this paper, we present PepNet, a fully convolutional neural network for high accuracy de novo peptide sequencing. PepNet takes an MS/MS spectrum (represented as a high-dimensional vector) as input, and outputs the optimal peptide sequence along with its confidence score. The PepNet model is trained using a total of 3 million high-energy collisional dissociation MS/MS spectra from multiple human peptide spectral libraries. Evaluation results show that PepNet significantly outperforms current best-performing de novo sequencing algorithms (e.g. PointNovo and DeepNovo) in both peptide-level accuracy and positional-level accuracy. PepNet can sequence a large fraction of spectra that were not identified by database search engines, and thus could be used as a complementary tool to database search engines for peptide identification in proteomics. In addition, PepNet runs around 3x and 7x faster than PointNovo and DeepNovo on GPUs, respectively, thus being more suitable for the analysis of large-scale proteomics data.



中文翻译:


使用全卷积神经网络进行准确的从头肽测序



从头肽测序不依赖于全面的靶序列数据库,为我们提供了一种从串联质谱中识别新肽的方法。然而,当前的从头测序算法的准确性和覆盖率较低,这阻碍了它们在蛋白质组学中的应用。在本文中,我们提出了PepNet ,一种用于高精度从头肽测序的全卷积神经网络。 PepNet 将 MS/MS 谱(表示为高维向量)作为输入,并输出最佳肽序列及其置信度得分。 PepNet 模型使用来自多个人类肽谱库的总共 300 万个高能碰撞解离 MS/MS 谱图进行训练。评估结果表明,PepNet 在肽级精度和位置级精度方面均显着优于当前性能最佳的从头测序算法(例如 PointNovo 和 DeepNovo)。 PepNet 可以对数据库搜索引擎未识别的大部分光谱进行测序,因此可以用作数据库搜索引擎的补充工具,用于蛋白质组学中的肽识别。此外,PepNet 在 GPU 上的运行速度分别比 PointNovo 和 DeepNovo 快 3 倍和 7 倍,因此更适合大规模蛋白质组数据的分析。

更新日期:2023-12-02
down
wechat
bug