Nature Communications ( IF 14.7 ) Pub Date : 2023-04-18 , DOI: 10.1038/s41467-023-37462-4
Bo Wen 1, 2, 3 , Bing Zhang 1, 2
|
We present PepQuery2, which leverages a new tandem mass spectrometry (MS/MS) data indexing approach to enable ultrafast, targeted identification of novel and known peptides in any local or publicly available MS proteomics datasets. The stand-alone version of PepQuery2 allows directly searching more than one billion indexed MS/MS spectra in the PepQueryDB or any public datasets from PRIDE, MassIVE, iProX, or jPOSTrepo, whereas the web version enables users to search datasets in PepQueryDB with a user-friendly interface. We demonstrate the utilities of PepQuery2 in a wide range of applications including detecting proteomic evidence for genomically predicted novel peptides, validating novel and known peptides identified using spectrum-centric database searching, prioritizing tumor-specific antigens, identifying missing proteins, and selecting proteotypic peptides for targeted proteomics experiments. By putting public MS proteomics data directly into the hands of scientists, PepQuery2 opens many new ways to transform these data into useful information for the broad research community.
中文翻译:

PepQuery2 使公共 MS 蛋白质组学数据民主化,以实现快速肽搜索
我们推出了 PepQuery2,它利用新的串联质谱 (MS/MS) 数据索引方法,能够超快速、有针对性地识别任何本地或公开可用的 MS 蛋白质组数据集中的新型肽和已知肽。PepQuery2 的独立版本允许直接搜索 PepQueryDB 中超过 10 亿个索引 MS/MS 谱图或来自 PRIDE、MassIVE、iProX 或 jPOSTrepo 的任何公共数据集,而网络版本允许用户通过用户搜索 PepQueryDB 中的数据集- 友好的界面。我们展示了 PepQuery2 在广泛应用中的实用性,包括检测基因组预测的新型肽的蛋白质组学证据、验证使用以谱为中心的数据库搜索鉴定的新型肽和已知肽、优先考虑肿瘤特异性抗原、识别缺失的蛋白质以及选择蛋白质肽有针对性的蛋白质组学实验。通过将公共 MS 蛋白质组数据直接交到科学家手中,PepQuery2 开辟了许多新方法,将这些数据转化为对广泛研究界有用的信息。