当前位置: X-MOL 学术Nucleic Acids Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Refining the pool of RNA-binding domains advances the classification and prediction of RNA-binding proteins
Nucleic Acids Research ( IF 16.6 ) Pub Date : 2024-06-25 , DOI: 10.1093/nar/gkae536
Elsa Wassmer 1 , Gergely Koppány 1 , Malte Hermes 1 , Sven Diederichs 2 , Maïwen Caudron-Herger 1
Affiliation  

From transcription to decay, RNA-binding proteins (RBPs) influence RNA metabolism. Using the RBP2GO database that combines proteome-wide RBP screens from 13 species, we investigated the RNA-binding features of 176 896 proteins. By compiling published lists of RNA-binding domains (RBDs) and RNA-related protein family (Rfam) IDs with lists from the InterPro database, we analyzed the distribution of the RBDs and Rfam IDs in RBPs and non-RBPs to select RBDs and Rfam IDs that were enriched in RBPs. We also explored proteins for their content in intrinsically disordered regions (IDRs) and low complexity regions (LCRs). We found a strong positive correlation between IDRs and RBDs and a co-occurrence of specific LCRs. Our bioinformatic analysis indicated that RBDs/Rfam IDs were strong indicators of the RNA-binding potential of proteins and helped predicting new RBP candidates, especially in less investigated species. By further analyzing RBPs without RBD, we predicted new RBDs that were validated by RNA-bound peptides. Finally, we created the RBP2GO composite score by combining the RBP2GO score with new quality factors linked to RBDs and Rfam IDs. Based on the RBP2GO composite score, we compiled a list of 2018 high-confidence human RBPs. The knowledge collected here was integrated into the RBP2GO database at https://RBP2GO-2-Beta.dkfz.de.

中文翻译:


完善 RNA 结合域库可推进 RNA 结合蛋白的分类和预测



从转录到衰变,RNA 结合蛋白 (RBP) 影响 RNA 代谢。使用结合了来自 13 个物种的全蛋白质组 RBP 筛选的 RBP2GO 数据库,我们研究了 176 896 种蛋白质的 RNA 结合特征。通过将已发布的 RNA 结合域 (RBD) 和 RNA 相关蛋白家族 (Rfam) ID 列表与 InterPro 数据库中的列表进行编译,我们分析了 RBP 和非 RBP 中 RBD 和 Rfam ID 的分布,以选择 RBD 和 Rfam富含 RBP 的 ID。我们还探索了蛋白质在本质无序区域(IDR)和低复杂性区域(LCR)中的含量。我们发现 IDR 和 RBD 之间存在很强的正相关性,并且特定 LCR 同时出现。我们的生物信息分析表明,RBD/Rfam ID 是蛋白质 RNA 结合潜力的有力指标,有助于预测新的 RBP 候选者,特别是在研究较少的物种中。通过进一步分析不含 RBD 的 RBP,我们预测了经过 RNA 结合肽验证的新 RBD。最后,我们通过将 RBP2GO 分数与与 RBD 和 Rfam ID 相关的新质量因素相结合,创建了 RBP2GO 综合分数。根据 RBP2GO 综合评分,我们编制了 2018 年高置信度人类 RBP 列表。这里收集的知识已集成到 RBP2GO 数据库中:https://RBP2GO-2-Beta.dkfz.de。
更新日期:2024-06-25
down
wechat
bug