当前位置: X-MOL 学术Genome Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Enhanced detection of RNA modifications and read mapping with high-accuracy nanopore RNA basecalling models
Genome Research ( IF 6.2 ) Pub Date : 2024-11-01 , DOI: 10.1101/gr.278849.123
Gregor Diensthuber 1 , Leszek P Pryszcz 2 , Laia Llovera 2 , Morghan C Lucas 2 , Anna Delgado-Tejedor 1 , Sonia Cruciani 1 , Jean-Yves Roignant 3 , Oguzhan Begik 2 , Eva Maria Novoa 4
Affiliation  

In recent years, nanopore direct RNA sequencing (DRS) became a valuable tool for studying the epitranscriptome, owing to its ability to detect multiple modifications within the same full-length native RNA molecules. Although RNA modifications can be identified in the form of systematic basecalling “errors” in DRS data sets, N6-methyladenosine (m6A) modifications produce relatively low “errors” compared with other RNA modifications, limiting the applicability of this approach to m6A sites that are modified at high stoichiometries. Here, we demonstrate that the use of alternative RNA basecalling models, trained with fully unmodified sequences, increases the “error” signal of m6A, leading to enhanced detection and improved sensitivity even at low stoichiometries. Moreover, we find that high-accuracy alternative RNA basecalling models can show up to 97% median basecalling accuracy, outperforming currently available RNA basecalling models, which show 91% median basecalling accuracy. Notably, the use of high-accuracy basecalling models is accompanied by a significant increase in the number of mapped reads—especially in shorter RNA fractions—and increased basecalling error signatures at pseudouridine (Ψ)- and N1-methylpseudouridine (m1Ψ)-modified sites. Overall, our work demonstrates that alternative RNA basecalling models can be used to improve the detection of RNA modifications, read mappability, and basecalling accuracy in nanopore DRS data sets.

中文翻译:


使用高精度纳米孔 RNA 碱基识别模型增强 RNA 修饰检测和读取映射



近年来,纳米孔直接 RNA 测序 (DRS) 成为研究表观转录组的宝贵工具,因为它能够检测同一全长天然 RNA 分子内的多种修饰。尽管在 DRS 数据集中可以通过系统碱基识别“错误”的形式识别 RNA 修饰,但与其他 RNA 修饰相比,N6-甲基腺苷 (m6A) 修饰产生的“错误”相对较低,从而限制了这种方法对在高化学计量下修饰的 m6A 位点的适用性。在这里,我们证明了使用用完全未修饰的序列训练的替代 RNA 碱基识别模型会增加 m6A 的“错误”信号,从而增强检测并提高灵敏度,即使在低化学计量下也是如此。此外,我们发现高精度替代 RNA 碱基识别模型可以显示高达 97% 的中位碱基识别准确率,优于目前可用的 RNA 碱基识别模型,后者显示 91% 的中位碱基识别准确率。值得注意的是,使用高精度碱基识别模型伴随着映射读数数量的显著增加(尤其是在较短的 RNA 组分中),并且假尿嘧啶 (Ψ) 和 N1-甲基假尿嘧啶 (m1Ψ) 修饰位点的碱基识别错误特征增加。总体而言,我们的工作表明,替代 RNA 碱基识别模型可用于提高纳米孔 DRS 数据集中 RNA 修饰的检测、读取可映射性和碱基识别准确性。
更新日期:2024-11-01
down
wechat
bug