当前位置: X-MOL 学术Genome Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Detecting m6A RNA modification from nanopore sequencing using a semisupervised learning framework
Genome Research ( IF 6.2 ) Pub Date : 2024-11-01 , DOI: 10.1101/gr.278960.124
Haotian Teng, Marcus Stoiber, Ziv Bar-Joseph, Carl Kingsford

Direct nanopore-based RNA sequencing can be used to detect posttranscriptional base modifications, such as N6-methyladenosine (m6A) methylation, based on the electric current signals produced by the distinct chemical structures of modified bases. A key challenge is the scarcity of adequate training data with known methylation modifications. We present Xron, a hybrid encoder–decoder framework that delivers a direct methylation-distinguishing basecaller by training on synthetic RNA data and immunoprecipitation (IP)-based experimental data in two steps. First, we generate data with more diverse modification combinations through in silico cross-linking. Second, we use this data set to train an end-to-end neural network basecaller followed by fine-tuning on IP-based experimental data with label smoothing. The trained neural network basecaller outperforms existing methylation detection methods on both read-level and site-level prediction scores. Xron is a standalone, end-to-end m6A-distinguishing basecaller capable of detecting methylated bases directly from raw sequencing signals, enabling de novo methylome assembly.

中文翻译:


使用半监督学习框架从纳米孔测序中检测 m6A RNA 修饰



基于纳米孔的直接 RNA 测序可用于检测转录后碱基修饰,例如 N6-甲基腺苷 (m6A) 甲基化,基于修饰碱基的不同化学结构产生的电流信号。一个关键挑战是缺乏具有已知甲基化修饰的足够训练数据。我们提出了 Xron,这是一种混合编码器-解码器框架,它通过分两步对合成 RNA 数据和基于免疫沉淀 (IP) 的实验数据进行训练来提供直接甲基化区分碱基调用基因。首先,我们通过计算机交联生成具有更多不同修饰组合的数据。其次,我们使用此数据集来训练端到端神经网络 basecaller,然后通过标签平滑对基于 IP 的实验数据进行微调。经过训练的神经网络 basecaller 在读取级和位点级预测分数上都优于现有的甲基化检测方法。Xron 是一种独立的端到端 m6A 区分碱基调用基因,能够直接从原始测序信号中检测甲基化碱基,从而实现从头甲基化组组装。
更新日期:2024-11-01
down
wechat
bug