当前位置: X-MOL 学术Nat. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
EASTR: Identifying and eliminating systematic alignment errors in multi-exon genes
Nature Communications ( IF 14.7 ) Pub Date : 2023-11-09 , DOI: 10.1038/s41467-023-43017-4
Ida Shinder 1, 2 , Richard Hu 2, 3 , Hyun Joo Ji 2, 3 , Kuan-Hao Chao 2, 3 , Mihaela Pertea 2, 3, 4, 5
Affiliation  

Accurate alignment of transcribed RNA to reference genomes is a critical step in the analysis of gene expression, which in turn has broad applications in biomedical research and in the basic sciences. We reveal that widely used splice-aware aligners, such as STAR and HISAT2, can introduce erroneous spliced alignments between repeated sequences, leading to the inclusion of falsely spliced transcripts in RNA-seq experiments. In some cases, the ‘phantom’ introns resulting from these errors make their way into widely-used genome annotation databases. To address this issue, we present EASTR (Emending Alignments of Spliced Transcript Reads), a software tool that detects and removes falsely spliced alignments or transcripts from alignment and annotation files. EASTR improves the accuracy of spliced alignments across diverse species, including human, maize, and Arabidopsis thaliana, by detecting sequence similarity between intron-flanking regions. We demonstrate that applying EASTR before transcript assembly substantially reduces false positive introns, exons, and transcripts, improving the overall accuracy of assembled transcripts. Additionally, we show that EASTR’s application to reference annotation databases can detect and correct likely cases of mis-annotated transcripts.



中文翻译:

EASTR:识别和消除多外显子基因中的系统比对错误

转录 RNA 与参考基因组的准确比对是基因表达分析的关键步骤,这反过来又在生物医学研究和基础科学中具有广泛的应用。我们发现广泛使用的剪接感知比对仪(例如 STAR 和 HISAT2)可能会在重复序列之间引入错误的剪接比对,导致 RNA-seq 实验中包含错误剪接的转录本。在某些情况下,这些错误产生的“幻影”内含子会进入广泛使用的基因组注释数据库。为了解决这个问题,我们推出了 EASTR(Emending Alignments of Spliced Transcript Reads),这是一种软件工具,可以从比对和注释文件中检测并删除错误的剪接比对或转录本。EASTR通过检测内含子侧翼区域之间的序列相似性,提高了不同物种(包括人类、玉米和拟南芥)剪接比对的准确性。我们证明,在转录本组装之前应用 EASTR 可以显着减少假阳性内含子、外显子和转录本,从而提高组装转录本的整体准确性。此外,我们还表明 EASTR 在参考注释数据库中的应用可以检测并纠正可能的错误注释转录本的情况。

更新日期:2023-11-09
down
wechat
bug