当前位置: X-MOL 学术Genome Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Challenges in identifying mRNA transcript starts and ends from long-read sequencing data
Genome Research ( IF 6.2 ) Pub Date : 2024-11-01 , DOI: 10.1101/gr.279559.124
Ezequiel Calvo-Roitberg, Rachel F. Daniels, Athma A. Pai

Long-read sequencing (LRS) technologies have the potential to revolutionize scientific discoveries in RNA biology through the comprehensive identification and quantification of full-length mRNA isoforms. Despite great promise, challenges remain in the widespread implementation of LRS technologies for RNA-based applications, including concerns about low coverage, high sequencing error, and robust computational pipelines. Although much focus has been placed on defining mRNA exon composition and structure with LRS data, less careful characterization has been done of the ability to assess the terminal ends of isoforms, specifically, transcription start and end sites. Such characterization is crucial for completely delineating full mRNA molecules and regulatory consequences. However, there are substantial inconsistencies in both start and end coordinates of LRS reads spanning a gene, such that LRS reads often fail to accurately recapitulate annotated or empirically derived terminal ends of mRNA molecules. Here, we describe the specific challenges of identifying and quantifying mRNA terminal ends with LRS technologies and how these issues influence biological interpretations of LRS data. We then review recent experimental and computational advances designed to alleviate these problems, with ideal use cases for each approach. Finally, we outline anticipated developments and necessary improvements for the characterization of terminal ends from LRS data.

中文翻译:


从长读长测序数据中鉴定 mRNA 转录本起点和终点的挑战



长读长测序 (LRS) 技术有可能通过对全长 mRNA 亚型的全面鉴定和定量来彻底改变 RNA 生物学的科学发现。尽管前景广阔,但在基于 RNA 的应用中广泛实施 LRS 技术仍然存在挑战,包括对低覆盖度、高测序误差和稳健计算管道的担忧。尽管人们已经非常关注使用 LRS 数据定义 mRNA 外显子组成和结构,但对评估亚型末端的能力(特别是转录起始位点和结束位点)的表征却不够仔细。这种表征对于完全描述完整的 mRNA 分子和调节后果至关重要。然而,跨越一个基因的 LRS 读数的起始和结束坐标存在很大不一致,因此 LRS 读数通常无法准确概括 mRNA 分子的注释或经验衍生的末端。在这里,我们描述了使用 LRS 技术识别和量化 mRNA 末端的具体挑战,以及这些问题如何影响 LRS 数据的生物学解释。然后,我们回顾了旨在缓解这些问题的最新实验和计算进展,并为每种方法提供了理想的用例。最后,我们概述了从 LRS 数据中表征终端的预期发展和必要的改进。
更新日期:2024-11-01
down
wechat
bug