当前位置: X-MOL 学术Genome Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Leveraging the T2T assembly to resolve rare and pathogenic inversions in reference genome gaps
Genome Research ( IF 6.2 ) Pub Date : 2024-11-01 , DOI: 10.1101/gr.279346.124
Kristine Bilgrav Saether, Jesper Eisfeldt, Jesse D. Bengtsson, Ming Yin Lun, Christopher M. Grochowski, Medhat Mahmoud, Hsiao-Tuan Chao, Jill A. Rosenfeld, Pengfei Liu, Marlene Ek, Jakob Schuy, Adam Ameur, Hongzheng Dai, Undiagnosed Diseases Network, James Paul Hwang, Fritz J. Sedlazeck, Weimin Bi, Ronit Marom, Josephine Wincent, Ann Nordgren, Claudia M.B. Carvalho, Anna Lindstrand

Chromosomal inversions (INVs) are particularly challenging to detect due to their copy-number neutral state and association with repetitive regions. Inversions represent about 1/20 of all balanced structural chromosome aberrations and can lead to disease by gene disruption or altering regulatory regions of dosage-sensitive genes in cis. Short-read genome sequencing (srGS) can only resolve ∼70% of cytogenetically visible inversions referred to clinical diagnostic laboratories, likely due to breakpoints in repetitive regions. Here, we study 12 inversions by long-read genome sequencing (lrGS) (n = 9) or srGS (n = 3) and resolve nine of them. In four cases, the inversion breakpoint region was missing from at least one of the human reference genomes (GRCh37, GRCh38, T2T-CHM13) and a reference agnostic analysis was needed. One of these cases, an INV9 mappable only in de novo assembled lrGS data using T2T-CHM13 disrupts EHMT1 consistent with a Mendelian diagnosis (Kleefstra syndrome 1; MIM#610253). Next, by pairwise comparison between T2T-CHM13, GRCh37, and GRCh38, as well as the chimpanzee and bonobo, we show that hundreds of megabases of sequence are missing from at least one human reference, highlighting that primate genomes contribute to genomic diversity. Aligning population genomic data to these regions indicated that these regions are variable between individuals. Our analysis emphasizes that T2T-CHM13 is necessary to maximize the value of lrGS for optimal inversion detection in clinical diagnostics. These results highlight the importance of leveraging diverse and comprehensive reference genomes to resolve unsolved molecular cases in rare diseases.

中文翻译:


利用 T2T 组装解决参考基因组差距中的罕见和致病性倒位



染色体倒位 (INV) 由于其拷贝数中性状态和与重复区域的关联而特别难以检测。倒位约占所有平衡结构染色体畸变的 1/20,可通过基因破坏或改变顺式中剂量敏感基因的调节区域导致疾病。短读基因组测序 (srGS) 只能解决 ∼70% 的转诊至临床诊断实验室的细胞遗传学可见倒位,这可能是由于重复区域的断点。在这里,我们通过长读基因组测序 (lrGS) (n = 9) 或 srGS (n = 3) 研究了 12 个倒位,并解决了其中的 9 个。在 4 个案例中,至少一个人类参考基因组 (GRCh37 、 GRCh38 、 T2T-CHM13) 中缺少倒置断点区域,需要进行与参考无关的分析。其中一种情况,仅使用 T2T-CHM13 从头组装的 lrGS 数据中映射的 INV9 破坏了与孟德尔诊断一致的 EHMT1(Kleefstra 综合征 1;MIM#610253)。接下来,通过对 T2T-CHM13、GRCh37 和 GRCh38 以及黑猩猩和倭黑猩猩进行成对比较,我们表明至少一个人类参考文献中缺少数百个兆碱基的序列,突出了灵长类动物基因组有助于基因组多样性。将群体基因组数据与这些区域对齐表明,这些区域在个体之间是可变的。我们的分析强调 T2T-CHM13 对于最大化 lrGS 在临床诊断中最佳反转检测的价值是必要的。这些结果强调了利用多样化和全面的参考基因组来解决罕见病中未解决的分子病例的重要性。
更新日期:2024-11-01
down
wechat
bug