当前位置: X-MOL 学术Nat. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A landscape of complex tandem repeats within individual human genomes
Nature Communications ( IF 14.7 ) Pub Date : 2023-09-14 , DOI: 10.1038/s41467-023-41262-1
Kazuki Ichikawa 1 , Riki Kawahara 1 , Takeshi Asano 1 , Shinichi Morishita 1
Affiliation  

Markedly expanded tandem repeats (TRs) have been correlated with ~60 diseases. TR diversity has been considered a clue toward understanding missing heritability. However, haplotype-resolved long TRs remain mostly hidden or blacked out because their complex structures (TRs composed of various units and minisatellites containing >10-bp units) make them difficult to determine accurately with existing methods. Here, using a high-precision algorithm to determine complex TR structures from long, accurate reads of PacBio HiFi, an investigation of 270 Japanese control samples yields several genome-wide findings. Approximately 322,000 TRs are difficult to impute from the surrounding single-nucleotide variants. Greater genetic divergence of TR loci is significantly correlated with more events of younger replication slippage. Complex TRs are more abundant than single-unit TRs, and a tendency for complex TRs to consist of <10-bp units and single-unit TRs to be minisatellites is statistically significant at loci with ≥500-bp TRs. Of note, 8909 loci with extended TRs (>100b longer than the mode) contain several known disease-associated TRs and are considered candidates for association with disorders. Overall, complex TRs and minisatellites are found to be abundant and diverse, even in genetically small Japanese populations, yielding insights into the landscape of long TRs.



中文翻译:

人类个体基因组中复杂串联重复的景观

显着扩展的串联重复序列 (TR) 与约 60 种疾病相关。TR 多样性被认为是理解缺失遗传性的线索。然而,单倍型解析的长TRs大部分仍然被隐藏或被遮盖,因为它们的复杂结构(由各种单元和包含> 10 bp单元的小卫星组成的TRs)使得它们难以用现有方法准确确定。在这里,使用高精度算法从 PacBio HiFi 的长而准确的读数中确定复杂的 TR 结构,对 270 个日本对照样本的研究产生了几个全基因组的发现。大约 322,000 个 TR 很难从周围的单核苷酸变异中推断出来。TR基因座的更大遗传差异与更多年轻复制滑移事件显着相关。复合TR比单单元TR更丰富,并且复合TR由<10 bp单元组成和单单元TR是小卫星的趋势在具有≥500 bp TR的位点具有统计显着性。值得注意的是,8909 个具有延长 TR(比模式长 >100b)的基因座包含几个已知的与疾病相关的 TR,并被认为是与疾病相关的候选基因。总体而言,即使在基因较少的日本人群中,复杂的TR和小型卫星也被发现丰富且多样,这使我们能够深入了解长TR的景观。

更新日期:2023-09-14
down
wechat
bug