当前位置:
X-MOL 学术
›
Nucleic Acids Res.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Motif distribution in genomes gives insights into gene clustering and co-regulation
Nucleic Acids Research ( IF 16.6 ) Pub Date : 2024-12-11 , DOI: 10.1093/nar/gkae1178 Atreyi Chakraborty, Sumant Chopde, Mallur Srivatsan Madhusudhan
Nucleic Acids Research ( IF 16.6 ) Pub Date : 2024-12-11 , DOI: 10.1093/nar/gkae1178 Atreyi Chakraborty, Sumant Chopde, Mallur Srivatsan Madhusudhan
We read the genome as proteins in the cell would – by studying the distributions of 5–6 base motifs of DNA in the whole genome or smaller stretches such as parts of, or whole chromosomes. This led us to some interesting findings about motif clustering and chromosome organization. It is quite clear that the motif distribution in genomes is not random at the length scales we examined: 1 kb to entire chromosomes. The observed-to-expected (OE) ratios of motif distributions show strong correlations in pairs of chromosomes that are susceptible to translocations. With the aid of examples, we suggest that similarity in motif distributions in promoter regions of genes could imply co-regulation. A simple extension of this idea empowers us with the ability to construct gene regulatory networks. Further, we could make inferences about the spatial proximity of genomic fragments using these motif distributions. Spatially proximal regions, as deduced by Hi-C or pcHi-C, were ∼3.5 times more likely to have their motif distributions correlated than non-proximal regions. These correlations had strong contributions from the CTCF protein recognizing motifs which are known markers of topologically associated domains. In general, correlating genomic regions by motif distribution comparisons alone is rife with functional information.
中文翻译:
基因组中的基序分布有助于了解基因聚集和共调控
我们像细胞中的蛋白质一样读取基因组 - 通过研究 DNA 的 5-6 个碱基基序在整个基因组或更小的片段(如部分或整个染色体)中的分布。这让我们得出了一些关于基序聚集和染色体组织的有趣发现。很明显,基因组中的基序分布在我们检查的长度尺度上不是随机的:1 kb 到整个染色体。基序分布的观察与预期 (OE) 比率在易易位的染色体对中显示出很强的相关性。借助示例,我们认为基因启动子区域基序分布的相似性可能意味着共调控。这个想法的简单扩展使我们能够构建基因调控网络。此外,我们可以使用这些基序分布来推断基因组片段的空间接近性。由 Hi-C 或 pcHi-C 推断的空间近端区域与其基序分布相关的可能性是非近端区域的 3.5 倍。这些相关性对 CTCF 蛋白识别基序有很大贡献,这些基序是拓扑相关结构域的已知标志物。一般来说,仅通过基序分布比较来关联基因组区域充满了功能信息。
更新日期:2024-12-11
中文翻译:

基因组中的基序分布有助于了解基因聚集和共调控
我们像细胞中的蛋白质一样读取基因组 - 通过研究 DNA 的 5-6 个碱基基序在整个基因组或更小的片段(如部分或整个染色体)中的分布。这让我们得出了一些关于基序聚集和染色体组织的有趣发现。很明显,基因组中的基序分布在我们检查的长度尺度上不是随机的:1 kb 到整个染色体。基序分布的观察与预期 (OE) 比率在易易位的染色体对中显示出很强的相关性。借助示例,我们认为基因启动子区域基序分布的相似性可能意味着共调控。这个想法的简单扩展使我们能够构建基因调控网络。此外,我们可以使用这些基序分布来推断基因组片段的空间接近性。由 Hi-C 或 pcHi-C 推断的空间近端区域与其基序分布相关的可能性是非近端区域的 3.5 倍。这些相关性对 CTCF 蛋白识别基序有很大贡献,这些基序是拓扑相关结构域的已知标志物。一般来说,仅通过基序分布比较来关联基因组区域充满了功能信息。